Query 033035
Match_columns 129
No_of_seqs 151 out of 1142
Neff 6.6
Searched_HMMs 46136
Date Fri Mar 29 09:01:06 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/033035.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/033035hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PTZ00199 high mobility group p 99.9 4.1E-26 9E-31 155.4 10.9 85 42-126 9-94 (94)
2 cd01389 MATA_HMG-box MATA_HMG- 99.9 1.2E-21 2.6E-26 128.3 7.8 70 55-125 1-70 (77)
3 cd01388 SOX-TCF_HMG-box SOX-TC 99.8 5.1E-21 1.1E-25 123.9 8.0 69 56-125 2-70 (72)
4 PF00505 HMG_box: HMG (high mo 99.8 1.1E-20 2.3E-25 120.2 8.9 69 56-125 1-69 (69)
5 PF09011 HMG_box_2: HMG-box do 99.8 2.3E-20 4.9E-25 121.2 9.1 72 53-125 1-73 (73)
6 cd01390 HMGB-UBF_HMG-box HMGB- 99.8 2.8E-20 6.2E-25 117.1 8.9 65 56-121 1-65 (66)
7 smart00398 HMG high mobility g 99.8 3.7E-20 7.9E-25 117.2 9.2 70 55-125 1-70 (70)
8 COG5648 NHP6B Chromatin-associ 99.8 5.5E-20 1.2E-24 140.2 7.9 84 44-128 59-142 (211)
9 KOG0381 HMG box-containing pro 99.8 1.3E-18 2.8E-23 117.5 10.6 76 52-128 17-95 (96)
10 cd00084 HMG-box High Mobility 99.8 3.2E-18 6.9E-23 107.1 8.9 65 56-121 1-65 (66)
11 KOG0527 HMG-box transcription 99.7 1.2E-17 2.7E-22 135.7 6.1 76 49-125 56-131 (331)
12 KOG0526 Nucleosome-binding fac 99.7 4E-17 8.6E-22 138.1 7.3 78 45-127 525-602 (615)
13 KOG4715 SWI/SNF-related matrix 99.2 2E-11 4.3E-16 98.5 7.6 78 48-126 57-134 (410)
14 KOG3248 Transcription factor T 99.2 2.3E-11 4.9E-16 98.9 5.8 72 54-126 190-261 (421)
15 KOG0528 HMG-box transcription 99.1 1.3E-11 2.7E-16 103.9 2.0 78 47-125 317-394 (511)
16 KOG2746 HMG-box transcription 98.3 4.8E-07 1E-11 79.0 4.2 72 48-120 174-247 (683)
17 PF14887 HMG_box_5: HMG (high 98.3 6.3E-06 1.4E-10 54.4 7.5 71 55-127 3-73 (85)
18 COG5648 NHP6B Chromatin-associ 97.3 0.00022 4.8E-09 55.0 3.1 68 54-122 142-209 (211)
19 PF06382 DUF1074: Protein of u 97.1 0.0013 2.8E-08 49.7 5.2 48 60-112 83-130 (183)
20 PF04690 YABBY: YABBY protein; 96.7 0.005 1.1E-07 46.2 5.6 48 51-99 117-164 (170)
21 PF08073 CHDNT: CHDNT (NUC034) 95.6 0.015 3.2E-07 36.0 3.0 40 60-100 13-52 (55)
22 PF06244 DUF1014: Protein of u 91.9 0.26 5.7E-06 35.1 3.8 48 53-101 69-117 (122)
23 PF04769 MAT_Alpha1: Mating-ty 91.1 0.79 1.7E-05 35.3 6.1 56 49-111 37-92 (201)
24 TIGR03481 HpnM hopanoid biosyn 90.4 0.79 1.7E-05 34.9 5.4 44 83-126 65-110 (198)
25 PRK15117 ABC transporter perip 88.1 1.5 3.3E-05 33.6 5.6 47 79-126 66-114 (211)
26 KOG3223 Uncharacterized conser 85.7 1.6 3.4E-05 33.7 4.4 53 54-110 162-215 (221)
27 PF05494 Tol_Tol_Ttg2: Toluene 78.6 3.6 7.8E-05 30.0 4.0 44 83-126 39-84 (170)
28 PF13875 DUF4202: Domain of un 75.6 6.6 0.00014 29.9 4.7 41 61-105 130-170 (185)
29 PF01352 KRAB: KRAB box; Inte 64.1 5.4 0.00012 22.9 1.6 29 84-112 3-32 (41)
30 COG2854 Ttg2D ABC-type transpo 60.3 13 0.00028 28.8 3.5 42 87-128 76-118 (202)
31 PF11304 DUF3106: Protein of u 54.8 53 0.0012 22.5 5.6 11 94-104 55-65 (107)
32 PRK09706 transcriptional repre 50.8 46 0.00099 23.2 4.9 42 86-127 87-128 (135)
33 PF06945 DUF1289: Protein of u 50.4 23 0.0005 21.1 2.8 24 84-112 24-47 (51)
34 PF12881 NUT_N: NUT protein N 40.5 88 0.0019 25.9 5.6 51 63-114 232-282 (328)
35 PF15581 Imm35: Immunity prote 38.9 32 0.00069 23.3 2.4 23 83-105 31-53 (93)
36 PRK12751 cpxP periplasmic stre 37.3 86 0.0019 23.3 4.7 33 86-118 118-150 (162)
37 PRK12750 cpxP periplasmic repr 37.2 1.1E+02 0.0023 22.8 5.3 35 87-121 126-160 (170)
38 cd07081 ALDH_F20_ACDH_EutE-lik 36.6 93 0.002 26.4 5.4 39 86-124 6-44 (439)
39 PF00887 ACBP: Acyl CoA bindin 34.2 1E+02 0.0022 19.9 4.3 53 63-117 30-86 (87)
40 cd07133 ALDH_CALDH_CalB Conife 33.4 1.2E+02 0.0026 25.4 5.6 40 85-124 4-43 (434)
41 PRK10363 cpxP periplasmic repr 31.3 1.4E+02 0.003 22.4 5.0 34 86-119 112-145 (166)
42 PF12650 DUF3784: Domain of un 30.5 37 0.0008 22.4 1.7 16 94-109 25-40 (97)
43 KOG1610 Corticosteroid 11-beta 30.1 1.6E+02 0.0034 24.5 5.6 48 66-113 188-247 (322)
44 cd07132 ALDH_F3AB Aldehyde deh 29.5 1.4E+02 0.0031 25.1 5.4 39 86-124 5-43 (443)
45 cd07122 ALDH_F20_ACDH Coenzyme 28.2 1.5E+02 0.0033 25.2 5.4 39 86-124 6-44 (436)
46 PRK10236 hypothetical protein; 28.0 60 0.0013 25.7 2.7 26 87-112 118-143 (237)
47 cd07087 ALDH_F3-13-14_CALDH-li 27.5 1.8E+02 0.0039 24.3 5.6 39 86-124 5-43 (426)
48 cd07136 ALDH_YwdH-P39616 Bacil 26.5 1.8E+02 0.004 24.6 5.6 41 84-124 3-43 (449)
49 PF05388 Carbpep_Y_N: Carboxyp 24.5 1.1E+02 0.0025 21.3 3.3 29 84-112 45-73 (113)
50 cd07085 ALDH_F6_MMSDH Methylma 24.3 2E+02 0.0043 24.4 5.4 38 86-123 45-82 (478)
51 TIGR00787 dctP tripartite ATP- 23.7 1.8E+02 0.0038 22.3 4.7 27 92-118 213-239 (257)
52 cd07150 ALDH_VaniDH_like Pseud 23.5 2E+02 0.0044 24.0 5.3 38 86-123 28-65 (451)
53 cd08317 Death_ank Death domain 23.1 41 0.0009 21.7 0.9 21 79-100 3-23 (84)
54 PRK00197 proA gamma-glutamyl p 23.0 2E+02 0.0044 24.0 5.2 39 86-124 11-49 (417)
55 cd07077 ALDH-like NAD(P)+-depe 22.9 1.6E+02 0.0035 24.2 4.5 36 88-123 3-38 (397)
56 COG4281 ACB Acyl-CoA-binding p 22.5 98 0.0021 20.5 2.5 60 56-117 17-85 (87)
57 PF14399 Transpep_BrtH: NlpC/p 22.3 3.1E+02 0.0067 21.3 5.9 48 78-125 259-314 (317)
58 PRK13252 betaine aldehyde dehy 22.3 2.5E+02 0.0053 23.9 5.6 38 86-123 51-88 (488)
59 cd07152 ALDH_BenzADH NAD-depen 22.2 2.3E+02 0.005 23.6 5.4 38 86-123 20-57 (443)
60 cd07137 ALDH_F3FHI Plant aldeh 22.1 2.4E+02 0.0051 23.7 5.4 40 85-124 5-44 (432)
61 cd07129 ALDH_KGSADH Alpha-Keto 21.8 2.2E+02 0.0048 24.0 5.2 38 86-123 6-43 (454)
62 cd07084 ALDH_KGSADH-like ALDH 21.6 2.1E+02 0.0046 24.1 5.1 39 85-123 5-43 (442)
63 PRK13968 putative succinate se 21.5 2.4E+02 0.0052 23.9 5.4 39 86-124 36-74 (462)
64 KOG1827 Chromatin remodeling c 21.5 6.7 0.00015 35.1 -4.1 44 59-103 552-595 (629)
65 cd07098 ALDH_F15-22 Aldehyde d 21.0 2.7E+02 0.0059 23.4 5.6 39 86-124 25-63 (465)
66 PRK11241 gabD succinate-semial 20.8 2.4E+02 0.0053 24.1 5.3 37 87-123 56-92 (482)
67 TIGR01780 SSADH succinate-semi 20.5 2.9E+02 0.0062 23.2 5.6 38 86-123 26-63 (448)
68 cd07099 ALDH_DDALDH Methylomon 20.3 2.7E+02 0.0058 23.2 5.4 38 86-123 25-62 (453)
69 COG1638 DctP TRAP-type C4-dica 20.3 2.2E+02 0.0048 23.3 4.8 36 90-125 242-277 (332)
70 PF13945 NST1: Salt tolerance 20.3 2.4E+02 0.0052 21.6 4.6 27 85-111 101-127 (190)
71 cd07108 ALDH_MGR_2402 Magnetos 20.3 2.6E+02 0.0057 23.4 5.3 39 86-124 26-64 (457)
No 1
>PTZ00199 high mobility group protein; Provisional
Probab=99.94 E-value=4.1e-26 Score=155.45 Aligned_cols=85 Identities=45% Similarity=0.694 Sum_probs=78.6
Q ss_pred ccccccccCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCc-cHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHH
Q 033035 42 RTKNVKSAKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVK-AVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKL 120 (129)
Q Consensus 42 ~~~~~~k~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~-~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e 120 (129)
..++++++.+||+.|+||+|||||||+++|..|..+||+++ +|.+|+++||++|++||+++|.+|.+.|..++.+|..+
T Consensus 9 ~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~~e 88 (94)
T PTZ00199 9 LVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYEKE 88 (94)
T ss_pred cccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHH
Confidence 34555667899999999999999999999999999999985 48999999999999999999999999999999999999
Q ss_pred HHHHhh
Q 033035 121 MTAYNK 126 (129)
Q Consensus 121 ~~~Y~~ 126 (129)
|.+|+.
T Consensus 89 ~~~Y~~ 94 (94)
T PTZ00199 89 KAEYAK 94 (94)
T ss_pred HHHHhC
Confidence 999963
No 2
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.86 E-value=1.2e-21 Score=128.29 Aligned_cols=70 Identities=23% Similarity=0.395 Sum_probs=67.8
Q ss_pred CCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 55 KPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 55 ~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
+|+||+||||||+++.|..|+.+||++ ++.+|+++||.+|+.||+++|.+|.++|..++++|..++++|.
T Consensus 1 ~~kRP~naf~lf~~~~r~~~~~~~p~~-~~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yk 70 (77)
T cd01389 1 KIPRPRNAFILYRQDKHAQLKTENPGL-TNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYK 70 (77)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCC-CHHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCc
Confidence 489999999999999999999999999 7899999999999999999999999999999999999999986
No 3
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.85 E-value=5.1e-21 Score=123.92 Aligned_cols=69 Identities=36% Similarity=0.523 Sum_probs=66.8
Q ss_pred CCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 56 PKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 56 PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
.|||+||||+|++++|..|+.+||++ ++.+|+++||++|+.||+++|.+|.+.|..++++|..++++|.
T Consensus 2 iKrP~naf~~F~~~~r~~~~~~~p~~-~~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~ 70 (72)
T cd01388 2 IKRPMNAFMLFSKRHRRKVLQEYPLK-ENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYK 70 (72)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCCCC-CHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCC
Confidence 68999999999999999999999999 7899999999999999999999999999999999999999985
No 4
>PF00505 HMG_box: HMG (high mobility group) box; InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.84 E-value=1.1e-20 Score=120.17 Aligned_cols=69 Identities=41% Similarity=0.755 Sum_probs=65.3
Q ss_pred CCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 56 PKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 56 PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
|+||+|||+||+.+++..++.+||++ ++.+|+++||.+|++||+++|.+|.+.|..++..|..+|.+|+
T Consensus 1 PkrP~~af~lf~~~~~~~~k~~~p~~-~~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~ 69 (69)
T PF00505_consen 1 PKRPPNAFMLFCKEKRAKLKEENPDL-SNKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK 69 (69)
T ss_dssp SSSS--HHHHHHHHHHHHHHHHSTTS-THHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCHHHHHHHHHHHHHHHHhccc-ccccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 89999999999999999999999999 6899999999999999999999999999999999999999995
No 5
>PF09011 HMG_box_2: HMG-box domain; InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.84 E-value=2.3e-20 Score=121.18 Aligned_cols=72 Identities=44% Similarity=0.762 Sum_probs=63.3
Q ss_pred CCCCCCCCChHHHHHHHHHHHHHHh-CCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 53 PNKPKRPPSAFFVFLEEFRKVYKQE-HPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 53 p~~PKRP~say~lF~~e~r~~~k~e-~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
|++||+|+|||+||+.+++..++.. ++.. ++.++++.|+..|++||++||.+|.++|..++.+|..+|..|+
T Consensus 1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~-~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~ 73 (73)
T PF09011_consen 1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQ-SFREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN 73 (73)
T ss_dssp SSS--SSSSHHHHHHHHHHHHHHHHT-T-S-SHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred CcCCCCCCCHHHHHHHHHHHHHHHhcccCC-CHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 7899999999999999999999988 5655 7899999999999999999999999999999999999999996
No 6
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.83 E-value=2.8e-20 Score=117.14 Aligned_cols=65 Identities=51% Similarity=0.787 Sum_probs=63.3
Q ss_pred CCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHH
Q 033035 56 PKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLM 121 (129)
Q Consensus 56 PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~ 121 (129)
|++|+|||++|+.++|..+..+||++ ++.+|++.||.+|++||+++|.+|.+.|..++.+|..+|
T Consensus 1 Pkrp~saf~~f~~~~r~~~~~~~p~~-~~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~ 65 (66)
T cd01390 1 PKRPLSAYFLFSQEQRPKLKKENPDA-SVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCC-CHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 89999999999999999999999998 789999999999999999999999999999999999987
No 7
>smart00398 HMG high mobility group.
Probab=99.83 E-value=3.7e-20 Score=117.25 Aligned_cols=70 Identities=47% Similarity=0.772 Sum_probs=67.5
Q ss_pred CCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 55 KPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 55 ~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
+|++|+|+|++|++++|..+..+||++ ++.+|+++||.+|+.||+++|.+|.+.|..++.+|..++..|.
T Consensus 1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~-~~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~ 70 (70)
T smart00398 1 KPKRPMSAFMLFSQENRAKIKAENPDL-SNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK 70 (70)
T ss_pred CcCCCCcHHHHHHHHHHHHHHHHCcCC-CHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 589999999999999999999999999 6899999999999999999999999999999999999999984
No 8
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.81 E-value=5.5e-20 Score=140.18 Aligned_cols=84 Identities=43% Similarity=0.703 Sum_probs=79.5
Q ss_pred ccccccCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 44 KNVKSAKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 44 ~~~~k~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+...++.+|||.|+||+||||+|+.++|++|..++|++ .|++|++++|++|++|+++++.+|...+..++++|+.++..
T Consensus 59 k~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l-~~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~ 137 (211)
T COG5648 59 KRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKL-TFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEE 137 (211)
T ss_pred HHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCC-ChHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHh
Confidence 45567889999999999999999999999999999999 79999999999999999999999999999999999999999
Q ss_pred Hhhhc
Q 033035 124 YNKKQ 128 (129)
Q Consensus 124 Y~~k~ 128 (129)
|+.+.
T Consensus 138 y~~k~ 142 (211)
T COG5648 138 YNKKL 142 (211)
T ss_pred hhccc
Confidence 98753
No 9
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.79 E-value=1.3e-18 Score=117.54 Aligned_cols=76 Identities=49% Similarity=0.779 Sum_probs=72.5
Q ss_pred CC--CCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHH-HHhhhc
Q 033035 52 DP--NKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMT-AYNKKQ 128 (129)
Q Consensus 52 dp--~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~-~Y~~k~ 128 (129)
|| +.|++|+|||++|+.+.+..++.+||++ ++.+|+++||++|++|+++++.+|...+..++.+|..+|. .|+...
T Consensus 17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~-~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~ 95 (96)
T KOG0381|consen 17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGL-SVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL 95 (96)
T ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCC-CHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence 77 5999999999999999999999999998 7899999999999999999999999999999999999999 998764
No 10
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.77 E-value=3.2e-18 Score=107.14 Aligned_cols=65 Identities=49% Similarity=0.772 Sum_probs=62.6
Q ss_pred CCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHH
Q 033035 56 PKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLM 121 (129)
Q Consensus 56 PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~ 121 (129)
|++|+|+|++|+++++..+..+||++ ++.+|+.+||.+|+.||+++|.+|.+.|..++.+|..++
T Consensus 1 pkrp~~af~~f~~~~~~~~~~~~~~~-~~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~ 65 (66)
T cd00084 1 PKRPLSAYFLFSQEHRAEVKAENPGL-SVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCC-CHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 78999999999999999999999998 689999999999999999999999999999999999875
No 11
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.71 E-value=1.2e-17 Score=135.73 Aligned_cols=76 Identities=29% Similarity=0.522 Sum_probs=72.2
Q ss_pred cCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 49 AKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 49 ~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
.....++.||||||||||++..|..|..++|+++| .||+++||.+|+.|+++||.+|++.|++++..|++++.+|+
T Consensus 56 ~k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHN-SEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYK 131 (331)
T KOG0527|consen 56 DKTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHN-SEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYK 131 (331)
T ss_pred CCCCccccCCCcchhhhhhHHHHHHHHHhCcchhh-HHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCcc
Confidence 45667899999999999999999999999999976 89999999999999999999999999999999999999995
No 12
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.69 E-value=4e-17 Score=138.11 Aligned_cols=78 Identities=40% Similarity=0.664 Sum_probs=73.5
Q ss_pred cccccCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 45 NVKSAKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 45 ~~~k~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
++.++.+|||+||||+||||||++..|..|+.+ ++ ++++|++.+|++|+.||. |.+|++.|+.++++|+.+|.+|
T Consensus 525 k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi-~~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~~y 599 (615)
T KOG0526|consen 525 KKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GI-SVGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMKEY 599 (615)
T ss_pred cCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--Cc-hHHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHHhh
Confidence 666789999999999999999999999999988 77 799999999999999999 9999999999999999999999
Q ss_pred hhh
Q 033035 125 NKK 127 (129)
Q Consensus 125 ~~k 127 (129)
+.-
T Consensus 600 k~g 602 (615)
T KOG0526|consen 600 KNG 602 (615)
T ss_pred cCC
Confidence 853
No 13
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]
Probab=99.25 E-value=2e-11 Score=98.53 Aligned_cols=78 Identities=24% Similarity=0.507 Sum_probs=73.8
Q ss_pred ccCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHhh
Q 033035 48 SAKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYNK 126 (129)
Q Consensus 48 k~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~~ 126 (129)
...+.|.+|-+|+-+||.|+...|++|+..||++ .+-+|+++||.+|..|+++||..|...++.++..|++.|..|..
T Consensus 57 t~pkpPkppekpl~pymrySrkvWd~VkA~nPe~-kLWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~ 134 (410)
T KOG4715|consen 57 TRPKPPKPPEKPLMPYMRYSRKVWDQVKASNPEL-KLWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHN 134 (410)
T ss_pred cCCCCCCCCCcccchhhHHhhhhhhhhhccCcch-HHHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhC
Confidence 4566788899999999999999999999999999 68999999999999999999999999999999999999999976
No 14
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.20 E-value=2.3e-11 Score=98.87 Aligned_cols=72 Identities=24% Similarity=0.429 Sum_probs=64.8
Q ss_pred CCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHhh
Q 033035 54 NKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYNK 126 (129)
Q Consensus 54 ~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~~ 126 (129)
...|+|+||||||+.|.|..|.+++- +...++|.++||.+|..||.+|.+.|.++|..+++.+...+..|-.
T Consensus 190 phiKKPLNAFmlyMKEmRa~vvaEct-lKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSA 261 (421)
T KOG3248|consen 190 PHIKKPLNAFMLYMKEMRAKVVAECT-LKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSA 261 (421)
T ss_pred ccccccHHHHHHHHHHHHHHHHHHhh-hhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcch
Confidence 46799999999999999999999885 5445899999999999999999999999999999999988877754
No 15
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=99.15 E-value=1.3e-11 Score=103.88 Aligned_cols=78 Identities=24% Similarity=0.437 Sum_probs=70.5
Q ss_pred cccCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 47 KSAKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 47 ~k~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
+-+...++..|||+||||+|..+.|..|...+||+++ .+|+++||.+|+.||..||++|.+.-.++...|.+..++|+
T Consensus 317 rg~~ss~PHIKRPMNAFMVWAkDERRKILqA~PDMHN-SnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYr 394 (511)
T KOG0528|consen 317 RGRASSEPHIKRPMNAFMVWAKDERRKILQAFPDMHN-SNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYR 394 (511)
T ss_pred cCcCCCCccccCCcchhhcccchhhhhhhhcCccccc-cchhHHhcccccccccccccchHHHHHHHHHhhhccCcccc
Confidence 3455567788999999999999999999999999987 69999999999999999999999999988888888888886
No 16
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.33 E-value=4.8e-07 Score=79.03 Aligned_cols=72 Identities=31% Similarity=0.417 Sum_probs=65.1
Q ss_pred ccCCCCCCCCCCCChHHHHHHHHH--HHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHH
Q 033035 48 SAKKDPNKPKRPPSAFFVFLEEFR--KVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKL 120 (129)
Q Consensus 48 k~~kdp~~PKRP~say~lF~~e~r--~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e 120 (129)
.-+.|.....+|||+|++|++.+| ..+...||+.. -..|++|||+.|-.|-+.||..|.++|.+.++.|.+.
T Consensus 174 pnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn~D-NrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfka 247 (683)
T KOG2746|consen 174 PNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPNQD-NRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFKA 247 (683)
T ss_pred CCcCcchhhhhhhHHHHHHHhhcCCccchhccCcccc-chhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhhh
Confidence 345566788999999999999999 88999999994 4899999999999999999999999999999988775
No 17
>PF14887 HMG_box_5: HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=98.27 E-value=6.3e-06 Score=54.36 Aligned_cols=71 Identities=23% Similarity=0.289 Sum_probs=58.4
Q ss_pred CCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHhhh
Q 033035 55 KPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYNKK 127 (129)
Q Consensus 55 ~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~~k 127 (129)
.|-.|-++--||.+.....+...+++. ...+ .+.+...|++|++.+|-+|...|.++..+|+.+|.+|+.-
T Consensus 3 lPE~PKt~qe~Wqq~vi~dYla~~~~d-r~K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~ 73 (85)
T PF14887_consen 3 LPETPKTAQEIWQQSVIGDYLAKFRND-RKKA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSA 73 (85)
T ss_dssp -S----THHHHHHHHHHHHHHHHTTST-HHHH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-C
T ss_pred CCCCCCCHHHHHHHHHHHHHHHHhhHh-HHHH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcC
Confidence 467788999999999999999999888 3334 5589999999999999999999999999999999999763
No 18
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=97.28 E-value=0.00022 Score=54.97 Aligned_cols=68 Identities=21% Similarity=0.290 Sum_probs=59.6
Q ss_pred CCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHH
Q 033035 54 NKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMT 122 (129)
Q Consensus 54 ~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~ 122 (129)
.+|..|...|+-|-.+.|..+...+|+. +..++++++|..|++|++.-+.+|.+.+..++.+|...+.
T Consensus 142 ~~~~~~~~~~~e~~~~~r~~~~~~~~~~-~~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~ 209 (211)
T COG5648 142 LPNKAPIGPFIENEPKIRPKVEGPSPDK-ALVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP 209 (211)
T ss_pred cCCCCCCchhhhccHHhccccCCCCcch-hhhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence 4566777788888888888888888888 6789999999999999999999999999999999987664
No 19
>PF06382 DUF1074: Protein of unknown function (DUF1074); InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=97.06 E-value=0.0013 Score=49.66 Aligned_cols=48 Identities=27% Similarity=0.522 Sum_probs=41.3
Q ss_pred CChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHH
Q 033035 60 PSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAK 112 (129)
Q Consensus 60 ~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~ 112 (129)
-++||-|+.+++. .|.++ +..|+....+..|..||+.+|..|..++..
T Consensus 83 nnaYLNFLReFRr----kh~~L-~p~dlI~~AAraW~rLSe~eK~rYrr~~~~ 130 (183)
T PF06382_consen 83 NNAYLNFLREFRR----KHCGL-SPQDLIQRAARAWCRLSEAEKNRYRRMAPS 130 (183)
T ss_pred chHHHHHHHHHHH----HccCC-CHHHHHHHHHHHHHhCCHHHHHHHHhhcch
Confidence 4689999999887 45788 568999999999999999999999986553
No 20
>PF04690 YABBY: YABBY protein; InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=96.65 E-value=0.005 Score=46.24 Aligned_cols=48 Identities=29% Similarity=0.508 Sum_probs=40.9
Q ss_pred CCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCC
Q 033035 51 KDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLT 99 (129)
Q Consensus 51 kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls 99 (129)
+.|.+-.|-+|||..|+.+.-..|+..+|++ +-.|.....+..|...+
T Consensus 117 kPPEKRqR~psaYn~f~k~ei~rik~~~p~i-shkeaFs~aAknW~h~p 164 (170)
T PF04690_consen 117 KPPEKRQRVPSAYNRFMKEEIQRIKAENPDI-SHKEAFSAAAKNWAHFP 164 (170)
T ss_pred CCccccCCCchhHHHHHHHHHHHHHhcCCCC-CHHHHHHHHHHhhhhCc
Confidence 3344445779999999999999999999999 56899999999998765
No 21
>PF08073 CHDNT: CHDNT (NUC034) domain; InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=95.62 E-value=0.015 Score=36.02 Aligned_cols=40 Identities=15% Similarity=0.365 Sum_probs=35.2
Q ss_pred CChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCCh
Q 033035 60 PSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTD 100 (129)
Q Consensus 60 ~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~ 100 (129)
++.|=+|++..|+.|...||.+ .+..|...++..|+.-+.
T Consensus 13 lt~yK~Fsq~vRP~l~~~NPk~-~~sKl~~l~~AKwrEF~~ 52 (55)
T PF08073_consen 13 LTNYKAFSQHVRPLLAKANPKA-PMSKLMMLLQAKWREFQE 52 (55)
T ss_pred HHHHHHHHHHHHHHHHHHCCCC-cHHHHHHHHHHHHHHHHh
Confidence 3668899999999999999999 678999999999987543
No 22
>PF06244 DUF1014: Protein of unknown function (DUF1014); InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=91.90 E-value=0.26 Score=35.13 Aligned_cols=48 Identities=21% Similarity=0.334 Sum_probs=39.9
Q ss_pred CCCCC-CCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChh
Q 033035 53 PNKPK-RPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDA 101 (129)
Q Consensus 53 p~~PK-RP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~ 101 (129)
..+|- |---||.-|.....+.++.++|+| -..++-.+|-.+|...++.
T Consensus 69 drHPErR~KAAy~afeE~~Lp~lK~E~PgL-rlsQ~kq~l~K~w~KSPeN 117 (122)
T PF06244_consen 69 DRHPERRMKAAYKAFEERRLPELKEENPGL-RLSQYKQMLWKEWQKSPEN 117 (122)
T ss_pred CCCcchhHHHHHHHHHHHHhHHHHhhCCCc-hHHHHHHHHHHHHhcCCCC
Confidence 34444 444789999999999999999999 6789999999999877754
No 23
>PF04769 MAT_Alpha1: Mating-type protein MAT alpha 1; InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=91.14 E-value=0.79 Score=35.28 Aligned_cols=56 Identities=21% Similarity=0.357 Sum_probs=39.9
Q ss_pred cCCCCCCCCCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHH
Q 033035 49 AKKDPNKPKRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAA 111 (129)
Q Consensus 49 ~~kdp~~PKRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~ 111 (129)
.......++||+|+||+|..-.- ...|+. ...+++..|+..|..=+ -+..|.-.+.
T Consensus 37 ~~~~~~~~kr~lN~Fm~FRsyy~----~~~~~~-~Qk~~S~~l~~lW~~dp--~k~~W~l~ak 92 (201)
T PF04769_consen 37 RKRSPEKAKRPLNGFMAFRSYYS----PIFPPL-PQKELSGILTKLWEKDP--FKNKWSLMAK 92 (201)
T ss_pred ccccccccccchhHHHHHHHHHH----hhcCCc-CHHHHHHHHHHHHhCCc--cHhHHHHHhh
Confidence 34456678999999999987765 344666 45799999999998632 2455655544
No 24
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=90.37 E-value=0.79 Score=34.88 Aligned_cols=44 Identities=18% Similarity=0.388 Sum_probs=37.9
Q ss_pred cHHHHHH-HHHHhhcCCChhhhHHHHHHHHH-HHHHHHHHHHHHhh
Q 033035 83 AVSAVGK-AGGEKWKSLTDAEKAPFEAKAAK-RKLDYEKLMTAYNK 126 (129)
Q Consensus 83 ~~~ei~k-~l~~~Wk~ls~~eK~~Y~~~a~~-~k~~Y~~e~~~Y~~ 126 (129)
+|..|++ .+|.-|+.+|+++++.|.+.... ....|-..+..|..
T Consensus 65 Df~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~ 110 (198)
T TIGR03481 65 DLPAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAG 110 (198)
T ss_pred CHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcC
Confidence 5677766 68899999999999999998888 77889999988865
No 25
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=88.11 E-value=1.5 Score=33.64 Aligned_cols=47 Identities=26% Similarity=0.367 Sum_probs=38.4
Q ss_pred CCCccHHHHHH-HHHHhhcCCChhhhHHHHHHHHH-HHHHHHHHHHHHhh
Q 033035 79 PNVKAVSAVGK-AGGEKWKSLTDAEKAPFEAKAAK-RKLDYEKLMTAYNK 126 (129)
Q Consensus 79 p~~~~~~ei~k-~l~~~Wk~ls~~eK~~Y~~~a~~-~k~~Y~~e~~~Y~~ 126 (129)
|.. +|..+++ .+|.-|+.+|++++..|.+.... +..-|-..+..|..
T Consensus 66 p~~-Df~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~ 114 (211)
T PRK15117 66 PYV-QVKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHG 114 (211)
T ss_pred ccC-CHHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCC
Confidence 555 6777766 67899999999999999987766 55679999998865
No 26
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=85.74 E-value=1.6 Score=33.69 Aligned_cols=53 Identities=30% Similarity=0.461 Sum_probs=43.6
Q ss_pred CCC-CCCCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHH
Q 033035 54 NKP-KRPPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKA 110 (129)
Q Consensus 54 ~~P-KRP~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a 110 (129)
.+| +|=.-||.-|-....+.|+.+||++ ...++-.+|-.+|..-++. ||.+.+
T Consensus 162 rHPEkRmrAA~~afEe~~LPrLK~e~P~l-rlsQ~Kqll~Kew~KsPDN---P~Nq~~ 215 (221)
T KOG3223|consen 162 RHPEKRMRAAFKAFEEARLPRLKKENPGL-RLSQYKQLLKKEWQKSPDN---PFNQAA 215 (221)
T ss_pred cChHHHHHHHHHHHHHhhchhhhhcCCCc-cHHHHHHHHHHHHhhCCCC---hhhHHh
Confidence 344 3445679999999999999999999 6889999999999988886 776654
No 27
>PF05494 Tol_Tol_Ttg2: Toluene tolerance, Ttg2 ; InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=78.57 E-value=3.6 Score=29.99 Aligned_cols=44 Identities=16% Similarity=0.332 Sum_probs=32.4
Q ss_pred cHHHHHH-HHHHhhcCCChhhhHHHHHHHHH-HHHHHHHHHHHHhh
Q 033035 83 AVSAVGK-AGGEKWKSLTDAEKAPFEAKAAK-RKLDYEKLMTAYNK 126 (129)
Q Consensus 83 ~~~ei~k-~l~~~Wk~ls~~eK~~Y~~~a~~-~k~~Y~~e~~~Y~~ 126 (129)
+|..|++ .||.-|+.||++++..|.+.... ....|-..+..|..
T Consensus 39 D~~~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~ 84 (170)
T PF05494_consen 39 DFERMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG 84 (170)
T ss_dssp -HHHHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred CHHHHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence 5666665 46789999999999999987766 55678888888864
No 28
>PF13875 DUF4202: Domain of unknown function (DUF4202)
Probab=75.59 E-value=6.6 Score=29.95 Aligned_cols=41 Identities=17% Similarity=0.378 Sum_probs=34.2
Q ss_pred ChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHH
Q 033035 61 SAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAP 105 (129)
Q Consensus 61 say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~ 105 (129)
-+.++|+..+.+.+...| +-..+..||...|+.||+.-++.
T Consensus 130 vacLVFL~~~f~~F~~~~----deeK~v~Il~KTw~KMS~~g~~~ 170 (185)
T PF13875_consen 130 VACLVFLEYYFEDFAAKH----DEEKIVDILRKTWRKMSERGHEA 170 (185)
T ss_pred hHHHHhHHHHHHHHHhcC----CHHHHHHHHHHHHHHCCHHHHHH
Confidence 358999999999998888 23578899999999999987653
No 29
>PF01352 KRAB: KRAB box; InterPro: IPR001909 The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that is found in the N-terminal part of about one third of eukaryotic Krueppel-type C2H2 zinc finger proteins (ZFPs) []. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box []. The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain [, ]. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin [, ]. KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome []. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.; GO: 0003676 nucleic acid binding, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 1V65_A.
Probab=64.15 E-value=5.4 Score=22.89 Aligned_cols=29 Identities=21% Similarity=0.279 Sum_probs=16.7
Q ss_pred HHHHHHHHH-HhhcCCChhhhHHHHHHHHH
Q 033035 84 VSAVGKAGG-EKWKSLTDAEKAPFEAKAAK 112 (129)
Q Consensus 84 ~~ei~k~l~-~~Wk~ls~~eK~~Y~~~a~~ 112 (129)
|.+|+--++ +.|..|.+.+|..|.+...+
T Consensus 3 f~Dvav~fs~eEW~~L~~~Qk~ly~dvm~E 32 (41)
T PF01352_consen 3 FEDVAVYFSQEEWELLDPAQKNLYRDVMLE 32 (41)
T ss_dssp ----TT---HHHHHTS-HHHHHHHHHHHHH
T ss_pred EEEEEEEcChhhcccccceecccchhHHHH
Confidence 344444444 66999999999999886543
No 30
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=60.34 E-value=13 Score=28.78 Aligned_cols=42 Identities=12% Similarity=0.205 Sum_probs=34.4
Q ss_pred HHHHHHHhhcCCChhhhHHHHHHHHH-HHHHHHHHHHHHhhhc
Q 033035 87 VGKAGGEKWKSLTDAEKAPFEAKAAK-RKLDYEKLMTAYNKKQ 128 (129)
Q Consensus 87 i~k~l~~~Wk~ls~~eK~~Y~~~a~~-~k~~Y~~e~~~Y~~k~ 128 (129)
-..-+|.-|+.+|+++++.|...... ....|-..+..|+..+
T Consensus 76 a~~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q~ 118 (202)
T COG2854 76 AKLVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQT 118 (202)
T ss_pred HHHHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCCC
Confidence 34567899999999999999987666 5567999999998753
No 31
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=54.79 E-value=53 Score=22.48 Aligned_cols=11 Identities=18% Similarity=0.806 Sum_probs=4.8
Q ss_pred hhcCCChhhhH
Q 033035 94 KWKSLTDAEKA 104 (129)
Q Consensus 94 ~Wk~ls~~eK~ 104 (129)
.|.+||++++.
T Consensus 55 ~W~~LspeqR~ 65 (107)
T PF11304_consen 55 RWAALSPEQRQ 65 (107)
T ss_pred HHHhCCHHHHH
Confidence 34444444443
No 32
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=50.82 E-value=46 Score=23.21 Aligned_cols=42 Identities=19% Similarity=0.179 Sum_probs=36.6
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHhhh
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYNKK 127 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~~k 127 (129)
.-...+-..|+.||++++.............|..-+++|-.+
T Consensus 87 ~~~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~ 128 (135)
T PRK09706 87 EDQKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKA 128 (135)
T ss_pred HHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 445778899999999999999999999999999988888654
No 33
>PF06945 DUF1289: Protein of unknown function (DUF1289); InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=50.43 E-value=23 Score=21.09 Aligned_cols=24 Identities=25% Similarity=0.560 Sum_probs=17.2
Q ss_pred HHHHHHHHHHhhcCCChhhhHHHHHHHHH
Q 033035 84 VSAVGKAGGEKWKSLTDAEKAPFEAKAAK 112 (129)
Q Consensus 84 ~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~ 112 (129)
..||.. |..||+++|.........
T Consensus 24 ~dEI~~-----W~~~s~~er~~i~~~l~~ 47 (51)
T PF06945_consen 24 LDEIRD-----WKSMSDDERRAILARLRA 47 (51)
T ss_pred HHHHHH-----HhhCCHHHHHHHHHHHHH
Confidence 456654 999999998877665444
No 34
>PF12881 NUT_N: NUT protein N terminus; InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=40.49 E-value=88 Score=25.91 Aligned_cols=51 Identities=14% Similarity=0.183 Sum_probs=35.4
Q ss_pred HHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHHH
Q 033035 63 FFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKRK 114 (129)
Q Consensus 63 y~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k 114 (129)
|.-|+.-....+....|.+ ++.|.-...-..|.-.|.=+|-.|.++|++=.
T Consensus 232 lSCFLIpvLrsLar~kPtM-tlEeGl~ra~qEW~~~SnfdRmifyemaekFm 282 (328)
T PF12881_consen 232 LSCFLIPVLRSLARLKPTM-TLEEGLWRAVQEWQHTSNFDRMIFYEMAEKFM 282 (328)
T ss_pred hhhhHHHHHHHHHhcCCCc-cHHHHHHHHHHHhhccccccHHHHHHHHHHHc
Confidence 3333333333444445666 56677777778899999999999999998743
No 35
>PF15581 Imm35: Immunity protein 35
Probab=38.87 E-value=32 Score=23.31 Aligned_cols=23 Identities=9% Similarity=0.280 Sum_probs=18.6
Q ss_pred cHHHHHHHHHHhhcCCChhhhHH
Q 033035 83 AVSAVGKAGGEKWKSLTDAEKAP 105 (129)
Q Consensus 83 ~~~ei~k~l~~~Wk~ls~~eK~~ 105 (129)
++..+...|.+.|+.|+.++-..
T Consensus 31 ~i~~l~~lIe~eWRGl~~~qV~~ 53 (93)
T PF15581_consen 31 TIRNLESLIEHEWRGLPEEQVLY 53 (93)
T ss_pred HHHHHHHHHHHHHcCCCHHHHHH
Confidence 45678889999999999886543
No 36
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=37.33 E-value=86 Score=23.26 Aligned_cols=33 Identities=18% Similarity=0.272 Sum_probs=25.7
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYE 118 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~ 118 (129)
++.+..-.++..|++++|..|.+..++...+..
T Consensus 118 ~~~~~~~qmy~lLTPEQra~l~~~~e~r~~~~~ 150 (162)
T PRK12751 118 EMAKVRNQMYNLLTPEQKEALNKKHQERIEKLQ 150 (162)
T ss_pred HHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence 345566788899999999999998877666554
No 37
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=37.24 E-value=1.1e+02 Score=22.81 Aligned_cols=35 Identities=20% Similarity=0.209 Sum_probs=28.7
Q ss_pred HHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHH
Q 033035 87 VGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLM 121 (129)
Q Consensus 87 i~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~ 121 (129)
+.+..-+.+.-|++++|..|.+...+-.+.|...+
T Consensus 126 ~~~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~ 160 (170)
T PRK12750 126 MLEKRHQMLSILTPEQKAKFQELQQERMQECQDKM 160 (170)
T ss_pred HHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence 34455567999999999999999888888887766
No 38
>cd07081 ALDH_F20_ACDH_EutE-like Coenzyme A acylating aldehyde dehydrogenase (ACDH), Ethanolamine utilization protein EutE, and related proteins. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA. The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH, and may be critical enzymes in the fermentative pathway.
Probab=36.64 E-value=93 Score=26.42 Aligned_cols=39 Identities=13% Similarity=-0.014 Sum_probs=32.7
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++.....|+.+|..+|..+........+++..++...
T Consensus 6 ~~A~~A~~~W~~~~~~~R~~iL~~~a~~l~~~~~ela~~ 44 (439)
T cd07081 6 AAAKVAQQGLSCKSQEMVDLIFRAAAEAAEDARIDLAKL 44 (439)
T ss_pred HHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 455666778999999999999999999888888887654
No 39
>PF00887 ACBP: Acyl CoA binding protein; InterPro: IPR000582 Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters []. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor []. ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species []. Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats (IPR002110 from INTERPRO) []. The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein []. Other proteins containing an ACB domain include: Endozepine-like peptide (ELP) (gene DBIL5) from mouse []. ELP is a testis-specific ACBP homologue that may be involved in the energy metabolism of the mature sperm. MA-DBI, a transmembrane protein of unknown function which has been found in mammals. MA-DBI contains a N-terminal ACB domain. DRS-1 [], a human protein of unknown function that contains a N-terminal ACB domain and a C-terminal enoyl-CoA isomerase/hydratase domain. ; GO: 0000062 fatty-acyl-CoA binding; PDB: 2CB8_A 2FJ9_A 2LBB_A 1ST7_A 3EPY_B 2FDQ_C 1NTI_A 1HB8_A 1ACA_A 1NVL_A ....
Probab=34.17 E-value=1e+02 Score=19.90 Aligned_cols=53 Identities=13% Similarity=0.257 Sum_probs=32.0
Q ss_pred HHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCC----hhhhHHHHHHHHHHHHHH
Q 033035 63 FFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLT----DAEKAPFEAKAAKRKLDY 117 (129)
Q Consensus 63 y~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls----~~eK~~Y~~~a~~~k~~Y 117 (129)
|-||.+.....+....|+.-+ -+.+.--.-|+.+. ++-+..|.+...+....|
T Consensus 30 YalyKQAt~Gd~~~~~P~~~d--~~~~~K~~AW~~l~gms~~eA~~~Yi~~v~~~~~~~ 86 (87)
T PF00887_consen 30 YALYKQATHGDCDTPRPGFFD--IEGRAKWDAWKALKGMSKEEAMREYIELVEELIPKY 86 (87)
T ss_dssp HHHHHHHHTSS--S-CTTTTC--HHHHHHHHHHHTTTTTHHHHHHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHhCCCcCCCCcchh--HHHHHHHHHHHHccCCCHHHHHHHHHHHHHHHHHhc
Confidence 777777776666666676633 34445556787776 355667777777766655
No 40
>cd07133 ALDH_CALDH_CalB Coniferyl aldehyde dehydrogenase-like. Coniferyl aldehyde dehydrogenase (CALDH, EC=1.2.1.68) of Pseudomonas sp. strain HR199 (CalB) which catalyzes the NAD+-dependent oxidation of coniferyl aldehyde to ferulic acid, and similar sequences, are present in this CD.
Probab=33.40 E-value=1.2e+02 Score=25.40 Aligned_cols=40 Identities=13% Similarity=-0.092 Sum_probs=32.6
Q ss_pred HHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 85 SAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 85 ~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
.+.++.....|+.++..+|..+........+.+..++...
T Consensus 4 ~~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~ 43 (434)
T cd07133 4 LERQKAAFLANPPPSLEERRDRLDRLKALLLDNQDALAEA 43 (434)
T ss_pred HHHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4556666778999999999999998888888888777654
No 41
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=31.31 E-value=1.4e+02 Score=22.43 Aligned_cols=34 Identities=18% Similarity=0.279 Sum_probs=28.5
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEK 119 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~ 119 (129)
++.++--.++.-|+|++|..|.+..+....++..
T Consensus 112 em~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~~ 145 (166)
T PRK10363 112 EMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLRD 145 (166)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHH
Confidence 5677778899999999999999988887777744
No 42
>PF12650 DUF3784: Domain of unknown function (DUF3784); InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=30.52 E-value=37 Score=22.43 Aligned_cols=16 Identities=19% Similarity=0.561 Sum_probs=13.6
Q ss_pred hhcCCChhhhHHHHHH
Q 033035 94 KWKSLTDAEKAPFEAK 109 (129)
Q Consensus 94 ~Wk~ls~~eK~~Y~~~ 109 (129)
-|+.||++||+.|...
T Consensus 25 Gyntms~eEk~~~D~~ 40 (97)
T PF12650_consen 25 GYNTMSKEEKEKYDKK 40 (97)
T ss_pred hcccCCHHHHHHhhHH
Confidence 4899999999999753
No 43
>KOG1610 consensus Corticosteroid 11-beta-dehydrogenase and related short chain-type dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism; General function prediction only]
Probab=30.15 E-value=1.6e+02 Score=24.46 Aligned_cols=48 Identities=19% Similarity=0.329 Sum_probs=35.3
Q ss_pred HHHHHHHHHHHhC-------C-----CCccHHHHHHHHHHhhcCCChhhhHHHHHHHHHH
Q 033035 66 FLEEFRKVYKQEH-------P-----NVKAVSAVGKAGGEKWKSLTDAEKAPFEAKAAKR 113 (129)
Q Consensus 66 F~~e~r~~~k~e~-------p-----~~~~~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~ 113 (129)
|+...|.++..=. | ++-+...+.+.+.+.|..||++.++.|-+.+..+
T Consensus 188 f~D~lR~EL~~fGV~VsiiePG~f~T~l~~~~~~~~~~~~~w~~l~~e~k~~YGedy~~~ 247 (322)
T KOG1610|consen 188 FSDSLRRELRPFGVKVSIIEPGFFKTNLANPEKLEKRMKEIWERLPQETKDEYGEDYFED 247 (322)
T ss_pred HHHHHHHHHHhcCcEEEEeccCccccccCChHHHHHHHHHHHhcCCHHHHHHHHHHHHHH
Confidence 7777777765221 2 2223467889999999999999999998766654
No 44
>cd07132 ALDH_F3AB Aldehyde dehydrogenase family 3 members A1, A2, and B1 and related proteins. NAD(P)+-dependent, aldehyde dehydrogenase, family 3 members A1 and B1 (ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and similar sequences are included in this CD. Human ALDH3A1 is a homodimer with a critical role in cellular defense against oxidative stress; it catalyzes the oxidation of various cellular membrane lipid-derived aldehydes. Corneal crystalline ALDH3A1 protects the cornea and underlying lens against UV-induced oxidative stress. Human ALDH3A2, a microsomal homodimer, catalyzes the oxidation of long-chain aliphatic aldehydes to fatty acids. Human ALDH3B1 is highly expressed in the kidney and liver and catalyzes the oxidation of various medium- and long-chain saturated and unsaturated aliphatic aldehydes.
Probab=29.55 E-value=1.4e+02 Score=25.13 Aligned_cols=39 Identities=5% Similarity=-0.173 Sum_probs=32.0
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++.....|+.++..+|..+........+.+..++..-
T Consensus 5 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~~l~~~ 43 (443)
T cd07132 5 RRAREAFSSGKTRPLEFRIQQLEALLRMLEENEDEIVEA 43 (443)
T ss_pred HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHH
Confidence 456666778999999999999998888888888777653
No 45
>cd07122 ALDH_F20_ACDH Coenzyme A acylating aldehyde dehydrogenase (ACDH), ALDH family 20-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH, EC=1.2.1.10), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA . The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH and may be critical enzymes in the fermentative pathway.
Probab=28.23 E-value=1.5e+02 Score=25.15 Aligned_cols=39 Identities=5% Similarity=0.082 Sum_probs=32.0
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++.....|+.+|.++|..+...+....+++..++...
T Consensus 6 ~~A~~A~~~W~~~~~~eR~~~L~~~a~~l~~~~eela~~ 44 (436)
T cd07122 6 ERARKAQREFATFSQEQVDKIVEAVAWAAADAAEELAKM 44 (436)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 455666678999999999999998888888888877654
No 46
>PRK10236 hypothetical protein; Provisional
Probab=28.00 E-value=60 Score=25.73 Aligned_cols=26 Identities=15% Similarity=0.311 Sum_probs=21.4
Q ss_pred HHHHHHHhhcCCChhhhHHHHHHHHH
Q 033035 87 VGKAGGEKWKSLTDAEKAPFEAKAAK 112 (129)
Q Consensus 87 i~k~l~~~Wk~ls~~eK~~Y~~~a~~ 112 (129)
+.+++...|..||++|++.+.+.-..
T Consensus 118 l~kll~~a~~kms~eE~~~L~~~l~~ 143 (237)
T PRK10236 118 LEQFLRNTWKKMDEEHKQEFLHAVDA 143 (237)
T ss_pred HHHHHHHHHHHCCHHHHHHHHHHHhh
Confidence 57889999999999999888764433
No 47
>cd07087 ALDH_F3-13-14_CALDH-like ALDH subfamily: Coniferyl aldehyde dehydrogenase, ALDH families 3, 13, and 14, and other related proteins. ALDH subfamily which includes NAD(P)+-dependent, aldehyde dehydrogenase, family 3 member A1 and B1 (ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and also plant ALDH family members ALDH3F1, ALDH3H1, and ALDH3I1, fungal ALDH14 (YMR110C) and the protozoan family 13 member (ALDH13), as well as coniferyl aldehyde dehydrogenases (CALDH, EC=1.2.1.68), and other similar sequences, such as the Pseudomonas putida benzaldehyde dehydrogenase I that is involved in the metabolism of mandelate.
Probab=27.50 E-value=1.8e+02 Score=24.27 Aligned_cols=39 Identities=10% Similarity=-0.164 Sum_probs=31.4
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++..-..|+.++..+|..+........+++..++..-
T Consensus 5 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~ 43 (426)
T cd07087 5 ARLRETFLTGKTRSLEWRKAQLKALKRMLTENEEEIAAA 43 (426)
T ss_pred HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHH
Confidence 455666678999999999999998888888887777643
No 48
>cd07136 ALDH_YwdH-P39616 Bacillus subtilis aldehyde dehydrogenase ywdH-like. Uncharacterized Bacillus subtilis ywdH aldehyde dehydrogenase (locus P39616) most closely related to the ALDHs and fatty ALDHs of families 3 and 14, and similar sequences, are included in this CD.
Probab=26.53 E-value=1.8e+02 Score=24.64 Aligned_cols=41 Identities=10% Similarity=-0.118 Sum_probs=33.7
Q ss_pred HHHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 84 VSAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 84 ~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.+.++..-..|..++..+|..+...+......+..++...
T Consensus 3 ~v~~a~~a~~~W~~~~~~~R~~~L~~~a~~l~~~~~ela~~ 43 (449)
T cd07136 3 LVEKQRAFFKTGATKDVEFRIEQLKKLKQAIKKYENEILEA 43 (449)
T ss_pred HHHHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHH
Confidence 34666777788999999999999998888888888877654
No 49
>PF05388 Carbpep_Y_N: Carboxypeptidase Y pro-peptide; InterPro: IPR008442 This signature is found at the N terminus of carboxypeptidase Y, which belong to MEROPS peptidase family S10. This region contains the signal peptide and pro-peptide regions [,].; GO: 0004185 serine-type carboxypeptidase activity, 0005773 vacuole
Probab=24.53 E-value=1.1e+02 Score=21.26 Aligned_cols=29 Identities=28% Similarity=0.243 Sum_probs=25.1
Q ss_pred HHHHHHHHHHhhcCCChhhhHHHHHHHHH
Q 033035 84 VSAVGKAGGEKWKSLTDAEKAPFEAKAAK 112 (129)
Q Consensus 84 ~~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~ 112 (129)
+.-+++.+++.++.|+.+-|..|.++...
T Consensus 45 ~~~~~~~l~e~l~~Lt~e~k~~W~E~~~~ 73 (113)
T PF05388_consen 45 LEKISKYLNEPLKSLTSEAKALWDEMMLL 73 (113)
T ss_pred HHHHHHHHHHHHhhccHHHHHHHHHHHHH
Confidence 55678889999999999999999998754
No 50
>cd07085 ALDH_F6_MMSDH Methylmalonate semialdehyde dehydrogenase and ALDH family members 6A1 and 6B2. Methylmalonate semialdehyde dehydrogenase (MMSDH, EC=1.2.1.27) [acylating] from Bacillus subtilis is involved in valine metabolism and catalyses the NAD+- and CoA-dependent oxidation of methylmalonate semialdehyde into propionyl-CoA. Mitochondrial human MMSDH ALDH6A1 and Arabidopsis MMSDH ALDH6B2 are also present in this CD.
Probab=24.32 E-value=2e+02 Score=24.36 Aligned_cols=38 Identities=13% Similarity=0.056 Sum_probs=30.0
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|+.+|.++|..+...+......+..++..
T Consensus 45 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~ 82 (478)
T cd07085 45 AAAKAAFPAWSATPVLKRQQVMFKFRQLLEENLDELAR 82 (478)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 34555567899999999999998888888777766654
No 51
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=23.71 E-value=1.8e+02 Score=22.26 Aligned_cols=27 Identities=22% Similarity=0.164 Sum_probs=20.6
Q ss_pred HHhhcCCChhhhHHHHHHHHHHHHHHH
Q 033035 92 GEKWKSLTDAEKAPFEAKAAKRKLDYE 118 (129)
Q Consensus 92 ~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~ 118 (129)
...|..||++.+....+.+.+.-..+.
T Consensus 213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~ 239 (257)
T TIGR00787 213 KAFWKSLPPDLQAVVKEAAKEAGEYQR 239 (257)
T ss_pred HHHHhcCCHHHHHHHHHHHHHHHHHHH
Confidence 467999999999999887766544433
No 52
>cd07150 ALDH_VaniDH_like Pseudomonas putida vanillin dehydrogenase-like. Vanillin dehydrogenase (Vdh, VaniDH) involved in the metabolism of ferulic acid and other related sequences are included in this CD. The E. coli vanillin dehydrogenase (LigV) preferred NAD+ to NADP+ and exhibited a broad substrate preference, including vanillin, benzaldehyde, protocatechualdehyde, m-anisaldehyde, and p-hydroxybenzaldehyde.
Probab=23.47 E-value=2e+02 Score=23.95 Aligned_cols=38 Identities=18% Similarity=0.114 Sum_probs=30.2
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|+.++..+|..+..........+..++..
T Consensus 28 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~ 65 (451)
T cd07150 28 AAAYDAFPAWAATTPSERERILLKAAEIMERRADDLID 65 (451)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHhHHHHHH
Confidence 44555667899999999999998888887777776654
No 53
>cd08317 Death_ank Death domain associated with Ankyrins. Death Domain (DD) associated with Ankyrins. Ankyrins are modular proteins comprising three conserved domains, an N-terminal membrane-binding domain containing ANK repeats, a spectrin-binding domain and a C-terminal DD. Ankyrins function as adaptor proteins and they interact, through ANK repeats, with structurally diverse membrane proteins, including ion channels/pumps, calcium release channels, and cell adhesion molecules. They play critical roles in the proper expression and membrane localization of these proteins. In mammals, this family includes ankyrin-R for restricted (or ANK1), ankyrin-B for broadly expressed (or ANK2) and ankyrin-G for general or giant (or ANK3). They are expressed in different combinations in many tissues and play non-overlapping functions. In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-associati
Probab=23.09 E-value=41 Score=21.67 Aligned_cols=21 Identities=14% Similarity=0.504 Sum_probs=16.3
Q ss_pred CCCccHHHHHHHHHHhhcCCCh
Q 033035 79 PNVKAVSAVGKAGGEKWKSLTD 100 (129)
Q Consensus 79 p~~~~~~ei~k~l~~~Wk~ls~ 100 (129)
+++ .+..|+..||..|..|-.
T Consensus 3 ~~~-~l~~ia~~lG~dW~~LAr 23 (84)
T cd08317 3 ADI-RLADISNLLGSDWPQLAR 23 (84)
T ss_pred ccc-hHHHHHHHHhhHHHHHHH
Confidence 344 678999999999987654
No 54
>PRK00197 proA gamma-glutamyl phosphate reductase; Provisional
Probab=22.98 E-value=2e+02 Score=24.01 Aligned_cols=39 Identities=21% Similarity=0.046 Sum_probs=31.9
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++.....|..+|..+|..+........+++..++...
T Consensus 11 ~~A~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~~la~~ 49 (417)
T PRK00197 11 RRAKAASRKLAQLSTAQKNRALLAIADALEANAAEILAA 49 (417)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHhhHHHHHHH
Confidence 455666778999999999999998888888888777654
No 55
>cd07077 ALDH-like NAD(P)+-dependent aldehyde dehydrogenase-like (ALDH-like) family. The aldehyde dehydrogenase-like (ALDH-like) group of the ALDH superfamily of NAD(P)+-dependent enzymes which, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. This group includes families ALDH18, ALDH19, and ALDH20 and represents such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group.
Probab=22.86 E-value=1.6e+02 Score=24.22 Aligned_cols=36 Identities=11% Similarity=0.084 Sum_probs=27.9
Q ss_pred HHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 88 GKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 88 ~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
++.....|..+|..+|..+........+++..++..
T Consensus 3 A~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~~la~ 38 (397)
T cd07077 3 AKNAQRTLAVNHDEQRDLIINAIANALYDTRQRLAS 38 (397)
T ss_pred HHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 345567799999999999888888877777766654
No 56
>COG4281 ACB Acyl-CoA-binding protein [Lipid metabolism]
Probab=22.45 E-value=98 Score=20.55 Aligned_cols=60 Identities=20% Similarity=0.361 Sum_probs=37.6
Q ss_pred CCCCCCh-----HHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCC----hhhhHHHHHHHHHHHHHH
Q 033035 56 PKRPPSA-----FFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLT----DAEKAPFEAKAAKRKLDY 117 (129)
Q Consensus 56 PKRP~sa-----y~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls----~~eK~~Y~~~a~~~k~~Y 117 (129)
|.+|-|- |.||-+.--.....+.|++- .-+.+---+-|..|- +.-++.|.....+++..|
T Consensus 17 ~~kP~~d~LLkLYAL~KQ~s~GD~~~ekPG~~--d~~gr~K~eAW~~LKGksqedA~qeYialVeeLkak~ 85 (87)
T COG4281 17 SEKPSNDELLKLYALFKQGSVGDNDGEKPGFF--DIVGRYKYEAWAGLKGKSQEDARQEYIALVEELKAKY 85 (87)
T ss_pred ccCCCcHHHHHHHHHHHhccccccCCCCCCcc--ccccchhHHHHhhccCccHHHHHHHHHHHHHHHHhhc
Confidence 4456554 66676655555555567773 334555557786664 455678888888877765
No 57
>PF14399 Transpep_BrtH: NlpC/p60-like transpeptidase
Probab=22.32 E-value=3.1e+02 Score=21.33 Aligned_cols=48 Identities=15% Similarity=0.151 Sum_probs=31.0
Q ss_pred CCCCccHHHHHHHHHHhhcCCChhhh--------HHHHHHHHHHHHHHHHHHHHHh
Q 033035 78 HPNVKAVSAVGKAGGEKWKSLTDAEK--------APFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 78 ~p~~~~~~ei~k~l~~~Wk~ls~~eK--------~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
+|.+....++...++..|..+...-- ..+...+.....-...|-..|.
T Consensus 259 ~~~~~~~~~~~~~i~~~W~~~~~~~~k~~~~~~~~~~~~i~~~l~~i~~~E~~~~~ 314 (317)
T PF14399_consen 259 NPELAEAAELFEEIAQLWRQLANLLVKASLSKSPDDLEEIADILEKIAELEEELYE 314 (317)
T ss_pred ChhhHHHHHHHHHHHHHHHHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHH
Confidence 45554567889999999988764322 4566666666665655555443
No 58
>PRK13252 betaine aldehyde dehydrogenase; Provisional
Probab=22.29 E-value=2.5e+02 Score=23.90 Aligned_cols=38 Identities=21% Similarity=0.328 Sum_probs=30.6
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|..+|..+|..+..........+..++..
T Consensus 51 ~~A~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~ 88 (488)
T PRK13252 51 ASAKQGQKIWAAMTAMERSRILRRAVDILRERNDELAA 88 (488)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHhHHHHHH
Confidence 44566677899999999999998888887778777654
No 59
>cd07152 ALDH_BenzADH NAD-dependent benzaldehyde dehydrogenase II-like. NAD-dependent, benzaldehyde dehydrogenase II (XylC, BenzADH, EC=1.2.1.28) is involved in the oxidation of benzyl alcohol to benzoate. In Acinetobacter calcoaceticus, this process is carried out by the chromosomally encoded, benzyl alcohol dehydrogenase (xylB) and benzaldehyde dehydrogenase II (xylC) enzymes; whereas in Pseudomonas putida they are encoded by TOL plasmids.
Probab=22.15 E-value=2.3e+02 Score=23.61 Aligned_cols=38 Identities=18% Similarity=0.198 Sum_probs=30.3
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|+.+|..+|..+..........+..++..
T Consensus 20 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~ 57 (443)
T cd07152 20 ARAAAAQRAWAATPPRERAAVLRRAADLLEEHADEIAD 57 (443)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHhHHHHHH
Confidence 44556667899999999999998888877777777654
No 60
>cd07137 ALDH_F3FHI Plant aldehyde dehydrogenase family 3 members F1, H1, and I1 and related proteins. Aldehyde dehydrogenase family members 3F1, 3H1, and 3I1 (ALDH3F1, ALDH3H1, and ALDH3I1), and similar plant sequences, are in this CD. In Arabidopsis thaliana, stress-regulated expression of ALDH3I1 was observed in leaves and osmotic stress expression of ALDH3H1 was observed in root tissue, whereas, ALDH3F1 expression was not stress responsive. Functional analysis of ALDH3I1 suggest it may be involved in a detoxification pathway in plants that limits aldehyde accumulation and oxidative stress.
Probab=22.14 E-value=2.4e+02 Score=23.71 Aligned_cols=40 Identities=5% Similarity=-0.213 Sum_probs=32.5
Q ss_pred HHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 85 SAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 85 ~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
.+.++..-..|+.++..+|..+...+.....++..++..-
T Consensus 5 ~~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~~l~~~ 44 (432)
T cd07137 5 VRELRETFRSGRTRSAEWRKSQLKGLLRLVDENEDDIFAA 44 (432)
T ss_pred HHHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 3556677778999999999999998888888888777643
No 61
>cd07129 ALDH_KGSADH Alpha-Ketoglutaric Semialdehyde Dehydrogenase. Alpha-Ketoglutaric Semialdehyde (KGSA) Dehydrogenase (KGSADH, EC 1.2.1.26) catalyzes the NAD(P)+-dependent conversion of KGSA to alpha-ketoglutarate. This CD contains such sequences as those seen in Azospirillum brasilense, KGSADH-II (D-glucarate/D-galactarate-inducible) and KGSADH-III (hydroxy-L-proline-inducible). Both show similar high substrate specificity for KGSA and different coenzyme specificity; KGSADH-II is NAD+-dependent and KGSADH-III is NADP+-dependent. Also included in this CD is the NADP(+)-dependent aldehyde dehydrogenase from Vibrio harveyi which catalyzes the oxidation of long-chain aliphatic aldehydes to acids.
Probab=21.79 E-value=2.2e+02 Score=23.98 Aligned_cols=38 Identities=24% Similarity=0.248 Sum_probs=30.1
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|+.++..+|..+..........+..++..
T Consensus 6 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~ 43 (454)
T cd07129 6 AAAAAAFESYRALSPARRAAFLEAIADEIEALGDELVA 43 (454)
T ss_pred HHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHhHHHHHH
Confidence 44556667899999999999998888888777776654
No 62
>cd07084 ALDH_KGSADH-like ALDH subfamily: NAD(P)+-dependent alpha-ketoglutaric semialdehyde dehydrogenases and plant delta(1)-pyrroline-5-carboxylate dehydrogenase, ALDH family 12-like. ALDH subfamily which includes the NAD(P)+-dependent, alpha-ketoglutaric semialdehyde dehydrogenases (KGSADH, EC 1.2.1.26); plant delta(1)-pyrroline-5-carboxylate dehydrogenase (P5CDH, EC=1.5.1.12 ), ALDH family 12; the N-terminal domain of the MaoC (monoamine oxidase C) dehydratase regulatory protein; and orthologs of MaoC, PaaZ and PaaN, which are putative ring-opening enzymes of the aerobic phenylacetic acid catabolic pathway.
Probab=21.62 E-value=2.1e+02 Score=24.05 Aligned_cols=39 Identities=15% Similarity=0.090 Sum_probs=31.8
Q ss_pred HHHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 85 SAVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 85 ~ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
.+.++.....|+.++..+|......+......+..++..
T Consensus 5 v~~A~~A~~~W~~~~~~~R~~~L~~~a~~l~~~~~ela~ 43 (442)
T cd07084 5 LLAADISTKAARRLALPKRADFLARIIQRLAAKSYDIAA 43 (442)
T ss_pred HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHhhHHHHHH
Confidence 355666778899999999999998888888888777654
No 63
>PRK13968 putative succinate semialdehyde dehydrogenase; Provisional
Probab=21.52 E-value=2.4e+02 Score=23.90 Aligned_cols=39 Identities=15% Similarity=0.118 Sum_probs=31.4
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++..-..|+.++..+|..+..........+..++...
T Consensus 36 ~~A~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~~ 74 (462)
T PRK13968 36 QLAAAGFRDWRETNIDYRAQKLRDIGKALRARSEEMAQM 74 (462)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHhHHHHHHHH
Confidence 445556678999999999999988888888888777653
No 64
>KOG1827 consensus Chromatin remodeling complex RSC, subunit RSC1/Polybromo and related proteins [Chromatin structure and dynamics; Transcription]
Probab=21.51 E-value=6.7 Score=35.12 Aligned_cols=44 Identities=25% Similarity=0.425 Sum_probs=39.2
Q ss_pred CCChHHHHHHHHHHHHHHhCCCCccHHHHHHHHHHhhcCCChhhh
Q 033035 59 PPSAFFVFLEEFRKVYKQEHPNVKAVSAVGKAGGEKWKSLTDAEK 103 (129)
Q Consensus 59 P~say~lF~~e~r~~~k~e~p~~~~~~ei~k~l~~~Wk~ls~~eK 103 (129)
-+++|++|+.+.+..+-..+|+. .+++++.+.|..|..|+..-+
T Consensus 552 ~~~~~~~~s~~~~~~~~~~np~v-~~~~~~~~vg~~~~~lp~~~k 595 (629)
T KOG1827|consen 552 SPEPYILDSIENRTIIWFENPTV-GFGEVSIIVGNDWDKLPNINK 595 (629)
T ss_pred CCccccccccccCceeeeeCCCc-ccceeEEeecCCcccCccccc
Confidence 45889999999999999999999 789999999999999995444
No 65
>cd07098 ALDH_F15-22 Aldehyde dehydrogenase family 15A1 and 22A1-like. Aldehyde dehydrogenase family members ALDH15A1 (Saccharomyces cerevisiae YHR039C) and ALDH22A1 (Arabidopsis thaliana, EC=1.2.1.3), and similar sequences, are in this CD. Significant improvement of stress tolerance in tobacco plants was observed by overexpressing the ALDH22A1 gene from maize (Zea mays) and was accompanied by a reduction of malondialdehyde derived from cellular lipid peroxidation.
Probab=20.99 E-value=2.7e+02 Score=23.38 Aligned_cols=39 Identities=18% Similarity=0.221 Sum_probs=30.7
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+.++.....|..+|.++|..+.........++..++...
T Consensus 25 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~~ 63 (465)
T cd07098 25 AAARAAQREWAKTSFAERRKVLRSLLKYILENQEEICRV 63 (465)
T ss_pred HHHHHHHHHhccCCHHHHHHHHHHHHHHHHHhHHHHHHH
Confidence 445556678999999999999988888887887776543
No 66
>PRK11241 gabD succinate-semialdehyde dehydrogenase I; Provisional
Probab=20.82 E-value=2.4e+02 Score=24.10 Aligned_cols=37 Identities=14% Similarity=0.214 Sum_probs=29.5
Q ss_pred HHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 87 VGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 87 i~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
-++.....|+.++..+|..+..........+..++..
T Consensus 56 ~A~~a~~~W~~~~~~~R~~~L~~~a~~l~~~~~ela~ 92 (482)
T PRK11241 56 AANRALPAWRALTAKERANILRRWFNLMMEHQDDLAR 92 (482)
T ss_pred HHHHHHHHHhcCCHHHHHHHHHHHHHHHHHhHHHHHH
Confidence 3445556799999999999998888888888877654
No 67
>TIGR01780 SSADH succinate-semialdehyde dehydrogenase. SSADH enzyme belongs to the aldehyde dehydrogenase family (pfam00171), sharing a common evolutionary origin and enzymatic mechanism with lactaldehyde dehydrogenase. Like in lactaldehyde dehydrogenase and succinate semialdehyde dehydrogenase, the mammalian catalytic glutamic acid and cysteine residues are conserved in all the enzymes of this family (PS00687, PS00070).
Probab=20.49 E-value=2.9e+02 Score=23.16 Aligned_cols=38 Identities=13% Similarity=0.168 Sum_probs=30.5
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|..++.++|..+...+......+..++..
T Consensus 26 ~~A~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~~la~ 63 (448)
T TIGR01780 26 RAAYEAFKTWKNTTAKERSSLLRKWYNLMMENKDDLAR 63 (448)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHhHHHHHH
Confidence 45566678899999999999998888887777777644
No 68
>cd07099 ALDH_DDALDH Methylomonas sp. 4,4'-diapolycopene-dialdehyde dehydrogenase-like. The 4,4'-diapolycopene-dialdehyde dehydrogenase (DDALDH) involved in C30 carotenoid synthesis in Methylomonas sp. strain 16a and other similar sequences are present in this CD. DDALDH converts 4,4'-diapolycopene-dialdehyde into 4,4'-diapolycopene-diacid.
Probab=20.34 E-value=2.7e+02 Score=23.22 Aligned_cols=38 Identities=16% Similarity=0.163 Sum_probs=29.8
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTA 123 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~ 123 (129)
+.++.....|+.++..+|..+...+......+..++..
T Consensus 25 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~~la~ 62 (453)
T cd07099 25 ARARAAQRAWAALGVEGRAQRLLRWKRALADHADELAE 62 (453)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 44555667899999999999998888877777766654
No 69
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=20.28 E-value=2.2e+02 Score=23.29 Aligned_cols=36 Identities=17% Similarity=0.291 Sum_probs=27.1
Q ss_pred HHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHHh
Q 033035 90 AGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAYN 125 (129)
Q Consensus 90 ~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y~ 125 (129)
+-...|..||++.+....+.+.+..........+++
T Consensus 242 ~s~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e 277 (332)
T COG1638 242 VSKAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELE 277 (332)
T ss_pred EcHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 445779999999999999988887665555554444
No 70
>PF13945 NST1: Salt tolerance down-regulator
Probab=20.28 E-value=2.4e+02 Score=21.57 Aligned_cols=27 Identities=19% Similarity=0.273 Sum_probs=21.0
Q ss_pred HHHHHHHHHhhcCCChhhhHHHHHHHH
Q 033035 85 SAVGKAGGEKWKSLTDAEKAPFEAKAA 111 (129)
Q Consensus 85 ~ei~k~l~~~Wk~ls~~eK~~Y~~~a~ 111 (129)
.+....|-+.|-+|+++||...+..-+
T Consensus 101 ~eEre~LkeFW~SL~eeERr~LVkIEK 127 (190)
T PF13945_consen 101 QEEREKLKEFWESLSEEERRSLVKIEK 127 (190)
T ss_pred HHHHHHHHHHHHccCHHHHHHHHHhhH
Confidence 456678999999999999987665433
No 71
>cd07108 ALDH_MGR_2402 Magnetospirillum NAD(P)+-dependent aldehyde dehydrogenase MSR-1-like. NAD(P)+-dependent aldehyde dehydrogenase of Magnetospirillum gryphiswaldense MSR-1 (MGR_2402) , and other similar sequences, are present in this CD.
Probab=20.26 E-value=2.6e+02 Score=23.39 Aligned_cols=39 Identities=18% Similarity=0.205 Sum_probs=31.0
Q ss_pred HHHHHHHHhhcCCChhhhHHHHHHHHHHHHHHHHHHHHH
Q 033035 86 AVGKAGGEKWKSLTDAEKAPFEAKAAKRKLDYEKLMTAY 124 (129)
Q Consensus 86 ei~k~l~~~Wk~ls~~eK~~Y~~~a~~~k~~Y~~e~~~Y 124 (129)
+-++.....|+.+|.++|..+..........+..++..-
T Consensus 26 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~ela~~ 64 (457)
T cd07108 26 AAAKAAFPEWAATPARERGKLLARIADALEARSEELARL 64 (457)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHhHHHHHHH
Confidence 445566778999999999999988888888887777543
Done!