Query 032101
Match_columns 147
No_of_seqs 162 out of 1153
Neff 6.7
Searched_HMMs 46136
Date Fri Mar 29 09:36:20 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/032101.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/032101hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PTZ00199 high mobility group p 99.9 2.4E-25 5.2E-30 155.3 11.4 86 39-124 7-93 (94)
2 cd01389 MATA_HMG-box MATA_HMG- 99.8 6.4E-21 1.4E-25 127.8 8.1 72 54-126 1-72 (77)
3 PF00505 HMG_box: HMG (high mo 99.8 2.9E-20 6.2E-25 121.1 9.0 69 55-124 1-69 (69)
4 cd01388 SOX-TCF_HMG-box SOX-TC 99.8 2.6E-20 5.7E-25 123.4 8.2 70 55-125 2-71 (72)
5 cd01390 HMGB-UBF_HMG-box HMGB- 99.8 1E-19 2.2E-24 117.4 9.0 65 55-120 1-65 (66)
6 COG5648 NHP6B Chromatin-associ 99.8 5E-20 1.1E-24 143.6 8.8 89 43-132 59-147 (211)
7 smart00398 HMG high mobility g 99.8 1.8E-19 4E-24 116.8 9.3 70 54-124 1-70 (70)
8 PF09011 HMG_box_2: HMG-box do 99.8 2.3E-19 5.1E-24 119.1 9.0 72 52-124 1-73 (73)
9 KOG0381 HMG box-containing pro 99.8 3.6E-18 7.8E-23 118.2 10.8 76 51-127 17-95 (96)
10 cd00084 HMG-box High Mobility 99.7 9.6E-18 2.1E-22 107.5 9.0 65 55-120 1-65 (66)
11 KOG0527 HMG-box transcription 99.7 1.3E-17 2.7E-22 139.0 6.2 79 48-127 56-134 (331)
12 KOG0526 Nucleosome-binding fac 99.7 3E-17 6.4E-22 142.1 7.9 78 44-126 525-602 (615)
13 KOG4715 SWI/SNF-related matrix 99.3 2.2E-12 4.9E-17 106.3 7.3 78 48-126 58-135 (410)
14 KOG3248 Transcription factor T 99.3 2.3E-12 4.9E-17 107.0 5.9 71 54-125 191-261 (421)
15 KOG0528 HMG-box transcription 99.1 1.4E-11 3E-16 106.1 1.8 80 50-130 321-400 (511)
16 KOG2746 HMG-box transcription 98.6 4.1E-08 8.9E-13 87.6 4.8 76 43-119 170-247 (683)
17 PF14887 HMG_box_5: HMG (high 98.3 7.4E-06 1.6E-10 55.1 7.7 74 54-129 3-76 (85)
18 PF06382 DUF1074: Protein of u 97.2 0.0007 1.5E-08 52.2 5.1 49 59-112 83-131 (183)
19 PF04690 YABBY: YABBY protein; 97.2 0.00097 2.1E-08 51.2 5.7 49 49-98 116-164 (170)
20 COG5648 NHP6B Chromatin-associ 97.1 0.00041 8.8E-09 54.7 3.0 69 52-121 141-209 (211)
21 PF08073 CHDNT: CHDNT (NUC034) 95.2 0.031 6.7E-07 35.3 3.5 39 60-99 14-52 (55)
22 PF06244 DUF1014: Protein of u 93.1 0.15 3.3E-06 37.2 4.0 50 50-100 68-117 (122)
23 PF04769 MAT_Alpha1: Mating-ty 90.5 0.87 1.9E-05 35.9 5.9 53 50-109 39-91 (201)
24 TIGR03481 HpnM hopanoid biosyn 89.8 0.91 2E-05 35.4 5.5 44 83-126 66-111 (198)
25 PRK15117 ABC transporter perip 89.3 1.1 2.3E-05 35.4 5.6 48 78-126 66-115 (211)
26 KOG3223 Uncharacterized conser 79.6 3.9 8.5E-05 32.2 4.6 53 53-109 162-215 (221)
27 PF05494 Tol_Tol_Ttg2: Toluene 78.3 2.7 5.9E-05 31.5 3.3 47 78-125 36-84 (170)
28 COG2854 Ttg2D ABC-type transpo 69.3 7.3 0.00016 30.8 3.8 42 88-129 78-120 (202)
29 PF13875 DUF4202: Domain of un 68.0 13 0.00027 29.1 4.8 40 60-103 130-169 (185)
30 PF12881 NUT_N: NUT protein N 67.9 16 0.00034 30.9 5.6 53 59-112 229-281 (328)
31 PRK10363 cpxP periplasmic repr 54.7 52 0.0011 25.3 6.0 40 84-124 111-150 (166)
32 PRK09706 transcriptional repre 49.3 48 0.001 23.7 5.0 44 85-128 87-130 (135)
33 PRK12751 cpxP periplasmic stre 47.9 44 0.00096 25.4 4.7 34 85-118 118-151 (162)
34 PRK12750 cpxP periplasmic repr 46.8 78 0.0017 24.1 6.0 35 86-120 126-160 (170)
35 PF11304 DUF3106: Protein of u 44.5 98 0.0021 21.7 5.8 9 94-102 56-64 (107)
36 PF00887 ACBP: Acyl CoA bindin 39.4 58 0.0013 21.6 3.9 53 62-116 30-86 (87)
37 PF06945 DUF1289: Protein of u 38.5 43 0.00093 20.3 2.8 24 83-111 24-47 (51)
38 KOG1610 Corticosteroid 11-beta 36.1 1.1E+02 0.0023 26.0 5.6 57 65-124 188-256 (322)
39 PF01352 KRAB: KRAB box; Inte 35.5 29 0.00063 20.2 1.6 28 83-110 3-31 (41)
40 PF12650 DUF3784: Domain of un 34.8 25 0.00054 23.8 1.5 15 94-108 26-40 (97)
41 TIGR00787 dctP tripartite ATP- 31.8 99 0.0021 24.3 4.7 28 91-118 213-240 (257)
42 cd07081 ALDH_F20_ACDH_EutE-lik 31.1 1.2E+02 0.0026 26.4 5.4 40 85-124 6-45 (439)
43 COG4281 ACB Acyl-CoA-binding p 31.0 52 0.0011 22.3 2.4 61 54-116 16-85 (87)
44 PF05388 Carbpep_Y_N: Carboxyp 30.4 75 0.0016 22.7 3.4 29 83-111 45-73 (113)
45 PRK10236 hypothetical protein; 30.4 49 0.0011 26.8 2.6 26 86-111 118-143 (237)
46 KOG1827 Chromatin remodeling c 29.8 3.9 8.5E-05 37.4 -4.1 44 58-102 552-595 (629)
47 COG1638 DctP TRAP-type C4-dica 29.3 1.1E+02 0.0024 25.7 4.7 35 91-125 244-278 (332)
48 PF06394 Pepsin-I3: Pepsin inh 28.9 59 0.0013 21.7 2.4 31 95-133 38-68 (76)
49 cd07133 ALDH_CALDH_CalB Conife 26.4 1.8E+02 0.0038 25.1 5.6 42 84-125 4-45 (434)
50 PRK10455 periplasmic protein; 26.3 1.4E+02 0.003 22.6 4.3 28 85-112 118-145 (161)
51 PHA02662 ORF131 putative membr 25.4 1.6E+02 0.0035 23.7 4.7 45 60-105 22-98 (226)
52 PF12290 DUF3802: Protein of u 25.2 2.6E+02 0.0057 20.1 5.3 40 72-111 49-101 (113)
53 PF03480 SBP_bac_7: Bacterial 25.2 1.1E+02 0.0024 24.3 4.0 31 91-121 213-243 (286)
54 cd07122 ALDH_F20_ACDH Coenzyme 23.7 1.9E+02 0.0041 25.2 5.3 40 85-124 6-45 (436)
55 cd07132 ALDH_F3AB Aldehyde deh 23.6 2E+02 0.0043 24.9 5.4 40 85-124 5-44 (443)
56 PF15581 Imm35: Immunity prote 23.2 93 0.002 21.5 2.6 21 83-103 32-52 (93)
57 cd01145 TroA_c Periplasmic bin 23.1 1.4E+02 0.0031 22.7 4.0 48 82-129 116-163 (203)
58 cd07087 ALDH_F3-13-14_CALDH-li 21.2 2.4E+02 0.0051 24.2 5.4 40 85-124 5-44 (426)
59 cd07085 ALDH_F6_MMSDH Methylma 21.0 2.4E+02 0.0052 24.5 5.4 37 87-123 47-83 (478)
60 PTZ00037 DnaJ_C chaperone prot 20.7 2E+02 0.0043 25.1 4.8 43 66-108 46-88 (421)
61 smart00271 DnaJ DnaJ molecular 20.4 1.7E+02 0.0037 17.2 3.3 34 67-100 20-58 (60)
No 1
>PTZ00199 high mobility group protein; Provisional
Probab=99.93 E-value=2.4e-25 Score=155.28 Aligned_cols=86 Identities=44% Similarity=0.699 Sum_probs=79.9
Q ss_pred ccccccccccCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCC-cHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHH
Q 032101 39 KRQGKREKKAKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVT-AVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYG 117 (147)
Q Consensus 39 kk~~kk~kk~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~-~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~ 117 (147)
++.+++++++.+||+.|+||+||||+|++++|..|..+||+++ ++++|+++||++|++||+++|.+|++.|..++++|.
T Consensus 7 ~~~~k~~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~ 86 (94)
T PTZ00199 7 KVLVRKNKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYE 86 (94)
T ss_pred CccccccCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence 4445667778899999999999999999999999999999985 479999999999999999999999999999999999
Q ss_pred HHHHHHh
Q 032101 118 KKMNAYN 124 (147)
Q Consensus 118 k~~~~Y~ 124 (147)
.+|.+|+
T Consensus 87 ~e~~~Y~ 93 (94)
T PTZ00199 87 KEKAEYA 93 (94)
T ss_pred HHHHHHh
Confidence 9999996
No 2
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.84 E-value=6.4e-21 Score=127.80 Aligned_cols=72 Identities=31% Similarity=0.517 Sum_probs=69.4
Q ss_pred CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhh
Q 032101 54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKK 126 (147)
Q Consensus 54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~ 126 (147)
.|+||+||||||+++.|..|+.+||+++ +.+|+++||++|++||+++|++|.++|..++++|..++++|...
T Consensus 1 ~~kRP~naf~lf~~~~r~~~~~~~p~~~-~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~p~Yky~ 72 (77)
T cd01389 1 KIPRPRNAFILYRQDKHAQLKTENPGLT-NNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREYPDYKYT 72 (77)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCCC-HHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHCCCCccc
Confidence 4899999999999999999999999997 79999999999999999999999999999999999999999875
No 3
>PF00505 HMG_box: HMG (high mobility group) box; InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.83 E-value=2.9e-20 Score=121.08 Aligned_cols=69 Identities=43% Similarity=0.797 Sum_probs=65.5
Q ss_pred CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
|+||+|||++|+.+++..++.+||+++ +.+|+++||++|++||+++|.+|.+.|..++.+|..++.+|+
T Consensus 1 PkrP~~af~lf~~~~~~~~k~~~p~~~-~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~~~~y~ 69 (69)
T PF00505_consen 1 PKRPPNAFMLFCKEKRAKLKEENPDLS-NKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKEMPEYK 69 (69)
T ss_dssp SSSS--HHHHHHHHHHHHHHHHSTTST-HHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCHHHHHHHHHHHHHHHHhcccc-cccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 899999999999999999999999998 799999999999999999999999999999999999999995
No 4
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.83 E-value=2.6e-20 Score=123.41 Aligned_cols=70 Identities=36% Similarity=0.527 Sum_probs=67.4
Q ss_pred CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101 55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK 125 (147)
Q Consensus 55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~ 125 (147)
.+||+||||+|++++|..++.+||+++ +.+|+++||++|+.||+++|++|.+.|..++++|..++++|+.
T Consensus 2 iKrP~naf~~F~~~~r~~~~~~~p~~~-~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~p~y~y 71 (72)
T cd01388 2 IKRPMNAFMLFSKRHRRKVLQEYPLKE-NRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLYPDYKW 71 (72)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCCCCC-HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHCcCCCC
Confidence 589999999999999999999999997 7999999999999999999999999999999999999999963
No 5
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.82 E-value=1e-19 Score=117.38 Aligned_cols=65 Identities=54% Similarity=0.836 Sum_probs=63.3
Q ss_pred CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHH
Q 032101 55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKM 120 (147)
Q Consensus 55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~ 120 (147)
|++|+|||++|++++|..+..+||+++ +.+|++.||++|++||+++|.+|.+.|..++.+|..+|
T Consensus 1 Pkrp~saf~~f~~~~r~~~~~~~p~~~-~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~ 65 (66)
T cd01390 1 PKRPLSAYFLFSQEQRPKLKKENPDAS-VTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 899999999999999999999999997 89999999999999999999999999999999999887
No 6
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.82 E-value=5e-20 Score=143.59 Aligned_cols=89 Identities=42% Similarity=0.689 Sum_probs=84.5
Q ss_pred ccccccCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHH
Q 032101 43 KREKKAKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNA 122 (147)
Q Consensus 43 kk~kk~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~ 122 (147)
+...+.++|||.|+||+||||+|+.++|.+|..++|.++ |.+|++.+|++|++|++++|.+|...|..++++|..++..
T Consensus 59 k~~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l~-~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~~ 137 (211)
T COG5648 59 KRLVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKLT-FGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKEE 137 (211)
T ss_pred HHHHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCCC-hHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHHh
Confidence 566778899999999999999999999999999999998 9999999999999999999999999999999999999999
Q ss_pred HhhhCCCCCC
Q 032101 123 YNKKQVTNLV 132 (147)
Q Consensus 123 Y~~~~~~~~~ 132 (147)
|+.+..+...
T Consensus 138 y~~k~~~~~~ 147 (211)
T COG5648 138 YNKKLPNKAP 147 (211)
T ss_pred hhcccCCCCC
Confidence 9999887753
No 7
>smart00398 HMG high mobility group.
Probab=99.81 E-value=1.8e-19 Score=116.81 Aligned_cols=70 Identities=47% Similarity=0.793 Sum_probs=67.7
Q ss_pred CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
+|++|+|+|++|++++|..+..+||+++ +.+|++.||++|+.||+++|.+|.+.|..++.+|..++..|.
T Consensus 1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~~-~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~~~y~ 70 (70)
T smart00398 1 KPKRPMSAFMLFSQENRAKIKAENPDLS-NAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEMPEYK 70 (70)
T ss_pred CcCCCCcHHHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 5899999999999999999999999997 899999999999999999999999999999999999999984
No 8
>PF09011 HMG_box_2: HMG-box domain; InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.80 E-value=2.3e-19 Score=119.15 Aligned_cols=72 Identities=43% Similarity=0.785 Sum_probs=63.3
Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHHh-CCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 52 PNKPKRPPSAFFVFLEEFRKTFKKE-NPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 52 p~~PKRP~sAy~lF~~e~r~~ik~e-~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
|+.||+|+|||+||+.+++..++.+ ++... +.++++.|++.|++||++||.+|.++|..++.+|..+|..|+
T Consensus 1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~-~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~~~~~ 73 (73)
T PF09011_consen 1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQS-FREVMKEISERWKSLSEEEKEPYEERAKEDKERYEREMKEWN 73 (73)
T ss_dssp SSS--SSSSHHHHHHHHHHHHHHHHT-T-SS-HHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred CcCCCCCCCHHHHHHHHHHHHHHHhcccCCC-HHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 6899999999999999999999988 66665 899999999999999999999999999999999999999995
No 9
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.77 E-value=3.6e-18 Score=118.24 Aligned_cols=76 Identities=49% Similarity=0.831 Sum_probs=72.3
Q ss_pred CC--CCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHH-HHhhhC
Q 032101 51 DP--NKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMN-AYNKKQ 127 (147)
Q Consensus 51 dp--~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~-~Y~~~~ 127 (147)
|| +.|+||+|||++|+.+.|..++.+||+++ +.+|++++|++|++|++++|.+|...+..++++|..+|. .|+...
T Consensus 17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~~-~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~~~~~~~~ 95 (96)
T KOG0381|consen 17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGLS-VGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELAGEYKASL 95 (96)
T ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCC-HHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence 66 59999999999999999999999999987 899999999999999999999999999999999999999 998754
No 10
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.75 E-value=9.6e-18 Score=107.52 Aligned_cols=65 Identities=51% Similarity=0.807 Sum_probs=62.7
Q ss_pred CCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHH
Q 032101 55 PKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKM 120 (147)
Q Consensus 55 PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~ 120 (147)
|+||+|||++|+++.|..+..+||+++ +.+|++.||++|+.|++++|.+|.+.|..++.+|..++
T Consensus 1 pkrp~~af~~f~~~~~~~~~~~~~~~~-~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~~ 65 (66)
T cd00084 1 PKRPLSAYFLFSQEHRAEVKAENPGLS-VGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCC-HHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 799999999999999999999999997 79999999999999999999999999999999999875
No 11
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.71 E-value=1.3e-17 Score=139.03 Aligned_cols=79 Identities=32% Similarity=0.590 Sum_probs=74.3
Q ss_pred cCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhC
Q 032101 48 AKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQ 127 (147)
Q Consensus 48 ~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~ 127 (147)
.+......||||||||+|.+..|.+|.++||.+.| .||+++||.+|+.|+++||.+|+++|++++..|.++..+|+.+-
T Consensus 56 ~k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHN-SEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~HmkehPdYKYRP 134 (331)
T KOG0527|consen 56 DKTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHN-SEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEYPDYKYRP 134 (331)
T ss_pred CCCCccccCCCcchhhhhhHHHHHHHHHhCcchhh-HHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhCCCccccc
Confidence 44556789999999999999999999999999997 89999999999999999999999999999999999999998763
No 12
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.70 E-value=3e-17 Score=142.08 Aligned_cols=78 Identities=41% Similarity=0.708 Sum_probs=73.6
Q ss_pred cccccCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHH
Q 032101 44 REKKAKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAY 123 (147)
Q Consensus 44 k~kk~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y 123 (147)
++.++.+|||+|||++||||+|++..|..|+.+ +++ +++|++.+|++|+.||. |.+|++.|+.++++|+.+|.+|
T Consensus 525 k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi~-~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~~y 599 (615)
T KOG0526|consen 525 KKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GIS-VGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMKEY 599 (615)
T ss_pred cCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--Cch-HHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHHhh
Confidence 667788999999999999999999999999987 887 99999999999999999 9999999999999999999999
Q ss_pred hhh
Q 032101 124 NKK 126 (147)
Q Consensus 124 ~~~ 126 (147)
+.-
T Consensus 600 k~g 602 (615)
T KOG0526|consen 600 KNG 602 (615)
T ss_pred cCC
Confidence 943
No 13
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]
Probab=99.35 E-value=2.2e-12 Score=106.25 Aligned_cols=78 Identities=28% Similarity=0.558 Sum_probs=73.9
Q ss_pred cCCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhh
Q 032101 48 AKKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKK 126 (147)
Q Consensus 48 ~~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~ 126 (147)
..+.|.+|-+|+-.||.|+..++++|+..||++. +.+|+++||.||..|+++||..|...++.++.+|++.|.+|+..
T Consensus 58 ~pkpPkppekpl~pymrySrkvWd~VkA~nPe~k-LWeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smkayh~s 135 (410)
T KOG4715|consen 58 RPKPPKPPEKPLMPYMRYSRKVWDQVKASNPELK-LWEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMKAYHNS 135 (410)
T ss_pred CCCCCCCCCcccchhhHHhhhhhhhhhccCcchH-HHHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence 4457889999999999999999999999999998 99999999999999999999999999999999999999999764
No 14
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=99.32 E-value=2.3e-12 Score=106.96 Aligned_cols=71 Identities=23% Similarity=0.416 Sum_probs=65.0
Q ss_pred CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101 54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK 125 (147)
Q Consensus 54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~ 125 (147)
..|+|+||||+|++++|..|..++. ++...+|.++||.+|++||-||.+.|+++|+++++.+......|-+
T Consensus 191 hiKKPLNAFmlyMKEmRa~vvaEct-lKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerqlH~qlYP~WSA 261 (421)
T KOG3248|consen 191 HIKKPLNAFMLYMKEMRAKVVAECT-LKESAAINQILGRRWHALSREEQAKYYELARKERQLHMQLYPGWSA 261 (421)
T ss_pred cccccHHHHHHHHHHHHHHHHHHhh-hhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHHHHHhcCCcch
Confidence 6799999999999999999999996 4446899999999999999999999999999999999988877744
No 15
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=99.14 E-value=1.4e-11 Score=106.11 Aligned_cols=80 Identities=29% Similarity=0.474 Sum_probs=73.0
Q ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCC
Q 032101 50 KDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVT 129 (147)
Q Consensus 50 kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~ 129 (147)
..++..||||||||+|.++.|..|.+.+|++.| ..|+++||.+|+.||..||++|++.-.++-..|.+..+.|+.+-.+
T Consensus 321 ss~PHIKRPMNAFMVWAkDERRKILqA~PDMHN-SnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~PdYrYkPRP 399 (511)
T KOG0528|consen 321 SSEPHIKRPMNAFMVWAKDERRKILQAFPDMHN-SNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKYPDYRYKPRP 399 (511)
T ss_pred CCCccccCCcchhhcccchhhhhhhhcCccccc-cchhHHhcccccccccccccchHHHHHHHHHhhhccCcccccCCCC
Confidence 334577999999999999999999999999997 6999999999999999999999999999999999999999987655
Q ss_pred C
Q 032101 130 N 130 (147)
Q Consensus 130 ~ 130 (147)
.
T Consensus 400 K 400 (511)
T KOG0528|consen 400 K 400 (511)
T ss_pred C
Confidence 4
No 16
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=98.61 E-value=4.1e-08 Score=87.60 Aligned_cols=76 Identities=26% Similarity=0.453 Sum_probs=69.7
Q ss_pred ccccccCCCCCCCCCCCCHHHHHHHHHH--HHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHH
Q 032101 43 KREKKAKKDPNKPKRPPSAFFVFLEEFR--KTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKK 119 (147)
Q Consensus 43 kk~kk~~kdp~~PKRP~sAy~lF~~e~r--~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~ 119 (147)
..+-..+.|....+||||+|++|++.+| ..+.+.||+..| ..|+++||+.|-.|.+.||+.|.++|.+.++.|.++
T Consensus 170 dgrspnkr~k~HirrPMnaf~ifskrhr~~g~vhq~~pn~DN-rtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfka 247 (683)
T KOG2746|consen 170 DGRSPNKRDKDHIRRPMNAFHIFSKRHRGEGRVHQRHPNQDN-RTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFKA 247 (683)
T ss_pred ccCCCCcCcchhhhhhhHHHHHHHhhcCCccchhccCccccc-hhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhhh
Confidence 4455566777789999999999999999 899999999997 899999999999999999999999999999999886
No 17
>PF14887 HMG_box_5: HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=98.25 E-value=7.4e-06 Score=55.11 Aligned_cols=74 Identities=19% Similarity=0.274 Sum_probs=61.0
Q ss_pred CCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCC
Q 032101 54 KPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVT 129 (147)
Q Consensus 54 ~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~ 129 (147)
.|..|-++--+|.+.....+...+++.. ..+ .+.+...|++|++.+|.+|+..|.++..+|+.+|.+|+.-...
T Consensus 3 lPE~PKt~qe~Wqq~vi~dYla~~~~dr-~K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el~e~r~~~~~ 76 (85)
T PF14887_consen 3 LPETPKTAQEIWQQSVIGDYLAKFRNDR-KKA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYERELREMRSAPAD 76 (85)
T ss_dssp -S----THHHHHHHHHHHHHHHHTTSTH-HHH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHHHCCS-CCCT
T ss_pred CCCCCCCHHHHHHHHHHHHHHHHhhHhH-HHH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHhcCCCC
Confidence 5778889999999999999999999885 344 5699999999999999999999999999999999999876443
No 18
>PF06382 DUF1074: Protein of unknown function (DUF1074); InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=97.20 E-value=0.0007 Score=52.20 Aligned_cols=49 Identities=29% Similarity=0.480 Sum_probs=42.8
Q ss_pred CCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHH
Q 032101 59 PSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKL 112 (147)
Q Consensus 59 ~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~ 112 (147)
-+||+-|+.++|.. |.+++ ..|+....+..|..||+++|..|..++...
T Consensus 83 nnaYLNFLReFRrk----h~~L~-p~dlI~~AAraW~rLSe~eK~rYrr~~~~~ 131 (183)
T PF06382_consen 83 NNAYLNFLREFRRK----HCGLS-PQDLIQRAARAWCRLSEAEKNRYRRMAPSV 131 (183)
T ss_pred chHHHHHHHHHHHH----ccCCC-HHHHHHHHHHHHHhCCHHHHHHHHhhcchh
Confidence 47899999998875 57897 789999999999999999999999876543
No 19
>PF04690 YABBY: YABBY protein; InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=97.18 E-value=0.00097 Score=51.23 Aligned_cols=49 Identities=33% Similarity=0.542 Sum_probs=42.8
Q ss_pred CCCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCC
Q 032101 49 KKDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMS 98 (147)
Q Consensus 49 ~kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls 98 (147)
.+.|.+-.|-+|||..|+++.-.+|+..||+++ ..|.-...+..|...+
T Consensus 116 ~kPPEKRqR~psaYn~f~k~ei~rik~~~p~is-hkeaFs~aAknW~h~p 164 (170)
T PF04690_consen 116 NKPPEKRQRVPSAYNRFMKEEIQRIKAENPDIS-HKEAFSAAAKNWAHFP 164 (170)
T ss_pred cCCccccCCCchhHHHHHHHHHHHHHhcCCCCC-HHHHHHHHHHhhhhCc
Confidence 344555567789999999999999999999998 7999999999998765
No 20
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=97.10 E-value=0.00041 Score=54.74 Aligned_cols=69 Identities=22% Similarity=0.312 Sum_probs=62.1
Q ss_pred CCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHH
Q 032101 52 PNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMN 121 (147)
Q Consensus 52 p~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~ 121 (147)
..++..|...|+-+-..+|..+...+|... ..++++++|..|.+|++.-|..|.+.+..++..|...+.
T Consensus 141 k~~~~~~~~~~~e~~~~~r~~~~~~~~~~~-~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~~ 209 (211)
T COG5648 141 KLPNKAPIGPFIENEPKIRPKVEGPSPDKA-LVEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFYP 209 (211)
T ss_pred ccCCCCCCchhhhccHHhccccCCCCcchh-hhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhcc
Confidence 457788888999999999999999999886 789999999999999999999999999999999987654
No 21
>PF08073 CHDNT: CHDNT (NUC034) domain; InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=95.21 E-value=0.031 Score=35.32 Aligned_cols=39 Identities=21% Similarity=0.422 Sum_probs=35.1
Q ss_pred CHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCCh
Q 032101 60 SAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSP 99 (147)
Q Consensus 60 sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~ 99 (147)
+.|-+|.+-.|..|...||++. ++.|..+++.+|+..++
T Consensus 14 t~yK~Fsq~vRP~l~~~NPk~~-~sKl~~l~~AKwrEF~~ 52 (55)
T PF08073_consen 14 TNYKAFSQHVRPLLAKANPKAP-MSKLMMLLQAKWREFQE 52 (55)
T ss_pred HHHHHHHHHHHHHHHHHCCCCc-HHHHHHHHHHHHHHHHh
Confidence 5688999999999999999997 89999999999987554
No 22
>PF06244 DUF1014: Protein of unknown function (DUF1014); InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=93.11 E-value=0.15 Score=37.16 Aligned_cols=50 Identities=22% Similarity=0.324 Sum_probs=41.2
Q ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChh
Q 032101 50 KDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPA 100 (147)
Q Consensus 50 kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~e 100 (147)
-|..|-+|---||.-|....-..|+.+||++. .+++-.+|-..|..-++.
T Consensus 68 ~drHPErR~KAAy~afeE~~Lp~lK~E~PgLr-lsQ~kq~l~K~w~KSPeN 117 (122)
T PF06244_consen 68 IDRHPERRMKAAYKAFEERRLPELKEENPGLR-LSQYKQMLWKEWQKSPEN 117 (122)
T ss_pred CCCCcchhHHHHHHHHHHHHhHHHHhhCCCch-HHHHHHHHHHHHhcCCCC
Confidence 34434455557899999999999999999998 899999999999876653
No 23
>PF04769 MAT_Alpha1: Mating-type protein MAT alpha 1; InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=90.52 E-value=0.87 Score=35.90 Aligned_cols=53 Identities=21% Similarity=0.353 Sum_probs=38.1
Q ss_pred CCCCCCCCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHH
Q 032101 50 KDPNKPKRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKA 109 (147)
Q Consensus 50 kdp~~PKRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A 109 (147)
.....++||.|+||.|..=.-. -.|+.. ..+++..|+..|..=+. |..|.-.|
T Consensus 39 ~~~~~~kr~lN~Fm~FRsyy~~----~~~~~~-Qk~~S~~l~~lW~~dp~--k~~W~l~a 91 (201)
T PF04769_consen 39 RSPEKAKRPLNGFMAFRSYYSP----IFPPLP-QKELSGILTKLWEKDPF--KNKWSLMA 91 (201)
T ss_pred ccccccccchhHHHHHHHHHHh----hcCCcC-HHHHHHHHHHHHhCCcc--HhHHHHHh
Confidence 3455789999999999766553 346776 68999999999986332 45555444
No 24
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=89.76 E-value=0.91 Score=35.43 Aligned_cols=44 Identities=18% Similarity=0.511 Sum_probs=38.7
Q ss_pred HHHHHH-HHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhhh
Q 032101 83 VSAVGK-AAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNKK 126 (147)
Q Consensus 83 ~~eisk-~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~~ 126 (147)
|..|++ .||..|+.+|+++++.|.+.... ....|-..+..|...
T Consensus 66 f~~mar~vLG~~W~~~s~~Qr~~F~~~F~~~l~~tY~~~l~~y~~~ 111 (198)
T TIGR03481 66 LPAMARLTLGSSWTSLSPEQRRRFIGAFRELSIATYASQFKSYAGE 111 (198)
T ss_pred HHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhcCc
Confidence 778876 68999999999999999998888 778899999999763
No 25
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=89.33 E-value=1.1 Score=35.36 Aligned_cols=48 Identities=29% Similarity=0.562 Sum_probs=40.3
Q ss_pred CCCCcHHHHHH-HHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhhh
Q 032101 78 PNVTAVSAVGK-AAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNKK 126 (147)
Q Consensus 78 P~~~~~~eisk-~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~~ 126 (147)
|... |..+++ .||..|+++|++++..|.+.... ....|-..+..|...
T Consensus 66 p~~D-f~~~s~~vLG~~wr~as~eQr~~F~~~F~~~Lv~tYa~~l~~y~~q 115 (211)
T PRK15117 66 PYVQ-VKYAGALVLGRYYKDATPAQREAYFAAFREYLKQAYGQALAMYHGQ 115 (211)
T ss_pred ccCC-HHHHHHHHhhhhhhhCCHHHHHHHHHHHHHHHHHHHHHHHHHhCCc
Confidence 5564 778875 68999999999999999987777 667899999999764
No 26
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=79.56 E-value=3.9 Score=32.24 Aligned_cols=53 Identities=30% Similarity=0.483 Sum_probs=43.3
Q ss_pred CCC-CCCCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHH
Q 032101 53 NKP-KRPPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKA 109 (147)
Q Consensus 53 ~~P-KRP~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A 109 (147)
..| +|=.-||.-|-....+.|+.+||++. ++++-.+|-.+|..-++. ||.+.+
T Consensus 162 rHPEkRmrAA~~afEe~~LPrLK~e~P~lr-lsQ~Kqll~Kew~KsPDN---P~Nq~~ 215 (221)
T KOG3223|consen 162 RHPEKRMRAAFKAFEEARLPRLKKENPGLR-LSQYKQLLKKEWQKSPDN---PFNQAA 215 (221)
T ss_pred cChHHHHHHHHHHHHHhhchhhhhcCCCcc-HHHHHHHHHHHHhhCCCC---hhhHHh
Confidence 445 45556799999999999999999998 899999999999887775 565544
No 27
>PF05494 Tol_Tol_Ttg2: Toluene tolerance, Ttg2 ; InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=78.27 E-value=2.7 Score=31.47 Aligned_cols=47 Identities=21% Similarity=0.548 Sum_probs=36.1
Q ss_pred CCCCcHHHHHH-HHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhh
Q 032101 78 PNVTAVSAVGK-AAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNK 125 (147)
Q Consensus 78 P~~~~~~eisk-~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~ 125 (147)
|... |..|++ .||..|+.||++++..|.+.... ....|-..+..|..
T Consensus 36 ~~~D-~~~~ar~~LG~~w~~~s~~q~~~F~~~f~~~l~~~Y~~~l~~y~~ 84 (170)
T PF05494_consen 36 PYFD-FERMARRVLGRYWRKASPAQRQRFVEAFKQLLVRTYAKRLDEYSG 84 (170)
T ss_dssp GGB--HHHHHHHHHGGGTTTS-HHHHHHHHHHHHHHHHHHHHHHHHT-SS
T ss_pred HhCC-HHHHHHHHHHHhHhhCCHHHHHHHHHHHHHHHHHHHHHHHHhhCC
Confidence 4554 677775 47889999999999999987776 66788999999975
No 28
>COG2854 Ttg2D ABC-type transport system involved in resistance to organic solvents, auxiliary component [Secondary metabolites biosynthesis, transport, and catabolism]
Probab=69.26 E-value=7.3 Score=30.83 Aligned_cols=42 Identities=19% Similarity=0.421 Sum_probs=35.9
Q ss_pred HHHHHHhcCCChhhhhhHHHHHHH-HHHHHHHHHHHHhhhCCC
Q 032101 88 KAAGGKWKSMSPAEKAPYESKAEK-LKSEYGKKMNAYNKKQVT 129 (147)
Q Consensus 88 k~lge~Wk~Ls~eeK~~Y~~~A~~-~k~~y~k~~~~Y~~~~~~ 129 (147)
..||.-|+++|+++++.|...... ..+.|-..+..|+.+...
T Consensus 78 ~vLGk~~k~aspeQ~~~F~~aF~~yl~q~Y~~aL~~Y~~q~~~ 120 (202)
T COG2854 78 LVLGKYYKTASPEQRQAFFKAFRTYLEQTYGQALLDYKGQTLK 120 (202)
T ss_pred HHhccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHccCCCce
Confidence 458999999999999999987776 678899999999887543
No 29
>PF13875 DUF4202: Domain of unknown function (DUF4202)
Probab=67.97 E-value=13 Score=29.12 Aligned_cols=40 Identities=23% Similarity=0.379 Sum_probs=33.9
Q ss_pred CHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhh
Q 032101 60 SAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKA 103 (147)
Q Consensus 60 sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~ 103 (147)
-+-++|+..+...+...|. -..+..+|...|..||+.-++
T Consensus 130 vacLVFL~~~f~~F~~~~d----eeK~v~Il~KTw~KMS~~g~~ 169 (185)
T PF13875_consen 130 VACLVFLEYYFEDFAAKHD----EEKIVDILRKTWRKMSERGHE 169 (185)
T ss_pred hHHHHhHHHHHHHHHhcCC----HHHHHHHHHHHHHHCCHHHHH
Confidence 3578999999999998882 357888999999999998765
No 30
>PF12881 NUT_N: NUT protein N terminus; InterPro: IPR024309 This domain is found in the N-terminal region of Nuclear Testis (NUT) proteins. It is also found in FAM22, which are a family of uncharacterised mammalian proteins.
Probab=67.86 E-value=16 Score=30.86 Aligned_cols=53 Identities=21% Similarity=0.281 Sum_probs=41.9
Q ss_pred CCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHH
Q 032101 59 PSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKL 112 (147)
Q Consensus 59 ~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~ 112 (147)
..||..|+.-+...+....|.++ +.|-....-+.|.-.|.-+|..|+++|++=
T Consensus 229 ~EAlSCFLIpvLrsLar~kPtMt-lEeGl~ra~qEW~~~SnfdRmifyemaekF 281 (328)
T PF12881_consen 229 AEALSCFLIPVLRSLARLKPTMT-LEEGLWRAVQEWQHTSNFDRMIFYEMAEKF 281 (328)
T ss_pred hhhhhhhHHHHHHHHHhcCCCcc-HHHHHHHHHHHhhccccccHHHHHHHHHHH
Confidence 35566666655555666678887 888888888999999999999999999774
No 31
>PRK10363 cpxP periplasmic repressor CpxP; Reviewed
Probab=54.73 E-value=52 Score=25.27 Aligned_cols=40 Identities=10% Similarity=0.280 Sum_probs=31.3
Q ss_pred HHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 84 SAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 84 ~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
.++.++-.+++.-|+||.|..|.+..+....++.. +..+.
T Consensus 111 Vem~k~~nqmy~lLTPEQKaq~~~~~~~rm~~~~~-~~~~q 150 (166)
T PRK10363 111 VEMAKVRNQMYRLLTPEQQAVLNEKHQQRMEQLRD-VTQWQ 150 (166)
T ss_pred HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHH-HHhcC
Confidence 45667778999999999999999888887777754 54443
No 32
>PRK09706 transcriptional repressor DicA; Reviewed
Probab=49.25 E-value=48 Score=23.67 Aligned_cols=44 Identities=11% Similarity=0.071 Sum_probs=38.3
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCC
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQV 128 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~ 128 (147)
+-...|-..|+.|+++++.............|...+++|-....
T Consensus 87 ~~~~~ll~~~~~L~~~~~~~~l~~l~~~~~~~~~~~~~~~~~~~ 130 (135)
T PRK09706 87 EDQKELLELFDALPESEQDAQLSEMRARVENFNKLFEELLKARK 130 (135)
T ss_pred HHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence 34567889999999999999999999999999999999977643
No 33
>PRK12751 cpxP periplasmic stress adaptor protein CpxP; Reviewed
Probab=47.94 E-value=44 Score=25.39 Aligned_cols=34 Identities=12% Similarity=0.262 Sum_probs=25.9
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHH
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGK 118 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k 118 (147)
+..+...+++..|++++|..|.+..++...+...
T Consensus 118 ~~~~~~~qmy~lLTPEQra~l~~~~e~r~~~~~~ 151 (162)
T PRK12751 118 EMAKVRNQMYNLLTPEQKEALNKKHQERIEKLQQ 151 (162)
T ss_pred HHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHh
Confidence 3445567888999999999999888776665543
No 34
>PRK12750 cpxP periplasmic repressor CpxP; Reviewed
Probab=46.80 E-value=78 Score=24.10 Aligned_cols=35 Identities=17% Similarity=0.231 Sum_probs=28.7
Q ss_pred HHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHH
Q 032101 86 VGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKM 120 (147)
Q Consensus 86 isk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~ 120 (147)
+.+..-+++..|++++|..|.+.-.+..+.|...+
T Consensus 126 ~~~~~~~~~~vLTpEQRak~~e~~~~r~~~~~~~~ 160 (170)
T PRK12750 126 MLEKRHQMLSILTPEQKAKFQELQQERMQECQDKM 160 (170)
T ss_pred HHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHH
Confidence 34456678999999999999999888888887766
No 35
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=44.51 E-value=98 Score=21.69 Aligned_cols=9 Identities=33% Similarity=1.180 Sum_probs=3.5
Q ss_pred hcCCChhhh
Q 032101 94 WKSMSPAEK 102 (147)
Q Consensus 94 Wk~Ls~eeK 102 (147)
|.+||++++
T Consensus 56 W~~LspeqR 64 (107)
T PF11304_consen 56 WAALSPEQR 64 (107)
T ss_pred HHhCCHHHH
Confidence 333333333
No 36
>PF00887 ACBP: Acyl CoA binding protein; InterPro: IPR000582 Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters []. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor []. ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species []. Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats (IPR002110 from INTERPRO) []. The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein []. Other proteins containing an ACB domain include: Endozepine-like peptide (ELP) (gene DBIL5) from mouse []. ELP is a testis-specific ACBP homologue that may be involved in the energy metabolism of the mature sperm. MA-DBI, a transmembrane protein of unknown function which has been found in mammals. MA-DBI contains a N-terminal ACB domain. DRS-1 [], a human protein of unknown function that contains a N-terminal ACB domain and a C-terminal enoyl-CoA isomerase/hydratase domain. ; GO: 0000062 fatty-acyl-CoA binding; PDB: 2CB8_A 2FJ9_A 2LBB_A 1ST7_A 3EPY_B 2FDQ_C 1NTI_A 1HB8_A 1ACA_A 1NVL_A ....
Probab=39.45 E-value=58 Score=21.58 Aligned_cols=53 Identities=17% Similarity=0.340 Sum_probs=32.6
Q ss_pred HHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCC----hhhhhhHHHHHHHHHHHH
Q 032101 62 FFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMS----PAEKAPYESKAEKLKSEY 116 (147)
Q Consensus 62 y~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls----~eeK~~Y~~~A~~~k~~y 116 (147)
|-+|.+.....+....|+..+ -+.+.--+.|+.|. ++-+..|.+...+....|
T Consensus 30 YalyKQAt~Gd~~~~~P~~~d--~~~~~K~~AW~~l~gms~~eA~~~Yi~~v~~~~~~~ 86 (87)
T PF00887_consen 30 YALYKQATHGDCDTPRPGFFD--IEGRAKWDAWKALKGMSKEEAMREYIELVEELIPKY 86 (87)
T ss_dssp HHHHHHHHTSS--S-CTTTTC--HHHHHHHHHHHTTTTTHHHHHHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHhCCCcCCCCcchh--HHHHHHHHHHHHccCCCHHHHHHHHHHHHHHHHHhc
Confidence 667777666655566677643 44555567798776 555667777777666555
No 37
>PF06945 DUF1289: Protein of unknown function (DUF1289); InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=38.47 E-value=43 Score=20.34 Aligned_cols=24 Identities=25% Similarity=0.488 Sum_probs=17.7
Q ss_pred HHHHHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101 83 VSAVGKAAGGKWKSMSPAEKAPYESKAEK 111 (147)
Q Consensus 83 ~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~ 111 (147)
..||.. |..|++++|.........
T Consensus 24 ~dEI~~-----W~~~s~~er~~i~~~l~~ 47 (51)
T PF06945_consen 24 LDEIRD-----WKSMSDDERRAILARLRA 47 (51)
T ss_pred HHHHHH-----HhhCCHHHHHHHHHHHHH
Confidence 456665 999999998877665543
No 38
>KOG1610 consensus Corticosteroid 11-beta-dehydrogenase and related short chain-type dehydrogenases [Secondary metabolites biosynthesis, transport and catabolism; General function prediction only]
Probab=36.11 E-value=1.1e+02 Score=26.01 Aligned_cols=57 Identities=18% Similarity=0.367 Sum_probs=39.9
Q ss_pred HHHHHHHHHHHh-------CCC-----CCcHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 65 FLEEFRKTFKKE-------NPN-----VTAVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 65 F~~e~r~~ik~e-------~P~-----~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
|+...|.++..= -|+ +.+...+.+.+.++|..|+++.|+.|-+.+..+ |+..+..|.
T Consensus 188 f~D~lR~EL~~fGV~VsiiePG~f~T~l~~~~~~~~~~~~~w~~l~~e~k~~YGedy~~~---~~~~~~~~~ 256 (322)
T KOG1610|consen 188 FSDSLRRELRPFGVKVSIIEPGFFKTNLANPEKLEKRMKEIWERLPQETKDEYGEDYFED---YKKSLEKYL 256 (322)
T ss_pred HHHHHHHHHHhcCcEEEEeccCccccccCChHHHHHHHHHHHhcCCHHHHHHHHHHHHHH---HHHHHHhhh
Confidence 777777776522 122 323478889999999999999999998877554 455555554
No 39
>PF01352 KRAB: KRAB box; InterPro: IPR001909 The Krueppel-associated box (KRAB) is a domain of around 75 amino acids that is found in the N-terminal part of about one third of eukaryotic Krueppel-type C2H2 zinc finger proteins (ZFPs) []. It is enriched in charged amino acids and can be divided into subregions A and B, which are predicted to fold into two amphipathic alpha-helices. The KRAB A and B boxes can be separated by variable spacer segments and many KRAB proteins contain only the A box []. The functions currently known for members of the KRAB-containing protein family include transcriptional repression of RNA polymerase I, II, and III promoters, binding and splicing of RNA, and control of nucleolus function. The KRAB domain functions as a transcriptional repressor when tethered to the template DNA by a DNA-binding domain. A sequence of 45 amino acids in the KRAB A subdomain has been shown to be necessary and sufficient for transcriptional repression. The B box does not repress by itself but does potentiate the repression exerted by the KRAB A subdomain [, ]. Gene silencing requires the binding of the KRAB domain to the RING-B box-coiled coil (RBCC) domain of the KAP-1/TIF1-beta corepressor. As KAP-1 binds to the heterochromatin proteins HP1, it has been proposed that the KRAB-ZFP-bound target gene could be silenced following recruitment to heterochromatin [, ]. KRAB-ZFPs probably constitute the single largest class of transcription factors within the human genome []. Although the function of KRAB-ZFPs is largely unknown, they appear to play important roles during cell differentiation and development. The KRAB domain is generally encoded by two exons. The regions coded by the two exons are known as KRAB-A and KRAB-B.; GO: 0003676 nucleic acid binding, 0006355 regulation of transcription, DNA-dependent, 0005622 intracellular; PDB: 1V65_A.
Probab=35.53 E-value=29 Score=20.23 Aligned_cols=28 Identities=21% Similarity=0.369 Sum_probs=15.9
Q ss_pred HHHHHHHHH-HHhcCCChhhhhhHHHHHH
Q 032101 83 VSAVGKAAG-GKWKSMSPAEKAPYESKAE 110 (147)
Q Consensus 83 ~~eisk~lg-e~Wk~Ls~eeK~~Y~~~A~ 110 (147)
|.+|+--++ +.|..|.+.+|..|.+.-.
T Consensus 3 f~Dvav~fs~eEW~~L~~~Qk~ly~dvm~ 31 (41)
T PF01352_consen 3 FEDVAVYFSQEEWELLDPAQKNLYRDVML 31 (41)
T ss_dssp ----TT---HHHHHTS-HHHHHHHHHHHH
T ss_pred EEEEEEEcChhhcccccceecccchhHHH
Confidence 445554444 5699999999999987653
No 40
>PF12650 DUF3784: Domain of unknown function (DUF3784); InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=34.85 E-value=25 Score=23.84 Aligned_cols=15 Identities=40% Similarity=0.718 Sum_probs=13.2
Q ss_pred hcCCChhhhhhHHHH
Q 032101 94 WKSMSPAEKAPYESK 108 (147)
Q Consensus 94 Wk~Ls~eeK~~Y~~~ 108 (147)
|+.||+|||+.|...
T Consensus 26 yntms~eEk~~~D~~ 40 (97)
T PF12650_consen 26 YNTMSKEEKEKYDKK 40 (97)
T ss_pred cccCCHHHHHHhhHH
Confidence 899999999999754
No 41
>TIGR00787 dctP tripartite ATP-independent periplasmic transporter solute receptor, DctP family. TRAP-T (Tripartite ATP-independent Periplasmic Transporter) family proteins generally consist of three components, and these systems have so far been found in Gram-negative bacteria, Gram-postive bacteria and archaea. The best characterized example is the DctPQM system of Rhodobacter capsulatus, a C4 dicarboxylate (malate, fumarate, succinate) transporter. This model represents the DctP family, one of at least three major families of extracytoplasmic solute receptor for TRAP family transporters. Other are the SnoM family (see pfam03480) and TAXI (TRAP-associated extracytoplasmic immunogenic) family.
Probab=31.82 E-value=99 Score=24.31 Aligned_cols=28 Identities=25% Similarity=0.310 Sum_probs=21.6
Q ss_pred HHHhcCCChhhhhhHHHHHHHHHHHHHH
Q 032101 91 GGKWKSMSPAEKAPYESKAEKLKSEYGK 118 (147)
Q Consensus 91 ge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k 118 (147)
.+.|..||++.|....+.+...-..+..
T Consensus 213 ~~~~~~L~~e~q~~i~~a~~~~~~~~~~ 240 (257)
T TIGR00787 213 KAFWKSLPPDLQAVVKEAAKEAGEYQRK 240 (257)
T ss_pred HHHHhcCCHHHHHHHHHHHHHHHHHHHH
Confidence 5779999999999998877765444443
No 42
>cd07081 ALDH_F20_ACDH_EutE-like Coenzyme A acylating aldehyde dehydrogenase (ACDH), Ethanolamine utilization protein EutE, and related proteins. Coenzyme A acylating aldehyde dehydrogenase (ACDH), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, acetylating (EC=1.2.1.10), functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA. The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH, and may be critical enzymes in the fermentative pathway.
Probab=31.05 E-value=1.2e+02 Score=26.39 Aligned_cols=40 Identities=13% Similarity=-0.030 Sum_probs=32.2
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
+.++..-..|+.++.++|..+...+....+++..++....
T Consensus 6 ~~A~~A~~~W~~~~~~~R~~iL~~~a~~l~~~~~ela~~~ 45 (439)
T cd07081 6 AAAKVAQQGLSCKSQEMVDLIFRAAAEAAEDARIDLAKLA 45 (439)
T ss_pred HHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4455566789999999999999988888888887777663
No 43
>COG4281 ACB Acyl-CoA-binding protein [Lipid metabolism]
Probab=31.03 E-value=52 Score=22.28 Aligned_cols=61 Identities=20% Similarity=0.390 Sum_probs=38.1
Q ss_pred CCCCCCCH-----HHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCC----hhhhhhHHHHHHHHHHHH
Q 032101 54 KPKRPPSA-----FFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMS----PAEKAPYESKAEKLKSEY 116 (147)
Q Consensus 54 ~PKRP~sA-----y~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls----~eeK~~Y~~~A~~~k~~y 116 (147)
.+.+|.|- |.||-+..-.....+-|++. .-+.+.--+.|.+|- ++-++.|.....+.+..|
T Consensus 16 L~~kP~~d~LLkLYAL~KQ~s~GD~~~ekPG~~--d~~gr~K~eAW~~LKGksqedA~qeYialVeeLkak~ 85 (87)
T COG4281 16 LSEKPSNDELLKLYALFKQGSVGDNDGEKPGFF--DIVGRYKYEAWAGLKGKSQEDARQEYIALVEELKAKY 85 (87)
T ss_pred hccCCCcHHHHHHHHHHHhccccccCCCCCCcc--ccccchhHHHHhhccCccHHHHHHHHHHHHHHHHhhc
Confidence 35666665 66666655544555668774 345566668897765 555667777777666544
No 44
>PF05388 Carbpep_Y_N: Carboxypeptidase Y pro-peptide; InterPro: IPR008442 This signature is found at the N terminus of carboxypeptidase Y, which belong to MEROPS peptidase family S10. This region contains the signal peptide and pro-peptide regions [,].; GO: 0004185 serine-type carboxypeptidase activity, 0005773 vacuole
Probab=30.41 E-value=75 Score=22.70 Aligned_cols=29 Identities=17% Similarity=0.193 Sum_probs=25.3
Q ss_pred HHHHHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101 83 VSAVGKAAGGKWKSMSPAEKAPYESKAEK 111 (147)
Q Consensus 83 ~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~ 111 (147)
+..+++.+++.++.|+.+.|+.|.+....
T Consensus 45 ~~~~~~~l~e~l~~Lt~e~k~~W~E~~~~ 73 (113)
T PF05388_consen 45 LEKISKYLNEPLKSLTSEAKALWDEMMLL 73 (113)
T ss_pred HHHHHHHHHHHHhhccHHHHHHHHHHHHH
Confidence 56677889999999999999999998754
No 45
>PRK10236 hypothetical protein; Provisional
Probab=30.38 E-value=49 Score=26.83 Aligned_cols=26 Identities=15% Similarity=0.351 Sum_probs=21.2
Q ss_pred HHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101 86 VGKAAGGKWKSMSPAEKAPYESKAEK 111 (147)
Q Consensus 86 isk~lge~Wk~Ls~eeK~~Y~~~A~~ 111 (147)
+.+.+.+.|..||++|++.+.+.-..
T Consensus 118 l~kll~~a~~kms~eE~~~L~~~l~~ 143 (237)
T PRK10236 118 LEQFLRNTWKKMDEEHKQEFLHAVDA 143 (237)
T ss_pred HHHHHHHHHHHCCHHHHHHHHHHHhh
Confidence 46889999999999999887765443
No 46
>KOG1827 consensus Chromatin remodeling complex RSC, subunit RSC1/Polybromo and related proteins [Chromatin structure and dynamics; Transcription]
Probab=29.81 E-value=3.9 Score=37.44 Aligned_cols=44 Identities=25% Similarity=0.404 Sum_probs=39.6
Q ss_pred CCCHHHHHHHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhh
Q 032101 58 PPSAFFVFLEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEK 102 (147)
Q Consensus 58 P~sAy~lF~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK 102 (147)
-.++|++|..+.+..+-..+|++. +++++.++|.-|..|+...+
T Consensus 552 ~~~~~~~~s~~~~~~~~~~np~v~-~~~~~~~vg~~~~~lp~~~k 595 (629)
T KOG1827|consen 552 SPEPYILDSIENRTIIWFENPTVG-FGEVSIIVGNDWDKLPNINK 595 (629)
T ss_pred CCccccccccccCceeeeeCCCcc-cceeEEeecCCcccCccccc
Confidence 558899999999999999999997 89999999999999994444
No 47
>COG1638 DctP TRAP-type C4-dicarboxylate transport system, periplasmic component [Carbohydrate transport and metabolism]
Probab=29.33 E-value=1.1e+02 Score=25.68 Aligned_cols=35 Identities=14% Similarity=0.232 Sum_probs=27.1
Q ss_pred HHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101 91 GGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK 125 (147)
Q Consensus 91 ge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~ 125 (147)
...|..||++.+....+.+.+......+...+++.
T Consensus 244 ~~~w~~L~~e~q~il~~aa~e~~~~~~~~~~~~e~ 278 (332)
T COG1638 244 KAFWDSLPEEDQTILLEAAKEAAEEQRKLVEELED 278 (332)
T ss_pred HHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 57899999999999999888866666555555544
No 48
>PF06394 Pepsin-I3: Pepsin inhibitor-3-like repeated domain; InterPro: IPR010480 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. The members of this group of proteins belong to MEROPS inhibitor family I33, clan IR; the nematode aspartyl protease inhibitors or Aspins. They are restricted to parasitic nematode species. Structural features common to the nematode Aspins include the presence of a signal peptide sequence and the conservation of all four cysteine residues in the mature protein. The Y[V.A]RDLT sequence motif has been suggested as being of crucial functional importance in several filarial nematode inhibitors [], this sequence is not conserved in Tco-API-1 from Trichostrongylus colubriformis (Black scour worm) and it has been demonstrated that Tco-API-1, is not an Aspin as it does not inhibit porcine pepsin []. Related inhibitors from Onchocerca volvulus, Ov33 [] and Ascaris suum (Pig roundworm), PI-3 [] inhibit the in vitro activity of aspartyl proteases such as pepsin and cathepsin E (MEROPS peptidase family A1). Aspin may facilitate the safe passage of the eggs of Ascaris through the host stomach without digestion by pepsin [, ]. The other parasitic nematodes known to express homologous proteins do not pass through the stomach of their hosts []. Several proteins in the family are potent allergens in mammals. The three-dimensional structures of pepsin inhibitor-3 (PI-3) from A. suum and of the complex between PI-3 and porcine pepsin at 1. 75 A and 2.45 A resolution, respectively, have revealed the mechanism of aspartic protease inhibition. PI-3 has a new fold consisting of two identical domains, each comprising an antiparallel beta-sheet flanked by an alpha-helix. In the enzyme-inhibitor complex, the N-terminal beta-strand of PI-3 pairs with one strand of the 'active site flap' (residues 70-82) of pepsin, thus forming an eight-stranded beta-sheet that spans the two proteins. PI-3 has a novel mode of inhibition, using its N-terminal residues to occupy and therefore block the first three binding pockets in pepsin for substrate residues C-terminal to the scissile bond (S1'-S3') [].; PDB: 1F32_A 1F34_B.
Probab=28.86 E-value=59 Score=21.74 Aligned_cols=31 Identities=23% Similarity=0.464 Sum_probs=18.4
Q ss_pred cCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCCCCCC
Q 032101 95 KSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVTNLVP 133 (147)
Q Consensus 95 k~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~~~~~ 133 (147)
+.|+++|+ ..+..|.+++..|...+...-..
T Consensus 38 R~Lt~~E~--------~eL~~y~~~v~~y~~~l~~~iq~ 68 (76)
T PF06394_consen 38 RDLTPDEQ--------QELKTYQKKVAAYKEQLQQQIQE 68 (76)
T ss_dssp EE--HHHH--------HHHHHHHHHHHHHHHHHTT----
T ss_pred ccCCHHHH--------HHHHHHHHHHHHHHHHHHHHHHH
Confidence 45666665 45678888888888876655443
No 49
>cd07133 ALDH_CALDH_CalB Coniferyl aldehyde dehydrogenase-like. Coniferyl aldehyde dehydrogenase (CALDH, EC=1.2.1.68) of Pseudomonas sp. strain HR199 (CalB) which catalyzes the NAD+-dependent oxidation of coniferyl aldehyde to ferulic acid, and similar sequences, are present in this CD.
Probab=26.37 E-value=1.8e+02 Score=25.07 Aligned_cols=42 Identities=14% Similarity=-0.096 Sum_probs=32.0
Q ss_pred HHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhh
Q 032101 84 SAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNK 125 (147)
Q Consensus 84 ~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~ 125 (147)
.+.++..-..|+.++..+|..+........+++..++.....
T Consensus 4 ~~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~~ 45 (434)
T cd07133 4 LERQKAAFLANPPPSLEERRDRLDRLKALLLDNQDALAEAIS 45 (434)
T ss_pred HHHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 345556666799999999998888888888888877776544
No 50
>PRK10455 periplasmic protein; Reviewed
Probab=26.33 E-value=1.4e+02 Score=22.59 Aligned_cols=28 Identities=18% Similarity=0.311 Sum_probs=21.4
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHH
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKL 112 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~ 112 (147)
+..+.-.++|..|++++|..|.+..++.
T Consensus 118 ~~~~~~~qiy~vLTPEQr~q~~~~~ekr 145 (161)
T PRK10455 118 AHMETQNKIYNVLTPEQKKQFNANFEKR 145 (161)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHH
Confidence 4556667789999999999998765443
No 51
>PHA02662 ORF131 putative membrane protein; Provisional
Probab=25.37 E-value=1.6e+02 Score=23.68 Aligned_cols=45 Identities=16% Similarity=0.262 Sum_probs=33.9
Q ss_pred CHHHHHHHHHHHHHHHh--------------------------------CCCCCcHHHHHHHHHHHhcCCChhhhhhH
Q 032101 60 SAFFVFLEEFRKTFKKE--------------------------------NPNVTAVSAVGKAAGGKWKSMSPAEKAPY 105 (147)
Q Consensus 60 sAy~lF~~e~r~~ik~e--------------------------------~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y 105 (147)
+=|-+|+..+-..+-.- +...+ |.-+.+.+.|....|++++|..-
T Consensus 22 tLY~lf~~ryL~kLs~~s~~a~a~C~IhIG~I~g~~k~C~v~V~N~C~sna~~s-f~lll~Al~Et~~~Lp~~qK~~i 98 (226)
T PHA02662 22 SLYDVFLARFLRRLAARAAPASAACAVRVGAVRGRLRNCELVVLNRCHTDAADA-LALASAALAETLAELPRADRLAV 98 (226)
T ss_pred hHHHHHHHHHHHHHHhccCccccccceEEeeEeeecCCceEEEEecccCCHHHH-HHHHHHHHHHHHHhCCHHHHHHH
Confidence 34999999887776421 22332 88889999999999999998653
No 52
>PF12290 DUF3802: Protein of unknown function (DUF3802); InterPro: IPR020979 This family of proteins is found in bacteria and are typically between 114 and 143 amino acids in length. There is a conserved KNLFD sequence motif. The annotation with this family suggests that it may be the B subunit of bacterial type IIA DNA topoisomerase but there is no evidence to support this annotation.
Probab=25.25 E-value=2.6e+02 Score=20.10 Aligned_cols=40 Identities=13% Similarity=0.260 Sum_probs=29.7
Q ss_pred HHHHhCCCCCc-------------HHHHHHHHHHHhcCCChhhhhhHHHHHHH
Q 032101 72 TFKKENPNVTA-------------VSAVGKAAGGKWKSMSPAEKAPYESKAEK 111 (147)
Q Consensus 72 ~ik~e~P~~~~-------------~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~ 111 (147)
.+..+||+++. +.++...|+..|...+-.+..-|.+..--
T Consensus 49 ~vc~Qnp~L~~~~R~~iirE~Daiv~DLeEVLa~V~~~~aT~eQ~~Fi~Ef~~ 101 (113)
T PF12290_consen 49 AVCEQNPELEFSQRFQIIREADAIVYDLEEVLASVWNQKATNEQIAFIEEFIG 101 (113)
T ss_pred HHHccCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHcCCCCHHHHHHHHHHHH
Confidence 45677888862 55777889999999888888777765544
No 53
>PF03480 SBP_bac_7: Bacterial extracellular solute-binding protein, family 7; InterPro: IPR018389 This family of proteins are involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. This family includes a C4-dicarboxylate-binding protein DctP [, ] and the sialic acid-binding protein SiaP. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins []. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an alpha-helix hinge component [].; GO: 0006810 transport, 0030288 outer membrane-bounded periplasmic space; PDB: 2HZK_C 2HZL_B 2HPG_C 2XWI_A 2XWK_A 2WX9_A 2CEY_A 2WYP_A 3B50_A 2CEX_B ....
Probab=25.24 E-value=1.1e+02 Score=24.34 Aligned_cols=31 Identities=10% Similarity=0.288 Sum_probs=22.0
Q ss_pred HHHhcCCChhhhhhHHHHHHHHHHHHHHHHH
Q 032101 91 GGKWKSMSPAEKAPYESKAEKLKSEYGKKMN 121 (147)
Q Consensus 91 ge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~ 121 (147)
.+.|..||++.|+...+.+.+....+.....
T Consensus 213 ~~~w~~L~~e~q~~l~~~~~~~~~~~~~~~~ 243 (286)
T PF03480_consen 213 KDWWDSLPDEDQEALDDAADEAEARAREYYE 243 (286)
T ss_dssp HHHHHHS-HHHHHHHHHHHHHHHHHHHHHHH
T ss_pred HHHHhcCCHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4679999999999999887776554444333
No 54
>cd07122 ALDH_F20_ACDH Coenzyme A acylating aldehyde dehydrogenase (ACDH), ALDH family 20-like. Coenzyme A acylating aldehyde dehydrogenase (ACDH, EC=1.2.1.10), an NAD+ and CoA-dependent acetaldehyde dehydrogenase, functions as a single enzyme (such as the Ethanolamine utilization protein, EutE, in Salmonella typhimurium) or as part of a multifunctional enzyme to convert acetaldehyde into acetyl-CoA . The E. coli aldehyde-alcohol dehydrogenase includes the functional domains, alcohol dehydrogenase (ADH), ACDH, and pyruvate-formate-lyase deactivase; and the Entamoeba histolytica aldehyde-alcohol dehydrogenase 2 (ALDH20A1) includes the functional domains ADH and ACDH and may be critical enzymes in the fermentative pathway.
Probab=23.66 E-value=1.9e+02 Score=25.16 Aligned_cols=40 Identities=5% Similarity=0.070 Sum_probs=31.0
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
+.++..-..|..++.++|..+...+....+++..++....
T Consensus 6 ~~A~~A~~~W~~~~~~eR~~~L~~~a~~l~~~~eela~~~ 45 (436)
T cd07122 6 ERARKAQREFATFSQEQVDKIVEAVAWAAADAAEELAKMA 45 (436)
T ss_pred HHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 3445555679999999999999888888888887776664
No 55
>cd07132 ALDH_F3AB Aldehyde dehydrogenase family 3 members A1, A2, and B1 and related proteins. NAD(P)+-dependent, aldehyde dehydrogenase, family 3 members A1 and B1 (ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and similar sequences are included in this CD. Human ALDH3A1 is a homodimer with a critical role in cellular defense against oxidative stress; it catalyzes the oxidation of various cellular membrane lipid-derived aldehydes. Corneal crystalline ALDH3A1 protects the cornea and underlying lens against UV-induced oxidative stress. Human ALDH3A2, a microsomal homodimer, catalyzes the oxidation of long-chain aliphatic aldehydes to fatty acids. Human ALDH3B1 is highly expressed in the kidney and liver and catalyzes the oxidation of various medium- and long-chain saturated and unsaturated aliphatic aldehydes.
Probab=23.63 E-value=2e+02 Score=24.91 Aligned_cols=40 Identities=8% Similarity=-0.111 Sum_probs=30.7
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
+.++..-..|..++..+|..+........+++..++.+-.
T Consensus 5 ~~A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~~l~~~~ 44 (443)
T cd07132 5 RRAREAFSSGKTRPLEFRIQQLEALLRMLEENEDEIVEAL 44 (443)
T ss_pred HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHHH
Confidence 4455566779999999999999888888787777666543
No 56
>PF15581 Imm35: Immunity protein 35
Probab=23.17 E-value=93 Score=21.52 Aligned_cols=21 Identities=5% Similarity=0.311 Sum_probs=17.0
Q ss_pred HHHHHHHHHHHhcCCChhhhh
Q 032101 83 VSAVGKAAGGKWKSMSPAEKA 103 (147)
Q Consensus 83 ~~eisk~lge~Wk~Ls~eeK~ 103 (147)
+.-+...|.+.|+.|++++=.
T Consensus 32 i~~l~~lIe~eWRGl~~~qV~ 52 (93)
T PF15581_consen 32 IRNLESLIEHEWRGLPEEQVL 52 (93)
T ss_pred HHHHHHHHHHHHcCCCHHHHH
Confidence 556788999999999987643
No 57
>cd01145 TroA_c Periplasmic binding protein TroA_c. These proteins are predicted to function as initial receptors in the ABC metal ion uptake in eubacteria and archaea. They belong to the TroA superfamily of helical backbone metal receptor proteins that share a distinct fold and ligand binding mechanism. A typical TroA protein is comprised of two globular subdomains connected by a single helix and can bind their ligands in the cleft between these domains.
Probab=23.09 E-value=1.4e+02 Score=22.74 Aligned_cols=48 Identities=15% Similarity=0.289 Sum_probs=38.6
Q ss_pred cHHHHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHhhhCCC
Q 032101 82 AVSAVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYNKKQVT 129 (147)
Q Consensus 82 ~~~eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~~~~~~ 129 (147)
+...++..|++....+.++.+..|.+.+.....+.......|......
T Consensus 116 ~~~~~a~~I~~~L~~~dP~~~~~y~~N~~~~~~~l~~l~~~~~~~l~~ 163 (203)
T cd01145 116 NAPALAKALADALIELDPSEQEEYKENLRVFLAKLNKLLREWERQFEG 163 (203)
T ss_pred HHHHHHHHHHHHHHHhCcccHHHHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence 356778888899999999999999999988877777777777766554
No 58
>cd07087 ALDH_F3-13-14_CALDH-like ALDH subfamily: Coniferyl aldehyde dehydrogenase, ALDH families 3, 13, and 14, and other related proteins. ALDH subfamily which includes NAD(P)+-dependent, aldehyde dehydrogenase, family 3 member A1 and B1 (ALDH3A1, ALDH3B1, EC=1.2.1.5) and fatty aldehyde dehydrogenase, family 3 member A2 (ALDH3A2, EC=1.2.1.3), and also plant ALDH family members ALDH3F1, ALDH3H1, and ALDH3I1, fungal ALDH14 (YMR110C) and the protozoan family 13 member (ALDH13), as well as coniferyl aldehyde dehydrogenases (CALDH, EC=1.2.1.68), and other similar sequences, such as the Pseudomonas putida benzaldehyde dehydrogenase I that is involved in the metabolism of mandelate.
Probab=21.21 E-value=2.4e+02 Score=24.17 Aligned_cols=40 Identities=13% Similarity=-0.078 Sum_probs=29.9
Q ss_pred HHHHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHHh
Q 032101 85 AVGKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAYN 124 (147)
Q Consensus 85 eisk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y~ 124 (147)
+.++..-..|..++..+|..+...+....+++..++.+..
T Consensus 5 ~~a~~a~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~~ 44 (426)
T cd07087 5 ARLRETFLTGKTRSLEWRKAQLKALKRMLTENEEEIAAAL 44 (426)
T ss_pred HHHHHHHHhcCCCCHHHHHHHHHHHHHHHHHhHHHHHHHH
Confidence 3445556679999999998888888888877777766554
No 59
>cd07085 ALDH_F6_MMSDH Methylmalonate semialdehyde dehydrogenase and ALDH family members 6A1 and 6B2. Methylmalonate semialdehyde dehydrogenase (MMSDH, EC=1.2.1.27) [acylating] from Bacillus subtilis is involved in valine metabolism and catalyses the NAD+- and CoA-dependent oxidation of methylmalonate semialdehyde into propionyl-CoA. Mitochondrial human MMSDH ALDH6A1 and Arabidopsis MMSDH ALDH6B2 are also present in this CD.
Probab=20.96 E-value=2.4e+02 Score=24.49 Aligned_cols=37 Identities=19% Similarity=0.159 Sum_probs=29.0
Q ss_pred HHHHHHHhcCCChhhhhhHHHHHHHHHHHHHHHHHHH
Q 032101 87 GKAAGGKWKSMSPAEKAPYESKAEKLKSEYGKKMNAY 123 (147)
Q Consensus 87 sk~lge~Wk~Ls~eeK~~Y~~~A~~~k~~y~k~~~~Y 123 (147)
++.....|..++.++|..+...+....+++..++..-
T Consensus 47 A~~A~~~w~~~~~~~R~~~L~~~a~~l~~~~~el~~~ 83 (478)
T cd07085 47 AKAAFPAWSATPVLKRQQVMFKFRQLLEENLDELARL 83 (478)
T ss_pred HHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4445567999999999999988888887777666553
No 60
>PTZ00037 DnaJ_C chaperone protein; Provisional
Probab=20.66 E-value=2e+02 Score=25.08 Aligned_cols=43 Identities=19% Similarity=0.184 Sum_probs=31.7
Q ss_pred HHHHHHHHHHhCCCCCcHHHHHHHHHHHhcCCChhhhhhHHHH
Q 032101 66 LEEFRKTFKKENPNVTAVSAVGKAAGGKWKSMSPAEKAPYESK 108 (147)
Q Consensus 66 ~~e~r~~ik~e~P~~~~~~eisk~lge~Wk~Ls~eeK~~Y~~~ 108 (147)
-+.+|...+.-||+...-.+..+.|.+.|..|++.+|...++.
T Consensus 46 KkAYrkla~k~HPDk~~~~e~F~~i~~AYevLsD~~kR~~YD~ 88 (421)
T PTZ00037 46 KKAYRKLAIKHHPDKGGDPEKFKEISRAYEVLSDPEKRKIYDE 88 (421)
T ss_pred HHHHHHHHHHHCCCCCchHHHHHHHHHHHHHhccHHHHHHHhh
Confidence 3456666778899985335777889999999998876655554
No 61
>smart00271 DnaJ DnaJ molecular chaperone homology domain.
Probab=20.41 E-value=1.7e+02 Score=17.18 Aligned_cols=34 Identities=18% Similarity=0.227 Sum_probs=19.2
Q ss_pred HHHHHHHHHhCCCCCc-----HHHHHHHHHHHhcCCChh
Q 032101 67 EEFRKTFKKENPNVTA-----VSAVGKAAGGKWKSMSPA 100 (147)
Q Consensus 67 ~e~r~~ik~e~P~~~~-----~~eisk~lge~Wk~Ls~e 100 (147)
..++..++.-||+... ..+....|.+-|..|.+.
T Consensus 20 ~ay~~l~~~~HPD~~~~~~~~~~~~~~~l~~Ay~~L~~~ 58 (60)
T smart00271 20 KAYRKLALKYHPDKNPGDKEEAEEKFKEINEAYEVLSDP 58 (60)
T ss_pred HHHHHHHHHHCcCCCCCchHHHHHHHHHHHHHHHHHcCC
Confidence 3445555666787752 234555666666665543
Done!