Query 019122
Match_columns 346
No_of_seqs 150 out of 310
Neff 6.9
Searched_HMMs 46136
Date Fri Mar 29 06:52:04 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/019122.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/019122hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF10536 PMD: Plant mobile dom 100.0 1.1E-33 2.4E-38 277.9 8.2 198 127-325 1-221 (363)
2 PTZ00199 high mobility group p 99.3 2.8E-12 6.1E-17 102.8 4.4 70 25-94 16-86 (94)
3 cd01390 HMGB-UBF_HMG-box HMGB- 99.1 3.8E-11 8.3E-16 89.0 3.4 62 32-94 1-62 (66)
4 PF09011 HMG_box_2: HMG-box do 99.1 6.4E-11 1.4E-15 90.4 2.3 66 29-94 1-66 (73)
5 PF00505 HMG_box: HMG (high mo 99.0 9.6E-11 2.1E-15 87.7 2.5 62 32-94 1-62 (69)
6 cd01389 MATA_HMG-box MATA_HMG- 99.0 3.9E-10 8.5E-15 86.9 4.0 64 31-95 1-64 (77)
7 cd01388 SOX-TCF_HMG-box SOX-TC 99.0 4.3E-10 9.2E-15 85.6 3.4 63 32-95 2-64 (72)
8 KOG0381 HMG box-containing pro 99.0 5.7E-10 1.2E-14 89.3 4.2 66 28-94 17-84 (96)
9 cd00084 HMG-box High Mobility 98.9 6.2E-10 1.3E-14 82.1 3.4 62 32-94 1-62 (66)
10 PF09331 DUF1985: Domain of un 98.9 2.3E-09 5.1E-14 92.3 6.2 124 156-279 14-142 (142)
11 smart00398 HMG high mobility g 98.9 1.5E-09 3.3E-14 81.0 3.8 63 31-94 1-63 (70)
12 COG5648 NHP6B Chromatin-associ 98.5 6.8E-08 1.5E-12 86.8 3.2 68 26-94 65-132 (211)
13 KOG0527 HMG-box transcription 98.2 9.5E-07 2.1E-11 85.7 3.3 64 31-95 62-125 (331)
14 KOG0526 Nucleosome-binding fac 97.6 3.9E-05 8.4E-10 77.4 2.5 69 21-94 525-593 (615)
15 KOG3248 Transcription factor T 95.3 0.013 2.9E-07 56.4 3.0 55 33-88 193-247 (421)
16 KOG0528 HMG-box transcription 94.4 0.021 4.6E-07 57.5 2.0 64 33-97 327-390 (511)
17 PF06382 DUF1074: Protein of u 92.1 0.1 2.2E-06 46.2 2.3 47 37-88 84-130 (183)
18 PF03078 ATHILA: ATHILA ORF-1 89.6 3.5 7.5E-05 42.2 10.9 187 101-297 63-286 (458)
19 KOG4715 SWI/SNF-related matrix 87.6 0.52 1.1E-05 45.4 3.2 63 31-94 64-126 (410)
20 PF14887 HMG_box_5: HMG (high 84.1 0.82 1.8E-05 35.2 2.2 60 33-94 5-64 (85)
21 KOG2746 HMG-box transcription 67.8 3.2 7E-05 43.9 2.0 62 33-95 183-246 (683)
22 PF04690 YABBY: YABBY protein; 65.9 8.3 0.00018 34.3 3.9 48 28-76 118-165 (170)
23 KOG2062 26S proteasome regulat 43.3 28 0.00061 37.7 4.0 54 113-183 565-620 (929)
24 PF11304 DUF3106: Protein of u 41.1 47 0.001 27.1 4.3 15 68-82 35-49 (107)
25 PF02919 Topoisom_I_N: Eukaryo 35.3 16 0.00035 33.6 0.7 42 41-82 63-113 (215)
26 cd03490 Topoisomer_IB_N_1 Topo 34.1 37 0.0008 31.2 2.8 44 41-84 60-113 (217)
27 cd00660 Topoisomer_IB_N Topois 33.5 47 0.001 30.5 3.4 44 41-84 62-114 (215)
28 COG5648 NHP6B Chromatin-associ 32.7 22 0.00048 32.5 1.2 37 58-94 169-205 (211)
29 cd03488 Topoisomer_IB_N_htopoI 32.6 49 0.0011 30.4 3.4 44 41-84 62-114 (215)
30 cd03489 Topoisomer_IB_N_Ldtopo 30.7 55 0.0012 30.0 3.3 44 41-84 60-111 (212)
31 PF11304 DUF3106: Protein of u 27.4 1.2E+02 0.0027 24.6 4.6 63 62-136 11-79 (107)
32 PF06945 DUF1289: Protein of u 24.8 44 0.00096 23.4 1.3 20 71-90 30-49 (51)
33 COG2920 DsrC Dissimilatory sul 24.7 59 0.0013 26.6 2.1 34 36-69 45-78 (111)
34 KOG4370 Ral-GTPase effector RL 22.7 1.5E+02 0.0032 30.3 5.0 31 16-50 55-85 (514)
35 PRK00114 hslO Hsp33-like chape 22.7 1.6E+02 0.0035 28.2 5.3 57 96-177 233-290 (293)
36 PF12169 DNA_pol3_gamma3: DNA 22.6 1.2E+02 0.0025 25.2 3.8 129 167-306 1-134 (143)
37 PF05494 Tol_Tol_Ttg2: Toluene 22.6 77 0.0017 27.5 2.7 30 59-88 40-69 (170)
38 PF10234 Cluap1: Clusterin-ass 21.7 36 0.00079 32.4 0.5 31 125-155 2-37 (267)
39 PF13875 DUF4202: Domain of un 20.4 67 0.0014 29.0 1.9 39 38-80 131-169 (185)
40 PF03457 HA: Helicase associat 20.4 62 0.0014 23.5 1.5 16 113-128 52-67 (68)
41 PF12650 DUF3784: Domain of un 20.2 64 0.0014 25.4 1.6 17 69-85 24-40 (97)
No 1
>PF10536 PMD: Plant mobile domain; InterPro: IPR019557 This entry represents a domain found in a variety of transposases [].
Probab=100.00 E-value=1.1e-33 Score=277.94 Aligned_cols=198 Identities=18% Similarity=0.303 Sum_probs=167.4
Q ss_pred Ccccccccc--ccccchHHHHHHHhccccccceEEECCeEeecCHhHHHHHhcccCCCCcccccCChhhHHHHHhhcC--
Q 019122 127 GFGSLLQLN--CGRLKRNLCGWLVEKIDIARCILQLNGVEVELSPKSFSYVMGISDGGKPLQLEGESSEVCAYVDNFT-- 202 (346)
Q Consensus 127 GFg~LL~i~--~~~l~~~L~~wL~~~~d~~t~~~~l~g~~i~it~~dV~~VLGLP~gG~~v~~~~~~~~~~~l~~~~~-- 202 (346)
|||+|+.|. ..++++.|+.+|+++|+++|++|+++++||+||++||.+|+|||+.|.+|......+ ..++.+.+.
T Consensus 1 ~~g~~~~i~~s~~~~~~~li~al~erW~~et~tF~~~~gEmtiTL~DV~~llGLpi~G~pv~~~~~~~-~~~~~~~ll~~ 79 (363)
T PF10536_consen 1 GFGILDAIMASRITIDRSLISALVERWDPETNTFHFPWGEMTITLEDVAMLLGLPIDGRPVTGPLPPD-WRDLCEELLGV 79 (363)
T ss_pred CchhHhhhhhhcCCCCHHHHHHHHHHhCcccCeeecccccccchhhhhhhccccccccccccCccccc-hhhHHHHHhcc
Confidence 899999999 899999999999999999999999999999999999999999999999998654433 333333332
Q ss_pred ------CCCCCcchHHHHHHHhhccCC-CchHhHHHHHHhhhhhccCCCCC-cccccccccccccCCCCcccchHHHHHH
Q 019122 203 ------PTSRGINITVLAGILQKLKSA-DDQFKVTFMMFALCTILCPPGGV-HISSGFLFSLKDVESIPKRNWATFCFHR 274 (346)
Q Consensus 203 ------~~~~~i~i~~L~~~l~~~~~~-~d~f~r~Fll~~i~~~L~Pts~~-~vs~~yl~~l~D~d~i~~ynW~~~Vl~~ 274 (346)
..+..+.+++|++.+...+++ .+.+.||||++++|++|||+++. +|+..|++++.|++.+++||||++||++
T Consensus 80 ~~~~~~~~~~~~~~~wl~~~~~~~~~~d~~~~~rAFll~~lg~~lfp~~~~~~v~~~~l~~~~~l~~~~~~~wg~a~La~ 159 (363)
T PF10536_consen 80 SPQIKSKKGSSIRLSWLEEFFSNRPEDDEEQYHRAFLLYWLGSFLFPDKSGDYVSPRYLPLAVDLARIKRYAWGSAVLAY 159 (363)
T ss_pred cccccccccccchhhheeccccccccchHHHHHHHHHHHhhhceeccCCCcceeeeeEEeeeeccccccccccHHHHHHH
Confidence 123456778898888655544 24899999999999999999988 8999999999999999999999999999
Q ss_pred HHHHHHhhhccC--ccccchhHHHHHHHHHhhccCCCcC-----C----CCCCCCccccCHH
Q 019122 275 LIQGITRHKEEQ--VAYVGGCLLYLQMLYFNSIVYGKLE-----R----DRSMCPLALWNVN 325 (346)
Q Consensus 275 L~~~i~k~~~~k--~~~i~GCl~lLqi~Yld~l~~g~~~-----~----~~~~ppI~~w~~~ 325 (346)
|++++.++..+. ..+++||+.|||+|+|||++.+... . +...|++..|.+.
T Consensus 160 ly~~L~~~~~~~~~~~~~~g~~~llq~W~werf~~~rP~~~~~~~~~~~~~~~P~~~rW~~~ 221 (363)
T PF10536_consen 160 LYRDLCKASRKSASQSNIGGPLWLLQLWAWERFPVGRPKLITAQKPNPIPDRPPRAARWCDR 221 (363)
T ss_pred HHHHHHHHhhhcccccccccceeeeccchhheeecccccccccccccccccCCCeeeeeecc
Confidence 999999988776 7899999999999999999977541 1 1233349999873
No 2
>PTZ00199 high mobility group protein; Provisional
Probab=99.28 E-value=2.8e-12 Score=102.85 Aligned_cols=70 Identities=17% Similarity=0.240 Sum_probs=64.6
Q ss_pred cCCCCCCCccCCcchhhhhHHHHHHHHhcCCCcc-hHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 25 ASNDGGRKRQCRNGFIRYFGEVVRQIKANDGMAC-ITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 25 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
+..|||.||||+||||.|+.+.|.++++.||+.. .+.+|.+.+|+.|++||++||.+|.+.|.+....|.
T Consensus 16 ~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~ 86 (94)
T PTZ00199 16 KKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYE 86 (94)
T ss_pred CCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence 3579999999999999999999999999999875 589999999999999999999999999999877764
No 3
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.12 E-value=3.8e-11 Score=89.04 Aligned_cols=62 Identities=19% Similarity=0.305 Sum_probs=58.4
Q ss_pred CccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 32 KRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
||+|+|||+.|+.+++.++++.||+ ....++.+.+|..|++||+++|.+|.++|++...+|.
T Consensus 1 Pkrp~saf~~f~~~~r~~~~~~~p~-~~~~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~ 62 (66)
T cd01390 1 PKRPLSAYFLFSQEQRPKLKKENPD-ASVTEVTKILGEKWKELSEEEKKKYEEKAEKDKERYE 62 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcC-CCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHH
Confidence 7999999999999999999999998 6799999999999999999999999999999877764
No 4
>PF09011 HMG_box_2: HMG-box domain; InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.06 E-value=6.4e-11 Score=90.37 Aligned_cols=66 Identities=23% Similarity=0.333 Sum_probs=55.3
Q ss_pred CCCCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 29 GGRKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 29 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
|++||||+|||+.|+.|.+.+.+..-.......++.+++|..|++||++||.+|.++|++.+.+|.
T Consensus 1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~~~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~ 66 (73)
T PF09011_consen 1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQSFREVMKEISERWKSLSEEEKEPYEERAKEDKERYE 66 (73)
T ss_dssp SSS--SSSSHHHHHHHHHHHHHHHHT-T-SSHHHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCCCCHHHHHHHHHHHHHHHhcccCCCHHHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHH
Confidence 789999999999999999999999833366678999999999999999999999999999877663
No 5
>PF00505 HMG_box: HMG (high mobility group) box; InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.04 E-value=9.6e-11 Score=87.73 Aligned_cols=62 Identities=29% Similarity=0.378 Sum_probs=56.3
Q ss_pred CccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 32 KRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
||||+|||+.|+++.+.++++.||+.. ..+|.+.+|..|++||+++|.+|.+.|.+....|.
T Consensus 1 PkrP~~af~lf~~~~~~~~k~~~p~~~-~~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~ 62 (69)
T PF00505_consen 1 PKRPPNAFMLFCKEKRAKLKEENPDLS-NKEISKILAQMWKNLSEEEKAPYKEEAEEEKERYE 62 (69)
T ss_dssp SSSS--HHHHHHHHHHHHHHHHSTTST-HHHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCHHHHHHHHHHHHHHHHhcccc-cccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHH
Confidence 799999999999999999999999877 99999999999999999999999999999877764
No 6
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=98.98 E-value=3.9e-10 Score=86.86 Aligned_cols=64 Identities=27% Similarity=0.352 Sum_probs=59.3
Q ss_pred CCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhccccccC
Q 019122 31 RKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKIG 95 (346)
Q Consensus 31 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~~ 95 (346)
++|||+||||.|+.+++.++++.+|+ ....++.+.+|..|++||+++|.+|.+.|++....|..
T Consensus 1 ~~kRP~naf~lf~~~~r~~~~~~~p~-~~~~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~ 64 (77)
T cd01389 1 KIPRPRNAFILYRQDKHAQLKTENPG-LTNNEISRIIGRMWRSESPEVKAYYKELAEEEKERHAR 64 (77)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCC-CCHHHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHH
Confidence 47999999999999999999999997 47889999999999999999999999999998887754
No 7
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=98.96 E-value=4.3e-10 Score=85.58 Aligned_cols=63 Identities=24% Similarity=0.328 Sum_probs=58.3
Q ss_pred CccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhccccccC
Q 019122 32 KRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKIG 95 (346)
Q Consensus 32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~~ 95 (346)
-|||+||||.|+.+.+.++++.||+ ..+.++.|.+|..|++||+++|.+|.+.|++...+|..
T Consensus 2 iKrP~naf~~F~~~~r~~~~~~~p~-~~~~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~ 64 (72)
T cd01388 2 IKRPMNAFMLFSKRHRRKVLQEYPL-KENRAISKILGDRWKALSNEEKQPYYEEAKKLKELHMK 64 (72)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCCC-CCHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHH
Confidence 4899999999999999999999997 58899999999999999999999999999998877753
No 8
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=98.96 E-value=5.7e-10 Score=89.26 Aligned_cols=66 Identities=26% Similarity=0.342 Sum_probs=61.3
Q ss_pred CC--CCCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 28 DG--GRKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 28 ~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
|| +.||||++|||.|+.+++..+++.||+ ..+.+|+|++|+.|++|++++|.+|.++|.+....|.
T Consensus 17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~-~~~~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~ 84 (96)
T KOG0381|consen 17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPG-LSVGEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYE 84 (96)
T ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCC-CCHHHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHH
Confidence 55 599999999999999999999999999 8899999999999999999999999999988777764
No 9
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=98.94 E-value=6.2e-10 Score=82.13 Aligned_cols=62 Identities=23% Similarity=0.303 Sum_probs=57.5
Q ss_pred CccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 32 KRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
||||+|||+.|+.+++..+++.+|+ ....++.+.+|..|++||+++|.+|.++|++...+|.
T Consensus 1 pkrp~~af~~f~~~~~~~~~~~~~~-~~~~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~ 62 (66)
T cd00084 1 PKRPLSAYFLFSQEHRAEVKAENPG-LSVGEISKILGEMWKSLSEEEKKKYEEKAEKDKERYE 62 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcC-CCHHHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHH
Confidence 6899999999999999999999988 5588999999999999999999999999999877764
No 10
>PF09331 DUF1985: Domain of unknown function (DUF1985); InterPro: IPR015410 This domain is functionally uncharacterised; it is found in a set of Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins.
Probab=98.90 E-value=2.3e-09 Score=92.30 Aligned_cols=124 Identities=17% Similarity=0.311 Sum_probs=95.4
Q ss_pred ceEEECCeEeecCHhHHHHHhcccCCCCcccccCChhhH---HHHHhhcCCCCCCcchHHHHHHHhhc-cCCCchHhHHH
Q 019122 156 CILQLNGVEVELSPKSFSYVMGISDGGKPLQLEGESSEV---CAYVDNFTPTSRGINITVLAGILQKL-KSADDQFKVTF 231 (346)
Q Consensus 156 ~~~~l~g~~i~it~~dV~~VLGLP~gG~~v~~~~~~~~~---~~l~~~~~~~~~~i~i~~L~~~l~~~-~~~~d~f~r~F 231 (346)
.-|.++|..|.++..+.+.|+|||++..|-......... ..+.+.+-..+..+++..+.++|... ..+.++=+|..
T Consensus 14 ~W~~~~g~piRfsl~Ef~lvTGL~C~~~p~~~~~~~~~~~~~~~fw~~Lf~~~~~vtv~dv~~~L~~~~~~~~~~Rlrla 93 (142)
T PF09331_consen 14 IWFVFNGVPIRFSLREFALVTGLNCGPYPKEKKVDKKGKKEKGSFWNKLFGREEDVTVEDVIAKLKKMKKWDSEDRLRLA 93 (142)
T ss_pred EEEEECCEeeEecHHHHHhhcCCcCCCCCcccchhhccccchhhhhhhhccccccCcHHHHHHHHhhcccCChhhHHHHH
Confidence 568899999999999999999999998877654321111 13333333346779999999999886 23555666666
Q ss_pred HHHhhhhhccCCC-CCcccccccccccccCCCCcccchHHHHHHHHHHH
Q 019122 232 MMFALCTILCPPG-GVHISSGFLFSLKDVESIPKRNWATFCFHRLIQGI 279 (346)
Q Consensus 232 ll~~i~~~L~Pts-~~~vs~~yl~~l~D~d~i~~ynW~~~Vl~~L~~~i 279 (346)
+|+++..+|+|++ +..|+..++..+.|++...+|-||.+.++.++++|
T Consensus 94 ~L~~v~gvl~~~~~~~~i~~~~~~~v~Dl~~f~~yPWGr~sF~~~~~sI 142 (142)
T PF09331_consen 94 LLLFVDGVLIATSKTTKIPKEHLKMVDDLEKFLNYPWGRYSFDMLMKSI 142 (142)
T ss_pred HHHhhheeeeccCCCCCCCHHHHHHHhhHHHHhcCCcHHHHHHHHHhcC
Confidence 6666666666665 55899999999999999999999999999999864
No 11
>smart00398 HMG high mobility group.
Probab=98.88 E-value=1.5e-09 Score=80.99 Aligned_cols=63 Identities=27% Similarity=0.357 Sum_probs=57.5
Q ss_pred CCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 31 RKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 31 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
+||+|+|||+.|+.+++..+++.+|+. ...++.+.+|..|++||+++|.+|.+.|++...+|.
T Consensus 1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~-~~~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~ 63 (70)
T smart00398 1 KPKRPMSAFMLFSQENRAKIKAENPDL-SNAEISKKLGERWKLLSEEEKAPYEEKAKKDKERYE 63 (70)
T ss_pred CcCCCCcHHHHHHHHHHHHHHHHCcCC-CHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHH
Confidence 489999999999999999999999975 478999999999999999999999999998776663
No 12
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=98.50 E-value=6.8e-08 Score=86.78 Aligned_cols=68 Identities=21% Similarity=0.236 Sum_probs=63.3
Q ss_pred CCCCCCCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 26 SNDGGRKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 26 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
-.|||.||||-++||.|..+-|.++...+|+. -+.+++|.+|++|++||++||.+|-+.|..-+..|-
T Consensus 65 k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l-~~~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq 132 (211)
T COG5648 65 KKDPNGPKRPLSAYFLYSAENRDEIRKENPKL-TFGEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQ 132 (211)
T ss_pred hcCCCCCCCchhHHHHHHHHHHHHHHHhCCCC-ChHHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHH
Confidence 35899999999999999999999999999988 889999999999999999999999999988777763
No 13
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=98.19 E-value=9.5e-07 Score=85.74 Aligned_cols=64 Identities=25% Similarity=0.371 Sum_probs=60.3
Q ss_pred CCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhccccccC
Q 019122 31 RKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKIG 95 (346)
Q Consensus 31 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~~ 95 (346)
+=|||-|||.|+..+.|+.|-+.+|+- =+.++.|.+|..||.|+++||.+|.+.|+++..++++
T Consensus 62 hIKRPMNAFMVWSq~~RRkma~qnP~m-HNSEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~Hmk 125 (331)
T KOG0527|consen 62 RIKRPMNAFMVWSQGQRRKLAKQNPKM-HNSEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMK 125 (331)
T ss_pred ccCCCcchhhhhhHHHHHHHHHhCcch-hhHHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHH
Confidence 449999999999999999999999965 6789999999999999999999999999999999987
No 14
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=97.56 E-value=3.9e-05 Score=77.41 Aligned_cols=69 Identities=12% Similarity=0.223 Sum_probs=59.9
Q ss_pred EeeccCCCCCCCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 21 VGVDASNDGGRKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 21 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
.+.--.-|||.|||+-+||..|+++=|..+|+. .-.+.+|+|.+|.+||.||. |.++.++|+.-+.+|.
T Consensus 525 k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d---gi~~~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~ 593 (615)
T KOG0526|consen 525 KKGKKKKDPNAPKRATSAYMLWLNASRESIKED---GISVGDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYE 593 (615)
T ss_pred cCcccCCCCCCCccchhHHHHHHHhhhhhHhhc---CchHHHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHH
Confidence 566677899999999999999999999999994 88899999999999999999 7777777776665553
No 15
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=95.29 E-value=0.013 Score=56.40 Aligned_cols=55 Identities=22% Similarity=0.368 Sum_probs=47.4
Q ss_pred ccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhh
Q 019122 33 RQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKR 88 (346)
Q Consensus 33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~ 88 (346)
|+|.+||.-||.|-|+..-+.-. .|--+++-|++|.+|+.||.||-+.|-.-|++
T Consensus 193 KKPLNAFmlyMKEmRa~vvaEct-lKeSAaiNqiLGrRWH~LSrEEQAKYyElArK 247 (421)
T KOG3248|consen 193 KKPLNAFMLYMKEMRAKVVAECT-LKESAAINQILGRRWHALSREEQAKYYELARK 247 (421)
T ss_pred cccHHHHHHHHHHHHHHHHHHhh-hhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHH
Confidence 78999999999999998888553 45556899999999999999999998888775
No 16
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=94.41 E-value=0.021 Score=57.47 Aligned_cols=64 Identities=25% Similarity=0.302 Sum_probs=56.3
Q ss_pred ccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhccccccCCC
Q 019122 33 RQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKIGKA 97 (346)
Q Consensus 33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~~~~ 97 (346)
|||-+||.++-.|=|+.+-.+.|+-- +-++.|-+|..||+||..||.+|-..-.++++.++.+-
T Consensus 327 KRPMNAFMVWAkDERRKILqA~PDMH-NSnISKILGSRWKaMSN~eKQPYYEEQaRLSk~HlEk~ 390 (511)
T KOG0528|consen 327 KRPMNAFMVWAKDERRKILQAFPDMH-NSNISKILGSRWKAMSNTEKQPYYEEQARLSKLHLEKY 390 (511)
T ss_pred cCCcchhhcccchhhhhhhhcCcccc-ccchhHHhcccccccccccccchHHHHHHHHHhhhccC
Confidence 99999999999999999999998432 23578899999999999999999999999999888753
No 17
>PF06382 DUF1074: Protein of unknown function (DUF1074); InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=92.06 E-value=0.1 Score=46.20 Aligned_cols=47 Identities=19% Similarity=0.355 Sum_probs=34.5
Q ss_pred cchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhh
Q 019122 37 NGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKR 88 (346)
Q Consensus 37 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~ 88 (346)
+||+.||.+|++. |.+ ....++-..+...|..||+.+|..|...+-.
T Consensus 84 naYLNFLReFRrk----h~~-L~p~dlI~~AAraW~rLSe~eK~rYrr~~~~ 130 (183)
T PF06382_consen 84 NAYLNFLREFRRK----HCG-LSPQDLIQRAARAWCRLSEAEKNRYRRMAPS 130 (183)
T ss_pred hHHHHHHHHHHHH----ccC-CCHHHHHHHHHHHHHhCCHHHHHHHHhhcch
Confidence 5788887776654 433 5556666777778999999999999875443
No 18
>PF03078 ATHILA: ATHILA ORF-1 family; InterPro: IPR004312 ATHILA is a group of Arabidopsis thaliana retrotransposons [] belonging to the Ty3/gypsy family of the long terminal repeat (LTR) class of eukaryotic retrotransposons[, ]. The central region of ATHILA retrotransposons contains two or three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown.
Probab=89.58 E-value=3.5 Score=42.20 Aligned_cols=187 Identities=15% Similarity=0.191 Sum_probs=106.8
Q ss_pred ccc-CHHHHHHHHhhCCHHHHHHHHHcCccccccccccccchHHHHHHHhc----c---cc--------ccceEEECCeE
Q 019122 101 TRC-APDRLAAAVLQLTDEQRVVVREMGFGSLLQLNCGRLKRNLCGWLVEK----I---DI--------ARCILQLNGVE 164 (346)
Q Consensus 101 tRc-S~~~~~~~i~~Ls~~qk~~I~~~GFg~LL~i~~~~l~~~L~~wL~~~----~---d~--------~t~~~~l~g~~ 164 (346)
||. ++..+..+ .|.++-..+++.+|.+.|..++...-+...+.+|+.. + ++ ..-+|.|.+.+
T Consensus 63 TRyp~~etl~~L--Gl~~dV~~lf~~~gL~~f~~~~~~~Y~eet~qFLaTl~v~~~~~~~~~~~e~~glG~l~F~V~~~~ 140 (458)
T PF03078_consen 63 TRYPDPETLQKL--GLLEDVEYLFKKCGLGTFMSYPYPTYPEETRQFLATLKVTFYNPSEPRAKELDGLGYLTFFVYGVE 140 (458)
T ss_pred cccCCHHHHHHh--ccHHHHHHHHHhcCchhhccCCCCCcHHHHHHhhheeeeeecccccchhhcccCcceEEEEEccee
Confidence 443 45555555 7888889999999999999888865554444444321 1 11 22467778999
Q ss_pred eecCHhHHHHHhcccCCCCcccccCChhhHHHHHhhcCCCCCCcchHHHHHHHhhccCCCchHhHHHHHHhhhhhccCCC
Q 019122 165 VELSPKSFSYVMGISDGGKPLQLEGESSEVCAYVDNFTPTSRGINITVLAGILQKLKSADDQFKVTFMMFALCTILCPPG 244 (346)
Q Consensus 165 i~it~~dV~~VLGLP~gG~~v~~~~~~~~~~~l~~~~~~~~~~i~i~~L~~~l~~~~~~~d~f~r~Fll~~i~~~L~Pts 244 (346)
..+|-.+.+.++|.|.++. +......++...|....|.+. .++...-... ..-+-+.+|+-=+++..|+|..
T Consensus 141 y~lsi~~L~~i~GF~~~~~-i~~~~~~~el~~~W~~ig~~~-p~~~~~~ks~------~Ir~PviRy~hr~iA~tlf~R~ 212 (458)
T PF03078_consen 141 YSLSIKHLERIFGFPSGDE-IKPDFDPEELNDFWATIGGGK-PFNSARSKSN------QIRSPVIRYFHRLIANTLFARE 212 (458)
T ss_pred eeeeHHHHHHHhCCCCccc-cCCCCCchHHHHHHHHhcCCC-cccccccccc------cccChHHHHHHHHHHhhhcccc
Confidence 9999999999999999854 333444455566666666321 1111111111 1122234444455565566555
Q ss_pred CC-ccccccccc-----------cccc----CCCCcccchHHHHHHHHHHHHhhhcc--C---ccccchhHHHH
Q 019122 245 GV-HISSGFLFS-----------LKDV----ESIPKRNWATFCFHRLIQGITRHKEE--Q---VAYVGGCLLYL 297 (346)
Q Consensus 245 ~~-~vs~~yl~~-----------l~D~----d~i~~ynW~~~Vl~~L~~~i~k~~~~--k---~~~i~GCl~lL 297 (346)
.+ .|...-|.+ ..|. .+..+.+-+...++||...=..+... + .-.+||=+.-+
T Consensus 213 ~~~~v~~~El~~l~~~L~~~Lr~~~~g~~l~~d~~dt~~~~vl~~hL~~yk~wa~~~~~~~~~~l~vGgviTpI 286 (458)
T PF03078_consen 213 ETGTVRNDELEMLDQALKHLLRRTKDGKLLRGDLNDTNVSMVLLDHLCSYKGWAKTNKKKARGSLCVGGVITPI 286 (458)
T ss_pred ccCceechhHHHHHHHHHHHHHhcCCCccccCcccccchhHHHHHHHHhhhHHHhhcCCCCCceeeecchhhhH
Confidence 33 555444332 1221 12366777777788877553222111 2 34677766544
No 19
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]
Probab=87.56 E-value=0.52 Score=45.37 Aligned_cols=63 Identities=29% Similarity=0.294 Sum_probs=51.6
Q ss_pred CCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 31 RKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 31 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
-|.+|.-++.+|..-.-.+.|++||+.+. =++||.+|.-|.-|+++||..|+..=.--+.+|-
T Consensus 64 ppekpl~pymrySrkvWd~VkA~nPe~kL-WeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~ 126 (410)
T KOG4715|consen 64 PPEKPLMPYMRYSRKVWDQVKASNPELKL-WEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYN 126 (410)
T ss_pred CCCcccchhhHHhhhhhhhhhccCcchHH-HHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHH
Confidence 45677788889988888999999999874 4899999999999999999988776555445553
No 20
>PF14887 HMG_box_5: HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=84.06 E-value=0.82 Score=35.18 Aligned_cols=60 Identities=12% Similarity=0.046 Sum_probs=45.2
Q ss_pred ccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 33 RQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
..|-+|-=.+....+.-|-+.+++..... -+++..+|++|++++|.+--++|.+.-+.|.
T Consensus 5 E~PKt~qe~Wqq~vi~dYla~~~~dr~K~--~kam~~~W~~me~Kekl~WIkKA~EdqKrYE 64 (85)
T PF14887_consen 5 ETPKTAQEIWQQSVIGDYLAKFRNDRKKA--LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYE 64 (85)
T ss_dssp ----THHHHHHHHHHHHHHHHTTSTHHHH--HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHH
T ss_pred CCCCCHHHHHHHHHHHHHHHHhhHhHHHH--HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHH
Confidence 34455555667888888998888765443 5688999999999999999999998777664
No 21
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=67.84 E-value=3.2 Score=43.85 Aligned_cols=62 Identities=21% Similarity=0.256 Sum_probs=44.6
Q ss_pred ccCCcchhhhhHHHH--HHHHhcCCCcchHHHHHHHhccCCCCCChhhhhhhhHHhhhccccccC
Q 019122 33 RQCRNGFIRYFGEVV--RQIKANDGMACITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKIG 95 (346)
Q Consensus 33 ~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~~ 95 (346)
+||-++|-.|...-+ .--.+.| -|.-+..|.|.+|+.|++|-+.||..|-+.|.+++..+.+
T Consensus 183 rrPMnaf~ifskrhr~~g~vhq~~-pn~DNrtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfk 246 (683)
T KOG2746|consen 183 RRPMNAFHIFSKRHRGEGRVHQRH-PNQDNRTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFK 246 (683)
T ss_pred hhhhHHHHHHHhhcCCccchhccC-ccccchhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhh
Confidence 778888866632222 1111222 4566789999999999999999999999988887766554
No 22
>PF04690 YABBY: YABBY protein; InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=65.93 E-value=8.3 Score=34.27 Aligned_cols=48 Identities=17% Similarity=0.348 Sum_probs=36.7
Q ss_pred CCCCCccCCcchhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCCh
Q 019122 28 DGGRKRQCRNGFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPP 76 (346)
Q Consensus 28 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~ 76 (346)
-|.+..|-|+||=+||.|-+..+|+.+|+-. =+++-+++.+-|...++
T Consensus 118 PPEKRqR~psaYn~f~k~ei~rik~~~p~is-hkeaFs~aAknW~h~ph 165 (170)
T PF04690_consen 118 PPEKRQRVPSAYNRFMKEEIQRIKAENPDIS-HKEAFSAAAKNWAHFPH 165 (170)
T ss_pred CccccCCCchhHHHHHHHHHHHHHhcCCCCC-HHHHHHHHHHhhhhCcc
Confidence 3445678899999999999999999998754 45666666666876553
No 23
>KOG2062 consensus 26S proteasome regulatory complex, subunit RPN2/PSMD1 [Posttranslational modification, protein turnover, chaperones]
Probab=43.35 E-value=28 Score=37.75 Aligned_cols=54 Identities=22% Similarity=0.381 Sum_probs=33.0
Q ss_pred hhCCHH-HHHHHHHcCccccccccccccchHHHHHHHhccccccceEEEC-CeEeecCHhHHHHHhcccCCCC
Q 019122 113 LQLTDE-QRVVVREMGFGSLLQLNCGRLKRNLCGWLVEKIDIARCILQLN-GVEVELSPKSFSYVMGISDGGK 183 (346)
Q Consensus 113 ~~Ls~~-qk~~I~~~GFg~LL~i~~~~l~~~L~~wL~~~~d~~t~~~~l~-g~~i~it~~dV~~VLGLP~gG~ 183 (346)
+.-+++ .|.+|-.+||==+- ...++ ...+.-|.++||| |+. | +++.|||-+.|.
T Consensus 565 sD~nDDVrRaAVialGFVl~~--dp~~~-~s~V~lLses~N~-----HVRyG---------aA~ALGIaCAGt 620 (929)
T KOG2062|consen 565 SDVNDDVRRAAVIALGFVLFR--DPEQL-PSTVSLLSESYNP-----HVRYG---------AAMALGIACAGT 620 (929)
T ss_pred cccchHHHHHHHHHheeeEec--Chhhc-hHHHHHHhhhcCh-----hhhhh---------HHHHHhhhhcCC
Confidence 344444 46688999985221 11122 2345678888887 443 4 788999988775
No 24
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=41.07 E-value=47 Score=27.06 Aligned_cols=15 Identities=20% Similarity=0.552 Sum_probs=7.1
Q ss_pred ccCCCCCChhhhhhh
Q 019122 68 GSTYKNLPPEEKCRY 82 (346)
Q Consensus 68 ~~~~~sl~~~~k~~~ 82 (346)
...|.+||++++..+
T Consensus 35 a~r~~~mspeqq~r~ 49 (107)
T PF11304_consen 35 AERWPSMSPEQQQRL 49 (107)
T ss_pred HHHHhcCCHHHHHHH
Confidence 344555555544443
No 25
>PF02919 Topoisom_I_N: Eukaryotic DNA topoisomerase I, DNA binding fragment; InterPro: IPR008336 DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks []. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis [, ]. DNA topoisomerases are divided into two classes: type I enzymes (5.99.1.2 from EC; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (5.99.1.3 from EC; topoisomerases II, IV and VI) break double-strand DNA []. Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA. This entry represents the N-terminal DNA-binding domain found in eukaryotic topoisomerase I, which is a type IB enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain []. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes []. More information about this protein can be found at Protein of the Month: DNA Topoisomerase [].; GO: 0003677 DNA binding, 0003917 DNA topoisomerase type I activity, 0006265 DNA topological change, 0005694 chromosome; PDB: 1TL8_A 1K4T_A 1A36_A 1RR8_C 1T8I_A 1SC7_A 1EJ9_A 1LPQ_A 1RRJ_A 1A31_A ....
Probab=35.33 E-value=16 Score=33.57 Aligned_cols=42 Identities=19% Similarity=0.287 Sum_probs=26.7
Q ss_pred hhhHHHHHHHHhcCCC------cchHHHH---HHHhccCCCCCChhhhhhh
Q 019122 41 RYFGEVVRQIKANDGM------ACITKEI---RKEIGSTYKNLPPEEKCRY 82 (346)
Q Consensus 41 ~~~~~~~~~~~~~~~~------~~~~~~~---~k~~~~~~~sl~~~~k~~~ 82 (346)
-||+||++.+...++. .|--..+ =.+.-+..++||.|||..-
T Consensus 63 NFf~Df~~~l~~~~~~~i~~~~kcDF~~i~~~~~~~~e~kk~~skeEK~~~ 113 (215)
T PF02919_consen 63 NFFKDFRKVLTKEERKKIKDFDKCDFSPIYEYFEKEKEKKKNMSKEEKKAL 113 (215)
T ss_dssp HHHHHHHHHHCHCCHHH-S-GGGEETHHHHHHHHHHHHHHCTS-CCHHHHH
T ss_pred HHHHHHHHHhhhccCcccCchhhCCCHHHHHHHHHHHHHHHhcCHHHHHHH
Confidence 3889999999987742 2222222 2333577899999998853
No 26
>cd03490 Topoisomer_IB_N_1 Topoisomer_IB_N_1: A subgroup of the N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB. Topo IB proteins include the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanoso
Probab=34.08 E-value=37 Score=31.23 Aligned_cols=44 Identities=23% Similarity=0.194 Sum_probs=27.7
Q ss_pred hhhHHHHHHHHhcCCCcchHH----------HHHHHhccCCCCCChhhhhhhhH
Q 019122 41 RYFGEVVRQIKANDGMACITK----------EIRKEIGSTYKNLPPEEKCRYKS 84 (346)
Q Consensus 41 ~~~~~~~~~~~~~~~~~~~~~----------~~~k~~~~~~~sl~~~~k~~~~~ 84 (346)
-||+||++.++..++...|.. +-=.+..++.++||.|||+.-..
T Consensus 60 NFf~df~~~l~~~~~~~~i~~f~kcDF~~i~~~~~~~ke~kK~~tkeEKk~~K~ 113 (217)
T cd03490 60 NFWKVFVNSFEKDHKFIRRCKLSDADFSLIKNHLEEEKEKKKNLNKEEKEAKKK 113 (217)
T ss_pred HHHHHHHHHhccccCcccccchhhCCCHHHHHHHHHHHHHHHhcCHHHHHHHHH
Confidence 388999999976553222211 11223457788999999886533
No 27
>cd00660 Topoisomer_IB_N Topoisomer_IB_N: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I diffe
Probab=33.52 E-value=47 Score=30.55 Aligned_cols=44 Identities=18% Similarity=0.242 Sum_probs=26.7
Q ss_pred hhhHHHHHHHHhcCC------CcchH---HHHHHHhccCCCCCChhhhhhhhH
Q 019122 41 RYFGEVVRQIKANDG------MACIT---KEIRKEIGSTYKNLPPEEKCRYKS 84 (346)
Q Consensus 41 ~~~~~~~~~~~~~~~------~~~~~---~~~~k~~~~~~~sl~~~~k~~~~~ 84 (346)
-||+||++.+..... +.|-- .+-=.+..++.|+||.|||+.-..
T Consensus 62 NFf~Df~~~l~~~~~~~i~~f~kcDF~~i~~~~~~~~e~kK~~s~eEKk~~K~ 114 (215)
T cd00660 62 NFFKDFRKILTKEEKHIIKKLSKCDFTPIYQYFEEEKEKKKAMSKEEKKAIKE 114 (215)
T ss_pred HHHHHHHHHhccccCccccchhhCCCHHHHHHHHHHHHHHHcCCHHHHHHHHH
Confidence 378999999955432 11111 122233467889999999886433
No 28
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=32.72 E-value=22 Score=32.53 Aligned_cols=37 Identities=19% Similarity=0.177 Sum_probs=31.2
Q ss_pred chHHHHHHHhccCCCCCChhhhhhhhHHhhhcccccc
Q 019122 58 CITKEIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKI 94 (346)
Q Consensus 58 ~~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~ 94 (346)
++.-+.+|.+|+.|++|++.-|.+|.+.++....+|.
T Consensus 169 ~~~~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~ 205 (211)
T COG5648 169 KALVEETKIISKAWSELDESKKKKYIDKYKKLKEEYD 205 (211)
T ss_pred hhhhHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHh
Confidence 3444789999999999999999999998888777764
No 29
>cd03488 Topoisomer_IB_N_htopoI_like Topoisomer_IB_N_htopoI_like : N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit religation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. This family may represent more than one structural domain.
Probab=32.62 E-value=49 Score=30.40 Aligned_cols=44 Identities=18% Similarity=0.222 Sum_probs=26.8
Q ss_pred hhhHHHHHHHHhcCC------CcchH---HHHHHHhccCCCCCChhhhhhhhH
Q 019122 41 RYFGEVVRQIKANDG------MACIT---KEIRKEIGSTYKNLPPEEKCRYKS 84 (346)
Q Consensus 41 ~~~~~~~~~~~~~~~------~~~~~---~~~~k~~~~~~~sl~~~~k~~~~~ 84 (346)
-||+||++.+..... +.|-- .+-=.+..++.|+||.|||+.-..
T Consensus 62 NFf~Df~~~l~~~~~~~I~~f~kcDF~~i~~~~~~~~e~kK~~tkeEKk~~K~ 114 (215)
T cd03488 62 NFFKDFKKVMTKEEKVIIKDFSKCDFTQMFAYFKAQKEEKKAMSKEEKKAIKA 114 (215)
T ss_pred HHHHHHHHHhccccCccccchhhCCCHHHHHHHHHHHHHHHcCCHHHHHHHHH
Confidence 388999999955331 11111 122234567889999999886433
No 30
>cd03489 Topoisomer_IB_N_LdtopoI_like Topoisomer_IB_N_LdtopoI_like: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human
Probab=30.73 E-value=55 Score=30.02 Aligned_cols=44 Identities=18% Similarity=0.129 Sum_probs=27.0
Q ss_pred hhhHHHHHHHHhcCCC-----cchHH---HHHHHhccCCCCCChhhhhhhhH
Q 019122 41 RYFGEVVRQIKANDGM-----ACITK---EIRKEIGSTYKNLPPEEKCRYKS 84 (346)
Q Consensus 41 ~~~~~~~~~~~~~~~~-----~~~~~---~~~k~~~~~~~sl~~~~k~~~~~ 84 (346)
-||+||++.+...+.. .|--. +-=.+..+..|+||.|||+.-..
T Consensus 60 NFf~df~~~l~~~~~~I~~f~kcDF~~i~~~~~~~~e~kK~~tkeEKk~~K~ 111 (212)
T cd03489 60 NFFESWREILDKRHHPIRKLELCDFTPIYEWHLREKEKKKSRTKEEKKALKE 111 (212)
T ss_pred HHHHHHHHHhcccCccccchhhCCCHHHHHHHHHHHHHHHhCCHHHHHHHHH
Confidence 3889999999765411 11111 12233457789999999886533
No 31
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=27.43 E-value=1.2e+02 Score=24.57 Aligned_cols=63 Identities=24% Similarity=0.333 Sum_probs=38.9
Q ss_pred HHHHHhccCCCCCChhhhhhhhHHhhhccccccCCCCcccccCH------HHHHHHHhhCCHHHHHHHHHcCcccccccc
Q 019122 62 EIRKEIGSTYKNLPPEEKCRYKSEAKRVGKSKIGKAKFPTRCAP------DRLAAAVLQLTDEQRVVVREMGFGSLLQLN 135 (346)
Q Consensus 62 ~~~k~~~~~~~sl~~~~k~~~~~~a~~~~~~~~~~~~~~tRcS~------~~~~~~i~~Ls~~qk~~I~~~GFg~LL~i~ 135 (346)
.+-.-+...|.+|+++.|......|.. ...-|| ..-+.--..|||+|++.+++- |..+-.|+
T Consensus 11 ~~L~pl~~~W~~l~~~qr~k~l~~a~r-----------~~~mspeqq~r~~~rm~~W~~LspeqR~~~R~~-~~~~~~Lp 78 (107)
T PF11304_consen 11 QALAPLAERWNSLPPEQRRKWLQIAER-----------WPSMSPEQQQRLRERMRRWAALSPEQRQQAREN-YQRFKQLP 78 (107)
T ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHH-----------HhcCCHHHHHHHHHHHHHHHhCCHHHHHHHHHH-HHHHHcCC
Confidence 334555677999999988777666554 122333 222334468888888887766 66555554
Q ss_pred c
Q 019122 136 C 136 (346)
Q Consensus 136 ~ 136 (346)
.
T Consensus 79 p 79 (107)
T PF11304_consen 79 P 79 (107)
T ss_pred H
Confidence 3
No 32
>PF06945 DUF1289: Protein of unknown function (DUF1289); InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=24.78 E-value=44 Score=23.42 Aligned_cols=20 Identities=10% Similarity=0.227 Sum_probs=15.8
Q ss_pred CCCCChhhhhhhhHHhhhcc
Q 019122 71 YKNLPPEEKCRYKSEAKRVG 90 (346)
Q Consensus 71 ~~sl~~~~k~~~~~~a~~~~ 90 (346)
|+.||+++|..-.++.....
T Consensus 30 W~~~s~~er~~i~~~l~~R~ 49 (51)
T PF06945_consen 30 WKSMSDDERRAILARLRARR 49 (51)
T ss_pred HhhCCHHHHHHHHHHHHHHh
Confidence 99999999987777665543
No 33
>COG2920 DsrC Dissimilatory sulfite reductase (desulfoviridin), gamma subunit [Inorganic ion transport and metabolism]
Probab=24.75 E-value=59 Score=26.58 Aligned_cols=34 Identities=15% Similarity=0.384 Sum_probs=29.5
Q ss_pred CcchhhhhHHHHHHHHhcCCCcchHHHHHHHhcc
Q 019122 36 RNGFIRYFGEVVRQIKANDGMACITKEIRKEIGS 69 (346)
Q Consensus 36 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~ 69 (346)
|=--++|..||-.+|+-+-|.--.||.+.|++|.
T Consensus 45 HWevv~fvR~fy~ef~tsPaiRMLvK~~~~~~g~ 78 (111)
T COG2920 45 HWEVVRFVREFYEEFNTSPAIRMLVKAMAKKLGE 78 (111)
T ss_pred HHHHHHHHHHHHHHHCCCchHHHHHHHHHHHhCc
Confidence 4445889999999999998888899999999985
No 34
>KOG4370 consensus Ral-GTPase effector RLIP76 [Signal transduction mechanisms]
Probab=22.73 E-value=1.5e+02 Score=30.28 Aligned_cols=31 Identities=32% Similarity=0.305 Sum_probs=23.0
Q ss_pred CCceeEeeccCCCCCCCccCCcchhhhhHHHHHHH
Q 019122 16 RGTEVVGVDASNDGGRKRQCRNGFIRYFGEVVRQI 50 (346)
Q Consensus 16 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 50 (346)
--|+-|-+|.+-|++..+. ||+|-.+|++++
T Consensus 55 ~~~~~v~~d~e~d~~~~~~----~f~~~~~~~e~~ 85 (514)
T KOG4370|consen 55 PLTESVSADPELDGIPLPS----FFRYAIDFVEEN 85 (514)
T ss_pred CCCcccccCcccCCCcCcc----cchhhhhhhhcc
Confidence 3455677777777776665 999999988775
No 35
>PRK00114 hslO Hsp33-like chaperonin; Reviewed
Probab=22.72 E-value=1.6e+02 Score=28.21 Aligned_cols=57 Identities=19% Similarity=0.333 Sum_probs=40.7
Q ss_pred CCCcccccCHHHHHHHHhhCCHHHHHHH-HHcCccccccccccccchHHHHHHHhccccccceEEECCeEeecCHhHHHH
Q 019122 96 KAKFPTRCAPDRLAAAVLQLTDEQRVVV-REMGFGSLLQLNCGRLKRNLCGWLVEKIDIARCILQLNGVEVELSPKSFSY 174 (346)
Q Consensus 96 ~~~~~tRcS~~~~~~~i~~Ls~~qk~~I-~~~GFg~LL~i~~~~l~~~L~~wL~~~~d~~t~~~~l~g~~i~it~~dV~~ 174 (346)
..++..+||..++.+++..|..+..+.+ ++ .+-..+.| ++++..-.+|++|+..
T Consensus 233 ~v~f~C~CS~er~~~~L~~Lg~~El~~i~~e---~~~iev~C----------------------~FC~~~Y~f~~~dl~~ 287 (293)
T PRK00114 233 PVEFKCDCSRERSANALKSLGKEELQEMIAE---DGGAEMVC----------------------QFCGNKYLFDEEDLEE 287 (293)
T ss_pred cCceeCCCCHHHHHHHHHhCCHHHHHHHHHc---CCCEEEEE----------------------eCCCCEEEeCHHHHHH
Confidence 4567899999999999999999886543 33 23333333 4567778888888877
Q ss_pred Hhc
Q 019122 175 VMG 177 (346)
Q Consensus 175 VLG 177 (346)
++.
T Consensus 288 l~~ 290 (293)
T PRK00114 288 LIA 290 (293)
T ss_pred HHh
Confidence 653
No 36
>PF12169 DNA_pol3_gamma3: DNA polymerase III subunits gamma and tau domain III; InterPro: IPR022754 This domain is found in bacteria and eukaryotes, and is approximately 110 amino acids in length. It is found in association with PF00004 from PFAM. This domain is also present in the tau subunit before it undergoes cleavage. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau. ; GO: 0003887 DNA-directed DNA polymerase activity; PDB: 1NJF_B 3GLG_G 1XXH_I 1NJG_A 3GLF_B 3GLI_G.
Probab=22.65 E-value=1.2e+02 Score=25.22 Aligned_cols=129 Identities=12% Similarity=0.146 Sum_probs=55.7
Q ss_pred cCHhHHHHHhcccCCCCcccccCChhhHHHHHhhcCCCCCCcchHHHHHHHhhccCCCchHhHHHHHHhhhhhccCCCC-
Q 019122 167 LSPKSFSYVMGISDGGKPLQLEGESSEVCAYVDNFTPTSRGINITVLAGILQKLKSADDQFKVTFMMFALCTILCPPGG- 245 (346)
Q Consensus 167 it~~dV~~VLGLP~gG~~v~~~~~~~~~~~l~~~~~~~~~~i~i~~L~~~l~~~~~~~d~f~r~Fll~~i~~~L~Pts~- 245 (346)
||.++|..+||+...+.-.. +.+.....+..-.+..+.+.+.. +.+-..|.+-++-++=..+++..+.
T Consensus 1 It~e~V~~~lG~v~~~~i~~----------l~~ai~~~d~~~~l~~~~~l~~~-G~d~~~~l~~L~~~~R~ll~~k~~~~ 69 (143)
T PF12169_consen 1 ITAEDVREILGLVDEEQIFE----------LLDAILEGDAAEALELLNELLEQ-GKDPKQFLDDLIEYLRDLLLYKITGD 69 (143)
T ss_dssp B-HHHHHHHHTHTSTHHHHH----------HHHHHHTT-HHHHHHHHHHHHHC-T--HHHHHHHHHHHHHHHHHHTTSGG
T ss_pred CCHHHHHHHHCCCCHHHHHH----------HHHHHHcCCHHHHHHHHHHHHHh-CCCHHHHHHHHHHHHHHHHHHHhCCc
Confidence 68999999999976654332 22221111100011222222222 2122333333333332333322221
Q ss_pred ----CcccccccccccccCCCCcccchHHHHHHHHHHHHhhhccCccccchhHHHHHHHHHhhcc
Q 019122 246 ----VHISSGFLFSLKDVESIPKRNWATFCFHRLIQGITRHKEEQVAYVGGCLLYLQMLYFNSIV 306 (346)
Q Consensus 246 ----~~vs~~yl~~l~D~d~i~~ynW~~~Vl~~L~~~i~k~~~~k~~~i~GCl~lLqi~Yld~l~ 306 (346)
..++..+...+.+...--..+=-...++.|.++..+.+....+.+.=-+.++.+..+.++.
T Consensus 70 ~~~~~~~~~~~~~~~~~~a~~~~~~~l~~~~~~l~~~~~~lr~s~~pr~~lE~~llrl~~~~~~~ 134 (143)
T PF12169_consen 70 KSNLLELSEEEEEKLKELAKKFSPERLQRILQILLEAENELRYSSNPRILLEMALLRLCQLKSLP 134 (143)
T ss_dssp GS-SG--CTTTHHHHHHHHHHS-HHHHHHHHHHHHHHHHHTTTSSSHHHHHHHHHHHHHHTC---
T ss_pred hhhcccCCHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHhccCCChHHHHHHHHHHHHHHhhcc
Confidence 2345555555544433333333344455555666655555555666677777777766654
No 37
>PF05494 Tol_Tol_Ttg2: Toluene tolerance, Ttg2 ; InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=22.61 E-value=77 Score=27.50 Aligned_cols=30 Identities=17% Similarity=0.453 Sum_probs=22.0
Q ss_pred hHHHHHHHhccCCCCCChhhhhhhhHHhhh
Q 019122 59 ITKEIRKEIGSTYKNLPPEEKCRYKSEAKR 88 (346)
Q Consensus 59 ~~~~~~k~~~~~~~sl~~~~k~~~~~~a~~ 88 (346)
...-++-++|.-|+.+|++++..|.++-++
T Consensus 40 ~~~~ar~~LG~~w~~~s~~q~~~F~~~f~~ 69 (170)
T PF05494_consen 40 FERMARRVLGRYWRKASPAQRQRFVEAFKQ 69 (170)
T ss_dssp HHHHHHHHHGGGTTTS-HHHHHHHHHHHHH
T ss_pred HHHHHHHHHHHhHhhCCHHHHHHHHHHHHH
Confidence 334556677988999999999998666555
No 38
>PF10234 Cluap1: Clusterin-associated protein-1; InterPro: IPR019366 This protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell [].
Probab=21.71 E-value=36 Score=32.43 Aligned_cols=31 Identities=23% Similarity=0.493 Sum_probs=24.0
Q ss_pred HcCccccccccccccc-----hHHHHHHHhcccccc
Q 019122 125 EMGFGSLLQLNCGRLK-----RNLCGWLVEKIDIAR 155 (346)
Q Consensus 125 ~~GFg~LL~i~~~~l~-----~~L~~wL~~~~d~~t 155 (346)
.+||.-++.|....-| .+++.||+.+|||+.
T Consensus 2 ~LGypr~iSmenFrtPNF~LVAeiL~WLv~rydP~~ 37 (267)
T PF10234_consen 2 ALGYPRLISMENFRTPNFELVAEILRWLVKRYDPDA 37 (267)
T ss_pred CCCCCCCCcHHHcCCCChHHHHHHHHHHHHHcCCCC
Confidence 5799999888764433 467889999999964
No 39
>PF13875 DUF4202: Domain of unknown function (DUF4202)
Probab=20.38 E-value=67 Score=28.98 Aligned_cols=39 Identities=13% Similarity=0.227 Sum_probs=32.4
Q ss_pred chhhhhHHHHHHHHhcCCCcchHHHHHHHhccCCCCCChhhhh
Q 019122 38 GFIRYFGEVVRQIKANDGMACITKEIRKEIGSTYKNLPPEEKC 80 (346)
Q Consensus 38 ~~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~~~~~sl~~~~k~ 80 (346)
+=++|+++....|.+.|.+.|++.-+.| +|.-||++-+.
T Consensus 131 acLVFL~~~f~~F~~~~deeK~v~Il~K----Tw~KMS~~g~~ 169 (185)
T PF13875_consen 131 ACLVFLEYYFEDFAAKHDEEKIVDILRK----TWRKMSERGHE 169 (185)
T ss_pred HHHHhHHHHHHHHHhcCCHHHHHHHHHH----HHHHCCHHHHH
Confidence 3689999999999999988888877766 68889998554
No 40
>PF03457 HA: Helicase associated domain; InterPro: IPR005114 This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.; PDB: 2KTA_A.
Probab=20.36 E-value=62 Score=23.51 Aligned_cols=16 Identities=31% Similarity=0.409 Sum_probs=11.3
Q ss_pred hhCCHHHHHHHHHcCc
Q 019122 113 LQLTDEQRVVVREMGF 128 (346)
Q Consensus 113 ~~Ls~~qk~~I~~~GF 128 (346)
..|+++|.+.++++||
T Consensus 52 g~L~~er~~~L~~lg~ 67 (68)
T PF03457_consen 52 GKLTPERIERLDALGF 67 (68)
T ss_dssp T---HHHHHHHHHHT-
T ss_pred CCCCHHHHHHHHcCCC
Confidence 4599999999999998
No 41
>PF12650 DUF3784: Domain of unknown function (DUF3784); InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=20.22 E-value=64 Score=25.41 Aligned_cols=17 Identities=29% Similarity=0.616 Sum_probs=13.4
Q ss_pred cCCCCCChhhhhhhhHH
Q 019122 69 STYKNLPPEEKCRYKSE 85 (346)
Q Consensus 69 ~~~~sl~~~~k~~~~~~ 85 (346)
.-++.||+|||+.+.++
T Consensus 24 aGyntms~eEk~~~D~~ 40 (97)
T PF12650_consen 24 AGYNTMSKEEKEKYDKK 40 (97)
T ss_pred hhcccCCHHHHHHhhHH
Confidence 56899999999877443
Done!