Query 021140
Match_columns 317
No_of_seqs 221 out of 415
Neff 6.9
Searched_HMMs 46136
Date Fri Mar 29 07:54:45 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/021140.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/021140hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF10536 PMD: Plant mobile dom 100.0 2.8E-33 6.2E-38 270.7 7.4 182 133-314 1-196 (363)
2 PTZ00199 high mobility group p 99.9 8.5E-23 1.9E-27 161.4 5.3 76 22-97 13-89 (94)
3 COG5648 NHP6B Chromatin-associ 99.8 9E-20 2E-24 160.1 3.0 76 22-98 61-136 (211)
4 cd01390 HMGB-UBF_HMG-box HMGB- 99.7 5.1E-18 1.1E-22 124.5 4.3 65 32-97 1-65 (66)
5 cd01389 MATA_HMG-box MATA_HMG- 99.7 2E-17 4.4E-22 125.8 4.5 66 31-97 1-66 (77)
6 PF09011 HMG_box_2: HMG-box do 99.7 1.2E-17 2.7E-22 125.7 2.2 68 29-97 1-69 (73)
7 KOG0526 Nucleosome-binding fac 99.7 1.1E-17 2.4E-22 163.2 1.3 73 21-98 525-597 (615)
8 cd01388 SOX-TCF_HMG-box SOX-TC 99.7 9.5E-17 2.1E-21 120.5 5.1 65 32-97 2-66 (72)
9 KOG0381 HMG box-containing pro 99.6 1.9E-16 4.1E-21 125.1 5.7 70 28-98 17-88 (96)
10 smart00398 HMG high mobility g 99.6 1.5E-16 3.2E-21 117.5 4.0 66 31-97 1-66 (70)
11 PF00505 HMG_box: HMG (high mo 99.6 2.2E-16 4.8E-21 116.8 4.3 64 32-96 1-64 (69)
12 cd00084 HMG-box High Mobility 99.6 5.4E-16 1.2E-20 113.2 4.4 64 32-96 1-64 (66)
13 KOG0527 HMG-box transcription 99.1 5.9E-11 1.3E-15 113.0 3.1 71 26-97 57-127 (331)
14 PF09331 DUF1985: Domain of un 98.9 2.8E-09 6E-14 90.4 6.2 123 162-284 14-142 (142)
15 KOG4715 SWI/SNF-related matrix 98.7 9.9E-09 2.1E-13 95.3 3.8 74 24-98 57-130 (410)
16 KOG3248 Transcription factor T 98.5 6.5E-08 1.4E-12 90.6 4.6 59 33-92 193-251 (421)
17 KOG0528 HMG-box transcription 98.1 1E-06 2.2E-11 86.4 2.4 61 33-94 327-387 (511)
18 PF06382 DUF1074: Protein of u 97.3 0.00043 9.2E-09 60.1 5.7 46 37-87 84-129 (183)
19 PF14887 HMG_box_5: HMG (high 97.3 0.00015 3.2E-09 54.5 2.0 64 32-97 4-67 (85)
20 KOG2746 HMG-box transcription 96.8 0.00057 1.2E-08 69.8 1.6 62 33-95 183-246 (683)
21 COG5648 NHP6B Chromatin-associ 96.7 0.00066 1.4E-08 60.5 1.3 68 29-97 141-208 (211)
22 PF04690 YABBY: YABBY protein; 96.6 0.0036 7.7E-08 54.5 4.9 49 26-75 116-164 (170)
23 PF08073 CHDNT: CHDNT (NUC034) 93.7 0.088 1.9E-06 37.3 3.3 39 36-75 13-51 (55)
24 PF03078 ATHILA: ATHILA ORF-1 91.3 5 0.00011 40.5 13.5 189 69-283 41-262 (458)
25 PF06244 DUF1014: Protein of u 87.3 0.89 1.9E-05 37.6 4.1 49 26-75 67-115 (122)
26 KOG2062 26S proteasome regulat 65.4 4.6 0.0001 42.8 2.5 147 122-309 569-731 (929)
27 PF06945 DUF1289: Protein of u 53.2 10 0.00022 26.3 1.8 23 69-91 28-50 (51)
28 PRK15117 ABC transporter perip 53.0 21 0.00045 32.2 4.3 33 54-86 65-97 (211)
29 PF11304 DUF3106: Protein of u 52.1 20 0.00044 28.7 3.7 23 65-87 14-36 (107)
30 PF04769 MAT_Alpha1: Mating-ty 50.1 36 0.00079 30.5 5.3 44 25-73 37-80 (201)
31 PF12650 DUF3784: Domain of un 48.0 11 0.00024 29.4 1.5 18 70-87 25-42 (97)
32 PF05494 Tol_Tol_Ttg2: Toluene 46.0 16 0.00035 31.3 2.4 34 54-87 35-68 (170)
33 PF12169 DNA_pol3_gamma3: DNA 43.5 48 0.001 27.1 4.8 59 253-311 76-134 (143)
34 PF11304 DUF3106: Protein of u 41.6 47 0.001 26.6 4.3 22 66-87 33-54 (107)
35 TIGR03481 HpnM hopanoid biosyn 39.1 23 0.00049 31.6 2.3 31 56-87 63-94 (198)
36 KOG0493 Transcription factor E 35.8 51 0.0011 30.8 4.0 63 22-92 235-304 (342)
37 PF10234 Cluap1: Clusterin-ass 35.1 20 0.00044 33.6 1.3 32 131-162 2-38 (267)
38 KOG3223 Uncharacterized conser 30.5 32 0.00068 30.7 1.7 49 27-76 160-208 (221)
39 PRK10236 hypothetical protein; 29.6 36 0.00078 31.3 2.0 26 63-88 118-143 (237)
40 PF04994 TfoX_C: TfoX C-termin 26.3 73 0.0016 24.2 2.9 34 48-82 40-77 (81)
41 PF03457 HA: Helicase associat 25.1 49 0.0011 23.7 1.7 16 119-134 52-67 (68)
42 PF05914 RIB43A: RIB43A; Inte 22.6 29 0.00064 34.2 0.1 38 47-84 249-287 (379)
43 cd09071 FAR_C C-terminal domai 22.6 79 0.0017 23.8 2.5 21 266-287 70-90 (92)
44 cd07321 Extradiol_Dioxygenase_ 22.0 1E+02 0.0022 23.1 3.0 32 120-154 34-65 (77)
No 1
>PF10536 PMD: Plant mobile domain; InterPro: IPR019557 This entry represents a domain found in a variety of transposases [].
Probab=99.98 E-value=2.8e-33 Score=270.72 Aligned_cols=182 Identities=22% Similarity=0.349 Sum_probs=158.1
Q ss_pred Ccchhhccc--cccccHHHHHHHHhcccCCcceEEEcCEEeeeChhhhhhhhcccCCCcccccCCCh---hHHHHHHHHh
Q 021140 133 GFESLLELR--CGKLKRKLCHWLVNQFKPERNIIELHGQKLELCPKMFSKIMGVKDGGMAIKINGAS---DHIAEVRRIF 207 (317)
Q Consensus 133 GFg~LL~i~--~~~l~~~L~~wL~~~~d~~t~~~~l~g~~i~it~~dV~~VLGLP~gG~~v~~~~~~---~~~~~l~~~~ 207 (317)
|||+|+.|. ..++++.|+.+|+++|+++|++|++++++++||++||.+|+|||+.|.+|...... +.++++.+..
T Consensus 1 ~~g~~~~i~~s~~~~~~~li~al~erW~~et~tF~~~~gEmtiTL~DV~~llGLpi~G~pv~~~~~~~~~~~~~~ll~~~ 80 (363)
T PF10536_consen 1 GFGILDAIMASRITIDRSLISALVERWDPETNTFHFPWGEMTITLEDVAMLLGLPIDGRPVTGPLPPDWRDLCEELLGVS 80 (363)
T ss_pred CchhHhhhhhhcCCCCHHHHHHHHHHhCcccCeeecccccccchhhhhhhccccccccccccCccccchhhHHHHHhccc
Confidence 899999999 89999999999999999999999999999999999999999999999999875432 3344443332
Q ss_pred CC----CCCCcchHHHHHHHhhcccc-cchhhhhHHhhhhcceecccCCC--CCcccccccccccccccccchHHHHHHH
Q 021140 208 QP----TVKGIRIRTLEEVIEQLDEA-NKIFKVAFTLFAIATLLCPIGSY--ISTLFLHPIMDVSSIKSLNWATFCYDWL 280 (317)
Q Consensus 208 ~~----~~~~i~l~~L~~~l~~~~~~-~d~f~r~Fll~~i~~~L~Ptts~--vs~~yl~~l~D~~~i~~yNW~~~Vld~L 280 (317)
.. .+..+.++++++.+.+.+++ .+.+.||||++.+|++|||+++. |+..|++++.|++.+++||||.+||++|
T Consensus 81 ~~~~~~~~~~~~~~wl~~~~~~~~~~d~~~~~rAFll~~lg~~lfp~~~~~~v~~~~l~~~~~l~~~~~~~wg~a~La~l 160 (363)
T PF10536_consen 81 PQIKSKKGSSIRLSWLEEFFSNRPEDDEEQYHRAFLLYWLGSFLFPDKSGDYVSPRYLPLAVDLARIKRYAWGSAVLAYL 160 (363)
T ss_pred ccccccccccchhhheeccccccccchHHHHHHHHHHHhhhceeccCCCcceeeeeEEeeeeccccccccccHHHHHHHH
Confidence 21 24566788999888544333 24899999999999999999877 9999999999999999999999999999
Q ss_pred HHHHHHHhhcC--CccccccHHHHHHHHHhcCCCcc
Q 021140 281 VKSICRFQNQQ--AAYIGGCLHFLQVRPLLQLKLSI 314 (317)
Q Consensus 281 ~~~i~k~~~~~--~~~i~GCl~lLqi~Yld~l~~~~ 314 (317)
+++|++++.+. ..+++||+.|||+|+||||+++.
T Consensus 161 y~~L~~~~~~~~~~~~~~g~~~llq~W~werf~~~r 196 (363)
T PF10536_consen 161 YRDLCKASRKSASQSNIGGPLWLLQLWAWERFPVGR 196 (363)
T ss_pred HHHHHHHhhhcccccccccceeeeccchhheeeccc
Confidence 99999998877 78999999999999999999763
No 2
>PTZ00199 high mobility group protein; Provisional
Probab=99.87 E-value=8.5e-23 Score=161.37 Aligned_cols=76 Identities=18% Similarity=0.314 Sum_probs=71.1
Q ss_pred hhhcCCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCc-hhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 22 ILRANGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLS-VTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 22 ~~~~~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~-~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
+.|+.+|||+||||+||||+|++++|.+++++||+++ .+++|++++|++|++||++||++|+++|++++.+|..++
T Consensus 13 ~~k~~kdp~~PKrP~sAY~~F~~~~R~~i~~~~P~~~~~~~evsk~ige~Wk~ls~eeK~~y~~~A~~dk~rY~~e~ 89 (94)
T PTZ00199 13 NKRKKKDPNAPKRALSAYMFFAKEKRAEIIAENPELAKDVAAVGKMVGEAWNKLSEEEKAPYEKKAQEDKVRYEKEK 89 (94)
T ss_pred cCCCCCCCCCCCCCCcHHHHHHHHHHHHHHHHCcCCcccHHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHH
Confidence 4577899999999999999999999999999999986 238999999999999999999999999999999998775
No 3
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=99.77 E-value=9e-20 Score=160.09 Aligned_cols=76 Identities=21% Similarity=0.312 Sum_probs=73.3
Q ss_pred hhhcCCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCCC
Q 021140 22 ILRANGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH 98 (317)
Q Consensus 22 ~~~~~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k~ 98 (317)
.+|+++|||+||||.||||+|+++.|.+++++||+++| ++|||.+|++||+|+|+||+||.+.|..++.||..+++
T Consensus 61 ~~r~k~dpN~PKRp~sayf~y~~~~R~ei~~~~p~l~~-~e~~k~~~e~WK~Ltd~eke~y~k~~~~~~erYq~ek~ 136 (211)
T COG5648 61 LVRKKKDPNGPKRPLSAYFLYSAENRDEIRKENPKLTF-GEVGKLLSEKWKELTDEEKEPYYKEANSDRERYQREKE 136 (211)
T ss_pred HHHHhcCCCCCCCchhHHHHHHHHHHHHHHHhCCCCCh-HHHHHHHHHHHHhccHhhhhhHHHHHhhHHHHHHHHHH
Confidence 67889999999999999999999999999999999999 99999999999999999999999999999999998863
No 4
>cd01390 HMGB-UBF_HMG-box HMGB-UBF_HMG-box, class II and III members of the HMG-box superfamily of DNA-binding proteins. These proteins bind the minor groove of DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions.
Probab=99.71 E-value=5.1e-18 Score=124.46 Aligned_cols=65 Identities=26% Similarity=0.458 Sum_probs=62.7
Q ss_pred CCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 32 PTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 32 pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
||||+||||+|++++|..++++||++++ .+|++.+|+.|++||++||++|.++|++++.+|..++
T Consensus 1 Pkrp~saf~~f~~~~r~~~~~~~p~~~~-~~i~~~~~~~W~~ls~~eK~~y~~~a~~~~~~y~~e~ 65 (66)
T cd01390 1 PKRPLSAYFLFSQEQRPKLKKENPDASV-TEVTKILGEKWKELSEEEKKKYEEKAEKDKERYEKEM 65 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCCH-HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHhh
Confidence 8999999999999999999999999988 9999999999999999999999999999999998653
No 5
>cd01389 MATA_HMG-box MATA_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include the fungal mating type gene products MC, MATA1 and Ste11.
Probab=99.69 E-value=2e-17 Score=125.76 Aligned_cols=66 Identities=23% Similarity=0.350 Sum_probs=63.7
Q ss_pred CCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 31 VPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 31 ~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
.||||++|||+|++++|.+++++||+.++ .+|++++|+.|+.||++||++|.++|++++.+|..+.
T Consensus 1 ~~kRP~naf~lf~~~~r~~~~~~~p~~~~-~eisk~~g~~Wk~ls~eeK~~y~~~A~~~k~~~~~~~ 66 (77)
T cd01389 1 KIPRPRNAFILYRQDKHAQLKTENPGLTN-NEISRIIGRMWRSESPEVKAYYKELAEEEKERHAREY 66 (77)
T ss_pred CCCCCCcHHHHHHHHHHHHHHHHCCCCCH-HHHHHHHHHHHhhCCHHHHHHHHHHHHHHHHHHHHHC
Confidence 38999999999999999999999999998 9999999999999999999999999999999999776
No 6
>PF09011 HMG_box_2: HMG-box domain; InterPro: IPR015101 This domain is predominantly found in Maelstrom homologue proteins. It has no known function. ; GO: 0005634 nucleus; PDB: 2EQZ_A 1V64_A 2CTO_A 1H5P_A 3TQ6_A 3FGH_A 3TMM_A 1J3X_A 2YRQ_A 1AAB_A ....
Probab=99.68 E-value=1.2e-17 Score=125.68 Aligned_cols=68 Identities=19% Similarity=0.293 Sum_probs=59.8
Q ss_pred CCCCCCCchhhHHHHHHHHHHHHhh-CCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 29 DAVPTSSTYGFVSYFNEEVKRLRSE-NSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 29 ~~~pkr~~~~~~~f~~~~r~~~~~~-~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
||+||||+|||++|+.+++.+++++ ++...+ .++.+.+|+.|++||++||++|.++|++++.+|+.++
T Consensus 1 p~kpK~~~say~lF~~~~~~~~k~~G~~~~~~-~e~~k~~~~~Wk~Ls~~EK~~Y~~~A~~~k~~y~~e~ 69 (73)
T PF09011_consen 1 PKKPKRPPSAYNLFMKEMRKEVKEEGGQKQSF-REVMKEISERWKSLSEEEKEPYEERAKEDKERYEREM 69 (73)
T ss_dssp SSS--SSSSHHHHHHHHHHHHHHHHT-T-SSH-HHHHHHHHHHHHHS-HHHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCCCCHHHHHHHHHHHHHHHhcccCCCH-HHHHHHHHHHHHhcCHHHHHHHHHHHHHHHHHHHHHH
Confidence 8999999999999999999999999 887778 8999999999999999999999999999999998654
No 7
>KOG0526 consensus Nucleosome-binding factor SPN, POB3 subunit [Transcription; Replication, recombination and repair; Chromatin structure and dynamics]
Probab=99.67 E-value=1.1e-17 Score=163.24 Aligned_cols=73 Identities=12% Similarity=0.272 Sum_probs=69.9
Q ss_pred chhhcCCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCCC
Q 021140 21 PILRANGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH 98 (317)
Q Consensus 21 ~~~~~~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k~ 98 (317)
++.|++|||||||||.||||+|+|..|..+|++ +.++ ++|+|.+|++||+||. |++|+++|+.+|.||+.++.
T Consensus 525 k~~kk~kdpnapkra~sa~m~w~~~~r~~ik~d--gi~~-~dv~kk~g~~wk~ms~--k~~we~ka~~dk~ry~~em~ 597 (615)
T KOG0526|consen 525 KKGKKKKDPNAPKRATSAYMLWLNASRESIKED--GISV-GDVAKKAGEKWKQMSA--KEEWEDKAAVDKQRYEDEMK 597 (615)
T ss_pred cCcccCCCCCCCccchhHHHHHHHhhhhhHhhc--CchH-HHHHHHHhHHHhhhcc--cchhhHHHHHHHHHHHHHHH
Confidence 477899999999999999999999999999999 9988 9999999999999998 99999999999999999884
No 8
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=99.66 E-value=9.5e-17 Score=120.53 Aligned_cols=65 Identities=14% Similarity=0.192 Sum_probs=62.3
Q ss_pred CCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 32 PTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 32 pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
-|||+||||+|++++|.+++++||++++ .+|+|.+|++|+.||++||++|.++|+.++++|..+.
T Consensus 2 iKrP~naf~~F~~~~r~~~~~~~p~~~~-~eisk~l~~~Wk~ls~~eK~~y~~~a~~~k~~y~~~~ 66 (72)
T cd01388 2 IKRPMNAFMLFSKRHRRKVLQEYPLKEN-RAISKILGDRWKALSNEEKQPYYEEAKKLKELHMKLY 66 (72)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCCCCCH-HHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHC
Confidence 4899999999999999999999999998 8999999999999999999999999999999998665
No 9
>KOG0381 consensus HMG box-containing protein [General function prediction only]
Probab=99.64 E-value=1.9e-16 Score=125.05 Aligned_cols=70 Identities=21% Similarity=0.404 Sum_probs=66.6
Q ss_pred CCC--CCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCCC
Q 021140 28 QDA--VPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH 98 (317)
Q Consensus 28 ~~~--~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k~ 98 (317)
||+ +||||+||||+|++++|..++++||++++ .+|+|++|+.|+.|+++||.+|..++..++.+|..++.
T Consensus 17 ~p~~~~pkrp~sa~~~f~~~~~~~~k~~~p~~~~-~~v~k~~g~~W~~l~~~~k~~y~~ka~~~k~~Y~~~~~ 88 (96)
T KOG0381|consen 17 DPNAQAPKRPLSAFFLFSSEQRSKIKAENPGLSV-GEVAKALGEMWKNLAEEEKQPYEEKASKLKEKYEKELA 88 (96)
T ss_pred CCCCCCCCCCCcHHHHHHHHHHHHHHHhCCCCCH-HHHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHH
Confidence 885 99999999999999999999999999888 99999999999999999999999999999999987653
No 10
>smart00398 HMG high mobility group.
Probab=99.64 E-value=1.5e-16 Score=117.51 Aligned_cols=66 Identities=24% Similarity=0.428 Sum_probs=63.3
Q ss_pred CCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 31 VPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 31 ~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
+||||+|||++|++++|..++++||++++ ++|++.+|+.|+.||++||++|.++|++++.+|..+.
T Consensus 1 ~pkrp~~~y~~f~~~~r~~~~~~~~~~~~-~~i~~~~~~~W~~l~~~ek~~y~~~a~~~~~~y~~~~ 66 (70)
T smart00398 1 KPKRPMSAFMLFSQENRAKIKAENPDLSN-AEISKKLGERWKLLSEEEKAPYEEKAKKDKERYEEEM 66 (70)
T ss_pred CcCCCCcHHHHHHHHHHHHHHHHCcCCCH-HHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHHHHHH
Confidence 69999999999999999999999999998 9999999999999999999999999999999998654
No 11
>PF00505 HMG_box: HMG (high mobility group) box; InterPro: IPR000910 High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair []. The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins []; LEF1 lymphoid enhancer binding factor 1 []; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor []; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.; GO: 0003677 DNA binding; PDB: 1I11_A 1J3C_A 1J3D_A 1WZ6_A 1WGF_A 2D7L_A 1GT0_D 3U2B_C 2CRJ_A 2CS1_A ....
Probab=99.63 E-value=2.2e-16 Score=116.81 Aligned_cols=64 Identities=28% Similarity=0.449 Sum_probs=59.9
Q ss_pred CCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCC
Q 021140 32 PTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSN 96 (317)
Q Consensus 32 pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~ 96 (317)
||||+|||++|++++|..++++||+.+. .+|++.+|+.|++||++||++|.+.|++.+.+|..+
T Consensus 1 PkrP~~af~lf~~~~~~~~k~~~p~~~~-~~i~~~~~~~W~~l~~~eK~~y~~~a~~~~~~y~~~ 64 (69)
T PF00505_consen 1 PKRPPNAFMLFCKEKRAKLKEENPDLSN-KEISKILAQMWKNLSEEEKAPYKEEAEEEKERYEKE 64 (69)
T ss_dssp SSSS--HHHHHHHHHHHHHHHHSTTSTH-HHHHHHHHHHHHCSHHHHHHHHHHHHHHHHHHHHHH
T ss_pred CcCCCCHHHHHHHHHHHHHHHHhccccc-ccchhhHHHHHhcCCHHHHHHHHHHHHHHHHHHHHH
Confidence 8999999999999999999999999997 999999999999999999999999999999998754
No 12
>cd00084 HMG-box High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III member
Probab=99.61 E-value=5.4e-16 Score=113.20 Aligned_cols=64 Identities=23% Similarity=0.420 Sum_probs=61.6
Q ss_pred CCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCC
Q 021140 32 PTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSN 96 (317)
Q Consensus 32 pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~ 96 (317)
||||+|||+.|++++|..++++||+.+. .+|.+.+|++|+.||++||++|.++|++.+.+|..+
T Consensus 1 pkrp~~af~~f~~~~~~~~~~~~~~~~~-~~i~~~~~~~W~~l~~~~k~~y~~~a~~~~~~y~~~ 64 (66)
T cd00084 1 PKRPLSAYFLFSQEHRAEVKAENPGLSV-GEISKILGEMWKSLSEEEKKKYEEKAEKDKERYEKE 64 (66)
T ss_pred CCCCCcHHHHHHHHHHHHHHHHCcCCCH-HHHHHHHHHHHHhCCHHHHHHHHHHHHHHHHHHHHh
Confidence 8999999999999999999999999988 899999999999999999999999999999998754
No 13
>KOG0527 consensus HMG-box transcription factor [Transcription]
Probab=99.07 E-value=5.9e-11 Score=112.98 Aligned_cols=71 Identities=15% Similarity=0.261 Sum_probs=64.9
Q ss_pred CCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 26 NGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 26 ~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
++....=|||..|||++.+.+|..+-++||+..- +||.|.+|+.||.|+|+||.||.+.|++.+..|.++=
T Consensus 57 k~~~~hIKRPMNAFMVWSq~~RRkma~qnP~mHN-SEISK~LG~~WK~Lse~EKrPFi~EAeRLR~~Hmkeh 127 (331)
T KOG0527|consen 57 KTSTDRIKRPMNAFMVWSQGQRRKLAKQNPKMHN-SEISKRLGAEWKLLSEEEKRPFVDEAERLRAQHMKEY 127 (331)
T ss_pred CCCccccCCCcchhhhhhHHHHHHHHHhCcchhh-HHHHHHHHHHHhhcCHhhhccHHHHHHHHHHHHHHhC
Confidence 4444555999999999999999999999999966 9999999999999999999999999999999988763
No 14
>PF09331 DUF1985: Domain of unknown function (DUF1985); InterPro: IPR015410 This domain is functionally uncharacterised; it is found in a set of Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins.
Probab=98.89 E-value=2.8e-09 Score=90.43 Aligned_cols=123 Identities=14% Similarity=0.265 Sum_probs=90.5
Q ss_pred ceEEEcCEEeeeChhhhhhhhcccCCCcccccCCChh---HHHHHHHHhCCCCCCcchHHHHHHHhhc--ccccchhhhh
Q 021140 162 NIIELHGQKLELCPKMFSKIMGVKDGGMAIKINGASD---HIAEVRRIFQPTVKGIRIRTLEEVIEQL--DEANKIFKVA 236 (317)
Q Consensus 162 ~~~~l~g~~i~it~~dV~~VLGLP~gG~~v~~~~~~~---~~~~l~~~~~~~~~~i~l~~L~~~l~~~--~~~~d~f~r~ 236 (317)
..|.++|..|.++..+.+.|+|||++..+-....... ....+.+.+-..+..+++.++.++|... .+.++.+.-+
T Consensus 14 ~W~~~~g~piRfsl~Ef~lvTGL~C~~~p~~~~~~~~~~~~~~~fw~~Lf~~~~~vtv~dv~~~L~~~~~~~~~~Rlrla 93 (142)
T PF09331_consen 14 IWFVFNGVPIRFSLREFALVTGLNCGPYPKEKKVDKKGKKEKGSFWNKLFGREEDVTVEDVIAKLKKMKKWDSEDRLRLA 93 (142)
T ss_pred EEEEECCEeeEecHHHHHhhcCCcCCCCCcccchhhccccchhhhhhhhccccccCcHHHHHHHHhhcccCChhhHHHHH
Confidence 4678899999999999999999999887766542111 0113333322345679999999999865 2234445555
Q ss_pred HHhhhhcceecccCCC-CCcccccccccccccccccchHHHHHHHHHHH
Q 021140 237 FTLFAIATLLCPIGSY-ISTLFLHPIMDVSSIKSLNWATFCYDWLVKSI 284 (317)
Q Consensus 237 Fll~~i~~~L~Ptts~-vs~~yl~~l~D~~~i~~yNW~~~Vld~L~~~i 284 (317)
+++++.|.+++++.+. |+..++..+.|++.+.+|-||.+.++.++++|
T Consensus 94 ~L~~v~gvl~~~~~~~~i~~~~~~~v~Dl~~f~~yPWGr~sF~~~~~sI 142 (142)
T PF09331_consen 94 LLLFVDGVLIATSKTTKIPKEHLKMVDDLEKFLNYPWGRYSFDMLMKSI 142 (142)
T ss_pred HHHhhheeeeccCCCCCCCHHHHHHHhhHHHHhcCCcHHHHHHHHHhcC
Confidence 5555555555555556 99999999999999999999999999999874
No 15
>KOG4715 consensus SWI/SNF-related matrix-associated actin-dependent regulator of chromatin [Chromatin structure and dynamics]
Probab=98.70 E-value=9.9e-09 Score=95.27 Aligned_cols=74 Identities=19% Similarity=0.290 Sum_probs=70.1
Q ss_pred hcCCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCCC
Q 021140 24 RANGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNSH 98 (317)
Q Consensus 24 ~~~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k~ 98 (317)
+..+-|.+|-+|+-.||-|+...+.++|++||+++. =+|||++|..|+-|+|+||..|+.-.+..|.+|+..++
T Consensus 57 t~pkpPkppekpl~pymrySrkvWd~VkA~nPe~kL-WeiGK~Ig~mW~dLpd~EK~ey~~EYeaEKieY~~smk 130 (410)
T KOG4715|consen 57 TRPKPPKPPEKPLMPYMRYSRKVWDQVKASNPELKL-WEIGKIIGGMWLDLPDEEKQEYLNEYEAEKIEYNESMK 130 (410)
T ss_pred cCCCCCCCCCcccchhhHHhhhhhhhhhccCcchHH-HHHHHHHHHHHhhCcchHHHHHHHHHHHHHHHHHHHHH
Confidence 447789999999999999999999999999999999 89999999999999999999999999999999998873
No 16
>KOG3248 consensus Transcription factor TCF-4 [Transcription]
Probab=98.55 E-value=6.5e-08 Score=90.64 Aligned_cols=59 Identities=17% Similarity=0.350 Sum_probs=53.7
Q ss_pred CCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccC
Q 021140 33 TSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNS 92 (317)
Q Consensus 33 kr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~ 92 (317)
|.|+.|||.||+|+|+.+-+|-- ++-.++|-+++|.+|-+||-||.++|-+.|++++.-
T Consensus 193 KKPLNAFmlyMKEmRa~vvaEct-lKeSAaiNqiLGrRWH~LSrEEQAKYyElArKerql 251 (421)
T KOG3248|consen 193 KKPLNAFMLYMKEMRAKVVAECT-LKESAAINQILGRRWHALSREEQAKYYELARKERQL 251 (421)
T ss_pred cccHHHHHHHHHHHHHHHHHHhh-hhhHHHHHHHHhHHHhhhhHHHHHHHHHHHHHHHHH
Confidence 88999999999999999999986 544499999999999999999999999999988653
No 17
>KOG0528 consensus HMG-box transcription factor SOX5 [Transcription]
Probab=98.14 E-value=1e-06 Score=86.41 Aligned_cols=61 Identities=15% Similarity=0.312 Sum_probs=52.6
Q ss_pred CCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCC
Q 021140 33 TSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGN 94 (317)
Q Consensus 33 kr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~ 94 (317)
|||.+|||++.+|.|..+-...||.-- ..+.|++|.+||+||..||+||-+--....+.|-
T Consensus 327 KRPMNAFMVWAkDERRKILqA~PDMHN-SnISKILGSRWKaMSN~eKQPYYEEQaRLSk~Hl 387 (511)
T KOG0528|consen 327 KRPMNAFMVWAKDERRKILQAFPDMHN-SNISKILGSRWKAMSNTEKQPYYEEQARLSKLHL 387 (511)
T ss_pred cCCcchhhcccchhhhhhhhcCccccc-cchhHHhcccccccccccccchHHHHHHHHHhhh
Confidence 999999999999999999999999977 7899999999999999999988444333444444
No 18
>PF06382 DUF1074: Protein of unknown function (DUF1074); InterPro: IPR024460 This family consists of several proteins which appear to be specific to Insecta. The function of this family is unknown.
Probab=97.32 E-value=0.00043 Score=60.10 Aligned_cols=46 Identities=17% Similarity=0.382 Sum_probs=41.2
Q ss_pred hhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhh
Q 021140 37 YGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDE 87 (317)
Q Consensus 37 ~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~ 87 (317)
.||+.|+.|||. .|.+++. .++-..++..|..||++||.+|..++.
T Consensus 84 naYLNFLReFRr----kh~~L~p-~dlI~~AAraW~rLSe~eK~rYrr~~~ 129 (183)
T PF06382_consen 84 NAYLNFLREFRR----KHCGLSP-QDLIQRAARAWCRLSEAEKNRYRRMAP 129 (183)
T ss_pred hHHHHHHHHHHH----HccCCCH-HHHHHHHHHHHHhCCHHHHHHHHhhcc
Confidence 699999999886 5889998 799999999999999999999988654
No 19
>PF14887 HMG_box_5: HMG (high mobility group) box 5; PDB: 1L8Y_A 1L8Z_A 2HDZ_A.
Probab=97.27 E-value=0.00015 Score=54.51 Aligned_cols=64 Identities=6% Similarity=0.066 Sum_probs=52.7
Q ss_pred CCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 32 PTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 32 pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
|..|-+|==.+.+.-+..|-+.+|+... .+ -++.+.+|+.|++.||.++.++|.++.++|+.++
T Consensus 4 PE~PKt~qe~Wqq~vi~dYla~~~~dr~-K~-~kam~~~W~~me~Kekl~WIkKA~EdqKrYE~el 67 (85)
T PF14887_consen 4 PETPKTAQEIWQQSVIGDYLAKFRNDRK-KA-LKAMEAQWSQMEKKEKLKWIKKAAEDQKRYEREL 67 (85)
T ss_dssp S----THHHHHHHHHHHHHHHHTTSTHH-HH-HHHHHHHHHTTGGGHHHHHHHHHHHHHHHHHHHH
T ss_pred CCCCCCHHHHHHHHHHHHHHHHhhHhHH-HH-HHHHHHHHHHhhhhhhhHHHHHHHHHHHHHHHHH
Confidence 4444455567788999999999999987 44 6699999999999999999999999999999775
No 20
>KOG2746 consensus HMG-box transcription factor Capicua and related proteins [Transcription]
Probab=96.77 E-value=0.00057 Score=69.85 Aligned_cols=62 Identities=13% Similarity=0.219 Sum_probs=58.3
Q ss_pred CCCchhhHHHHHHHH--HHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCC
Q 021140 33 TSSTYGFVSYFNEEV--KRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNS 95 (317)
Q Consensus 33 kr~~~~~~~f~~~~r--~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~ 95 (317)
.||.+||++|++..| ..+...||+..- ..|.|++|+.|-+|-+.||+.|.+.|.+.|.-+-+
T Consensus 183 rrPMnaf~ifskrhr~~g~vhq~~pn~DN-rtIskiLgewWytL~~~Ekq~yhdLa~Qvk~Ahfk 246 (683)
T KOG2746|consen 183 RRPMNAFHIFSKRHRGEGRVHQRHPNQDN-RTISKILGEWWYTLGPNEKQKYHDLAFQVKEAHFK 246 (683)
T ss_pred hhhhHHHHHHHhhcCCccchhccCccccc-hhHHHHHhhhHhhhCchhhhhHHHHHHHHHHHHhh
Confidence 799999999999999 999999999988 89999999999999999999999999998877765
No 21
>COG5648 NHP6B Chromatin-associated proteins containing the HMG domain [Chromatin structure and dynamics]
Probab=96.70 E-value=0.00066 Score=60.47 Aligned_cols=68 Identities=15% Similarity=0.182 Sum_probs=63.0
Q ss_pred CCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhhhhccCCCCCC
Q 021140 29 DAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDERMGNSGNSNS 97 (317)
Q Consensus 29 ~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~~~~~~y~~~k 97 (317)
.-+|++|+.+|..+..+-|......+|+... .+.+|.+|+.|++|++.-|++|.+.+.+++.+|++..
T Consensus 141 k~~~~~~~~~~~e~~~~~r~~~~~~~~~~~~-~e~~k~~~~~w~el~~skK~~~~~~~Kk~k~~~~~~~ 208 (211)
T COG5648 141 KLPNKAPIGPFIENEPKIRPKVEGPSPDKAL-VEETKIISKAWSELDESKKKKYIDKYKKLKEEYDSFY 208 (211)
T ss_pred ccCCCCCCchhhhccHHhccccCCCCcchhh-hHHhhhhhhhhhhhChhhhhHHHHHHHHHHHHHhhhc
Confidence 3467999999999999999999999999988 7999999999999999999999999999999998764
No 22
>PF04690 YABBY: YABBY protein; InterPro: IPR006780 YABBY proteins are a group of plant-specific transcription factors involved in the specification of abaxial polarity in lateral organs such as leaves and floral organs [, ].
Probab=96.56 E-value=0.0036 Score=54.51 Aligned_cols=49 Identities=20% Similarity=0.373 Sum_probs=43.7
Q ss_pred CCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCC
Q 021140 26 NGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELP 75 (317)
Q Consensus 26 ~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~ 75 (317)
.+-|-+..|-||||-.||+|.-.++|++||+++- .|.=+++++.|+-.+
T Consensus 116 ~kPPEKRqR~psaYn~f~k~ei~rik~~~p~ish-keaFs~aAknW~h~p 164 (170)
T PF04690_consen 116 NKPPEKRQRVPSAYNRFMKEEIQRIKAENPDISH-KEAFSAAAKNWAHFP 164 (170)
T ss_pred cCCccccCCCchhHHHHHHHHHHHHHhcCCCCCH-HHHHHHHHHhhhhCc
Confidence 4556666799999999999999999999999999 899999999998755
No 23
>PF08073 CHDNT: CHDNT (NUC034) domain; InterPro: IPR012958 The CHD N-terminal domain is found in PHD/RING fingers and chromo domain-associated helicases [].; GO: 0003677 DNA binding, 0005524 ATP binding, 0008270 zinc ion binding, 0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides, 0006355 regulation of transcription, DNA-dependent, 0005634 nucleus
Probab=93.66 E-value=0.088 Score=37.31 Aligned_cols=39 Identities=10% Similarity=0.146 Sum_probs=35.6
Q ss_pred chhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCC
Q 021140 36 TYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELP 75 (317)
Q Consensus 36 ~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~ 75 (317)
.|.|=.|.+-.|..+.++||++.. ..+-.+++.+|++-+
T Consensus 13 lt~yK~Fsq~vRP~l~~~NPk~~~-sKl~~l~~AKwrEF~ 51 (55)
T PF08073_consen 13 LTNYKAFSQHVRPLLAKANPKAPM-SKLMMLLQAKWREFQ 51 (55)
T ss_pred HHHHHHHHHHHHHHHHHHCCCCcH-HHHHHHHHHHHHHHH
Confidence 577889999999999999999998 899999999998744
No 24
>PF03078 ATHILA: ATHILA ORF-1 family; InterPro: IPR004312 ATHILA is a group of Arabidopsis thaliana retrotransposons [] belonging to the Ty3/gypsy family of the long terminal repeat (LTR) class of eukaryotic retrotransposons[, ]. The central region of ATHILA retrotransposons contains two or three open reading frames (ORFs). This family represents the ORF1 product. The function of ORF1 is unknown.
Probab=91.30 E-value=5 Score=40.45 Aligned_cols=189 Identities=15% Similarity=0.118 Sum_probs=108.4
Q ss_pred hhhcCCChHHhhhhhhhhhhhccCCCCCCCCCCCCcceeeechhHHHHHhhcCCHHHHHHHHHhCcchhhccccccccHH
Q 021140 69 KTYKELPPEQKARYKKRDERMGNSGNSNSHSGDNEIVETKCVPERFCALVKSLSEEKKKAIREIGFESLLELRCGKLKRK 148 (317)
Q Consensus 69 ~~wk~l~~~ek~~y~~~a~~~~~~y~~~k~~~k~~~~~~RcS~~~~~~~i~~Ls~~qk~~I~~~GFg~LL~i~~~~l~~~ 148 (317)
.+|+..+++|--.|.+.-+--..||- ++..+..+ .|.++-..+++.+|.+.|..++...-+..
T Consensus 41 ~k~~~~t~~eyy~~l~~~~~~~TRyp---------------~~etl~~L--Gl~~dV~~lf~~~gL~~f~~~~~~~Y~ee 103 (458)
T PF03078_consen 41 KKKDKLTPSEYYQLLKKIEFAPTRYP---------------DPETLQKL--GLLEDVEYLFKKCGLGTFMSYPYPTYPEE 103 (458)
T ss_pred cccccCChHHHHHHHhhccccccccC---------------CHHHHHHh--ccHHHHHHHHHhcCchhhccCCCCCcHHH
Confidence 35566666665555555444444443 34445555 67888899999999999998887655544
Q ss_pred HHHHHHhcccC----------------CcceEEEcCEEeeeChhhhhhhhcccCCCcccccCCChhHHHHHHHHhCCCCC
Q 021140 149 LCHWLVNQFKP----------------ERNIIELHGQKLELCPKMFSKIMGVKDGGMAIKINGASDHIAEVRRIFQPTVK 212 (317)
Q Consensus 149 L~~wL~~~~d~----------------~t~~~~l~g~~i~it~~dV~~VLGLP~gG~~v~~~~~~~~~~~l~~~~~~~~~ 212 (317)
.+..|+.. .. ..-+|.|.|....+|-.+...++|.|.|+. +....+.+....|++..|.+.
T Consensus 104 t~qFLaTl-~v~~~~~~~~~~~e~~glG~l~F~V~~~~y~lsi~~L~~i~GF~~~~~-i~~~~~~~el~~~W~~ig~~~- 180 (458)
T PF03078_consen 104 TRQFLATL-KVTFYNPSEPRAKELDGLGYLTFFVYGVEYSLSIKHLERIFGFPSGDE-IKPDFDPEELNDFWATIGGGK- 180 (458)
T ss_pred HHHhhhee-eeeecccccchhhcccCcceEEEEEcceeeeeeHHHHHHHhCCCCccc-cCCCCCchHHHHHHHHhcCCC-
Confidence 44444432 21 123577789999999999999999998743 434444444566777665431
Q ss_pred CcchHHHHHHHhhcccccchhhhhHHhhhhcceecccCCC--CCcccccc-----------cccc----cccccccchHH
Q 021140 213 GIRIRTLEEVIEQLDEANKIFKVAFTLFAIATLLCPIGSY--ISTLFLHP-----------IMDV----SSIKSLNWATF 275 (317)
Q Consensus 213 ~i~l~~L~~~l~~~~~~~d~f~r~Fll~~i~~~L~Ptts~--vs~~yl~~-----------l~D~----~~i~~yNW~~~ 275 (317)
.++...-.... .-+=+.+++--+++..|+|.+.. |..+-|.+ ..|. .+..+.|-+..
T Consensus 181 p~~~~~~ks~~------Ir~PviRy~hr~iA~tlf~R~~~~~v~~~El~~l~~~L~~~Lr~~~~g~~l~~d~~dt~~~~v 254 (458)
T PF03078_consen 181 PFNSARSKSNQ------IRSPVIRYFHRLIANTLFAREETGTVRNDELEMLDQALKHLLRRTKDGKLLRGDLNDTNVSMV 254 (458)
T ss_pred ccccccccccc------ccChHHHHHHHHHHhhhccccccCceechhHHHHHHHHHHHHHhcCCCccccCcccccchhHH
Confidence 11111111111 11223344444566666664433 55544332 1121 12356667777
Q ss_pred HHHHHHHH
Q 021140 276 CYDWLVKS 283 (317)
Q Consensus 276 Vld~L~~~ 283 (317)
.++||+..
T Consensus 255 l~~hL~~y 262 (458)
T PF03078_consen 255 LLDHLCSY 262 (458)
T ss_pred HHHHHHhh
Confidence 77777654
No 25
>PF06244 DUF1014: Protein of unknown function (DUF1014); InterPro: IPR010422 This family consists of several hypothetical eukaryotic proteins of unknown function.
Probab=87.27 E-value=0.89 Score=37.60 Aligned_cols=49 Identities=12% Similarity=0.220 Sum_probs=43.9
Q ss_pred CCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCC
Q 021140 26 NGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELP 75 (317)
Q Consensus 26 ~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~ 75 (317)
..|.++-+|---||--|-...-.++|++||++.. .++--++-+.|+--|
T Consensus 67 ~~drHPErR~KAAy~afeE~~Lp~lK~E~PgLrl-sQ~kq~l~K~w~KSP 115 (122)
T PF06244_consen 67 PIDRHPERRMKAAYKAFEERRLPELKEENPGLRL-SQYKQMLWKEWQKSP 115 (122)
T ss_pred CCCCCcchhHHHHHHHHHHHHhHHHHhhCCCchH-HHHHHHHHHHHhcCC
Confidence 5577788888899999999999999999999999 899999999997644
No 26
>KOG2062 consensus 26S proteasome regulatory complex, subunit RPN2/PSMD1 [Posttranslational modification, protein turnover, chaperones]
Probab=65.42 E-value=4.6 Score=42.79 Aligned_cols=147 Identities=16% Similarity=0.189 Sum_probs=73.2
Q ss_pred CHHHHHHHHHhCcchhhccccccccHHHHHHHHhcccCCcceEEEcCEEeeeChhhhhhhhcccCCCcccccCCChhHHH
Q 021140 122 SEEKKKAIREIGFESLLELRCGKLKRKLCHWLVNQFKPERNIIELHGQKLELCPKMFSKIMGVKDGGMAIKINGASDHIA 201 (317)
Q Consensus 122 s~~qk~~I~~~GFg~LL~i~~~~l~~~L~~wL~~~~d~~t~~~~l~g~~i~it~~dV~~VLGLP~gG~~v~~~~~~~~~~ 201 (317)
++..|.+|-.+||==+-+ ...=...+.-|.++||| |+.- -+.+.|||-+.|- +..+.++
T Consensus 569 DDVrRaAVialGFVl~~d---p~~~~s~V~lLses~N~-----HVRy--------GaA~ALGIaCAGt-----G~~eAi~ 627 (929)
T KOG2062|consen 569 DDVRRAAVIALGFVLFRD---PEQLPSTVSLLSESYNP-----HVRY--------GAAMALGIACAGT-----GLKEAIN 627 (929)
T ss_pred hHHHHHHHHHheeeEecC---hhhchHHHHHHhhhcCh-----hhhh--------hHHHHHhhhhcCC-----CcHHHHH
Confidence 344678999999842211 11123455677777775 3431 2788999987773 4444432
Q ss_pred HHHHHhCCCCCCcchHHHHHHHhhcccccchhhhh-------HHhhhhcceecccCCCCCcccccccccccccccccchH
Q 021140 202 EVRRIFQPTVKGIRIRTLEEVIEQLDEANKIFKVA-------FTLFAIATLLCPIGSYISTLFLHPIMDVSSIKSLNWAT 274 (317)
Q Consensus 202 ~l~~~~~~~~~~i~l~~L~~~l~~~~~~~d~f~r~-------Fll~~i~~~L~Ptts~vs~~yl~~l~D~~~i~~yNW~~ 274 (317)
- |+-++. +-..|+|- +||.-..--+||..+.+...|...+.|=.+=.=.-.|.
T Consensus 628 l----------------Lepl~~----D~~~fVRQgAlIa~amIm~Q~t~~~~pkv~~frk~l~kvI~dKhEd~~aK~GA 687 (929)
T KOG2062|consen 628 L----------------LEPLTS----DPVDFVRQGALIALAMIMIQQTEQLCPKVNGFRKQLEKVINDKHEDGMAKFGA 687 (929)
T ss_pred H----------------Hhhhhc----ChHHHHHHHHHHHHHHHHHhcccccCchHHHHHHHHHHHhhhhhhHHHHHHHH
Confidence 2 222221 22345544 44444444445544434444544444322111112233
Q ss_pred HHHHHHHHHHHHH-----h----hcCCccccccHHHHHHHHHhc
Q 021140 275 FCYDWLVKSICRF-----Q----NQQAAYIGGCLHFLQVRPLLQ 309 (317)
Q Consensus 275 ~Vld~L~~~i~k~-----~----~~~~~~i~GCl~lLqi~Yld~ 309 (317)
.+-.-|.++=.+= + +.+...|-|-+.|+|.|||--
T Consensus 688 ilAqGildaGGrNvtislqs~tg~~~~~~vvGl~~Flq~WyWfP 731 (929)
T KOG2062|consen 688 ILAQGILDAGGRNVTISLQSMTGHTKLDAVVGLVVFLQYWYWFP 731 (929)
T ss_pred HHHhhhhhcCCceEEEEEeccCCCCchHHHHHHHHHHHHHHHHH
Confidence 3333333321110 0 011357899999999999954
No 27
>PF06945 DUF1289: Protein of unknown function (DUF1289); InterPro: IPR010710 This family consists of a number of hypothetical bacterial proteins. The aligned region spans around 56 residues and contains 4 highly conserved cysteine residues towards the N terminus. The function of this family is unknown.
Probab=53.19 E-value=10 Score=26.25 Aligned_cols=23 Identities=9% Similarity=0.189 Sum_probs=18.3
Q ss_pred hhhcCCChHHhhhhhhhhhhhcc
Q 021140 69 KTYKELPPEQKARYKKRDERMGN 91 (317)
Q Consensus 69 ~~wk~l~~~ek~~y~~~a~~~~~ 91 (317)
..|++||++||....++.....+
T Consensus 28 ~~W~~~s~~er~~i~~~l~~R~~ 50 (51)
T PF06945_consen 28 RDWKSMSDDERRAILARLRARRA 50 (51)
T ss_pred HHHhhCCHHHHHHHHHHHHHHhc
Confidence 37999999999988877665543
No 28
>PRK15117 ABC transporter periplasmic binding protein MlaC; Provisional
Probab=53.00 E-value=21 Score=32.17 Aligned_cols=33 Identities=21% Similarity=0.261 Sum_probs=24.9
Q ss_pred CCCCchhhhhhhHHhhhhcCCChHHhhhhhhhh
Q 021140 54 NSDLSVTLGLRKHIGKTYKELPPEQKARYKKRD 86 (317)
Q Consensus 54 ~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a 86 (317)
.|.+.|..--...+|..|+.+|+++++.|.+.=
T Consensus 65 ~p~~Df~~~s~~vLG~~wr~as~eQr~~F~~~F 97 (211)
T PRK15117 65 LPYVQVKYAGALVLGRYYKDATPAQREAYFAAF 97 (211)
T ss_pred cccCCHHHHHHHHhhhhhhhCCHHHHHHHHHHH
Confidence 366777333356899999999999999886543
No 29
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=52.10 E-value=20 Score=28.74 Aligned_cols=23 Identities=26% Similarity=0.618 Sum_probs=12.4
Q ss_pred hHHhhhhcCCChHHhhhhhhhhh
Q 021140 65 KHIGKTYKELPPEQKARYKKRDE 87 (317)
Q Consensus 65 k~~g~~wk~l~~~ek~~y~~~a~ 87 (317)
.-+.+.|.+|+++.|..+...++
T Consensus 14 ~pl~~~W~~l~~~qr~k~l~~a~ 36 (107)
T PF11304_consen 14 APLAERWNSLPPEQRRKWLQIAE 36 (107)
T ss_pred HHHHHHHhcCCHHHHHHHHHHHH
Confidence 34445566666666655554443
No 30
>PF04769 MAT_Alpha1: Mating-type protein MAT alpha 1; InterPro: IPR006856 This family includes Saccharomyces cerevisiae (Baker's yeast) mating type protein alpha 1 (P01365 from SWISSPROT). MAT alpha 1 is a transcription activator that activates mating-type alpha-specific genes with the help of the MADS-box containing MCM1 transcription factor, which together bind cooperatively to PQ elements upstream of alpha-specific genes. The MCM1-MATalpha1 complex is required for the proper DNA-bending that is needed for transcriptional activation []. Alpha 1 interacts in vivo with STE12, linking expression of alpha-specific genes to the alpha-pheromone (IPR006742 from INTERPRO) response pathway [].; GO: 0000772 mating pheromone activity, 0003677 DNA binding, 0045895 positive regulation of transcription, mating-type specific, 0005634 nucleus
Probab=50.13 E-value=36 Score=30.55 Aligned_cols=44 Identities=11% Similarity=0.133 Sum_probs=36.8
Q ss_pred cCCCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcC
Q 021140 25 ANGQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKE 73 (317)
Q Consensus 25 ~~~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~ 73 (317)
.+..+..+|||..+||.| |.-+....|+... .++...++.-|+.
T Consensus 37 ~~~~~~~~kr~lN~Fm~F----Rsyy~~~~~~~~Q-k~~S~~l~~lW~~ 80 (201)
T PF04769_consen 37 RKRSPEKAKRPLNGFMAF----RSYYSPIFPPLPQ-KELSGILTKLWEK 80 (201)
T ss_pred ccccccccccchhHHHHH----HHHHHhhcCCcCH-HHHHHHHHHHHhC
Confidence 455666789999999987 5667788889988 8999999999977
No 31
>PF12650 DUF3784: Domain of unknown function (DUF3784); InterPro: IPR017259 This group represents an uncharacterised conserved protein.
Probab=48.02 E-value=11 Score=29.36 Aligned_cols=18 Identities=28% Similarity=0.613 Sum_probs=14.6
Q ss_pred hhcCCChHHhhhhhhhhh
Q 021140 70 TYKELPPEQKARYKKRDE 87 (317)
Q Consensus 70 ~wk~l~~~ek~~y~~~a~ 87 (317)
-|+.||+|||+.|.++.-
T Consensus 25 Gyntms~eEk~~~D~~~l 42 (97)
T PF12650_consen 25 GYNTMSKEEKEKYDKKKL 42 (97)
T ss_pred hcccCCHHHHHHhhHHHH
Confidence 478899999999876654
No 32
>PF05494 Tol_Tol_Ttg2: Toluene tolerance, Ttg2 ; InterPro: IPR008869 Toluene tolerance is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation []. Many proteins are involved in these processes. This family is a transporter which shows similarity to ABC transporters [].; PDB: 2QGU_A.
Probab=45.97 E-value=16 Score=31.31 Aligned_cols=34 Identities=15% Similarity=0.358 Sum_probs=23.9
Q ss_pred CCCCchhhhhhhHHhhhhcCCChHHhhhhhhhhh
Q 021140 54 NSDLSVTLGLRKHIGKTYKELPPEQKARYKKRDE 87 (317)
Q Consensus 54 ~p~~~~~~~~~k~~g~~wk~l~~~ek~~y~~~a~ 87 (317)
.|-+.+..-..-.+|..|+.+|++|++.|.+.-.
T Consensus 35 ~~~~D~~~~ar~~LG~~w~~~s~~q~~~F~~~f~ 68 (170)
T PF05494_consen 35 DPYFDFERMARRVLGRYWRKASPAQRQRFVEAFK 68 (170)
T ss_dssp GGGB-HHHHHHHHHGGGTTTS-HHHHHHHHHHHH
T ss_pred HHhCCHHHHHHHHHHHhHhhCCHHHHHHHHHHHH
Confidence 3666774445667899999999999998865443
No 33
>PF12169 DNA_pol3_gamma3: DNA polymerase III subunits gamma and tau domain III; InterPro: IPR022754 This domain is found in bacteria and eukaryotes, and is approximately 110 amino acids in length. It is found in association with PF00004 from PFAM. This domain is also present in the tau subunit before it undergoes cleavage. Domains I-III are shared between the tau and the gamma subunits, while most of the DnaB-binding Domain IV and all of the alpha-interacting Domain V are unique to tau. ; GO: 0003887 DNA-directed DNA polymerase activity; PDB: 1NJF_B 3GLG_G 1XXH_I 1NJG_A 3GLF_B 3GLI_G.
Probab=43.47 E-value=48 Score=27.15 Aligned_cols=59 Identities=12% Similarity=0.008 Sum_probs=31.2
Q ss_pred CCcccccccccccccccccchHHHHHHHHHHHHHHhhcCCccccccHHHHHHHHHhcCC
Q 021140 253 ISTLFLHPIMDVSSIKSLNWATFCYDWLVKSICRFQNQQAAYIGGCLHFLQVRPLLQLK 311 (317)
Q Consensus 253 vs~~yl~~l~D~~~i~~yNW~~~Vld~L~~~i~k~~~~~~~~i~GCl~lLqi~Yld~l~ 311 (317)
++..+...+.+...--..+=-...++.|.++..+.+....+.+.=-+.++.+..+.++.
T Consensus 76 ~~~~~~~~~~~~a~~~~~~~l~~~~~~l~~~~~~lr~s~~pr~~lE~~llrl~~~~~~~ 134 (143)
T PF12169_consen 76 LSEEEEEKLKELAKKFSPERLQRILQILLEAENELRYSSNPRILLEMALLRLCQLKSLP 134 (143)
T ss_dssp -CTTTHHHHHHHHHHS-HHHHHHHHHHHHHHHHHTTTSSSHHHHHHHHHHHHHHTC---
T ss_pred CCHHHHHHHHHHHHcCCHHHHHHHHHHHHHHHHHhccCCChHHHHHHHHHHHHHHhhcc
Confidence 55655555554443333333444556666666666655556666666777777765543
No 34
>PF11304 DUF3106: Protein of unknown function (DUF3106); InterPro: IPR021455 Some members in this family of proteins are annotated as transmembrane proteins however this cannot be confirmed. Currently no function is known.
Probab=41.59 E-value=47 Score=26.64 Aligned_cols=22 Identities=27% Similarity=0.599 Sum_probs=16.7
Q ss_pred HHhhhhcCCChHHhhhhhhhhh
Q 021140 66 HIGKTYKELPPEQKARYKKRDE 87 (317)
Q Consensus 66 ~~g~~wk~l~~~ek~~y~~~a~ 87 (317)
.++++|.+||++|++.+..+..
T Consensus 33 ~~a~r~~~mspeqq~r~~~rm~ 54 (107)
T PF11304_consen 33 QIAERWPSMSPEQQQRLRERMR 54 (107)
T ss_pred HHHHHHhcCCHHHHHHHHHHHH
Confidence 3678899999999987655543
No 35
>TIGR03481 HpnM hopanoid biosynthesis associated membrane protein HpnM. The genomes containing members of this family share the machinery for the biosynthesis of hopanoid lipids. Furthermore, the genes of this family are usually located proximal to other components of this biological process. The proteins are members of the pfam05494 family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the members included here have anything to do with toluene per-se.
Probab=39.10 E-value=23 Score=31.58 Aligned_cols=31 Identities=19% Similarity=0.428 Sum_probs=23.7
Q ss_pred CCchhhhhh-hHHhhhhcCCChHHhhhhhhhhh
Q 021140 56 DLSVTLGLR-KHIGKTYKELPPEQKARYKKRDE 87 (317)
Q Consensus 56 ~~~~~~~~~-k~~g~~wk~l~~~ek~~y~~~a~ 87 (317)
.+.| ..++ ..+|..|+.+|+++++.|.+.-.
T Consensus 63 ~~Df-~~mar~vLG~~W~~~s~~Qr~~F~~~F~ 94 (198)
T TIGR03481 63 AFDL-PAMARLTLGSSWTSLSPEQRRRFIGAFR 94 (198)
T ss_pred hCCH-HHHHHHHhhhhhhhCCHHHHHHHHHHHH
Confidence 4566 4454 48899999999999998876544
No 36
>KOG0493 consensus Transcription factor Engrailed, contains HOX domain [General function prediction only]
Probab=35.77 E-value=51 Score=30.83 Aligned_cols=63 Identities=16% Similarity=0.264 Sum_probs=40.5
Q ss_pred hhhcCCCCC-CCCCCchhhHHHHHHHHHHHHhhCCCCchhh-----hhhhHHhhhhcCCChHHh-hhhhhhhhhhccC
Q 021140 22 ILRANGQDA-VPTSSTYGFVSYFNEEVKRLRSENSDLSVTL-----GLRKHIGKTYKELPPEQK-ARYKKRDERMGNS 92 (317)
Q Consensus 22 ~~~~~~~~~-~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~-----~~~k~~g~~wk~l~~~ek-~~y~~~a~~~~~~ 92 (317)
|.|++++++ --|||-||| ..|+-..+|+|.-++...+ +|+.++| |.|..= .=|+.|..+-|+.
T Consensus 235 k~kkkk~~~~eeKRPRTAF---taeQL~RLK~EF~enRYlTEqRRQ~La~ELg-----LNEsQIKIWFQNKRAKiKKs 304 (342)
T KOG0493|consen 235 KPKKKKSSSKEEKRPRTAF---TAEQLQRLKAEFQENRYLTEQRRQELAQELG-----LNESQIKIWFQNKRAKIKKS 304 (342)
T ss_pred cccccCCccchhcCccccc---cHHHHHHHHHHHhhhhhHHHHHHHHHHHHhC-----cCHHHhhHHhhhhhhhhhhc
Confidence 334445443 349999996 5789999999988886543 4555664 777654 4566555554443
No 37
>PF10234 Cluap1: Clusterin-associated protein-1; InterPro: IPR019366 This protein of 413 amino acids contains a central coiled-coil domain, possibly the region that binds to clusterin. Cluap1 expression is highest in the nucleus and gradually increases during late S to G2/M phases of the cell cycle and returns to the basal level in the G0/G1 phases. In addition, it is upregulated in colon cancer tissues compared to corresponding non-cancerous mucosa. It thus plays a crucial role in the life of the cell [].
Probab=35.08 E-value=20 Score=33.63 Aligned_cols=32 Identities=19% Similarity=0.643 Sum_probs=25.4
Q ss_pred HhCcchhhcccccccc-----HHHHHHHHhcccCCcc
Q 021140 131 EIGFESLLELRCGKLK-----RKLCHWLVNQFKPERN 162 (317)
Q Consensus 131 ~~GFg~LL~i~~~~l~-----~~L~~wL~~~~d~~t~ 162 (317)
.+||-.+++|....-| -+++.||+.+|||+..
T Consensus 2 ~LGypr~iSmenFrtPNF~LVAeiL~WLv~rydP~~~ 38 (267)
T PF10234_consen 2 ALGYPRLISMENFRTPNFELVAEILRWLVKRYDPDAD 38 (267)
T ss_pred CCCCCCCCcHHHcCCCChHHHHHHHHHHHHHcCCCCC
Confidence 5799999998774444 5677899999999873
No 38
>KOG3223 consensus Uncharacterized conserved protein [Function unknown]
Probab=30.53 E-value=32 Score=30.70 Aligned_cols=49 Identities=16% Similarity=0.307 Sum_probs=42.2
Q ss_pred CCCCCCCCCchhhHHHHHHHHHHHHhhCCCCchhhhhhhHHhhhhcCCCh
Q 021140 27 GQDAVPTSSTYGFVSYFNEEVKRLRSENSDLSVTLGLRKHIGKTYKELPP 76 (317)
Q Consensus 27 ~~~~~pkr~~~~~~~f~~~~r~~~~~~~p~~~~~~~~~k~~g~~wk~l~~ 76 (317)
.|-+|-||=--||--|-...-++++.+||++.. .+.-.++-+.|+.=||
T Consensus 160 ddrHPEkRmrAA~~afEe~~LPrLK~e~P~lrl-sQ~Kqll~Kew~KsPD 208 (221)
T KOG3223|consen 160 DDRHPEKRMRAAFKAFEEARLPRLKKENPGLRL-SQYKQLLKKEWQKSPD 208 (221)
T ss_pred cccChHHHHHHHHHHHHHhhchhhhhcCCCccH-HHHHHHHHHHHhhCCC
Confidence 445666888899999999999999999999999 8888899888976554
No 39
>PRK10236 hypothetical protein; Provisional
Probab=29.58 E-value=36 Score=31.34 Aligned_cols=26 Identities=19% Similarity=0.404 Sum_probs=21.5
Q ss_pred hhhHHhhhhcCCChHHhhhhhhhhhh
Q 021140 63 LRKHIGKTYKELPPEQKARYKKRDER 88 (317)
Q Consensus 63 ~~k~~g~~wk~l~~~ek~~y~~~a~~ 88 (317)
+.|.+++.|+.||++|++.+.+.-..
T Consensus 118 l~kll~~a~~kms~eE~~~L~~~l~~ 143 (237)
T PRK10236 118 LEQFLRNTWKKMDEEHKQEFLHAVDA 143 (237)
T ss_pred HHHHHHHHHHHCCHHHHHHHHHHHhh
Confidence 48999999999999999887665443
No 40
>PF04994 TfoX_C: TfoX C-terminal domain; InterPro: IPR007077 This domain is found in a number of bacterial proteins including the TfoX gene product of Haemophilus influenzae. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes []. This family corresponds to the C-terminal presumed domain of TfoX. The domain is found in association with the N-terminal domain in some, but not all members of this group, suggesting this is an autonomous and functionally unrelated domain. For example it is found associated with Q9JZR1 from SWISSPROT in IPR002125 from INTERPRO.; PDB: 3BQT_A 3MAB_A.
Probab=26.29 E-value=73 Score=24.16 Aligned_cols=34 Identities=24% Similarity=0.363 Sum_probs=20.2
Q ss_pred HHHHhhCCCCchhhhh----hhHHhhhhcCCChHHhhhh
Q 021140 48 KRLRSENSDLSVTLGL----RKHIGKTYKELPPEQKARY 82 (317)
Q Consensus 48 ~~~~~~~p~~~~~~~~----~k~~g~~wk~l~~~ek~~y 82 (317)
..+++.++++.. .-+ |=.-|..|..||+++|+.-
T Consensus 40 ~~Lk~~~~~~~~-~~L~aL~gAi~g~~~~~L~~~~K~~L 77 (81)
T PF04994_consen 40 LRLKASGPSVCL-NLLYALEGAIQGIHWADLPDEEKQEL 77 (81)
T ss_dssp HHHHHH-TT--H-HHHHHHHHHHCTS-GGGS-HHHHHHH
T ss_pred HHHHHHCCCCCH-HHHHHHHHHHcCCCHHHCCHHHHHHH
Confidence 356777888876 344 4445569999999999753
No 41
>PF03457 HA: Helicase associated domain; InterPro: IPR005114 This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.; PDB: 2KTA_A.
Probab=25.09 E-value=49 Score=23.66 Aligned_cols=16 Identities=25% Similarity=0.578 Sum_probs=11.3
Q ss_pred hcCCHHHHHHHHHhCc
Q 021140 119 KSLSEEKKKAIREIGF 134 (317)
Q Consensus 119 ~~Ls~~qk~~I~~~GF 134 (317)
..|+++|.+.++++||
T Consensus 52 g~L~~er~~~L~~lg~ 67 (68)
T PF03457_consen 52 GKLTPERIERLDALGF 67 (68)
T ss_dssp T---HHHHHHHHHHT-
T ss_pred CCCCHHHHHHHHcCCC
Confidence 4699999999999998
No 42
>PF05914 RIB43A: RIB43A; InterPro: IPR008805 This family consists of several RIB43A-like eukaryotic proteins. Ciliary and flagellar microtubules contain a specialised set of protofilaments, termed ribbons, that are composed of tubulin and several associated proteins. RIB43A was first characterised in the unicellular biflagellate, Chlamydomonas reinhardtii although highly related sequences are present in several higher eukaryotes including humans. The function of this protein is unknown although the structure of RIB43A and its association with the specialised protofilament ribbons and with basal bodies is relevant to the proposed role of ribbons in forming and stabilising doublet and triplet microtubules and in organising their three-dimensional structure. Human RIB43A homologues could represent a structural requirement in centriole replication in dividing cells [].
Probab=22.62 E-value=29 Score=34.18 Aligned_cols=38 Identities=26% Similarity=0.458 Sum_probs=27.9
Q ss_pred HHHHHhhCCCCch-hhhhhhHHhhhhcCCChHHhhhhhh
Q 021140 47 VKRLRSENSDLSV-TLGLRKHIGKTYKELPPEQKARYKK 84 (317)
Q Consensus 47 r~~~~~~~p~~~~-~~~~~k~~g~~wk~l~~~ek~~y~~ 84 (317)
...+-.|||++.. ..+-.+.+...||.||+++.+-+.+
T Consensus 249 ~sdlLtEnp~~a~s~~gp~Rv~~d~wKGMs~eQl~~i~~ 287 (379)
T PF05914_consen 249 TSDLLTENPQVAQSSFGPHRVIPDRWKGMSPEQLEEIRK 287 (379)
T ss_pred cCccccCCchhccccCCCCCCCCcccCCCCHHHHHHHHH
Confidence 3456678999853 1233677889999999999997743
No 43
>cd09071 FAR_C C-terminal domain of fatty acyl CoA reductases. C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins in this FAR_C family may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. The function of this C-terminal domain is unclear.
Probab=22.57 E-value=79 Score=23.82 Aligned_cols=21 Identities=14% Similarity=0.754 Sum_probs=17.8
Q ss_pred ccccccchHHHHHHHHHHHHHH
Q 021140 266 SIKSLNWATFCYDWLVKSICRF 287 (317)
Q Consensus 266 ~i~~yNW~~~Vld~L~~~i~k~ 287 (317)
++.++||..++.++ +.|+++|
T Consensus 70 D~~~idW~~Y~~~~-~~G~r~y 90 (92)
T cd09071 70 DIRSIDWDDYFENY-IPGLRKY 90 (92)
T ss_pred CCCCCCHHHHHHHH-HHHHHHH
Confidence 35799999999999 8888876
No 44
>cd07321 Extradiol_Dioxygenase_3A_like Subunit A of Class III extradiol dioxygenases. Extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. There are two major groups of dioxygenases according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B, which form a tetramer composed of two copies of each subunit. This model represents subunit A of c
Probab=22.02 E-value=1e+02 Score=23.11 Aligned_cols=32 Identities=22% Similarity=0.212 Sum_probs=25.4
Q ss_pred cCCHHHHHHHHHhCcchhhccccccccHHHHHHHH
Q 021140 120 SLSEEKKKAIREIGFESLLELRCGKLKRKLCHWLV 154 (317)
Q Consensus 120 ~Ls~~qk~~I~~~GFg~LL~i~~~~l~~~L~~wL~ 154 (317)
.||++|+++|.+--+.+|+++.. |..++..+.
T Consensus 34 ~Lt~eE~~al~~rD~~~L~~lG~---~~~~l~k~~ 65 (77)
T cd07321 34 GLTPEEKAALLARDVGALYVLGV---NPMLLMHFA 65 (77)
T ss_pred CCCHHHHHHHHcCCHHHHHHcCC---CHHHHHHHH
Confidence 69999999999999999999874 445544443
Done!