Query 042031
Match_columns 565
No_of_seqs 250 out of 1582
Neff 9.2
Searched_HMMs 46136
Date Fri Mar 29 13:16:21 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/042031.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/042031hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 8.7E-69 1.9E-73 580.4 41.2 444 16-483 70-623 (846)
2 PF10551 MULE: MULE transposas 99.8 8.7E-19 1.9E-23 143.4 7.3 76 205-285 18-93 (93)
3 PF00872 Transposase_mut: Tran 99.7 3.2E-18 7E-23 175.7 5.2 224 120-362 113-349 (381)
4 PF03108 DBD_Tnp_Mut: MuDR fam 99.7 7.5E-17 1.6E-21 122.4 8.9 67 17-83 1-67 (67)
5 COG3328 Transposase and inacti 99.3 8.5E-11 1.8E-15 118.4 14.2 219 119-362 98-328 (379)
6 smart00575 ZnF_PMZ plant mutat 99.0 2.1E-10 4.6E-15 69.7 2.3 28 441-468 1-28 (28)
7 PF08731 AFT: Transcription fa 98.9 7E-09 1.5E-13 83.5 8.1 68 26-93 1-110 (111)
8 PF03101 FAR1: FAR1 DNA-bindin 98.7 1.8E-08 4E-13 81.7 5.8 61 34-95 1-90 (91)
9 PF04434 SWIM: SWIM zinc finge 98.3 4.9E-07 1.1E-11 60.6 3.3 30 436-465 10-39 (40)
10 PF00098 zf-CCHC: Zinc knuckle 96.4 0.0018 3.9E-08 34.7 1.4 18 546-563 1-18 (18)
11 PF15288 zf-CCHC_6: Zinc knuck 96.2 0.0021 4.6E-08 41.8 1.0 18 546-563 2-21 (40)
12 PF13696 zf-CCHC_2: Zinc knuck 94.5 0.019 4.2E-07 35.5 1.2 21 544-564 7-27 (32)
13 PF03106 WRKY: WRKY DNA -bindi 94.1 0.077 1.7E-06 38.7 3.9 39 55-93 21-59 (60)
14 PF06782 UPF0236: Uncharacteri 94.0 1.8 3.8E-05 46.2 15.6 130 223-361 235-376 (470)
15 PF04684 BAF1_ABF1: BAF1 / ABF 93.2 0.19 4E-06 51.3 6.2 67 10-76 12-79 (496)
16 COG3316 Transposase and inacti 92.9 1.3 2.8E-05 41.2 10.6 116 130-274 33-152 (215)
17 PF01610 DDE_Tnp_ISL3: Transpo 91.3 0.16 3.5E-06 49.3 3.1 66 218-288 30-96 (249)
18 PF13610 DDE_Tnp_IS240: DDE do 90.9 0.04 8.8E-07 48.3 -1.5 60 205-271 22-81 (140)
19 smart00774 WRKY DNA binding do 90.7 0.35 7.5E-06 35.0 3.5 38 55-92 21-59 (59)
20 PF04500 FLYWCH: FLYWCH zinc f 90.5 0.45 9.9E-06 34.7 4.2 46 43-92 14-62 (62)
21 smart00343 ZnF_C2HC zinc finge 87.1 0.32 7E-06 28.7 1.0 17 547-563 1-17 (26)
22 PF14392 zf-CCHC_4: Zinc knuck 87.0 0.24 5.1E-06 34.6 0.5 20 544-563 30-49 (49)
23 PF13565 HTH_32: Homeodomain-l 86.5 1.8 3.9E-05 33.2 5.4 41 107-147 34-76 (77)
24 PF03050 DDE_Tnp_IS66: Transpo 83.1 1.1 2.4E-05 44.0 3.5 36 250-290 121-156 (271)
25 COG5179 TAF1 Transcription ini 82.0 0.86 1.9E-05 48.1 2.1 25 539-563 931-957 (968)
26 COG5431 Uncharacterized metal- 78.3 1.6 3.5E-05 34.9 2.1 34 428-463 39-77 (117)
27 PF02178 AT_hook: AT hook moti 76.7 1.1 2.4E-05 21.8 0.5 9 524-532 2-10 (13)
28 PHA02517 putative transposase 75.6 26 0.00056 34.4 10.5 42 107-149 30-72 (277)
29 PRK09335 30S ribosomal protein 73.8 2.7 5.9E-05 33.3 2.3 27 520-553 2-28 (95)
30 PLN00186 ribosomal protein S26 69.4 3.8 8.3E-05 33.3 2.2 27 520-553 2-28 (109)
31 PTZ00172 40S ribosomal protein 68.6 4.1 8.9E-05 33.1 2.3 27 520-553 2-28 (108)
32 smart00384 AT_hook DNA binding 64.6 4.1 8.8E-05 23.9 1.1 12 523-534 1-12 (26)
33 PF13917 zf-CCHC_3: Zinc knuck 58.1 6.1 0.00013 26.4 1.3 18 545-562 4-21 (42)
34 COG5082 AIR1 Arginine methyltr 57.7 5.2 0.00011 36.3 1.2 16 546-561 98-113 (190)
35 PF13592 HTH_33: Winged helix- 56.4 20 0.00044 26.0 3.9 30 119-148 2-31 (60)
36 COG4715 Uncharacterized conser 55.6 35 0.00077 36.4 6.9 51 415-467 41-97 (587)
37 PF04937 DUF659: Protein of un 52.7 48 0.001 29.4 6.5 62 225-290 74-138 (153)
38 PF04800 ETC_C1_NDUFA4: ETC co 51.5 27 0.00058 28.5 4.2 45 7-55 34-80 (101)
39 COG4830 RPS26B Ribosomal prote 50.9 10 0.00023 30.1 1.7 27 520-553 2-28 (108)
40 PHA00689 hypothetical protein 50.6 10 0.00022 25.8 1.4 12 544-555 16-27 (62)
41 PF01283 Ribosomal_S26e: Ribos 49.9 12 0.00025 31.0 1.9 27 520-553 2-28 (113)
42 PF08766 DEK_C: DEK C terminal 46.5 52 0.0011 23.2 4.7 38 107-144 4-43 (54)
43 PF01498 HTH_Tnp_Tc3_2: Transp 43.0 20 0.00043 26.9 2.2 36 112-148 4-39 (72)
44 PF00665 rve: Integrase core d 40.2 36 0.00077 28.1 3.6 54 205-263 29-82 (120)
45 PF09713 A_thal_3526: Plant pr 36.8 32 0.0007 24.4 2.2 26 121-146 12-38 (54)
46 COG4279 Uncharacterized conser 34.0 26 0.00057 33.3 1.8 24 440-466 124-147 (266)
47 PF05741 zf-nanos: Nanos RNA b 33.3 13 0.00029 26.5 -0.1 20 544-563 32-54 (55)
48 COG5082 AIR1 Arginine methyltr 32.5 22 0.00047 32.5 1.0 18 545-562 60-77 (190)
49 PRK14892 putative transcriptio 32.4 35 0.00075 27.7 2.1 9 544-552 20-28 (99)
50 PRK12286 rpmF 50S ribosomal pr 29.6 54 0.0012 23.6 2.5 32 520-553 4-35 (57)
51 PF10045 DUF2280: Uncharacteri 28.9 61 0.0013 26.3 2.9 24 123-146 21-44 (104)
52 PF13276 HTH_21: HTH-like doma 28.3 1.5E+02 0.0033 21.1 4.8 42 108-149 6-48 (60)
53 PF13877 RPAP3_C: Potential Mo 26.9 47 0.001 26.5 2.0 33 304-336 5-37 (94)
54 COG5222 Uncharacterized conser 26.2 43 0.00092 32.5 1.9 25 540-564 171-195 (427)
55 PF05634 APO_RNA-bind: APO RNA 25.8 45 0.00098 30.7 1.9 20 544-563 97-121 (204)
56 PTZ00368 universal minicircle 25.8 43 0.00092 29.4 1.7 19 546-564 78-96 (148)
57 TIGR01031 rpmF_bact ribosomal 25.3 79 0.0017 22.6 2.7 41 520-561 2-42 (55)
58 PTZ00368 universal minicircle 24.8 41 0.0009 29.5 1.5 20 545-564 52-71 (148)
59 TIGR01589 A_thal_3526 uncharac 24.7 64 0.0014 23.2 2.1 26 122-147 16-42 (57)
60 PF14201 DUF4318: Domain of un 24.5 1.3E+02 0.0029 22.9 3.9 32 23-54 11-42 (74)
61 PF13551 HTH_29: Winged helix- 24.0 1.8E+02 0.0039 23.5 5.2 37 111-147 65-107 (112)
62 PF14952 zf-tcix: Putative tre 23.7 50 0.0011 22.2 1.3 21 544-564 10-31 (44)
63 KOG2044 5'-3' exonuclease HKE1 23.5 32 0.00069 38.2 0.6 20 544-563 259-278 (931)
64 COG4888 Uncharacterized Zn rib 22.7 71 0.0015 25.8 2.2 8 546-553 47-54 (104)
65 KOG0341 DEAD-box protein abstr 22.0 42 0.00091 34.2 1.0 19 545-563 570-588 (610)
66 PF01644 Chitin_synth_1: Chiti 20.9 2.1E+02 0.0045 25.6 5.0 47 206-258 101-149 (163)
67 PF11645 PDDEXK_5: PD-(D/E)XK 20.7 2.8E+02 0.0062 24.1 5.5 45 29-73 6-50 (149)
68 PF01316 Arg_repressor: Argini 20.5 1.8E+02 0.0039 21.9 3.9 36 111-147 9-44 (70)
69 KOG4602 Nanos and related prot 20.2 44 0.00095 31.7 0.7 20 544-563 267-289 (318)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=8.7e-69 Score=580.40 Aligned_cols=444 Identities=16% Similarity=0.260 Sum_probs=344.0
Q ss_pred CCCCccccCeeCCHHHHHHHHHHHHHhccceEEEEeecCe-------EEEEEeec-------------------------
Q 042031 16 AEPTLSIGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRS-------RFIAKCSK------------------------- 63 (565)
Q Consensus 16 ~~~~l~~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~-------r~~~~C~~------------------------- 63 (565)
....|.+||+|+|.|||++||+.||.+.||++|+.+|.++ ..+++|++
T Consensus 70 ~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~~~~ 149 (846)
T PLN03097 70 TNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQDPEN 149 (846)
T ss_pred CCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccCccc
Confidence 4456899999999999999999999999999998755432 23567764
Q ss_pred ---------CCCccEEEEEEcCCCCceEEeeeccceeeeCcccccccccchhhHHHHHHHHHhcCCCCCHHHHHHHHHHH
Q 042031 64 ---------EGCPWRVHVAKCPGVPTFSIRTLHGEHTCEGVQNLHHQQASVGWVARSVEARIRDNPQYKPKEILQDIRDQ 134 (565)
Q Consensus 64 ---------~~C~~~v~~~~~~~~~~~~V~~~~~~H~c~~~~~~~~~~~~~~~i~~~~~~~l~~~~~~~~~~i~~~~~~~ 134 (565)
+||+++|.+.+ ..+|.|+|+.+..+|||+..+.......+...+. .+...+....++.
T Consensus 150 ~~~rR~~tRtGC~A~m~Vk~-~~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~-~~~~~~~~~~~v~----------- 216 (846)
T PLN03097 150 GTGRRSCAKTDCKASMHVKR-RPDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYA-AMARQFAEYKNVV----------- 216 (846)
T ss_pred ccccccccCCCCceEEEEEE-cCCCeEEEEEEecCCCCCCCCccccchhhhhhHH-HHHhhhhcccccc-----------
Confidence 37999999987 4568999999999999976543221110111111 0100010000000
Q ss_pred cCccc-CHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcc---------cchhhhhc-----
Q 042031 135 HGVAV-SYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKI---------VSIGSLFL----- 199 (565)
Q Consensus 135 ~g~~~-s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~---------~~i~~f~~----- 199 (565)
+... .....-+.|++ +... .+.+.|++||++++.+||+|+|+|++|++++ .|+..|.+
T Consensus 217 -~~~~d~~~~~~~~r~~--~~~~----gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~FGDvV 289 (846)
T PLN03097 217 -GLKNDSKSSFDKGRNL--GLEA----GDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGNFSDVV 289 (846)
T ss_pred -ccchhhcchhhHHHhh--hccc----chHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHhcCCEE
Confidence 0000 00011111221 1112 3457899999999999999999999999886 44444432
Q ss_pred -ccccc------------cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhh
Q 042031 200 -IVHQY------------MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVET 266 (565)
Q Consensus 200 -l~~~y------------~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~ 266 (565)
+|++| +|+|+|+|+++|||||+.+|+.|+|.|||++|+++|+ +.+|.+||||++.+|.+||++
T Consensus 290 ~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~----gk~P~tIiTDqd~am~~AI~~ 365 (846)
T PLN03097 290 SFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMG----GQAPKVIITDQDKAMKSVISE 365 (846)
T ss_pred EEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhC----CCCCceEEecCCHHHHHHHHH
Confidence 55555 9999999999999999999999999999999999999 899999999999999999999
Q ss_pred hCCCCcchhhHHHHHHHHHhhcCc-----hhhHHHHHHHHH-hhcHHHHHHHHHHHHhc-ccchhhHhhhC--CCCCccc
Q 042031 267 HFPSAFHCFCLRYVSENFRDTFKN-----TKLVNIFWNAVY-ALTTVEFEAKISEMVEI-SQDVIPWFQQF--PPQLWAI 337 (565)
Q Consensus 267 vfP~~~h~~C~~Hi~~n~~~~~~~-----~~~~~~~~~~~~-~~t~~eF~~~~~~l~~~-~~~~~~~l~~~--~~~~W~~ 337 (565)
|||++.|++|.|||++|+.++++. +.|...|..|++ +.+++||+..|.+|... +++.++||+.+ .|++|++
T Consensus 366 VfP~t~Hr~C~wHI~~~~~e~L~~~~~~~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~LY~~RekWap 445 (846)
T PLN03097 366 VFPNAHHCFFLWHILGKVSENLGQVIKQHENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSLYEDRKQWVP 445 (846)
T ss_pred HCCCceehhhHHHHHHHHHHHhhHHhhhhhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHHHHhHhhhhH
Confidence 999999999999999999999853 589999999998 89999999999999876 89999999999 8999999
Q ss_pred cccCCccc-cccccchhHHHHHHhhhC--CCCCHHHHHHHHHHHHHHHHHHHHHh-----------------hhhccCCC
Q 042031 338 AYFEGVRY-GHFTLGVTELLYNWALEC--HELPVVQMMEYIRHQLTSWFNDRREM-----------------GMRWTSIL 397 (565)
Q Consensus 338 a~~~~~~~-~~~ttn~~Es~n~~lk~~--~~~pi~~~~e~i~~~~~~~~~~r~~~-----------------~~~~~~~~ 397 (565)
+|+++.++ |+.||+++||+|+.|++. ...+|..|++.+...+..+..+..+. ..+....+
T Consensus 446 aY~k~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piEkQAs~iY 525 (846)
T PLN03097 446 TYMRDAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLEKSVSGVY 525 (846)
T ss_pred HHhcccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHHHHHHHHh
Confidence 99966666 667999999999999974 55889999988877665544332211 11235689
Q ss_pred CchHHHHHHHHHHhccceEEEeeC----CeeEEEEe--cCceEEEEc----cCceeccCccccCCCCchhHHHHHHhcCC
Q 042031 398 VPSTERQIMEAIADARCYQVLRAN----EIEFEIVS--TERTNIVDI----RSRVCSCRRWQLYGLPCAHAAAALLSCGQ 467 (565)
Q Consensus 398 tp~~~~~~~~~~~~~~~~~v~~~~----~~~~~V~~--~~~~~~V~l----~~~~CsC~~~~~~GiPC~H~lav~~~~~~ 467 (565)
||.+|++||+++..+..|.+...+ ..+|.|.. ....|.|.. .+.+|+|++|+..||||+|||+||.+.++
T Consensus 526 T~~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLkVL~~~~v 605 (846)
T PLN03097 526 THAVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALVVLQMCQL 605 (846)
T ss_pred HHHHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHHHHhhcCc
Confidence 999999999999998888776542 24688876 345677743 36799999999999999999999999998
Q ss_pred --CcccccccchhHHHHH
Q 042031 468 --NVHLFAEPCFTVASYR 483 (565)
Q Consensus 468 --~~~~~v~~~y~~~~~~ 483 (565)
.|+.||.+|||+++-.
T Consensus 606 ~~IP~~YILkRWTKdAK~ 623 (846)
T PLN03097 606 SAIPSQYILKRWTKDAKS 623 (846)
T ss_pred ccCchhhhhhhchhhhhh
Confidence 4999999999988743
No 2
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.76 E-value=8.7e-19 Score=143.39 Aligned_cols=76 Identities=30% Similarity=0.649 Sum_probs=73.2
Q ss_pred cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHH
Q 042031 205 MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHCFCLRYVSENF 284 (565)
Q Consensus 205 ~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~ 284 (565)
+|+|++++.+|+||+++++|+.++|.|||+.+++.++ .. |.+||||++.|+.+||+++||++.|++|.||+.+|+
T Consensus 18 ~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~----~~-p~~ii~D~~~~~~~Ai~~vfP~~~~~~C~~H~~~n~ 92 (93)
T PF10551_consen 18 VGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP----QK-PKVIISDFDKALINAIKEVFPDARHQLCLFHILRNI 92 (93)
T ss_pred EEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc----cC-ceeeeccccHHHHHHHHHHCCCceEehhHHHHHHhh
Confidence 7999999999999999999999999999999999998 45 999999999999999999999999999999999997
Q ss_pred H
Q 042031 285 R 285 (565)
Q Consensus 285 ~ 285 (565)
+
T Consensus 93 k 93 (93)
T PF10551_consen 93 K 93 (93)
T ss_pred C
Confidence 4
No 3
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.72 E-value=3.2e-18 Score=175.69 Aligned_cols=224 Identities=18% Similarity=0.187 Sum_probs=167.4
Q ss_pred CCCCHHHHHHHHHHHcC-cccCHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcc-cchhhh
Q 042031 120 PQYKPKEILQDIRDQHG-VAVSYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKI-VSIGSL 197 (565)
Q Consensus 120 ~~~~~~~i~~~~~~~~g-~~~s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~-~~i~~f 197 (565)
.+++..+|.+.++..+| ..+|.+++.|..+...+.+ ..|..+-....|-.++ .+|+-.. .-.+|-
T Consensus 113 ~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~-----------~~w~~R~L~~~~y~~l--~iD~~~~kvr~~~~ 179 (381)
T PF00872_consen 113 KGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEV-----------EAWRNRPLESEPYPYL--WIDGTYFKVREDGR 179 (381)
T ss_pred cccccccccchhhhhhcccccCchhhhhhhhhhhhhH-----------HHHhhhccccccccce--eeeeeecccccccc
Confidence 67889999999999999 7899999988776654432 2333333333322222 2221100 000000
Q ss_pred hcccccc--cccCCCCceeEeEEEEeeccccchHHHHHHHHH-HHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcch
Q 042031 198 FLIVHQY--MGVDAEDALFPLAIAIVDVESDENWMWFMSELR-KLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHC 274 (565)
Q Consensus 198 ~~l~~~y--~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~-~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~ 274 (565)
..-...| +|+|.+|+..+||+.+.+.|+.++|.-||+.|+ +.+. .|..|++|.++||.+||+++||++.++
T Consensus 180 ~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~L~~RGl~------~~~lvv~Dg~~gl~~ai~~~fp~a~~Q 253 (381)
T PF00872_consen 180 VVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQDLKERGLK------DILLVVSDGHKGLKEAIREVFPGAKWQ 253 (381)
T ss_pred cccchhhhhhhhhcccccceeeeecccCCccCEeeecchhhhhcccc------ccceeeccccccccccccccccchhhh
Confidence 0001123 999999999999999999999999999999997 4544 588999999999999999999999999
Q ss_pred hhHHHHHHHHHhhcCch---hhHHHHHHHHHhhcHHHHHHHHHHHHh----cccchhhHhhhCCCCCccccccCCccc-c
Q 042031 275 FCLRYVSENFRDTFKNT---KLVNIFWNAVYALTTVEFEAKISEMVE----ISQDVIPWFQQFPPQLWAIAYFEGVRY-G 346 (565)
Q Consensus 275 ~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~t~~eF~~~~~~l~~----~~~~~~~~l~~~~~~~W~~a~~~~~~~-~ 346 (565)
.|.+|+++|+.++++.+ .+...+..+..+.+.++....++++.+ .+|.+.++|++...+.|+..-++...+ -
T Consensus 254 rC~vH~~RNv~~~v~~k~~~~v~~~Lk~I~~a~~~e~a~~~l~~f~~~~~~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~ 333 (381)
T PF00872_consen 254 RCVVHLMRNVLRKVPKKDRKEVKADLKAIYQAPDKEEAREALEEFAEKWEKKYPKAAKSLEENWDELLTFLDFPPEHRRS 333 (381)
T ss_pred hheechhhhhccccccccchhhhhhccccccccccchhhhhhhhcccccccccchhhhhhhhccccccceeeecchhccc
Confidence 99999999999998653 777788888778888888888888765 388899999887667676544444444 5
Q ss_pred ccccchhHHHHHHhhh
Q 042031 347 HFTLGVTELLYNWALE 362 (565)
Q Consensus 347 ~~ttn~~Es~n~~lk~ 362 (565)
+.|||..||+|+.|+.
T Consensus 334 i~TTN~iEsln~~irr 349 (381)
T PF00872_consen 334 IRTTNAIESLNKEIRR 349 (381)
T ss_pred cchhhhccccccchhh
Confidence 5699999999999986
No 4
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=99.70 E-value=7.5e-17 Score=122.44 Aligned_cols=67 Identities=40% Similarity=0.840 Sum_probs=65.1
Q ss_pred CCCccccCeeCCHHHHHHHHHHHHHhccceEEEEeecCeEEEEEeecCCCccEEEEEEcCCCCceEE
Q 042031 17 EPTLSIGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRSRFIAKCSKEGCPWRVHVAKCPGVPTFSI 83 (565)
Q Consensus 17 ~~~l~~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~r~~~~C~~~~C~~~v~~~~~~~~~~~~V 83 (565)
++.|.+||+|+|++|++.||..||++.+|++++.+|++.+++++|...+|||+|++++.++++.|+|
T Consensus 1 n~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~~r~~~~C~~~~C~Wrv~as~~~~~~~~~I 67 (67)
T PF03108_consen 1 NPELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDKKRYRAKCKDKGCPWRVRASKRKRSDTFQI 67 (67)
T ss_pred CCccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCCEEEEEEEcCCCCCEEEEEEEcCCCCEEEC
Confidence 5789999999999999999999999999999999999999999999999999999999999999986
No 5
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.25 E-value=8.5e-11 Score=118.39 Aligned_cols=219 Identities=19% Similarity=0.186 Sum_probs=152.2
Q ss_pred CCCCCHHHHHHHHHHHcCcccCHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcc-cchhhh
Q 042031 119 NPQYKPKEILQDIRDQHGVAVSYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKI-VSIGSL 197 (565)
Q Consensus 119 ~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~-~~i~~f 197 (565)
..++++.++...++..++..+|...+.+.....++ .+.+++..-++-+-.+.+|.... .. ++
T Consensus 98 ~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e---------------~v~~~~~r~l~~~~~v~~D~~~~k~r--~v 160 (379)
T COG3328 98 AKGVTTREIEALLEELYGHKVSPSVISVVTDRLDE---------------KVKAWQNRPLGDYPYVYLDAKYVKVR--SV 160 (379)
T ss_pred HcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHH---------------HHHHHHhccccCceEEEEecceeehh--hh
Confidence 36789999999999998888888777665544433 33444444443333333332211 00 00
Q ss_pred hcccccc--cccCCCCceeEeEEEEeeccccchHHHHHHHHH-HHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcch
Q 042031 198 FLIVHQY--MGVDAEDALFPLAIAIVDVESDENWMWFMSELR-KLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHC 274 (565)
Q Consensus 198 ~~l~~~y--~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~-~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~ 274 (565)
.-...| +|++.+|+-.++|+.+-+.|+ ..|.-||..|+ +.+. ....+++|..+|+.+||.++||.+.++
T Consensus 161 -~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~rgl~------~v~l~v~Dg~~gl~~aI~~v~p~a~~Q 232 (379)
T COG3328 161 -RNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNRGLS------DVLLVVVDGLKGLPEAISAVFPQAAVQ 232 (379)
T ss_pred -hhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhcccc------ceeEEecchhhhhHHHHHHhccHhhhh
Confidence 011112 999999999999999999999 99997777776 4455 346677899999999999999999999
Q ss_pred hhHHHHHHHHHhhcCch---hhHHHHHHHHHhhcHHHHHHHHHHHHh----cccchhhHhhhCCCCCc-cccccCCcccc
Q 042031 275 FCLRYVSENFRDTFKNT---KLVNIFWNAVYALTTVEFEAKISEMVE----ISQDVIPWFQQFPPQLW-AIAYFEGVRYG 346 (565)
Q Consensus 275 ~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~t~~eF~~~~~~l~~----~~~~~~~~l~~~~~~~W-~~a~~~~~~~~ 346 (565)
.|..|+.+|+..+...+ .+...+..+..+.+.++-...|..+.. ..|....++.+..-+.| ..+|....+--
T Consensus 233 ~C~vH~~Rnll~~v~~k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~~yP~i~~~~~~~~~~~~~F~~fp~~~r~~ 312 (379)
T COG3328 233 RCIVHLVRNLLDKVPRKDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGKRYPAILKSWRNALEELLPFFAFPSEIRKI 312 (379)
T ss_pred hhhhHHHhhhhhhhhhhhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhhhcchHHHHHHHHHHHhcccccCcHHHHhH
Confidence 99999999999988765 344445555556677766666666443 47787777777644444 33343333335
Q ss_pred ccccchhHHHHHHhhh
Q 042031 347 HFTLGVTELLYNWALE 362 (565)
Q Consensus 347 ~~ttn~~Es~n~~lk~ 362 (565)
+.|||..|++|+.++.
T Consensus 313 i~ttN~IE~~n~~ir~ 328 (379)
T COG3328 313 IYTTNAIESLNKLIRR 328 (379)
T ss_pred hhcchHHHHHHHHHHH
Confidence 6799999999997763
No 6
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.99 E-value=2.1e-10 Score=69.73 Aligned_cols=28 Identities=50% Similarity=1.068 Sum_probs=25.5
Q ss_pred ceeccCccccCCCCchhHHHHHHhcCCC
Q 042031 441 RVCSCRRWQLYGLPCAHAAAALLSCGQN 468 (565)
Q Consensus 441 ~~CsC~~~~~~GiPC~H~lav~~~~~~~ 468 (565)
.+|||++||..||||+|+|+|+...+++
T Consensus 1 ~~CsC~~~~~~gipC~H~i~v~~~~~~~ 28 (28)
T smart00575 1 KTCSCRKFQLSGIPCRHALAAAIHIGLS 28 (28)
T ss_pred CcccCCCcccCCccHHHHHHHHHHhCCC
Confidence 4799999999999999999999988763
No 7
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=98.89 E-value=7e-09 Score=83.49 Aligned_cols=68 Identities=21% Similarity=0.451 Sum_probs=64.5
Q ss_pred eCCHHHHHHHHHHHHHhccceEEEEeecCeEEEEEeec------------------------------------------
Q 042031 26 FPDVETCRRTLKDIAIALHFDLRIVKSDRSRFIAKCSK------------------------------------------ 63 (565)
Q Consensus 26 F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~r~~~~C~~------------------------------------------ 63 (565)
|.+++|++.+|+.++...||++.+.+|+...+.|.|..
T Consensus 1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k~t~srk 80 (111)
T PF08731_consen 1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKKKIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKKRTKSRK 80 (111)
T ss_pred CCchHHHHHHHHHHhhhcCceEEEEecCCceEEEEEecCCCcccccccccccccccccccccccccccccccCCcccccc
Confidence 88999999999999999999999999999999999963
Q ss_pred CCCccEEEEEEcCCCCceEEeeeccceeee
Q 042031 64 EGCPWRVHVAKCPGVPTFSIRTLHGEHTCE 93 (565)
Q Consensus 64 ~~C~~~v~~~~~~~~~~~~V~~~~~~H~c~ 93 (565)
..|||+|+|+.....+.|.|+.+++.|||+
T Consensus 81 ~~CPFriRA~yS~k~k~W~lvvvnn~HnH~ 110 (111)
T PF08731_consen 81 NTCPFRIRANYSKKNKKWTLVVVNNEHNHP 110 (111)
T ss_pred cCCCeEEEEEEEecCCeEEEEEecCCcCCC
Confidence 379999999999999999999999999996
No 8
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=98.73 E-value=1.8e-08 Score=81.65 Aligned_cols=61 Identities=23% Similarity=0.314 Sum_probs=53.3
Q ss_pred HHHHHHHHhccceEEEEeecCe-------EEEEEeec----------------------CCCccEEEEEEcCCCCceEEe
Q 042031 34 RTLKDIAIALHFDLRIVKSDRS-------RFIAKCSK----------------------EGCPWRVHVAKCPGVPTFSIR 84 (565)
Q Consensus 34 ~a~~~ya~~~gf~~~~~kS~~~-------r~~~~C~~----------------------~~C~~~v~~~~~~~~~~~~V~ 84 (565)
+||+.||..+||.+++.+|.+. ++.++|++ +||||+|.+.+.. ++.|.|.
T Consensus 1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~v~ 79 (91)
T PF03101_consen 1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK-DGKWRVT 79 (91)
T ss_pred CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc-CCEEEEE
Confidence 4899999999999999876654 57889985 5899999999877 8999999
Q ss_pred eeccceeeeCc
Q 042031 85 TLHGEHTCEGV 95 (565)
Q Consensus 85 ~~~~~H~c~~~ 95 (565)
.+..+|||+..
T Consensus 80 ~~~~~HNH~L~ 90 (91)
T PF03101_consen 80 SFVLEHNHPLC 90 (91)
T ss_pred ECcCCcCCCCC
Confidence 99999999753
No 9
>PF04434 SWIM: SWIM zinc finger; InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.32 E-value=4.9e-07 Score=60.61 Aligned_cols=30 Identities=43% Similarity=0.802 Sum_probs=27.2
Q ss_pred EEccCceeccCccccCCCCchhHHHHHHhc
Q 042031 436 VDIRSRVCSCRRWQLYGLPCAHAAAALLSC 465 (565)
Q Consensus 436 V~l~~~~CsC~~~~~~GiPC~H~lav~~~~ 465 (565)
+++...+|+|..|+..|.||+|++|++...
T Consensus 10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~ 39 (40)
T PF04434_consen 10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL 39 (40)
T ss_pred ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence 567788999999999999999999998765
No 10
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=96.41 E-value=0.0018 Score=34.72 Aligned_cols=18 Identities=28% Similarity=0.650 Sum_probs=16.3
Q ss_pred EEcCccCCCCCCCCCCCC
Q 042031 546 VQCGRCHLLGHSQKKCTM 563 (565)
Q Consensus 546 ~~C~~C~~~gHn~~tC~~ 563 (565)
++|-+|++.||.++.||+
T Consensus 1 ~~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 1 RKCFNCGEPGHIARDCPK 18 (18)
T ss_dssp SBCTTTSCSSSCGCTSSS
T ss_pred CcCcCCCCcCcccccCcc
Confidence 379999999999999995
No 11
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=96.20 E-value=0.0021 Score=41.85 Aligned_cols=18 Identities=44% Similarity=1.128 Sum_probs=15.9
Q ss_pred EEcCccCCCCCCC--CCCCC
Q 042031 546 VQCGRCHLLGHSQ--KKCTM 563 (565)
Q Consensus 546 ~~C~~C~~~gHn~--~tC~~ 563 (565)
++|++||+.||.+ ++||+
T Consensus 2 ~kC~~CG~~GH~~t~k~CP~ 21 (40)
T PF15288_consen 2 VKCKNCGAFGHMRTNKRCPM 21 (40)
T ss_pred ccccccccccccccCccCCC
Confidence 6899999999998 77875
No 12
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=94.53 E-value=0.019 Score=35.51 Aligned_cols=21 Identities=29% Similarity=0.520 Sum_probs=18.9
Q ss_pred ceEEcCccCCCCCCCCCCCCC
Q 042031 544 RIVQCGRCHLLGHSQKKCTMP 564 (565)
Q Consensus 544 ~~~~C~~C~~~gHn~~tC~~~ 564 (565)
..+.|.+|++.||..+.||+.
T Consensus 7 ~~Y~C~~C~~~GH~i~dCP~~ 27 (32)
T PF13696_consen 7 PGYVCHRCGQKGHWIQDCPTN 27 (32)
T ss_pred CCCEeecCCCCCccHhHCCCC
Confidence 458999999999999999974
No 13
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=94.14 E-value=0.077 Score=38.71 Aligned_cols=39 Identities=31% Similarity=0.606 Sum_probs=32.4
Q ss_pred eEEEEEeecCCCccEEEEEEcCCCCceEEeeeccceeee
Q 042031 55 SRFIAKCSKEGCPWRVHVAKCPGVPTFSIRTLHGEHTCE 93 (565)
Q Consensus 55 ~r~~~~C~~~~C~~~v~~~~~~~~~~~~V~~~~~~H~c~ 93 (565)
.|.-+.|+..+|+++-.+.+..+++...++++.++|||+
T Consensus 21 pRsYYrCt~~~C~akK~Vqr~~~d~~~~~vtY~G~H~h~ 59 (60)
T PF03106_consen 21 PRSYYRCTHPGCPAKKQVQRSADDPNIVIVTYEGEHNHP 59 (60)
T ss_dssp EEEEEEEECTTEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred eeEeeeccccChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence 456689999999999999988888899999999999996
No 14
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=94.05 E-value=1.8 Score=46.20 Aligned_cols=130 Identities=15% Similarity=0.192 Sum_probs=91.2
Q ss_pred ccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHHHhhcCc-hhhHHHHHHHH
Q 042031 223 VESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHCFCLRYVSENFRDTFKN-TKLVNIFWNAV 301 (565)
Q Consensus 223 ~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~~~~~~~-~~~~~~~~~~~ 301 (565)
..+.+-|.-+...+.+.... ....-+++.+|+.+.|.+++. .||.+.|.+..+|+.+.+.+.++. +.+...++.+.
T Consensus 235 ~~~~~~~~~v~~~i~~~Y~~--~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~~~~~~~~~al 311 (470)
T PF06782_consen 235 ESAEEFWEEVLDYIYNHYDL--DKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDPELKEKIRKAL 311 (470)
T ss_pred cchHHHHHHHHHHHHHhcCc--ccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhChHHHHHHHHHH
Confidence 55678899888888877763 123357788999999988776 999999999999999999988863 46777777788
Q ss_pred HhhcHHHHHHHHHHHHhc--cc-------chhhHhhhCCCCCccc--cccCCccccccccchhHHHHHHhh
Q 042031 302 YALTTVEFEAKISEMVEI--SQ-------DVIPWFQQFPPQLWAI--AYFEGVRYGHFTLGVTELLYNWAL 361 (565)
Q Consensus 302 ~~~t~~eF~~~~~~l~~~--~~-------~~~~~l~~~~~~~W~~--a~~~~~~~~~~ttn~~Es~n~~lk 361 (565)
+.....+++..++.+... .+ +...||.. +|-. .|... -|.......|+.+..+.
T Consensus 312 ~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~~~~Yl~~----n~~~i~~y~~~--~~~~g~g~ee~~~~~~s 376 (470)
T PF06782_consen 312 KKGDKKKLETVLDTAESCAKDEEERKKIRKLRKYLLN----NWDGIKPYRER--EGLRGIGAEESVSHVLS 376 (470)
T ss_pred HhcCHHHHHHHHHHHHHhhhchHHHHHHHHHHHHHHH----CHHHhhhhhhc--cCCCccchhhhhhhHHH
Confidence 877888888888887754 22 23444444 4422 22210 23334455777777664
No 15
>PF04684 BAF1_ABF1: BAF1 / ABF1 chromatin reorganising factor; InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=93.25 E-value=0.19 Score=51.32 Aligned_cols=67 Identities=22% Similarity=0.533 Sum_probs=56.5
Q ss_pred CCcCCCCCCCccccCeeCCHHHHHHHHHHHHHhccceEEEEeecCe-EEEEEeecCCCccEEEEEEcC
Q 042031 10 NDSLSLAEPTLSIGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRS-RFIAKCSKEGCPWRVHVAKCP 76 (565)
Q Consensus 10 ~~~~~~~~~~l~~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~-r~~~~C~~~~C~~~v~~~~~~ 76 (565)
+..|...++.-..+..|++.++-+.+++.|.++....|..+.|-+. .++|.|.-..|||+|.++..+
T Consensus 12 n~~l~~~~~~~~~~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nkhftfachlk~c~fkillsy~g 79 (496)
T PF04684_consen 12 NKSLASGDPQSAQARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNKHFTFACHLKNCPFKILLSYCG 79 (496)
T ss_pred hhhhccCCcccccccCCCcHHHHHHHHhhhhhhhcCceeecccccccceEEEeeccCCCceeeeeecc
Confidence 3445455556666888999999999999999999999999988764 599999999999999998654
No 16
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=92.89 E-value=1.3 Score=41.16 Aligned_cols=116 Identities=12% Similarity=0.149 Sum_probs=80.9
Q ss_pred HHHHHcCcccCHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcccchhhhhcccccc----c
Q 042031 130 DIRDQHGVAVSYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKIVSIGSLFLIVHQY----M 205 (565)
Q Consensus 130 ~~~~~~g~~~s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~~~i~~f~~l~~~y----~ 205 (565)
.+....|+.+++.++.|.=++. -+.+...+...++.....+-+|+ .+..+.|++ .
T Consensus 33 e~l~~rgi~v~h~Ti~rwv~k~--------------~~~~~~~~~~r~~~~~~~w~vDE-------t~ikv~gkw~ylyr 91 (215)
T COG3316 33 EMLAERGIEVDHETIHRWVQKY--------------GPLLARRLKRRKRKAGDSWRVDE-------TYIKVNGKWHYLYR 91 (215)
T ss_pred HHHHHcCcchhHHHHHHHHHHH--------------hHHHHHHhhhhccccccceeeee-------eEEeeccEeeehhh
Confidence 3445679999999988764432 13455566666665444444443 122256665 7
Q ss_pred ccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcch
Q 042031 206 GVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHC 274 (565)
Q Consensus 206 g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~ 274 (565)
++|.+| .++.+-|...-+...=.-||..+++.-+ .|.+|+||+.+....|+.++-+.+.|+
T Consensus 92 Aid~~g--~~Ld~~L~~rRn~~aAk~Fl~kllk~~g------~p~v~vtDka~s~~~A~~~l~~~~ehr 152 (215)
T COG3316 92 AIDADG--LTLDVWLSKRRNALAAKAFLKKLLKKHG------EPRVFVTDKAPSYTAALRKLGSEVEHR 152 (215)
T ss_pred hhccCC--CeEEEEEEcccCcHHHHHHHHHHHHhcC------CCceEEecCccchHHHHHhcCcchhee
Confidence 888884 4567777777777777777877776656 689999999999999999999977665
No 17
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=91.31 E-value=0.16 Score=49.29 Aligned_cols=66 Identities=14% Similarity=0.023 Sum_probs=54.5
Q ss_pred EEEeeccccchHHHHHHHH-HHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHHHhhc
Q 042031 218 IAIVDVESDENWMWFMSEL-RKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHCFCLRYVSENFRDTF 288 (565)
Q Consensus 218 ~a~~~~E~~e~~~w~l~~l-~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~~~~~ 288 (565)
++++++-+.++..=||..+ -... .....+|++|...+..+|+++.||+|.+..-.|||++++.+.+
T Consensus 30 l~i~~~r~~~~l~~~~~~~~~~~~-----~~~v~~V~~Dm~~~y~~~~~~~~P~A~iv~DrFHvvk~~~~al 96 (249)
T PF01610_consen 30 LDILPGRDKETLKDFFRSLYPEEE-----RKNVKVVSMDMSPPYRSAIREYFPNAQIVADRFHVVKLANRAL 96 (249)
T ss_pred EEEcCCccHHHHHHHHHHhCcccc-----ccceEEEEcCCCccccccccccccccccccccchhhhhhhhcc
Confidence 3578888888888787766 3332 3467899999999999999999999999999999999886644
No 18
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=90.87 E-value=0.04 Score=48.29 Aligned_cols=60 Identities=15% Similarity=0.128 Sum_probs=51.5
Q ss_pred cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCC
Q 042031 205 MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSA 271 (565)
Q Consensus 205 ~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~ 271 (565)
-.+|.+++ +|++-|-..-+.+.=..||..+++..+ ..|..|+||..++...|+++++|..
T Consensus 22 ~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~~-----~~p~~ivtDk~~aY~~A~~~l~~~~ 81 (140)
T PF13610_consen 22 RAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRHR-----GEPRVIVTDKLPAYPAAIKELNPEG 81 (140)
T ss_pred Eeeccccc--chhhhhhhhcccccceeeccccceeec-----cccceeecccCCccchhhhhccccc
Confidence 88999998 888888888888888888888776653 3789999999999999999998864
No 19
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=90.71 E-value=0.35 Score=35.03 Aligned_cols=38 Identities=32% Similarity=0.573 Sum_probs=32.0
Q ss_pred eEEEEEeec-CCCccEEEEEEcCCCCceEEeeeccceee
Q 042031 55 SRFIAKCSK-EGCPWRVHVAKCPGVPTFSIRTLHGEHTC 92 (565)
Q Consensus 55 ~r~~~~C~~-~~C~~~v~~~~~~~~~~~~V~~~~~~H~c 92 (565)
.|.-++|+. .+|+++=.+.+..+++...++++.++|||
T Consensus 21 pRsYYrCt~~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h 59 (59)
T smart00774 21 PRSYYRCTYSQGCPAKKQVQRSDDDPSVVEVTYEGEHTH 59 (59)
T ss_pred cceEEeccccCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence 345579998 89999988887777788899999999997
No 20
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=90.50 E-value=0.45 Score=34.68 Aligned_cols=46 Identities=20% Similarity=0.322 Sum_probs=24.9
Q ss_pred ccceEEEEeecCeEEEEEeecC---CCccEEEEEEcCCCCceEEeeeccceee
Q 042031 43 LHFDLRIVKSDRSRFIAKCSKE---GCPWRVHVAKCPGVPTFSIRTLHGEHTC 92 (565)
Q Consensus 43 ~gf~~~~~kS~~~r~~~~C~~~---~C~~~v~~~~~~~~~~~~V~~~~~~H~c 92 (565)
.|+.|...+.........|... +|+++|... .+.-.|.....+|||
T Consensus 14 ~Gy~y~~~~~~~~~~~WrC~~~~~~~C~a~~~~~----~~~~~~~~~~~~HnH 62 (62)
T PF04500_consen 14 DGYRYYFNKRNDGKTYWRCSRRRSHGCRARLITD----AGDGRVVRTNGEHNH 62 (62)
T ss_dssp TTEEEEEEEE-SS-EEEEEGGGTTS----EEEEE------TTEEEE-S---SS
T ss_pred CCeEEECcCCCCCcEEEEeCCCCCCCCeEEEEEE----CCCCEEEECCCccCC
Confidence 3677777776677788999864 899999997 233455566688987
No 21
>smart00343 ZnF_C2HC zinc finger.
Probab=87.08 E-value=0.32 Score=28.74 Aligned_cols=17 Identities=29% Similarity=0.755 Sum_probs=15.3
Q ss_pred EcCccCCCCCCCCCCCC
Q 042031 547 QCGRCHLLGHSQKKCTM 563 (565)
Q Consensus 547 ~C~~C~~~gHn~~tC~~ 563 (565)
.|.+|++.||..+.||.
T Consensus 1 ~C~~CG~~GH~~~~C~~ 17 (26)
T smart00343 1 KCYNCGKEGHIARDCPK 17 (26)
T ss_pred CCccCCCCCcchhhCCc
Confidence 48899999999999983
No 22
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=86.99 E-value=0.24 Score=34.57 Aligned_cols=20 Identities=35% Similarity=0.642 Sum_probs=17.6
Q ss_pred ceEEcCccCCCCCCCCCCCC
Q 042031 544 RIVQCGRCHLLGHSQKKCTM 563 (565)
Q Consensus 544 ~~~~C~~C~~~gHn~~tC~~ 563 (565)
-...|.+|+..||+.+.||.
T Consensus 30 lp~~C~~C~~~gH~~~~C~k 49 (49)
T PF14392_consen 30 LPRFCFHCGRIGHSDKECPK 49 (49)
T ss_pred cChhhcCCCCcCcCHhHcCC
Confidence 34689999999999999984
No 23
>PF13565 HTH_32: Homeodomain-like domain
Probab=86.49 E-value=1.8 Score=33.17 Aligned_cols=41 Identities=24% Similarity=0.462 Sum_probs=34.6
Q ss_pred hHHHHHHHHHhcCCCCCHHHHHHHHHHHcCccc--CHHHHHHH
Q 042031 107 WVARSVEARIRDNPQYKPKEILQDIRDQHGVAV--SYMQAWRG 147 (565)
Q Consensus 107 ~i~~~~~~~l~~~~~~~~~~i~~~~~~~~g~~~--s~~~~~r~ 147 (565)
.+...+.+.+..+|.+++.+|.+.|.+++|+.+ |.+++||.
T Consensus 34 e~~~~i~~~~~~~p~wt~~~i~~~L~~~~g~~~~~S~~tv~R~ 76 (77)
T PF13565_consen 34 EQRERIIALIEEHPRWTPREIAEYLEEEFGISVRVSRSTVYRI 76 (77)
T ss_pred HHHHHHHHHHHhCCCCCHHHHHHHHHHHhCCCCCccHhHHHHh
Confidence 344567777788999999999999999999876 99999874
No 24
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=83.13 E-value=1.1 Score=43.98 Aligned_cols=36 Identities=14% Similarity=0.293 Sum_probs=28.8
Q ss_pred EEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHHHhhcCc
Q 042031 250 LTILSERQRGIVEAVETHFPSAFHCFCLRYVSENFRDTFKN 290 (565)
Q Consensus 250 ~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~~~~~~~ 290 (565)
-+++||+-.+-.. +..+.|..|..|+.+.+.+-...
T Consensus 121 GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~ 156 (271)
T PF03050_consen 121 GILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES 156 (271)
T ss_pred eeeeccccccccc-----ccccccccccccccccccccccc
Confidence 4899999987654 33789999999999998776654
No 25
>COG5179 TAF1 Transcription initiation factor TFIID, subunit TAF1 [Transcription]
Probab=81.98 E-value=0.86 Score=48.12 Aligned_cols=25 Identities=32% Similarity=0.695 Sum_probs=19.4
Q ss_pred CCCCCceEEcCccCCCCCCC--CCCCC
Q 042031 539 FKRPKRIVQCGRCHLLGHSQ--KKCTM 563 (565)
Q Consensus 539 ~~~~~~~~~C~~C~~~gHn~--~tC~~ 563 (565)
+++...+++|++|||.||=+ +.||+
T Consensus 931 ~GRK~Ttr~C~nCGQvGHmkTNK~CP~ 957 (968)
T COG5179 931 KGRKNTTRTCGNCGQVGHMKTNKACPK 957 (968)
T ss_pred CCCCCcceecccccccccccccccCcc
Confidence 45555689999999999965 46775
No 26
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=78.29 E-value=1.6 Score=34.95 Aligned_cols=34 Identities=32% Similarity=0.545 Sum_probs=24.7
Q ss_pred EecCceEEEEccCceeccCccc-----cCCCCchhHHHHHH
Q 042031 428 VSTERTNIVDIRSRVCSCRRWQ-----LYGLPCAHAAAALL 463 (565)
Q Consensus 428 ~~~~~~~~V~l~~~~CsC~~~~-----~~GiPC~H~lav~~ 463 (565)
...++.|+++.+ -|||..|- .-.-||.|++.+-.
T Consensus 39 vG~~rdYIl~~g--fCSCp~~~~svvl~Gk~~C~Hi~glk~ 77 (117)
T COG5431 39 VGKERDYILEGG--FCSCPDFLGSVVLKGKSPCAHIIGLKV 77 (117)
T ss_pred EccccceEEEcC--cccCHHHHhHhhhcCcccchhhhheee
Confidence 346678999877 89998876 22357999997533
No 27
>PF02178 AT_hook: AT hook motif; InterPro: IPR017956 AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins [], in DNA-binding proteins from plants [] and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex []. High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin []. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner [, ]. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions. ; GO: 0003677 DNA binding; PDB: 2EZE_A 2EZD_A 2EZF_A 2EZG_A.
Probab=76.67 E-value=1.1 Score=21.78 Aligned_cols=9 Identities=56% Similarity=1.066 Sum_probs=3.5
Q ss_pred CCCCCCCcc
Q 042031 524 RPPGRPKKK 532 (565)
Q Consensus 524 ~~~GRpkk~ 532 (565)
+++|||+|.
T Consensus 2 r~RGRP~k~ 10 (13)
T PF02178_consen 2 RKRGRPRKN 10 (13)
T ss_dssp --SS--TT-
T ss_pred CcCCCCccc
Confidence 678999875
No 28
>PHA02517 putative transposase OrfB; Reviewed
Probab=75.56 E-value=26 Score=34.39 Aligned_cols=42 Identities=12% Similarity=0.317 Sum_probs=31.7
Q ss_pred hHHHHHHHHHhc-CCCCCHHHHHHHHHHHcCcccCHHHHHHHHH
Q 042031 107 WVARSVEARIRD-NPQYKPKEILQDIRDQHGVAVSYMQAWRGKE 149 (565)
Q Consensus 107 ~i~~~~~~~l~~-~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~~ 149 (565)
.+.+.+.+.... .+.+..+.|...|.+. |+.+|.++++|..+
T Consensus 30 ~l~~~I~~i~~~~~~~~G~r~I~~~L~~~-g~~vs~~tV~Rim~ 72 (277)
T PHA02517 30 WLKSEILRVYDENHQVYGVRKVWRQLNRE-GIRVARCTVGRLMK 72 (277)
T ss_pred HHHHHHHHHHHHhCCCCCHHHHHHHHHhc-CcccCHHHHHHHHH
Confidence 455555666554 5788999999998755 99999999998644
No 29
>PRK09335 30S ribosomal protein S26e; Provisional
Probab=73.83 E-value=2.7 Score=33.29 Aligned_cols=27 Identities=41% Similarity=0.609 Sum_probs=20.3
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL 553 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~ 553 (565)
|..++..||-|+.|-.. +..+|.+||.
T Consensus 2 ~kKRrn~GR~K~~rGhv-------~~V~C~nCgr 28 (95)
T PRK09335 2 PKKRENRGRRKGDKGHV-------GYVQCDNCGR 28 (95)
T ss_pred CcccccCCCCCCCCCCC-------ccEEeCCCCC
Confidence 56778888887765432 5789999997
No 30
>PLN00186 ribosomal protein S26; Provisional
Probab=69.40 E-value=3.8 Score=33.27 Aligned_cols=27 Identities=33% Similarity=0.570 Sum_probs=20.2
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL 553 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~ 553 (565)
|..++..||-|+.|-.. +..+|++|+.
T Consensus 2 ~kKRrN~GR~K~~rGhv-------~~V~C~nCgr 28 (109)
T PLN00186 2 TKKRRNGGRNKHGRGHV-------KRIRCSNCGK 28 (109)
T ss_pred CcccccCCCCCCCCCCC-------cceeeCCCcc
Confidence 56778888887765433 5789999997
No 31
>PTZ00172 40S ribosomal protein S26; Provisional
Probab=68.56 E-value=4.1 Score=33.10 Aligned_cols=27 Identities=33% Similarity=0.556 Sum_probs=20.4
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL 553 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~ 553 (565)
|..++..||-|+.|-.. +..+|.+|+.
T Consensus 2 ~kKRrN~GR~K~~rGhv-------~~V~C~nCgr 28 (108)
T PTZ00172 2 TSKRRNNGRSKHGRGHV-------KPVRCSNCGR 28 (108)
T ss_pred CcccccCCCCCCCCCCC-------ccEEeCCccc
Confidence 56778888887765433 5789999997
No 32
>smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).
Probab=64.57 E-value=4.1 Score=23.85 Aligned_cols=12 Identities=42% Similarity=0.733 Sum_probs=9.4
Q ss_pred CCCCCCCCcccc
Q 042031 523 RRPPGRPKKKVL 534 (565)
Q Consensus 523 ~~~~GRpkk~R~ 534 (565)
.+++|||+|...
T Consensus 1 kRkRGRPrK~~~ 12 (26)
T smart00384 1 KRKRGRPRKAPK 12 (26)
T ss_pred CCCCCCCCCCCC
Confidence 378999998754
No 33
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=58.12 E-value=6.1 Score=26.42 Aligned_cols=18 Identities=33% Similarity=0.779 Sum_probs=16.6
Q ss_pred eEEcCccCCCCCCCCCCC
Q 042031 545 IVQCGRCHLLGHSQKKCT 562 (565)
Q Consensus 545 ~~~C~~C~~~gHn~~tC~ 562 (565)
...|.+|++.||-..-||
T Consensus 4 ~~~CqkC~~~GH~tyeC~ 21 (42)
T PF13917_consen 4 RVRCQKCGQKGHWTYECP 21 (42)
T ss_pred CCcCcccCCCCcchhhCC
Confidence 468999999999999999
No 34
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=57.68 E-value=5.2 Score=36.33 Aligned_cols=16 Identities=31% Similarity=0.812 Sum_probs=14.5
Q ss_pred EEcCccCCCCCCCCCC
Q 042031 546 VQCGRCHLLGHSQKKC 561 (565)
Q Consensus 546 ~~C~~C~~~gHn~~tC 561 (565)
.+|.+||+.||-++-|
T Consensus 98 ~~C~~Cg~~GH~~~dC 113 (190)
T COG5082 98 KKCYNCGETGHLSRDC 113 (190)
T ss_pred cccccccccCcccccc
Confidence 5899999999999999
No 35
>PF13592 HTH_33: Winged helix-turn helix
Probab=56.37 E-value=20 Score=25.99 Aligned_cols=30 Identities=27% Similarity=0.280 Sum_probs=25.8
Q ss_pred CCCCCHHHHHHHHHHHcCcccCHHHHHHHH
Q 042031 119 NPQYKPKEILQDIRDQHGVAVSYMQAWRGK 148 (565)
Q Consensus 119 ~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~ 148 (565)
+..++.++|...|.+.||+.+|.+.+|+.-
T Consensus 2 ~~~wt~~~i~~~I~~~fgv~ys~~~v~~lL 31 (60)
T PF13592_consen 2 GGRWTLKEIAAYIEEEFGVKYSPSGVYRLL 31 (60)
T ss_pred CCcccHHHHHHHHHHHHCCEEcHHHHHHHH
Confidence 355788999999999999999999998753
No 36
>COG4715 Uncharacterized conserved protein [Function unknown]
Probab=55.58 E-value=35 Score=36.40 Aligned_cols=51 Identities=22% Similarity=0.348 Sum_probs=31.0
Q ss_pred eEEEeeCCeeEEEEecCceE--EEEc----cCceeccCccccCCCCchhHHHHHHhcCC
Q 042031 415 YQVLRANEIEFEIVSTERTN--IVDI----RSRVCSCRRWQLYGLPCAHAAAALLSCGQ 467 (565)
Q Consensus 415 ~~v~~~~~~~~~V~~~~~~~--~V~l----~~~~CsC~~~~~~GiPC~H~lav~~~~~~ 467 (565)
..+..-++..--|+.+++.| .|++ .+..|||.. ...| -|.|++||+....-
T Consensus 41 ~~i~~~g~~v~A~V~Gs~~y~v~vtL~~~~~ss~CTCP~-~~~g-aCKH~VAvvl~~~~ 97 (587)
T COG4715 41 LKITIRGGTVRAVVEGSRRYRVRVTLEGGALSSICTCPY-GGSG-ACKHVVAVVLEYLD 97 (587)
T ss_pred eEEeecCCeEEEEEeccceeeEEEEeecCCcCceeeCCC-CCCc-chHHHHHHHHHHhh
Confidence 34443333333334455554 4455 346999997 5555 59999999888644
No 37
>PF04937 DUF659: Protein of unknown function (DUF 659); InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=52.69 E-value=48 Score=29.36 Aligned_cols=62 Identities=11% Similarity=0.244 Sum_probs=42.7
Q ss_pred ccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHh---hhCCCCcchhhHHHHHHHHHhhcCc
Q 042031 225 SDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVE---THFPSAFHCFCLRYVSENFRDTFKN 290 (565)
Q Consensus 225 ~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~---~vfP~~~h~~C~~Hi~~n~~~~~~~ 290 (565)
+.+...-+|....+.+| .....-||||....+.+|-+ +-+|.....-|..|-+.-+.+.+..
T Consensus 74 ~a~~l~~ll~~vIeeVG----~~nVvqVVTDn~~~~~~a~~~L~~k~p~ifw~~CaaH~inLmledi~k 138 (153)
T PF04937_consen 74 TAEYLFELLDEVIEEVG----EENVVQVVTDNASNMKKAGKLLMEKYPHIFWTPCAAHCINLMLEDIGK 138 (153)
T ss_pred cHHHHHHHHHHHHHHhh----hhhhhHHhccCchhHHHHHHHHHhcCCCEEEechHHHHHHHHHHHHhc
Confidence 34444444444445555 44566789999999888744 4488888889999988877666543
No 38
>PF04800 ETC_C1_NDUFA4: ETC complex I subunit conserved region; InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=51.45 E-value=27 Score=28.49 Aligned_cols=45 Identities=16% Similarity=0.158 Sum_probs=27.9
Q ss_pred cccCCcCCCCCCCcc--ccCeeCCHHHHHHHHHHHHHhccceEEEEeecCe
Q 042031 7 TVPNDSLSLAEPTLS--IGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRS 55 (565)
Q Consensus 7 ~~~~~~~~~~~~~l~--~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~ 55 (565)
.+||-+-..+++.+. +.+.|+|+|+|. .||.++|..|.|..-...
T Consensus 34 ~~PLMGWtss~D~~~q~v~l~F~skE~Ai----~yaer~G~~Y~V~~p~~r 80 (101)
T PF04800_consen 34 ENPLMGWTSSGDPLSQSVRLKFDSKEDAI----AYAERNGWDYEVEEPKKR 80 (101)
T ss_dssp --TTT-SSSS--SEEE-CEEEESSHHHHH----HHHHHCT-EEEEE-STT-
T ss_pred CCCccCCCCCCChhhCeeEeeeCCHHHHH----HHHHHcCCeEEEeCCCCC
Confidence 345555444444553 899999999985 579999999999865443
No 39
>COG4830 RPS26B Ribosomal protein S26 [Translation, ribosomal structure and biogenesis]
Probab=50.95 E-value=10 Score=30.06 Aligned_cols=27 Identities=44% Similarity=0.724 Sum_probs=20.6
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL 553 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~ 553 (565)
|..++..||.|+.|-.. .-.+|-+||.
T Consensus 2 pkkR~N~GR~K~~rGhv-------~~v~CdnCg~ 28 (108)
T COG4830 2 PKKRRNRGRNKKGRGHV-------KYVRCDNCGK 28 (108)
T ss_pred cchhhhcCCCCCCCCCc-------cceeeccccc
Confidence 67788889988866433 4579999997
No 40
>PHA00689 hypothetical protein
Probab=50.58 E-value=10 Score=25.79 Aligned_cols=12 Identities=50% Similarity=0.991 Sum_probs=10.2
Q ss_pred ceEEcCccCCCC
Q 042031 544 RIVQCGRCHLLG 555 (565)
Q Consensus 544 ~~~~C~~C~~~g 555 (565)
+..+|++||..|
T Consensus 16 ravtckrcgktg 27 (62)
T PHA00689 16 RAVTCKRCGKTG 27 (62)
T ss_pred ceeehhhccccC
Confidence 678999999876
No 41
>PF01283 Ribosomal_S26e: Ribosomal protein S26e; InterPro: IPR000892 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26 []; Octopus S26 []; Drosophila S26 (DS31) []; plant cytoplasmic S26; and fungal S26 []. These proteins have 114 to 127 amino acids.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 3U5G_a 3U5C_a 2XZM_5 2XZN_5.
Probab=49.86 E-value=12 Score=31.02 Aligned_cols=27 Identities=37% Similarity=0.545 Sum_probs=13.1
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL 553 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~ 553 (565)
|..++.-||-|+.|-. -..++|.+|+.
T Consensus 2 ~~KRrN~Gr~KkgrGh-------v~~V~C~nCgr 28 (113)
T PF01283_consen 2 TKKRRNNGRSKKGRGH-------VQPVRCDNCGR 28 (113)
T ss_dssp ----TTTTSS-SSSS----------EEE-TTTB-
T ss_pred CcccccCCCCCCCCCC-------CcCEeeCcccc
Confidence 4567777777765533 25789999986
No 42
>PF08766 DEK_C: DEK C terminal domain; InterPro: IPR014876 DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients []. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like Q8TF96 from SWISSPROT, and in protein phosphatases such as Q6NN85 from SWISSPROT. ; PDB: 1Q1V_A.
Probab=46.48 E-value=52 Score=23.25 Aligned_cols=38 Identities=13% Similarity=0.333 Sum_probs=24.5
Q ss_pred hHHHHHHHHHhc-C-CCCCHHHHHHHHHHHcCcccCHHHH
Q 042031 107 WVARSVEARIRD-N-PQYKPKEILQDIRDQHGVAVSYMQA 144 (565)
Q Consensus 107 ~i~~~~~~~l~~-~-~~~~~~~i~~~~~~~~g~~~s~~~~ 144 (565)
-+...+.+.|+. + .+++.++|.+.|.+.+|++++..+.
T Consensus 4 ~i~~~i~~iL~~~dl~~vT~k~vr~~Le~~~~~dL~~~K~ 43 (54)
T PF08766_consen 4 EIREAIREILREADLDTVTKKQVREQLEERFGVDLSSRKK 43 (54)
T ss_dssp HHHHHHHHHHTTS-GGG--HHHHHHHHHHH-SS--SHHHH
T ss_pred HHHHHHHHHHHhCCHhHhhHHHHHHHHHHHHCCCcHHHHH
Confidence 455566777775 2 5678999999999999999996543
No 43
>PF01498 HTH_Tnp_Tc3_2: Transposase; InterPro: IPR002492 Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in []. Tc3 is a member of the Tc1/mariner family of transposable elements. This entry also includes histone-lysine N-methyltransferase SETMAR, which is a SET domain and mariner transposase fusion gene-containing protein. This histone methyltransferase has sequence-specific DNA-binding activity and recognises the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element. This protein has DNA nicking activity, and has in vivo end joining activity and may mediate genomic integration of foreign DNA [, , , ]. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated, 0015074 DNA integration; PDB: 3K9K_B 3F2K_B 3K9J_B 1U78_A.
Probab=42.97 E-value=20 Score=26.93 Aligned_cols=36 Identities=28% Similarity=0.486 Sum_probs=16.0
Q ss_pred HHHHHhcCCCCCHHHHHHHHHHHcCcccCHHHHHHHH
Q 042031 112 VEARIRDNPQYKPKEILQDIRDQHGVAVSYMQAWRGK 148 (565)
Q Consensus 112 ~~~~l~~~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~ 148 (565)
+...++.+|..+..+|...+.+. |..+|..+++|.-
T Consensus 4 I~~~v~~~p~~s~~~i~~~l~~~-~~~vS~~TI~r~L 39 (72)
T PF01498_consen 4 IVRMVRRNPRISAREIAQELQEA-GISVSKSTIRRRL 39 (72)
T ss_dssp ------------HHHHHHHT----T--S-HHHHHHHH
T ss_pred HHHHHHHCCCCCHHHHHHHHHHc-cCCcCHHHHHHHH
Confidence 34566778999999999999888 9999999998763
No 44
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=40.20 E-value=36 Score=28.09 Aligned_cols=54 Identities=15% Similarity=0.144 Sum_probs=37.9
Q ss_pred cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHH
Q 042031 205 MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEA 263 (565)
Q Consensus 205 ~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~A 263 (565)
+.+|..- -+.+++.+-..++.+.+.-+|.......+ ...|.+|+||+..+..+.
T Consensus 29 ~~iD~~S-~~~~~~~~~~~~~~~~~~~~l~~~~~~~~----~~~p~~i~tD~g~~f~~~ 82 (120)
T PF00665_consen 29 VFIDDYS-RFIYAFPVSSKETAEAALRALKRAIEKRG----GRPPRVIRTDNGSEFTSH 82 (120)
T ss_dssp EEEETTT-TEEEEEEESSSSHHHHHHHHHHHHHHHHS-----SE-SEEEEESCHHHHSH
T ss_pred EEEECCC-CcEEEEEeecccccccccccccccccccc----cccceecccccccccccc
Confidence 5555543 44557777777788888888887777766 333899999999987643
No 45
>PF09713 A_thal_3526: Plant protein 1589 of unknown function (A_thal_3526); InterPro: IPR006476 This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.
Probab=36.84 E-value=32 Score=24.41 Aligned_cols=26 Identities=12% Similarity=0.411 Sum_probs=18.8
Q ss_pred CCCHHHHHHHHHHHcCcccCHH-HHHH
Q 042031 121 QYKPKEILQDIRDQHGVAVSYM-QAWR 146 (565)
Q Consensus 121 ~~~~~~i~~~~~~~~g~~~s~~-~~~r 146 (565)
.++..++++.|.++.|+.+... .+|+
T Consensus 12 yMsk~E~v~~L~~~a~I~P~~T~~VW~ 38 (54)
T PF09713_consen 12 YMSKEECVRALQKQANIEPVFTSTVWQ 38 (54)
T ss_pred cCCHHHHHHHHHHHcCCChHHHHHHHH
Confidence 3567889999988888876554 4553
No 46
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=33.97 E-value=26 Score=33.29 Aligned_cols=24 Identities=42% Similarity=0.751 Sum_probs=18.8
Q ss_pred CceeccCccccCCCCchhHHHHHHhcC
Q 042031 440 SRVCSCRRWQLYGLPCAHAAAALLSCG 466 (565)
Q Consensus 440 ~~~CsC~~~~~~GiPC~H~lav~~~~~ 466 (565)
...|||.. .-.||.|+.||..+..
T Consensus 124 ~~dCSCPD---~anPCKHi~AvyY~la 147 (266)
T COG4279 124 STDCSCPD---YANPCKHIAAVYYLLA 147 (266)
T ss_pred ccccCCCC---cccchHHHHHHHHHHH
Confidence 34699986 4579999999987743
No 47
>PF05741 zf-nanos: Nanos RNA binding domain; InterPro: IPR024161 Nanos is a highly conserved RNA-binding protein in higher eukaryotes and functions as a key regulatory protein in translational control using a 3' untranslated region during the development and maintenance of germ cells. Nanos comprises a non-conserved amino-terminus and highly conserved carboxy- terminal regions. The C-terminal region has two conserved Cys-Cys-His-Cys (CCHC)-type zinc-finger motifs that are indispensable for nanos function [, , ]. The structure of the nanos-type zinc finger is composed of two independent zinc-finger (ZF) lobes, the N-terminal ZF1 and the C-terminal ZF2, which are connected by a linker helix []. These lobes create a large cleft. Zinc ions in ZF1 and ZF2 are bound to the CCHC motif by tetrahedral coordination.; PDB: 3ALR_B.
Probab=33.33 E-value=13 Score=26.45 Aligned_cols=20 Identities=30% Similarity=0.514 Sum_probs=8.5
Q ss_pred ceEEcCccCC---CCCCCCCCCC
Q 042031 544 RIVQCGRCHL---LGHSQKKCTM 563 (565)
Q Consensus 544 ~~~~C~~C~~---~gHn~~tC~~ 563 (565)
+.+.|..||. .+|+.+-||+
T Consensus 32 r~y~Cp~CgAtGd~AHT~~yCP~ 54 (55)
T PF05741_consen 32 RKYVCPICGATGDNAHTIKYCPK 54 (55)
T ss_dssp GG---TTT---GGG---GGG-TT
T ss_pred hcCcCCCCcCcCccccccccCcC
Confidence 5689999998 5678888886
No 48
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=32.48 E-value=22 Score=32.45 Aligned_cols=18 Identities=28% Similarity=0.608 Sum_probs=16.4
Q ss_pred eEEcCccCCCCCCCCCCC
Q 042031 545 IVQCGRCHLLGHSQKKCT 562 (565)
Q Consensus 545 ~~~C~~C~~~gHn~~tC~ 562 (565)
...|-+||+.||-++.||
T Consensus 60 ~~~C~nCg~~GH~~~DCP 77 (190)
T COG5082 60 NPVCFNCGQNGHLRRDCP 77 (190)
T ss_pred ccccchhcccCcccccCC
Confidence 468999999999999999
No 49
>PRK14892 putative transcription elongation factor Elf1; Provisional
Probab=32.42 E-value=35 Score=27.75 Aligned_cols=9 Identities=44% Similarity=1.228 Sum_probs=6.1
Q ss_pred ceEEcCccC
Q 042031 544 RIVQCGRCH 552 (565)
Q Consensus 544 ~~~~C~~C~ 552 (565)
....|.+||
T Consensus 20 t~f~CP~Cg 28 (99)
T PRK14892 20 KIFECPRCG 28 (99)
T ss_pred cEeECCCCC
Confidence 456777777
No 50
>PRK12286 rpmF 50S ribosomal protein L32; Reviewed
Probab=29.64 E-value=54 Score=23.64 Aligned_cols=32 Identities=22% Similarity=0.435 Sum_probs=20.0
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL 553 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~ 553 (565)
|+.+.++.|..++|-. ..........|+.||.
T Consensus 4 PKrk~S~srr~~RRsh--~~l~~~~l~~C~~CG~ 35 (57)
T PRK12286 4 PKRKTSKSRKRKRRAH--FKLKAPGLVECPNCGE 35 (57)
T ss_pred CcCcCChhhcchhccc--ccccCCcceECCCCCC
Confidence 6666777776666543 2223334678999988
No 51
>PF10045 DUF2280: Uncharacterized conserved protein (DUF2280); InterPro: IPR018738 This entry is represented by Burkholderia phage Bups phi1, Orf2.36. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Probab=28.93 E-value=61 Score=26.33 Aligned_cols=24 Identities=25% Similarity=0.491 Sum_probs=21.2
Q ss_pred CHHHHHHHHHHHcCcccCHHHHHH
Q 042031 123 KPKEILQDIRDQHGVAVSYMQAWR 146 (565)
Q Consensus 123 ~~~~i~~~~~~~~g~~~s~~~~~r 146 (565)
+|.++.+.+++.||+.+|..++-.
T Consensus 21 TPs~v~~aVk~eFgi~vsrQqve~ 44 (104)
T PF10045_consen 21 TPSEVAEAVKEEFGIDVSRQQVES 44 (104)
T ss_pred CHHHHHHHHHHHhCCccCHHHHHH
Confidence 799999999999999999887643
No 52
>PF13276 HTH_21: HTH-like domain
Probab=28.26 E-value=1.5e+02 Score=21.11 Aligned_cols=42 Identities=19% Similarity=0.374 Sum_probs=32.4
Q ss_pred HHHHHHHHHhc-CCCCCHHHHHHHHHHHcCcccCHHHHHHHHH
Q 042031 108 VARSVEARIRD-NPQYKPKEILQDIRDQHGVAVSYMQAWRGKE 149 (565)
Q Consensus 108 i~~~~~~~l~~-~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~~ 149 (565)
+...+.+.... .+.+....|...|..+.|+.+|..+++|..+
T Consensus 6 l~~~I~~i~~~~~~~yG~rri~~~L~~~~~~~v~~krV~RlM~ 48 (60)
T PF13276_consen 6 LRELIKEIFKESKPTYGYRRIWAELRREGGIRVSRKRVRRLMR 48 (60)
T ss_pred HHHHHHHHHHHcCCCeehhHHHHHHhccCcccccHHHHHHHHH
Confidence 34445555555 4788999999999999889999999988754
No 53
>PF13877 RPAP3_C: Potential Monad-binding region of RPAP3
Probab=26.86 E-value=47 Score=26.48 Aligned_cols=33 Identities=15% Similarity=0.343 Sum_probs=26.9
Q ss_pred hcHHHHHHHHHHHHhcccchhhHhhhCCCCCcc
Q 042031 304 LTTVEFEAKISEMVEISQDVIPWFQQFPPQLWA 336 (565)
Q Consensus 304 ~t~~eF~~~~~~l~~~~~~~~~~l~~~~~~~W~ 336 (565)
.|..||+..|..+.......++||..++++...
T Consensus 5 ~~~~eF~~~w~~~~~~~~~~~~yL~~i~p~~l~ 37 (94)
T PF13877_consen 5 KNSYEFERDWRRLKKDPEERYEYLKSIPPDSLP 37 (94)
T ss_pred CCHHHHHHHHHHHcCCHHHHHHHHHhCChHHHH
Confidence 467899999999987767899999998766543
No 54
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=26.20 E-value=43 Score=32.50 Aligned_cols=25 Identities=32% Similarity=0.530 Sum_probs=20.5
Q ss_pred CCCCceEEcCccCCCCCCCCCCCCC
Q 042031 540 KRPKRIVQCGRCHLLGHSQKKCTMP 564 (565)
Q Consensus 540 ~~~~~~~~C~~C~~~gHn~~tC~~~ 564 (565)
+.+-..+-|=+||+.||-...||..
T Consensus 171 kppPpgY~CyRCGqkgHwIqnCpTN 195 (427)
T COG5222 171 KPPPPGYVCYRCGQKGHWIQNCPTN 195 (427)
T ss_pred CCCCCceeEEecCCCCchhhcCCCC
Confidence 3344569999999999999999863
No 55
>PF05634 APO_RNA-bind: APO RNA-binding; InterPro: IPR008512 This family consists of plant APO (accumulation of photosystem 1) proteins.
Probab=25.76 E-value=45 Score=30.67 Aligned_cols=20 Identities=30% Similarity=0.775 Sum_probs=17.0
Q ss_pred ceEEcCccCC-----CCCCCCCCCC
Q 042031 544 RIVQCGRCHL-----LGHSQKKCTM 563 (565)
Q Consensus 544 ~~~~C~~C~~-----~gHn~~tC~~ 563 (565)
.+..|+.|.+ .||..+||..
T Consensus 97 pV~~C~~C~EVHVG~~GH~irtC~g 121 (204)
T PF05634_consen 97 PVKACGYCPEVHVGPVGHKIRTCGG 121 (204)
T ss_pred eeeecCCCCCeEECCCcccccccCC
Confidence 3678999977 8999999964
No 56
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=25.75 E-value=43 Score=29.40 Aligned_cols=19 Identities=26% Similarity=0.642 Sum_probs=15.6
Q ss_pred EEcCccCCCCCCCCCCCCC
Q 042031 546 VQCGRCHLLGHSQKKCTMP 564 (565)
Q Consensus 546 ~~C~~C~~~gHn~~tC~~~ 564 (565)
..|.+|++.||..+.||.+
T Consensus 78 ~~C~~Cg~~GH~~~~C~~~ 96 (148)
T PTZ00368 78 RSCYNCGQTGHISRECPNR 96 (148)
T ss_pred cccCcCCCCCcccccCCCc
Confidence 5688888888888888764
No 57
>TIGR01031 rpmF_bact ribosomal protein L32. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus.
Probab=25.26 E-value=79 Score=22.58 Aligned_cols=41 Identities=20% Similarity=0.398 Sum_probs=22.1
Q ss_pred CCCCCCCCCCCcccccccCCCCCCceEEcCccCCCCCCCCCC
Q 042031 520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHLLGHSQKKC 561 (565)
Q Consensus 520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~~gHn~~tC 561 (565)
|..+.++-|..++|-... .........|+.||+.--.=+-|
T Consensus 2 PKrk~Sksr~~~RRah~~-kl~~p~l~~C~~cG~~~~~H~vc 42 (55)
T TIGR01031 2 PKRKTSKSRKRKRRSHDA-KLTAPTLVVCPNCGEFKLPHRVC 42 (55)
T ss_pred CCCcCCcccccchhcCcc-cccCCcceECCCCCCcccCeeEC
Confidence 555566666655554321 12233567899999843333334
No 58
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=24.81 E-value=41 Score=29.48 Aligned_cols=20 Identities=25% Similarity=0.576 Sum_probs=17.9
Q ss_pred eEEcCccCCCCCCCCCCCCC
Q 042031 545 IVQCGRCHLLGHSQKKCTMP 564 (565)
Q Consensus 545 ~~~C~~C~~~gHn~~tC~~~ 564 (565)
...|.+|++.||-.+.||.+
T Consensus 52 ~~~C~~Cg~~GH~~~~Cp~~ 71 (148)
T PTZ00368 52 ERSCYNCGKTGHLSRECPEA 71 (148)
T ss_pred CcccCCCCCcCcCcccCCCc
Confidence 35799999999999999975
No 59
>TIGR01589 A_thal_3526 uncharacterized plant-specific domain TIGR01589. This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa.
Probab=24.69 E-value=64 Score=23.21 Aligned_cols=26 Identities=12% Similarity=0.287 Sum_probs=20.2
Q ss_pred CCHHHHHHHHHHHcCcccCHH-HHHHH
Q 042031 122 YKPKEILQDIRDQHGVAVSYM-QAWRG 147 (565)
Q Consensus 122 ~~~~~i~~~~~~~~g~~~s~~-~~~r~ 147 (565)
++..++++.+.++.|+.+... .+|+.
T Consensus 16 Msk~E~v~~L~~~a~I~P~~T~~VW~~ 42 (57)
T TIGR01589 16 MSKEETVSFLFENAGISPKFTRFVWYL 42 (57)
T ss_pred CCHHHHHHHHHHHcCCCchhHHHHHHH
Confidence 567899999999999987664 56754
No 60
>PF14201 DUF4318: Domain of unknown function (DUF4318)
Probab=24.53 E-value=1.3e+02 Score=22.93 Aligned_cols=32 Identities=16% Similarity=0.314 Sum_probs=27.0
Q ss_pred cCeeCCHHHHHHHHHHHHHhccceEEEEeecC
Q 042031 23 GQEFPDVETCRRTLKDIAIALHFDLRIVKSDR 54 (565)
Q Consensus 23 G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~ 54 (565)
-..++|.+++-.++..|+.+++-.+...+-+.
T Consensus 11 ~~~yPs~e~i~~aIE~YC~~~~~~l~Fisr~~ 42 (74)
T PF14201_consen 11 SPKYPSKEEICEAIEKYCIKNGESLEFISRDK 42 (74)
T ss_pred CCCCCCHHHHHHHHHHHHHHcCCceEEEecCC
Confidence 34588999999999999999999998875443
No 61
>PF13551 HTH_29: Winged helix-turn helix
Probab=24.04 E-value=1.8e+02 Score=23.49 Aligned_cols=37 Identities=22% Similarity=0.369 Sum_probs=28.9
Q ss_pred HHHHHHhcCC-----CCCHHHHHHHH-HHHcCcccCHHHHHHH
Q 042031 111 SVEARIRDNP-----QYKPKEILQDI-RDQHGVAVSYMQAWRG 147 (565)
Q Consensus 111 ~~~~~l~~~~-----~~~~~~i~~~~-~~~~g~~~s~~~~~r~ 147 (565)
.+.+.+..+| .+++..|.+.+ ++.+|+.+|.+++++.
T Consensus 65 ~l~~~~~~~p~~g~~~~t~~~l~~~l~~~~~~~~~s~~ti~r~ 107 (112)
T PF13551_consen 65 QLIELLRENPPEGRSRWTLEELAEWLIEEEFGIDVSPSTIRRI 107 (112)
T ss_pred HHHHHHHHCCCCCCCcccHHHHHHHHHHhccCccCCHHHHHHH
Confidence 4555666655 47889999876 8999999999999875
No 62
>PF14952 zf-tcix: Putative treble-clef, zinc-finger, Zn-binding
Probab=23.67 E-value=50 Score=22.20 Aligned_cols=21 Identities=24% Similarity=0.521 Sum_probs=15.9
Q ss_pred ceEEcCccCC-CCCCCCCCCCC
Q 042031 544 RIVQCGRCHL-LGHSQKKCTMP 564 (565)
Q Consensus 544 ~~~~C~~C~~-~gHn~~tC~~~ 564 (565)
..++|..||- -|+-.-.|++.
T Consensus 10 GirkCp~CGt~NG~R~~~CKN~ 31 (44)
T PF14952_consen 10 GIRKCPKCGTYNGTRGLSCKNK 31 (44)
T ss_pred ccccCCcCcCccCcccccccCC
Confidence 4679999999 66666677764
No 63
>KOG2044 consensus 5'-3' exonuclease HKE1/RAT1 [Replication, recombination and repair; RNA processing and modification]
Probab=23.51 E-value=32 Score=38.20 Aligned_cols=20 Identities=30% Similarity=0.586 Sum_probs=17.1
Q ss_pred ceEEcCccCCCCCCCCCCCC
Q 042031 544 RIVQCGRCHLLGHSQKKCTM 563 (565)
Q Consensus 544 ~~~~C~~C~~~gHn~~tC~~ 563 (565)
+.++|-.||+.||+...|..
T Consensus 259 ~~~~C~~cgq~gh~~~dc~g 278 (931)
T KOG2044|consen 259 KPRRCFLCGQTGHEAKDCEG 278 (931)
T ss_pred CcccchhhcccCCcHhhcCC
Confidence 44569999999999999964
No 64
>COG4888 Uncharacterized Zn ribbon-containing protein [General function prediction only]
Probab=22.74 E-value=71 Score=25.80 Aligned_cols=8 Identities=50% Similarity=1.294 Sum_probs=3.7
Q ss_pred EEcCccCC
Q 042031 546 VQCGRCHL 553 (565)
Q Consensus 546 ~~C~~C~~ 553 (565)
..|+.||+
T Consensus 47 ~~Cg~CGl 54 (104)
T COG4888 47 AVCGNCGL 54 (104)
T ss_pred EEcccCcc
Confidence 34444443
No 65
>KOG0341 consensus DEAD-box protein abstrakt [RNA processing and modification]
Probab=21.97 E-value=42 Score=34.17 Aligned_cols=19 Identities=32% Similarity=0.628 Sum_probs=17.2
Q ss_pred eEEcCccCCCCCCCCCCCC
Q 042031 545 IVQCGRCHLLGHSQKKCTM 563 (565)
Q Consensus 545 ~~~C~~C~~~gHn~~tC~~ 563 (565)
..-|-.||+.||....||+
T Consensus 570 ~kGCayCgGLGHRItdCPK 588 (610)
T KOG0341|consen 570 EKGCAYCGGLGHRITDCPK 588 (610)
T ss_pred ccccccccCCCcccccCch
Confidence 4579999999999999996
No 66
>PF01644 Chitin_synth_1: Chitin synthase; InterPro: IPR004834 This region is found commonly in chitin synthases classes I, II and III 2.4.1.16 from EC. Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesised on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases []. ; GO: 0004100 chitin synthase activity, 0006031 chitin biosynthetic process
Probab=20.88 E-value=2.1e+02 Score=25.65 Aligned_cols=47 Identities=13% Similarity=0.325 Sum_probs=32.6
Q ss_pred ccCCCCceeEeEEEEeec--cccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcc
Q 042031 206 GVDAEDALFPLAIAIVDV--ESDENWMWFMSELRKLLGVNTDNMPRLTILSERQR 258 (565)
Q Consensus 206 g~d~~~~~~~la~a~~~~--E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~ 258 (565)
+.+.+..+..+-|++=+. ..-.|-+|||+.|-+.+. |..|+.-|-..
T Consensus 101 ~~~~~~~PvQ~ifclKe~N~kKinSHrWfFnaf~~~l~------P~vcvllDvGT 149 (163)
T PF01644_consen 101 GPEKNIVPVQIIFCLKEKNAKKINSHRWFFNAFCRQLQ------PNVCVLLDVGT 149 (163)
T ss_pred ccccCCCCEEEEEEeccccccccchhhHHHHHHHhhcC------CcEEEEEecCC
Confidence 334445556666666444 234899999999999987 66888888654
No 67
>PF11645 PDDEXK_5: PD-(D/E)XK endonuclease; InterPro: IPR021671 This family are putative endonuclease proteins which are restricted to Synechocystis. ; PDB: 2OST_D.
Probab=20.71 E-value=2.8e+02 Score=24.12 Aligned_cols=45 Identities=18% Similarity=0.321 Sum_probs=34.9
Q ss_pred HHHHHHHHHHHHHhccceEEEEeecCeEEEEEeecCCCccEEEEE
Q 042031 29 VETCRRTLKDIAIALHFDLRIVKSDRSRFIAKCSKEGCPWRVHVA 73 (565)
Q Consensus 29 ~~e~~~a~~~ya~~~gf~~~~~kS~~~r~~~~C~~~~C~~~v~~~ 73 (565)
.+++..++....++.|+.+-+.-.+..+|.++=..+||-|||.+.
T Consensus 6 GDite~~ii~~ll~~GY~V~~P~gDn~~YDLV~d~eg~L~RIQvK 50 (149)
T PF11645_consen 6 GDITEAKIINRLLEKGYSVSIPFGDNLKYDLVFDKEGILWRIQVK 50 (149)
T ss_dssp HHHHHHHHHHHHHHTT-EEEEESSTTSS-SEEEEETTEEEEEEEE
T ss_pred chHHHHHHHHHHHHcCcEEEeecCCCCCcCEEEecCCcEEEEEEe
Confidence 356677888889999999999988887766555578999999875
No 68
>PF01316 Arg_repressor: Arginine repressor, DNA binding domain; InterPro: IPR020900 The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) []. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein []. Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR []. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine []. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography []. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0006525 arginine metabolic process; PDB: 1AOY_A 3V4G_A 3LAJ_D 3FHZ_A 3LAP_B 3ERE_D 2P5L_C 1F9N_D 2P5K_A 1B4A_A ....
Probab=20.52 E-value=1.8e+02 Score=21.93 Aligned_cols=36 Identities=14% Similarity=0.288 Sum_probs=24.1
Q ss_pred HHHHHHhcCCCCCHHHHHHHHHHHcCcccCHHHHHHH
Q 042031 111 SVEARIRDNPQYKPKEILQDIRDQHGVAVSYMQAWRG 147 (565)
Q Consensus 111 ~~~~~l~~~~~~~~~~i~~~~~~~~g~~~s~~~~~r~ 147 (565)
.+...|..+.-.+-.+|++.|.+. |+.+|..++.|-
T Consensus 9 ~I~~li~~~~i~sQ~eL~~~L~~~-Gi~vTQaTiSRD 44 (70)
T PF01316_consen 9 LIKELISEHEISSQEELVELLEEE-GIEVTQATISRD 44 (70)
T ss_dssp HHHHHHHHS---SHHHHHHHHHHT-T-T--HHHHHHH
T ss_pred HHHHHHHHCCcCCHHHHHHHHHHc-CCCcchhHHHHH
Confidence 466777777777889999999775 999999998875
No 69
>KOG4602 consensus Nanos and related proteins [General function prediction only]
Probab=20.22 E-value=44 Score=31.72 Aligned_cols=20 Identities=35% Similarity=0.718 Sum_probs=16.1
Q ss_pred ceEEcCccCCCC---CCCCCCCC
Q 042031 544 RIVQCGRCHLLG---HSQKKCTM 563 (565)
Q Consensus 544 ~~~~C~~C~~~g---Hn~~tC~~ 563 (565)
|.+.|..||..| |+++-||.
T Consensus 267 R~YVCPiCGATgDnAHTiKyCPl 289 (318)
T KOG4602|consen 267 RSYVCPICGATGDNAHTIKYCPL 289 (318)
T ss_pred hhhcCccccccCCcccceecccc
Confidence 578999999865 77777885
Done!