Query 007973
Match_columns 583
No_of_seqs 262 out of 1689
Neff 9.4
Searched_HMMs 46136
Date Thu Mar 28 17:48:04 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/007973.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/007973hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 2.9E-76 6.3E-81 639.0 44.4 471 4-499 71-624 (846)
2 PF10551 MULE: MULE transposas 99.9 4E-23 8.6E-28 169.3 8.1 90 206-301 1-93 (93)
3 PF00872 Transposase_mut: Tran 99.8 1E-21 2.2E-26 201.7 5.8 223 106-378 112-350 (381)
4 PF03108 DBD_Tnp_Mut: MuDR fam 99.7 5.6E-17 1.2E-21 123.0 8.8 67 4-70 1-67 (67)
5 COG3328 Transposase and inacti 99.6 1.3E-13 2.7E-18 138.7 17.6 235 106-392 98-347 (379)
6 smart00575 ZnF_PMZ plant mutat 99.0 3.9E-10 8.4E-15 68.4 2.3 27 456-482 1-27 (28)
7 PF08731 AFT: Transcription fa 98.9 5.2E-09 1.1E-13 84.1 8.9 69 13-81 1-111 (111)
8 PF03101 FAR1: FAR1 DNA-bindin 98.7 2.4E-08 5.2E-13 81.0 6.3 61 21-82 1-90 (91)
9 PF04434 SWIM: SWIM zinc finge 98.3 6.9E-07 1.5E-11 59.8 3.0 30 451-480 10-39 (40)
10 PF00098 zf-CCHC: Zinc knuckle 96.3 0.0027 5.9E-08 33.9 1.6 18 557-574 1-18 (18)
11 PF06782 UPF0236: Uncharacteri 95.8 0.41 8.9E-06 51.0 17.0 93 239-334 235-328 (470)
12 PF15288 zf-CCHC_6: Zinc knuck 95.6 0.0064 1.4E-07 39.5 1.3 20 556-575 1-22 (40)
13 PF01610 DDE_Tnp_ISL3: Transpo 95.3 0.033 7.1E-07 54.3 6.0 94 202-305 1-97 (249)
14 PF13610 DDE_Tnp_IS240: DDE do 95.2 0.012 2.6E-07 51.7 2.4 81 199-287 1-81 (140)
15 PF03106 WRKY: WRKY DNA -bindi 94.5 0.083 1.8E-06 38.5 4.7 39 42-80 21-59 (60)
16 PF13696 zf-CCHC_2: Zinc knuck 93.5 0.043 9.4E-07 33.9 1.4 25 554-578 6-30 (32)
17 PF04500 FLYWCH: FLYWCH zinc f 91.2 0.47 1E-05 34.6 4.8 46 30-79 14-62 (62)
18 PF04684 BAF1_ABF1: BAF1 / ABF 91.1 0.49 1.1E-05 48.4 6.2 55 9-63 24-79 (496)
19 PF03050 DDE_Tnp_IS66: Transpo 90.5 0.96 2.1E-05 44.6 7.8 134 106-306 18-156 (271)
20 smart00774 WRKY DNA binding do 90.3 0.44 9.5E-06 34.4 3.7 38 42-79 21-59 (59)
21 COG3316 Transposase and inacti 90.0 5.1 0.00011 37.3 11.3 127 109-290 26-152 (215)
22 PF00665 rve: Integrase core d 89.4 1.2 2.7E-05 37.3 6.7 75 199-278 6-81 (120)
23 PHA02517 putative transposase 88.9 1.8 4E-05 42.7 8.4 152 94-277 30-182 (277)
24 PF14392 zf-CCHC_4: Zinc knuck 88.1 0.2 4.3E-06 34.9 0.6 21 554-574 29-49 (49)
25 PF13565 HTH_32: Homeodomain-l 85.7 2.2 4.9E-05 32.7 5.5 42 93-134 33-76 (77)
26 smart00343 ZnF_C2HC zinc finge 85.4 0.43 9.3E-06 28.1 1.0 19 558-576 1-19 (26)
27 PF02178 AT_hook: AT hook moti 83.3 0.51 1.1E-05 22.9 0.5 9 535-543 2-10 (13)
28 PRK14702 insertion element IS2 82.9 32 0.0007 33.6 13.5 147 93-276 11-164 (262)
29 PF04937 DUF659: Protein of un 81.6 15 0.00033 32.6 9.7 91 213-307 45-139 (153)
30 COG5431 Uncharacterized metal- 79.5 4.8 0.0001 32.3 5.0 37 440-478 35-77 (117)
31 smart00384 AT_hook DNA binding 73.3 2.1 4.6E-05 24.9 1.1 14 534-547 1-14 (26)
32 PRK09335 30S ribosomal protein 72.2 3.1 6.6E-05 33.0 2.2 27 531-564 2-28 (95)
33 PLN00186 ribosomal protein S26 67.4 4.4 9.6E-05 32.9 2.2 27 531-564 2-28 (109)
34 PTZ00172 40S ribosomal protein 66.6 4.6 9.9E-05 32.8 2.2 27 531-564 2-28 (108)
35 PF12762 DDE_Tnp_IS1595: ISXO2 65.8 15 0.00033 32.3 5.7 69 200-277 4-87 (151)
36 PRK09409 IS2 transposase TnpB; 63.6 38 0.00083 33.9 8.7 147 94-276 51-203 (301)
37 COG4279 Uncharacterized conser 61.1 4.7 0.0001 38.1 1.5 24 455-481 124-147 (266)
38 COG4830 RPS26B Ribosomal prote 60.6 5.9 0.00013 31.4 1.7 44 531-581 2-46 (108)
39 COG5082 AIR1 Arginine methyltr 59.3 4.9 0.00011 36.5 1.3 20 455-474 60-80 (190)
40 PF13917 zf-CCHC_3: Zinc knuck 58.9 5.9 0.00013 26.5 1.3 20 555-574 3-22 (42)
41 PF13592 HTH_33: Winged helix- 58.0 18 0.00039 26.2 3.9 31 107-137 3-33 (60)
42 COG5179 TAF1 Transcription ini 55.8 7.4 0.00016 41.5 2.0 25 550-574 931-957 (968)
43 PF01283 Ribosomal_S26e: Ribos 48.8 11 0.00025 31.1 1.7 27 531-564 2-28 (113)
44 PRK13907 rnhA ribonuclease H; 45.9 1.1E+02 0.0024 25.8 7.7 78 201-284 3-81 (128)
45 PF01498 HTH_Tnp_Tc3_2: Transp 44.9 22 0.00047 26.8 2.7 38 99-137 4-41 (72)
46 PHA00689 hypothetical protein 44.4 13 0.00027 25.3 1.1 15 552-566 13-27 (62)
47 PF08069 Ribosomal_S13_N: Ribo 41.2 30 0.00066 25.1 2.7 34 91-124 22-60 (60)
48 PF13276 HTH_21: HTH-like doma 39.9 75 0.0016 22.8 4.8 43 95-137 6-49 (60)
49 PRK00766 hypothetical protein; 39.0 1.8E+02 0.0038 27.0 8.0 91 200-290 10-128 (194)
50 PF08766 DEK_C: DEK C terminal 37.6 76 0.0017 22.4 4.4 38 94-131 4-43 (54)
51 PF13082 DUF3931: Protein of u 37.4 92 0.002 21.5 4.4 28 213-240 35-62 (66)
52 PRK14892 putative transcriptio 35.0 28 0.0006 28.3 1.9 9 555-563 20-28 (99)
53 PRK12286 rpmF 50S ribosomal pr 34.7 37 0.0008 24.4 2.3 32 531-564 4-35 (57)
54 PF11645 PDDEXK_5: PD-(D/E)XK 34.0 1.4E+02 0.003 26.0 6.0 54 17-70 7-63 (149)
55 PF04800 ETC_C1_NDUFA4: ETC co 33.7 53 0.0012 26.8 3.4 29 9-41 51-79 (101)
56 COG4715 Uncharacterized conser 32.8 1.4E+02 0.0031 32.1 7.1 26 455-482 72-97 (587)
57 PF14420 Clr5: Clr5 domain 32.3 1.3E+02 0.0027 21.3 4.8 26 106-131 18-43 (54)
58 COG0626 MetC Cystathionine bet 31.7 6.2E+02 0.013 26.4 11.7 70 187-257 167-243 (396)
59 PF13877 RPAP3_C: Potential Mo 30.3 50 0.0011 26.3 2.8 33 320-352 5-37 (94)
60 PF09713 A_thal_3526: Plant pr 30.3 51 0.0011 23.4 2.4 22 108-129 12-33 (54)
61 PF14201 DUF4318: Domain of un 30.2 96 0.0021 23.7 4.0 31 10-40 11-41 (74)
62 PF08459 UvrC_HhH_N: UvrC Heli 29.9 92 0.002 27.7 4.5 67 201-285 32-100 (155)
63 COG5082 AIR1 Arginine methyltr 28.1 34 0.00073 31.2 1.5 17 559-575 157-173 (190)
64 PF13551 HTH_29: Winged helix- 27.7 1.4E+02 0.003 24.2 5.2 41 97-137 64-110 (112)
65 PF13719 zinc_ribbon_5: zinc-r 27.5 31 0.00067 22.3 0.8 14 551-564 20-33 (37)
66 PF14952 zf-tcix: Putative tre 27.0 41 0.00088 22.6 1.3 22 554-575 9-31 (44)
67 TIGR01031 rpmF_bact ribosomal 26.9 67 0.0014 22.9 2.5 42 531-573 2-43 (55)
68 PF05741 zf-nanos: Nanos RNA b 25.6 25 0.00054 25.1 0.2 21 555-575 32-55 (55)
69 PF13717 zinc_ribbon_4: zinc-r 25.4 38 0.00083 21.7 1.0 13 552-564 21-33 (36)
70 PTZ00368 universal minicircle 25.0 51 0.0011 28.9 2.1 21 556-576 52-72 (148)
71 PF01316 Arg_repressor: Argini 24.0 1.9E+02 0.0042 21.8 4.7 40 98-138 9-48 (70)
72 PF10045 DUF2280: Uncharacteri 24.0 88 0.0019 25.4 3.0 22 110-131 21-42 (104)
73 TIGR01589 A_thal_3526 uncharac 23.9 73 0.0016 22.9 2.2 27 108-134 15-42 (57)
74 PF08086 Toxin_17: Ergtoxin fa 23.2 8.9 0.00019 24.3 -2.1 11 562-572 24-34 (41)
75 PTZ00368 universal minicircle 23.0 55 0.0012 28.7 1.9 18 557-574 130-147 (148)
76 PF10122 Mu-like_Com: Mu-like 22.8 51 0.0011 23.0 1.3 25 555-579 23-47 (51)
77 PF14787 zf-CCHC_5: GAG-polypr 22.7 56 0.0012 20.9 1.3 19 557-575 3-21 (36)
78 COG4888 Uncharacterized Zn rib 21.9 67 0.0015 25.9 1.9 8 556-563 22-29 (104)
79 COG5222 Uncharacterized conser 21.4 62 0.0013 31.4 1.9 25 552-576 172-196 (427)
80 KOG2044 5'-3' exonuclease HKE1 21.2 42 0.00092 37.3 0.9 25 553-577 257-281 (931)
81 COG3915 Uncharacterized protei 20.5 1.3E+02 0.0027 25.9 3.3 38 191-229 97-136 (155)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=2.9e-76 Score=639.00 Aligned_cols=471 Identities=16% Similarity=0.222 Sum_probs=377.6
Q ss_pred CCcccccCeeCCHHHHHHHHHHHHHhcceeEEEEeecC-------eEEEEEeec--------------------------
Q 007973 4 DHNFVVGQEFPDVKAFRNAIKEAAIAQHFELRIIKSDL-------IRYFAKCVT-------------------------- 50 (583)
Q Consensus 4 ~~~l~~G~~F~s~~e~~~ai~~ya~~~gf~~~~~ks~~-------~r~~~~C~~-------------------------- 50 (583)
+..|.+||+|+|.+||++||+.||...||++++.+|.+ ...+++|++
T Consensus 71 ~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~~~~~ 150 (846)
T PLN03097 71 NLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQDPENG 150 (846)
T ss_pred CccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccCcccc
Confidence 45689999999999999999999999999999855432 123466653
Q ss_pred --------CCCceEEEEEEeCCCCceEEEeecccceecCCCccccccchhhHHHHHHHHHhhcCCCCChHHHHHHHHHHh
Q 007973 51 --------EGCPWRIRAVKLPNAPTFTIRSLEGTHTCGKNAQIGHHQASVDWIVSFIEERLRDNINYKPKDILQDIHKQY 122 (583)
Q Consensus 51 --------~gC~wrv~~~~~~~~~~~~v~~~~~~H~c~~~~~~~~~~~s~~~la~~~~~~l~~~~~~~p~~i~~~l~~~~ 122 (583)
+||++++++++. .+++|.|+.++.+|||++.+.......+.+.+.. +...+....++.. +..+.
T Consensus 151 ~~rR~~tRtGC~A~m~Vk~~-~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~~-~~~~~~~~~~v~~------~~~d~ 222 (846)
T PLN03097 151 TGRRSCAKTDCKASMHVKRR-PDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYAA-MARQFAEYKNVVG------LKNDS 222 (846)
T ss_pred cccccccCCCCceEEEEEEc-CCCeEEEEEEecCCCCCCCCccccchhhhhhHHH-HHhhhhccccccc------cchhh
Confidence 479999999874 5588999999999999998654321111111111 1000111001000 00000
Q ss_pred CcccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEEEEEecCCCCceEEEEEehHhHHHhhhcCcccEE
Q 007973 123 GIIIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIAEVFTTGADNRFQRLFVSFNASIYGFLNGCLPIV 202 (583)
Q Consensus 123 g~~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~~vv 202 (583)
.......++. .. ..+..+.|.+|+++++.+||+|+|+ ++.|++++++++||+++.++.+|. +|+|||
T Consensus 223 -----~~~~~~~r~~--~~----~~gD~~~ll~yf~~~q~~nP~Ffy~-~qlDe~~~l~niFWaD~~sr~~Y~-~FGDvV 289 (846)
T PLN03097 223 -----KSSFDKGRNL--GL----EAGDTKILLDFFTQMQNMNSNFFYA-VDLGEDQRLKNLFWVDAKSRHDYG-NFSDVV 289 (846)
T ss_pred -----cchhhHHHhh--hc----ccchHHHHHHHHHHHHhhCCCceEE-EEEccCCCeeeEEeccHHHHHHHH-hcCCEE
Confidence 0001111111 11 1234677999999999999999999 799999999999999999999999 699999
Q ss_pred EeeceEeecccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHHHHhhh
Q 007973 203 SIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIADAVRR 282 (583)
Q Consensus 203 ~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~ 282 (583)
.||+||++|+|++||..++|+|+|+|+++||+||+.+|+.|+|.|||++|+++|+ +.+|.+||||++.+|.+||++
T Consensus 290 ~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~----gk~P~tIiTDqd~am~~AI~~ 365 (846)
T PLN03097 290 SFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMG----GQAPKVIITDQDKAMKSVISE 365 (846)
T ss_pred EEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhC----CCCCceEEecCCHHHHHHHHH
Confidence 9999999999999999999999999999999999999999999999999999999 999999999999999999999
Q ss_pred cCCCCccccchhHHHHHHhhhccC-----chhhHHHHHHHH-ccCHHHHHHHHHHHH-hcCHHHHHHHHhC--CCCCccc
Q 007973 283 KFPNSSLAFCMRHLSESIGKEFKN-----SRLTHLLWKVAY-ATTTMAFKERMGEIE-DVSSEAAKWIQQY--PPSHWAL 353 (583)
Q Consensus 283 vfP~a~~~~C~~Hi~~n~~~~~~~-----~~~~~~~~~~~~-~~t~~~f~~~~~~l~-~~~~~~~~~l~~~--~~~~W~~ 353 (583)
|||++.|++|+|||++|+.++++. ..|...|..+++ +.+++||+..|..|. .++.+.++||..+ .+++|++
T Consensus 366 VfP~t~Hr~C~wHI~~~~~e~L~~~~~~~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~LY~~RekWap 445 (846)
T PLN03097 366 VFPNAHHCFFLWHILGKVSENLGQVIKQHENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSLYEDRKQWVP 445 (846)
T ss_pred HCCCceehhhHHHHHHHHHHHhhHHhhhhhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHHHHhHhhhhH
Confidence 999999999999999999999864 378889998876 569999999999976 5688999999998 5899999
Q ss_pred cccCCccc-ccccccHHH-HHHHHhhc--cCCcHHHHHHHHHHHHHHHHHHHHhh-----------------hhcccCCC
Q 007973 354 VHFEGTRY-GHLSSNIEE-FNRWILEA--RELPIIQVIEQIHCKLMAEFEARRLK-----------------SSSWFSVL 412 (583)
Q Consensus 354 ~~~~~~~~-~~~ttn~~E-~n~~lk~~--~~~~i~~~~e~i~~~~~~~~~~r~~~-----------------~~~~~~~~ 412 (583)
+|+.+.++ |+.||+++| +|+.|++. ..++|..|++.+...+..+.+...+. ..+....|
T Consensus 446 aY~k~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piEkQAs~iY 525 (846)
T PLN03097 446 TYMRDAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLEKSVSGVY 525 (846)
T ss_pred HHhcccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHHHHHHHHh
Confidence 99976666 555777999 99999994 68999999999888776655443322 12345799
Q ss_pred ChhHHHHHHHHHhhcCceEEEecC----CeeEEEEe--cCceEEEec----cCcccccCCccccCCCchhHHHHHHhcCC
Q 007973 413 APSAEKRMIEAINHASMYQVLRSD----EVEFEVLS--AERSDIVNI----GTHCCSCRDWQLYGIPCSHAVAALISCRK 482 (583)
Q Consensus 413 tp~~~~~~~~~~~~a~~~~v~~~~----~~~~~V~~--~~~~~~V~~----~~~~CsC~~~~~~giPC~H~lav~~~~~~ 482 (583)
||.+|++||+++..+..|.+...+ ..+|.|.. ....|.|.. .+.+|+|++|+..||||+|||.|+.+.++
T Consensus 526 T~~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLkVL~~~~v 605 (846)
T PLN03097 526 THAVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALVVLQMCQL 605 (846)
T ss_pred HHHHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHHHHhhcCc
Confidence 999999999999999888876642 24788876 345677743 36899999999999999999999999998
Q ss_pred --Ccccccccccccccccc
Q 007973 483 --DVYAFAEKCFTVASYRQ 499 (583)
Q Consensus 483 --~~~~~v~~~yt~~~~~~ 499 (583)
.|+.||.+|||+++-..
T Consensus 606 ~~IP~~YILkRWTKdAK~~ 624 (846)
T PLN03097 606 SAIPSQYILKRWTKDAKSR 624 (846)
T ss_pred ccCchhhhhhhchhhhhhc
Confidence 59999999999998543
No 2
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.88 E-value=4e-23 Score=169.29 Aligned_cols=90 Identities=33% Similarity=0.667 Sum_probs=86.5
Q ss_pred ceEeecccccEEEE---EeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHHHHhhh
Q 007973 206 GIQLKSKYLGTLLS---ATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIADAVRR 282 (583)
Q Consensus 206 ~t~~~~~y~~~l~~---~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~ 282 (583)
|||++|+| ++++. ++|+|++|+.+|+||+++++|+.++|.|||+.+++.++ .. |.+||||++.|+.+||++
T Consensus 1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~----~~-p~~ii~D~~~~~~~Ai~~ 74 (93)
T PF10551_consen 1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP----QK-PKVIISDFDKALINAIKE 74 (93)
T ss_pred Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc----cC-ceeeeccccHHHHHHHHH
Confidence 79999999 98886 99999999999999999999999999999999999998 56 999999999999999999
Q ss_pred cCCCCccccchhHHHHHHh
Q 007973 283 KFPNSSLAFCMRHLSESIG 301 (583)
Q Consensus 283 vfP~a~~~~C~~Hi~~n~~ 301 (583)
+||++.|++|.||+.+|++
T Consensus 75 vfP~~~~~~C~~H~~~n~k 93 (93)
T PF10551_consen 75 VFPDARHQLCLFHILRNIK 93 (93)
T ss_pred HCCCceEehhHHHHHHhhC
Confidence 9999999999999999985
No 3
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.84 E-value=1e-21 Score=201.68 Aligned_cols=223 Identities=21% Similarity=0.281 Sum_probs=184.3
Q ss_pred CCCCChHHHHHHHHHHhC-cccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEEEEEecCCCCceEEEE
Q 007973 106 NINYKPKDILQDIHKQYG-IIIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIAEVFTTGADNRFQRLF 184 (583)
Q Consensus 106 ~~~~~p~~i~~~l~~~~g-~~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~~~~~~~~~~~~~~f 184 (583)
-.|++.++|.+.+..-+| ..+|.+++.+..+.+.+.. ..|. ...
T Consensus 112 ~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~-----------~~w~----~R~-------------------- 156 (381)
T PF00872_consen 112 LKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEV-----------EAWR----NRP-------------------- 156 (381)
T ss_pred ccccccccccchhhhhhcccccCchhhhhhhhhhhhhH-----------HHHh----hhc--------------------
Confidence 468999999999999999 7899999988877765542 1121 111
Q ss_pred EehHhHHHhhhcCc-ccEEEeeceEeecccc-----cEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcc
Q 007973 185 VSFNASIYGFLNGC-LPIVSIGGIQLKSKYL-----GTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEI 258 (583)
Q Consensus 185 ~~~~~~~~~~~~~~-~~vv~~D~t~~~~~y~-----~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~ 258 (583)
.. .. -++|++||+|.+.+.+ ..+++++|+|.+|+..+||+.+.+.|+.++|.-||+.|++.+.
T Consensus 157 ---------L~-~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~L~~RGl- 225 (381)
T PF00872_consen 157 ---------LE-SEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQDLKERGL- 225 (381)
T ss_pred ---------cc-cccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeecccCCccCEeeecchhhhhccc-
Confidence 00 23 3789999999987643 5689999999999999999999999999999999999999875
Q ss_pred cccCCCCeEEEccCcccHHHHhhhcCCCCccccchhHHHHHHhhhccCc---hhhHHHHHHHHccCHHHHHHHHHHHHh-
Q 007973 259 HAESMPQLTFISDGQKGIADAVRRKFPNSSLAFCMRHLSESIGKEFKNS---RLTHLLWKVAYATTTMAFKERMGEIED- 334 (583)
Q Consensus 259 ~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~t~~~f~~~~~~l~~- 334 (583)
..|..||+|+++||.+||+++||++.++.|.+|+++|+.++++.. .+...++.+..+.+.+++...++++.+
T Consensus 226 ----~~~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~k~~~~v~~~Lk~I~~a~~~e~a~~~l~~f~~~ 301 (381)
T PF00872_consen 226 ----KDILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPKKDRKEVKADLKAIYQAPDKEEAREALEEFAEK 301 (381)
T ss_pred ----cccceeeccccccccccccccccchhhhhheechhhhhccccccccchhhhhhccccccccccchhhhhhhhcccc
Confidence 569999999999999999999999999999999999999999765 667778888888888888888888754
Q ss_pred ---cCHHHHHHHHhCCCCCcccccc-CCcccccccccHHH-HHHHHhhc
Q 007973 335 ---VSSEAAKWIQQYPPSHWALVHF-EGTRYGHLSSNIEE-FNRWILEA 378 (583)
Q Consensus 335 ---~~~~~~~~l~~~~~~~W~~~~~-~~~~~~~~ttn~~E-~n~~lk~~ 378 (583)
.+|.+.+++++...+.|+..-+ ...+--+.|||..| +|+.|++.
T Consensus 302 ~~~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~i~TTN~iEsln~~irrr 350 (381)
T PF00872_consen 302 WEKKYPKAAKSLEENWDELLTFLDFPPEHRRSIRTTNAIESLNKEIRRR 350 (381)
T ss_pred cccccchhhhhhhhccccccceeeecchhccccchhhhccccccchhhh
Confidence 4788999999877677765534 44445678999999 99999883
No 4
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=99.70 E-value=5.6e-17 Score=123.03 Aligned_cols=67 Identities=46% Similarity=0.852 Sum_probs=65.1
Q ss_pred CCcccccCeeCCHHHHHHHHHHHHHhcceeEEEEeecCeEEEEEeecCCCceEEEEEEeCCCCceEE
Q 007973 4 DHNFVVGQEFPDVKAFRNAIKEAAIAQHFELRIIKSDLIRYFAKCVTEGCPWRIRAVKLPNAPTFTI 70 (583)
Q Consensus 4 ~~~l~~G~~F~s~~e~~~ai~~ya~~~gf~~~~~ks~~~r~~~~C~~~gC~wrv~~~~~~~~~~~~v 70 (583)
||.|.+||+|+|++|++.||..||+..+|++++.+|++.+++++|...||||+|+|+..++++.|+|
T Consensus 1 n~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~~r~~~~C~~~~C~Wrv~as~~~~~~~~~I 67 (67)
T PF03108_consen 1 NPELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDKKRYRAKCKDKGCPWRVRASKRKRSDTFQI 67 (67)
T ss_pred CCccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCCEEEEEEEcCCCCCEEEEEEEcCCCCEEEC
Confidence 7899999999999999999999999999999999999999999999999999999999999999976
No 5
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.56 E-value=1.3e-13 Score=138.70 Aligned_cols=235 Identities=15% Similarity=0.150 Sum_probs=177.3
Q ss_pred CCCCChHHHHHHHHHHhCcccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEEEEEecCCCCceEEEEE
Q 007973 106 NINYKPKDILQDIHKQYGIIIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIAEVFTTGADNRFQRLFV 185 (583)
Q Consensus 106 ~~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~~~~~~~~~~~~~~f~ 185 (583)
..|++++++.+.+++.++..++...+.+......+.+ .+++..-+
T Consensus 98 ~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e~v---------------~~~~~r~l-------------------- 142 (379)
T COG3328 98 AKGVTTREIEALLEELYGHKVSPSVISVVTDRLDEKV---------------KAWQNRPL-------------------- 142 (379)
T ss_pred HcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHHHH---------------HHHHhccc--------------------
Confidence 4689999999999999998888888777766655442 22222211
Q ss_pred ehHhHHHhhhcCcccEEEeeceEeecc--cccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCC
Q 007973 186 SFNASIYGFLNGCLPIVSIGGIQLKSK--YLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESM 263 (583)
Q Consensus 186 ~~~~~~~~~~~~~~~vv~~D~t~~~~~--y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~ 263 (583)
+..+++++||+|++-+ -+..+++|+|++.+|+..++|+.+...|+ ..|.-||..|+..+. .
T Consensus 143 -----------~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~rgl-----~ 205 (379)
T COG3328 143 -----------GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNRGL-----S 205 (379)
T ss_pred -----------cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhccc-----c
Confidence 2457899999999987 45789999999999999999999999999 999999999998764 4
Q ss_pred CCeEEEccCcccHHHHhhhcCCCCccccchhHHHHHHhhhccCch---hhHHHHHHHHccCHHHHHHHHHHHH----hcC
Q 007973 264 PQLTFISDGQKGIADAVRRKFPNSSLAFCMRHLSESIGKEFKNSR---LTHLLWKVAYATTTMAFKERMGEIE----DVS 336 (583)
Q Consensus 264 ~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~~---~~~~~~~~~~~~t~~~f~~~~~~l~----~~~ 336 (583)
....+++|+.+|+.+||.++||.+.++.|..|+.+|+..+...+. ....+..+..+.+.++....|..+. ...
T Consensus 206 ~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~~y 285 (379)
T COG3328 206 DVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPRKDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGKRY 285 (379)
T ss_pred ceeEEecchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhhhhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhhhc
Confidence 556677799999999999999999999999999999999988763 3344555555667776666666644 346
Q ss_pred HHHHHHHHhCCCCCcc-ccccCCcccccccccHHH-HHHHHhhc----cCCcHHHHHHHHHH
Q 007973 337 SEAAKWIQQYPPSHWA-LVHFEGTRYGHLSSNIEE-FNRWILEA----RELPIIQVIEQIHC 392 (583)
Q Consensus 337 ~~~~~~l~~~~~~~W~-~~~~~~~~~~~~ttn~~E-~n~~lk~~----~~~~i~~~~e~i~~ 392 (583)
|....++.+..-+.|. ..|....+--+.|||..| +|+.++.. ...|-.+.+..+..
T Consensus 286 P~i~~~~~~~~~~~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~~~~fpn~~sv~k~~y 347 (379)
T COG3328 286 PAILKSWRNALEELLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKVVGIFPNEESVEKLVY 347 (379)
T ss_pred chHHHHHHHHHHHhcccccCcHHHHhHhhcchHHHHHHHHHHHHHhhhccCCCHHHHHHHHH
Confidence 7777777776444442 333344445678999999 99977752 24555555554433
No 6
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.95 E-value=3.9e-10 Score=68.42 Aligned_cols=27 Identities=56% Similarity=1.009 Sum_probs=25.1
Q ss_pred cccccCCccccCCCchhHHHHHHhcCC
Q 007973 456 HCCSCRDWQLYGIPCSHAVAALISCRK 482 (583)
Q Consensus 456 ~~CsC~~~~~~giPC~H~lav~~~~~~ 482 (583)
.+|||++|+..||||+|+|+|+...++
T Consensus 1 ~~CsC~~~~~~gipC~H~i~v~~~~~~ 27 (28)
T smart00575 1 KTCSCRKFQLSGIPCRHALAAAIHIGL 27 (28)
T ss_pred CcccCCCcccCCccHHHHHHHHHHhCC
Confidence 479999999999999999999998875
No 7
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=98.94 E-value=5.2e-09 Score=84.08 Aligned_cols=69 Identities=23% Similarity=0.364 Sum_probs=65.0
Q ss_pred eCCHHHHHHHHHHHHHhcceeEEEEeecCeEEEEEeec------------------------------------------
Q 007973 13 FPDVKAFRNAIKEAAIAQHFELRIIKSDLIRYFAKCVT------------------------------------------ 50 (583)
Q Consensus 13 F~s~~e~~~ai~~ya~~~gf~~~~~ks~~~r~~~~C~~------------------------------------------ 50 (583)
|.+++|++.+|+.++...|+++.+.+|+...+.|+|..
T Consensus 1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k~t~srk 80 (111)
T PF08731_consen 1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKKKIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKKRTKSRK 80 (111)
T ss_pred CCchHHHHHHHHHHhhhcCceEEEEecCCceEEEEEecCCCcccccccccccccccccccccccccccccccCCcccccc
Confidence 88999999999999999999999999999999999952
Q ss_pred CCCceEEEEEEeCCCCceEEEeecccceecC
Q 007973 51 EGCPWRIRAVKLPNAPTFTIRSLEGTHTCGK 81 (583)
Q Consensus 51 ~gC~wrv~~~~~~~~~~~~v~~~~~~H~c~~ 81 (583)
..|||+|+|......+.|.|..++..|+|++
T Consensus 81 ~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l 111 (111)
T PF08731_consen 81 NTCPFRIRANYSKKNKKWTLVVVNNEHNHPL 111 (111)
T ss_pred cCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence 2799999999999999999999999999974
No 8
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=98.72 E-value=2.4e-08 Score=81.00 Aligned_cols=61 Identities=20% Similarity=0.196 Sum_probs=53.5
Q ss_pred HHHHHHHHhcceeEEEEeecC-------eEEEEEeec----------------------CCCceEEEEEEeCCCCceEEE
Q 007973 21 NAIKEAAIAQHFELRIIKSDL-------IRYFAKCVT----------------------EGCPWRIRAVKLPNAPTFTIR 71 (583)
Q Consensus 21 ~ai~~ya~~~gf~~~~~ks~~-------~r~~~~C~~----------------------~gC~wrv~~~~~~~~~~~~v~ 71 (583)
+||+.||..+||++++.+|.+ .++.++|.+ +||||+|.+.... ++.|.|.
T Consensus 1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~v~ 79 (91)
T PF03101_consen 1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK-DGKWRVT 79 (91)
T ss_pred CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc-CCEEEEE
Confidence 479999999999999977654 378899974 4899999999877 8999999
Q ss_pred eecccceecCC
Q 007973 72 SLEGTHTCGKN 82 (583)
Q Consensus 72 ~~~~~H~c~~~ 82 (583)
.+..+|||++.
T Consensus 80 ~~~~~HNH~L~ 90 (91)
T PF03101_consen 80 SFVLEHNHPLC 90 (91)
T ss_pred ECcCCcCCCCC
Confidence 99999999985
No 9
>PF04434 SWIM: SWIM zinc finger; InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.27 E-value=6.9e-07 Score=59.83 Aligned_cols=30 Identities=43% Similarity=0.817 Sum_probs=27.2
Q ss_pred EeccCcccccCCccccCCCchhHHHHHHhc
Q 007973 451 VNIGTHCCSCRDWQLYGIPCSHAVAALISC 480 (583)
Q Consensus 451 V~~~~~~CsC~~~~~~giPC~H~lav~~~~ 480 (583)
+++...+|||..|+..|.||+|++|++...
T Consensus 10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~ 39 (40)
T PF04434_consen 10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL 39 (40)
T ss_pred ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence 566789999999999999999999999864
No 10
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=96.30 E-value=0.0027 Score=33.94 Aligned_cols=18 Identities=28% Similarity=0.798 Sum_probs=16.3
Q ss_pred eeCCccCcCCCCCCCCCC
Q 007973 557 VHCSRCNQTGHYKTTCKA 574 (583)
Q Consensus 557 ~~Cs~C~~~gHn~~tC~~ 574 (583)
++|-+|++.||.++.||.
T Consensus 1 ~~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 1 RKCFNCGEPGHIARDCPK 18 (18)
T ss_dssp SBCTTTSCSSSCGCTSSS
T ss_pred CcCcCCCCcCcccccCcc
Confidence 369999999999999985
No 11
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=95.84 E-value=0.41 Score=51.05 Aligned_cols=93 Identities=15% Similarity=0.238 Sum_probs=74.0
Q ss_pred ccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHHHHhhhcCCCCccccchhHHHHHHhhhccCc-hhhHHHHHHH
Q 007973 239 VENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIADAVRRKFPNSSLAFCMRHLSESIGKEFKNS-RLTHLLWKVA 317 (583)
Q Consensus 239 ~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~-~~~~~~~~~~ 317 (583)
..+.+-|.-+.+.+.+..... ...-+++.+|+...+.+++. .+|++.+.+..+|+.+.+.+.++.. .+.+.++.+.
T Consensus 235 ~~~~~~~~~v~~~i~~~Y~~~--~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~~~~~~~~~al 311 (470)
T PF06782_consen 235 ESAEEFWEEVLDYIYNHYDLD--KTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDPELKEKIRKAL 311 (470)
T ss_pred cchHHHHHHHHHHHHHhcCcc--cceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhChHHHHHHHHHH
Confidence 455788999999888877621 22246778999999988776 8999999999999999999988643 6677677777
Q ss_pred HccCHHHHHHHHHHHHh
Q 007973 318 YATTTMAFKERMGEIED 334 (583)
Q Consensus 318 ~~~t~~~f~~~~~~l~~ 334 (583)
+.....+++..++.+..
T Consensus 312 ~~~d~~~l~~~L~~~~~ 328 (470)
T PF06782_consen 312 KKGDKKKLETVLDTAES 328 (470)
T ss_pred HhcCHHHHHHHHHHHHH
Confidence 77888888888888764
No 12
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=95.56 E-value=0.0064 Score=39.52 Aligned_cols=20 Identities=30% Similarity=0.800 Sum_probs=17.3
Q ss_pred ceeCCccCcCCCCC--CCCCCc
Q 007973 556 TVHCSRCNQTGHYK--TTCKAE 575 (583)
Q Consensus 556 ~~~Cs~C~~~gHn~--~tC~~~ 575 (583)
+++|++||+.||.+ ++||..
T Consensus 1 k~kC~~CG~~GH~~t~k~CP~~ 22 (40)
T PF15288_consen 1 KVKCKNCGAFGHMRTNKRCPMY 22 (40)
T ss_pred CccccccccccccccCccCCCC
Confidence 46899999999998 789875
No 13
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=95.32 E-value=0.033 Score=54.25 Aligned_cols=94 Identities=14% Similarity=0.183 Sum_probs=69.0
Q ss_pred EEeeceEeecccccEEEEEeeecC--CCCeeEEEEEEeeccccchHHHHHHHH-HHHhcccccCCCCeEEEccCcccHHH
Q 007973 202 VSIGGIQLKSKYLGTLLSATSFDA--DGGLFPIAFGVIDVENDESWMWFLSEF-HKALEIHAESMPQLTFISDGQKGIAD 278 (583)
Q Consensus 202 v~~D~t~~~~~y~~~l~~~~g~d~--~~~~~~~a~~~~~~E~~~~~~w~l~~l-~~~~~~~~~~~~~~~iitD~~~~l~~ 278 (583)
|+||=+........ +..+.+|. ++..+ +.++++-+.++..-||..+ -... .....+|++|...+...
T Consensus 1 lgiDE~~~~~g~~~--y~t~~~d~~~~~~~i---l~i~~~r~~~~l~~~~~~~~~~~~-----~~~v~~V~~Dm~~~y~~ 70 (249)
T PF01610_consen 1 LGIDEFAFRKGHRS--YVTVVVDLDTDTGRI---LDILPGRDKETLKDFFRSLYPEEE-----RKNVKVVSMDMSPPYRS 70 (249)
T ss_pred CeEeeeeeecCCcc--eeEEEEECccCCceE---EEEcCCccHHHHHHHHHHhCcccc-----ccceEEEEcCCCccccc
Confidence 46777666543332 33344444 33322 4578888888888888866 3332 46778999999999999
Q ss_pred HhhhcCCCCccccchhHHHHHHhhhcc
Q 007973 279 AVRRKFPNSSLAFCMRHLSESIGKEFK 305 (583)
Q Consensus 279 Ai~~vfP~a~~~~C~~Hi~~n~~~~~~ 305 (583)
|+++.||+|.+..-.|||++++.+.+.
T Consensus 71 ~~~~~~P~A~iv~DrFHvvk~~~~al~ 97 (249)
T PF01610_consen 71 AIREYFPNAQIVADRFHVVKLANRALD 97 (249)
T ss_pred cccccccccccccccchhhhhhhhcch
Confidence 999999999999999999999987553
No 14
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=95.24 E-value=0.012 Score=51.68 Aligned_cols=81 Identities=15% Similarity=0.083 Sum_probs=67.5
Q ss_pred ccEEEeeceEeecccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHHH
Q 007973 199 LPIVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIAD 278 (583)
Q Consensus 199 ~~vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~ 278 (583)
++.+.+|-||.+.+ +-..+...++|.+++ +|++-+.+.-+...=..||+.+.+... ..|..|+||+.++...
T Consensus 1 ~~~w~~DEt~iki~-G~~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~~-----~~p~~ivtDk~~aY~~ 72 (140)
T PF13610_consen 1 GDSWHVDETYIKIK-GKWHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRHR-----GEPRVIVTDKLPAYPA 72 (140)
T ss_pred CCEEEEeeEEEEEC-CEEEEEEEeeccccc--chhhhhhhhcccccceeeccccceeec-----cccceeecccCCccch
Confidence 36789999998754 224566888999999 889999998888888888888877663 6889999999999999
Q ss_pred HhhhcCCCC
Q 007973 279 AVRRKFPNS 287 (583)
Q Consensus 279 Ai~~vfP~a 287 (583)
|+++++|..
T Consensus 73 A~~~l~~~~ 81 (140)
T PF13610_consen 73 AIKELNPEG 81 (140)
T ss_pred hhhhccccc
Confidence 999999874
No 15
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=94.49 E-value=0.083 Score=38.50 Aligned_cols=39 Identities=26% Similarity=0.585 Sum_probs=32.6
Q ss_pred eEEEEEeecCCCceEEEEEEeCCCCceEEEeecccceec
Q 007973 42 IRYFAKCVTEGCPWRIRAVKLPNAPTFTIRSLEGTHTCG 80 (583)
Q Consensus 42 ~r~~~~C~~~gC~wrv~~~~~~~~~~~~v~~~~~~H~c~ 80 (583)
.|-.++|+..+|+++-.+.+..+++...++++..+|||+
T Consensus 21 pRsYYrCt~~~C~akK~Vqr~~~d~~~~~vtY~G~H~h~ 59 (60)
T PF03106_consen 21 PRSYYRCTHPGCPAKKQVQRSADDPNIVIVTYEGEHNHP 59 (60)
T ss_dssp EEEEEEEECTTEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred eeEeeeccccChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence 566799999999999999998888889999999999996
No 16
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=93.53 E-value=0.043 Score=33.89 Aligned_cols=25 Identities=32% Similarity=0.673 Sum_probs=21.3
Q ss_pred cCceeCCccCcCCCCCCCCCCcccc
Q 007973 554 KHTVHCSRCNQTGHYKTTCKAEIMK 578 (583)
Q Consensus 554 k~~~~Cs~C~~~gHn~~tC~~~~~~ 578 (583)
...+.|..|++.||..+.||...+.
T Consensus 6 P~~Y~C~~C~~~GH~i~dCP~~~Pk 30 (32)
T PF13696_consen 6 PPGYVCHRCGQKGHWIQDCPTNKPK 30 (32)
T ss_pred CCCCEeecCCCCCccHhHCCCCCCC
Confidence 3568899999999999999986554
No 17
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=91.16 E-value=0.47 Score=34.61 Aligned_cols=46 Identities=17% Similarity=0.275 Sum_probs=25.8
Q ss_pred cceeEEEEeecCeEEEEEeecC---CCceEEEEEEeCCCCceEEEeeccccee
Q 007973 30 QHFELRIIKSDLIRYFAKCVTE---GCPWRIRAVKLPNAPTFTIRSLEGTHTC 79 (583)
Q Consensus 30 ~gf~~~~~ks~~~r~~~~C~~~---gC~wrv~~~~~~~~~~~~v~~~~~~H~c 79 (583)
.|+.|...+.........|... +|++++... .+.-.+.....+|||
T Consensus 14 ~Gy~y~~~~~~~~~~~WrC~~~~~~~C~a~~~~~----~~~~~~~~~~~~HnH 62 (62)
T PF04500_consen 14 DGYRYYFNKRNDGKTYWRCSRRRSHGCRARLITD----AGDGRVVRTNGEHNH 62 (62)
T ss_dssp TTEEEEEEEE-SS-EEEEEGGGTTS----EEEEE------TTEEEE-S---SS
T ss_pred CCeEEECcCCCCCcEEEEeCCCCCCCCeEEEEEE----CCCCEEEECCCccCC
Confidence 5778887777688888999874 899999997 233455666688887
No 18
>PF04684 BAF1_ABF1: BAF1 / ABF1 chromatin reorganising factor; InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=91.10 E-value=0.49 Score=48.39 Aligned_cols=55 Identities=18% Similarity=0.387 Sum_probs=49.8
Q ss_pred ccCeeCCHHHHHHHHHHHHHhcceeEEEEeecC-eEEEEEeecCCCceEEEEEEeC
Q 007973 9 VGQEFPDVKAFRNAIKEAAIAQHFELRIIKSDL-IRYFAKCVTEGCPWRIRAVKLP 63 (583)
Q Consensus 9 ~G~~F~s~~e~~~ai~~ya~~~gf~~~~~ks~~-~r~~~~C~~~gC~wrv~~~~~~ 63 (583)
.|..|+++++-+.+|+.|......+|..+.|-+ +.++|.|.-..|||+|.++...
T Consensus 24 ~~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nkhftfachlk~c~fkillsy~g 79 (496)
T PF04684_consen 24 QARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNKHFTFACHLKNCPFKILLSYCG 79 (496)
T ss_pred cccCCCcHHHHHHHHhhhhhhhcCceeecccccccceEEEeeccCCCceeeeeecc
Confidence 467899999999999999999999999998865 6799999999999999998643
No 19
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=90.52 E-value=0.96 Score=44.58 Aligned_cols=134 Identities=18% Similarity=0.168 Sum_probs=86.1
Q ss_pred CCCCChHHHHHHHHHHhCcccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEEEEEecCCCCceEEEEE
Q 007973 106 NINYKPKDILQDIHKQYGIIIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIAEVFTTGADNRFQRLFV 185 (583)
Q Consensus 106 ~~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~~~~~~~~~~~~~~f~ 185 (583)
...++-..+.+.+.+. |+.+|.+++.+...+..+.. ....+.+.+.
T Consensus 18 ~~~lp~~r~~~~~~~~-G~~is~~ti~~~~~~~~~~l-----------~~~~~~l~~~---------------------- 63 (271)
T PF03050_consen 18 VYHLPLYRIQQMLEDL-GITISRGTIANWIKRVAEAL-----------KPLYEALKEE---------------------- 63 (271)
T ss_pred cCCCCHHHHhhhhhcc-ceeeccchhHhHhhhhhhhh-----------hhhhhhhhhh----------------------
Confidence 4466777777777777 99999998887766654432 1122222211
Q ss_pred ehHhHHHhhhcCcccEEEeeceEee----ccc-ccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccc
Q 007973 186 SFNASIYGFLNGCLPIVSIGGIQLK----SKY-LGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHA 260 (583)
Q Consensus 186 ~~~~~~~~~~~~~~~vv~~D~t~~~----~~y-~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~ 260 (583)
.. -.+|+.+|-|..+ ++. ++-+.++++-+ .+.|.+.++=+.+.-.-+|..
T Consensus 64 --------~~--~~~~~~~DET~~~vl~~~~g~~~~~Wv~~~~~------~v~f~~~~sR~~~~~~~~L~~--------- 118 (271)
T PF03050_consen 64 --------LR--SSPVVHADETGWRVLDKGKGKKGYLWVFVSPE------VVLFFYAPSRSSKVIKEFLGD--------- 118 (271)
T ss_pred --------cc--ccceeccCCceEEEeccccccceEEEeeeccc------eeeeeecccccccchhhhhcc---------
Confidence 11 2578888988877 433 33444444433 555666666666655555442
Q ss_pred cCCCCeEEEccCcccHHHHhhhcCCCCccccchhHHHHHHhhhccC
Q 007973 261 ESMPQLTFISDGQKGIADAVRRKFPNSSLAFCMRHLSESIGKEFKN 306 (583)
Q Consensus 261 ~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~ 306 (583)
-.-+++||+..+-.. +.++.|+.|..|+.+.+..-...
T Consensus 119 ---~~GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~ 156 (271)
T PF03050_consen 119 ---FSGILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES 156 (271)
T ss_pred ---cceeeeccccccccc-----ccccccccccccccccccccccc
Confidence 224899999987655 22789999999999999876654
No 20
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=90.27 E-value=0.44 Score=34.45 Aligned_cols=38 Identities=26% Similarity=0.571 Sum_probs=32.1
Q ss_pred eEEEEEeec-CCCceEEEEEEeCCCCceEEEeeccccee
Q 007973 42 IRYFAKCVT-EGCPWRIRAVKLPNAPTFTIRSLEGTHTC 79 (583)
Q Consensus 42 ~r~~~~C~~-~gC~wrv~~~~~~~~~~~~v~~~~~~H~c 79 (583)
-|-.++|+. .||+++=.+.+..+++...++.+..+|||
T Consensus 21 pRsYYrCt~~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h 59 (59)
T smart00774 21 PRSYYRCTYSQGCPAKKQVQRSDDDPSVVEVTYEGEHTH 59 (59)
T ss_pred cceEEeccccCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence 355689998 89999988888776778888899999997
No 21
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=89.96 E-value=5.1 Score=37.31 Aligned_cols=127 Identities=15% Similarity=0.079 Sum_probs=82.9
Q ss_pred CChHHHHHHHHHHhCcccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEEEEEecCCCCceEEEEEehH
Q 007973 109 YKPKDILQDIHKQYGIIIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIAEVFTTGADNRFQRLFVSFN 188 (583)
Q Consensus 109 ~~p~~i~~~l~~~~g~~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~~~~~~~~~~~~~~f~~~~ 188 (583)
++-..+.+. ..+.|+.+...++++.-++.-.. +.+.+.+.++.
T Consensus 26 Ls~r~v~e~-l~~rgi~v~h~Ti~rwv~k~~~~--------------~~~~~~~r~~~---------------------- 68 (215)
T COG3316 26 LSLRDVEEM-LAERGIEVDHETIHRWVQKYGPL--------------LARRLKRRKRK---------------------- 68 (215)
T ss_pred hhhccHHHH-HHHcCcchhHHHHHHHHHHHhHH--------------HHHHhhhhccc----------------------
Confidence 334444443 45569999999988876654322 23333333322
Q ss_pred hHHHhhhcCcccEEEeeceEeecccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCCCCeEE
Q 007973 189 ASIYGFLNGCLPIVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESMPQLTF 268 (583)
Q Consensus 189 ~~~~~~~~~~~~vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~i 268 (583)
-.+++.+|-||.+.+-+. .+.-.++|.+|+ ++.+-+...-+...=.-||..+++..+ .|.+|
T Consensus 69 ---------~~~~w~vDEt~ikv~gkw-~ylyrAid~~g~--~Ld~~L~~rRn~~aAk~Fl~kllk~~g------~p~v~ 130 (215)
T COG3316 69 ---------AGDSWRVDETYIKVNGKW-HYLYRAIDADGL--TLDVWLSKRRNALAAKAFLKKLLKKHG------EPRVF 130 (215)
T ss_pred ---------cccceeeeeeEEeeccEe-eehhhhhccCCC--eEEEEEEcccCcHHHHHHHHHHHHhcC------CCceE
Confidence 256788999998754222 233445566544 566777777666666667776666544 68899
Q ss_pred EccCcccHHHHhhhcCCCCccc
Q 007973 269 ISDGQKGIADAVRRKFPNSSLA 290 (583)
Q Consensus 269 itD~~~~l~~Ai~~vfP~a~~~ 290 (583)
+||+.+....|+.++-+.+.|+
T Consensus 131 vtDka~s~~~A~~~l~~~~ehr 152 (215)
T COG3316 131 VTDKAPSYTAALRKLGSEVEHR 152 (215)
T ss_pred EecCccchHHHHHhcCcchhee
Confidence 9999999999999998866665
No 22
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=89.43 E-value=1.2 Score=37.28 Aligned_cols=75 Identities=15% Similarity=0.034 Sum_probs=55.2
Q ss_pred ccEEEeeceEee-cccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHH
Q 007973 199 LPIVSIGGIQLK-SKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIA 277 (583)
Q Consensus 199 ~~vv~~D~t~~~-~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~ 277 (583)
..++.+|.++.. ...++..+..+.+|..-.. .+++.+-..++.+.+..+|.......+ ...|.+|+||+..+..
T Consensus 6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~~-~~~~~~~~~~~~~~~~~~l~~~~~~~~----~~~p~~i~tD~g~~f~ 80 (120)
T PF00665_consen 6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSRF-IYAFPVSSKETAEAALRALKRAIEKRG----GRPPRVIRTDNGSEFT 80 (120)
T ss_dssp TTEEEEEEEEETGGCTT-CEEEEEEEETTTTE-EEEEEESSSSHHHHHHHHHHHHHHHHS-----SE-SEEEEESCHHHH
T ss_pred CCEEEEeeEEEecCCCCccEEEEEEEECCCCc-EEEEEeecccccccccccccccccccc----cccceecccccccccc
Confidence 468999999765 3456688889999987765 457777777788888888887666665 3349999999998876
Q ss_pred H
Q 007973 278 D 278 (583)
Q Consensus 278 ~ 278 (583)
.
T Consensus 81 ~ 81 (120)
T PF00665_consen 81 S 81 (120)
T ss_dssp S
T ss_pred c
Confidence 4
No 23
>PHA02517 putative transposase OrfB; Reviewed
Probab=88.93 E-value=1.8 Score=42.68 Aligned_cols=152 Identities=16% Similarity=0.093 Sum_probs=84.5
Q ss_pred HHHHHHHHHhhc-CCCCChHHHHHHHHHHhCcccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEEEEE
Q 007973 94 WIVSFIEERLRD-NINYKPKDILQDIHKQYGIIIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIAEVF 172 (583)
Q Consensus 94 ~la~~~~~~l~~-~~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~~~ 172 (583)
.+.+.+.+.... .+.++.+.|...|++. |+.++.++++|..+.. |-... .. .+... ...-. .
T Consensus 30 ~l~~~I~~i~~~~~~~~G~r~I~~~L~~~-g~~vs~~tV~Rim~~~-----gl~~~--------~~-~k~~~-~~~~~-~ 92 (277)
T PHA02517 30 WLKSEILRVYDENHQVYGVRKVWRQLNRE-GIRVARCTVGRLMKEL-----GLAGV--------LR-GKKVR-TTISR-K 92 (277)
T ss_pred HHHHHHHHHHHHhCCCCCHHHHHHHHHhc-CcccCHHHHHHHHHHc-----CCceE--------ec-CCCcC-CCCCC-C
Confidence 455556666554 5788999999998765 9999999998876542 11000 00 00000 00000 0
Q ss_pred ecCCCCceEEEEEehHhHHHhhhcCcccEEEeeceEeecccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHH
Q 007973 173 TTGADNRFQRLFVSFNASIYGFLNGCLPIVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEF 252 (583)
Q Consensus 173 ~~~~~~~~~~~f~~~~~~~~~~~~~~~~vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l 252 (583)
.....+.+.+-|-+. .-..++..|.||....- +..+.++.+|...+ .++|+.+...++.+...-+|+..
T Consensus 93 ~~~~~n~~~r~f~~~---------~pn~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~~~~~l~~a 161 (277)
T PHA02517 93 AVAAPDRVNRQFVAT---------RPNQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDFVLDALEQA 161 (277)
T ss_pred CCCCCCcccCCCCCC---------CCCCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHHHHHHHHHH
Confidence 000011111112111 13468999999986543 55677777777665 46788888777777554444444
Q ss_pred HHHhcccccCCCCeEEEccCcccHH
Q 007973 253 HKALEIHAESMPQLTFISDGQKGIA 277 (583)
Q Consensus 253 ~~~~~~~~~~~~~~~iitD~~~~l~ 277 (583)
....+ ...+.+|.||+.....
T Consensus 162 ~~~~~----~~~~~i~~sD~G~~y~ 182 (277)
T PHA02517 162 LWARG----RPGGLIHHSDKGSQYV 182 (277)
T ss_pred HHhcC----CCcCcEeecccccccc
Confidence 33333 2334577799987653
No 24
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=88.10 E-value=0.2 Score=34.90 Aligned_cols=21 Identities=24% Similarity=0.676 Sum_probs=18.4
Q ss_pred cCceeCCccCcCCCCCCCCCC
Q 007973 554 KHTVHCSRCNQTGHYKTTCKA 574 (583)
Q Consensus 554 k~~~~Cs~C~~~gHn~~tC~~ 574 (583)
+-+..|++|+..||..+.||.
T Consensus 29 ~lp~~C~~C~~~gH~~~~C~k 49 (49)
T PF14392_consen 29 RLPRFCFHCGRIGHSDKECPK 49 (49)
T ss_pred CcChhhcCCCCcCcCHhHcCC
Confidence 356789999999999999984
No 25
>PF13565 HTH_32: Homeodomain-like domain
Probab=85.70 E-value=2.2 Score=32.67 Aligned_cols=42 Identities=14% Similarity=0.325 Sum_probs=34.9
Q ss_pred hHHHHHHHHHhhcCCCCChHHHHHHHHHHhCccc--CHHHHHHH
Q 007973 93 DWIVSFIEERLRDNINYKPKDILQDIHKQYGIII--PYKQAWRA 134 (583)
Q Consensus 93 ~~la~~~~~~l~~~~~~~p~~i~~~l~~~~g~~~--~~~~~~~~ 134 (583)
..+...+.+.+..+|.+++.+|.+.|.+++|+.+ |.+++||.
T Consensus 33 ~e~~~~i~~~~~~~p~wt~~~i~~~L~~~~g~~~~~S~~tv~R~ 76 (77)
T PF13565_consen 33 PEQRERIIALIEEHPRWTPREIAEYLEEEFGISVRVSRSTVYRI 76 (77)
T ss_pred HHHHHHHHHHHHhCCCCCHHHHHHHHHHHhCCCCCccHhHHHHh
Confidence 3344567777778899999999999999999877 99999875
No 26
>smart00343 ZnF_C2HC zinc finger.
Probab=85.41 E-value=0.43 Score=28.14 Aligned_cols=19 Identities=26% Similarity=0.787 Sum_probs=16.3
Q ss_pred eCCccCcCCCCCCCCCCcc
Q 007973 558 HCSRCNQTGHYKTTCKAEI 576 (583)
Q Consensus 558 ~Cs~C~~~gHn~~tC~~~~ 576 (583)
.|.+|++.||..+.||...
T Consensus 1 ~C~~CG~~GH~~~~C~~~~ 19 (26)
T smart00343 1 KCYNCGKEGHIARDCPKXX 19 (26)
T ss_pred CCccCCCCCcchhhCCccc
Confidence 4899999999999998443
No 27
>PF02178 AT_hook: AT hook motif; InterPro: IPR017956 AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins [], in DNA-binding proteins from plants [] and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex []. High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin []. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner [, ]. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions. ; GO: 0003677 DNA binding; PDB: 2EZE_A 2EZD_A 2EZF_A 2EZG_A.
Probab=83.33 E-value=0.51 Score=22.89 Aligned_cols=9 Identities=56% Similarity=0.981 Sum_probs=3.6
Q ss_pred CCCCCCCCc
Q 007973 535 RPPGRPEKK 543 (583)
Q Consensus 535 r~~GRpkk~ 543 (583)
+++|||+|.
T Consensus 2 r~RGRP~k~ 10 (13)
T PF02178_consen 2 RKRGRPRKN 10 (13)
T ss_dssp --SS--TT-
T ss_pred CcCCCCccc
Confidence 679999886
No 28
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=82.88 E-value=32 Score=33.61 Aligned_cols=147 Identities=13% Similarity=0.085 Sum_probs=87.7
Q ss_pred hHHHHHHHHHhhcCCCCChHHHHHHHHHH---hCc-ccCHHHHHHHHHHH-HHHHhCChHhhhcchHHHHHHHhhhCCCC
Q 007973 93 DWIVSFIEERLRDNINYKPKDILQDIHKQ---YGI-IIPYKQAWRAKERG-LAAIYGSSEEGYCLLPSYCEQIKRTNPGS 167 (583)
Q Consensus 93 ~~la~~~~~~l~~~~~~~p~~i~~~l~~~---~g~-~~~~~~~~~~~~~~-~~~~~g~~~~~~~~l~~~~~~l~~~np~~ 167 (583)
..+...+.+....++.++...|...|+.+ .|+ .++..+++|..+.+ +.... +...+.+
T Consensus 11 ~~l~~~I~~~~~~~~~yG~rri~~~L~~~~~~~g~~~v~~krV~rlmr~~gL~~~~-----------------r~~~~~~ 73 (262)
T PRK14702 11 TDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRLMRQNALLLER-----------------KPAVPPS 73 (262)
T ss_pred HHHHHHHHHHHHhCcccChHHHHHHHHhhhcccCccccCHHHHHHHHHHhCCcccc-----------------CCCCCCC
Confidence 44555666666677889999999999875 366 48999988876553 10000 0000000
Q ss_pred EEEEEecCCCCceEEEEEehHhHHHhhhcCcccEEEeeceEeecccccEEEEEeeecCCCCeeEEEEEEeec-cccchHH
Q 007973 168 IAEVFTTGADNRFQRLFVSFNASIYGFLNGCLPIVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDV-ENDESWM 246 (583)
Q Consensus 168 ~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~~vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~-E~~~~~~ 246 (583)
. ...... |. ...-..++..|-||.....++.++.++-+|...+ .+|||++... .+.+.-.
T Consensus 74 ~-----~~~~~~----~~---------~~~pn~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~~v~ 134 (262)
T PRK14702 74 K-----RAHTGR----VA---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQ 134 (262)
T ss_pred C-----cCCCCc----cc---------cCCCCCEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHHHHH
Confidence 0 000000 10 1112468999999986554557888888988777 6789999874 5656555
Q ss_pred HHHHHHHHH-hcccccCCCCeEEEccCcccH
Q 007973 247 WFLSEFHKA-LEIHAESMPQLTFISDGQKGI 276 (583)
Q Consensus 247 w~l~~l~~~-~~~~~~~~~~~~iitD~~~~l 276 (583)
-+|+...+. .+. .....|.+|.||+....
T Consensus 135 ~~l~~A~~~~~~~-~~~~~~~iihSD~Gsqy 164 (262)
T PRK14702 135 DVMLGAVERRFGN-DLPSSPVEWLTDNGSCY 164 (262)
T ss_pred HHHHHHHHHHhcc-cCCCCCeEEEcCCCccc
Confidence 566544333 220 00346788999998653
No 29
>PF04937 DUF659: Protein of unknown function (DUF 659); InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=81.57 E-value=15 Score=32.56 Aligned_cols=91 Identities=13% Similarity=0.203 Sum_probs=57.8
Q ss_pred cccEEEEEeeecCCCCeeEEEEEEe-eccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHHHHhh---hcCCCCc
Q 007973 213 YLGTLLSATSFDADGGLFPIAFGVI-DVENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIADAVR---RKFPNSS 288 (583)
Q Consensus 213 y~~~l~~~~g~d~~~~~~~~a~~~~-~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~---~vfP~a~ 288 (583)
.+.+|+.++..-+.|-.|.=..-.- ...+.+...-+|+...+.++ ...-..||||....+..|-+ +.+|...
T Consensus 45 ~~~~lInf~v~~~~g~~Flksvd~s~~~~~a~~l~~ll~~vIeeVG----~~nVvqVVTDn~~~~~~a~~~L~~k~p~if 120 (153)
T PF04937_consen 45 KGRSLINFMVYCPEGTVFLKSVDASSIIKTAEYLFELLDEVIEEVG----EENVVQVVTDNASNMKKAGKLLMEKYPHIF 120 (153)
T ss_pred CCCeEEEEEEEcccccEEEEEEecccccccHHHHHHHHHHHHHHhh----hhhhhHHhccCchhHHHHHHHHHhcCCCEE
Confidence 3344444444444444332222111 12345555566666655555 55667789999999888844 5689999
Q ss_pred cccchhHHHHHHhhhccCc
Q 007973 289 LAFCMRHLSESIGKEFKNS 307 (583)
Q Consensus 289 ~~~C~~Hi~~n~~~~~~~~ 307 (583)
..-|.-|-+.-+.+.+...
T Consensus 121 w~~CaaH~inLmledi~k~ 139 (153)
T PF04937_consen 121 WTPCAAHCINLMLEDIGKL 139 (153)
T ss_pred EechHHHHHHHHHHHHhcC
Confidence 9999999988888877654
No 30
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=79.50 E-value=4.8 Score=32.31 Aligned_cols=37 Identities=35% Similarity=0.630 Sum_probs=25.5
Q ss_pred EEEEe-cCceEEEeccCcccccCCcc-----ccCCCchhHHHHHH
Q 007973 440 FEVLS-AERSDIVNIGTHCCSCRDWQ-----LYGIPCSHAVAALI 478 (583)
Q Consensus 440 ~~V~~-~~~~~~V~~~~~~CsC~~~~-----~~giPC~H~lav~~ 478 (583)
|.|.- .++.|+++.. .|||..|- .-.-||.|++.+=.
T Consensus 35 ~fVyvG~~rdYIl~~g--fCSCp~~~~svvl~Gk~~C~Hi~glk~ 77 (117)
T COG5431 35 FFVYVGKERDYILEGG--FCSCPDFLGSVVLKGKSPCAHIIGLKV 77 (117)
T ss_pred EEEEEccccceEEEcC--cccCHHHHhHhhhcCcccchhhhheee
Confidence 33433 5668999877 89998885 22357999986533
No 31
>smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).
Probab=73.27 E-value=2.1 Score=24.92 Aligned_cols=14 Identities=36% Similarity=0.510 Sum_probs=10.4
Q ss_pred CCCCCCCCCccccc
Q 007973 534 RRPPGRPEKKRICL 547 (583)
Q Consensus 534 ~r~~GRpkk~R~~~ 547 (583)
.|++|||+|.....
T Consensus 1 kRkRGRPrK~~~~~ 14 (26)
T smart00384 1 KRKRGRPRKAPKDX 14 (26)
T ss_pred CCCCCCCCCCCCcc
Confidence 47899999885543
No 32
>PRK09335 30S ribosomal protein S26e; Provisional
Probab=72.16 E-value=3.1 Score=32.99 Aligned_cols=27 Identities=30% Similarity=0.587 Sum_probs=19.9
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCc
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQ 564 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~ 564 (583)
|..++..||-|+.|... ...+|++|+.
T Consensus 2 ~kKRrn~GR~K~~rGhv-------~~V~C~nCgr 28 (95)
T PRK09335 2 PKKRENRGRRKGDKGHV-------GYVQCDNCGR 28 (95)
T ss_pred CcccccCCCCCCCCCCC-------ccEEeCCCCC
Confidence 56677788887664433 5789999997
No 33
>PLN00186 ribosomal protein S26; Provisional
Probab=67.39 E-value=4.4 Score=32.88 Aligned_cols=27 Identities=33% Similarity=0.620 Sum_probs=19.6
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCc
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQ 564 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~ 564 (583)
|..++..||-|+.|... +..+|++|+.
T Consensus 2 ~kKRrN~GR~K~~rGhv-------~~V~C~nCgr 28 (109)
T PLN00186 2 TKKRRNGGRNKHGRGHV-------KRIRCSNCGK 28 (109)
T ss_pred CcccccCCCCCCCCCCC-------cceeeCCCcc
Confidence 55677778877664433 5789999997
No 34
>PTZ00172 40S ribosomal protein S26; Provisional
Probab=66.64 E-value=4.6 Score=32.80 Aligned_cols=27 Identities=33% Similarity=0.614 Sum_probs=19.8
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCc
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQ 564 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~ 564 (583)
|..++..||-|+.|... +..+|++|+.
T Consensus 2 ~kKRrN~GR~K~~rGhv-------~~V~C~nCgr 28 (108)
T PTZ00172 2 TSKRRNNGRSKHGRGHV-------KPVRCSNCGR 28 (108)
T ss_pred CcccccCCCCCCCCCCC-------ccEEeCCccc
Confidence 55677788887664433 5789999997
No 35
>PF12762 DDE_Tnp_IS1595: ISXO2-like transposase domain; InterPro: IPR024445 This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2.
Probab=65.85 E-value=15 Score=32.32 Aligned_cols=69 Identities=19% Similarity=0.197 Sum_probs=43.2
Q ss_pred cEEEeeceEeeccc--------------ccEEEEEeeecCC-CCeeEEEEEEeeccccchHHHHHHHHHHHhcccccCCC
Q 007973 200 PIVSIGGIQLKSKY--------------LGTLLSATSFDAD-GGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAESMP 264 (583)
Q Consensus 200 ~vv~~D~t~~~~~y--------------~~~l~~~~g~d~~-~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~ 264 (583)
.+|-+|.||..++- .....++++++-+ +..--+-..++++.+.++..-+++... .+
T Consensus 4 G~VEiDEty~~~~~~~~~~~~~~~gr~~~~k~~V~~~ver~~~~~~~~~~~~v~~~~~~tl~~~i~~~i---------~~ 74 (151)
T PF12762_consen 4 GIVEIDETYFGGRKNKKPRRKGKRGRGSKNKVPVFGAVERNDGGTGRVFMFVVPDRSAETLKPIIQEHI---------EP 74 (151)
T ss_pred CEEEeCcCEECCcccccccCCCCCCCcCCCCcEEEEEEeecccCCceEEEEeecccccchhHHHHHHhh---------hc
Confidence 47888888876432 2233445555544 444444555667888888877765442 34
Q ss_pred CeEEEccCcccHH
Q 007973 265 QLTFISDGQKGIA 277 (583)
Q Consensus 265 ~~~iitD~~~~l~ 277 (583)
..+|+||...+-.
T Consensus 75 gs~i~TD~~~aY~ 87 (151)
T PF12762_consen 75 GSTIITDGWRAYN 87 (151)
T ss_pred cceeeecchhhcC
Confidence 5689999998764
No 36
>PRK09409 IS2 transposase TnpB; Reviewed
Probab=63.57 E-value=38 Score=33.86 Aligned_cols=147 Identities=12% Similarity=0.093 Sum_probs=87.4
Q ss_pred HHHHHHHHHhhcCCCCChHHHHHHHHHHh---Cc-ccCHHHHHHHHHHHHHHHhCChHhhhcchHHHHHHHhhhCCCCEE
Q 007973 94 WIVSFIEERLRDNINYKPKDILQDIHKQY---GI-IIPYKQAWRAKERGLAAIYGSSEEGYCLLPSYCEQIKRTNPGSIA 169 (583)
Q Consensus 94 ~la~~~~~~l~~~~~~~p~~i~~~l~~~~---g~-~~~~~~~~~~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~ 169 (583)
.+...+.+.....+..+.+.|...|+++. |+ .++..+++|..+.+ |- .. ..+...+.+.
T Consensus 51 ~l~~~I~~i~~~~~~yG~Rri~~~L~~~g~~~g~~~v~~k~V~RlMr~~-----Gl--------~~---~~~~~~~~~~- 113 (301)
T PRK09409 51 DVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRIMRQN-----AL--------LL---ERKPAVPPSK- 113 (301)
T ss_pred HHHHHHHHHHHhCccCCHHHHHHHHHhhhcccCccccCHHHHHHHHHHc-----CC--------cc---cccCCCCCCC-
Confidence 44555666656678899999999998752 66 58888888776543 10 00 0000000000
Q ss_pred EEEecCCCCceEEEEEehHhHHHhhhcCcccEEEeeceEeecccccEEEEEeeecCCCCeeEEEEEEeec-cccchHHHH
Q 007973 170 EVFTTGADNRFQRLFVSFNASIYGFLNGCLPIVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDV-ENDESWMWF 248 (583)
Q Consensus 170 ~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~~vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~-E~~~~~~w~ 248 (583)
......| . ...-..|++.|-||....-++-++.++.+|...+ .+|||++... .+.+.-.-+
T Consensus 114 ----~~~~~~~----~---------~~~pN~~W~tDiT~~~~~~g~~~Yl~~ViD~~sR-~ivg~~~s~~~~~~~~v~~~ 175 (301)
T PRK09409 114 ----RAHTGRV----A---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQDV 175 (301)
T ss_pred ----CCCCCCc----C---------CCCCCCEEEeeeEEEEeCCCCEEEEEEEeecccc-eEEEEEeccCCCCHHHHHHH
Confidence 0000111 0 1123468999999976544556888888898877 6889999875 566666666
Q ss_pred HHH-HHHHhcccccCCCCeEEEccCcccH
Q 007973 249 LSE-FHKALEIHAESMPQLTFISDGQKGI 276 (583)
Q Consensus 249 l~~-l~~~~~~~~~~~~~~~iitD~~~~l 276 (583)
|+. +....+. .....|.+|.||+....
T Consensus 176 l~~a~~~~~~~-~~~~~~~iihSDrGsqy 203 (301)
T PRK09409 176 MLGAVERRFGN-DLPSSPVEWLTDNGSCY 203 (301)
T ss_pred HHHHHHHHhcc-CCCCCCcEEecCCCccc
Confidence 654 3333320 00235788999998654
No 37
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=61.07 E-value=4.7 Score=38.12 Aligned_cols=24 Identities=33% Similarity=0.707 Sum_probs=19.6
Q ss_pred CcccccCCccccCCCchhHHHHHHhcC
Q 007973 455 THCCSCRDWQLYGIPCSHAVAALISCR 481 (583)
Q Consensus 455 ~~~CsC~~~~~~giPC~H~lav~~~~~ 481 (583)
...|||..+ -.||.|+-||..+..
T Consensus 124 ~~dCSCPD~---anPCKHi~AvyY~la 147 (266)
T COG4279 124 STDCSCPDY---ANPCKHIAAVYYLLA 147 (266)
T ss_pred ccccCCCCc---ccchHHHHHHHHHHH
Confidence 356999986 469999999998863
No 38
>COG4830 RPS26B Ribosomal protein S26 [Translation, ribosomal structure and biogenesis]
Probab=60.62 E-value=5.9 Score=31.37 Aligned_cols=44 Identities=32% Similarity=0.511 Sum_probs=28.7
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCc-CCCCCCCCCCccccccc
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQ-TGHYKTTCKAEIMKSIE 581 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~-~gHn~~tC~~~~~~~~~ 581 (583)
|..++.+||.|+.|... .-.+|-+|+. .--.+.-|-..+-.-||
T Consensus 2 pkkR~N~GR~K~~rGhv-------~~v~CdnCg~~vPkdKAikr~~i~s~Ve 46 (108)
T COG4830 2 PKKRRNRGRNKKGRGHV-------KYVRCDNCGKAVPKDKAIKRTAIRSPVE 46 (108)
T ss_pred cchhhhcCCCCCCCCCc-------cceeeccccccCCccceeeEeeccCccc
Confidence 66778888888775433 4678999998 55555555555544444
No 39
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=59.31 E-value=4.9 Score=36.51 Aligned_cols=20 Identities=20% Similarity=0.579 Sum_probs=12.3
Q ss_pred Cccc-ccCCccccCCCchhHH
Q 007973 455 THCC-SCRDWQLYGIPCSHAV 474 (583)
Q Consensus 455 ~~~C-sC~~~~~~giPC~H~l 474 (583)
...| .|+.-.+..-=|.|.|
T Consensus 60 ~~~C~nCg~~GH~~~DCP~~i 80 (190)
T COG5082 60 NPVCFNCGQNGHLRRDCPHSI 80 (190)
T ss_pred ccccchhcccCcccccCChhH
Confidence 4555 6776666666666644
No 40
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=58.94 E-value=5.9 Score=26.46 Aligned_cols=20 Identities=35% Similarity=0.958 Sum_probs=17.5
Q ss_pred CceeCCccCcCCCCCCCCCC
Q 007973 555 HTVHCSRCNQTGHYKTTCKA 574 (583)
Q Consensus 555 ~~~~Cs~C~~~gHn~~tC~~ 574 (583)
....|.+|++.||-..-||+
T Consensus 3 ~~~~CqkC~~~GH~tyeC~~ 22 (42)
T PF13917_consen 3 ARVRCQKCGQKGHWTYECPN 22 (42)
T ss_pred CCCcCcccCCCCcchhhCCC
Confidence 34679999999999999994
No 41
>PF13592 HTH_33: Winged helix-turn helix
Probab=58.01 E-value=18 Score=26.25 Aligned_cols=31 Identities=19% Similarity=0.330 Sum_probs=26.6
Q ss_pred CCCChHHHHHHHHHHhCcccCHHHHHHHHHH
Q 007973 107 INYKPKDILQDIHKQYGIIIPYKQAWRAKER 137 (583)
Q Consensus 107 ~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~ 137 (583)
.-.+..+|.+.|.+.||+.+|.+.+++...+
T Consensus 3 ~~wt~~~i~~~I~~~fgv~ys~~~v~~lL~r 33 (60)
T PF13592_consen 3 GRWTLKEIAAYIEEEFGVKYSPSGVYRLLKR 33 (60)
T ss_pred CcccHHHHHHHHHHHHCCEEcHHHHHHHHHH
Confidence 4567899999999999999999998877654
No 42
>COG5179 TAF1 Transcription initiation factor TFIID, subunit TAF1 [Transcription]
Probab=55.76 E-value=7.4 Score=41.48 Aligned_cols=25 Identities=40% Similarity=0.812 Sum_probs=18.6
Q ss_pred cCCccCceeCCccCcCCCCCC--CCCC
Q 007973 550 LNREKHTVHCSRCNQTGHYKT--TCKA 574 (583)
Q Consensus 550 ~~~~k~~~~Cs~C~~~gHn~~--tC~~ 574 (583)
+++...+++|++||+.||-+. .||.
T Consensus 931 ~GRK~Ttr~C~nCGQvGHmkTNK~CP~ 957 (968)
T COG5179 931 KGRKNTTRTCGNCGQVGHMKTNKACPK 957 (968)
T ss_pred CCCCCcceecccccccccccccccCcc
Confidence 344445899999999999764 5664
No 43
>PF01283 Ribosomal_S26e: Ribosomal protein S26e; InterPro: IPR000892 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26 []; Octopus S26 []; Drosophila S26 (DS31) []; plant cytoplasmic S26; and fungal S26 []. These proteins have 114 to 127 amino acids.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 3U5G_a 3U5C_a 2XZM_5 2XZN_5.
Probab=48.82 E-value=11 Score=31.07 Aligned_cols=27 Identities=37% Similarity=0.671 Sum_probs=12.9
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCc
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQ 564 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~ 564 (583)
|..+|..||-|+.|... ...+|.+|+.
T Consensus 2 ~~KRrN~Gr~KkgrGhv-------~~V~C~nCgr 28 (113)
T PF01283_consen 2 TKKRRNNGRSKKGRGHV-------QPVRCDNCGR 28 (113)
T ss_dssp ----TTTTSS-SSSS----------EEE-TTTB-
T ss_pred CcccccCCCCCCCCCCC-------cCEeeCcccc
Confidence 45667777777664433 5789999986
No 44
>PRK13907 rnhA ribonuclease H; Provisional
Probab=45.86 E-value=1.1e+02 Score=25.79 Aligned_cols=78 Identities=12% Similarity=0.093 Sum_probs=42.5
Q ss_pred EEEeeceEeecccccEEEEEeeecCCCCeeEEEE-EEeeccccchHHHHHHHHHHHhcccccCCCCeEEEccCcccHHHH
Q 007973 201 IVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAF-GVIDVENDESWMWFLSEFHKALEIHAESMPQLTFISDGQKGIADA 279 (583)
Q Consensus 201 vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~-~~~~~E~~~~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~A 279 (583)
.|.+||.+..+.-.+-.-.++ .|..+.. .+++ .-..+.+..-|.-++..|+.+... +..++.|.||. ..+.++
T Consensus 3 ~iy~DGa~~~~~g~~G~G~vi-~~~~~~~-~~~~~~~~~tn~~AE~~All~aL~~a~~~---g~~~v~i~sDS-~~vi~~ 76 (128)
T PRK13907 3 EVYIDGASKGNPGPSGAGVFI-KGVQPAV-QLSLPLGTMSNHEAEYHALLAALKYCTEH---NYNIVSFRTDS-QLVERA 76 (128)
T ss_pred EEEEeeCCCCCCCccEEEEEE-EECCeeE-EEEecccccCCcHHHHHHHHHHHHHHHhC---CCCEEEEEech-HHHHHH
Confidence 377898887653222111111 4444433 2332 122344566677788888777652 34567788886 445555
Q ss_pred hhhcC
Q 007973 280 VRRKF 284 (583)
Q Consensus 280 i~~vf 284 (583)
+...+
T Consensus 77 ~~~~~ 81 (128)
T PRK13907 77 VEKEY 81 (128)
T ss_pred HhHHH
Confidence 55543
No 45
>PF01498 HTH_Tnp_Tc3_2: Transposase; InterPro: IPR002492 Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in []. Tc3 is a member of the Tc1/mariner family of transposable elements. This entry also includes histone-lysine N-methyltransferase SETMAR, which is a SET domain and mariner transposase fusion gene-containing protein. This histone methyltransferase has sequence-specific DNA-binding activity and recognises the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element. This protein has DNA nicking activity, and has in vivo end joining activity and may mediate genomic integration of foreign DNA [, , , ]. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated, 0015074 DNA integration; PDB: 3K9K_B 3F2K_B 3K9J_B 1U78_A.
Probab=44.95 E-value=22 Score=26.76 Aligned_cols=38 Identities=21% Similarity=0.330 Sum_probs=17.2
Q ss_pred HHHHhhcCCCCChHHHHHHHHHHhCcccCHHHHHHHHHH
Q 007973 99 IEERLRDNINYKPKDILQDIHKQYGIIIPYKQAWRAKER 137 (583)
Q Consensus 99 ~~~~l~~~~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~ 137 (583)
+...+..+|..+..+|...+.+. |..+|..++++....
T Consensus 4 I~~~v~~~p~~s~~~i~~~l~~~-~~~vS~~TI~r~L~~ 41 (72)
T PF01498_consen 4 IVRMVRRNPRISAREIAQELQEA-GISVSKSTIRRRLRE 41 (72)
T ss_dssp ------------HHHHHHHT----T--S-HHHHHHHHHH
T ss_pred HHHHHHHCCCCCHHHHHHHHHHc-cCCcCHHHHHHHHHH
Confidence 44566688999999999999988 999999998877644
No 46
>PHA00689 hypothetical protein
Probab=44.38 E-value=13 Score=25.31 Aligned_cols=15 Identities=47% Similarity=1.057 Sum_probs=11.7
Q ss_pred CccCceeCCccCcCC
Q 007973 552 REKHTVHCSRCNQTG 566 (583)
Q Consensus 552 ~~k~~~~Cs~C~~~g 566 (583)
+..|..+|++||++|
T Consensus 13 qepravtckrcgktg 27 (62)
T PHA00689 13 QEPRAVTCKRCGKTG 27 (62)
T ss_pred cCcceeehhhccccC
Confidence 445778999999876
No 47
>PF08069 Ribosomal_S13_N: Ribosomal S13/S15 N-terminal domain; InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=41.21 E-value=30 Score=25.12 Aligned_cols=34 Identities=26% Similarity=0.455 Sum_probs=24.6
Q ss_pred hhhHH---HHHHHHHhh--cCCCCChHHHHHHHHHHhCc
Q 007973 91 SVDWI---VSFIEERLR--DNINYKPKDILQDIHKQYGI 124 (583)
Q Consensus 91 s~~~l---a~~~~~~l~--~~~~~~p~~i~~~l~~~~g~ 124 (583)
.+.|+ ++++.+.|. +..|++|.+|--.|+++||+
T Consensus 22 ~P~W~~~~~~eVe~~I~klakkG~tpSqIG~iLRD~~GI 60 (60)
T PF08069_consen 22 PPSWLKYSPEEVEELIVKLAKKGLTPSQIGVILRDQYGI 60 (60)
T ss_dssp --TT--S-HHHHHHHHHHHCCTTHCHHHHHHHHHHSCTC
T ss_pred CCCCcCCCHHHHHHHHHHHHHcCCCHHHhhhhhhhccCC
Confidence 45565 356666665 56799999999999999985
No 48
>PF13276 HTH_21: HTH-like domain
Probab=39.91 E-value=75 Score=22.78 Aligned_cols=43 Identities=19% Similarity=0.367 Sum_probs=34.3
Q ss_pred HHHHHHHHhhcC-CCCChHHHHHHHHHHhCcccCHHHHHHHHHH
Q 007973 95 IVSFIEERLRDN-INYKPKDILQDIHKQYGIIIPYKQAWRAKER 137 (583)
Q Consensus 95 la~~~~~~l~~~-~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~ 137 (583)
+...+.+....+ +.++...|...|.++.|+.+|..++++..+.
T Consensus 6 l~~~I~~i~~~~~~~yG~rri~~~L~~~~~~~v~~krV~RlM~~ 49 (60)
T PF13276_consen 6 LRELIKEIFKESKPTYGYRRIWAELRREGGIRVSRKRVRRLMRE 49 (60)
T ss_pred HHHHHHHHHHHcCCCeehhHHHHHHhccCcccccHHHHHHHHHH
Confidence 445566666654 7899999999999998999999999887654
No 49
>PRK00766 hypothetical protein; Provisional
Probab=39.05 E-value=1.8e+02 Score=27.01 Aligned_cols=91 Identities=11% Similarity=0.095 Sum_probs=51.3
Q ss_pred cEEEee-ceEeecccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhc---------------------
Q 007973 200 PIVSIG-GIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALE--------------------- 257 (583)
Q Consensus 200 ~vv~~D-~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~--------------------- 257 (583)
.|++|| +.|..+.-+-..++-+-+-++.-..-++|+.+.-.-.|.=.-+.+.++....
T Consensus 10 rvlGidds~f~~~~~~~~~lvGvv~r~~~~idGv~~~~itvdG~DaT~~i~~mv~~~~~r~~i~~V~L~Git~agFNvvD 89 (194)
T PRK00766 10 RVLGIDDGTFLFKSSEKVILVGVVMRGGDWVDGVLSRWITVDGLDATEAIIEMVNSSRHKGQLRVIMLDGITYGGFNVVD 89 (194)
T ss_pred eEEEEecCccccCCCCCEEEEEEEEECCeEEeeEEEEEEEECCccHHHHHHHHHHhcccccceEEEEECCEeeeeeEEec
Confidence 578887 4444432234555555555666666677777766666666666666554211
Q ss_pred ---ccccCCCCeEEEccCcc---cHHHHhhhcCCCCccc
Q 007973 258 ---IHAESMPQLTFISDGQK---GIADAVRRKFPNSSLA 290 (583)
Q Consensus 258 ---~~~~~~~~~~iitD~~~---~l~~Ai~~vfP~a~~~ 290 (583)
++.+..-|..+++...+ +|.+||++-||+...+
T Consensus 90 ~~~l~~~tg~PVI~V~r~~p~~~~ie~AL~k~f~~~~~R 128 (194)
T PRK00766 90 IEELYRETGLPVIVVMRKKPDFEAIESALKKHFSDWEER 128 (194)
T ss_pred HHHHHHHHCCCEEEEEecCCCHHHHHHHHHHHCCCHHHH
Confidence 00011345555533333 6788888888886554
No 50
>PF08766 DEK_C: DEK C terminal domain; InterPro: IPR014876 DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients []. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like Q8TF96 from SWISSPROT, and in protein phosphatases such as Q6NN85 from SWISSPROT. ; PDB: 1Q1V_A.
Probab=37.55 E-value=76 Score=22.37 Aligned_cols=38 Identities=18% Similarity=0.348 Sum_probs=23.9
Q ss_pred HHHHHHHHHhhc-C-CCCChHHHHHHHHHHhCcccCHHHH
Q 007973 94 WIVSFIEERLRD-N-INYKPKDILQDIHKQYGIIIPYKQA 131 (583)
Q Consensus 94 ~la~~~~~~l~~-~-~~~~p~~i~~~l~~~~g~~~~~~~~ 131 (583)
-+...+.+.++. + ..++.++|...|.+.+|..++..+.
T Consensus 4 ~i~~~i~~iL~~~dl~~vT~k~vr~~Le~~~~~dL~~~K~ 43 (54)
T PF08766_consen 4 EIREAIREILREADLDTVTKKQVREQLEERFGVDLSSRKK 43 (54)
T ss_dssp HHHHHHHHHHTTS-GGG--HHHHHHHHHHH-SS--SHHHH
T ss_pred HHHHHHHHHHHhCCHhHhhHHHHHHHHHHHHCCCcHHHHH
Confidence 345566777773 2 4689999999999999999996543
No 51
>PF13082 DUF3931: Protein of unknown function (DUF3931)
Probab=37.40 E-value=92 Score=21.54 Aligned_cols=28 Identities=14% Similarity=0.067 Sum_probs=20.9
Q ss_pred cccEEEEEeeecCCCCeeEEEEEEeecc
Q 007973 213 YLGTLLSATSFDADGGLFPIAFGVIDVE 240 (583)
Q Consensus 213 y~~~l~~~~g~d~~~~~~~~a~~~~~~E 240 (583)
|...-++++|-.++|+..++...+..+|
T Consensus 35 yefssfvlcgetpdgrrlvlthmistde 62 (66)
T PF13082_consen 35 YEFSSFVLCGETPDGRRLVLTHMISTDE 62 (66)
T ss_pred EEEEEEEEEccCCCCcEEEEEEEecchh
Confidence 3445578889999999888877776655
No 52
>PRK14892 putative transcription elongation factor Elf1; Provisional
Probab=34.99 E-value=28 Score=28.28 Aligned_cols=9 Identities=33% Similarity=1.073 Sum_probs=6.3
Q ss_pred CceeCCccC
Q 007973 555 HTVHCSRCN 563 (583)
Q Consensus 555 ~~~~Cs~C~ 563 (583)
....|.+|+
T Consensus 20 t~f~CP~Cg 28 (99)
T PRK14892 20 KIFECPRCG 28 (99)
T ss_pred cEeECCCCC
Confidence 456688887
No 53
>PRK12286 rpmF 50S ribosomal protein L32; Reviewed
Probab=34.70 E-value=37 Score=24.43 Aligned_cols=32 Identities=25% Similarity=0.513 Sum_probs=20.6
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCc
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQ 564 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~ 564 (583)
|..+.++.|..++|... .........|+.||+
T Consensus 4 PKrk~S~srr~~RRsh~--~l~~~~l~~C~~CG~ 35 (57)
T PRK12286 4 PKRKTSKSRKRKRRAHF--KLKAPGLVECPNCGE 35 (57)
T ss_pred CcCcCChhhcchhcccc--cccCCcceECCCCCC
Confidence 55666666666666552 233445677999998
No 54
>PF11645 PDDEXK_5: PD-(D/E)XK endonuclease; InterPro: IPR021671 This family are putative endonuclease proteins which are restricted to Synechocystis. ; PDB: 2OST_D.
Probab=34.02 E-value=1.4e+02 Score=25.97 Aligned_cols=54 Identities=17% Similarity=0.258 Sum_probs=41.4
Q ss_pred HHHHHHHHHHHHhcceeEEEEeecCeEEEEEeecCCCceEEEEE---EeCCCCceEE
Q 007973 17 KAFRNAIKEAAIAQHFELRIIKSDLIRYFAKCVTEGCPWRIRAV---KLPNAPTFTI 70 (583)
Q Consensus 17 ~e~~~ai~~ya~~~gf~~~~~ks~~~r~~~~C~~~gC~wrv~~~---~~~~~~~~~v 70 (583)
++...++...++..|+.+-+.-.+..+|.++=..+||-|||-++ ...+.+.+.+
T Consensus 7 Dite~~ii~~ll~~GY~V~~P~gDn~~YDLV~d~eg~L~RIQvKT~w~s~~~g~~~v 63 (149)
T PF11645_consen 7 DITEAKIINRLLEKGYSVSIPFGDNLKYDLVFDKEGILWRIQVKTGWFSDDRGIYVV 63 (149)
T ss_dssp HHHHHHHHHHHHHTT-EEEEESSTTSS-SEEEEETTEEEEEEEEE-EEETTTSCEEE
T ss_pred hHHHHHHHHHHHHcCcEEEeecCCCCCcCEEEecCCcEEEEEEeeEEEecCceEEEE
Confidence 56778888999999999999999998888777799999999886 2333444443
No 55
>PF04800 ETC_C1_NDUFA4: ETC complex I subunit conserved region; InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=33.74 E-value=53 Score=26.79 Aligned_cols=29 Identities=10% Similarity=0.075 Sum_probs=22.6
Q ss_pred ccCeeCCHHHHHHHHHHHHHhcceeEEEEeecC
Q 007973 9 VGQEFPDVKAFRNAIKEAAIAQHFELRIIKSDL 41 (583)
Q Consensus 9 ~G~~F~s~~e~~~ai~~ya~~~gf~~~~~ks~~ 41 (583)
+.+.|+|+|+|.. ||.++|..|.+..-..
T Consensus 51 v~l~F~skE~Ai~----yaer~G~~Y~V~~p~~ 79 (101)
T PF04800_consen 51 VRLKFDSKEDAIA----YAERNGWDYEVEEPKK 79 (101)
T ss_dssp CEEEESSHHHHHH----HHHHCT-EEEEE-STT
T ss_pred eEeeeCCHHHHHH----HHHHcCCeEEEeCCCC
Confidence 7889999999875 7899999999865544
No 56
>COG4715 Uncharacterized conserved protein [Function unknown]
Probab=32.82 E-value=1.4e+02 Score=32.09 Aligned_cols=26 Identities=27% Similarity=0.600 Sum_probs=20.4
Q ss_pred CcccccCCccccCCCchhHHHHHHhcCC
Q 007973 455 THCCSCRDWQLYGIPCSHAVAALISCRK 482 (583)
Q Consensus 455 ~~~CsC~~~~~~giPC~H~lav~~~~~~ 482 (583)
+..|||.. ...| -|.|+.||+...-.
T Consensus 72 ss~CTCP~-~~~g-aCKH~VAvvl~~~~ 97 (587)
T COG4715 72 SSICTCPY-GGSG-ACKHVVAVVLEYLD 97 (587)
T ss_pred CceeeCCC-CCCc-chHHHHHHHHHHhh
Confidence 68899987 4444 69999999987644
No 57
>PF14420 Clr5: Clr5 domain
Probab=32.30 E-value=1.3e+02 Score=21.31 Aligned_cols=26 Identities=12% Similarity=0.212 Sum_probs=22.6
Q ss_pred CCCCChHHHHHHHHHHhCcccCHHHH
Q 007973 106 NINYKPKDILQDIHKQYGIIIPYKQA 131 (583)
Q Consensus 106 ~~~~~p~~i~~~l~~~~g~~~~~~~~ 131 (583)
+.+.+..+|++.+...||..+|..+.
T Consensus 18 ~e~~tl~~v~~~M~~~~~F~at~rqy 43 (54)
T PF14420_consen 18 DENKTLEEVMEIMKEEHGFKATKRQY 43 (54)
T ss_pred hCCCcHHHHHHHHHHHhCCCcCHHHH
Confidence 67889999999999999999996553
No 58
>COG0626 MetC Cystathionine beta-lyases/cystathionine gamma-synthases [Amino acid transport and metabolism]
Probab=31.66 E-value=6.2e+02 Score=26.43 Aligned_cols=70 Identities=10% Similarity=-0.041 Sum_probs=38.0
Q ss_pred hHhHHHhhhcCcccEEEeeceEeecccccEEEE-------EeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhc
Q 007973 187 FNASIYGFLNGCLPIVSIGGIQLKSKYLGTLLS-------ATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALE 257 (583)
Q Consensus 187 ~~~~~~~~~~~~~~vv~~D~t~~~~~y~~~l~~-------~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~ 257 (583)
+-+.+....+..+-++.+|.||.+.-+.-||-. ....=-+|+.-.+|= ++-.-+++-|..++......++
T Consensus 167 DI~~i~~~A~~~g~~vvVDNTfatP~~q~PL~~GaDIVvhSaTKyl~GHsDvl~G-~v~~~~~~~~~~~~~~~~~~~G 243 (396)
T COG0626 167 DIPAIARLAKAYGALVVVDNTFATPVLQRPLELGADIVVHSATKYLGGHSDVLGG-VVLTPNEELYELLFFAQRANTG 243 (396)
T ss_pred cHHHHHHHHHhcCCEEEEECCcccccccChhhcCCCEEEEeccccccCCcceeee-EEecChHHHHHHHHHHHHhhcC
Confidence 333344443345678999999999888776632 222222344434443 4444455556655454444344
No 59
>PF13877 RPAP3_C: Potential Monad-binding region of RPAP3
Probab=30.32 E-value=50 Score=26.34 Aligned_cols=33 Identities=12% Similarity=0.363 Sum_probs=27.3
Q ss_pred cCHHHHHHHHHHHHhcCHHHHHHHHhCCCCCcc
Q 007973 320 TTTMAFKERMGEIEDVSSEAAKWIQQYPPSHWA 352 (583)
Q Consensus 320 ~t~~~f~~~~~~l~~~~~~~~~~l~~~~~~~W~ 352 (583)
.|..||+..|..+.......++||..+.++...
T Consensus 5 ~~~~eF~~~w~~~~~~~~~~~~yL~~i~p~~l~ 37 (94)
T PF13877_consen 5 KNSYEFERDWRRLKKDPEERYEYLKSIPPDSLP 37 (94)
T ss_pred CCHHHHHHHHHHHcCCHHHHHHHHHhCChHHHH
Confidence 477899999999987667899999998776653
No 60
>PF09713 A_thal_3526: Plant protein 1589 of unknown function (A_thal_3526); InterPro: IPR006476 This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.
Probab=30.26 E-value=51 Score=23.40 Aligned_cols=22 Identities=14% Similarity=0.212 Sum_probs=16.9
Q ss_pred CCChHHHHHHHHHHhCcccCHH
Q 007973 108 NYKPKDILQDIHKQYGIIIPYK 129 (583)
Q Consensus 108 ~~~p~~i~~~l~~~~g~~~~~~ 129 (583)
.++..++++.|.++.++.+...
T Consensus 12 yMsk~E~v~~L~~~a~I~P~~T 33 (54)
T PF09713_consen 12 YMSKEECVRALQKQANIEPVFT 33 (54)
T ss_pred cCCHHHHHHHHHHHcCCChHHH
Confidence 5678889999988888776553
No 61
>PF14201 DUF4318: Domain of unknown function (DUF4318)
Probab=30.21 E-value=96 Score=23.68 Aligned_cols=31 Identities=23% Similarity=0.368 Sum_probs=26.4
Q ss_pred cCeeCCHHHHHHHHHHHHHhcceeEEEEeec
Q 007973 10 GQEFPDVKAFRNAIKEAAIAQHFELRIIKSD 40 (583)
Q Consensus 10 G~~F~s~~e~~~ai~~ya~~~gf~~~~~ks~ 40 (583)
--.|+|.+++-.+|..|+.++|-.+...+-+
T Consensus 11 ~~~yPs~e~i~~aIE~YC~~~~~~l~Fisr~ 41 (74)
T PF14201_consen 11 SPKYPSKEEICEAIEKYCIKNGESLEFISRD 41 (74)
T ss_pred CCCCCCHHHHHHHHHHHHHHcCCceEEEecC
Confidence 3458999999999999999999998876543
No 62
>PF08459 UvrC_HhH_N: UvrC Helix-hairpin-helix N-terminal; InterPro: IPR001162 During the process of Escherichia coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products []. The UvrC proteins contain 4 conserved regions: a central region which interacts with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases. Proteins that contain the UvrC homology region 1, IPR000305 from INTERPRO, are listed below: Prokaryotic UvrC proteins. Bacteriophage T4 END2 protein. Small subunit of ribonucleotide reductase enzyme. T4 TEV1 protein. Endonuclease specific to the thymidylate synthase (td) gene splice junction. Found in putative intron-homing endonucleases encoded by group I introns of fungi and phage. Mycobacterium hypothetical protein Y002. Exonuclease by similarity. Bacillus subtilis hypothetical protein YURQ. ; GO: 0003677 DNA binding, 0004518 nuclease activity, 0006289 nucleotide-excision repair; PDB: 3C65_A 2NRZ_A 2NRR_A 2NRX_A 2NRV_A 2NRT_A 2NRW_A.
Probab=29.94 E-value=92 Score=27.68 Aligned_cols=67 Identities=19% Similarity=0.255 Sum_probs=41.4
Q ss_pred EEEeeceEeecccccEEEEEeeecCCCCeeEEEEEEeeccccchHHHHHHHHHHHhccccc--CCCCeEEEccCcccHHH
Q 007973 201 IVSIGGIQLKSKYLGTLLSATSFDADGGLFPIAFGVIDVENDESWMWFLSEFHKALEIHAE--SMPQLTFISDGQKGIAD 278 (583)
Q Consensus 201 vv~~D~t~~~~~y~~~l~~~~g~d~~~~~~~~a~~~~~~E~~~~~~w~l~~l~~~~~~~~~--~~~~~~iitD~~~~l~~ 278 (583)
|++.||-+.++.|+- |-+-+.+..++|.-+-+.+.+....... ...|-.|+.|+.++-.+
T Consensus 32 Vvf~~G~~~k~~YR~------------------f~i~~~~~~dDy~~M~Evl~RR~~~~~~~~~~lPDLilIDGG~gQl~ 93 (155)
T PF08459_consen 32 VVFENGKPDKSEYRR------------------FNIKTVDGGDDYAAMREVLTRRFKRLKEEKEPLPDLILIDGGKGQLN 93 (155)
T ss_dssp EEEETTEE-GGG-EE------------------EEEE--STT-HHHHHHHHHHHHHCCCHHHT----SEEEESSSHHHHH
T ss_pred EEEECCccChhhCce------------------EecCCCCCCcHHHHHHHHHHHHHhcccccCCCCCCEEEEcCCHHHHH
Confidence 566777777777764 3333345568898888888887752111 14688999999999988
Q ss_pred HhhhcCC
Q 007973 279 AVRRKFP 285 (583)
Q Consensus 279 Ai~~vfP 285 (583)
|..+++-
T Consensus 94 aa~~~l~ 100 (155)
T PF08459_consen 94 AAKEVLK 100 (155)
T ss_dssp HHHHHHH
T ss_pred HHHHHHH
Confidence 8887653
No 63
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=28.10 E-value=34 Score=31.25 Aligned_cols=17 Identities=35% Similarity=1.036 Sum_probs=7.3
Q ss_pred CCccCcCCCCCCCCCCc
Q 007973 559 CSRCNQTGHYKTTCKAE 575 (583)
Q Consensus 559 Cs~C~~~gHn~~tC~~~ 575 (583)
|-.|+..||--..|+.+
T Consensus 157 cy~c~~~~H~~~dc~~~ 173 (190)
T COG5082 157 CYSCGSAGHFGDDCKEP 173 (190)
T ss_pred ccccCCccccCCCCCCC
Confidence 44444444444444433
No 64
>PF13551 HTH_29: Winged helix-turn helix
Probab=27.72 E-value=1.4e+02 Score=24.25 Aligned_cols=41 Identities=20% Similarity=0.332 Sum_probs=31.1
Q ss_pred HHHHHHhhcCC-----CCChHHHHHHH-HHHhCcccCHHHHHHHHHH
Q 007973 97 SFIEERLRDNI-----NYKPKDILQDI-HKQYGIIIPYKQAWRAKER 137 (583)
Q Consensus 97 ~~~~~~l~~~~-----~~~p~~i~~~l-~~~~g~~~~~~~~~~~~~~ 137 (583)
..+.+.+..+| .+++..|.+.| .+.+|+.+|.+++++.-++
T Consensus 64 ~~l~~~~~~~p~~g~~~~t~~~l~~~l~~~~~~~~~s~~ti~r~L~~ 110 (112)
T PF13551_consen 64 AQLIELLRENPPEGRSRWTLEELAEWLIEEEFGIDVSPSTIRRILKR 110 (112)
T ss_pred HHHHHHHHHCCCCCCCcccHHHHHHHHHHhccCccCCHHHHHHHHHH
Confidence 34556666555 47889999966 8899999999999877543
No 65
>PF13719 zinc_ribbon_5: zinc-ribbon domain
Probab=27.45 E-value=31 Score=22.25 Aligned_cols=14 Identities=21% Similarity=0.691 Sum_probs=10.7
Q ss_pred CCccCceeCCccCc
Q 007973 551 NREKHTVHCSRCNQ 564 (583)
Q Consensus 551 ~~~k~~~~Cs~C~~ 564 (583)
....++.+|++|+.
T Consensus 20 ~~~~~~vrC~~C~~ 33 (37)
T PF13719_consen 20 PAGGRKVRCPKCGH 33 (37)
T ss_pred ccCCcEEECCCCCc
Confidence 34457899999985
No 66
>PF14952 zf-tcix: Putative treble-clef, zinc-finger, Zn-binding
Probab=26.99 E-value=41 Score=22.57 Aligned_cols=22 Identities=23% Similarity=0.690 Sum_probs=17.1
Q ss_pred cCceeCCccCc-CCCCCCCCCCc
Q 007973 554 KHTVHCSRCNQ-TGHYKTTCKAE 575 (583)
Q Consensus 554 k~~~~Cs~C~~-~gHn~~tC~~~ 575 (583)
+..++|..||. -|+-.-.|+++
T Consensus 9 RGirkCp~CGt~NG~R~~~CKN~ 31 (44)
T PF14952_consen 9 RGIRKCPKCGTYNGTRGLSCKNK 31 (44)
T ss_pred hccccCCcCcCccCcccccccCC
Confidence 45788999999 67777678875
No 67
>TIGR01031 rpmF_bact ribosomal protein L32. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus.
Probab=26.88 E-value=67 Score=22.93 Aligned_cols=42 Identities=21% Similarity=0.347 Sum_probs=23.3
Q ss_pred CCCCCCCCCCCCccccccccCCccCceeCCccCcCCCCCCCCC
Q 007973 531 PKFRRPPGRPEKKRICLEDLNREKHTVHCSRCNQTGHYKTTCK 573 (583)
Q Consensus 531 p~~~r~~GRpkk~R~~~~~~~~~k~~~~Cs~C~~~gHn~~tC~ 573 (583)
|..+.++.|..++|....+ ........|+.||+.-..=+-||
T Consensus 2 PKrk~Sksr~~~RRah~~k-l~~p~l~~C~~cG~~~~~H~vc~ 43 (55)
T TIGR01031 2 PKRKTSKSRKRKRRSHDAK-LTAPTLVVCPNCGEFKLPHRVCP 43 (55)
T ss_pred CCCcCCcccccchhcCccc-ccCCcceECCCCCCcccCeeECC
Confidence 4555556666666555322 22345677999998443333443
No 68
>PF05741 zf-nanos: Nanos RNA binding domain; InterPro: IPR024161 Nanos is a highly conserved RNA-binding protein in higher eukaryotes and functions as a key regulatory protein in translational control using a 3' untranslated region during the development and maintenance of germ cells. Nanos comprises a non-conserved amino-terminus and highly conserved carboxy- terminal regions. The C-terminal region has two conserved Cys-Cys-His-Cys (CCHC)-type zinc-finger motifs that are indispensable for nanos function [, , ]. The structure of the nanos-type zinc finger is composed of two independent zinc-finger (ZF) lobes, the N-terminal ZF1 and the C-terminal ZF2, which are connected by a linker helix []. These lobes create a large cleft. Zinc ions in ZF1 and ZF2 are bound to the CCHC motif by tetrahedral coordination.; PDB: 3ALR_B.
Probab=25.57 E-value=25 Score=25.07 Aligned_cols=21 Identities=19% Similarity=0.504 Sum_probs=8.3
Q ss_pred CceeCCccCc---CCCCCCCCCCc
Q 007973 555 HTVHCSRCNQ---TGHYKTTCKAE 575 (583)
Q Consensus 555 ~~~~Cs~C~~---~gHn~~tC~~~ 575 (583)
|.+.|..||. ..|..+-||.+
T Consensus 32 r~y~Cp~CgAtGd~AHT~~yCP~k 55 (55)
T PF05741_consen 32 RKYVCPICGATGDNAHTIKYCPKK 55 (55)
T ss_dssp GG---TTT---GGG---GGG-TT-
T ss_pred hcCcCCCCcCcCccccccccCcCC
Confidence 5688999999 45677778863
No 69
>PF13717 zinc_ribbon_4: zinc-ribbon domain
Probab=25.36 E-value=38 Score=21.72 Aligned_cols=13 Identities=31% Similarity=0.856 Sum_probs=10.1
Q ss_pred CccCceeCCccCc
Q 007973 552 REKHTVHCSRCNQ 564 (583)
Q Consensus 552 ~~k~~~~Cs~C~~ 564 (583)
...++.+|++|+.
T Consensus 21 ~~g~~v~C~~C~~ 33 (36)
T PF13717_consen 21 PKGRKVRCSKCGH 33 (36)
T ss_pred CCCcEEECCCCCC
Confidence 3456899999986
No 70
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=24.99 E-value=51 Score=28.93 Aligned_cols=21 Identities=29% Similarity=0.702 Sum_probs=17.7
Q ss_pred ceeCCccCcCCCCCCCCCCcc
Q 007973 556 TVHCSRCNQTGHYKTTCKAEI 576 (583)
Q Consensus 556 ~~~Cs~C~~~gHn~~tC~~~~ 576 (583)
...|.+|++.||-.+.||.+.
T Consensus 52 ~~~C~~Cg~~GH~~~~Cp~~~ 72 (148)
T PTZ00368 52 ERSCYNCGKTGHLSRECPEAP 72 (148)
T ss_pred CcccCCCCCcCcCcccCCCcc
Confidence 457999999999999998865
No 71
>PF01316 Arg_repressor: Arginine repressor, DNA binding domain; InterPro: IPR020900 The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) []. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein []. Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR []. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine []. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography []. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0006525 arginine metabolic process; PDB: 1AOY_A 3V4G_A 3LAJ_D 3FHZ_A 3LAP_B 3ERE_D 2P5L_C 1F9N_D 2P5K_A 1B4A_A ....
Probab=24.03 E-value=1.9e+02 Score=21.78 Aligned_cols=40 Identities=13% Similarity=0.204 Sum_probs=26.5
Q ss_pred HHHHHhhcCCCCChHHHHHHHHHHhCcccCHHHHHHHHHHH
Q 007973 98 FIEERLRDNINYKPKDILQDIHKQYGIIIPYKQAWRAKERG 138 (583)
Q Consensus 98 ~~~~~l~~~~~~~p~~i~~~l~~~~g~~~~~~~~~~~~~~~ 138 (583)
.+++.|..+.-.+-.+|++.|.+. |+.+|-.++.|.-+.+
T Consensus 9 ~I~~li~~~~i~sQ~eL~~~L~~~-Gi~vTQaTiSRDLkeL 48 (70)
T PF01316_consen 9 LIKELISEHEISSQEELVELLEEE-GIEVTQATISRDLKEL 48 (70)
T ss_dssp HHHHHHHHS---SHHHHHHHHHHT-T-T--HHHHHHHHHHH
T ss_pred HHHHHHHHCCcCCHHHHHHHHHHc-CCCcchhHHHHHHHHc
Confidence 466777777777889999998775 9999999988776543
No 72
>PF10045 DUF2280: Uncharacterized conserved protein (DUF2280); InterPro: IPR018738 This entry is represented by Burkholderia phage Bups phi1, Orf2.36. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Probab=23.99 E-value=88 Score=25.42 Aligned_cols=22 Identities=18% Similarity=0.581 Sum_probs=20.0
Q ss_pred ChHHHHHHHHHHhCcccCHHHH
Q 007973 110 KPKDILQDIHKQYGIIIPYKQA 131 (583)
Q Consensus 110 ~p~~i~~~l~~~~g~~~~~~~~ 131 (583)
+|+++.+.+++.||+.+|..++
T Consensus 21 TPs~v~~aVk~eFgi~vsrQqv 42 (104)
T PF10045_consen 21 TPSEVAEAVKEEFGIDVSRQQV 42 (104)
T ss_pred CHHHHHHHHHHHhCCccCHHHH
Confidence 7999999999999999997754
No 73
>TIGR01589 A_thal_3526 uncharacterized plant-specific domain TIGR01589. This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa.
Probab=23.93 E-value=73 Score=22.89 Aligned_cols=27 Identities=11% Similarity=0.256 Sum_probs=20.3
Q ss_pred CCChHHHHHHHHHHhCcccCHH-HHHHH
Q 007973 108 NYKPKDILQDIHKQYGIIIPYK-QAWRA 134 (583)
Q Consensus 108 ~~~p~~i~~~l~~~~g~~~~~~-~~~~~ 134 (583)
.++..++++.|.++.|+.+... .+|+.
T Consensus 15 yMsk~E~v~~L~~~a~I~P~~T~~VW~~ 42 (57)
T TIGR01589 15 YMSKEETVSFLFENAGISPKFTRFVWYL 42 (57)
T ss_pred HCCHHHHHHHHHHHcCCCchhHHHHHHH
Confidence 5788899999999998887764 34543
No 74
>PF08086 Toxin_17: Ergtoxin family; InterPro: IPR012622 The Ergtoxin (ErgTx) family is a class of peptides from scorpion venom that specifically block ERG (ether-a-go-go-related gene) K+ channels of the nerve, heart and endocrine cells [, , ]. Peptides of the ErgTx family have from 42 to 47 amino acid residues cross-linked by four disulphide bridges. The four disulphide bridges have been assigned as C1-C4, C2-C6, C3-C7 and C5-C8 (see the schematic representation below) []. ErgTxs consist of a triple-stranded beta-sheet and an alpha-helix, as is typical of K+ channel scorpion toxins. There is a large hydrophobic patch on the surface of the toxin, surrounding a central lysine residue located near the beta-hairpin loop between the second and third strands of the beta-sheet. It has been postulated that this hydrophobic patch is likely to form part of the binding surface of the toxin []. Peptides of the ErgTx family possess a Knottin scaffold (see http://knottin.cbs.cnrs.fr). Some proteins known to belong to the ErgTx family are listed below: ErgTx1, ErgTx2 and ErgTx3 from Centruroides elegans (Bark scorpion). ErgTx1, ErgTx2, ErgTx3 and ErgTx4 from Centruroides exilicauda (Bark scorpion). ErgTx1, ErgTx2 and ErgTx3 from Centruroides gracilis (Slenderbrown scorpion) (Florida bark scorpion). ErgTx1, ErgTx2, ErgTx3 and ErgTx4 from Centruroides limpidus limpidus (Mexican scorpion). ErgTx1, ErgTx2, ErgTx3, ErgTx4, ErgTx5 and gamma-KTx 4.12 from Centruroides sculpturatus (Bark scorpion). ErgTx, ErgTx2, ErgTx3, ErgTx4 and ErgTx5 from Centruroides noxius (Mexican scorpion). ; GO: 0019870 potassium channel inhibitor activity, 0009405 pathogenesis, 0005576 extracellular region; PDB: 1NE5_A 1PX9_A.
Probab=23.24 E-value=8.9 Score=24.28 Aligned_cols=11 Identities=45% Similarity=1.120 Sum_probs=7.9
Q ss_pred cCcCCCCCCCC
Q 007973 562 CNQTGHYKTTC 572 (583)
Q Consensus 562 C~~~gHn~~tC 572 (583)
|+..|||..||
T Consensus 24 ckkag~~~gtc 34 (41)
T PF08086_consen 24 CKKAGHRGGTC 34 (41)
T ss_dssp HHHHTSS-EEE
T ss_pred HHHhCCCCcee
Confidence 45599999888
No 75
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=23.02 E-value=55 Score=28.71 Aligned_cols=18 Identities=39% Similarity=0.890 Sum_probs=9.9
Q ss_pred eeCCccCcCCCCCCCCCC
Q 007973 557 VHCSRCNQTGHYKTTCKA 574 (583)
Q Consensus 557 ~~Cs~C~~~gHn~~tC~~ 574 (583)
..|-+|++.||..+.||.
T Consensus 130 ~~C~~Cg~~gH~~~dCp~ 147 (148)
T PTZ00368 130 KTCYNCGQTGHLSRDCPD 147 (148)
T ss_pred CccccCCCcCcccccCCC
Confidence 455555555555555554
No 76
>PF10122 Mu-like_Com: Mu-like prophage protein Com; InterPro: IPR019294 Members of this entry belong to the Com family of proteins that act as translational regulators of mom [, ].
Probab=22.79 E-value=51 Score=22.97 Aligned_cols=25 Identities=24% Similarity=0.469 Sum_probs=20.6
Q ss_pred CceeCCccCcCCCCCCCCCCccccc
Q 007973 555 HTVHCSRCNQTGHYKTTCKAEIMKS 579 (583)
Q Consensus 555 ~~~~Cs~C~~~gHn~~tC~~~~~~~ 579 (583)
-..+|.+|+...|=+.+=|.+.+.|
T Consensus 23 leIKCpRC~tiN~~~a~~~~~~p~~ 47 (51)
T PF10122_consen 23 LEIKCPRCKTINHVRATSPEPEPLS 47 (51)
T ss_pred EEEECCCCCccceEeccCCCCCchh
Confidence 4688999999999988887776654
No 77
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=22.74 E-value=56 Score=20.95 Aligned_cols=19 Identities=26% Similarity=0.809 Sum_probs=11.9
Q ss_pred eeCCccCcCCCCCCCCCCc
Q 007973 557 VHCSRCNQTGHYKTTCKAE 575 (583)
Q Consensus 557 ~~Cs~C~~~gHn~~tC~~~ 575 (583)
-.|.+|++-.|=...|...
T Consensus 3 ~~CprC~kg~Hwa~~C~sk 21 (36)
T PF14787_consen 3 GLCPRCGKGFHWASECRSK 21 (36)
T ss_dssp -C-TTTSSSCS-TTT---T
T ss_pred ccCcccCCCcchhhhhhhh
Confidence 4699999999999999765
No 78
>COG4888 Uncharacterized Zn ribbon-containing protein [General function prediction only]
Probab=21.87 E-value=67 Score=25.89 Aligned_cols=8 Identities=50% Similarity=1.281 Sum_probs=3.6
Q ss_pred ceeCCccC
Q 007973 556 TVHCSRCN 563 (583)
Q Consensus 556 ~~~Cs~C~ 563 (583)
+-.|..|+
T Consensus 22 ~FtCp~Cg 29 (104)
T COG4888 22 TFTCPRCG 29 (104)
T ss_pred eEecCccC
Confidence 34444444
No 79
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=21.43 E-value=62 Score=31.45 Aligned_cols=25 Identities=28% Similarity=0.631 Sum_probs=20.7
Q ss_pred CccCceeCCccCcCCCCCCCCCCcc
Q 007973 552 REKHTVHCSRCNQTGHYKTTCKAEI 576 (583)
Q Consensus 552 ~~k~~~~Cs~C~~~gHn~~tC~~~~ 576 (583)
.+..-+-|-+||+.||--+.||..-
T Consensus 172 ppPpgY~CyRCGqkgHwIqnCpTN~ 196 (427)
T COG5222 172 PPPPGYVCYRCGQKGHWIQNCPTNQ 196 (427)
T ss_pred CCCCceeEEecCCCCchhhcCCCCC
Confidence 3345688999999999999999754
No 80
>KOG2044 consensus 5'-3' exonuclease HKE1/RAT1 [Replication, recombination and repair; RNA processing and modification]
Probab=21.16 E-value=42 Score=37.35 Aligned_cols=25 Identities=28% Similarity=0.573 Sum_probs=20.3
Q ss_pred ccCceeCCccCcCCCCCCCCCCccc
Q 007973 553 EKHTVHCSRCNQTGHYKTTCKAEIM 577 (583)
Q Consensus 553 ~k~~~~Cs~C~~~gHn~~tC~~~~~ 577 (583)
+....+|-.||+.||+...|...+-
T Consensus 257 P~~~~~C~~cgq~gh~~~dc~g~~~ 281 (931)
T KOG2044|consen 257 PNKPRRCFLCGQTGHEAKDCEGKPR 281 (931)
T ss_pred CCCcccchhhcccCCcHhhcCCcCC
Confidence 3455669999999999999987743
No 81
>COG3915 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=20.53 E-value=1.3e+02 Score=25.87 Aligned_cols=38 Identities=21% Similarity=0.217 Sum_probs=30.7
Q ss_pred HHhhhcCcccEE--EeeceEeecccccEEEEEeeecCCCCe
Q 007973 191 IYGFLNGCLPIV--SIGGIQLKSKYLGTLLSATSFDADGGL 229 (583)
Q Consensus 191 ~~~~~~~~~~vv--~~D~t~~~~~y~~~l~~~~g~d~~~~~ 229 (583)
+.++. .|.|++ -.||.|++.+.+|||+.+--.|.+...
T Consensus 97 ~sDi~-kynpIlA~~~nGn~M~IRerGPl~~IYplds~peL 136 (155)
T COG3915 97 YSDIE-KYNPILAIQNNGNYMQIRERGPLWSIYPLDSSPEL 136 (155)
T ss_pred HHHhh-hcccEEEEEeCCcEEEEeccCceEEEeecCCChhh
Confidence 45666 478875 569999999999999999988887653
Done!