Query 043258
Match_columns 454
No_of_seqs 189 out of 1558
Neff 8.8
Searched_HMMs 46136
Date Fri Mar 29 03:05:52 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/043258.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/043258hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 3.9E-64 8.4E-69 531.4 33.1 398 12-435 156-624 (846)
2 PF00872 Transposase_mut: Tran 99.9 8.5E-24 1.8E-28 211.2 6.8 236 58-345 100-367 (381)
3 PF10551 MULE: MULE transposas 99.9 9.5E-23 2.1E-27 163.1 8.9 90 172-265 1-93 (93)
4 COG3328 Transposase and inacti 99.7 1E-15 2.3E-20 150.2 17.0 223 68-344 98-345 (379)
5 smart00575 ZnF_PMZ plant mutat 99.1 4.4E-11 9.6E-16 71.9 2.2 27 392-418 1-27 (28)
6 PF03108 DBD_Tnp_Mut: MuDR fam 98.2 1.9E-06 4.1E-11 63.9 4.6 34 1-34 34-67 (67)
7 PF04434 SWIM: SWIM zinc finge 98.2 9.2E-07 2E-11 58.3 2.6 30 387-416 10-39 (40)
8 PF03101 FAR1: FAR1 DNA-bindin 97.5 0.00012 2.7E-09 57.6 4.5 33 14-47 59-91 (91)
9 PF08731 AFT: Transcription fa 97.3 0.00038 8.2E-09 55.6 4.3 31 15-45 81-111 (111)
10 PF13610 DDE_Tnp_IS240: DDE do 96.9 0.00053 1.1E-08 58.8 2.1 81 165-251 1-81 (140)
11 PF06782 UPF0236: Uncharacteri 96.1 0.15 3.4E-06 52.7 14.7 78 205-283 235-313 (470)
12 PF03106 WRKY: WRKY DNA -bindi 95.9 0.026 5.7E-07 40.5 5.6 42 3-44 18-59 (60)
13 PHA02517 putative transposase 95.8 0.11 2.3E-06 49.9 11.4 151 56-240 30-181 (277)
14 PF01610 DDE_Tnp_ISL3: Transpo 95.7 0.023 4.9E-07 53.7 6.4 93 168-268 1-96 (249)
15 PF03050 DDE_Tnp_IS66: Transpo 95.5 0.065 1.4E-06 51.3 8.6 134 68-270 18-156 (271)
16 smart00774 WRKY DNA binding do 93.4 0.16 3.6E-06 36.1 4.3 40 4-43 19-59 (59)
17 COG3316 Transposase and inacti 92.0 2.3 4.9E-05 38.8 10.8 83 165-254 70-152 (215)
18 PRK09409 IS2 transposase TnpB; 89.9 5.5 0.00012 38.8 12.3 147 55-240 50-203 (301)
19 PF13565 HTH_32: Homeodomain-l 89.6 0.75 1.6E-05 34.5 4.9 40 57-96 35-76 (77)
20 PRK14702 insertion element IS2 89.4 5.5 0.00012 37.9 11.7 147 55-240 11-164 (262)
21 PF00665 rve: Integrase core d 88.1 2.4 5.1E-05 34.5 7.3 76 165-243 6-82 (120)
22 PF04937 DUF659: Protein of un 78.7 25 0.00055 30.4 9.9 64 206-271 73-139 (153)
23 PF12762 DDE_Tnp_IS1595: ISXO2 71.5 14 0.0003 31.6 6.6 69 166-241 4-87 (151)
24 PRK13907 rnhA ribonuclease H; 70.5 35 0.00076 28.1 8.6 78 167-248 3-81 (128)
25 PF04500 FLYWCH: FLYWCH zinc f 68.6 5.2 0.00011 28.2 2.7 35 5-43 25-62 (62)
26 COG5431 Uncharacterized metal- 67.3 1.5 3.3E-05 34.6 -0.4 31 381-413 41-76 (117)
27 PF13592 HTH_33: Winged helix- 64.8 12 0.00025 26.7 3.9 31 69-99 3-33 (60)
28 COG4279 Uncharacterized conser 60.7 5.1 0.00011 37.2 1.6 24 391-417 124-147 (266)
29 PF01498 HTH_Tnp_Tc3_2: Transp 59.8 9 0.0002 28.2 2.7 38 61-99 4-41 (72)
30 PF08766 DEK_C: DEK C terminal 53.5 33 0.00071 23.8 4.5 38 56-93 4-43 (54)
31 PF13276 HTH_21: HTH-like doma 45.2 46 0.001 23.4 4.3 42 57-98 6-48 (60)
32 PF08069 Ribosomal_S13_N: Ribo 44.3 20 0.00043 25.7 2.1 29 58-86 30-60 (60)
33 PRK00766 hypothetical protein; 44.1 2.5E+02 0.0053 25.4 9.7 89 166-254 10-128 (194)
34 PF13082 DUF3931: Protein of u 42.7 54 0.0012 22.4 3.9 28 179-206 35-62 (66)
35 PF13551 HTH_29: Winged helix- 39.5 54 0.0012 26.0 4.4 40 59-98 64-109 (112)
36 TIGR00334 5S_RNA_mat_M5 ribonu 36.7 51 0.0011 29.1 4.0 44 214-260 37-83 (174)
37 PF14420 Clr5: Clr5 domain 34.3 90 0.0019 21.6 4.2 26 68-93 18-43 (54)
38 PF10045 DUF2280: Uncharacteri 33.0 50 0.0011 26.3 3.0 23 72-94 21-43 (104)
39 PF09713 A_thal_3526: Plant pr 31.7 44 0.00096 23.3 2.2 25 70-94 12-37 (54)
40 KOG4027 Uncharacterized conser 30.2 72 0.0016 27.6 3.7 35 170-204 70-107 (187)
41 PF11447 DUF3201: Protein of u 29.9 2.8E+02 0.0061 23.4 6.9 72 113-212 8-84 (150)
42 PF01316 Arg_repressor: Argini 27.9 1.2E+02 0.0025 22.5 4.1 41 58-99 7-47 (70)
43 PF03705 CheR_N: CheR methyltr 26.5 1.8E+02 0.0039 19.9 4.8 47 74-123 6-52 (57)
44 KOG2909 Vacuolar H+-ATPase V1 26.5 2.1E+02 0.0046 28.1 6.5 29 107-135 139-167 (381)
45 TIGR01529 argR_whole arginine 23.5 1.4E+02 0.003 25.6 4.4 36 60-96 6-41 (146)
46 cd00131 PAX Paired Box domain 22.9 1.7E+02 0.0037 24.4 4.7 39 59-98 82-125 (128)
47 PF07761 DUF1617: Protein of u 20.4 4.3E+02 0.0092 22.6 6.5 36 71-106 5-40 (143)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=3.9e-64 Score=531.39 Aligned_cols=398 Identities=14% Similarity=0.224 Sum_probs=312.5
Q ss_pred eccCCCccEEEEEEeCCCCeEEEEEecCCcccCCCCCCCc-cchHHHHHHHhhhhccCCCCCHHHHHHHHHHHhCCccCH
Q 043258 12 CSDLSCDWQVTAIRDVRGKGFVITQFSPKHNCPRLDHAFH-PASKWISAMFLHRWKEQPSISTTEVRNEIESMYGIKCPE 90 (454)
Q Consensus 12 C~~~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~~~~~~~~-~~s~~i~~~~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~ 90 (454)
|+.+||+++|++++. .++.|.|+.++.+|||++.+.... ..++.+-..+...+....++.. ++.. ..
T Consensus 156 ~tRtGC~A~m~Vk~~-~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~~~~~~~~~~~~v~~------~~~d-----~~ 223 (846)
T PLN03097 156 CAKTDCKASMHVKRR-PDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYAAMARQFAEYKNVVG------LKND-----SK 223 (846)
T ss_pred ccCCCCceEEEEEEc-CCCeEEEEEEecCCCCCCCCccccchhhhhhHHHHHhhhhccccccc------cchh-----hc
Confidence 566799999999875 447899999999999999974321 1111111111111111000000 0000 00
Q ss_pred HHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEEecccccccccceeEEEEeecchHHHHHhhcccEEEE
Q 043258 91 WKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAV 170 (454)
Q Consensus 91 ~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~ 170 (454)
....+.+.+. + ...+.+.|.+|++.++.+||+|+|++++ |++++++++||+++.++.+|. +|+|||.+
T Consensus 224 ~~~~~~r~~~---~---~~gD~~~ll~yf~~~q~~nP~Ffy~~ql-----De~~~l~niFWaD~~sr~~Y~-~FGDvV~f 291 (846)
T PLN03097 224 SSFDKGRNLG---L---EAGDTKILLDFFTQMQNMNSNFFYAVDL-----GEDQRLKNLFWVDAKSRHDYG-NFSDVVSF 291 (846)
T ss_pred chhhHHHhhh---c---ccchHHHHHHHHHHHHhhCCCceEEEEE-----ccCCCeeeEEeccHHHHHHHH-hcCCEEEE
Confidence 0111111111 1 1235678999999999999999999998 889999999999999999997 59999999
Q ss_pred eCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhcCC
Q 043258 171 DGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFLPR 250 (454)
Q Consensus 171 D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~ 250 (454)
|+||++|+|++||++++|+|+|+++++||+||+.+|+.|+|.|+|++|+++| ++..|.+||||++.+|.+||.+|||+
T Consensus 292 DTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM--~gk~P~tIiTDqd~am~~AI~~VfP~ 369 (846)
T PLN03097 292 DTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAM--GGQAPKVIITDQDKAMKSVISEVFPN 369 (846)
T ss_pred eceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHh--CCCCCceEEecCCHHHHHHHHHHCCC
Confidence 9999999999999999999999999999999999999999999999999999 58999999999999999999999999
Q ss_pred ccccccHHHHHHhHHhhCC-----CchhHHHHHHHhhhhcc--------------CCHHHHHHhhhc--CcccceeeecC
Q 043258 251 AVYRQCCHRIFNEMVRRFP-----TAPVQHLFWSACRTTSA--------------TSQECHDWLKNS--NWERWALFCMP 309 (454)
Q Consensus 251 a~~~~C~~Hi~~n~~~~~~-----~~~~~~~~~~~~~~~~~--------------~~~~~~~~l~~~--~~~~W~~~~~~ 309 (454)
+.|++|.|||++|+.+++. .+.+...|+++++.+.. ++...++||+.+ .|++|+++|++
T Consensus 370 t~Hr~C~wHI~~~~~e~L~~~~~~~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~LY~~RekWapaY~k 449 (846)
T PLN03097 370 AHHCFFLWHILGKVSENLGQVIKQHENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSLYEDRKQWVPTYMR 449 (846)
T ss_pred ceehhhHHHHHHHHHHHhhHHhhhhhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHHHHhHhhhhHHHhc
Confidence 9999999999999999885 34889999999886543 677888999998 89999999999
Q ss_pred CcceeccccCChhHHHhhhhhh--hccccHHHHHHHHHHHHHHHHHHHHHh-Hh---------------hhcccccChhH
Q 043258 310 HWVKCTCVTLTITEKLRTSFDH--YLEMSITRRFTAIARSTAEIFERRRMV-VW---------------KWYREKVTPTV 371 (454)
Q Consensus 310 ~~~~~~~~Ttn~~Es~n~~lk~--~r~~pi~~~~~~i~~~~~~~~~~r~~~-~~---------------~~~~~~~tp~~ 371 (454)
+.|..|+.||+++||+|+.|++ .+..+|..|++++...+..+.++..+. .. +.....+||.+
T Consensus 450 ~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piEkQAs~iYT~~i 529 (846)
T PLN03097 450 DAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLEKSVSGVYTHAV 529 (846)
T ss_pred ccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHHHHHHHHhHHHH
Confidence 9989999999999999999999 467889999998877665544433211 10 11113788888
Q ss_pred HHHhhh------cc-------------------CCCceEEEEcc----cceeecCCcccCCCCchhHHHHHHHhcC--Ch
Q 043258 372 QDIIHD------RC-------------------SDGRRFILNMD----AMSCSCGLWQISGIPCAHACRGIKYMRR--KI 420 (454)
Q Consensus 372 ~~i~~~------~~-------------------~~~~~~~V~l~----~~~CsC~~~~~~giPC~H~lav~~~~~~--~~ 420 (454)
++.++. .| .....|.|..+ +.+|+|++||..||||+|||+||...++ .|
T Consensus 530 F~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLkVL~~~~v~~IP 609 (846)
T PLN03097 530 FKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALVVLQMCQLSAIP 609 (846)
T ss_pred HHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHHHHhhcCcccCc
Confidence 774432 12 11234556443 5799999999999999999999999998 59
Q ss_pred hhhhhhhccHHHHHh
Q 043258 421 EDYVDSMMSVQNYMS 435 (454)
Q Consensus 421 ~~~v~~~~t~~~~~~ 435 (454)
+.||.++||+++-..
T Consensus 610 ~~YILkRWTKdAK~~ 624 (846)
T PLN03097 610 SQYILKRWTKDAKSR 624 (846)
T ss_pred hhhhhhhchhhhhhc
Confidence 999999999998754
No 2
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.89 E-value=8.5e-24 Score=211.24 Aligned_cols=236 Identities=16% Similarity=0.209 Sum_probs=184.1
Q ss_pred HHHHhhhhcc--CCCCCHHHHHHHHHHHhC-CccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEE
Q 043258 58 SAMFLHRWKE--QPSISTTEVRNEIESMYG-IKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVE 134 (454)
Q Consensus 58 ~~~~~~~~~~--~~~~~~~~i~~~l~~~~g-~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~ 134 (454)
.+.+.+.|.. -.+++.++|.+.++..+| ..+|.+++.|..+.+.+.. ..| +...
T Consensus 100 ~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~-----------~~w----~~R~-------- 156 (381)
T PF00872_consen 100 EDSLEELIISLYLKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEV-----------EAW----RNRP-------- 156 (381)
T ss_pred hhhhhhhhhhhhccccccccccchhhhhhcccccCchhhhhhhhhhhhhH-----------HHH----hhhc--------
Confidence 4444444443 468999999999999999 8899999988766654422 111 1110
Q ss_pred ecccccccccceeEEEEeecchHHHHHhhc-ccEEEEeCeeecCCCCc-----ceEEEEEEeCCCCeEEEEEEEeecccc
Q 043258 135 TETHESREEERFKRVFVCCARTSYAFKVHC-RGILAVDGWEINNPCNS-----VMLVAAGLDGNNGILPVAFCEVQVEDL 208 (454)
Q Consensus 135 ~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~-~~vi~~D~t~~~~~y~~-----~l~~~~g~d~~~~~~~la~al~~~E~~ 208 (454)
- ... .|+|++||+|.+.+.++ .+++++|+|.+|+..+||+.+.+.|+.
T Consensus 157 -------L-------------------~~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~ 210 (381)
T PF00872_consen 157 -------L-------------------ESEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESA 210 (381)
T ss_pred -------c-------------------ccccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeecccCCcc
Confidence 0 013 58899999999987544 589999999999999999999999999
Q ss_pred hhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCCc---hhHHHHHHHhhhhc
Q 043258 209 DSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPTA---PVQHLFWSACRTTS 285 (454)
Q Consensus 209 e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~~ 285 (454)
++|.-||+.|++++ ...|..||+|+++||.+||.++||++.+|.|++|+++|+.++++.+ .+...++.+..+.+
T Consensus 211 ~~W~~~l~~L~~RG---l~~~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~k~~~~v~~~Lk~I~~a~~ 287 (381)
T PF00872_consen 211 ASWREFLQDLKERG---LKDILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPKKDRKEVKADLKAIYQAPD 287 (381)
T ss_pred CEeeecchhhhhcc---ccccceeeccccccccccccccccchhhhhheechhhhhccccccccchhhhhhccccccccc
Confidence 99999999999998 5679999999999999999999999999999999999999999755 56677776666554
Q ss_pred c----------------CCHHHHHHhhhcCcccceeeecCCcceeccccCChhHHHhhhhhhh----ccccHHHHHHHHH
Q 043258 286 A----------------TSQECHDWLKNSNWERWALFCMPHWVKCTCVTLTITEKLRTSFDHY----LEMSITRRFTAIA 345 (454)
Q Consensus 286 ~----------------~~~~~~~~l~~~~~~~W~~~~~~~~~~~~~~Ttn~~Es~n~~lk~~----r~~pi~~~~~~i~ 345 (454)
. ..|.+.+++++...+.|+..-|+...+--+.|||.+||+|+.+|+. +..|-.+.+..+.
T Consensus 288 ~e~a~~~l~~f~~~~~~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~i~TTN~iEsln~~irrr~~~~~~Fp~~~s~lr~~ 367 (381)
T PF00872_consen 288 KEEAREALEEFAEKWEKKYPKAAKSLEENWDELLTFLDFPPEHRRSIRTTNAIESLNKEIRRRTKVVGIFPNEESALRLV 367 (381)
T ss_pred cchhhhhhhhcccccccccchhhhhhhhccccccceeeecchhccccchhhhccccccchhhhccccccCCCHHHHHHHH
Confidence 3 5677777777766666666555655455678999999999999882 3466555444433
No 3
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.88 E-value=9.5e-23 Score=163.07 Aligned_cols=90 Identities=30% Similarity=0.470 Sum_probs=85.7
Q ss_pred CeeecCCCCcceEE---EEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhc
Q 043258 172 GWEINNPCNSVMLV---AAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFL 248 (454)
Q Consensus 172 ~t~~~~~y~~~l~~---~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vf 248 (454)
|||++|+| ++++. ++|+|++|+.+|+||+++.+|+.++|.|||+.+++.++ .. |.+||||++.|+.+||+++|
T Consensus 1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~--~~-p~~ii~D~~~~~~~Ai~~vf 76 (93)
T PF10551_consen 1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP--QK-PKVIISDFDKALINAIKEVF 76 (93)
T ss_pred Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc--cC-ceeeeccccHHHHHHHHHHC
Confidence 79999999 98885 99999999999999999999999999999999999994 45 99999999999999999999
Q ss_pred CCccccccHHHHHHhHH
Q 043258 249 PRAVYRQCCHRIFNEMV 265 (454)
Q Consensus 249 P~a~~~~C~~Hi~~n~~ 265 (454)
|++.|++|.||+.+|++
T Consensus 77 P~~~~~~C~~H~~~n~k 93 (93)
T PF10551_consen 77 PDARHQLCLFHILRNIK 93 (93)
T ss_pred CCceEehhHHHHHHhhC
Confidence 99999999999999974
No 4
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.68 E-value=1e-15 Score=150.17 Aligned_cols=223 Identities=15% Similarity=0.180 Sum_probs=163.2
Q ss_pred CCCCCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEEeccccccccccee
Q 043258 68 QPSISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVETETHESREEERFK 147 (454)
Q Consensus 68 ~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~ 147 (454)
..+++++++.+.++..++..++...+.+...++.+ .+.+++..-+
T Consensus 98 ~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e---------------~v~~~~~r~l-------------------- 142 (379)
T COG3328 98 AKGVTTREIEALLEELYGHKVSPSVISVVTDRLDE---------------KVKAWQNRPL-------------------- 142 (379)
T ss_pred HcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHH---------------HHHHHHhccc--------------------
Confidence 56899999999999999988888877765554443 3333333211
Q ss_pred EEEEeecchHHHHHhhcccEEEEeCeeecCC--CCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhcccc
Q 043258 148 RVFVCCARTSYAFKVHCRGILAVDGWEINNP--CNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLE 225 (454)
Q Consensus 148 ~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~--y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~ 225 (454)
+..+++++||+|++.+ -+..+++|+|++.+|+..++|+.+...|+ ..|.-||..|+..+
T Consensus 143 ---------------~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~rg--- 203 (379)
T COG3328 143 ---------------GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNRG--- 203 (379)
T ss_pred ---------------cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhcc---
Confidence 1458899999999988 45579999999999999999999999999 99999999999997
Q ss_pred CCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCCch---hHHHHHHHhhhhcc----------------
Q 043258 226 NGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPTAP---VQHLFWSACRTTSA---------------- 286 (454)
Q Consensus 226 ~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~~---~~~~~~~~~~~~~~---------------- 286 (454)
......+++|+.+|+.+||.++||.+.+|+|..|+.+|+.++.+.++ ....+..+..+.+.
T Consensus 204 l~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~ 283 (379)
T COG3328 204 LSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPRKDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGK 283 (379)
T ss_pred ccceeEEecchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhhhhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhh
Confidence 55666777899999999999999999999999999999999887662 22333332222221
Q ss_pred CCHHHHHHhhhcCcccceeeecCCcceeccccCChhHHHhhhhhh-h---ccccHHHHHHHH
Q 043258 287 TSQECHDWLKNSNWERWALFCMPHWVKCTCVTLTITEKLRTSFDH-Y---LEMSITRRFTAI 344 (454)
Q Consensus 287 ~~~~~~~~l~~~~~~~W~~~~~~~~~~~~~~Ttn~~Es~n~~lk~-~---r~~pi~~~~~~i 344 (454)
..|....|+.+.-.+.|...-|+...+--+.|||..|++|+.++. . ..+|-.+.+..+
T Consensus 284 ~yP~i~~~~~~~~~~~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~~~~fpn~~sv~k~ 345 (379)
T COG3328 284 RYPAILKSWRNALEELLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKVVGIFPNEESVEKL 345 (379)
T ss_pred hcchHHHHHHHHHHHhcccccCcHHHHhHhhcchHHHHHHHHHHHHHhhhccCCCHHHHHHH
Confidence 334444444444334443333333324457899999999998775 2 235655555443
No 5
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=99.10 E-value=4.4e-11 Score=71.94 Aligned_cols=27 Identities=41% Similarity=0.785 Sum_probs=25.2
Q ss_pred ceeecCCcccCCCCchhHHHHHHHhcC
Q 043258 392 MSCSCGLWQISGIPCAHACRGIKYMRR 418 (454)
Q Consensus 392 ~~CsC~~~~~~giPC~H~lav~~~~~~ 418 (454)
.+|||++||..||||+|+|+|+...|+
T Consensus 1 ~~CsC~~~~~~gipC~H~i~v~~~~~~ 27 (28)
T smart00575 1 KTCSCRKFQLSGIPCRHALAAAIHIGL 27 (28)
T ss_pred CcccCCCcccCCccHHHHHHHHHHhCC
Confidence 479999999999999999999998876
No 6
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.21 E-value=1.9e-06 Score=63.92 Aligned_cols=34 Identities=26% Similarity=0.576 Sum_probs=31.8
Q ss_pred CCCCCeEEEEEeccCCCccEEEEEEeCCCCeEEE
Q 043258 1 MENRSHIVSCECSDLSCDWQVTAIRDVRGKGFVI 34 (454)
Q Consensus 1 ~ks~~~r~~~~C~~~gC~~~v~~~~~~~~~~~~v 34 (454)
.||+++|+.++|...||||+|+|++.++++.|+|
T Consensus 34 ~ksd~~r~~~~C~~~~C~Wrv~as~~~~~~~~~I 67 (67)
T PF03108_consen 34 KKSDKKRYRAKCKDKGCPWRVRASKRKRSDTFQI 67 (67)
T ss_pred eccCCEEEEEEEcCCCCCEEEEEEEcCCCCEEEC
Confidence 4799999999999999999999999999999976
No 7
>PF04434 SWIM: SWIM zinc finger; InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.21 E-value=9.2e-07 Score=58.32 Aligned_cols=30 Identities=30% Similarity=0.703 Sum_probs=26.4
Q ss_pred EEcccceeecCCcccCCCCchhHHHHHHHh
Q 043258 387 LNMDAMSCSCGLWQISGIPCAHACRGIKYM 416 (454)
Q Consensus 387 V~l~~~~CsC~~~~~~giPC~H~lav~~~~ 416 (454)
+++...+|||..|+..|.||+|++|++...
T Consensus 10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~ 39 (40)
T PF04434_consen 10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL 39 (40)
T ss_pred ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence 345678999999999999999999998765
No 8
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=97.52 E-value=0.00012 Score=57.65 Aligned_cols=33 Identities=21% Similarity=0.362 Sum_probs=29.7
Q ss_pred cCCCccEEEEEEeCCCCeEEEEEecCCcccCCCC
Q 043258 14 DLSCDWQVTAIRDVRGKGFVITQFSPKHNCPRLD 47 (454)
Q Consensus 14 ~~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~~~~ 47 (454)
.+||||+|.+.+.+ ++.|.|+.+..+|||++.+
T Consensus 59 ktgC~a~i~v~~~~-~~~w~v~~~~~~HNH~L~P 91 (91)
T PF03101_consen 59 KTGCKARINVKRRK-DGKWRVTSFVLEHNHPLCP 91 (91)
T ss_pred ccCCCEEEEEEEcc-CCEEEEEECcCCcCCCCCC
Confidence 36999999999887 7899999999999999864
No 9
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=97.26 E-value=0.00038 Score=55.64 Aligned_cols=31 Identities=19% Similarity=0.423 Sum_probs=29.1
Q ss_pred CCCccEEEEEEeCCCCeEEEEEecCCcccCC
Q 043258 15 LSCDWQVTAIRDVRGKGFVITQFSPKHNCPR 45 (454)
Q Consensus 15 ~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~~ 45 (454)
.+|||+|+|......+.|.|..+++.|||++
T Consensus 81 ~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l 111 (111)
T PF08731_consen 81 NTCPFRIRANYSKKNKKWTLVVVNNEHNHPL 111 (111)
T ss_pred cCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence 4899999999999999999999999999985
No 10
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=96.88 E-value=0.00053 Score=58.83 Aligned_cols=81 Identities=19% Similarity=0.126 Sum_probs=68.9
Q ss_pred ccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHH
Q 043258 165 RGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAV 244 (454)
Q Consensus 165 ~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai 244 (454)
++.+.+|-||.+.+-+ ..++..++|.+|+ +|.+-|-..-+...=..||..+.+.. ...|..|+||+.++...|+
T Consensus 1 ~~~w~~DEt~iki~G~-~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~---~~~p~~ivtDk~~aY~~A~ 74 (140)
T PF13610_consen 1 GDSWHVDETYIKIKGK-WHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRH---RGEPRVIVTDKLPAYPAAI 74 (140)
T ss_pred CCEEEEeeEEEEECCE-EEEEEEeeccccc--chhhhhhhhcccccceeeccccceee---ccccceeecccCCccchhh
Confidence 3678999999876533 5667899999999 89999999999988889998888876 4789999999999999999
Q ss_pred hhhcCCc
Q 043258 245 EEFLPRA 251 (454)
Q Consensus 245 ~~vfP~a 251 (454)
+++.|.-
T Consensus 75 ~~l~~~~ 81 (140)
T PF13610_consen 75 KELNPEG 81 (140)
T ss_pred hhccccc
Confidence 9998763
No 11
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=96.08 E-value=0.15 Score=52.72 Aligned_cols=78 Identities=18% Similarity=0.287 Sum_probs=62.5
Q ss_pred cccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCC-chhHHHHHHHhhh
Q 043258 205 VEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPT-APVQHLFWSACRT 283 (454)
Q Consensus 205 ~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~-~~~~~~~~~~~~~ 283 (454)
..+.+-|.-+.+.+.+...+....-+++.+|+.+.+.+++. .+|++.+.+..+|+.+.+.+.++. +++.+.++++.+.
T Consensus 235 ~~~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~~~~~~~~~al~~ 313 (470)
T PF06782_consen 235 ESAEEFWEEVLDYIYNHYDLDKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDPELKEKIRKALKK 313 (470)
T ss_pred cchHHHHHHHHHHHHHhcCcccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhChHHHHHHHHHHHh
Confidence 55678899999999888864444468889999999998776 999999999999999999988753 3566656666554
No 12
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=95.87 E-value=0.026 Score=40.49 Aligned_cols=42 Identities=19% Similarity=0.339 Sum_probs=33.3
Q ss_pred CCCeEEEEEeccCCCccEEEEEEeCCCCeEEEEEecCCcccC
Q 043258 3 NRSHIVSCECSDLSCDWQVTAIRDVRGKGFVITQFSPKHNCP 44 (454)
Q Consensus 3 s~~~r~~~~C~~~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~ 44 (454)
|.-.|--++|+..+||++-.+.+..+++...++++.++|||+
T Consensus 18 ~~~pRsYYrCt~~~C~akK~Vqr~~~d~~~~~vtY~G~H~h~ 59 (60)
T PF03106_consen 18 SPYPRSYYRCTHPGCPAKKQVQRSADDPNIVIVTYEGEHNHP 59 (60)
T ss_dssp TTCEEEEEEEECTTEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred CceeeEeeeccccChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence 344566799999999999999998877778888999999996
No 13
>PHA02517 putative transposase OrfB; Reviewed
Probab=95.78 E-value=0.11 Score=49.94 Aligned_cols=151 Identities=16% Similarity=0.079 Sum_probs=83.8
Q ss_pred HHHHHHhhhhcc-CCCCCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEE
Q 043258 56 WISAMFLHRWKE-QPSISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVE 134 (454)
Q Consensus 56 ~i~~~~~~~~~~-~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~ 134 (454)
.+.+.+.+.... .+.+..+.|...|++. |+.+|.++++|..+.. |-... ....-.....+.. .
T Consensus 30 ~l~~~I~~i~~~~~~~~G~r~I~~~L~~~-g~~vs~~tV~Rim~~~-----gl~~~-------~~~k~~~~~~~~~-~-- 93 (277)
T PHA02517 30 WLKSEILRVYDENHQVYGVRKVWRQLNRE-GIRVARCTVGRLMKEL-----GLAGV-------LRGKKVRTTISRK-A-- 93 (277)
T ss_pred HHHHHHHHHHHHhCCCCCHHHHHHHHHhc-CcccCHHHHHHHHHHc-----CCceE-------ecCCCcCCCCCCC-C--
Confidence 455566666554 5788999999999765 9999999998764432 11000 0000000000000 0
Q ss_pred ecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHH
Q 043258 135 TETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYF 214 (454)
Q Consensus 135 ~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~ 214 (454)
....+.+.+-|-+ ..-..++..|.||..... +..++.+.+|...+ .++|+.+...++.+...-.
T Consensus 94 -----~~~~n~~~r~f~~---------~~pn~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~~~~~ 157 (277)
T PHA02517 94 -----VAAPDRVNRQFVA---------TRPNQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDFVLDA 157 (277)
T ss_pred -----CCCCCcccCCCCC---------CCCCCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHHHHHH
Confidence 0001111111100 013468999999986543 56677777776554 4667888877888766555
Q ss_pred HHHHHhhccccCCCcEEEEcCCcchH
Q 043258 215 LKNINSALRLENGKGLCILGDGDNGV 240 (454)
Q Consensus 215 l~~l~~~~~~~~~~~~~iitD~~~~l 240 (454)
|+......+ ...+.+|.||+....
T Consensus 158 l~~a~~~~~--~~~~~i~~sD~G~~y 181 (277)
T PHA02517 158 LEQALWARG--RPGGLIHHSDKGSQY 181 (277)
T ss_pred HHHHHHhcC--CCcCcEeeccccccc
Confidence 555444431 223457789998864
No 14
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=95.74 E-value=0.023 Score=53.69 Aligned_cols=93 Identities=15% Similarity=0.175 Sum_probs=69.5
Q ss_pred EEEeCeeecCCCCcceEEEEEEeC--CCCeEEEEEEEeecccchhHHHHHHHH-HhhccccCCCcEEEEcCCcchHHHHH
Q 043258 168 LAVDGWEINNPCNSVMLVAAGLDG--NNGILPVAFCEVQVEDLDSWVYFLKNI-NSALRLENGKGLCILGDGDNGVEYAV 244 (454)
Q Consensus 168 i~~D~t~~~~~y~~~l~~~~g~d~--~~~~~~la~al~~~E~~e~~~w~l~~l-~~~~~~~~~~~~~iitD~~~~l~~Ai 244 (454)
|+||=+........ +..+.+|. +++.+ ++++++-+.++..-||..+ -... ......|++|..++..+|+
T Consensus 1 lgiDE~~~~~g~~~--y~t~~~d~~~~~~~i---l~i~~~r~~~~l~~~~~~~~~~~~---~~~v~~V~~Dm~~~y~~~~ 72 (249)
T PF01610_consen 1 LGIDEFAFRKGHRS--YVTVVVDLDTDTGRI---LDILPGRDKETLKDFFRSLYPEEE---RKNVKVVSMDMSPPYRSAI 72 (249)
T ss_pred CeEeeeeeecCCcc--eeEEEEECccCCceE---EEEcCCccHHHHHHHHHHhCcccc---ccceEEEEcCCCccccccc
Confidence 45666665443332 45555555 44332 4588999999999888876 3332 4567899999999999999
Q ss_pred hhhcCCccccccHHHHHHhHHhhC
Q 043258 245 EEFLPRAVYRQCCHRIFNEMVRRF 268 (454)
Q Consensus 245 ~~vfP~a~~~~C~~Hi~~n~~~~~ 268 (454)
++.||+|.+..--||+++++.+.+
T Consensus 73 ~~~~P~A~iv~DrFHvvk~~~~al 96 (249)
T PF01610_consen 73 REYFPNAQIVADRFHVVKLANRAL 96 (249)
T ss_pred cccccccccccccchhhhhhhhcc
Confidence 999999999999999999987644
No 15
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=95.49 E-value=0.065 Score=51.31 Aligned_cols=134 Identities=12% Similarity=0.119 Sum_probs=87.0
Q ss_pred CCCCCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEEeccccccccccee
Q 043258 68 QPSISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVETETHESREEERFK 147 (454)
Q Consensus 68 ~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~ 147 (454)
...++...+.+.+... |+++|..++.+.-.++.+.. ....+.+.+.-
T Consensus 18 ~~~lp~~r~~~~~~~~-G~~is~~ti~~~~~~~~~~l-----------~~~~~~l~~~~--------------------- 64 (271)
T PF03050_consen 18 VYHLPLYRIQQMLEDL-GITISRGTIANWIKRVAEAL-----------KPLYEALKEEL--------------------- 64 (271)
T ss_pred cCCCCHHHHhhhhhcc-ceeeccchhHhHhhhhhhhh-----------hhhhhhhhhhc---------------------
Confidence 4456777788888777 99999999887766554322 12222222211
Q ss_pred EEEEeecchHHHHHhhcccEEEEeCeeec----CCCC-cceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhc
Q 043258 148 RVFVCCARTSYAFKVHCRGILAVDGWEIN----NPCN-SVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSAL 222 (454)
Q Consensus 148 ~~f~~~~~~~~~~~~~~~~vi~~D~t~~~----~~y~-~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~ 222 (454)
--.+++.+|-|..+ ++.. +-+.++++-+ .+.|.+.++-+.+...-+|..
T Consensus 65 ---------------~~~~~~~~DET~~~vl~~~~g~~~~~Wv~~~~~------~v~f~~~~sR~~~~~~~~L~~----- 118 (271)
T PF03050_consen 65 ---------------RSSPVVHADETGWRVLDKGKGKKGYLWVFVSPE------VVLFFYAPSRSSKVIKEFLGD----- 118 (271)
T ss_pred ---------------cccceeccCCceEEEeccccccceEEEeeeccc------eeeeeecccccccchhhhhcc-----
Confidence 13578888888877 4433 3344444433 666777777777777666533
Q ss_pred cccCCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCC
Q 043258 223 RLENGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPT 270 (454)
Q Consensus 223 ~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~ 270 (454)
-.-+++||+-.+-.. +..+.|+.|+.|+.|.+.+-...
T Consensus 119 -----~~GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~ 156 (271)
T PF03050_consen 119 -----FSGILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES 156 (271)
T ss_pred -----cceeeeccccccccc-----ccccccccccccccccccccccc
Confidence 223789999987654 22788999999999998876653
No 16
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=93.36 E-value=0.16 Score=36.11 Aligned_cols=40 Identities=13% Similarity=0.142 Sum_probs=32.2
Q ss_pred CCeEEEEEecc-CCCccEEEEEEeCCCCeEEEEEecCCccc
Q 043258 4 RSHIVSCECSD-LSCDWQVTAIRDVRGKGFVITQFSPKHNC 43 (454)
Q Consensus 4 ~~~r~~~~C~~-~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc 43 (454)
...|--.+|+. .|||++=.+.+..+++...++.+.++|||
T Consensus 19 ~~pRsYYrCt~~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h 59 (59)
T smart00774 19 PFPRSYYRCTYSQGCPAKKQVQRSDDDPSVVEVTYEGEHTH 59 (59)
T ss_pred cCcceEEeccccCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence 34455689998 89999988888766666777899999998
No 17
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=91.99 E-value=2.3 Score=38.77 Aligned_cols=83 Identities=14% Similarity=0.071 Sum_probs=60.6
Q ss_pred ccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHH
Q 043258 165 RGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAV 244 (454)
Q Consensus 165 ~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai 244 (454)
++.+.+|-||.+.+-+. .+.-.+||.+| .+|.+-|...-+...=.-||..+.+.- +.|.+|+||+.+....|+
T Consensus 70 ~~~w~vDEt~ikv~gkw-~ylyrAid~~g--~~Ld~~L~~rRn~~aAk~Fl~kllk~~----g~p~v~vtDka~s~~~A~ 142 (215)
T COG3316 70 GDSWRVDETYIKVNGKW-HYLYRAIDADG--LTLDVWLSKRRNALAAKAFLKKLLKKH----GEPRVFVTDKAPSYTAAL 142 (215)
T ss_pred ccceeeeeeEEeeccEe-eehhhhhccCC--CeEEEEEEcccCcHHHHHHHHHHHHhc----CCCceEEecCccchHHHH
Confidence 46678999998765433 23334556664 466777777777777777887766664 478899999999999999
Q ss_pred hhhcCCcccc
Q 043258 245 EEFLPRAVYR 254 (454)
Q Consensus 245 ~~vfP~a~~~ 254 (454)
.++-+.+.|+
T Consensus 143 ~~l~~~~ehr 152 (215)
T COG3316 143 RKLGSEVEHR 152 (215)
T ss_pred HhcCcchhee
Confidence 9998866655
No 18
>PRK09409 IS2 transposase TnpB; Reviewed
Probab=89.90 E-value=5.5 Score=38.76 Aligned_cols=147 Identities=11% Similarity=0.021 Sum_probs=87.4
Q ss_pred HHHHHHHhhhhccCCCCCHHHHHHHHHHHh---CC-ccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcE
Q 043258 55 KWISAMFLHRWKEQPSISTTEVRNEIESMY---GI-KCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNI 130 (454)
Q Consensus 55 ~~i~~~~~~~~~~~~~~~~~~i~~~l~~~~---g~-~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~ 130 (454)
..+.+.+.+.....+.+..+.|...|+++. |+ .++.++++|..+.+ |-.. ...+..+.+.
T Consensus 50 ~~l~~~I~~i~~~~~~yG~Rri~~~L~~~g~~~g~~~v~~k~V~RlMr~~-----Gl~~-----------~~~~~~~~~~ 113 (301)
T PRK09409 50 TDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRIMRQN-----ALLL-----------ERKPAVPPSK 113 (301)
T ss_pred HHHHHHHHHHHHhCccCCHHHHHHHHHhhhcccCccccCHHHHHHHHHHc-----CCcc-----------cccCCCCCCC
Confidence 345555666655678899999999998752 66 58999988764432 1000 0000000000
Q ss_pred EEEEecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeec-ccch
Q 043258 131 VLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQV-EDLD 209 (454)
Q Consensus 131 ~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~-E~~e 209 (454)
......| . ...-..++..|-||....-++.++.++-+|...+ .++|+++... .+.+
T Consensus 114 ---------~~~~~~~----~---------~~~pN~~W~tDiT~~~~~~g~~~Yl~~ViD~~sR-~ivg~~~s~~~~~~~ 170 (301)
T PRK09409 114 ---------RAHTGRV----A---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSE 170 (301)
T ss_pred ---------CCCCCCc----C---------CCCCCCEEEeeeEEEEeCCCCEEEEEEEeecccc-eEEEEEeccCCCCHH
Confidence 0001111 0 0123468999999987655556888888888776 6889999875 5666
Q ss_pred hHHHHHHH-HHhhccc-cCCCcEEEEcCCcchH
Q 043258 210 SWVYFLKN-INSALRL-ENGKGLCILGDGDNGV 240 (454)
Q Consensus 210 ~~~w~l~~-l~~~~~~-~~~~~~~iitD~~~~l 240 (454)
.-.-+|+. +....+. ....|.+|.||+....
T Consensus 171 ~v~~~l~~a~~~~~~~~~~~~~~iihSDrGsqy 203 (301)
T PRK09409 171 TVQDVMLGAVERRFGNDLPSSPVEWLTDNGSCY 203 (301)
T ss_pred HHHHHHHHHHHHHhccCCCCCCcEEecCCCccc
Confidence 65566654 4443320 1235788999998753
No 19
>PF13565 HTH_32: Homeodomain-like domain
Probab=89.63 E-value=0.75 Score=34.52 Aligned_cols=40 Identities=18% Similarity=0.302 Sum_probs=34.1
Q ss_pred HHHHHhhhhccCCCCCHHHHHHHHHHHhCCcc--CHHHHHHH
Q 043258 57 ISAMFLHRWKEQPSISTTEVRNEIESMYGIKC--PEWKVFCA 96 (454)
Q Consensus 57 i~~~~~~~~~~~~~~~~~~i~~~l~~~~g~~~--s~~~~~ra 96 (454)
+.+.+.+.+..+|.+++.+|.+.|.+++|+.+ |.+++||.
T Consensus 35 ~~~~i~~~~~~~p~wt~~~i~~~L~~~~g~~~~~S~~tv~R~ 76 (77)
T PF13565_consen 35 QRERIIALIEEHPRWTPREIAEYLEEEFGISVRVSRSTVYRI 76 (77)
T ss_pred HHHHHHHHHHhCCCCCHHHHHHHHHHHhCCCCCccHhHHHHh
Confidence 44566677778899999999999999999876 99999874
No 20
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=89.44 E-value=5.5 Score=37.89 Aligned_cols=147 Identities=12% Similarity=0.045 Sum_probs=87.7
Q ss_pred HHHHHHHhhhhccCCCCCHHHHHHHHHHH---hCC-ccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcE
Q 043258 55 KWISAMFLHRWKEQPSISTTEVRNEIESM---YGI-KCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNI 130 (454)
Q Consensus 55 ~~i~~~~~~~~~~~~~~~~~~i~~~l~~~---~g~-~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~ 130 (454)
..+...+.+.....+.+..+.|...|++. .|+ .++.++++|..+.+- +.+. .++..+-+.
T Consensus 11 ~~l~~~I~~~~~~~~~yG~rri~~~L~~~~~~~g~~~v~~krV~rlmr~~g--L~~~--------------~r~~~~~~~ 74 (262)
T PRK14702 11 TDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRLMRQNA--LLLE--------------RKPAVPPSK 74 (262)
T ss_pred HHHHHHHHHHHHhCcccChHHHHHHHHhhhcccCccccCHHHHHHHHHHhC--Cccc--------------cCCCCCCCC
Confidence 45666666666667889999999999875 477 499999987644320 0000 000000000
Q ss_pred EEEEecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeec-ccch
Q 043258 131 VLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQV-EDLD 209 (454)
Q Consensus 131 ~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~-E~~e 209 (454)
......| . ...-..++..|-||.....++.++.++-+|...+ .++|+++... .+.+
T Consensus 75 ---------~~~~~~~----~---------~~~pn~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~ 131 (262)
T PRK14702 75 ---------RAHTGRV----A---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSE 131 (262)
T ss_pred ---------cCCCCcc----c---------cCCCCCEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHH
Confidence 0000001 0 0113468999999987654556888888887776 6788888874 5666
Q ss_pred hHHHHHHH-HHhhccc-cCCCcEEEEcCCcchH
Q 043258 210 SWVYFLKN-INSALRL-ENGKGLCILGDGDNGV 240 (454)
Q Consensus 210 ~~~w~l~~-l~~~~~~-~~~~~~~iitD~~~~l 240 (454)
.-.-+|+. +....+. ....|..|.||+....
T Consensus 132 ~v~~~l~~A~~~~~~~~~~~~~~iihSD~Gsqy 164 (262)
T PRK14702 132 TVQDVMLGAVERRFGNDLPSSPVEWLTDNGSCY 164 (262)
T ss_pred HHHHHHHHHHHHHhcccCCCCCeEEEcCCCccc
Confidence 65556654 3333210 1335788999998754
No 21
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=88.08 E-value=2.4 Score=34.47 Aligned_cols=76 Identities=12% Similarity=-0.037 Sum_probs=54.1
Q ss_pred ccEEEEeCeeec-CCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHH
Q 043258 165 RGILAVDGWEIN-NPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYA 243 (454)
Q Consensus 165 ~~vi~~D~t~~~-~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~A 243 (454)
...+.+|.+... ...++..++.+.+|..-.. .+++.+-..++.+....+|.......+ ...|.+|+||++....+.
T Consensus 6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~~-~~~~~~~~~~~~~~~~~~l~~~~~~~~--~~~p~~i~tD~g~~f~~~ 82 (120)
T PF00665_consen 6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSRF-IYAFPVSSKETAEAALRALKRAIEKRG--GRPPRVIRTDNGSEFTSH 82 (120)
T ss_dssp TTEEEEEEEEETGGCTT-CEEEEEEEETTTTE-EEEEEESSSSHHHHHHHHHHHHHHHHS---SE-SEEEEESCHHHHSH
T ss_pred CCEEEEeeEEEecCCCCccEEEEEEEECCCCc-EEEEEeecccccccccccccccccccc--cccceecccccccccccc
Confidence 457889999766 3456688899999977665 446777777788888888876666652 333999999999987643
No 22
>PF04937 DUF659: Protein of unknown function (DUF 659); InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=78.72 E-value=25 Score=30.41 Aligned_cols=64 Identities=13% Similarity=0.209 Sum_probs=46.9
Q ss_pred ccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHH---hhhcCCccccccHHHHHHhHHhhCCCc
Q 043258 206 EDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAV---EEFLPRAVYRQCCHRIFNEMVRRFPTA 271 (454)
Q Consensus 206 E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai---~~vfP~a~~~~C~~Hi~~n~~~~~~~~ 271 (454)
.+.+..--+|+...+.+ +....+-||||......+|- .+-+|.....-|..|-+.-+.+.+...
T Consensus 73 ~~a~~l~~ll~~vIeeV--G~~nVvqVVTDn~~~~~~a~~~L~~k~p~ifw~~CaaH~inLmledi~k~ 139 (153)
T PF04937_consen 73 KTAEYLFELLDEVIEEV--GEENVVQVVTDNASNMKKAGKLLMEKYPHIFWTPCAAHCINLMLEDIGKL 139 (153)
T ss_pred ccHHHHHHHHHHHHHHh--hhhhhhHHhccCchhHHHHHHHHHhcCCCEEEechHHHHHHHHHHHHhcC
Confidence 44555556666655555 35566779999999988884 455899999999999998887766543
No 23
>PF12762 DDE_Tnp_IS1595: ISXO2-like transposase domain; InterPro: IPR024445 This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2.
Probab=71.52 E-value=14 Score=31.62 Aligned_cols=69 Identities=17% Similarity=0.181 Sum_probs=44.2
Q ss_pred cEEEEeCeeecCCC--------------CcceEEEEEEeCC-CCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcE
Q 043258 166 GILAVDGWEINNPC--------------NSVMLVAAGLDGN-NGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGL 230 (454)
Q Consensus 166 ~vi~~D~t~~~~~y--------------~~~l~~~~g~d~~-~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~ 230 (454)
.+|-+|.||..++- .....++++++-+ +..--+...++++.+.++..-+++...+ +..
T Consensus 4 G~VEiDEty~~~~~~~~~~~~~~~gr~~~~k~~V~~~ver~~~~~~~~~~~~v~~~~~~tl~~~i~~~i~-------~gs 76 (151)
T PF12762_consen 4 GIVEIDETYFGGRKNKKPRRKGKRGRGSKNKVPVFGAVERNDGGTGRVFMFVVPDRSAETLKPIIQEHIE-------PGS 76 (151)
T ss_pred CEEEeCcCEECCcccccccCCCCCCCcCCCCcEEEEEEeecccCCceEEEEeecccccchhHHHHHHhhh-------ccc
Confidence 36778888875433 2245666666665 4444555556688888887766654333 335
Q ss_pred EEEcCCcchHH
Q 043258 231 CILGDGDNGVE 241 (454)
Q Consensus 231 ~iitD~~~~l~ 241 (454)
+|+||+.++-.
T Consensus 77 ~i~TD~~~aY~ 87 (151)
T PF12762_consen 77 TIITDGWRAYN 87 (151)
T ss_pred eeeecchhhcC
Confidence 78999998753
No 24
>PRK13907 rnhA ribonuclease H; Provisional
Probab=70.53 E-value=35 Score=28.14 Aligned_cols=78 Identities=18% Similarity=0.079 Sum_probs=44.8
Q ss_pred EEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEE-eecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHh
Q 043258 167 ILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCE-VQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVE 245 (454)
Q Consensus 167 vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al-~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~ 245 (454)
.|.+||.+..+.-.+-.-.++ .|..+.. .+++.. ..+.+..-+.-++..|+.+.. .+..++.|-||. +.+.+++.
T Consensus 3 ~iy~DGa~~~~~g~~G~G~vi-~~~~~~~-~~~~~~~~~tn~~AE~~All~aL~~a~~-~g~~~v~i~sDS-~~vi~~~~ 78 (128)
T PRK13907 3 EVYIDGASKGNPGPSGAGVFI-KGVQPAV-QLSLPLGTMSNHEAEYHALLAALKYCTE-HNYNIVSFRTDS-QLVERAVE 78 (128)
T ss_pred EEEEeeCCCCCCCccEEEEEE-EECCeeE-EEEecccccCCcHHHHHHHHHHHHHHHh-CCCCEEEEEech-HHHHHHHh
Confidence 378999998765332222222 4555543 333221 234455567777888887774 234567788876 55666666
Q ss_pred hhc
Q 043258 246 EFL 248 (454)
Q Consensus 246 ~vf 248 (454)
..+
T Consensus 79 ~~~ 81 (128)
T PRK13907 79 KEY 81 (128)
T ss_pred HHH
Confidence 654
No 25
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=68.61 E-value=5.2 Score=28.17 Aligned_cols=35 Identities=14% Similarity=0.163 Sum_probs=16.0
Q ss_pred CeEEEEEecc---CCCccEEEEEEeCCCCeEEEEEecCCccc
Q 043258 5 SHIVSCECSD---LSCDWQVTAIRDVRGKGFVITQFSPKHNC 43 (454)
Q Consensus 5 ~~r~~~~C~~---~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc 43 (454)
.....-+|+. .+|+++|... .++ -.|.....+|||
T Consensus 25 ~~~~~WrC~~~~~~~C~a~~~~~--~~~--~~~~~~~~~HnH 62 (62)
T PF04500_consen 25 DGKTYWRCSRRRSHGCRARLITD--AGD--GRVVRTNGEHNH 62 (62)
T ss_dssp SS-EEEEEGGGTTS----EEEEE----T--TEEEE-S---SS
T ss_pred CCcEEEEeCCCCCCCCeEEEEEE--CCC--CEEEECCCccCC
Confidence 3455678885 3899999988 222 345566688998
No 26
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=67.31 E-value=1.5 Score=34.60 Aligned_cols=31 Identities=32% Similarity=0.480 Sum_probs=21.5
Q ss_pred CCceEEEEcccceeecCCccc-----CCCCchhHHHHH
Q 043258 381 DGRRFILNMDAMSCSCGLWQI-----SGIPCAHACRGI 413 (454)
Q Consensus 381 ~~~~~~V~l~~~~CsC~~~~~-----~giPC~H~lav~ 413 (454)
.++.|+++.+ .|||..|-. ---||.|+++.-
T Consensus 41 ~~rdYIl~~g--fCSCp~~~~svvl~Gk~~C~Hi~glk 76 (117)
T COG5431 41 KERDYILEGG--FCSCPDFLGSVVLKGKSPCAHIIGLK 76 (117)
T ss_pred cccceEEEcC--cccCHHHHhHhhhcCcccchhhhhee
Confidence 3446666665 999998762 235899998753
No 27
>PF13592 HTH_33: Winged helix-turn helix
Probab=64.85 E-value=12 Score=26.69 Aligned_cols=31 Identities=23% Similarity=0.303 Sum_probs=26.3
Q ss_pred CCCCHHHHHHHHHHHhCCccCHHHHHHHHHH
Q 043258 69 PSISTTEVRNEIESMYGIKCPEWKVFCAANR 99 (454)
Q Consensus 69 ~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~ 99 (454)
+-++..+|.+.|.+.+|+.+|.+.+++.-++
T Consensus 3 ~~wt~~~i~~~I~~~fgv~ys~~~v~~lL~r 33 (60)
T PF13592_consen 3 GRWTLKEIAAYIEEEFGVKYSPSGVYRLLKR 33 (60)
T ss_pred CcccHHHHHHHHHHHHCCEEcHHHHHHHHHH
Confidence 4578899999999999999999999876443
No 28
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=60.73 E-value=5.1 Score=37.15 Aligned_cols=24 Identities=25% Similarity=0.588 Sum_probs=19.4
Q ss_pred cceeecCCcccCCCCchhHHHHHHHhc
Q 043258 391 AMSCSCGLWQISGIPCAHACRGIKYMR 417 (454)
Q Consensus 391 ~~~CsC~~~~~~giPC~H~lav~~~~~ 417 (454)
...|||..| -.||.|+-||....+
T Consensus 124 ~~dCSCPD~---anPCKHi~AvyY~la 147 (266)
T COG4279 124 STDCSCPDY---ANPCKHIAAVYYLLA 147 (266)
T ss_pred ccccCCCCc---ccchHHHHHHHHHHH
Confidence 456999975 579999999987764
No 29
>PF01498 HTH_Tnp_Tc3_2: Transposase; InterPro: IPR002492 Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in []. Tc3 is a member of the Tc1/mariner family of transposable elements. This entry also includes histone-lysine N-methyltransferase SETMAR, which is a SET domain and mariner transposase fusion gene-containing protein. This histone methyltransferase has sequence-specific DNA-binding activity and recognises the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element. This protein has DNA nicking activity, and has in vivo end joining activity and may mediate genomic integration of foreign DNA [, , , ]. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated, 0015074 DNA integration; PDB: 3K9K_B 3F2K_B 3K9J_B 1U78_A.
Probab=59.76 E-value=9 Score=28.20 Aligned_cols=38 Identities=18% Similarity=0.326 Sum_probs=17.0
Q ss_pred HhhhhccCCCCCHHHHHHHHHHHhCCccCHHHHHHHHHH
Q 043258 61 FLHRWKEQPSISTTEVRNEIESMYGIKCPEWKVFCAANR 99 (454)
Q Consensus 61 ~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~ 99 (454)
+...+..+|..+..+|...+... |..+|..++.|.-..
T Consensus 4 I~~~v~~~p~~s~~~i~~~l~~~-~~~vS~~TI~r~L~~ 41 (72)
T PF01498_consen 4 IVRMVRRNPRISAREIAQELQEA-GISVSKSTIRRRLRE 41 (72)
T ss_dssp ------------HHHHHHHT----T--S-HHHHHHHHHH
T ss_pred HHHHHHHCCCCCHHHHHHHHHHc-cCCcCHHHHHHHHHH
Confidence 44556678999999999999988 999999999876443
No 30
>PF08766 DEK_C: DEK C terminal domain; InterPro: IPR014876 DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients []. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like Q8TF96 from SWISSPROT, and in protein phosphatases such as Q6NN85 from SWISSPROT. ; PDB: 1Q1V_A.
Probab=53.48 E-value=33 Score=23.78 Aligned_cols=38 Identities=18% Similarity=0.346 Sum_probs=24.7
Q ss_pred HHHHHHhhhhcc-C-CCCCHHHHHHHHHHHhCCccCHHHH
Q 043258 56 WISAMFLHRWKE-Q-PSISTTEVRNEIESMYGIKCPEWKV 93 (454)
Q Consensus 56 ~i~~~~~~~~~~-~-~~~~~~~i~~~l~~~~g~~~s~~~~ 93 (454)
-+...+.+.+.+ + ..++.++|...|.+.+|.+++..+.
T Consensus 4 ~i~~~i~~iL~~~dl~~vT~k~vr~~Le~~~~~dL~~~K~ 43 (54)
T PF08766_consen 4 EIREAIREILREADLDTVTKKQVREQLEERFGVDLSSRKK 43 (54)
T ss_dssp HHHHHHHHHHTTS-GGG--HHHHHHHHHHH-SS--SHHHH
T ss_pred HHHHHHHHHHHhCCHhHhhHHHHHHHHHHHHCCCcHHHHH
Confidence 355667777775 3 4689999999999999999986553
No 31
>PF13276 HTH_21: HTH-like domain
Probab=45.17 E-value=46 Score=23.38 Aligned_cols=42 Identities=17% Similarity=0.252 Sum_probs=33.5
Q ss_pred HHHHHhhhhccC-CCCCHHHHHHHHHHHhCCccCHHHHHHHHH
Q 043258 57 ISAMFLHRWKEQ-PSISTTEVRNEIESMYGIKCPEWKVFCAAN 98 (454)
Q Consensus 57 i~~~~~~~~~~~-~~~~~~~i~~~l~~~~g~~~s~~~~~rak~ 98 (454)
+.+.+.+....+ +.+....|...|+++.|+.+|..+++|..+
T Consensus 6 l~~~I~~i~~~~~~~yG~rri~~~L~~~~~~~v~~krV~RlM~ 48 (60)
T PF13276_consen 6 LRELIKEIFKESKPTYGYRRIWAELRREGGIRVSRKRVRRLMR 48 (60)
T ss_pred HHHHHHHHHHHcCCCeehhHHHHHHhccCcccccHHHHHHHHH
Confidence 455566666654 788999999999999899999999987643
No 32
>PF08069 Ribosomal_S13_N: Ribosomal S13/S15 N-terminal domain; InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=44.26 E-value=20 Score=25.65 Aligned_cols=29 Identities=10% Similarity=0.218 Sum_probs=23.6
Q ss_pred HHHHhhhhcc--CCCCCHHHHHHHHHHHhCC
Q 043258 58 SAMFLHRWKE--QPSISTTEVRNEIESMYGI 86 (454)
Q Consensus 58 ~~~~~~~~~~--~~~~~~~~i~~~l~~~~g~ 86 (454)
++++++.|.. ..+++|.+|--.|+.+||+
T Consensus 30 ~~eVe~~I~klakkG~tpSqIG~iLRD~~GI 60 (60)
T PF08069_consen 30 PEEVEELIVKLAKKGLTPSQIGVILRDQYGI 60 (60)
T ss_dssp HHHHHHHHHHHCCTTHCHHHHHHHHHHSCTC
T ss_pred HHHHHHHHHHHHHcCCCHHHhhhhhhhccCC
Confidence 4666666664 6789999999999999985
No 33
>PRK00766 hypothetical protein; Provisional
Probab=44.09 E-value=2.5e+02 Score=25.42 Aligned_cols=89 Identities=19% Similarity=0.212 Sum_probs=51.1
Q ss_pred cEEEEeC-eeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhcc---------------------
Q 043258 166 GILAVDG-WEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALR--------------------- 223 (454)
Q Consensus 166 ~vi~~D~-t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~--------------------- 223 (454)
.|+.||- .|..+.-+-..++-+-.-++.-..-++|+-+.-.-.|.=.-+.+.++....
T Consensus 10 rvlGidds~f~~~~~~~~~lvGvv~r~~~~idGv~~~~itvdG~DaT~~i~~mv~~~~~r~~i~~V~L~Git~agFNvvD 89 (194)
T PRK00766 10 RVLGIDDGTFLFKSSEKVILVGVVMRGGDWVDGVLSRWITVDGLDATEAIIEMVNSSRHKGQLRVIMLDGITYGGFNVVD 89 (194)
T ss_pred eEEEEecCccccCCCCCEEEEEEEEECCeEEeeEEEEEEEECCccHHHHHHHHHHhcccccceEEEEECCEeeeeeEEec
Confidence 5788884 444433333555555566666666777777666555555555544443110
Q ss_pred -----ccCCCcEEEEcCCcc---hHHHHHhhhcCCcccc
Q 043258 224 -----LENGKGLCILGDGDN---GVEYAVEEFLPRAVYR 254 (454)
Q Consensus 224 -----~~~~~~~~iitD~~~---~l~~Ai~~vfP~a~~~ 254 (454)
...+-|+.+++..-+ +|.+|+++.||+...+
T Consensus 90 ~~~l~~~tg~PVI~V~r~~p~~~~ie~AL~k~f~~~~~R 128 (194)
T PRK00766 90 IEELYRETGLPVIVVMRKKPDFEAIESALKKHFSDWEER 128 (194)
T ss_pred HHHHHHHHCCCEEEEEecCCCHHHHHHHHHHHCCCHHHH
Confidence 001246666644444 6888888888886554
No 34
>PF13082 DUF3931: Protein of unknown function (DUF3931)
Probab=42.68 E-value=54 Score=22.39 Aligned_cols=28 Identities=14% Similarity=0.025 Sum_probs=21.0
Q ss_pred CCcceEEEEEEeCCCCeEEEEEEEeecc
Q 043258 179 CNSVMLVAAGLDGNNGILPVAFCEVQVE 206 (454)
Q Consensus 179 y~~~l~~~~g~d~~~~~~~la~al~~~E 206 (454)
|...-++++|-+++|+..++.-.|-.+|
T Consensus 35 yefssfvlcgetpdgrrlvlthmistde 62 (66)
T PF13082_consen 35 YEFSSFVLCGETPDGRRLVLTHMISTDE 62 (66)
T ss_pred EEEEEEEEEccCCCCcEEEEEEEecchh
Confidence 3445677889999999888877776655
No 35
>PF13551 HTH_29: Winged helix-turn helix
Probab=39.49 E-value=54 Score=25.96 Aligned_cols=40 Identities=18% Similarity=0.267 Sum_probs=29.9
Q ss_pred HHHhhhhccCC-----CCCHHHHHHHH-HHHhCCccCHHHHHHHHH
Q 043258 59 AMFLHRWKEQP-----SISTTEVRNEI-ESMYGIKCPEWKVFCAAN 98 (454)
Q Consensus 59 ~~~~~~~~~~~-----~~~~~~i~~~l-~~~~g~~~s~~~~~rak~ 98 (454)
+.+.+.+..+| .+++..|...| ++.+|+.+|..++++.-+
T Consensus 64 ~~l~~~~~~~p~~g~~~~t~~~l~~~l~~~~~~~~~s~~ti~r~L~ 109 (112)
T PF13551_consen 64 AQLIELLRENPPEGRSRWTLEELAEWLIEEEFGIDVSPSTIRRILK 109 (112)
T ss_pred HHHHHHHHHCCCCCCCcccHHHHHHHHHHhccCccCCHHHHHHHHH
Confidence 34455555544 47889999976 888999999999987644
No 36
>TIGR00334 5S_RNA_mat_M5 ribonuclease M5. This family of orthologous proteins shows a weak but significant similarity to the central region of the DnaG-type DNA primase. The region of similarity is termed the Toprim (topoisomerase-primase) domain and is also shared by RecR, OLD family nucleases, and type IA and II topoisomerases.
Probab=36.68 E-value=51 Score=29.13 Aligned_cols=44 Identities=23% Similarity=0.244 Sum_probs=34.0
Q ss_pred HHHHHHhhccccCCCcEEEEcCCcch---HHHHHhhhcCCccccccHHHH
Q 043258 214 FLKNINSALRLENGKGLCILGDGDNG---VEYAVEEFLPRAVYRQCCHRI 260 (454)
Q Consensus 214 ~l~~l~~~~~~~~~~~~~iitD~~~~---l~~Ai~~vfP~a~~~~C~~Hi 260 (454)
.++.++.+. ....+.|+||.|.+ |.+-|.+.+|++.|.+=...-
T Consensus 37 ~i~~i~~~~---~~rgVIIfTDpD~~GekIRk~i~~~vp~~khafi~~~~ 83 (174)
T TIGR00334 37 TINLIKKAQ---KKQGVIILTDPDFPGEKIRKKIEQHLPGYENCFIPKHL 83 (174)
T ss_pred HHHHHHHHh---hcCCEEEEeCCCCchHHHHHHHHHHCCCCeEEeeeHHh
Confidence 455666655 67789999999974 888999999999998754443
No 37
>PF14420 Clr5: Clr5 domain
Probab=34.32 E-value=90 Score=21.63 Aligned_cols=26 Identities=15% Similarity=0.180 Sum_probs=22.4
Q ss_pred CCCCCHHHHHHHHHHHhCCccCHHHH
Q 043258 68 QPSISTTEVRNEIESMYGIKCPEWKV 93 (454)
Q Consensus 68 ~~~~~~~~i~~~l~~~~g~~~s~~~~ 93 (454)
+.+.+..+|++.++..||..+|..+.
T Consensus 18 ~e~~tl~~v~~~M~~~~~F~at~rqy 43 (54)
T PF14420_consen 18 DENKTLEEVMEIMKEEHGFKATKRQY 43 (54)
T ss_pred hCCCcHHHHHHHHHHHhCCCcCHHHH
Confidence 56788999999999999999986653
No 38
>PF10045 DUF2280: Uncharacterized conserved protein (DUF2280); InterPro: IPR018738 This entry is represented by Burkholderia phage Bups phi1, Orf2.36. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Probab=32.96 E-value=50 Score=26.33 Aligned_cols=23 Identities=22% Similarity=0.410 Sum_probs=20.7
Q ss_pred CHHHHHHHHHHHhCCccCHHHHH
Q 043258 72 STTEVRNEIESMYGIKCPEWKVF 94 (454)
Q Consensus 72 ~~~~i~~~l~~~~g~~~s~~~~~ 94 (454)
+|+++.+.+++++|+.+|..++-
T Consensus 21 TPs~v~~aVk~eFgi~vsrQqve 43 (104)
T PF10045_consen 21 TPSEVAEAVKEEFGIDVSRQQVE 43 (104)
T ss_pred CHHHHHHHHHHHhCCccCHHHHH
Confidence 79999999999999999987764
No 39
>PF09713 A_thal_3526: Plant protein 1589 of unknown function (A_thal_3526); InterPro: IPR006476 This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.
Probab=31.74 E-value=44 Score=23.34 Aligned_cols=25 Identities=16% Similarity=0.089 Sum_probs=18.0
Q ss_pred CCCHHHHHHHHHHHhCCccCHH-HHH
Q 043258 70 SISTTEVRNEIESMYGIKCPEW-KVF 94 (454)
Q Consensus 70 ~~~~~~i~~~l~~~~g~~~s~~-~~~ 94 (454)
.++..++++.|.++.|+++... .+|
T Consensus 12 yMsk~E~v~~L~~~a~I~P~~T~~VW 37 (54)
T PF09713_consen 12 YMSKEECVRALQKQANIEPVFTSTVW 37 (54)
T ss_pred cCCHHHHHHHHHHHcCCChHHHHHHH
Confidence 5678899999988888775543 344
No 40
>KOG4027 consensus Uncharacterized conserved protein [Function unknown]
Probab=30.24 E-value=72 Score=27.59 Aligned_cols=35 Identities=11% Similarity=0.046 Sum_probs=30.0
Q ss_pred EeCeee-cCCCCcc--eEEEEEEeCCCCeEEEEEEEee
Q 043258 170 VDGWEI-NNPCNSV--MLVAAGLDGNNGILPVAFCEVQ 204 (454)
Q Consensus 170 ~D~t~~-~~~y~~~--l~~~~g~d~~~~~~~la~al~~ 204 (454)
||.||+ ++.|+.| ++...|.|+-|+-.+.|+|.+.
T Consensus 70 ievt~KstsPygWPqivl~vfg~d~~G~d~v~GYg~~h 107 (187)
T KOG4027|consen 70 IEVTLKSTSPYGWPQIVLNVFGKDHSGKDCVTGYGMLH 107 (187)
T ss_pred eEEEeccCCCCCCceEEEEEecCCcCCcceeeeeeeEe
Confidence 788997 4789987 6677899999999999999875
No 41
>PF11447 DUF3201: Protein of unknown function (DUF3201); InterPro: IPR024505 This archaeal family of proteins has no known function.; PDB: 1YB3_A.
Probab=29.90 E-value=2.8e+02 Score=23.35 Aligned_cols=72 Identities=13% Similarity=0.066 Sum_probs=41.9
Q ss_pred HHHHHHHHHHHhhCCCcEEEEEecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEE-----E
Q 043258 113 AMLHQFKEEMERIDRDNIVLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVA-----A 187 (454)
Q Consensus 113 ~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~-----~ 187 (454)
..+...-++|++.-||+.++--. . .|.-.|.+||-+..-+|--|.+.+ +
T Consensus 8 ~eif~l~eELkeel~gf~vE~v~--------e------------------vFnayi~lDgeW~em~YPhPaf~ikp~gEv 61 (150)
T PF11447_consen 8 EEIFRLNEELKEELKGFKVEEVE--------E------------------VFNAYIYLDGEWREMKYPHPAFEIKPQGEV 61 (150)
T ss_dssp HHHHHHHHHHHHHSTTSEE---E--------E------------------E-S-EEEETTEEEE--S-EEEEEEETTEEE
T ss_pred HHHHHHHHHHHHHcCCCcceeHh--------h------------------hhheeEEecCeeeeecCCCCceeeccCccc
Confidence 34556667888888898776421 1 255669999999999999887765 6
Q ss_pred EEeCCCCeEEEEEEEeecccchhHH
Q 043258 188 GLDGNNGILPVAFCEVQVEDLDSWV 212 (454)
Q Consensus 188 g~d~~~~~~~la~al~~~E~~e~~~ 212 (454)
|.+..+ +-+-+|+...+-.+.+.
T Consensus 62 Gat~q~--~YFvfav~kE~is~~Fv 84 (150)
T PF11447_consen 62 GATPQG--FYFVFAVPKEKISEEFV 84 (150)
T ss_dssp EEETTE--EEEEEEEEGGG--HHHH
T ss_pred ccccce--EEEEEEeeHHHhhHHHH
Confidence 666554 44555555555444443
No 42
>PF01316 Arg_repressor: Arginine repressor, DNA binding domain; InterPro: IPR020900 The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) []. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein []. Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR []. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine []. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography []. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0006525 arginine metabolic process; PDB: 1AOY_A 3V4G_A 3LAJ_D 3FHZ_A 3LAP_B 3ERE_D 2P5L_C 1F9N_D 2P5K_A 1B4A_A ....
Probab=27.93 E-value=1.2e+02 Score=22.50 Aligned_cols=41 Identities=15% Similarity=0.126 Sum_probs=26.2
Q ss_pred HHHHhhhhccCCCCCHHHHHHHHHHHhCCccCHHHHHHHHHH
Q 043258 58 SAMFLHRWKEQPSISTTEVRNEIESMYGIKCPEWKVFCAANR 99 (454)
Q Consensus 58 ~~~~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~ 99 (454)
.+.+.+.|...+-.+-.+|++.|++. |+.++..++.|--+.
T Consensus 7 ~~~I~~li~~~~i~sQ~eL~~~L~~~-Gi~vTQaTiSRDLke 47 (70)
T PF01316_consen 7 QELIKELISEHEISSQEELVELLEEE-GIEVTQATISRDLKE 47 (70)
T ss_dssp HHHHHHHHHHS---SHHHHHHHHHHT-T-T--HHHHHHHHHH
T ss_pred HHHHHHHHHHCCcCCHHHHHHHHHHc-CCCcchhHHHHHHHH
Confidence 34566677777767788999999875 999999999876443
No 43
>PF03705 CheR_N: CheR methyltransferase, all-alpha domain; InterPro: IPR022641 CheR proteins are part of the chemotaxis signaling mechanism which methylates the chemotaxis receptor at specific glutamate residues. This entry refers to the N-terminal domain of the CherR-type MCP methyltransferases, which are found in bacteria, archaea and green plants. This entry is found in association with PF01739 from PFAM. Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented. Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order []. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI []), shared by other AdoMet-Mtases [], is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments [], although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases [, , , ]. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences []. Flagellated bacteria swim towards favourable chemicals and away from deleterious ones. Sensing of chemoeffector gradients involves chemotaxis receptors, transmembrane (TM) proteins that detect stimuli through their periplasmic domains and transduce the signals via their cytoplasmic domains []. Signalling outputs from these receptors are influenced both by the binding of the chemoeffector ligand to their periplasmic domains and by methylation of specific glutamate residues on their cytoplasmic domains. Methylation is catalysed by CheR, an S-adenosylmethionine-dependent methyltransferase [], which reversibly methylates specific glutamate residues within a coiled coil region, to form gamma-glutamyl methyl ester residues [, ]. The structure of the Salmonella typhimurium chemotaxis receptor methyltransferase CheR, bound to S-adenosylhomocysteine, has been determined to a resolution of 2.0 A []. The structure reveals CheR to be a two-domain protein, with a smaller N-terminal helical domain linked via a single polypeptide connection to a larger C-terminal alpha/beta domain. The C-terminal domain has the characteristics of a nucleotide-binding fold, with an insertion of a small anti-parallel beta-sheet subdomain. The S-adenosylhomocysteine-binding site is formed mainly by the large domain, with contributions from residues within the N-terminal domain and the linker region [].; PDB: 1AF7_A 1BC5_A.
Probab=26.51 E-value=1.8e+02 Score=19.86 Aligned_cols=47 Identities=19% Similarity=0.211 Sum_probs=22.3
Q ss_pred HHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHH
Q 043258 74 TEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEME 123 (454)
Q Consensus 74 ~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~ 123 (454)
..+.+.|.+..|+.++..+-.-.+.++...+. ..++..+.+|++.++
T Consensus 6 ~~~~~~i~~~~Gi~l~~~K~~~l~rRl~~rm~---~~~~~~~~~y~~~L~ 52 (57)
T PF03705_consen 6 ERFRELIYRRTGIDLSEYKRSLLERRLARRMR---ALGLPSFAEYYELLR 52 (57)
T ss_dssp HHHHHHHHHHH-----GGGHHHHHHHHHHHHH---HHT---HHHHHHHHH
T ss_pred HHHHHHHHHHHCCCCchhhHHHHHHHHHHHHH---HcCCCCHHHHHHHHH
Confidence 45778899999999887664444444433322 223345556666653
No 44
>KOG2909 consensus Vacuolar H+-ATPase V1 sector, subunit C [Energy production and conversion]
Probab=26.50 E-value=2.1e+02 Score=28.12 Aligned_cols=29 Identities=10% Similarity=0.082 Sum_probs=24.8
Q ss_pred ChHHHHHHHHHHHHHHHhhCCCcEEEEEe
Q 043258 107 DYDDGYAMLHQFKEEMERIDRDNIVLVET 135 (454)
Q Consensus 107 ~~~~~~~~l~~~~~~l~~~npg~~~~~~~ 135 (454)
...+.|+.+..=++.++++.-|+.....+
T Consensus 139 ~r~a~yn~ak~nl~nlerK~~GsL~~rsL 167 (381)
T KOG2909|consen 139 TRAAAYNNAKGNLQNLERKKTGSLSTRSL 167 (381)
T ss_pred HHHHHHHhHHHHHHHHhhhccCChhhhhH
Confidence 45678999999999999999999887755
No 45
>TIGR01529 argR_whole arginine repressor. This model includes most members of the arginine-responsive transcriptional regulator family ArgR. This hexameric protein binds DNA at its amino end to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbor-joining tree, some of these paralogous sequences show long branches and differ significantly in an otherwise well-conserved C-terminal region motif GT[VIL][AC]GDDT. These paralogs are excluded from the seed and score in the gray zone of this model, between trusted and noise cutoffs.
Probab=23.47 E-value=1.4e+02 Score=25.56 Aligned_cols=36 Identities=14% Similarity=0.050 Sum_probs=29.0
Q ss_pred HHhhhhccCCCCCHHHHHHHHHHHhCCccCHHHHHHH
Q 043258 60 MFLHRWKEQPSISTTEVRNEIESMYGIKCPEWKVFCA 96 (454)
Q Consensus 60 ~~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~~~~~ra 96 (454)
.++..+.+++-.+..+|.+.|+ +.|+.+|..++||.
T Consensus 6 ~i~~Li~~~~i~tqeeL~~~L~-~~G~~vsqaTIsRd 41 (146)
T TIGR01529 6 RIKEIITEEKISTQEELVALLK-AEGIEVTQATVSRD 41 (146)
T ss_pred HHHHHHHcCCCCCHHHHHHHHH-HhCCCcCHHHHHHH
Confidence 4555666777788999999995 45999999999983
No 46
>cd00131 PAX Paired Box domain
Probab=22.93 E-value=1.7e+02 Score=24.37 Aligned_cols=39 Identities=13% Similarity=0.065 Sum_probs=29.5
Q ss_pred HHHhhhhccCCCCCHHHHHHHHHHHhCC-----ccCHHHHHHHHH
Q 043258 59 AMFLHRWKEQPSISTTEVRNEIESMYGI-----KCPEWKVFCAAN 98 (454)
Q Consensus 59 ~~~~~~~~~~~~~~~~~i~~~l~~~~g~-----~~s~~~~~rak~ 98 (454)
+.+...+..+|+.+..+|.+.|.. .|+ .+|.++++|+-+
T Consensus 82 ~~i~~~v~~~p~~Tl~El~~~L~~-~gv~~~~~~~s~stI~R~L~ 125 (128)
T cd00131 82 KKIEIYKQENPGMFAWEIRDRLLQ-EGVCDKSNVPSVSSINRILR 125 (128)
T ss_pred HHHHHHHHHCCCCCHHHHHHHHHH-cCCcccCCCCCHHHHHHHHH
Confidence 334445678999999999999864 466 579999988743
No 47
>PF07761 DUF1617: Protein of unknown function (DUF1617); InterPro: IPR011675 This entry is represented by Bacteriophage phi3396 (Streptococcus phage phi3396), Orf51 (Orf: phi3396_51). The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This entry is found in a family of hypothetical bacterial and bacteriophage proteins. The region represented by this entry is approximately 150 residues long and is highly conserved throughout the family.
Probab=20.44 E-value=4.3e+02 Score=22.60 Aligned_cols=36 Identities=14% Similarity=0.090 Sum_probs=24.2
Q ss_pred CCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhcc
Q 043258 71 ISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGL 106 (454)
Q Consensus 71 ~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g 106 (454)
++-++|.....--.++.+.-.++.|+|.+.++.+..
T Consensus 5 lkN~eL~~i~~~L~~iklk~~kaSraRtKLi~~v~~ 40 (143)
T PF07761_consen 5 LKNKELNPIYNFLEKIKLKNMKASRARTKLIKLVEE 40 (143)
T ss_pred eehHHHHHHHHHHHhcccccchhhHHHHHHHHHHHH
Confidence 344556655555556777666888999998876643
Done!