Query         043258
Match_columns 454
No_of_seqs    189 out of 1558
Neff          8.8 
Searched_HMMs 46136
Date          Fri Mar 29 03:05:52 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/043258.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/043258hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PLN03097 FHY3 Protein FAR-RED  100.0 3.9E-64 8.4E-69  531.4  33.1  398   12-435   156-624 (846)
  2 PF00872 Transposase_mut:  Tran  99.9 8.5E-24 1.8E-28  211.2   6.8  236   58-345   100-367 (381)
  3 PF10551 MULE:  MULE transposas  99.9 9.5E-23 2.1E-27  163.1   8.9   90  172-265     1-93  (93)
  4 COG3328 Transposase and inacti  99.7   1E-15 2.3E-20  150.2  17.0  223   68-344    98-345 (379)
  5 smart00575 ZnF_PMZ plant mutat  99.1 4.4E-11 9.6E-16   71.9   2.2   27  392-418     1-27  (28)
  6 PF03108 DBD_Tnp_Mut:  MuDR fam  98.2 1.9E-06 4.1E-11   63.9   4.6   34    1-34     34-67  (67)
  7 PF04434 SWIM:  SWIM zinc finge  98.2 9.2E-07   2E-11   58.3   2.6   30  387-416    10-39  (40)
  8 PF03101 FAR1:  FAR1 DNA-bindin  97.5 0.00012 2.7E-09   57.6   4.5   33   14-47     59-91  (91)
  9 PF08731 AFT:  Transcription fa  97.3 0.00038 8.2E-09   55.6   4.3   31   15-45     81-111 (111)
 10 PF13610 DDE_Tnp_IS240:  DDE do  96.9 0.00053 1.1E-08   58.8   2.1   81  165-251     1-81  (140)
 11 PF06782 UPF0236:  Uncharacteri  96.1    0.15 3.4E-06   52.7  14.7   78  205-283   235-313 (470)
 12 PF03106 WRKY:  WRKY DNA -bindi  95.9   0.026 5.7E-07   40.5   5.6   42    3-44     18-59  (60)
 13 PHA02517 putative transposase   95.8    0.11 2.3E-06   49.9  11.4  151   56-240    30-181 (277)
 14 PF01610 DDE_Tnp_ISL3:  Transpo  95.7   0.023 4.9E-07   53.7   6.4   93  168-268     1-96  (249)
 15 PF03050 DDE_Tnp_IS66:  Transpo  95.5   0.065 1.4E-06   51.3   8.6  134   68-270    18-156 (271)
 16 smart00774 WRKY DNA binding do  93.4    0.16 3.6E-06   36.1   4.3   40    4-43     19-59  (59)
 17 COG3316 Transposase and inacti  92.0     2.3 4.9E-05   38.8  10.8   83  165-254    70-152 (215)
 18 PRK09409 IS2 transposase TnpB;  89.9     5.5 0.00012   38.8  12.3  147   55-240    50-203 (301)
 19 PF13565 HTH_32:  Homeodomain-l  89.6    0.75 1.6E-05   34.5   4.9   40   57-96     35-76  (77)
 20 PRK14702 insertion element IS2  89.4     5.5 0.00012   37.9  11.7  147   55-240    11-164 (262)
 21 PF00665 rve:  Integrase core d  88.1     2.4 5.1E-05   34.5   7.3   76  165-243     6-82  (120)
 22 PF04937 DUF659:  Protein of un  78.7      25 0.00055   30.4   9.9   64  206-271    73-139 (153)
 23 PF12762 DDE_Tnp_IS1595:  ISXO2  71.5      14  0.0003   31.6   6.6   69  166-241     4-87  (151)
 24 PRK13907 rnhA ribonuclease H;   70.5      35 0.00076   28.1   8.6   78  167-248     3-81  (128)
 25 PF04500 FLYWCH:  FLYWCH zinc f  68.6     5.2 0.00011   28.2   2.7   35    5-43     25-62  (62)
 26 COG5431 Uncharacterized metal-  67.3     1.5 3.3E-05   34.6  -0.4   31  381-413    41-76  (117)
 27 PF13592 HTH_33:  Winged helix-  64.8      12 0.00025   26.7   3.9   31   69-99      3-33  (60)
 28 COG4279 Uncharacterized conser  60.7     5.1 0.00011   37.2   1.6   24  391-417   124-147 (266)
 29 PF01498 HTH_Tnp_Tc3_2:  Transp  59.8       9  0.0002   28.2   2.7   38   61-99      4-41  (72)
 30 PF08766 DEK_C:  DEK C terminal  53.5      33 0.00071   23.8   4.5   38   56-93      4-43  (54)
 31 PF13276 HTH_21:  HTH-like doma  45.2      46   0.001   23.4   4.3   42   57-98      6-48  (60)
 32 PF08069 Ribosomal_S13_N:  Ribo  44.3      20 0.00043   25.7   2.1   29   58-86     30-60  (60)
 33 PRK00766 hypothetical protein;  44.1 2.5E+02  0.0053   25.4   9.7   89  166-254    10-128 (194)
 34 PF13082 DUF3931:  Protein of u  42.7      54  0.0012   22.4   3.9   28  179-206    35-62  (66)
 35 PF13551 HTH_29:  Winged helix-  39.5      54  0.0012   26.0   4.4   40   59-98     64-109 (112)
 36 TIGR00334 5S_RNA_mat_M5 ribonu  36.7      51  0.0011   29.1   4.0   44  214-260    37-83  (174)
 37 PF14420 Clr5:  Clr5 domain      34.3      90  0.0019   21.6   4.2   26   68-93     18-43  (54)
 38 PF10045 DUF2280:  Uncharacteri  33.0      50  0.0011   26.3   3.0   23   72-94     21-43  (104)
 39 PF09713 A_thal_3526:  Plant pr  31.7      44 0.00096   23.3   2.2   25   70-94     12-37  (54)
 40 KOG4027 Uncharacterized conser  30.2      72  0.0016   27.6   3.7   35  170-204    70-107 (187)
 41 PF11447 DUF3201:  Protein of u  29.9 2.8E+02  0.0061   23.4   6.9   72  113-212     8-84  (150)
 42 PF01316 Arg_repressor:  Argini  27.9 1.2E+02  0.0025   22.5   4.1   41   58-99      7-47  (70)
 43 PF03705 CheR_N:  CheR methyltr  26.5 1.8E+02  0.0039   19.9   4.8   47   74-123     6-52  (57)
 44 KOG2909 Vacuolar H+-ATPase V1   26.5 2.1E+02  0.0046   28.1   6.5   29  107-135   139-167 (381)
 45 TIGR01529 argR_whole arginine   23.5 1.4E+02   0.003   25.6   4.4   36   60-96      6-41  (146)
 46 cd00131 PAX Paired Box domain   22.9 1.7E+02  0.0037   24.4   4.7   39   59-98     82-125 (128)
 47 PF07761 DUF1617:  Protein of u  20.4 4.3E+02  0.0092   22.6   6.5   36   71-106     5-40  (143)

No 1  
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00  E-value=3.9e-64  Score=531.39  Aligned_cols=398  Identities=14%  Similarity=0.224  Sum_probs=312.5

Q ss_pred             eccCCCccEEEEEEeCCCCeEEEEEecCCcccCCCCCCCc-cchHHHHHHHhhhhccCCCCCHHHHHHHHHHHhCCccCH
Q 043258           12 CSDLSCDWQVTAIRDVRGKGFVITQFSPKHNCPRLDHAFH-PASKWISAMFLHRWKEQPSISTTEVRNEIESMYGIKCPE   90 (454)
Q Consensus        12 C~~~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~~~~~~~~-~~s~~i~~~~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~   90 (454)
                      |+.+||+++|++++. .++.|.|+.++.+|||++.+.... ..++.+-..+...+....++..      ++..     ..
T Consensus       156 ~tRtGC~A~m~Vk~~-~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~~~~~~~~~~~~v~~------~~~d-----~~  223 (846)
T PLN03097        156 CAKTDCKASMHVKRR-PDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYAAMARQFAEYKNVVG------LKND-----SK  223 (846)
T ss_pred             ccCCCCceEEEEEEc-CCCeEEEEEEecCCCCCCCCccccchhhhhhHHHHHhhhhccccccc------cchh-----hc
Confidence            566799999999875 447899999999999999974321 1111111111111111000000      0000     00


Q ss_pred             HHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEEecccccccccceeEEEEeecchHHHHHhhcccEEEE
Q 043258           91 WKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAV  170 (454)
Q Consensus        91 ~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~  170 (454)
                      ....+.+.+.   +   ...+.+.|.+|++.++.+||+|+|++++     |++++++++||+++.++.+|. +|+|||.+
T Consensus       224 ~~~~~~r~~~---~---~~gD~~~ll~yf~~~q~~nP~Ffy~~ql-----De~~~l~niFWaD~~sr~~Y~-~FGDvV~f  291 (846)
T PLN03097        224 SSFDKGRNLG---L---EAGDTKILLDFFTQMQNMNSNFFYAVDL-----GEDQRLKNLFWVDAKSRHDYG-NFSDVVSF  291 (846)
T ss_pred             chhhHHHhhh---c---ccchHHHHHHHHHHHHhhCCCceEEEEE-----ccCCCeeeEEeccHHHHHHHH-hcCCEEEE
Confidence            0111111111   1   1235678999999999999999999998     889999999999999999997 59999999


Q ss_pred             eCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhcCC
Q 043258          171 DGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFLPR  250 (454)
Q Consensus       171 D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~  250 (454)
                      |+||++|+|++||++++|+|+|+++++||+||+.+|+.|+|.|+|++|+++|  ++..|.+||||++.+|.+||.+|||+
T Consensus       292 DTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM--~gk~P~tIiTDqd~am~~AI~~VfP~  369 (846)
T PLN03097        292 DTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAM--GGQAPKVIITDQDKAMKSVISEVFPN  369 (846)
T ss_pred             eceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHh--CCCCCceEEecCCHHHHHHHHHHCCC
Confidence            9999999999999999999999999999999999999999999999999999  58999999999999999999999999


Q ss_pred             ccccccHHHHHHhHHhhCC-----CchhHHHHHHHhhhhcc--------------CCHHHHHHhhhc--CcccceeeecC
Q 043258          251 AVYRQCCHRIFNEMVRRFP-----TAPVQHLFWSACRTTSA--------------TSQECHDWLKNS--NWERWALFCMP  309 (454)
Q Consensus       251 a~~~~C~~Hi~~n~~~~~~-----~~~~~~~~~~~~~~~~~--------------~~~~~~~~l~~~--~~~~W~~~~~~  309 (454)
                      +.|++|.|||++|+.+++.     .+.+...|+++++.+..              ++...++||+.+  .|++|+++|++
T Consensus       370 t~Hr~C~wHI~~~~~e~L~~~~~~~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~LY~~RekWapaY~k  449 (846)
T PLN03097        370 AHHCFFLWHILGKVSENLGQVIKQHENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSLYEDRKQWVPTYMR  449 (846)
T ss_pred             ceehhhHHHHHHHHHHHhhHHhhhhhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHHHHhHhhhhHHHhc
Confidence            9999999999999999885     34889999999886543              677888999998  89999999999


Q ss_pred             CcceeccccCChhHHHhhhhhh--hccccHHHHHHHHHHHHHHHHHHHHHh-Hh---------------hhcccccChhH
Q 043258          310 HWVKCTCVTLTITEKLRTSFDH--YLEMSITRRFTAIARSTAEIFERRRMV-VW---------------KWYREKVTPTV  371 (454)
Q Consensus       310 ~~~~~~~~Ttn~~Es~n~~lk~--~r~~pi~~~~~~i~~~~~~~~~~r~~~-~~---------------~~~~~~~tp~~  371 (454)
                      +.|..|+.||+++||+|+.|++  .+..+|..|++++...+..+.++..+. ..               +.....+||.+
T Consensus       450 ~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piEkQAs~iYT~~i  529 (846)
T PLN03097        450 DAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLEKSVSGVYTHAV  529 (846)
T ss_pred             ccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHHHHHHHHhHHHH
Confidence            9989999999999999999999  467889999998877665544433211 10               11113788888


Q ss_pred             HHHhhh------cc-------------------CCCceEEEEcc----cceeecCCcccCCCCchhHHHHHHHhcC--Ch
Q 043258          372 QDIIHD------RC-------------------SDGRRFILNMD----AMSCSCGLWQISGIPCAHACRGIKYMRR--KI  420 (454)
Q Consensus       372 ~~i~~~------~~-------------------~~~~~~~V~l~----~~~CsC~~~~~~giPC~H~lav~~~~~~--~~  420 (454)
                      ++.++.      .|                   .....|.|..+    +.+|+|++||..||||+|||+||...++  .|
T Consensus       530 F~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLkVL~~~~v~~IP  609 (846)
T PLN03097        530 FKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALVVLQMCQLSAIP  609 (846)
T ss_pred             HHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHHHHhhcCcccCc
Confidence            774432      12                   11234556443    5799999999999999999999999998  59


Q ss_pred             hhhhhhhccHHHHHh
Q 043258          421 EDYVDSMMSVQNYMS  435 (454)
Q Consensus       421 ~~~v~~~~t~~~~~~  435 (454)
                      +.||.++||+++-..
T Consensus       610 ~~YILkRWTKdAK~~  624 (846)
T PLN03097        610 SQYILKRWTKDAKSR  624 (846)
T ss_pred             hhhhhhhchhhhhhc
Confidence            999999999998754


No 2  
>PF00872 Transposase_mut:  Transposase, Mutator family;  InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.89  E-value=8.5e-24  Score=211.24  Aligned_cols=236  Identities=16%  Similarity=0.209  Sum_probs=184.1

Q ss_pred             HHHHhhhhcc--CCCCCHHHHHHHHHHHhC-CccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEE
Q 043258           58 SAMFLHRWKE--QPSISTTEVRNEIESMYG-IKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVE  134 (454)
Q Consensus        58 ~~~~~~~~~~--~~~~~~~~i~~~l~~~~g-~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~  134 (454)
                      .+.+.+.|..  -.+++.++|.+.++..+| ..+|.+++.|..+.+.+..           ..|    +...        
T Consensus       100 ~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~-----------~~w----~~R~--------  156 (381)
T PF00872_consen  100 EDSLEELIISLYLKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEV-----------EAW----RNRP--------  156 (381)
T ss_pred             hhhhhhhhhhhhccccccccccchhhhhhcccccCchhhhhhhhhhhhhH-----------HHH----hhhc--------
Confidence            4444444443  468999999999999999 8899999988766654422           111    1110        


Q ss_pred             ecccccccccceeEEEEeecchHHHHHhhc-ccEEEEeCeeecCCCCc-----ceEEEEEEeCCCCeEEEEEEEeecccc
Q 043258          135 TETHESREEERFKRVFVCCARTSYAFKVHC-RGILAVDGWEINNPCNS-----VMLVAAGLDGNNGILPVAFCEVQVEDL  208 (454)
Q Consensus       135 ~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~-~~vi~~D~t~~~~~y~~-----~l~~~~g~d~~~~~~~la~al~~~E~~  208 (454)
                             -                   ... .|+|++||+|.+.+.++     .+++++|+|.+|+..+||+.+.+.|+.
T Consensus       157 -------L-------------------~~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~  210 (381)
T PF00872_consen  157 -------L-------------------ESEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESA  210 (381)
T ss_pred             -------c-------------------ccccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeecccCCcc
Confidence                   0                   013 58899999999987544     589999999999999999999999999


Q ss_pred             hhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCCc---hhHHHHHHHhhhhc
Q 043258          209 DSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPTA---PVQHLFWSACRTTS  285 (454)
Q Consensus       209 e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~~  285 (454)
                      ++|.-||+.|++++   ...|..||+|+++||.+||.++||++.+|.|++|+++|+.++++.+   .+...++.+..+.+
T Consensus       211 ~~W~~~l~~L~~RG---l~~~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~k~~~~v~~~Lk~I~~a~~  287 (381)
T PF00872_consen  211 ASWREFLQDLKERG---LKDILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPKKDRKEVKADLKAIYQAPD  287 (381)
T ss_pred             CEeeecchhhhhcc---ccccceeeccccccccccccccccchhhhhheechhhhhccccccccchhhhhhccccccccc
Confidence            99999999999998   5679999999999999999999999999999999999999999755   56677776666554


Q ss_pred             c----------------CCHHHHHHhhhcCcccceeeecCCcceeccccCChhHHHhhhhhhh----ccccHHHHHHHHH
Q 043258          286 A----------------TSQECHDWLKNSNWERWALFCMPHWVKCTCVTLTITEKLRTSFDHY----LEMSITRRFTAIA  345 (454)
Q Consensus       286 ~----------------~~~~~~~~l~~~~~~~W~~~~~~~~~~~~~~Ttn~~Es~n~~lk~~----r~~pi~~~~~~i~  345 (454)
                      .                ..|.+.+++++...+.|+..-|+...+--+.|||.+||+|+.+|+.    +..|-.+.+..+.
T Consensus       288 ~e~a~~~l~~f~~~~~~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~i~TTN~iEsln~~irrr~~~~~~Fp~~~s~lr~~  367 (381)
T PF00872_consen  288 KEEAREALEEFAEKWEKKYPKAAKSLEENWDELLTFLDFPPEHRRSIRTTNAIESLNKEIRRRTKVVGIFPNEESALRLV  367 (381)
T ss_pred             cchhhhhhhhcccccccccchhhhhhhhccccccceeeecchhccccchhhhccccccchhhhccccccCCCHHHHHHHH
Confidence            3                5677777777766666666555655455678999999999999882    3466555444433


No 3  
>PF10551 MULE:  MULE transposase domain;  InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 []. 
Probab=99.88  E-value=9.5e-23  Score=163.07  Aligned_cols=90  Identities=30%  Similarity=0.470  Sum_probs=85.7

Q ss_pred             CeeecCCCCcceEE---EEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhc
Q 043258          172 GWEINNPCNSVMLV---AAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFL  248 (454)
Q Consensus       172 ~t~~~~~y~~~l~~---~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vf  248 (454)
                      |||++|+| ++++.   ++|+|++|+.+|+||+++.+|+.++|.|||+.+++.++  .. |.+||||++.|+.+||+++|
T Consensus         1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~--~~-p~~ii~D~~~~~~~Ai~~vf   76 (93)
T PF10551_consen    1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP--QK-PKVIISDFDKALINAIKEVF   76 (93)
T ss_pred             Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc--cC-ceeeeccccHHHHHHHHHHC
Confidence            79999999 98885   99999999999999999999999999999999999994  45 99999999999999999999


Q ss_pred             CCccccccHHHHHHhHH
Q 043258          249 PRAVYRQCCHRIFNEMV  265 (454)
Q Consensus       249 P~a~~~~C~~Hi~~n~~  265 (454)
                      |++.|++|.||+.+|++
T Consensus        77 P~~~~~~C~~H~~~n~k   93 (93)
T PF10551_consen   77 PDARHQLCLFHILRNIK   93 (93)
T ss_pred             CCceEehhHHHHHHhhC
Confidence            99999999999999974


No 4  
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.68  E-value=1e-15  Score=150.17  Aligned_cols=223  Identities=15%  Similarity=0.180  Sum_probs=163.2

Q ss_pred             CCCCCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEEeccccccccccee
Q 043258           68 QPSISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVETETHESREEERFK  147 (454)
Q Consensus        68 ~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~  147 (454)
                      ..+++++++.+.++..++..++...+.+...++.+               .+.+++..-+                    
T Consensus        98 ~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e---------------~v~~~~~r~l--------------------  142 (379)
T COG3328          98 AKGVTTREIEALLEELYGHKVSPSVISVVTDRLDE---------------KVKAWQNRPL--------------------  142 (379)
T ss_pred             HcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHH---------------HHHHHHhccc--------------------
Confidence            56899999999999999988888877765554443               3333333211                    


Q ss_pred             EEEEeecchHHHHHhhcccEEEEeCeeecCC--CCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhcccc
Q 043258          148 RVFVCCARTSYAFKVHCRGILAVDGWEINNP--CNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLE  225 (454)
Q Consensus       148 ~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~--y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~  225 (454)
                                     +..+++++||+|++.+  -+..+++|+|++.+|+..++|+.+...|+ ..|.-||..|+..+   
T Consensus       143 ---------------~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~rg---  203 (379)
T COG3328         143 ---------------GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNRG---  203 (379)
T ss_pred             ---------------cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhcc---
Confidence                           1458899999999988  45579999999999999999999999999 99999999999997   


Q ss_pred             CCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCCch---hHHHHHHHhhhhcc----------------
Q 043258          226 NGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPTAP---VQHLFWSACRTTSA----------------  286 (454)
Q Consensus       226 ~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~~---~~~~~~~~~~~~~~----------------  286 (454)
                      ......+++|+.+|+.+||.++||.+.+|+|..|+.+|+.++.+.++   ....+..+..+.+.                
T Consensus       204 l~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~  283 (379)
T COG3328         204 LSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPRKDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGK  283 (379)
T ss_pred             ccceeEEecchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhhhhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhh
Confidence            55666777899999999999999999999999999999999887662   22333332222221                


Q ss_pred             CCHHHHHHhhhcCcccceeeecCCcceeccccCChhHHHhhhhhh-h---ccccHHHHHHHH
Q 043258          287 TSQECHDWLKNSNWERWALFCMPHWVKCTCVTLTITEKLRTSFDH-Y---LEMSITRRFTAI  344 (454)
Q Consensus       287 ~~~~~~~~l~~~~~~~W~~~~~~~~~~~~~~Ttn~~Es~n~~lk~-~---r~~pi~~~~~~i  344 (454)
                      ..|....|+.+.-.+.|...-|+...+--+.|||..|++|+.++. .   ..+|-.+.+..+
T Consensus       284 ~yP~i~~~~~~~~~~~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~~~~fpn~~sv~k~  345 (379)
T COG3328         284 RYPAILKSWRNALEELLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKVVGIFPNEESVEKL  345 (379)
T ss_pred             hcchHHHHHHHHHHHhcccccCcHHHHhHhhcchHHHHHHHHHHHHHhhhccCCCHHHHHHH
Confidence            334444444444334443333333324457899999999998775 2   235655555443


No 5  
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=99.10  E-value=4.4e-11  Score=71.94  Aligned_cols=27  Identities=41%  Similarity=0.785  Sum_probs=25.2

Q ss_pred             ceeecCCcccCCCCchhHHHHHHHhcC
Q 043258          392 MSCSCGLWQISGIPCAHACRGIKYMRR  418 (454)
Q Consensus       392 ~~CsC~~~~~~giPC~H~lav~~~~~~  418 (454)
                      .+|||++||..||||+|+|+|+...|+
T Consensus         1 ~~CsC~~~~~~gipC~H~i~v~~~~~~   27 (28)
T smart00575        1 KTCSCRKFQLSGIPCRHALAAAIHIGL   27 (28)
T ss_pred             CcccCCCcccCCccHHHHHHHHHHhCC
Confidence            479999999999999999999998876


No 6  
>PF03108 DBD_Tnp_Mut:  MuDR family transposase;  InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.21  E-value=1.9e-06  Score=63.92  Aligned_cols=34  Identities=26%  Similarity=0.576  Sum_probs=31.8

Q ss_pred             CCCCCeEEEEEeccCCCccEEEEEEeCCCCeEEE
Q 043258            1 MENRSHIVSCECSDLSCDWQVTAIRDVRGKGFVI   34 (454)
Q Consensus         1 ~ks~~~r~~~~C~~~gC~~~v~~~~~~~~~~~~v   34 (454)
                      .||+++|+.++|...||||+|+|++.++++.|+|
T Consensus        34 ~ksd~~r~~~~C~~~~C~Wrv~as~~~~~~~~~I   67 (67)
T PF03108_consen   34 KKSDKKRYRAKCKDKGCPWRVRASKRKRSDTFQI   67 (67)
T ss_pred             eccCCEEEEEEEcCCCCCEEEEEEEcCCCCEEEC
Confidence            4799999999999999999999999999999976


No 7  
>PF04434 SWIM:  SWIM zinc finger;  InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.21  E-value=9.2e-07  Score=58.32  Aligned_cols=30  Identities=30%  Similarity=0.703  Sum_probs=26.4

Q ss_pred             EEcccceeecCCcccCCCCchhHHHHHHHh
Q 043258          387 LNMDAMSCSCGLWQISGIPCAHACRGIKYM  416 (454)
Q Consensus       387 V~l~~~~CsC~~~~~~giPC~H~lav~~~~  416 (454)
                      +++...+|||..|+..|.||+|++|++...
T Consensus        10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~   39 (40)
T PF04434_consen   10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL   39 (40)
T ss_pred             ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence            345678999999999999999999998765


No 8  
>PF03101 FAR1:  FAR1 DNA-binding domain;  InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ].   This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=97.52  E-value=0.00012  Score=57.65  Aligned_cols=33  Identities=21%  Similarity=0.362  Sum_probs=29.7

Q ss_pred             cCCCccEEEEEEeCCCCeEEEEEecCCcccCCCC
Q 043258           14 DLSCDWQVTAIRDVRGKGFVITQFSPKHNCPRLD   47 (454)
Q Consensus        14 ~~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~~~~   47 (454)
                      .+||||+|.+.+.+ ++.|.|+.+..+|||++.+
T Consensus        59 ktgC~a~i~v~~~~-~~~w~v~~~~~~HNH~L~P   91 (91)
T PF03101_consen   59 KTGCKARINVKRRK-DGKWRVTSFVLEHNHPLCP   91 (91)
T ss_pred             ccCCCEEEEEEEcc-CCEEEEEECcCCcCCCCCC
Confidence            36999999999887 7899999999999999864


No 9  
>PF08731 AFT:  Transcription factor AFT;  InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2. 
Probab=97.26  E-value=0.00038  Score=55.64  Aligned_cols=31  Identities=19%  Similarity=0.423  Sum_probs=29.1

Q ss_pred             CCCccEEEEEEeCCCCeEEEEEecCCcccCC
Q 043258           15 LSCDWQVTAIRDVRGKGFVITQFSPKHNCPR   45 (454)
Q Consensus        15 ~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~~   45 (454)
                      .+|||+|+|......+.|.|..+++.|||++
T Consensus        81 ~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l  111 (111)
T PF08731_consen   81 NTCPFRIRANYSKKNKKWTLVVVNNEHNHPL  111 (111)
T ss_pred             cCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence            4899999999999999999999999999985


No 10 
>PF13610 DDE_Tnp_IS240:  DDE domain
Probab=96.88  E-value=0.00053  Score=58.83  Aligned_cols=81  Identities=19%  Similarity=0.126  Sum_probs=68.9

Q ss_pred             ccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHH
Q 043258          165 RGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAV  244 (454)
Q Consensus       165 ~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai  244 (454)
                      ++.+.+|-||.+.+-+ ..++..++|.+|+  +|.+-|-..-+...=..||..+.+..   ...|..|+||+.++...|+
T Consensus         1 ~~~w~~DEt~iki~G~-~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~---~~~p~~ivtDk~~aY~~A~   74 (140)
T PF13610_consen    1 GDSWHVDETYIKIKGK-WHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRH---RGEPRVIVTDKLPAYPAAI   74 (140)
T ss_pred             CCEEEEeeEEEEECCE-EEEEEEeeccccc--chhhhhhhhcccccceeeccccceee---ccccceeecccCCccchhh
Confidence            3678999999876533 5667899999999  89999999999988889998888876   4789999999999999999


Q ss_pred             hhhcCCc
Q 043258          245 EEFLPRA  251 (454)
Q Consensus       245 ~~vfP~a  251 (454)
                      +++.|.-
T Consensus        75 ~~l~~~~   81 (140)
T PF13610_consen   75 KELNPEG   81 (140)
T ss_pred             hhccccc
Confidence            9998763


No 11 
>PF06782 UPF0236:  Uncharacterised protein family (UPF0236);  InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=96.08  E-value=0.15  Score=52.72  Aligned_cols=78  Identities=18%  Similarity=0.287  Sum_probs=62.5

Q ss_pred             cccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCC-chhHHHHHHHhhh
Q 043258          205 VEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPT-APVQHLFWSACRT  283 (454)
Q Consensus       205 ~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~-~~~~~~~~~~~~~  283 (454)
                      ..+.+-|.-+.+.+.+...+....-+++.+|+.+.+.+++. .+|++.+.+..+|+.+.+.+.++. +++.+.++++.+.
T Consensus       235 ~~~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~~~~~~~~~al~~  313 (470)
T PF06782_consen  235 ESAEEFWEEVLDYIYNHYDLDKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDPELKEKIRKALKK  313 (470)
T ss_pred             cchHHHHHHHHHHHHHhcCcccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhChHHHHHHHHHHHh
Confidence            55678899999999888864444468889999999998776 999999999999999999988753 3566656666554


No 12 
>PF03106 WRKY:  WRKY DNA -binding domain;  InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=95.87  E-value=0.026  Score=40.49  Aligned_cols=42  Identities=19%  Similarity=0.339  Sum_probs=33.3

Q ss_pred             CCCeEEEEEeccCCCccEEEEEEeCCCCeEEEEEecCCcccC
Q 043258            3 NRSHIVSCECSDLSCDWQVTAIRDVRGKGFVITQFSPKHNCP   44 (454)
Q Consensus         3 s~~~r~~~~C~~~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc~   44 (454)
                      |.-.|--++|+..+||++-.+.+..+++...++++.++|||+
T Consensus        18 ~~~pRsYYrCt~~~C~akK~Vqr~~~d~~~~~vtY~G~H~h~   59 (60)
T PF03106_consen   18 SPYPRSYYRCTHPGCPAKKQVQRSADDPNIVIVTYEGEHNHP   59 (60)
T ss_dssp             TTCEEEEEEEECTTEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred             CceeeEeeeccccChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence            344566799999999999999998877778888999999996


No 13 
>PHA02517 putative transposase OrfB; Reviewed
Probab=95.78  E-value=0.11  Score=49.94  Aligned_cols=151  Identities=16%  Similarity=0.079  Sum_probs=83.8

Q ss_pred             HHHHHHhhhhcc-CCCCCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEE
Q 043258           56 WISAMFLHRWKE-QPSISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVE  134 (454)
Q Consensus        56 ~i~~~~~~~~~~-~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~  134 (454)
                      .+.+.+.+.... .+.+..+.|...|++. |+.+|.++++|..+..     |-...       ....-.....+.. .  
T Consensus        30 ~l~~~I~~i~~~~~~~~G~r~I~~~L~~~-g~~vs~~tV~Rim~~~-----gl~~~-------~~~k~~~~~~~~~-~--   93 (277)
T PHA02517         30 WLKSEILRVYDENHQVYGVRKVWRQLNRE-GIRVARCTVGRLMKEL-----GLAGV-------LRGKKVRTTISRK-A--   93 (277)
T ss_pred             HHHHHHHHHHHHhCCCCCHHHHHHHHHhc-CcccCHHHHHHHHHHc-----CCceE-------ecCCCcCCCCCCC-C--
Confidence            455566666554 5788999999999765 9999999998764432     11000       0000000000000 0  


Q ss_pred             ecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHH
Q 043258          135 TETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYF  214 (454)
Q Consensus       135 ~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~  214 (454)
                           ....+.+.+-|-+         ..-..++..|.||..... +..++.+.+|...+ .++|+.+...++.+...-.
T Consensus        94 -----~~~~n~~~r~f~~---------~~pn~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~~~~~  157 (277)
T PHA02517         94 -----VAAPDRVNRQFVA---------TRPNQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDFVLDA  157 (277)
T ss_pred             -----CCCCCcccCCCCC---------CCCCCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHHHHHH
Confidence                 0001111111100         013468999999986543 56677777776554 4667888877888766555


Q ss_pred             HHHHHhhccccCCCcEEEEcCCcchH
Q 043258          215 LKNINSALRLENGKGLCILGDGDNGV  240 (454)
Q Consensus       215 l~~l~~~~~~~~~~~~~iitD~~~~l  240 (454)
                      |+......+  ...+.+|.||+....
T Consensus       158 l~~a~~~~~--~~~~~i~~sD~G~~y  181 (277)
T PHA02517        158 LEQALWARG--RPGGLIHHSDKGSQY  181 (277)
T ss_pred             HHHHHHhcC--CCcCcEeeccccccc
Confidence            555444431  223457789998864


No 14 
>PF01610 DDE_Tnp_ISL3:  Transposase;  InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=95.74  E-value=0.023  Score=53.69  Aligned_cols=93  Identities=15%  Similarity=0.175  Sum_probs=69.5

Q ss_pred             EEEeCeeecCCCCcceEEEEEEeC--CCCeEEEEEEEeecccchhHHHHHHHH-HhhccccCCCcEEEEcCCcchHHHHH
Q 043258          168 LAVDGWEINNPCNSVMLVAAGLDG--NNGILPVAFCEVQVEDLDSWVYFLKNI-NSALRLENGKGLCILGDGDNGVEYAV  244 (454)
Q Consensus       168 i~~D~t~~~~~y~~~l~~~~g~d~--~~~~~~la~al~~~E~~e~~~w~l~~l-~~~~~~~~~~~~~iitD~~~~l~~Ai  244 (454)
                      |+||=+........  +..+.+|.  +++.+   ++++++-+.++..-||..+ -...   ......|++|..++..+|+
T Consensus         1 lgiDE~~~~~g~~~--y~t~~~d~~~~~~~i---l~i~~~r~~~~l~~~~~~~~~~~~---~~~v~~V~~Dm~~~y~~~~   72 (249)
T PF01610_consen    1 LGIDEFAFRKGHRS--YVTVVVDLDTDTGRI---LDILPGRDKETLKDFFRSLYPEEE---RKNVKVVSMDMSPPYRSAI   72 (249)
T ss_pred             CeEeeeeeecCCcc--eeEEEEECccCCceE---EEEcCCccHHHHHHHHHHhCcccc---ccceEEEEcCCCccccccc
Confidence            45666665443332  45555555  44332   4588999999999888876 3332   4567899999999999999


Q ss_pred             hhhcCCccccccHHHHHHhHHhhC
Q 043258          245 EEFLPRAVYRQCCHRIFNEMVRRF  268 (454)
Q Consensus       245 ~~vfP~a~~~~C~~Hi~~n~~~~~  268 (454)
                      ++.||+|.+..--||+++++.+.+
T Consensus        73 ~~~~P~A~iv~DrFHvvk~~~~al   96 (249)
T PF01610_consen   73 REYFPNAQIVADRFHVVKLANRAL   96 (249)
T ss_pred             cccccccccccccchhhhhhhhcc
Confidence            999999999999999999987644


No 15 
>PF03050 DDE_Tnp_IS66:  Transposase IS66 family ;  InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=95.49  E-value=0.065  Score=51.31  Aligned_cols=134  Identities=12%  Similarity=0.119  Sum_probs=87.0

Q ss_pred             CCCCCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcEEEEEeccccccccccee
Q 043258           68 QPSISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNIVLVETETHESREEERFK  147 (454)
Q Consensus        68 ~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~  147 (454)
                      ...++...+.+.+... |+++|..++.+.-.++.+..           ....+.+.+.-                     
T Consensus        18 ~~~lp~~r~~~~~~~~-G~~is~~ti~~~~~~~~~~l-----------~~~~~~l~~~~---------------------   64 (271)
T PF03050_consen   18 VYHLPLYRIQQMLEDL-GITISRGTIANWIKRVAEAL-----------KPLYEALKEEL---------------------   64 (271)
T ss_pred             cCCCCHHHHhhhhhcc-ceeeccchhHhHhhhhhhhh-----------hhhhhhhhhhc---------------------
Confidence            4456777788888777 99999999887766554322           12222222211                     


Q ss_pred             EEEEeecchHHHHHhhcccEEEEeCeeec----CCCC-cceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhc
Q 043258          148 RVFVCCARTSYAFKVHCRGILAVDGWEIN----NPCN-SVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSAL  222 (454)
Q Consensus       148 ~~f~~~~~~~~~~~~~~~~vi~~D~t~~~----~~y~-~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~  222 (454)
                                     --.+++.+|-|..+    ++.. +-+.++++-+      .+.|.+.++-+.+...-+|..     
T Consensus        65 ---------------~~~~~~~~DET~~~vl~~~~g~~~~~Wv~~~~~------~v~f~~~~sR~~~~~~~~L~~-----  118 (271)
T PF03050_consen   65 ---------------RSSPVVHADETGWRVLDKGKGKKGYLWVFVSPE------VVLFFYAPSRSSKVIKEFLGD-----  118 (271)
T ss_pred             ---------------cccceeccCCceEEEeccccccceEEEeeeccc------eeeeeecccccccchhhhhcc-----
Confidence                           13578888888877    4433 3344444433      666777777777777666533     


Q ss_pred             cccCCCcEEEEcCCcchHHHHHhhhcCCccccccHHHHHHhHHhhCCC
Q 043258          223 RLENGKGLCILGDGDNGVEYAVEEFLPRAVYRQCCHRIFNEMVRRFPT  270 (454)
Q Consensus       223 ~~~~~~~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~  270 (454)
                           -.-+++||+-.+-..     +..+.|+.|+.|+.|.+.+-...
T Consensus       119 -----~~GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~  156 (271)
T PF03050_consen  119 -----FSGILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES  156 (271)
T ss_pred             -----cceeeeccccccccc-----ccccccccccccccccccccccc
Confidence                 223789999987654     22788999999999998876653


No 16 
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=93.36  E-value=0.16  Score=36.11  Aligned_cols=40  Identities=13%  Similarity=0.142  Sum_probs=32.2

Q ss_pred             CCeEEEEEecc-CCCccEEEEEEeCCCCeEEEEEecCCccc
Q 043258            4 RSHIVSCECSD-LSCDWQVTAIRDVRGKGFVITQFSPKHNC   43 (454)
Q Consensus         4 ~~~r~~~~C~~-~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc   43 (454)
                      ...|--.+|+. .|||++=.+.+..+++...++.+.++|||
T Consensus        19 ~~pRsYYrCt~~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h   59 (59)
T smart00774       19 PFPRSYYRCTYSQGCPAKKQVQRSDDDPSVVEVTYEGEHTH   59 (59)
T ss_pred             cCcceEEeccccCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence            34455689998 89999988888766666777899999998


No 17 
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=91.99  E-value=2.3  Score=38.77  Aligned_cols=83  Identities=14%  Similarity=0.071  Sum_probs=60.6

Q ss_pred             ccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHH
Q 043258          165 RGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAV  244 (454)
Q Consensus       165 ~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai  244 (454)
                      ++.+.+|-||.+.+-+. .+.-.+||.+|  .+|.+-|...-+...=.-||..+.+.-    +.|.+|+||+.+....|+
T Consensus        70 ~~~w~vDEt~ikv~gkw-~ylyrAid~~g--~~Ld~~L~~rRn~~aAk~Fl~kllk~~----g~p~v~vtDka~s~~~A~  142 (215)
T COG3316          70 GDSWRVDETYIKVNGKW-HYLYRAIDADG--LTLDVWLSKRRNALAAKAFLKKLLKKH----GEPRVFVTDKAPSYTAAL  142 (215)
T ss_pred             ccceeeeeeEEeeccEe-eehhhhhccCC--CeEEEEEEcccCcHHHHHHHHHHHHhc----CCCceEEecCccchHHHH
Confidence            46678999998765433 23334556664  466777777777777777887766664    478899999999999999


Q ss_pred             hhhcCCcccc
Q 043258          245 EEFLPRAVYR  254 (454)
Q Consensus       245 ~~vfP~a~~~  254 (454)
                      .++-+.+.|+
T Consensus       143 ~~l~~~~ehr  152 (215)
T COG3316         143 RKLGSEVEHR  152 (215)
T ss_pred             HhcCcchhee
Confidence            9998866655


No 18 
>PRK09409 IS2 transposase TnpB; Reviewed
Probab=89.90  E-value=5.5  Score=38.76  Aligned_cols=147  Identities=11%  Similarity=0.021  Sum_probs=87.4

Q ss_pred             HHHHHHHhhhhccCCCCCHHHHHHHHHHHh---CC-ccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcE
Q 043258           55 KWISAMFLHRWKEQPSISTTEVRNEIESMY---GI-KCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNI  130 (454)
Q Consensus        55 ~~i~~~~~~~~~~~~~~~~~~i~~~l~~~~---g~-~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~  130 (454)
                      ..+.+.+.+.....+.+..+.|...|+++.   |+ .++.++++|..+.+     |-..           ...+..+.+.
T Consensus        50 ~~l~~~I~~i~~~~~~yG~Rri~~~L~~~g~~~g~~~v~~k~V~RlMr~~-----Gl~~-----------~~~~~~~~~~  113 (301)
T PRK09409         50 TDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRIMRQN-----ALLL-----------ERKPAVPPSK  113 (301)
T ss_pred             HHHHHHHHHHHHhCccCCHHHHHHHHHhhhcccCccccCHHHHHHHHHHc-----CCcc-----------cccCCCCCCC
Confidence            345555666655678899999999998752   66 58999988764432     1000           0000000000


Q ss_pred             EEEEecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeec-ccch
Q 043258          131 VLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQV-EDLD  209 (454)
Q Consensus       131 ~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~-E~~e  209 (454)
                               ......|    .         ...-..++..|-||....-++.++.++-+|...+ .++|+++... .+.+
T Consensus       114 ---------~~~~~~~----~---------~~~pN~~W~tDiT~~~~~~g~~~Yl~~ViD~~sR-~ivg~~~s~~~~~~~  170 (301)
T PRK09409        114 ---------RAHTGRV----A---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSE  170 (301)
T ss_pred             ---------CCCCCCc----C---------CCCCCCEEEeeeEEEEeCCCCEEEEEEEeecccc-eEEEEEeccCCCCHH
Confidence                     0001111    0         0123468999999987655556888888888776 6889999875 5666


Q ss_pred             hHHHHHHH-HHhhccc-cCCCcEEEEcCCcchH
Q 043258          210 SWVYFLKN-INSALRL-ENGKGLCILGDGDNGV  240 (454)
Q Consensus       210 ~~~w~l~~-l~~~~~~-~~~~~~~iitD~~~~l  240 (454)
                      .-.-+|+. +....+. ....|.+|.||+....
T Consensus       171 ~v~~~l~~a~~~~~~~~~~~~~~iihSDrGsqy  203 (301)
T PRK09409        171 TVQDVMLGAVERRFGNDLPSSPVEWLTDNGSCY  203 (301)
T ss_pred             HHHHHHHHHHHHHhccCCCCCCcEEecCCCccc
Confidence            65566654 4443320 1235788999998753


No 19 
>PF13565 HTH_32:  Homeodomain-like domain
Probab=89.63  E-value=0.75  Score=34.52  Aligned_cols=40  Identities=18%  Similarity=0.302  Sum_probs=34.1

Q ss_pred             HHHHHhhhhccCCCCCHHHHHHHHHHHhCCcc--CHHHHHHH
Q 043258           57 ISAMFLHRWKEQPSISTTEVRNEIESMYGIKC--PEWKVFCA   96 (454)
Q Consensus        57 i~~~~~~~~~~~~~~~~~~i~~~l~~~~g~~~--s~~~~~ra   96 (454)
                      +.+.+.+.+..+|.+++.+|.+.|.+++|+.+  |.+++||.
T Consensus        35 ~~~~i~~~~~~~p~wt~~~i~~~L~~~~g~~~~~S~~tv~R~   76 (77)
T PF13565_consen   35 QRERIIALIEEHPRWTPREIAEYLEEEFGISVRVSRSTVYRI   76 (77)
T ss_pred             HHHHHHHHHHhCCCCCHHHHHHHHHHHhCCCCCccHhHHHHh
Confidence            44566677778899999999999999999876  99999874


No 20 
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=89.44  E-value=5.5  Score=37.89  Aligned_cols=147  Identities=12%  Similarity=0.045  Sum_probs=87.7

Q ss_pred             HHHHHHHhhhhccCCCCCHHHHHHHHHHH---hCC-ccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHHhhCCCcE
Q 043258           55 KWISAMFLHRWKEQPSISTTEVRNEIESM---YGI-KCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEMERIDRDNI  130 (454)
Q Consensus        55 ~~i~~~~~~~~~~~~~~~~~~i~~~l~~~---~g~-~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~~~npg~~  130 (454)
                      ..+...+.+.....+.+..+.|...|++.   .|+ .++.++++|..+.+-  +.+.              .++..+-+.
T Consensus        11 ~~l~~~I~~~~~~~~~yG~rri~~~L~~~~~~~g~~~v~~krV~rlmr~~g--L~~~--------------~r~~~~~~~   74 (262)
T PRK14702         11 TDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRLMRQNA--LLLE--------------RKPAVPPSK   74 (262)
T ss_pred             HHHHHHHHHHHHhCcccChHHHHHHHHhhhcccCccccCHHHHHHHHHHhC--Cccc--------------cCCCCCCCC
Confidence            45666666666667889999999999875   477 499999987644320  0000              000000000


Q ss_pred             EEEEecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEEeec-ccch
Q 043258          131 VLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCEVQV-EDLD  209 (454)
Q Consensus       131 ~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~-E~~e  209 (454)
                               ......|    .         ...-..++..|-||.....++.++.++-+|...+ .++|+++... .+.+
T Consensus        75 ---------~~~~~~~----~---------~~~pn~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~  131 (262)
T PRK14702         75 ---------RAHTGRV----A---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSE  131 (262)
T ss_pred             ---------cCCCCcc----c---------cCCCCCEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHH
Confidence                     0000001    0         0113468999999987654556888888887776 6788888874 5666


Q ss_pred             hHHHHHHH-HHhhccc-cCCCcEEEEcCCcchH
Q 043258          210 SWVYFLKN-INSALRL-ENGKGLCILGDGDNGV  240 (454)
Q Consensus       210 ~~~w~l~~-l~~~~~~-~~~~~~~iitD~~~~l  240 (454)
                      .-.-+|+. +....+. ....|..|.||+....
T Consensus       132 ~v~~~l~~A~~~~~~~~~~~~~~iihSD~Gsqy  164 (262)
T PRK14702        132 TVQDVMLGAVERRFGNDLPSSPVEWLTDNGSCY  164 (262)
T ss_pred             HHHHHHHHHHHHHhcccCCCCCeEEEcCCCccc
Confidence            65556654 3333210 1335788999998754


No 21 
>PF00665 rve:  Integrase core domain;  InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis [].  Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.  HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=88.08  E-value=2.4  Score=34.47  Aligned_cols=76  Identities=12%  Similarity=-0.037  Sum_probs=54.1

Q ss_pred             ccEEEEeCeeec-CCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHH
Q 043258          165 RGILAVDGWEIN-NPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYA  243 (454)
Q Consensus       165 ~~vi~~D~t~~~-~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~A  243 (454)
                      ...+.+|.+... ...++..++.+.+|..-.. .+++.+-..++.+....+|.......+  ...|.+|+||++....+.
T Consensus         6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~~-~~~~~~~~~~~~~~~~~~l~~~~~~~~--~~~p~~i~tD~g~~f~~~   82 (120)
T PF00665_consen    6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSRF-IYAFPVSSKETAEAALRALKRAIEKRG--GRPPRVIRTDNGSEFTSH   82 (120)
T ss_dssp             TTEEEEEEEEETGGCTT-CEEEEEEEETTTTE-EEEEEESSSSHHHHHHHHHHHHHHHHS---SE-SEEEEESCHHHHSH
T ss_pred             CCEEEEeeEEEecCCCCccEEEEEEEECCCCc-EEEEEeecccccccccccccccccccc--cccceecccccccccccc
Confidence            457889999766 3456688899999977665 446777777788888888876666652  333999999999987643


No 22 
>PF04937 DUF659:  Protein of unknown function (DUF 659);  InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=78.72  E-value=25  Score=30.41  Aligned_cols=64  Identities=13%  Similarity=0.209  Sum_probs=46.9

Q ss_pred             ccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHH---hhhcCCccccccHHHHHHhHHhhCCCc
Q 043258          206 EDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAV---EEFLPRAVYRQCCHRIFNEMVRRFPTA  271 (454)
Q Consensus       206 E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai---~~vfP~a~~~~C~~Hi~~n~~~~~~~~  271 (454)
                      .+.+..--+|+...+.+  +....+-||||......+|-   .+-+|.....-|..|-+.-+.+.+...
T Consensus        73 ~~a~~l~~ll~~vIeeV--G~~nVvqVVTDn~~~~~~a~~~L~~k~p~ifw~~CaaH~inLmledi~k~  139 (153)
T PF04937_consen   73 KTAEYLFELLDEVIEEV--GEENVVQVVTDNASNMKKAGKLLMEKYPHIFWTPCAAHCINLMLEDIGKL  139 (153)
T ss_pred             ccHHHHHHHHHHHHHHh--hhhhhhHHhccCchhHHHHHHHHHhcCCCEEEechHHHHHHHHHHHHhcC
Confidence            44555556666655555  35566779999999988884   455899999999999998887766543


No 23 
>PF12762 DDE_Tnp_IS1595:  ISXO2-like transposase domain;  InterPro: IPR024445 This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2.
Probab=71.52  E-value=14  Score=31.62  Aligned_cols=69  Identities=17%  Similarity=0.181  Sum_probs=44.2

Q ss_pred             cEEEEeCeeecCCC--------------CcceEEEEEEeCC-CCeEEEEEEEeecccchhHHHHHHHHHhhccccCCCcE
Q 043258          166 GILAVDGWEINNPC--------------NSVMLVAAGLDGN-NGILPVAFCEVQVEDLDSWVYFLKNINSALRLENGKGL  230 (454)
Q Consensus       166 ~vi~~D~t~~~~~y--------------~~~l~~~~g~d~~-~~~~~la~al~~~E~~e~~~w~l~~l~~~~~~~~~~~~  230 (454)
                      .+|-+|.||..++-              .....++++++-+ +..--+...++++.+.++..-+++...+       +..
T Consensus         4 G~VEiDEty~~~~~~~~~~~~~~~gr~~~~k~~V~~~ver~~~~~~~~~~~~v~~~~~~tl~~~i~~~i~-------~gs   76 (151)
T PF12762_consen    4 GIVEIDETYFGGRKNKKPRRKGKRGRGSKNKVPVFGAVERNDGGTGRVFMFVVPDRSAETLKPIIQEHIE-------PGS   76 (151)
T ss_pred             CEEEeCcCEECCcccccccCCCCCCCcCCCCcEEEEEEeecccCCceEEEEeecccccchhHHHHHHhhh-------ccc
Confidence            36778888875433              2245666666665 4444555556688888887766654333       335


Q ss_pred             EEEcCCcchHH
Q 043258          231 CILGDGDNGVE  241 (454)
Q Consensus       231 ~iitD~~~~l~  241 (454)
                      +|+||+.++-.
T Consensus        77 ~i~TD~~~aY~   87 (151)
T PF12762_consen   77 TIITDGWRAYN   87 (151)
T ss_pred             eeeecchhhcC
Confidence            78999998753


No 24 
>PRK13907 rnhA ribonuclease H; Provisional
Probab=70.53  E-value=35  Score=28.14  Aligned_cols=78  Identities=18%  Similarity=0.079  Sum_probs=44.8

Q ss_pred             EEEEeCeeecCCCCcceEEEEEEeCCCCeEEEEEEE-eecccchhHHHHHHHHHhhccccCCCcEEEEcCCcchHHHHHh
Q 043258          167 ILAVDGWEINNPCNSVMLVAAGLDGNNGILPVAFCE-VQVEDLDSWVYFLKNINSALRLENGKGLCILGDGDNGVEYAVE  245 (454)
Q Consensus       167 vi~~D~t~~~~~y~~~l~~~~g~d~~~~~~~la~al-~~~E~~e~~~w~l~~l~~~~~~~~~~~~~iitD~~~~l~~Ai~  245 (454)
                      .|.+||.+..+.-.+-.-.++ .|..+.. .+++.. ..+.+..-+.-++..|+.+.. .+..++.|-||. +.+.+++.
T Consensus         3 ~iy~DGa~~~~~g~~G~G~vi-~~~~~~~-~~~~~~~~~tn~~AE~~All~aL~~a~~-~g~~~v~i~sDS-~~vi~~~~   78 (128)
T PRK13907          3 EVYIDGASKGNPGPSGAGVFI-KGVQPAV-QLSLPLGTMSNHEAEYHALLAALKYCTE-HNYNIVSFRTDS-QLVERAVE   78 (128)
T ss_pred             EEEEeeCCCCCCCccEEEEEE-EECCeeE-EEEecccccCCcHHHHHHHHHHHHHHHh-CCCCEEEEEech-HHHHHHHh
Confidence            378999998765332222222 4555543 333221 234455567777888887774 234567788876 55666666


Q ss_pred             hhc
Q 043258          246 EFL  248 (454)
Q Consensus       246 ~vf  248 (454)
                      ..+
T Consensus        79 ~~~   81 (128)
T PRK13907         79 KEY   81 (128)
T ss_pred             HHH
Confidence            654


No 25 
>PF04500 FLYWCH:  FLYWCH zinc finger domain;  InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif:  F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH  where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=68.61  E-value=5.2  Score=28.17  Aligned_cols=35  Identities=14%  Similarity=0.163  Sum_probs=16.0

Q ss_pred             CeEEEEEecc---CCCccEEEEEEeCCCCeEEEEEecCCccc
Q 043258            5 SHIVSCECSD---LSCDWQVTAIRDVRGKGFVITQFSPKHNC   43 (454)
Q Consensus         5 ~~r~~~~C~~---~gC~~~v~~~~~~~~~~~~v~~~~~~Hnc   43 (454)
                      .....-+|+.   .+|+++|...  .++  -.|.....+|||
T Consensus        25 ~~~~~WrC~~~~~~~C~a~~~~~--~~~--~~~~~~~~~HnH   62 (62)
T PF04500_consen   25 DGKTYWRCSRRRSHGCRARLITD--AGD--GRVVRTNGEHNH   62 (62)
T ss_dssp             SS-EEEEEGGGTTS----EEEEE----T--TEEEE-S---SS
T ss_pred             CCcEEEEeCCCCCCCCeEEEEEE--CCC--CEEEECCCccCC
Confidence            3455678885   3899999988  222  345566688998


No 26 
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=67.31  E-value=1.5  Score=34.60  Aligned_cols=31  Identities=32%  Similarity=0.480  Sum_probs=21.5

Q ss_pred             CCceEEEEcccceeecCCccc-----CCCCchhHHHHH
Q 043258          381 DGRRFILNMDAMSCSCGLWQI-----SGIPCAHACRGI  413 (454)
Q Consensus       381 ~~~~~~V~l~~~~CsC~~~~~-----~giPC~H~lav~  413 (454)
                      .++.|+++.+  .|||..|-.     ---||.|+++.-
T Consensus        41 ~~rdYIl~~g--fCSCp~~~~svvl~Gk~~C~Hi~glk   76 (117)
T COG5431          41 KERDYILEGG--FCSCPDFLGSVVLKGKSPCAHIIGLK   76 (117)
T ss_pred             cccceEEEcC--cccCHHHHhHhhhcCcccchhhhhee
Confidence            3446666665  999998762     235899998753


No 27 
>PF13592 HTH_33:  Winged helix-turn helix
Probab=64.85  E-value=12  Score=26.69  Aligned_cols=31  Identities=23%  Similarity=0.303  Sum_probs=26.3

Q ss_pred             CCCCHHHHHHHHHHHhCCccCHHHHHHHHHH
Q 043258           69 PSISTTEVRNEIESMYGIKCPEWKVFCAANR   99 (454)
Q Consensus        69 ~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~   99 (454)
                      +-++..+|.+.|.+.+|+.+|.+.+++.-++
T Consensus         3 ~~wt~~~i~~~I~~~fgv~ys~~~v~~lL~r   33 (60)
T PF13592_consen    3 GRWTLKEIAAYIEEEFGVKYSPSGVYRLLKR   33 (60)
T ss_pred             CcccHHHHHHHHHHHHCCEEcHHHHHHHHHH
Confidence            4578899999999999999999999876443


No 28 
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=60.73  E-value=5.1  Score=37.15  Aligned_cols=24  Identities=25%  Similarity=0.588  Sum_probs=19.4

Q ss_pred             cceeecCCcccCCCCchhHHHHHHHhc
Q 043258          391 AMSCSCGLWQISGIPCAHACRGIKYMR  417 (454)
Q Consensus       391 ~~~CsC~~~~~~giPC~H~lav~~~~~  417 (454)
                      ...|||..|   -.||.|+-||....+
T Consensus       124 ~~dCSCPD~---anPCKHi~AvyY~la  147 (266)
T COG4279         124 STDCSCPDY---ANPCKHIAAVYYLLA  147 (266)
T ss_pred             ccccCCCCc---ccchHHHHHHHHHHH
Confidence            456999975   579999999987764


No 29 
>PF01498 HTH_Tnp_Tc3_2:  Transposase;  InterPro: IPR002492 Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in []. Tc3 is a member of the Tc1/mariner family of transposable elements. This entry also includes histone-lysine N-methyltransferase SETMAR, which is a SET domain and mariner transposase fusion gene-containing protein. This histone methyltransferase has sequence-specific DNA-binding activity and recognises the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element. This protein has DNA nicking activity, and has in vivo end joining activity and may mediate genomic integration of foreign DNA [, , , ]. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated, 0015074 DNA integration; PDB: 3K9K_B 3F2K_B 3K9J_B 1U78_A.
Probab=59.76  E-value=9  Score=28.20  Aligned_cols=38  Identities=18%  Similarity=0.326  Sum_probs=17.0

Q ss_pred             HhhhhccCCCCCHHHHHHHHHHHhCCccCHHHHHHHHHH
Q 043258           61 FLHRWKEQPSISTTEVRNEIESMYGIKCPEWKVFCAANR   99 (454)
Q Consensus        61 ~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~   99 (454)
                      +...+..+|..+..+|...+... |..+|..++.|.-..
T Consensus         4 I~~~v~~~p~~s~~~i~~~l~~~-~~~vS~~TI~r~L~~   41 (72)
T PF01498_consen    4 IVRMVRRNPRISAREIAQELQEA-GISVSKSTIRRRLRE   41 (72)
T ss_dssp             ------------HHHHHHHT----T--S-HHHHHHHHHH
T ss_pred             HHHHHHHCCCCCHHHHHHHHHHc-cCCcCHHHHHHHHHH
Confidence            44556678999999999999988 999999999876443


No 30 
>PF08766 DEK_C:  DEK C terminal domain;  InterPro: IPR014876 DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients []. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like Q8TF96 from SWISSPROT, and in protein phosphatases such as Q6NN85 from SWISSPROT. ; PDB: 1Q1V_A.
Probab=53.48  E-value=33  Score=23.78  Aligned_cols=38  Identities=18%  Similarity=0.346  Sum_probs=24.7

Q ss_pred             HHHHHHhhhhcc-C-CCCCHHHHHHHHHHHhCCccCHHHH
Q 043258           56 WISAMFLHRWKE-Q-PSISTTEVRNEIESMYGIKCPEWKV   93 (454)
Q Consensus        56 ~i~~~~~~~~~~-~-~~~~~~~i~~~l~~~~g~~~s~~~~   93 (454)
                      -+...+.+.+.+ + ..++.++|...|.+.+|.+++..+.
T Consensus         4 ~i~~~i~~iL~~~dl~~vT~k~vr~~Le~~~~~dL~~~K~   43 (54)
T PF08766_consen    4 EIREAIREILREADLDTVTKKQVREQLEERFGVDLSSRKK   43 (54)
T ss_dssp             HHHHHHHHHHTTS-GGG--HHHHHHHHHHH-SS--SHHHH
T ss_pred             HHHHHHHHHHHhCCHhHhhHHHHHHHHHHHHCCCcHHHHH
Confidence            355667777775 3 4689999999999999999986553


No 31 
>PF13276 HTH_21:  HTH-like domain
Probab=45.17  E-value=46  Score=23.38  Aligned_cols=42  Identities=17%  Similarity=0.252  Sum_probs=33.5

Q ss_pred             HHHHHhhhhccC-CCCCHHHHHHHHHHHhCCccCHHHHHHHHH
Q 043258           57 ISAMFLHRWKEQ-PSISTTEVRNEIESMYGIKCPEWKVFCAAN   98 (454)
Q Consensus        57 i~~~~~~~~~~~-~~~~~~~i~~~l~~~~g~~~s~~~~~rak~   98 (454)
                      +.+.+.+....+ +.+....|...|+++.|+.+|..+++|..+
T Consensus         6 l~~~I~~i~~~~~~~yG~rri~~~L~~~~~~~v~~krV~RlM~   48 (60)
T PF13276_consen    6 LRELIKEIFKESKPTYGYRRIWAELRREGGIRVSRKRVRRLMR   48 (60)
T ss_pred             HHHHHHHHHHHcCCCeehhHHHHHHhccCcccccHHHHHHHHH
Confidence            455566666654 788999999999999899999999987643


No 32 
>PF08069 Ribosomal_S13_N:  Ribosomal S13/S15 N-terminal domain;  InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=44.26  E-value=20  Score=25.65  Aligned_cols=29  Identities=10%  Similarity=0.218  Sum_probs=23.6

Q ss_pred             HHHHhhhhcc--CCCCCHHHHHHHHHHHhCC
Q 043258           58 SAMFLHRWKE--QPSISTTEVRNEIESMYGI   86 (454)
Q Consensus        58 ~~~~~~~~~~--~~~~~~~~i~~~l~~~~g~   86 (454)
                      ++++++.|..  ..+++|.+|--.|+.+||+
T Consensus        30 ~~eVe~~I~klakkG~tpSqIG~iLRD~~GI   60 (60)
T PF08069_consen   30 PEEVEELIVKLAKKGLTPSQIGVILRDQYGI   60 (60)
T ss_dssp             HHHHHHHHHHHCCTTHCHHHHHHHHHHSCTC
T ss_pred             HHHHHHHHHHHHHcCCCHHHhhhhhhhccCC
Confidence            4666666664  6789999999999999985


No 33 
>PRK00766 hypothetical protein; Provisional
Probab=44.09  E-value=2.5e+02  Score=25.42  Aligned_cols=89  Identities=19%  Similarity=0.212  Sum_probs=51.1

Q ss_pred             cEEEEeC-eeecCCCCcceEEEEEEeCCCCeEEEEEEEeecccchhHHHHHHHHHhhcc---------------------
Q 043258          166 GILAVDG-WEINNPCNSVMLVAAGLDGNNGILPVAFCEVQVEDLDSWVYFLKNINSALR---------------------  223 (454)
Q Consensus       166 ~vi~~D~-t~~~~~y~~~l~~~~g~d~~~~~~~la~al~~~E~~e~~~w~l~~l~~~~~---------------------  223 (454)
                      .|+.||- .|..+.-+-..++-+-.-++.-..-++|+-+.-.-.|.=.-+.+.++....                     
T Consensus        10 rvlGidds~f~~~~~~~~~lvGvv~r~~~~idGv~~~~itvdG~DaT~~i~~mv~~~~~r~~i~~V~L~Git~agFNvvD   89 (194)
T PRK00766         10 RVLGIDDGTFLFKSSEKVILVGVVMRGGDWVDGVLSRWITVDGLDATEAIIEMVNSSRHKGQLRVIMLDGITYGGFNVVD   89 (194)
T ss_pred             eEEEEecCccccCCCCCEEEEEEEEECCeEEeeEEEEEEEECCccHHHHHHHHHHhcccccceEEEEECCEeeeeeEEec
Confidence            5788884 444433333555555566666666777777666555555555544443110                     


Q ss_pred             -----ccCCCcEEEEcCCcc---hHHHHHhhhcCCcccc
Q 043258          224 -----LENGKGLCILGDGDN---GVEYAVEEFLPRAVYR  254 (454)
Q Consensus       224 -----~~~~~~~~iitD~~~---~l~~Ai~~vfP~a~~~  254 (454)
                           ...+-|+.+++..-+   +|.+|+++.||+...+
T Consensus        90 ~~~l~~~tg~PVI~V~r~~p~~~~ie~AL~k~f~~~~~R  128 (194)
T PRK00766         90 IEELYRETGLPVIVVMRKKPDFEAIESALKKHFSDWEER  128 (194)
T ss_pred             HHHHHHHHCCCEEEEEecCCCHHHHHHHHHHHCCCHHHH
Confidence                 001246666644444   6888888888886554


No 34 
>PF13082 DUF3931:  Protein of unknown function (DUF3931)
Probab=42.68  E-value=54  Score=22.39  Aligned_cols=28  Identities=14%  Similarity=0.025  Sum_probs=21.0

Q ss_pred             CCcceEEEEEEeCCCCeEEEEEEEeecc
Q 043258          179 CNSVMLVAAGLDGNNGILPVAFCEVQVE  206 (454)
Q Consensus       179 y~~~l~~~~g~d~~~~~~~la~al~~~E  206 (454)
                      |...-++++|-+++|+..++.-.|-.+|
T Consensus        35 yefssfvlcgetpdgrrlvlthmistde   62 (66)
T PF13082_consen   35 YEFSSFVLCGETPDGRRLVLTHMISTDE   62 (66)
T ss_pred             EEEEEEEEEccCCCCcEEEEEEEecchh
Confidence            3445677889999999888877776655


No 35 
>PF13551 HTH_29:  Winged helix-turn helix
Probab=39.49  E-value=54  Score=25.96  Aligned_cols=40  Identities=18%  Similarity=0.267  Sum_probs=29.9

Q ss_pred             HHHhhhhccCC-----CCCHHHHHHHH-HHHhCCccCHHHHHHHHH
Q 043258           59 AMFLHRWKEQP-----SISTTEVRNEI-ESMYGIKCPEWKVFCAAN   98 (454)
Q Consensus        59 ~~~~~~~~~~~-----~~~~~~i~~~l-~~~~g~~~s~~~~~rak~   98 (454)
                      +.+.+.+..+|     .+++..|...| ++.+|+.+|..++++.-+
T Consensus        64 ~~l~~~~~~~p~~g~~~~t~~~l~~~l~~~~~~~~~s~~ti~r~L~  109 (112)
T PF13551_consen   64 AQLIELLRENPPEGRSRWTLEELAEWLIEEEFGIDVSPSTIRRILK  109 (112)
T ss_pred             HHHHHHHHHCCCCCCCcccHHHHHHHHHHhccCccCCHHHHHHHHH
Confidence            34455555544     47889999976 888999999999987644


No 36 
>TIGR00334 5S_RNA_mat_M5 ribonuclease M5. This family of orthologous proteins shows a weak but significant similarity to the central region of the DnaG-type DNA primase. The region of similarity is termed the Toprim (topoisomerase-primase) domain and is also shared by RecR, OLD family nucleases, and type IA and II topoisomerases.
Probab=36.68  E-value=51  Score=29.13  Aligned_cols=44  Identities=23%  Similarity=0.244  Sum_probs=34.0

Q ss_pred             HHHHHHhhccccCCCcEEEEcCCcch---HHHHHhhhcCCccccccHHHH
Q 043258          214 FLKNINSALRLENGKGLCILGDGDNG---VEYAVEEFLPRAVYRQCCHRI  260 (454)
Q Consensus       214 ~l~~l~~~~~~~~~~~~~iitD~~~~---l~~Ai~~vfP~a~~~~C~~Hi  260 (454)
                      .++.++.+.   ....+.|+||.|.+   |.+-|.+.+|++.|.+=...-
T Consensus        37 ~i~~i~~~~---~~rgVIIfTDpD~~GekIRk~i~~~vp~~khafi~~~~   83 (174)
T TIGR00334        37 TINLIKKAQ---KKQGVIILTDPDFPGEKIRKKIEQHLPGYENCFIPKHL   83 (174)
T ss_pred             HHHHHHHHh---hcCCEEEEeCCCCchHHHHHHHHHHCCCCeEEeeeHHh
Confidence            455666655   67789999999974   888999999999998754443


No 37 
>PF14420 Clr5:  Clr5 domain
Probab=34.32  E-value=90  Score=21.63  Aligned_cols=26  Identities=15%  Similarity=0.180  Sum_probs=22.4

Q ss_pred             CCCCCHHHHHHHHHHHhCCccCHHHH
Q 043258           68 QPSISTTEVRNEIESMYGIKCPEWKV   93 (454)
Q Consensus        68 ~~~~~~~~i~~~l~~~~g~~~s~~~~   93 (454)
                      +.+.+..+|++.++..||..+|..+.
T Consensus        18 ~e~~tl~~v~~~M~~~~~F~at~rqy   43 (54)
T PF14420_consen   18 DENKTLEEVMEIMKEEHGFKATKRQY   43 (54)
T ss_pred             hCCCcHHHHHHHHHHHhCCCcCHHHH
Confidence            56788999999999999999986653


No 38 
>PF10045 DUF2280:  Uncharacterized conserved protein (DUF2280);  InterPro: IPR018738 This entry is represented by Burkholderia phage Bups phi1, Orf2.36. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Probab=32.96  E-value=50  Score=26.33  Aligned_cols=23  Identities=22%  Similarity=0.410  Sum_probs=20.7

Q ss_pred             CHHHHHHHHHHHhCCccCHHHHH
Q 043258           72 STTEVRNEIESMYGIKCPEWKVF   94 (454)
Q Consensus        72 ~~~~i~~~l~~~~g~~~s~~~~~   94 (454)
                      +|+++.+.+++++|+.+|..++-
T Consensus        21 TPs~v~~aVk~eFgi~vsrQqve   43 (104)
T PF10045_consen   21 TPSEVAEAVKEEFGIDVSRQQVE   43 (104)
T ss_pred             CHHHHHHHHHHHhCCccCHHHHH
Confidence            79999999999999999987764


No 39 
>PF09713 A_thal_3526:  Plant protein 1589 of unknown function (A_thal_3526);  InterPro: IPR006476 This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.
Probab=31.74  E-value=44  Score=23.34  Aligned_cols=25  Identities=16%  Similarity=0.089  Sum_probs=18.0

Q ss_pred             CCCHHHHHHHHHHHhCCccCHH-HHH
Q 043258           70 SISTTEVRNEIESMYGIKCPEW-KVF   94 (454)
Q Consensus        70 ~~~~~~i~~~l~~~~g~~~s~~-~~~   94 (454)
                      .++..++++.|.++.|+++... .+|
T Consensus        12 yMsk~E~v~~L~~~a~I~P~~T~~VW   37 (54)
T PF09713_consen   12 YMSKEECVRALQKQANIEPVFTSTVW   37 (54)
T ss_pred             cCCHHHHHHHHHHHcCCChHHHHHHH
Confidence            5678899999988888775543 344


No 40 
>KOG4027 consensus Uncharacterized conserved protein [Function unknown]
Probab=30.24  E-value=72  Score=27.59  Aligned_cols=35  Identities=11%  Similarity=0.046  Sum_probs=30.0

Q ss_pred             EeCeee-cCCCCcc--eEEEEEEeCCCCeEEEEEEEee
Q 043258          170 VDGWEI-NNPCNSV--MLVAAGLDGNNGILPVAFCEVQ  204 (454)
Q Consensus       170 ~D~t~~-~~~y~~~--l~~~~g~d~~~~~~~la~al~~  204 (454)
                      ||.||+ ++.|+.|  ++...|.|+-|+-.+.|+|.+.
T Consensus        70 ievt~KstsPygWPqivl~vfg~d~~G~d~v~GYg~~h  107 (187)
T KOG4027|consen   70 IEVTLKSTSPYGWPQIVLNVFGKDHSGKDCVTGYGMLH  107 (187)
T ss_pred             eEEEeccCCCCCCceEEEEEecCCcCCcceeeeeeeEe
Confidence            788997 4789987  6677899999999999999875


No 41 
>PF11447 DUF3201:  Protein of unknown function (DUF3201);  InterPro: IPR024505 This archaeal family of proteins has no known function.; PDB: 1YB3_A.
Probab=29.90  E-value=2.8e+02  Score=23.35  Aligned_cols=72  Identities=13%  Similarity=0.066  Sum_probs=41.9

Q ss_pred             HHHHHHHHHHHhhCCCcEEEEEecccccccccceeEEEEeecchHHHHHhhcccEEEEeCeeecCCCCcceEEE-----E
Q 043258          113 AMLHQFKEEMERIDRDNIVLVETETHESREEERFKRVFVCCARTSYAFKVHCRGILAVDGWEINNPCNSVMLVA-----A  187 (454)
Q Consensus       113 ~~l~~~~~~l~~~npg~~~~~~~~~~~~~~~~~f~~~f~~~~~~~~~~~~~~~~vi~~D~t~~~~~y~~~l~~~-----~  187 (454)
                      ..+...-++|++.-||+.++--.        .                  .|.-.|.+||-+..-+|--|.+.+     +
T Consensus         8 ~eif~l~eELkeel~gf~vE~v~--------e------------------vFnayi~lDgeW~em~YPhPaf~ikp~gEv   61 (150)
T PF11447_consen    8 EEIFRLNEELKEELKGFKVEEVE--------E------------------VFNAYIYLDGEWREMKYPHPAFEIKPQGEV   61 (150)
T ss_dssp             HHHHHHHHHHHHHSTTSEE---E--------E------------------E-S-EEEETTEEEE--S-EEEEEEETTEEE
T ss_pred             HHHHHHHHHHHHHcCCCcceeHh--------h------------------hhheeEEecCeeeeecCCCCceeeccCccc
Confidence            34556667888888898776421        1                  255669999999999999887765     6


Q ss_pred             EEeCCCCeEEEEEEEeecccchhHH
Q 043258          188 GLDGNNGILPVAFCEVQVEDLDSWV  212 (454)
Q Consensus       188 g~d~~~~~~~la~al~~~E~~e~~~  212 (454)
                      |.+..+  +-+-+|+...+-.+.+.
T Consensus        62 Gat~q~--~YFvfav~kE~is~~Fv   84 (150)
T PF11447_consen   62 GATPQG--FYFVFAVPKEKISEEFV   84 (150)
T ss_dssp             EEETTE--EEEEEEEEGGG--HHHH
T ss_pred             ccccce--EEEEEEeeHHHhhHHHH
Confidence            666554  44555555555444443


No 42 
>PF01316 Arg_repressor:  Arginine repressor, DNA binding domain;  InterPro: IPR020900 The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) []. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein []. Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR []. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine []. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography []. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0006525 arginine metabolic process; PDB: 1AOY_A 3V4G_A 3LAJ_D 3FHZ_A 3LAP_B 3ERE_D 2P5L_C 1F9N_D 2P5K_A 1B4A_A ....
Probab=27.93  E-value=1.2e+02  Score=22.50  Aligned_cols=41  Identities=15%  Similarity=0.126  Sum_probs=26.2

Q ss_pred             HHHHhhhhccCCCCCHHHHHHHHHHHhCCccCHHHHHHHHHH
Q 043258           58 SAMFLHRWKEQPSISTTEVRNEIESMYGIKCPEWKVFCAANR   99 (454)
Q Consensus        58 ~~~~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~~~~~rak~~   99 (454)
                      .+.+.+.|...+-.+-.+|++.|++. |+.++..++.|--+.
T Consensus         7 ~~~I~~li~~~~i~sQ~eL~~~L~~~-Gi~vTQaTiSRDLke   47 (70)
T PF01316_consen    7 QELIKELISEHEISSQEELVELLEEE-GIEVTQATISRDLKE   47 (70)
T ss_dssp             HHHHHHHHHHS---SHHHHHHHHHHT-T-T--HHHHHHHHHH
T ss_pred             HHHHHHHHHHCCcCCHHHHHHHHHHc-CCCcchhHHHHHHHH
Confidence            34566677777767788999999875 999999999876443


No 43 
>PF03705 CheR_N:  CheR methyltransferase, all-alpha domain;  InterPro: IPR022641  CheR proteins are part of the chemotaxis signaling mechanism which methylates the chemotaxis receptor at specific glutamate residues. This entry refers to the N-terminal domain of the CherR-type MCP methyltransferases, which are found in bacteria, archaea and green plants. This entry is found in association with PF01739 from PFAM.  Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented. Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order []. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI []), shared by other AdoMet-Mtases [], is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments [], although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases [, , , ]. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences []. Flagellated bacteria swim towards favourable chemicals and away from deleterious ones. Sensing of chemoeffector gradients involves chemotaxis receptors, transmembrane (TM) proteins that detect stimuli through their periplasmic domains and transduce the signals via their cytoplasmic domains []. Signalling outputs from these receptors are influenced both by the binding of the chemoeffector ligand to their periplasmic domains and by methylation of specific glutamate residues on their cytoplasmic domains. Methylation is catalysed by CheR, an S-adenosylmethionine-dependent methyltransferase [], which reversibly methylates specific glutamate residues within a coiled coil region, to form gamma-glutamyl methyl ester residues [, ]. The structure of the Salmonella typhimurium chemotaxis receptor methyltransferase CheR, bound to S-adenosylhomocysteine, has been determined to a resolution of 2.0 A []. The structure reveals CheR to be a two-domain protein, with a smaller N-terminal helical domain linked via a single polypeptide connection to a larger C-terminal alpha/beta domain. The C-terminal domain has the characteristics of a nucleotide-binding fold, with an insertion of a small anti-parallel beta-sheet subdomain. The S-adenosylhomocysteine-binding site is formed mainly by the large domain, with contributions from residues within the N-terminal domain and the linker region [].; PDB: 1AF7_A 1BC5_A.
Probab=26.51  E-value=1.8e+02  Score=19.86  Aligned_cols=47  Identities=19%  Similarity=0.211  Sum_probs=22.3

Q ss_pred             HHHHHHHHHHhCCccCHHHHHHHHHHHHHHhccChHHHHHHHHHHHHHHH
Q 043258           74 TEVRNEIESMYGIKCPEWKVFCAANRAKQILGLDYDDGYAMLHQFKEEME  123 (454)
Q Consensus        74 ~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g~~~~~~~~l~~~~~~l~  123 (454)
                      ..+.+.|.+..|+.++..+-.-.+.++...+.   ..++..+.+|++.++
T Consensus         6 ~~~~~~i~~~~Gi~l~~~K~~~l~rRl~~rm~---~~~~~~~~~y~~~L~   52 (57)
T PF03705_consen    6 ERFRELIYRRTGIDLSEYKRSLLERRLARRMR---ALGLPSFAEYYELLR   52 (57)
T ss_dssp             HHHHHHHHHHH-----GGGHHHHHHHHHHHHH---HHT---HHHHHHHHH
T ss_pred             HHHHHHHHHHHCCCCchhhHHHHHHHHHHHHH---HcCCCCHHHHHHHHH
Confidence            45778899999999887664444444433322   223345556666653


No 44 
>KOG2909 consensus Vacuolar H+-ATPase V1 sector, subunit C [Energy production and conversion]
Probab=26.50  E-value=2.1e+02  Score=28.12  Aligned_cols=29  Identities=10%  Similarity=0.082  Sum_probs=24.8

Q ss_pred             ChHHHHHHHHHHHHHHHhhCCCcEEEEEe
Q 043258          107 DYDDGYAMLHQFKEEMERIDRDNIVLVET  135 (454)
Q Consensus       107 ~~~~~~~~l~~~~~~l~~~npg~~~~~~~  135 (454)
                      ...+.|+.+..=++.++++.-|+.....+
T Consensus       139 ~r~a~yn~ak~nl~nlerK~~GsL~~rsL  167 (381)
T KOG2909|consen  139 TRAAAYNNAKGNLQNLERKKTGSLSTRSL  167 (381)
T ss_pred             HHHHHHHhHHHHHHHHhhhccCChhhhhH
Confidence            45678999999999999999999887755


No 45 
>TIGR01529 argR_whole arginine repressor. This model includes most members of the arginine-responsive transcriptional regulator family ArgR. This hexameric protein binds DNA at its amino end to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbor-joining tree, some of these paralogous sequences show long branches and differ significantly in an otherwise well-conserved C-terminal region motif GT[VIL][AC]GDDT. These paralogs are excluded from the seed and score in the gray zone of this model, between trusted and noise cutoffs.
Probab=23.47  E-value=1.4e+02  Score=25.56  Aligned_cols=36  Identities=14%  Similarity=0.050  Sum_probs=29.0

Q ss_pred             HHhhhhccCCCCCHHHHHHHHHHHhCCccCHHHHHHH
Q 043258           60 MFLHRWKEQPSISTTEVRNEIESMYGIKCPEWKVFCA   96 (454)
Q Consensus        60 ~~~~~~~~~~~~~~~~i~~~l~~~~g~~~s~~~~~ra   96 (454)
                      .++..+.+++-.+..+|.+.|+ +.|+.+|..++||.
T Consensus         6 ~i~~Li~~~~i~tqeeL~~~L~-~~G~~vsqaTIsRd   41 (146)
T TIGR01529         6 RIKEIITEEKISTQEELVALLK-AEGIEVTQATVSRD   41 (146)
T ss_pred             HHHHHHHcCCCCCHHHHHHHHH-HhCCCcCHHHHHHH
Confidence            4555666777788999999995 45999999999983


No 46 
>cd00131 PAX Paired Box domain
Probab=22.93  E-value=1.7e+02  Score=24.37  Aligned_cols=39  Identities=13%  Similarity=0.065  Sum_probs=29.5

Q ss_pred             HHHhhhhccCCCCCHHHHHHHHHHHhCC-----ccCHHHHHHHHH
Q 043258           59 AMFLHRWKEQPSISTTEVRNEIESMYGI-----KCPEWKVFCAAN   98 (454)
Q Consensus        59 ~~~~~~~~~~~~~~~~~i~~~l~~~~g~-----~~s~~~~~rak~   98 (454)
                      +.+...+..+|+.+..+|.+.|.. .|+     .+|.++++|+-+
T Consensus        82 ~~i~~~v~~~p~~Tl~El~~~L~~-~gv~~~~~~~s~stI~R~L~  125 (128)
T cd00131          82 KKIEIYKQENPGMFAWEIRDRLLQ-EGVCDKSNVPSVSSINRILR  125 (128)
T ss_pred             HHHHHHHHHCCCCCHHHHHHHHHH-cCCcccCCCCCHHHHHHHHH
Confidence            334445678999999999999864 466     579999988743


No 47 
>PF07761 DUF1617:  Protein of unknown function (DUF1617);  InterPro: IPR011675 This entry is represented by Bacteriophage phi3396 (Streptococcus phage phi3396), Orf51 (Orf: phi3396_51). The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This entry is found in a family of hypothetical bacterial and bacteriophage proteins. The region represented by this entry is approximately 150 residues long and is highly conserved throughout the family.
Probab=20.44  E-value=4.3e+02  Score=22.60  Aligned_cols=36  Identities=14%  Similarity=0.090  Sum_probs=24.2

Q ss_pred             CCHHHHHHHHHHHhCCccCHHHHHHHHHHHHHHhcc
Q 043258           71 ISTTEVRNEIESMYGIKCPEWKVFCAANRAKQILGL  106 (454)
Q Consensus        71 ~~~~~i~~~l~~~~g~~~s~~~~~rak~~~~~~~~g  106 (454)
                      ++-++|.....--.++.+.-.++.|+|.+.++.+..
T Consensus         5 lkN~eL~~i~~~L~~iklk~~kaSraRtKLi~~v~~   40 (143)
T PF07761_consen    5 LKNKELNPIYNFLEKIKLKNMKASRARTKLIKLVEE   40 (143)
T ss_pred             eehHHHHHHHHHHHhcccccchhhHHHHHHHHHHHH
Confidence            344556655555556777666888999998876643


Done!