Query         042031
Match_columns 565
No_of_seqs    250 out of 1582
Neff          9.2 
Searched_HMMs 46136
Date          Fri Mar 29 13:16:21 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/042031.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/042031hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PLN03097 FHY3 Protein FAR-RED  100.0 8.7E-69 1.9E-73  580.4  41.2  444   16-483    70-623 (846)
  2 PF10551 MULE:  MULE transposas  99.8 8.7E-19 1.9E-23  143.4   7.3   76  205-285    18-93  (93)
  3 PF00872 Transposase_mut:  Tran  99.7 3.2E-18   7E-23  175.7   5.2  224  120-362   113-349 (381)
  4 PF03108 DBD_Tnp_Mut:  MuDR fam  99.7 7.5E-17 1.6E-21  122.4   8.9   67   17-83      1-67  (67)
  5 COG3328 Transposase and inacti  99.3 8.5E-11 1.8E-15  118.4  14.2  219  119-362    98-328 (379)
  6 smart00575 ZnF_PMZ plant mutat  99.0 2.1E-10 4.6E-15   69.7   2.3   28  441-468     1-28  (28)
  7 PF08731 AFT:  Transcription fa  98.9   7E-09 1.5E-13   83.5   8.1   68   26-93      1-110 (111)
  8 PF03101 FAR1:  FAR1 DNA-bindin  98.7 1.8E-08   4E-13   81.7   5.8   61   34-95      1-90  (91)
  9 PF04434 SWIM:  SWIM zinc finge  98.3 4.9E-07 1.1E-11   60.6   3.3   30  436-465    10-39  (40)
 10 PF00098 zf-CCHC:  Zinc knuckle  96.4  0.0018 3.9E-08   34.7   1.4   18  546-563     1-18  (18)
 11 PF15288 zf-CCHC_6:  Zinc knuck  96.2  0.0021 4.6E-08   41.8   1.0   18  546-563     2-21  (40)
 12 PF13696 zf-CCHC_2:  Zinc knuck  94.5   0.019 4.2E-07   35.5   1.2   21  544-564     7-27  (32)
 13 PF03106 WRKY:  WRKY DNA -bindi  94.1   0.077 1.7E-06   38.7   3.9   39   55-93     21-59  (60)
 14 PF06782 UPF0236:  Uncharacteri  94.0     1.8 3.8E-05   46.2  15.6  130  223-361   235-376 (470)
 15 PF04684 BAF1_ABF1:  BAF1 / ABF  93.2    0.19   4E-06   51.3   6.2   67   10-76     12-79  (496)
 16 COG3316 Transposase and inacti  92.9     1.3 2.8E-05   41.2  10.6  116  130-274    33-152 (215)
 17 PF01610 DDE_Tnp_ISL3:  Transpo  91.3    0.16 3.5E-06   49.3   3.1   66  218-288    30-96  (249)
 18 PF13610 DDE_Tnp_IS240:  DDE do  90.9    0.04 8.8E-07   48.3  -1.5   60  205-271    22-81  (140)
 19 smart00774 WRKY DNA binding do  90.7    0.35 7.5E-06   35.0   3.5   38   55-92     21-59  (59)
 20 PF04500 FLYWCH:  FLYWCH zinc f  90.5    0.45 9.9E-06   34.7   4.2   46   43-92     14-62  (62)
 21 smart00343 ZnF_C2HC zinc finge  87.1    0.32   7E-06   28.7   1.0   17  547-563     1-17  (26)
 22 PF14392 zf-CCHC_4:  Zinc knuck  87.0    0.24 5.1E-06   34.6   0.5   20  544-563    30-49  (49)
 23 PF13565 HTH_32:  Homeodomain-l  86.5     1.8 3.9E-05   33.2   5.4   41  107-147    34-76  (77)
 24 PF03050 DDE_Tnp_IS66:  Transpo  83.1     1.1 2.4E-05   44.0   3.5   36  250-290   121-156 (271)
 25 COG5179 TAF1 Transcription ini  82.0    0.86 1.9E-05   48.1   2.1   25  539-563   931-957 (968)
 26 COG5431 Uncharacterized metal-  78.3     1.6 3.5E-05   34.9   2.1   34  428-463    39-77  (117)
 27 PF02178 AT_hook:  AT hook moti  76.7     1.1 2.4E-05   21.8   0.5    9  524-532     2-10  (13)
 28 PHA02517 putative transposase   75.6      26 0.00056   34.4  10.5   42  107-149    30-72  (277)
 29 PRK09335 30S ribosomal protein  73.8     2.7 5.9E-05   33.3   2.3   27  520-553     2-28  (95)
 30 PLN00186 ribosomal protein S26  69.4     3.8 8.3E-05   33.3   2.2   27  520-553     2-28  (109)
 31 PTZ00172 40S ribosomal protein  68.6     4.1 8.9E-05   33.1   2.3   27  520-553     2-28  (108)
 32 smart00384 AT_hook DNA binding  64.6     4.1 8.8E-05   23.9   1.1   12  523-534     1-12  (26)
 33 PF13917 zf-CCHC_3:  Zinc knuck  58.1     6.1 0.00013   26.4   1.3   18  545-562     4-21  (42)
 34 COG5082 AIR1 Arginine methyltr  57.7     5.2 0.00011   36.3   1.2   16  546-561    98-113 (190)
 35 PF13592 HTH_33:  Winged helix-  56.4      20 0.00044   26.0   3.9   30  119-148     2-31  (60)
 36 COG4715 Uncharacterized conser  55.6      35 0.00077   36.4   6.9   51  415-467    41-97  (587)
 37 PF04937 DUF659:  Protein of un  52.7      48   0.001   29.4   6.5   62  225-290    74-138 (153)
 38 PF04800 ETC_C1_NDUFA4:  ETC co  51.5      27 0.00058   28.5   4.2   45    7-55     34-80  (101)
 39 COG4830 RPS26B Ribosomal prote  50.9      10 0.00023   30.1   1.7   27  520-553     2-28  (108)
 40 PHA00689 hypothetical protein   50.6      10 0.00022   25.8   1.4   12  544-555    16-27  (62)
 41 PF01283 Ribosomal_S26e:  Ribos  49.9      12 0.00025   31.0   1.9   27  520-553     2-28  (113)
 42 PF08766 DEK_C:  DEK C terminal  46.5      52  0.0011   23.2   4.7   38  107-144     4-43  (54)
 43 PF01498 HTH_Tnp_Tc3_2:  Transp  43.0      20 0.00043   26.9   2.2   36  112-148     4-39  (72)
 44 PF00665 rve:  Integrase core d  40.2      36 0.00077   28.1   3.6   54  205-263    29-82  (120)
 45 PF09713 A_thal_3526:  Plant pr  36.8      32  0.0007   24.4   2.2   26  121-146    12-38  (54)
 46 COG4279 Uncharacterized conser  34.0      26 0.00057   33.3   1.8   24  440-466   124-147 (266)
 47 PF05741 zf-nanos:  Nanos RNA b  33.3      13 0.00029   26.5  -0.1   20  544-563    32-54  (55)
 48 COG5082 AIR1 Arginine methyltr  32.5      22 0.00047   32.5   1.0   18  545-562    60-77  (190)
 49 PRK14892 putative transcriptio  32.4      35 0.00075   27.7   2.1    9  544-552    20-28  (99)
 50 PRK12286 rpmF 50S ribosomal pr  29.6      54  0.0012   23.6   2.5   32  520-553     4-35  (57)
 51 PF10045 DUF2280:  Uncharacteri  28.9      61  0.0013   26.3   2.9   24  123-146    21-44  (104)
 52 PF13276 HTH_21:  HTH-like doma  28.3 1.5E+02  0.0033   21.1   4.8   42  108-149     6-48  (60)
 53 PF13877 RPAP3_C:  Potential Mo  26.9      47   0.001   26.5   2.0   33  304-336     5-37  (94)
 54 COG5222 Uncharacterized conser  26.2      43 0.00092   32.5   1.9   25  540-564   171-195 (427)
 55 PF05634 APO_RNA-bind:  APO RNA  25.8      45 0.00098   30.7   1.9   20  544-563    97-121 (204)
 56 PTZ00368 universal minicircle   25.8      43 0.00092   29.4   1.7   19  546-564    78-96  (148)
 57 TIGR01031 rpmF_bact ribosomal   25.3      79  0.0017   22.6   2.7   41  520-561     2-42  (55)
 58 PTZ00368 universal minicircle   24.8      41  0.0009   29.5   1.5   20  545-564    52-71  (148)
 59 TIGR01589 A_thal_3526 uncharac  24.7      64  0.0014   23.2   2.1   26  122-147    16-42  (57)
 60 PF14201 DUF4318:  Domain of un  24.5 1.3E+02  0.0029   22.9   3.9   32   23-54     11-42  (74)
 61 PF13551 HTH_29:  Winged helix-  24.0 1.8E+02  0.0039   23.5   5.2   37  111-147    65-107 (112)
 62 PF14952 zf-tcix:  Putative tre  23.7      50  0.0011   22.2   1.3   21  544-564    10-31  (44)
 63 KOG2044 5'-3' exonuclease HKE1  23.5      32 0.00069   38.2   0.6   20  544-563   259-278 (931)
 64 COG4888 Uncharacterized Zn rib  22.7      71  0.0015   25.8   2.2    8  546-553    47-54  (104)
 65 KOG0341 DEAD-box protein abstr  22.0      42 0.00091   34.2   1.0   19  545-563   570-588 (610)
 66 PF01644 Chitin_synth_1:  Chiti  20.9 2.1E+02  0.0045   25.6   5.0   47  206-258   101-149 (163)
 67 PF11645 PDDEXK_5:  PD-(D/E)XK   20.7 2.8E+02  0.0062   24.1   5.5   45   29-73      6-50  (149)
 68 PF01316 Arg_repressor:  Argini  20.5 1.8E+02  0.0039   21.9   3.9   36  111-147     9-44  (70)
 69 KOG4602 Nanos and related prot  20.2      44 0.00095   31.7   0.7   20  544-563   267-289 (318)

No 1  
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00  E-value=8.7e-69  Score=580.40  Aligned_cols=444  Identities=16%  Similarity=0.260  Sum_probs=344.0

Q ss_pred             CCCCccccCeeCCHHHHHHHHHHHHHhccceEEEEeecCe-------EEEEEeec-------------------------
Q 042031           16 AEPTLSIGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRS-------RFIAKCSK-------------------------   63 (565)
Q Consensus        16 ~~~~l~~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~-------r~~~~C~~-------------------------   63 (565)
                      ....|.+||+|+|.|||++||+.||.+.||++|+.+|.++       ..+++|++                         
T Consensus        70 ~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~~~~  149 (846)
T PLN03097         70 TNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQDPEN  149 (846)
T ss_pred             CCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccCccc
Confidence            4456899999999999999999999999999998755432       23567764                         


Q ss_pred             ---------CCCccEEEEEEcCCCCceEEeeeccceeeeCcccccccccchhhHHHHHHHHHhcCCCCCHHHHHHHHHHH
Q 042031           64 ---------EGCPWRVHVAKCPGVPTFSIRTLHGEHTCEGVQNLHHQQASVGWVARSVEARIRDNPQYKPKEILQDIRDQ  134 (565)
Q Consensus        64 ---------~~C~~~v~~~~~~~~~~~~V~~~~~~H~c~~~~~~~~~~~~~~~i~~~~~~~l~~~~~~~~~~i~~~~~~~  134 (565)
                               +||+++|.+.+ ..+|.|+|+.+..+|||+..+.......+...+. .+...+....++.           
T Consensus       150 ~~~rR~~tRtGC~A~m~Vk~-~~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~-~~~~~~~~~~~v~-----------  216 (846)
T PLN03097        150 GTGRRSCAKTDCKASMHVKR-RPDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYA-AMARQFAEYKNVV-----------  216 (846)
T ss_pred             ccccccccCCCCceEEEEEE-cCCCeEEEEEEecCCCCCCCCccccchhhhhhHH-HHHhhhhcccccc-----------
Confidence                     37999999987 4568999999999999976543221110111111 0100010000000           


Q ss_pred             cCccc-CHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcc---------cchhhhhc-----
Q 042031          135 HGVAV-SYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKI---------VSIGSLFL-----  199 (565)
Q Consensus       135 ~g~~~-s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~---------~~i~~f~~-----  199 (565)
                       +... .....-+.|++  +...    .+.+.|++||++++.+||+|+|+|++|++++         .|+..|.+     
T Consensus       217 -~~~~d~~~~~~~~r~~--~~~~----gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~FGDvV  289 (846)
T PLN03097        217 -GLKNDSKSSFDKGRNL--GLEA----GDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGNFSDVV  289 (846)
T ss_pred             -ccchhhcchhhHHHhh--hccc----chHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHhcCCEE
Confidence             0000 00011111221  1112    3457899999999999999999999999886         44444432     


Q ss_pred             -ccccc------------cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhh
Q 042031          200 -IVHQY------------MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVET  266 (565)
Q Consensus       200 -l~~~y------------~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~  266 (565)
                       +|++|            +|+|+|+|+++|||||+.+|+.|+|.|||++|+++|+    +.+|.+||||++.+|.+||++
T Consensus       290 ~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~----gk~P~tIiTDqd~am~~AI~~  365 (846)
T PLN03097        290 SFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMG----GQAPKVIITDQDKAMKSVISE  365 (846)
T ss_pred             EEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhC----CCCCceEEecCCHHHHHHHHH
Confidence             55555            9999999999999999999999999999999999999    899999999999999999999


Q ss_pred             hCCCCcchhhHHHHHHHHHhhcCc-----hhhHHHHHHHHH-hhcHHHHHHHHHHHHhc-ccchhhHhhhC--CCCCccc
Q 042031          267 HFPSAFHCFCLRYVSENFRDTFKN-----TKLVNIFWNAVY-ALTTVEFEAKISEMVEI-SQDVIPWFQQF--PPQLWAI  337 (565)
Q Consensus       267 vfP~~~h~~C~~Hi~~n~~~~~~~-----~~~~~~~~~~~~-~~t~~eF~~~~~~l~~~-~~~~~~~l~~~--~~~~W~~  337 (565)
                      |||++.|++|.|||++|+.++++.     +.|...|..|++ +.+++||+..|.+|... +++.++||+.+  .|++|++
T Consensus       366 VfP~t~Hr~C~wHI~~~~~e~L~~~~~~~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~LY~~RekWap  445 (846)
T PLN03097        366 VFPNAHHCFFLWHILGKVSENLGQVIKQHENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSLYEDRKQWVP  445 (846)
T ss_pred             HCCCceehhhHHHHHHHHHHHhhHHhhhhhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHHHHhHhhhhH
Confidence            999999999999999999999853     589999999998 89999999999999876 89999999999  8999999


Q ss_pred             cccCCccc-cccccchhHHHHHHhhhC--CCCCHHHHHHHHHHHHHHHHHHHHHh-----------------hhhccCCC
Q 042031          338 AYFEGVRY-GHFTLGVTELLYNWALEC--HELPVVQMMEYIRHQLTSWFNDRREM-----------------GMRWTSIL  397 (565)
Q Consensus       338 a~~~~~~~-~~~ttn~~Es~n~~lk~~--~~~pi~~~~e~i~~~~~~~~~~r~~~-----------------~~~~~~~~  397 (565)
                      +|+++.++ |+.||+++||+|+.|++.  ...+|..|++.+...+..+..+..+.                 ..+....+
T Consensus       446 aY~k~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piEkQAs~iY  525 (846)
T PLN03097        446 TYMRDAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLEKSVSGVY  525 (846)
T ss_pred             HHhcccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHHHHHHHHh
Confidence            99966666 667999999999999974  55889999988877665544332211                 11235689


Q ss_pred             CchHHHHHHHHHHhccceEEEeeC----CeeEEEEe--cCceEEEEc----cCceeccCccccCCCCchhHHHHHHhcCC
Q 042031          398 VPSTERQIMEAIADARCYQVLRAN----EIEFEIVS--TERTNIVDI----RSRVCSCRRWQLYGLPCAHAAAALLSCGQ  467 (565)
Q Consensus       398 tp~~~~~~~~~~~~~~~~~v~~~~----~~~~~V~~--~~~~~~V~l----~~~~CsC~~~~~~GiPC~H~lav~~~~~~  467 (565)
                      ||.+|++||+++..+..|.+...+    ..+|.|..  ....|.|..    .+.+|+|++|+..||||+|||+||.+.++
T Consensus       526 T~~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLkVL~~~~v  605 (846)
T PLN03097        526 THAVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALVVLQMCQL  605 (846)
T ss_pred             HHHHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHHHHhhcCc
Confidence            999999999999998888776542    24688876  345677743    36799999999999999999999999998


Q ss_pred             --CcccccccchhHHHHH
Q 042031          468 --NVHLFAEPCFTVASYR  483 (565)
Q Consensus       468 --~~~~~v~~~y~~~~~~  483 (565)
                        .|+.||.+|||+++-.
T Consensus       606 ~~IP~~YILkRWTKdAK~  623 (846)
T PLN03097        606 SAIPSQYILKRWTKDAKS  623 (846)
T ss_pred             ccCchhhhhhhchhhhhh
Confidence              4999999999988743


No 2  
>PF10551 MULE:  MULE transposase domain;  InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 []. 
Probab=99.76  E-value=8.7e-19  Score=143.39  Aligned_cols=76  Identities=30%  Similarity=0.649  Sum_probs=73.2

Q ss_pred             cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHH
Q 042031          205 MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHCFCLRYVSENF  284 (565)
Q Consensus       205 ~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~  284 (565)
                      +|+|++++.+|+||+++++|+.++|.|||+.+++.++    .. |.+||||++.|+.+||+++||++.|++|.||+.+|+
T Consensus        18 ~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~----~~-p~~ii~D~~~~~~~Ai~~vfP~~~~~~C~~H~~~n~   92 (93)
T PF10551_consen   18 VGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP----QK-PKVIISDFDKALINAIKEVFPDARHQLCLFHILRNI   92 (93)
T ss_pred             EEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc----cC-ceeeeccccHHHHHHHHHHCCCceEehhHHHHHHhh
Confidence            7999999999999999999999999999999999998    45 999999999999999999999999999999999997


Q ss_pred             H
Q 042031          285 R  285 (565)
Q Consensus       285 ~  285 (565)
                      +
T Consensus        93 k   93 (93)
T PF10551_consen   93 K   93 (93)
T ss_pred             C
Confidence            4


No 3  
>PF00872 Transposase_mut:  Transposase, Mutator family;  InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.72  E-value=3.2e-18  Score=175.69  Aligned_cols=224  Identities=18%  Similarity=0.187  Sum_probs=167.4

Q ss_pred             CCCCHHHHHHHHHHHcC-cccCHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcc-cchhhh
Q 042031          120 PQYKPKEILQDIRDQHG-VAVSYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKI-VSIGSL  197 (565)
Q Consensus       120 ~~~~~~~i~~~~~~~~g-~~~s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~-~~i~~f  197 (565)
                      .+++..+|.+.++..+| ..+|.+++.|..+...+.+           ..|..+-....|-.++  .+|+-.. .-.+|-
T Consensus       113 ~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~-----------~~w~~R~L~~~~y~~l--~iD~~~~kvr~~~~  179 (381)
T PF00872_consen  113 KGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEV-----------EAWRNRPLESEPYPYL--WIDGTYFKVREDGR  179 (381)
T ss_pred             cccccccccchhhhhhcccccCchhhhhhhhhhhhhH-----------HHHhhhccccccccce--eeeeeecccccccc
Confidence            67889999999999999 7899999988776654432           2333333333322222  2221100 000000


Q ss_pred             hcccccc--cccCCCCceeEeEEEEeeccccchHHHHHHHHH-HHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcch
Q 042031          198 FLIVHQY--MGVDAEDALFPLAIAIVDVESDENWMWFMSELR-KLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHC  274 (565)
Q Consensus       198 ~~l~~~y--~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~-~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~  274 (565)
                      ..-...|  +|+|.+|+..+||+.+.+.|+.++|.-||+.|+ +.+.      .|..|++|.++||.+||+++||++.++
T Consensus       180 ~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~L~~RGl~------~~~lvv~Dg~~gl~~ai~~~fp~a~~Q  253 (381)
T PF00872_consen  180 VVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQDLKERGLK------DILLVVSDGHKGLKEAIREVFPGAKWQ  253 (381)
T ss_pred             cccchhhhhhhhhcccccceeeeecccCCccCEeeecchhhhhcccc------ccceeeccccccccccccccccchhhh
Confidence            0001123  999999999999999999999999999999997 4544      588999999999999999999999999


Q ss_pred             hhHHHHHHHHHhhcCch---hhHHHHHHHHHhhcHHHHHHHHHHHHh----cccchhhHhhhCCCCCccccccCCccc-c
Q 042031          275 FCLRYVSENFRDTFKNT---KLVNIFWNAVYALTTVEFEAKISEMVE----ISQDVIPWFQQFPPQLWAIAYFEGVRY-G  346 (565)
Q Consensus       275 ~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~t~~eF~~~~~~l~~----~~~~~~~~l~~~~~~~W~~a~~~~~~~-~  346 (565)
                      .|.+|+++|+.++++.+   .+...+..+..+.+.++....++++.+    .+|.+.++|++...+.|+..-++...+ -
T Consensus       254 rC~vH~~RNv~~~v~~k~~~~v~~~Lk~I~~a~~~e~a~~~l~~f~~~~~~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~  333 (381)
T PF00872_consen  254 RCVVHLMRNVLRKVPKKDRKEVKADLKAIYQAPDKEEAREALEEFAEKWEKKYPKAAKSLEENWDELLTFLDFPPEHRRS  333 (381)
T ss_pred             hheechhhhhccccccccchhhhhhccccccccccchhhhhhhhcccccccccchhhhhhhhccccccceeeecchhccc
Confidence            99999999999998653   777788888778888888888888765    388899999887667676544444444 5


Q ss_pred             ccccchhHHHHHHhhh
Q 042031          347 HFTLGVTELLYNWALE  362 (565)
Q Consensus       347 ~~ttn~~Es~n~~lk~  362 (565)
                      +.|||..||+|+.|+.
T Consensus       334 i~TTN~iEsln~~irr  349 (381)
T PF00872_consen  334 IRTTNAIESLNKEIRR  349 (381)
T ss_pred             cchhhhccccccchhh
Confidence            5699999999999986


No 4  
>PF03108 DBD_Tnp_Mut:  MuDR family transposase;  InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=99.70  E-value=7.5e-17  Score=122.44  Aligned_cols=67  Identities=40%  Similarity=0.840  Sum_probs=65.1

Q ss_pred             CCCccccCeeCCHHHHHHHHHHHHHhccceEEEEeecCeEEEEEeecCCCccEEEEEEcCCCCceEE
Q 042031           17 EPTLSIGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRSRFIAKCSKEGCPWRVHVAKCPGVPTFSI   83 (565)
Q Consensus        17 ~~~l~~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~r~~~~C~~~~C~~~v~~~~~~~~~~~~V   83 (565)
                      ++.|.+||+|+|++|++.||..||++.+|++++.+|++.+++++|...+|||+|++++.++++.|+|
T Consensus         1 n~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~~r~~~~C~~~~C~Wrv~as~~~~~~~~~I   67 (67)
T PF03108_consen    1 NPELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDKKRYRAKCKDKGCPWRVRASKRKRSDTFQI   67 (67)
T ss_pred             CCccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCCEEEEEEEcCCCCCEEEEEEEcCCCCEEEC
Confidence            5789999999999999999999999999999999999999999999999999999999999999986


No 5  
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.25  E-value=8.5e-11  Score=118.39  Aligned_cols=219  Identities=19%  Similarity=0.186  Sum_probs=152.2

Q ss_pred             CCCCCHHHHHHHHHHHcCcccCHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcc-cchhhh
Q 042031          119 NPQYKPKEILQDIRDQHGVAVSYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKI-VSIGSL  197 (565)
Q Consensus       119 ~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~-~~i~~f  197 (565)
                      ..++++.++...++..++..+|...+.+.....++               .+.+++..-++-+-.+.+|.... ..  ++
T Consensus        98 ~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e---------------~v~~~~~r~l~~~~~v~~D~~~~k~r--~v  160 (379)
T COG3328          98 AKGVTTREIEALLEELYGHKVSPSVISVVTDRLDE---------------KVKAWQNRPLGDYPYVYLDAKYVKVR--SV  160 (379)
T ss_pred             HcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHH---------------HHHHHHhccccCceEEEEecceeehh--hh
Confidence            36789999999999998888888777665544433               33444444443333333332211 00  00


Q ss_pred             hcccccc--cccCCCCceeEeEEEEeeccccchHHHHHHHHH-HHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcch
Q 042031          198 FLIVHQY--MGVDAEDALFPLAIAIVDVESDENWMWFMSELR-KLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHC  274 (565)
Q Consensus       198 ~~l~~~y--~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~-~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~  274 (565)
                       .-...|  +|++.+|+-.++|+.+-+.|+ ..|.-||..|+ +.+.      ....+++|..+|+.+||.++||.+.++
T Consensus       161 -~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~rgl~------~v~l~v~Dg~~gl~~aI~~v~p~a~~Q  232 (379)
T COG3328         161 -RNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNRGLS------DVLLVVVDGLKGLPEAISAVFPQAAVQ  232 (379)
T ss_pred             -hhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhcccc------ceeEEecchhhhhHHHHHHhccHhhhh
Confidence             011112  999999999999999999999 99997777776 4455      346677899999999999999999999


Q ss_pred             hhHHHHHHHHHhhcCch---hhHHHHHHHHHhhcHHHHHHHHHHHHh----cccchhhHhhhCCCCCc-cccccCCcccc
Q 042031          275 FCLRYVSENFRDTFKNT---KLVNIFWNAVYALTTVEFEAKISEMVE----ISQDVIPWFQQFPPQLW-AIAYFEGVRYG  346 (565)
Q Consensus       275 ~C~~Hi~~n~~~~~~~~---~~~~~~~~~~~~~t~~eF~~~~~~l~~----~~~~~~~~l~~~~~~~W-~~a~~~~~~~~  346 (565)
                      .|..|+.+|+..+...+   .+...+..+..+.+.++-...|..+..    ..|....++.+..-+.| ..+|....+--
T Consensus       233 ~C~vH~~Rnll~~v~~k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~~yP~i~~~~~~~~~~~~~F~~fp~~~r~~  312 (379)
T COG3328         233 RCIVHLVRNLLDKVPRKDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGKRYPAILKSWRNALEELLPFFAFPSEIRKI  312 (379)
T ss_pred             hhhhHHHhhhhhhhhhhhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhhhcchHHHHHHHHHHHhcccccCcHHHHhH
Confidence            99999999999988765   344445555556677766666666443    47787777777644444 33343333335


Q ss_pred             ccccchhHHHHHHhhh
Q 042031          347 HFTLGVTELLYNWALE  362 (565)
Q Consensus       347 ~~ttn~~Es~n~~lk~  362 (565)
                      +.|||..|++|+.++.
T Consensus       313 i~ttN~IE~~n~~ir~  328 (379)
T COG3328         313 IYTTNAIESLNKLIRR  328 (379)
T ss_pred             hhcchHHHHHHHHHHH
Confidence            6799999999997763


No 6  
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.99  E-value=2.1e-10  Score=69.73  Aligned_cols=28  Identities=50%  Similarity=1.068  Sum_probs=25.5

Q ss_pred             ceeccCccccCCCCchhHHHHHHhcCCC
Q 042031          441 RVCSCRRWQLYGLPCAHAAAALLSCGQN  468 (565)
Q Consensus       441 ~~CsC~~~~~~GiPC~H~lav~~~~~~~  468 (565)
                      .+|||++||..||||+|+|+|+...+++
T Consensus         1 ~~CsC~~~~~~gipC~H~i~v~~~~~~~   28 (28)
T smart00575        1 KTCSCRKFQLSGIPCRHALAAAIHIGLS   28 (28)
T ss_pred             CcccCCCcccCCccHHHHHHHHHHhCCC
Confidence            4799999999999999999999988763


No 7  
>PF08731 AFT:  Transcription factor AFT;  InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2. 
Probab=98.89  E-value=7e-09  Score=83.49  Aligned_cols=68  Identities=21%  Similarity=0.451  Sum_probs=64.5

Q ss_pred             eCCHHHHHHHHHHHHHhccceEEEEeecCeEEEEEeec------------------------------------------
Q 042031           26 FPDVETCRRTLKDIAIALHFDLRIVKSDRSRFIAKCSK------------------------------------------   63 (565)
Q Consensus        26 F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~r~~~~C~~------------------------------------------   63 (565)
                      |.+++|++.+|+.++...||++.+.+|+...+.|.|..                                          
T Consensus         1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k~t~srk   80 (111)
T PF08731_consen    1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKKKIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKKRTKSRK   80 (111)
T ss_pred             CCchHHHHHHHHHHhhhcCceEEEEecCCceEEEEEecCCCcccccccccccccccccccccccccccccccCCcccccc
Confidence            88999999999999999999999999999999999963                                          


Q ss_pred             CCCccEEEEEEcCCCCceEEeeeccceeee
Q 042031           64 EGCPWRVHVAKCPGVPTFSIRTLHGEHTCE   93 (565)
Q Consensus        64 ~~C~~~v~~~~~~~~~~~~V~~~~~~H~c~   93 (565)
                      ..|||+|+|+.....+.|.|+.+++.|||+
T Consensus        81 ~~CPFriRA~yS~k~k~W~lvvvnn~HnH~  110 (111)
T PF08731_consen   81 NTCPFRIRANYSKKNKKWTLVVVNNEHNHP  110 (111)
T ss_pred             cCCCeEEEEEEEecCCeEEEEEecCCcCCC
Confidence            379999999999999999999999999996


No 8  
>PF03101 FAR1:  FAR1 DNA-binding domain;  InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ].   This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=98.73  E-value=1.8e-08  Score=81.65  Aligned_cols=61  Identities=23%  Similarity=0.314  Sum_probs=53.3

Q ss_pred             HHHHHHHHhccceEEEEeecCe-------EEEEEeec----------------------CCCccEEEEEEcCCCCceEEe
Q 042031           34 RTLKDIAIALHFDLRIVKSDRS-------RFIAKCSK----------------------EGCPWRVHVAKCPGVPTFSIR   84 (565)
Q Consensus        34 ~a~~~ya~~~gf~~~~~kS~~~-------r~~~~C~~----------------------~~C~~~v~~~~~~~~~~~~V~   84 (565)
                      +||+.||..+||.+++.+|.+.       ++.++|++                      +||||+|.+.+.. ++.|.|.
T Consensus         1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~v~   79 (91)
T PF03101_consen    1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK-DGKWRVT   79 (91)
T ss_pred             CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc-CCEEEEE
Confidence            4899999999999999876654       57889985                      5899999999877 8999999


Q ss_pred             eeccceeeeCc
Q 042031           85 TLHGEHTCEGV   95 (565)
Q Consensus        85 ~~~~~H~c~~~   95 (565)
                      .+..+|||+..
T Consensus        80 ~~~~~HNH~L~   90 (91)
T PF03101_consen   80 SFVLEHNHPLC   90 (91)
T ss_pred             ECcCCcCCCCC
Confidence            99999999753


No 9  
>PF04434 SWIM:  SWIM zinc finger;  InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.32  E-value=4.9e-07  Score=60.61  Aligned_cols=30  Identities=43%  Similarity=0.802  Sum_probs=27.2

Q ss_pred             EEccCceeccCccccCCCCchhHHHHHHhc
Q 042031          436 VDIRSRVCSCRRWQLYGLPCAHAAAALLSC  465 (565)
Q Consensus       436 V~l~~~~CsC~~~~~~GiPC~H~lav~~~~  465 (565)
                      +++...+|+|..|+..|.||+|++|++...
T Consensus        10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~   39 (40)
T PF04434_consen   10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL   39 (40)
T ss_pred             ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence            567788999999999999999999998765


No 10 
>PF00098 zf-CCHC:  Zinc knuckle;  InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:  C-X2-C-X4-H-X4-C  where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=96.41  E-value=0.0018  Score=34.72  Aligned_cols=18  Identities=28%  Similarity=0.650  Sum_probs=16.3

Q ss_pred             EEcCccCCCCCCCCCCCC
Q 042031          546 VQCGRCHLLGHSQKKCTM  563 (565)
Q Consensus       546 ~~C~~C~~~gHn~~tC~~  563 (565)
                      ++|-+|++.||.++.||+
T Consensus         1 ~~C~~C~~~GH~~~~Cp~   18 (18)
T PF00098_consen    1 RKCFNCGEPGHIARDCPK   18 (18)
T ss_dssp             SBCTTTSCSSSCGCTSSS
T ss_pred             CcCcCCCCcCcccccCcc
Confidence            379999999999999995


No 11 
>PF15288 zf-CCHC_6:  Zinc knuckle
Probab=96.20  E-value=0.0021  Score=41.85  Aligned_cols=18  Identities=44%  Similarity=1.128  Sum_probs=15.9

Q ss_pred             EEcCccCCCCCCC--CCCCC
Q 042031          546 VQCGRCHLLGHSQ--KKCTM  563 (565)
Q Consensus       546 ~~C~~C~~~gHn~--~tC~~  563 (565)
                      ++|++||+.||.+  ++||+
T Consensus         2 ~kC~~CG~~GH~~t~k~CP~   21 (40)
T PF15288_consen    2 VKCKNCGAFGHMRTNKRCPM   21 (40)
T ss_pred             ccccccccccccccCccCCC
Confidence            6899999999998  77875


No 12 
>PF13696 zf-CCHC_2:  Zinc knuckle
Probab=94.53  E-value=0.019  Score=35.51  Aligned_cols=21  Identities=29%  Similarity=0.520  Sum_probs=18.9

Q ss_pred             ceEEcCccCCCCCCCCCCCCC
Q 042031          544 RIVQCGRCHLLGHSQKKCTMP  564 (565)
Q Consensus       544 ~~~~C~~C~~~gHn~~tC~~~  564 (565)
                      ..+.|.+|++.||..+.||+.
T Consensus         7 ~~Y~C~~C~~~GH~i~dCP~~   27 (32)
T PF13696_consen    7 PGYVCHRCGQKGHWIQDCPTN   27 (32)
T ss_pred             CCCEeecCCCCCccHhHCCCC
Confidence            458999999999999999974


No 13 
>PF03106 WRKY:  WRKY DNA -binding domain;  InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=94.14  E-value=0.077  Score=38.71  Aligned_cols=39  Identities=31%  Similarity=0.606  Sum_probs=32.4

Q ss_pred             eEEEEEeecCCCccEEEEEEcCCCCceEEeeeccceeee
Q 042031           55 SRFIAKCSKEGCPWRVHVAKCPGVPTFSIRTLHGEHTCE   93 (565)
Q Consensus        55 ~r~~~~C~~~~C~~~v~~~~~~~~~~~~V~~~~~~H~c~   93 (565)
                      .|.-+.|+..+|+++-.+.+..+++...++++.++|||+
T Consensus        21 pRsYYrCt~~~C~akK~Vqr~~~d~~~~~vtY~G~H~h~   59 (60)
T PF03106_consen   21 PRSYYRCTHPGCPAKKQVQRSADDPNIVIVTYEGEHNHP   59 (60)
T ss_dssp             EEEEEEEECTTEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred             eeEeeeccccChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence            456689999999999999988888899999999999996


No 14 
>PF06782 UPF0236:  Uncharacterised protein family (UPF0236);  InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=94.05  E-value=1.8  Score=46.20  Aligned_cols=130  Identities=15%  Similarity=0.192  Sum_probs=91.2

Q ss_pred             ccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHHHhhcCc-hhhHHHHHHHH
Q 042031          223 VESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHCFCLRYVSENFRDTFKN-TKLVNIFWNAV  301 (565)
Q Consensus       223 ~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~~~~~~~-~~~~~~~~~~~  301 (565)
                      ..+.+-|.-+...+.+....  ....-+++.+|+.+.|.+++. .||.+.|.+..+|+.+.+.+.++. +.+...++.+.
T Consensus       235 ~~~~~~~~~v~~~i~~~Y~~--~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~~~~~~~~~al  311 (470)
T PF06782_consen  235 ESAEEFWEEVLDYIYNHYDL--DKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDPELKEKIRKAL  311 (470)
T ss_pred             cchHHHHHHHHHHHHHhcCc--ccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhChHHHHHHHHHH
Confidence            55678899888888877763  123357788999999988776 999999999999999999988863 46777777788


Q ss_pred             HhhcHHHHHHHHHHHHhc--cc-------chhhHhhhCCCCCccc--cccCCccccccccchhHHHHHHhh
Q 042031          302 YALTTVEFEAKISEMVEI--SQ-------DVIPWFQQFPPQLWAI--AYFEGVRYGHFTLGVTELLYNWAL  361 (565)
Q Consensus       302 ~~~t~~eF~~~~~~l~~~--~~-------~~~~~l~~~~~~~W~~--a~~~~~~~~~~ttn~~Es~n~~lk  361 (565)
                      +.....+++..++.+...  .+       +...||..    +|-.  .|...  -|.......|+.+..+.
T Consensus       312 ~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~~~~Yl~~----n~~~i~~y~~~--~~~~g~g~ee~~~~~~s  376 (470)
T PF06782_consen  312 KKGDKKKLETVLDTAESCAKDEEERKKIRKLRKYLLN----NWDGIKPYRER--EGLRGIGAEESVSHVLS  376 (470)
T ss_pred             HhcCHHHHHHHHHHHHHhhhchHHHHHHHHHHHHHHH----CHHHhhhhhhc--cCCCccchhhhhhhHHH
Confidence            877888888888887754  22       23444444    4422  22210  23334455777777664


No 15 
>PF04684 BAF1_ABF1:  BAF1 / ABF1 chromatin reorganising factor;  InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=93.25  E-value=0.19  Score=51.32  Aligned_cols=67  Identities=22%  Similarity=0.533  Sum_probs=56.5

Q ss_pred             CCcCCCCCCCccccCeeCCHHHHHHHHHHHHHhccceEEEEeecCe-EEEEEeecCCCccEEEEEEcC
Q 042031           10 NDSLSLAEPTLSIGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRS-RFIAKCSKEGCPWRVHVAKCP   76 (565)
Q Consensus        10 ~~~~~~~~~~l~~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~-r~~~~C~~~~C~~~v~~~~~~   76 (565)
                      +..|...++.-..+..|++.++-+.+++.|.++....|..+.|-+. .++|.|.-..|||+|.++..+
T Consensus        12 n~~l~~~~~~~~~~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nkhftfachlk~c~fkillsy~g   79 (496)
T PF04684_consen   12 NKSLASGDPQSAQARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNKHFTFACHLKNCPFKILLSYCG   79 (496)
T ss_pred             hhhhccCCcccccccCCCcHHHHHHHHhhhhhhhcCceeecccccccceEEEeeccCCCceeeeeecc
Confidence            3445455556666888999999999999999999999999988764 599999999999999998654


No 16 
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=92.89  E-value=1.3  Score=41.16  Aligned_cols=116  Identities=12%  Similarity=0.149  Sum_probs=80.9

Q ss_pred             HHHHHcCcccCHHHHHHHHHHHHHHhhCChHHhhccHHHHHHHHhhcCCCcEEEEEEcCCcccchhhhhcccccc----c
Q 042031          130 DIRDQHGVAVSYMQAWRGKERSMAALHGTYEEGYRLLPAYCEQIRKTNPGSIASVFATGQKIVSIGSLFLIVHQY----M  205 (565)
Q Consensus       130 ~~~~~~g~~~s~~~~~r~~~~~~~~~~g~~~~~~~~l~~~~~~l~~~np~~~~~v~~~~~~~~~i~~f~~l~~~y----~  205 (565)
                      .+....|+.+++.++.|.=++.              -+.+...+...++.....+-+|+       .+..+.|++    .
T Consensus        33 e~l~~rgi~v~h~Ti~rwv~k~--------------~~~~~~~~~~r~~~~~~~w~vDE-------t~ikv~gkw~ylyr   91 (215)
T COG3316          33 EMLAERGIEVDHETIHRWVQKY--------------GPLLARRLKRRKRKAGDSWRVDE-------TYIKVNGKWHYLYR   91 (215)
T ss_pred             HHHHHcCcchhHHHHHHHHHHH--------------hHHHHHHhhhhccccccceeeee-------eEEeeccEeeehhh
Confidence            3445679999999988764432              13455566666665444444443       122256665    7


Q ss_pred             ccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcch
Q 042031          206 GVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHC  274 (565)
Q Consensus       206 g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~  274 (565)
                      ++|.+|  .++.+-|...-+...=.-||..+++.-+      .|.+|+||+.+....|+.++-+.+.|+
T Consensus        92 Aid~~g--~~Ld~~L~~rRn~~aAk~Fl~kllk~~g------~p~v~vtDka~s~~~A~~~l~~~~ehr  152 (215)
T COG3316          92 AIDADG--LTLDVWLSKRRNALAAKAFLKKLLKKHG------EPRVFVTDKAPSYTAALRKLGSEVEHR  152 (215)
T ss_pred             hhccCC--CeEEEEEEcccCcHHHHHHHHHHHHhcC------CCceEEecCccchHHHHHhcCcchhee
Confidence            888884  4567777777777777777877776656      689999999999999999999977665


No 17 
>PF01610 DDE_Tnp_ISL3:  Transposase;  InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=91.31  E-value=0.16  Score=49.29  Aligned_cols=66  Identities=14%  Similarity=0.023  Sum_probs=54.5

Q ss_pred             EEEeeccccchHHHHHHHH-HHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHHHhhc
Q 042031          218 IAIVDVESDENWMWFMSEL-RKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSAFHCFCLRYVSENFRDTF  288 (565)
Q Consensus       218 ~a~~~~E~~e~~~w~l~~l-~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~~~~~  288 (565)
                      ++++++-+.++..=||..+ -...     .....+|++|...+..+|+++.||+|.+..-.|||++++.+.+
T Consensus        30 l~i~~~r~~~~l~~~~~~~~~~~~-----~~~v~~V~~Dm~~~y~~~~~~~~P~A~iv~DrFHvvk~~~~al   96 (249)
T PF01610_consen   30 LDILPGRDKETLKDFFRSLYPEEE-----RKNVKVVSMDMSPPYRSAIREYFPNAQIVADRFHVVKLANRAL   96 (249)
T ss_pred             EEEcCCccHHHHHHHHHHhCcccc-----ccceEEEEcCCCccccccccccccccccccccchhhhhhhhcc
Confidence            3578888888888787766 3332     3467899999999999999999999999999999999886644


No 18 
>PF13610 DDE_Tnp_IS240:  DDE domain
Probab=90.87  E-value=0.04  Score=48.29  Aligned_cols=60  Identities=15%  Similarity=0.128  Sum_probs=51.5

Q ss_pred             cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHhhhCCCC
Q 042031          205 MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVETHFPSA  271 (565)
Q Consensus       205 ~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~~  271 (565)
                      -.+|.+++  +|++-|-..-+.+.=..||..+++..+     ..|..|+||..++...|+++++|..
T Consensus        22 ~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~~-----~~p~~ivtDk~~aY~~A~~~l~~~~   81 (140)
T PF13610_consen   22 RAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRHR-----GEPRVIVTDKLPAYPAAIKELNPEG   81 (140)
T ss_pred             Eeeccccc--chhhhhhhhcccccceeeccccceeec-----cccceeecccCCccchhhhhccccc
Confidence            88999998  888888888888888888888776653     3789999999999999999998864


No 19 
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=90.71  E-value=0.35  Score=35.03  Aligned_cols=38  Identities=32%  Similarity=0.573  Sum_probs=32.0

Q ss_pred             eEEEEEeec-CCCccEEEEEEcCCCCceEEeeeccceee
Q 042031           55 SRFIAKCSK-EGCPWRVHVAKCPGVPTFSIRTLHGEHTC   92 (565)
Q Consensus        55 ~r~~~~C~~-~~C~~~v~~~~~~~~~~~~V~~~~~~H~c   92 (565)
                      .|.-++|+. .+|+++=.+.+..+++...++++.++|||
T Consensus        21 pRsYYrCt~~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h   59 (59)
T smart00774       21 PRSYYRCTYSQGCPAKKQVQRSDDDPSVVEVTYEGEHTH   59 (59)
T ss_pred             cceEEeccccCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence            345579998 89999988887777788899999999997


No 20 
>PF04500 FLYWCH:  FLYWCH zinc finger domain;  InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif:  F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH  where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=90.50  E-value=0.45  Score=34.68  Aligned_cols=46  Identities=20%  Similarity=0.322  Sum_probs=24.9

Q ss_pred             ccceEEEEeecCeEEEEEeecC---CCccEEEEEEcCCCCceEEeeeccceee
Q 042031           43 LHFDLRIVKSDRSRFIAKCSKE---GCPWRVHVAKCPGVPTFSIRTLHGEHTC   92 (565)
Q Consensus        43 ~gf~~~~~kS~~~r~~~~C~~~---~C~~~v~~~~~~~~~~~~V~~~~~~H~c   92 (565)
                      .|+.|...+.........|...   +|+++|...    .+.-.|.....+|||
T Consensus        14 ~Gy~y~~~~~~~~~~~WrC~~~~~~~C~a~~~~~----~~~~~~~~~~~~HnH   62 (62)
T PF04500_consen   14 DGYRYYFNKRNDGKTYWRCSRRRSHGCRARLITD----AGDGRVVRTNGEHNH   62 (62)
T ss_dssp             TTEEEEEEEE-SS-EEEEEGGGTTS----EEEEE------TTEEEE-S---SS
T ss_pred             CCeEEECcCCCCCcEEEEeCCCCCCCCeEEEEEE----CCCCEEEECCCccCC
Confidence            3677777776677788999864   899999997    233455566688987


No 21 
>smart00343 ZnF_C2HC zinc finger.
Probab=87.08  E-value=0.32  Score=28.74  Aligned_cols=17  Identities=29%  Similarity=0.755  Sum_probs=15.3

Q ss_pred             EcCccCCCCCCCCCCCC
Q 042031          547 QCGRCHLLGHSQKKCTM  563 (565)
Q Consensus       547 ~C~~C~~~gHn~~tC~~  563 (565)
                      .|.+|++.||..+.||.
T Consensus         1 ~C~~CG~~GH~~~~C~~   17 (26)
T smart00343        1 KCYNCGKEGHIARDCPK   17 (26)
T ss_pred             CCccCCCCCcchhhCCc
Confidence            48899999999999983


No 22 
>PF14392 zf-CCHC_4:  Zinc knuckle
Probab=86.99  E-value=0.24  Score=34.57  Aligned_cols=20  Identities=35%  Similarity=0.642  Sum_probs=17.6

Q ss_pred             ceEEcCccCCCCCCCCCCCC
Q 042031          544 RIVQCGRCHLLGHSQKKCTM  563 (565)
Q Consensus       544 ~~~~C~~C~~~gHn~~tC~~  563 (565)
                      -...|.+|+..||+.+.||.
T Consensus        30 lp~~C~~C~~~gH~~~~C~k   49 (49)
T PF14392_consen   30 LPRFCFHCGRIGHSDKECPK   49 (49)
T ss_pred             cChhhcCCCCcCcCHhHcCC
Confidence            34689999999999999984


No 23 
>PF13565 HTH_32:  Homeodomain-like domain
Probab=86.49  E-value=1.8  Score=33.17  Aligned_cols=41  Identities=24%  Similarity=0.462  Sum_probs=34.6

Q ss_pred             hHHHHHHHHHhcCCCCCHHHHHHHHHHHcCccc--CHHHHHHH
Q 042031          107 WVARSVEARIRDNPQYKPKEILQDIRDQHGVAV--SYMQAWRG  147 (565)
Q Consensus       107 ~i~~~~~~~l~~~~~~~~~~i~~~~~~~~g~~~--s~~~~~r~  147 (565)
                      .+...+.+.+..+|.+++.+|.+.|.+++|+.+  |.+++||.
T Consensus        34 e~~~~i~~~~~~~p~wt~~~i~~~L~~~~g~~~~~S~~tv~R~   76 (77)
T PF13565_consen   34 EQRERIIALIEEHPRWTPREIAEYLEEEFGISVRVSRSTVYRI   76 (77)
T ss_pred             HHHHHHHHHHHhCCCCCHHHHHHHHHHHhCCCCCccHhHHHHh
Confidence            344567777788999999999999999999876  99999874


No 24 
>PF03050 DDE_Tnp_IS66:  Transposase IS66 family ;  InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=83.13  E-value=1.1  Score=43.98  Aligned_cols=36  Identities=14%  Similarity=0.293  Sum_probs=28.8

Q ss_pred             EEEEeCCcccHHHHHhhhCCCCcchhhHHHHHHHHHhhcCc
Q 042031          250 LTILSERQRGIVEAVETHFPSAFHCFCLRYVSENFRDTFKN  290 (565)
Q Consensus       250 ~~iitD~~~~l~~Ai~~vfP~~~h~~C~~Hi~~n~~~~~~~  290 (565)
                      -+++||+-.+-..     +..+.|..|..|+.+.+.+-...
T Consensus       121 GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~  156 (271)
T PF03050_consen  121 GILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES  156 (271)
T ss_pred             eeeeccccccccc-----ccccccccccccccccccccccc
Confidence            4899999987654     33789999999999998776654


No 25 
>COG5179 TAF1 Transcription initiation factor TFIID, subunit TAF1 [Transcription]
Probab=81.98  E-value=0.86  Score=48.12  Aligned_cols=25  Identities=32%  Similarity=0.695  Sum_probs=19.4

Q ss_pred             CCCCCceEEcCccCCCCCCC--CCCCC
Q 042031          539 FKRPKRIVQCGRCHLLGHSQ--KKCTM  563 (565)
Q Consensus       539 ~~~~~~~~~C~~C~~~gHn~--~tC~~  563 (565)
                      +++...+++|++|||.||=+  +.||+
T Consensus       931 ~GRK~Ttr~C~nCGQvGHmkTNK~CP~  957 (968)
T COG5179         931 KGRKNTTRTCGNCGQVGHMKTNKACPK  957 (968)
T ss_pred             CCCCCcceecccccccccccccccCcc
Confidence            45555689999999999965  46775


No 26 
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=78.29  E-value=1.6  Score=34.95  Aligned_cols=34  Identities=32%  Similarity=0.545  Sum_probs=24.7

Q ss_pred             EecCceEEEEccCceeccCccc-----cCCCCchhHHHHHH
Q 042031          428 VSTERTNIVDIRSRVCSCRRWQ-----LYGLPCAHAAAALL  463 (565)
Q Consensus       428 ~~~~~~~~V~l~~~~CsC~~~~-----~~GiPC~H~lav~~  463 (565)
                      ...++.|+++.+  -|||..|-     .-.-||.|++.+-.
T Consensus        39 vG~~rdYIl~~g--fCSCp~~~~svvl~Gk~~C~Hi~glk~   77 (117)
T COG5431          39 VGKERDYILEGG--FCSCPDFLGSVVLKGKSPCAHIIGLKV   77 (117)
T ss_pred             EccccceEEEcC--cccCHHHHhHhhhcCcccchhhhheee
Confidence            346678999877  89998876     22357999997533


No 27 
>PF02178 AT_hook:  AT hook motif;  InterPro: IPR017956 AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins [], in DNA-binding proteins from plants [] and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex [].  High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin []. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner [, ]. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions. ; GO: 0003677 DNA binding; PDB: 2EZE_A 2EZD_A 2EZF_A 2EZG_A.
Probab=76.67  E-value=1.1  Score=21.78  Aligned_cols=9  Identities=56%  Similarity=1.066  Sum_probs=3.5

Q ss_pred             CCCCCCCcc
Q 042031          524 RPPGRPKKK  532 (565)
Q Consensus       524 ~~~GRpkk~  532 (565)
                      +++|||+|.
T Consensus         2 r~RGRP~k~   10 (13)
T PF02178_consen    2 RKRGRPRKN   10 (13)
T ss_dssp             --SS--TT-
T ss_pred             CcCCCCccc
Confidence            678999875


No 28 
>PHA02517 putative transposase OrfB; Reviewed
Probab=75.56  E-value=26  Score=34.39  Aligned_cols=42  Identities=12%  Similarity=0.317  Sum_probs=31.7

Q ss_pred             hHHHHHHHHHhc-CCCCCHHHHHHHHHHHcCcccCHHHHHHHHH
Q 042031          107 WVARSVEARIRD-NPQYKPKEILQDIRDQHGVAVSYMQAWRGKE  149 (565)
Q Consensus       107 ~i~~~~~~~l~~-~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~~  149 (565)
                      .+.+.+.+.... .+.+..+.|...|.+. |+.+|.++++|..+
T Consensus        30 ~l~~~I~~i~~~~~~~~G~r~I~~~L~~~-g~~vs~~tV~Rim~   72 (277)
T PHA02517         30 WLKSEILRVYDENHQVYGVRKVWRQLNRE-GIRVARCTVGRLMK   72 (277)
T ss_pred             HHHHHHHHHHHHhCCCCCHHHHHHHHHhc-CcccCHHHHHHHHH
Confidence            455555666554 5788999999998755 99999999998644


No 29 
>PRK09335 30S ribosomal protein S26e; Provisional
Probab=73.83  E-value=2.7  Score=33.29  Aligned_cols=27  Identities=41%  Similarity=0.609  Sum_probs=20.3

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL  553 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~  553 (565)
                      |..++..||-|+.|-..       +..+|.+||.
T Consensus         2 ~kKRrn~GR~K~~rGhv-------~~V~C~nCgr   28 (95)
T PRK09335          2 PKKRENRGRRKGDKGHV-------GYVQCDNCGR   28 (95)
T ss_pred             CcccccCCCCCCCCCCC-------ccEEeCCCCC
Confidence            56778888887765432       5789999997


No 30 
>PLN00186 ribosomal protein S26; Provisional
Probab=69.40  E-value=3.8  Score=33.27  Aligned_cols=27  Identities=33%  Similarity=0.570  Sum_probs=20.2

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL  553 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~  553 (565)
                      |..++..||-|+.|-..       +..+|++|+.
T Consensus         2 ~kKRrN~GR~K~~rGhv-------~~V~C~nCgr   28 (109)
T PLN00186          2 TKKRRNGGRNKHGRGHV-------KRIRCSNCGK   28 (109)
T ss_pred             CcccccCCCCCCCCCCC-------cceeeCCCcc
Confidence            56778888887765433       5789999997


No 31 
>PTZ00172 40S ribosomal protein S26; Provisional
Probab=68.56  E-value=4.1  Score=33.10  Aligned_cols=27  Identities=33%  Similarity=0.556  Sum_probs=20.4

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL  553 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~  553 (565)
                      |..++..||-|+.|-..       +..+|.+|+.
T Consensus         2 ~kKRrN~GR~K~~rGhv-------~~V~C~nCgr   28 (108)
T PTZ00172          2 TSKRRNNGRSKHGRGHV-------KPVRCSNCGR   28 (108)
T ss_pred             CcccccCCCCCCCCCCC-------ccEEeCCccc
Confidence            56778888887765433       5789999997


No 32 
>smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).
Probab=64.57  E-value=4.1  Score=23.85  Aligned_cols=12  Identities=42%  Similarity=0.733  Sum_probs=9.4

Q ss_pred             CCCCCCCCcccc
Q 042031          523 RRPPGRPKKKVL  534 (565)
Q Consensus       523 ~~~~GRpkk~R~  534 (565)
                      .+++|||+|...
T Consensus         1 kRkRGRPrK~~~   12 (26)
T smart00384        1 KRKRGRPRKAPK   12 (26)
T ss_pred             CCCCCCCCCCCC
Confidence            378999998754


No 33 
>PF13917 zf-CCHC_3:  Zinc knuckle
Probab=58.12  E-value=6.1  Score=26.42  Aligned_cols=18  Identities=33%  Similarity=0.779  Sum_probs=16.6

Q ss_pred             eEEcCccCCCCCCCCCCC
Q 042031          545 IVQCGRCHLLGHSQKKCT  562 (565)
Q Consensus       545 ~~~C~~C~~~gHn~~tC~  562 (565)
                      ...|.+|++.||-..-||
T Consensus         4 ~~~CqkC~~~GH~tyeC~   21 (42)
T PF13917_consen    4 RVRCQKCGQKGHWTYECP   21 (42)
T ss_pred             CCcCcccCCCCcchhhCC
Confidence            468999999999999999


No 34 
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=57.68  E-value=5.2  Score=36.33  Aligned_cols=16  Identities=31%  Similarity=0.812  Sum_probs=14.5

Q ss_pred             EEcCccCCCCCCCCCC
Q 042031          546 VQCGRCHLLGHSQKKC  561 (565)
Q Consensus       546 ~~C~~C~~~gHn~~tC  561 (565)
                      .+|.+||+.||-++-|
T Consensus        98 ~~C~~Cg~~GH~~~dC  113 (190)
T COG5082          98 KKCYNCGETGHLSRDC  113 (190)
T ss_pred             cccccccccCcccccc
Confidence            5899999999999999


No 35 
>PF13592 HTH_33:  Winged helix-turn helix
Probab=56.37  E-value=20  Score=25.99  Aligned_cols=30  Identities=27%  Similarity=0.280  Sum_probs=25.8

Q ss_pred             CCCCCHHHHHHHHHHHcCcccCHHHHHHHH
Q 042031          119 NPQYKPKEILQDIRDQHGVAVSYMQAWRGK  148 (565)
Q Consensus       119 ~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~  148 (565)
                      +..++.++|...|.+.||+.+|.+.+|+.-
T Consensus         2 ~~~wt~~~i~~~I~~~fgv~ys~~~v~~lL   31 (60)
T PF13592_consen    2 GGRWTLKEIAAYIEEEFGVKYSPSGVYRLL   31 (60)
T ss_pred             CCcccHHHHHHHHHHHHCCEEcHHHHHHHH
Confidence            355788999999999999999999998753


No 36 
>COG4715 Uncharacterized conserved protein [Function unknown]
Probab=55.58  E-value=35  Score=36.40  Aligned_cols=51  Identities=22%  Similarity=0.348  Sum_probs=31.0

Q ss_pred             eEEEeeCCeeEEEEecCceE--EEEc----cCceeccCccccCCCCchhHHHHHHhcCC
Q 042031          415 YQVLRANEIEFEIVSTERTN--IVDI----RSRVCSCRRWQLYGLPCAHAAAALLSCGQ  467 (565)
Q Consensus       415 ~~v~~~~~~~~~V~~~~~~~--~V~l----~~~~CsC~~~~~~GiPC~H~lav~~~~~~  467 (565)
                      ..+..-++..--|+.+++.|  .|++    .+..|||.. ...| -|.|++||+....-
T Consensus        41 ~~i~~~g~~v~A~V~Gs~~y~v~vtL~~~~~ss~CTCP~-~~~g-aCKH~VAvvl~~~~   97 (587)
T COG4715          41 LKITIRGGTVRAVVEGSRRYRVRVTLEGGALSSICTCPY-GGSG-ACKHVVAVVLEYLD   97 (587)
T ss_pred             eEEeecCCeEEEEEeccceeeEEEEeecCCcCceeeCCC-CCCc-chHHHHHHHHHHhh
Confidence            34443333333334455554  4455    346999997 5555 59999999888644


No 37 
>PF04937 DUF659:  Protein of unknown function (DUF 659);  InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=52.69  E-value=48  Score=29.36  Aligned_cols=62  Identities=11%  Similarity=0.244  Sum_probs=42.7

Q ss_pred             ccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHHHh---hhCCCCcchhhHHHHHHHHHhhcCc
Q 042031          225 SDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEAVE---THFPSAFHCFCLRYVSENFRDTFKN  290 (565)
Q Consensus       225 ~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~Ai~---~vfP~~~h~~C~~Hi~~n~~~~~~~  290 (565)
                      +.+...-+|....+.+|    .....-||||....+.+|-+   +-+|.....-|..|-+.-+.+.+..
T Consensus        74 ~a~~l~~ll~~vIeeVG----~~nVvqVVTDn~~~~~~a~~~L~~k~p~ifw~~CaaH~inLmledi~k  138 (153)
T PF04937_consen   74 TAEYLFELLDEVIEEVG----EENVVQVVTDNASNMKKAGKLLMEKYPHIFWTPCAAHCINLMLEDIGK  138 (153)
T ss_pred             cHHHHHHHHHHHHHHhh----hhhhhHHhccCchhHHHHHHHHHhcCCCEEEechHHHHHHHHHHHHhc
Confidence            34444444444445555    44566789999999888744   4488888889999988877666543


No 38 
>PF04800 ETC_C1_NDUFA4:  ETC complex I subunit conserved region;  InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=51.45  E-value=27  Score=28.49  Aligned_cols=45  Identities=16%  Similarity=0.158  Sum_probs=27.9

Q ss_pred             cccCCcCCCCCCCcc--ccCeeCCHHHHHHHHHHHHHhccceEEEEeecCe
Q 042031            7 TVPNDSLSLAEPTLS--IGQEFPDVETCRRTLKDIAIALHFDLRIVKSDRS   55 (565)
Q Consensus         7 ~~~~~~~~~~~~~l~--~G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~~   55 (565)
                      .+||-+-..+++.+.  +.+.|+|+|+|.    .||.++|..|.|..-...
T Consensus        34 ~~PLMGWtss~D~~~q~v~l~F~skE~Ai----~yaer~G~~Y~V~~p~~r   80 (101)
T PF04800_consen   34 ENPLMGWTSSGDPLSQSVRLKFDSKEDAI----AYAERNGWDYEVEEPKKR   80 (101)
T ss_dssp             --TTT-SSSS--SEEE-CEEEESSHHHHH----HHHHHCT-EEEEE-STT-
T ss_pred             CCCccCCCCCCChhhCeeEeeeCCHHHHH----HHHHHcCCeEEEeCCCCC
Confidence            345555444444553  899999999985    579999999999865443


No 39 
>COG4830 RPS26B Ribosomal protein S26 [Translation, ribosomal structure and biogenesis]
Probab=50.95  E-value=10  Score=30.06  Aligned_cols=27  Identities=44%  Similarity=0.724  Sum_probs=20.6

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL  553 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~  553 (565)
                      |..++..||.|+.|-..       .-.+|-+||.
T Consensus         2 pkkR~N~GR~K~~rGhv-------~~v~CdnCg~   28 (108)
T COG4830           2 PKKRRNRGRNKKGRGHV-------KYVRCDNCGK   28 (108)
T ss_pred             cchhhhcCCCCCCCCCc-------cceeeccccc
Confidence            67788889988866433       4579999997


No 40 
>PHA00689 hypothetical protein
Probab=50.58  E-value=10  Score=25.79  Aligned_cols=12  Identities=50%  Similarity=0.991  Sum_probs=10.2

Q ss_pred             ceEEcCccCCCC
Q 042031          544 RIVQCGRCHLLG  555 (565)
Q Consensus       544 ~~~~C~~C~~~g  555 (565)
                      +..+|++||..|
T Consensus        16 ravtckrcgktg   27 (62)
T PHA00689         16 RAVTCKRCGKTG   27 (62)
T ss_pred             ceeehhhccccC
Confidence            678999999876


No 41 
>PF01283 Ribosomal_S26e:  Ribosomal protein S26e;  InterPro: IPR000892 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26 []; Octopus S26 []; Drosophila S26 (DS31) []; plant cytoplasmic S26; and fungal S26 []. These proteins have 114 to 127 amino acids.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 3U5G_a 3U5C_a 2XZM_5 2XZN_5.
Probab=49.86  E-value=12  Score=31.02  Aligned_cols=27  Identities=37%  Similarity=0.545  Sum_probs=13.1

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL  553 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~  553 (565)
                      |..++.-||-|+.|-.       -..++|.+|+.
T Consensus         2 ~~KRrN~Gr~KkgrGh-------v~~V~C~nCgr   28 (113)
T PF01283_consen    2 TKKRRNNGRSKKGRGH-------VQPVRCDNCGR   28 (113)
T ss_dssp             ----TTTTSS-SSSS----------EEE-TTTB-
T ss_pred             CcccccCCCCCCCCCC-------CcCEeeCcccc
Confidence            4567777777765533       25789999986


No 42 
>PF08766 DEK_C:  DEK C terminal domain;  InterPro: IPR014876 DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients []. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like Q8TF96 from SWISSPROT, and in protein phosphatases such as Q6NN85 from SWISSPROT. ; PDB: 1Q1V_A.
Probab=46.48  E-value=52  Score=23.25  Aligned_cols=38  Identities=13%  Similarity=0.333  Sum_probs=24.5

Q ss_pred             hHHHHHHHHHhc-C-CCCCHHHHHHHHHHHcCcccCHHHH
Q 042031          107 WVARSVEARIRD-N-PQYKPKEILQDIRDQHGVAVSYMQA  144 (565)
Q Consensus       107 ~i~~~~~~~l~~-~-~~~~~~~i~~~~~~~~g~~~s~~~~  144 (565)
                      -+...+.+.|+. + .+++.++|.+.|.+.+|++++..+.
T Consensus         4 ~i~~~i~~iL~~~dl~~vT~k~vr~~Le~~~~~dL~~~K~   43 (54)
T PF08766_consen    4 EIREAIREILREADLDTVTKKQVREQLEERFGVDLSSRKK   43 (54)
T ss_dssp             HHHHHHHHHHTTS-GGG--HHHHHHHHHHH-SS--SHHHH
T ss_pred             HHHHHHHHHHHhCCHhHhhHHHHHHHHHHHHCCCcHHHHH
Confidence            455566777775 2 5678999999999999999996543


No 43 
>PF01498 HTH_Tnp_Tc3_2:  Transposase;  InterPro: IPR002492 Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in []. Tc3 is a member of the Tc1/mariner family of transposable elements. This entry also includes histone-lysine N-methyltransferase SETMAR, which is a SET domain and mariner transposase fusion gene-containing protein. This histone methyltransferase has sequence-specific DNA-binding activity and recognises the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element. This protein has DNA nicking activity, and has in vivo end joining activity and may mediate genomic integration of foreign DNA [, , , ]. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated, 0015074 DNA integration; PDB: 3K9K_B 3F2K_B 3K9J_B 1U78_A.
Probab=42.97  E-value=20  Score=26.93  Aligned_cols=36  Identities=28%  Similarity=0.486  Sum_probs=16.0

Q ss_pred             HHHHHhcCCCCCHHHHHHHHHHHcCcccCHHHHHHHH
Q 042031          112 VEARIRDNPQYKPKEILQDIRDQHGVAVSYMQAWRGK  148 (565)
Q Consensus       112 ~~~~l~~~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~  148 (565)
                      +...++.+|..+..+|...+.+. |..+|..+++|.-
T Consensus         4 I~~~v~~~p~~s~~~i~~~l~~~-~~~vS~~TI~r~L   39 (72)
T PF01498_consen    4 IVRMVRRNPRISAREIAQELQEA-GISVSKSTIRRRL   39 (72)
T ss_dssp             ------------HHHHHHHT----T--S-HHHHHHHH
T ss_pred             HHHHHHHCCCCCHHHHHHHHHHc-cCCcCHHHHHHHH
Confidence            34566778999999999999888 9999999998763


No 44 
>PF00665 rve:  Integrase core domain;  InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis [].  Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.  HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=40.20  E-value=36  Score=28.09  Aligned_cols=54  Identities=15%  Similarity=0.144  Sum_probs=37.9

Q ss_pred             cccCCCCceeEeEEEEeeccccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcccHHHH
Q 042031          205 MGVDAEDALFPLAIAIVDVESDENWMWFMSELRKLLGVNTDNMPRLTILSERQRGIVEA  263 (565)
Q Consensus       205 ~g~d~~~~~~~la~a~~~~E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~~l~~A  263 (565)
                      +.+|..- -+.+++.+-..++.+.+.-+|.......+    ...|.+|+||+..+..+.
T Consensus        29 ~~iD~~S-~~~~~~~~~~~~~~~~~~~~l~~~~~~~~----~~~p~~i~tD~g~~f~~~   82 (120)
T PF00665_consen   29 VFIDDYS-RFIYAFPVSSKETAEAALRALKRAIEKRG----GRPPRVIRTDNGSEFTSH   82 (120)
T ss_dssp             EEEETTT-TEEEEEEESSSSHHHHHHHHHHHHHHHHS-----SE-SEEEEESCHHHHSH
T ss_pred             EEEECCC-CcEEEEEeecccccccccccccccccccc----cccceecccccccccccc
Confidence            5555543 44557777777788888888887777766    333899999999987643


No 45 
>PF09713 A_thal_3526:  Plant protein 1589 of unknown function (A_thal_3526);  InterPro: IPR006476 This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.
Probab=36.84  E-value=32  Score=24.41  Aligned_cols=26  Identities=12%  Similarity=0.411  Sum_probs=18.8

Q ss_pred             CCCHHHHHHHHHHHcCcccCHH-HHHH
Q 042031          121 QYKPKEILQDIRDQHGVAVSYM-QAWR  146 (565)
Q Consensus       121 ~~~~~~i~~~~~~~~g~~~s~~-~~~r  146 (565)
                      .++..++++.|.++.|+.+... .+|+
T Consensus        12 yMsk~E~v~~L~~~a~I~P~~T~~VW~   38 (54)
T PF09713_consen   12 YMSKEECVRALQKQANIEPVFTSTVWQ   38 (54)
T ss_pred             cCCHHHHHHHHHHHcCCChHHHHHHHH
Confidence            3567889999988888876554 4553


No 46 
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=33.97  E-value=26  Score=33.29  Aligned_cols=24  Identities=42%  Similarity=0.751  Sum_probs=18.8

Q ss_pred             CceeccCccccCCCCchhHHHHHHhcC
Q 042031          440 SRVCSCRRWQLYGLPCAHAAAALLSCG  466 (565)
Q Consensus       440 ~~~CsC~~~~~~GiPC~H~lav~~~~~  466 (565)
                      ...|||..   .-.||.|+.||..+..
T Consensus       124 ~~dCSCPD---~anPCKHi~AvyY~la  147 (266)
T COG4279         124 STDCSCPD---YANPCKHIAAVYYLLA  147 (266)
T ss_pred             ccccCCCC---cccchHHHHHHHHHHH
Confidence            34699986   4579999999987743


No 47 
>PF05741 zf-nanos:  Nanos RNA binding domain;  InterPro: IPR024161 Nanos is a highly conserved RNA-binding protein in higher eukaryotes and functions as a key regulatory protein in translational control using a 3' untranslated region during the development and maintenance of germ cells. Nanos comprises a non-conserved amino-terminus and highly conserved carboxy- terminal regions. The C-terminal region has two conserved Cys-Cys-His-Cys (CCHC)-type zinc-finger motifs that are indispensable for nanos function [, , ]. The structure of the nanos-type zinc finger is composed of two independent zinc-finger (ZF) lobes, the N-terminal ZF1 and the C-terminal ZF2, which are connected by a linker helix []. These lobes create a large cleft. Zinc ions in ZF1 and ZF2 are bound to the CCHC motif by tetrahedral coordination.; PDB: 3ALR_B.
Probab=33.33  E-value=13  Score=26.45  Aligned_cols=20  Identities=30%  Similarity=0.514  Sum_probs=8.5

Q ss_pred             ceEEcCccCC---CCCCCCCCCC
Q 042031          544 RIVQCGRCHL---LGHSQKKCTM  563 (565)
Q Consensus       544 ~~~~C~~C~~---~gHn~~tC~~  563 (565)
                      +.+.|..||.   .+|+.+-||+
T Consensus        32 r~y~Cp~CgAtGd~AHT~~yCP~   54 (55)
T PF05741_consen   32 RKYVCPICGATGDNAHTIKYCPK   54 (55)
T ss_dssp             GG---TTT---GGG---GGG-TT
T ss_pred             hcCcCCCCcCcCccccccccCcC
Confidence            5689999998   5678888886


No 48 
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=32.48  E-value=22  Score=32.45  Aligned_cols=18  Identities=28%  Similarity=0.608  Sum_probs=16.4

Q ss_pred             eEEcCccCCCCCCCCCCC
Q 042031          545 IVQCGRCHLLGHSQKKCT  562 (565)
Q Consensus       545 ~~~C~~C~~~gHn~~tC~  562 (565)
                      ...|-+||+.||-++.||
T Consensus        60 ~~~C~nCg~~GH~~~DCP   77 (190)
T COG5082          60 NPVCFNCGQNGHLRRDCP   77 (190)
T ss_pred             ccccchhcccCcccccCC
Confidence            468999999999999999


No 49 
>PRK14892 putative transcription elongation factor Elf1; Provisional
Probab=32.42  E-value=35  Score=27.75  Aligned_cols=9  Identities=44%  Similarity=1.228  Sum_probs=6.1

Q ss_pred             ceEEcCccC
Q 042031          544 RIVQCGRCH  552 (565)
Q Consensus       544 ~~~~C~~C~  552 (565)
                      ....|.+||
T Consensus        20 t~f~CP~Cg   28 (99)
T PRK14892         20 KIFECPRCG   28 (99)
T ss_pred             cEeECCCCC
Confidence            456777777


No 50 
>PRK12286 rpmF 50S ribosomal protein L32; Reviewed
Probab=29.64  E-value=54  Score=23.64  Aligned_cols=32  Identities=22%  Similarity=0.435  Sum_probs=20.0

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHL  553 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~  553 (565)
                      |+.+.++.|..++|-.  ..........|+.||.
T Consensus         4 PKrk~S~srr~~RRsh--~~l~~~~l~~C~~CG~   35 (57)
T PRK12286          4 PKRKTSKSRKRKRRAH--FKLKAPGLVECPNCGE   35 (57)
T ss_pred             CcCcCChhhcchhccc--ccccCCcceECCCCCC
Confidence            6666777776666543  2223334678999988


No 51 
>PF10045 DUF2280:  Uncharacterized conserved protein (DUF2280);  InterPro: IPR018738 This entry is represented by Burkholderia phage Bups phi1, Orf2.36. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Probab=28.93  E-value=61  Score=26.33  Aligned_cols=24  Identities=25%  Similarity=0.491  Sum_probs=21.2

Q ss_pred             CHHHHHHHHHHHcCcccCHHHHHH
Q 042031          123 KPKEILQDIRDQHGVAVSYMQAWR  146 (565)
Q Consensus       123 ~~~~i~~~~~~~~g~~~s~~~~~r  146 (565)
                      +|.++.+.+++.||+.+|..++-.
T Consensus        21 TPs~v~~aVk~eFgi~vsrQqve~   44 (104)
T PF10045_consen   21 TPSEVAEAVKEEFGIDVSRQQVES   44 (104)
T ss_pred             CHHHHHHHHHHHhCCccCHHHHHH
Confidence            799999999999999999887643


No 52 
>PF13276 HTH_21:  HTH-like domain
Probab=28.26  E-value=1.5e+02  Score=21.11  Aligned_cols=42  Identities=19%  Similarity=0.374  Sum_probs=32.4

Q ss_pred             HHHHHHHHHhc-CCCCCHHHHHHHHHHHcCcccCHHHHHHHHH
Q 042031          108 VARSVEARIRD-NPQYKPKEILQDIRDQHGVAVSYMQAWRGKE  149 (565)
Q Consensus       108 i~~~~~~~l~~-~~~~~~~~i~~~~~~~~g~~~s~~~~~r~~~  149 (565)
                      +...+.+.... .+.+....|...|..+.|+.+|..+++|..+
T Consensus         6 l~~~I~~i~~~~~~~yG~rri~~~L~~~~~~~v~~krV~RlM~   48 (60)
T PF13276_consen    6 LRELIKEIFKESKPTYGYRRIWAELRREGGIRVSRKRVRRLMR   48 (60)
T ss_pred             HHHHHHHHHHHcCCCeehhHHHHHHhccCcccccHHHHHHHHH
Confidence            34445555555 4788999999999999889999999988754


No 53 
>PF13877 RPAP3_C:  Potential Monad-binding region of RPAP3
Probab=26.86  E-value=47  Score=26.48  Aligned_cols=33  Identities=15%  Similarity=0.343  Sum_probs=26.9

Q ss_pred             hcHHHHHHHHHHHHhcccchhhHhhhCCCCCcc
Q 042031          304 LTTVEFEAKISEMVEISQDVIPWFQQFPPQLWA  336 (565)
Q Consensus       304 ~t~~eF~~~~~~l~~~~~~~~~~l~~~~~~~W~  336 (565)
                      .|..||+..|..+.......++||..++++...
T Consensus         5 ~~~~eF~~~w~~~~~~~~~~~~yL~~i~p~~l~   37 (94)
T PF13877_consen    5 KNSYEFERDWRRLKKDPEERYEYLKSIPPDSLP   37 (94)
T ss_pred             CCHHHHHHHHHHHcCCHHHHHHHHHhCChHHHH
Confidence            467899999999987767899999998766543


No 54 
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=26.20  E-value=43  Score=32.50  Aligned_cols=25  Identities=32%  Similarity=0.530  Sum_probs=20.5

Q ss_pred             CCCCceEEcCccCCCCCCCCCCCCC
Q 042031          540 KRPKRIVQCGRCHLLGHSQKKCTMP  564 (565)
Q Consensus       540 ~~~~~~~~C~~C~~~gHn~~tC~~~  564 (565)
                      +.+-..+-|=+||+.||-...||..
T Consensus       171 kppPpgY~CyRCGqkgHwIqnCpTN  195 (427)
T COG5222         171 KPPPPGYVCYRCGQKGHWIQNCPTN  195 (427)
T ss_pred             CCCCCceeEEecCCCCchhhcCCCC
Confidence            3344569999999999999999863


No 55 
>PF05634 APO_RNA-bind:  APO RNA-binding;  InterPro: IPR008512 This family consists of plant APO (accumulation of photosystem 1) proteins.
Probab=25.76  E-value=45  Score=30.67  Aligned_cols=20  Identities=30%  Similarity=0.775  Sum_probs=17.0

Q ss_pred             ceEEcCccCC-----CCCCCCCCCC
Q 042031          544 RIVQCGRCHL-----LGHSQKKCTM  563 (565)
Q Consensus       544 ~~~~C~~C~~-----~gHn~~tC~~  563 (565)
                      .+..|+.|.+     .||..+||..
T Consensus        97 pV~~C~~C~EVHVG~~GH~irtC~g  121 (204)
T PF05634_consen   97 PVKACGYCPEVHVGPVGHKIRTCGG  121 (204)
T ss_pred             eeeecCCCCCeEECCCcccccccCC
Confidence            3678999977     8999999964


No 56 
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=25.75  E-value=43  Score=29.40  Aligned_cols=19  Identities=26%  Similarity=0.642  Sum_probs=15.6

Q ss_pred             EEcCccCCCCCCCCCCCCC
Q 042031          546 VQCGRCHLLGHSQKKCTMP  564 (565)
Q Consensus       546 ~~C~~C~~~gHn~~tC~~~  564 (565)
                      ..|.+|++.||..+.||.+
T Consensus        78 ~~C~~Cg~~GH~~~~C~~~   96 (148)
T PTZ00368         78 RSCYNCGQTGHISRECPNR   96 (148)
T ss_pred             cccCcCCCCCcccccCCCc
Confidence            5688888888888888764


No 57 
>TIGR01031 rpmF_bact ribosomal protein L32. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus.
Probab=25.26  E-value=79  Score=22.58  Aligned_cols=41  Identities=20%  Similarity=0.398  Sum_probs=22.1

Q ss_pred             CCCCCCCCCCCcccccccCCCCCCceEEcCccCCCCCCCCCC
Q 042031          520 PKTRRPPGRPKKKVLRVENFKRPKRIVQCGRCHLLGHSQKKC  561 (565)
Q Consensus       520 P~~~~~~GRpkk~R~~~~~~~~~~~~~~C~~C~~~gHn~~tC  561 (565)
                      |..+.++-|..++|-... .........|+.||+.--.=+-|
T Consensus         2 PKrk~Sksr~~~RRah~~-kl~~p~l~~C~~cG~~~~~H~vc   42 (55)
T TIGR01031         2 PKRKTSKSRKRKRRSHDA-KLTAPTLVVCPNCGEFKLPHRVC   42 (55)
T ss_pred             CCCcCCcccccchhcCcc-cccCCcceECCCCCCcccCeeEC
Confidence            555566666655554321 12233567899999843333334


No 58 
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=24.81  E-value=41  Score=29.48  Aligned_cols=20  Identities=25%  Similarity=0.576  Sum_probs=17.9

Q ss_pred             eEEcCccCCCCCCCCCCCCC
Q 042031          545 IVQCGRCHLLGHSQKKCTMP  564 (565)
Q Consensus       545 ~~~C~~C~~~gHn~~tC~~~  564 (565)
                      ...|.+|++.||-.+.||.+
T Consensus        52 ~~~C~~Cg~~GH~~~~Cp~~   71 (148)
T PTZ00368         52 ERSCYNCGKTGHLSRECPEA   71 (148)
T ss_pred             CcccCCCCCcCcCcccCCCc
Confidence            35799999999999999975


No 59 
>TIGR01589 A_thal_3526 uncharacterized plant-specific domain TIGR01589. This model represents an uncharacterized plant-specific domain 57 residues in length. It is found toward the N-terminus of most proteins that contain it. Examples include at least 10 proteins from Arabidopsis thaliana and at least one from Oryza sativa.
Probab=24.69  E-value=64  Score=23.21  Aligned_cols=26  Identities=12%  Similarity=0.287  Sum_probs=20.2

Q ss_pred             CCHHHHHHHHHHHcCcccCHH-HHHHH
Q 042031          122 YKPKEILQDIRDQHGVAVSYM-QAWRG  147 (565)
Q Consensus       122 ~~~~~i~~~~~~~~g~~~s~~-~~~r~  147 (565)
                      ++..++++.+.++.|+.+... .+|+.
T Consensus        16 Msk~E~v~~L~~~a~I~P~~T~~VW~~   42 (57)
T TIGR01589        16 MSKEETVSFLFENAGISPKFTRFVWYL   42 (57)
T ss_pred             CCHHHHHHHHHHHcCCCchhHHHHHHH
Confidence            567899999999999987664 56754


No 60 
>PF14201 DUF4318:  Domain of unknown function (DUF4318)
Probab=24.53  E-value=1.3e+02  Score=22.93  Aligned_cols=32  Identities=16%  Similarity=0.314  Sum_probs=27.0

Q ss_pred             cCeeCCHHHHHHHHHHHHHhccceEEEEeecC
Q 042031           23 GQEFPDVETCRRTLKDIAIALHFDLRIVKSDR   54 (565)
Q Consensus        23 G~~F~s~~e~~~a~~~ya~~~gf~~~~~kS~~   54 (565)
                      -..++|.+++-.++..|+.+++-.+...+-+.
T Consensus        11 ~~~yPs~e~i~~aIE~YC~~~~~~l~Fisr~~   42 (74)
T PF14201_consen   11 SPKYPSKEEICEAIEKYCIKNGESLEFISRDK   42 (74)
T ss_pred             CCCCCCHHHHHHHHHHHHHHcCCceEEEecCC
Confidence            34588999999999999999999998875443


No 61 
>PF13551 HTH_29:  Winged helix-turn helix
Probab=24.04  E-value=1.8e+02  Score=23.49  Aligned_cols=37  Identities=22%  Similarity=0.369  Sum_probs=28.9

Q ss_pred             HHHHHHhcCC-----CCCHHHHHHHH-HHHcCcccCHHHHHHH
Q 042031          111 SVEARIRDNP-----QYKPKEILQDI-RDQHGVAVSYMQAWRG  147 (565)
Q Consensus       111 ~~~~~l~~~~-----~~~~~~i~~~~-~~~~g~~~s~~~~~r~  147 (565)
                      .+.+.+..+|     .+++..|.+.+ ++.+|+.+|.+++++.
T Consensus        65 ~l~~~~~~~p~~g~~~~t~~~l~~~l~~~~~~~~~s~~ti~r~  107 (112)
T PF13551_consen   65 QLIELLRENPPEGRSRWTLEELAEWLIEEEFGIDVSPSTIRRI  107 (112)
T ss_pred             HHHHHHHHCCCCCCCcccHHHHHHHHHHhccCccCCHHHHHHH
Confidence            4555666655     47889999876 8999999999999875


No 62 
>PF14952 zf-tcix:  Putative treble-clef, zinc-finger, Zn-binding
Probab=23.67  E-value=50  Score=22.20  Aligned_cols=21  Identities=24%  Similarity=0.521  Sum_probs=15.9

Q ss_pred             ceEEcCccCC-CCCCCCCCCCC
Q 042031          544 RIVQCGRCHL-LGHSQKKCTMP  564 (565)
Q Consensus       544 ~~~~C~~C~~-~gHn~~tC~~~  564 (565)
                      ..++|..||- -|+-.-.|++.
T Consensus        10 GirkCp~CGt~NG~R~~~CKN~   31 (44)
T PF14952_consen   10 GIRKCPKCGTYNGTRGLSCKNK   31 (44)
T ss_pred             ccccCCcCcCccCcccccccCC
Confidence            4679999999 66666677764


No 63 
>KOG2044 consensus 5'-3' exonuclease HKE1/RAT1 [Replication, recombination and repair; RNA processing and modification]
Probab=23.51  E-value=32  Score=38.20  Aligned_cols=20  Identities=30%  Similarity=0.586  Sum_probs=17.1

Q ss_pred             ceEEcCccCCCCCCCCCCCC
Q 042031          544 RIVQCGRCHLLGHSQKKCTM  563 (565)
Q Consensus       544 ~~~~C~~C~~~gHn~~tC~~  563 (565)
                      +.++|-.||+.||+...|..
T Consensus       259 ~~~~C~~cgq~gh~~~dc~g  278 (931)
T KOG2044|consen  259 KPRRCFLCGQTGHEAKDCEG  278 (931)
T ss_pred             CcccchhhcccCCcHhhcCC
Confidence            44569999999999999964


No 64 
>COG4888 Uncharacterized Zn ribbon-containing protein [General function prediction only]
Probab=22.74  E-value=71  Score=25.80  Aligned_cols=8  Identities=50%  Similarity=1.294  Sum_probs=3.7

Q ss_pred             EEcCccCC
Q 042031          546 VQCGRCHL  553 (565)
Q Consensus       546 ~~C~~C~~  553 (565)
                      ..|+.||+
T Consensus        47 ~~Cg~CGl   54 (104)
T COG4888          47 AVCGNCGL   54 (104)
T ss_pred             EEcccCcc
Confidence            34444443


No 65 
>KOG0341 consensus DEAD-box protein abstrakt [RNA processing and modification]
Probab=21.97  E-value=42  Score=34.17  Aligned_cols=19  Identities=32%  Similarity=0.628  Sum_probs=17.2

Q ss_pred             eEEcCccCCCCCCCCCCCC
Q 042031          545 IVQCGRCHLLGHSQKKCTM  563 (565)
Q Consensus       545 ~~~C~~C~~~gHn~~tC~~  563 (565)
                      ..-|-.||+.||....||+
T Consensus       570 ~kGCayCgGLGHRItdCPK  588 (610)
T KOG0341|consen  570 EKGCAYCGGLGHRITDCPK  588 (610)
T ss_pred             ccccccccCCCcccccCch
Confidence            4579999999999999996


No 66 
>PF01644 Chitin_synth_1:  Chitin synthase;  InterPro: IPR004834 This region is found commonly in chitin synthases classes I, II and III 2.4.1.16 from EC. Chitin a linear homopolymer of GlcNAc residues, it is an important component of the cell wall of fungi and is synthesised on the cytoplasmic surface of the cell membrane by membrane bound chitin synthases []. ; GO: 0004100 chitin synthase activity, 0006031 chitin biosynthetic process
Probab=20.88  E-value=2.1e+02  Score=25.65  Aligned_cols=47  Identities=13%  Similarity=0.325  Sum_probs=32.6

Q ss_pred             ccCCCCceeEeEEEEeec--cccchHHHHHHHHHHHhCCCCCCCCcEEEEeCCcc
Q 042031          206 GVDAEDALFPLAIAIVDV--ESDENWMWFMSELRKLLGVNTDNMPRLTILSERQR  258 (565)
Q Consensus       206 g~d~~~~~~~la~a~~~~--E~~e~~~w~l~~l~~~~~~~~~~~~~~~iitD~~~  258 (565)
                      +.+.+..+..+-|++=+.  ..-.|-+|||+.|-+.+.      |..|+.-|-..
T Consensus       101 ~~~~~~~PvQ~ifclKe~N~kKinSHrWfFnaf~~~l~------P~vcvllDvGT  149 (163)
T PF01644_consen  101 GPEKNIVPVQIIFCLKEKNAKKINSHRWFFNAFCRQLQ------PNVCVLLDVGT  149 (163)
T ss_pred             ccccCCCCEEEEEEeccccccccchhhHHHHHHHhhcC------CcEEEEEecCC
Confidence            334445556666666444  234899999999999987      66888888654


No 67 
>PF11645 PDDEXK_5:  PD-(D/E)XK endonuclease;  InterPro: IPR021671  This family are putative endonuclease proteins which are restricted to Synechocystis. ; PDB: 2OST_D.
Probab=20.71  E-value=2.8e+02  Score=24.12  Aligned_cols=45  Identities=18%  Similarity=0.321  Sum_probs=34.9

Q ss_pred             HHHHHHHHHHHHHhccceEEEEeecCeEEEEEeecCCCccEEEEE
Q 042031           29 VETCRRTLKDIAIALHFDLRIVKSDRSRFIAKCSKEGCPWRVHVA   73 (565)
Q Consensus        29 ~~e~~~a~~~ya~~~gf~~~~~kS~~~r~~~~C~~~~C~~~v~~~   73 (565)
                      .+++..++....++.|+.+-+.-.+..+|.++=..+||-|||.+.
T Consensus         6 GDite~~ii~~ll~~GY~V~~P~gDn~~YDLV~d~eg~L~RIQvK   50 (149)
T PF11645_consen    6 GDITEAKIINRLLEKGYSVSIPFGDNLKYDLVFDKEGILWRIQVK   50 (149)
T ss_dssp             HHHHHHHHHHHHHHTT-EEEEESSTTSS-SEEEEETTEEEEEEEE
T ss_pred             chHHHHHHHHHHHHcCcEEEeecCCCCCcCEEEecCCcEEEEEEe
Confidence            356677888889999999999988887766555578999999875


No 68 
>PF01316 Arg_repressor:  Arginine repressor, DNA binding domain;  InterPro: IPR020900 The arginine dihydrolase (AD) pathway is found in many prokaryotes and some primitive eukaryotes, an example of the latter being Giardia lamblia (Giardia intestinalis) []. The three-enzyme anaerobic pathway breaks down L-arginine to form 1 mol of ATP, carbon dioxide and ammonia. In simpler bacteria, the first enzyme, arginine deiminase, can account for up to 10% of total cell protein []. Most prokaryotic arginine deiminase pathways are under the control of a repressor gene, termed ArgR []. This is a negative regulator, and will only release the arginine deiminase operon for expression in the presence of arginine []. The crystal structure of apo-ArgR from Bacillus stearothermophilus has been determined to 2.5A by means of X-ray crystallography []. The protein exists as a hexamer of identical subunits, and is shown to have six DNA-binding domains, clustered around a central oligomeric core when bound to arginine. It predominantly interacts with A.T residues in ARG boxes. This hexameric protein binds DNA at its N terminus to repress arginine biosyntheis or activate arginine catabolism. Some species have several ArgR paralogs. In a neighbour-joining tree, some of these paralogous sequences show long branches and differ significantly from the well-conserved C-terminal region. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent, 0006525 arginine metabolic process; PDB: 1AOY_A 3V4G_A 3LAJ_D 3FHZ_A 3LAP_B 3ERE_D 2P5L_C 1F9N_D 2P5K_A 1B4A_A ....
Probab=20.52  E-value=1.8e+02  Score=21.93  Aligned_cols=36  Identities=14%  Similarity=0.288  Sum_probs=24.1

Q ss_pred             HHHHHHhcCCCCCHHHHHHHHHHHcCcccCHHHHHHH
Q 042031          111 SVEARIRDNPQYKPKEILQDIRDQHGVAVSYMQAWRG  147 (565)
Q Consensus       111 ~~~~~l~~~~~~~~~~i~~~~~~~~g~~~s~~~~~r~  147 (565)
                      .+...|..+.-.+-.+|++.|.+. |+.+|..++.|-
T Consensus         9 ~I~~li~~~~i~sQ~eL~~~L~~~-Gi~vTQaTiSRD   44 (70)
T PF01316_consen    9 LIKELISEHEISSQEELVELLEEE-GIEVTQATISRD   44 (70)
T ss_dssp             HHHHHHHHS---SHHHHHHHHHHT-T-T--HHHHHHH
T ss_pred             HHHHHHHHCCcCCHHHHHHHHHHc-CCCcchhHHHHH
Confidence            466777777777889999999775 999999998875


No 69 
>KOG4602 consensus Nanos and related proteins [General function prediction only]
Probab=20.22  E-value=44  Score=31.72  Aligned_cols=20  Identities=35%  Similarity=0.718  Sum_probs=16.1

Q ss_pred             ceEEcCccCCCC---CCCCCCCC
Q 042031          544 RIVQCGRCHLLG---HSQKKCTM  563 (565)
Q Consensus       544 ~~~~C~~C~~~g---Hn~~tC~~  563 (565)
                      |.+.|..||..|   |+++-||.
T Consensus       267 R~YVCPiCGATgDnAHTiKyCPl  289 (318)
T KOG4602|consen  267 RSYVCPICGATGDNAHTIKYCPL  289 (318)
T ss_pred             hhhcCccccccCCcccceecccc
Confidence            578999999865   77777885


Done!