Query         037162
Match_columns 689
No_of_seqs    421 out of 1354
Neff          7.7 
Searched_HMMs 46136
Date          Fri Mar 29 09:18:47 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/037162.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/037162hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PLN03097 FHY3 Protein FAR-RED  100.0 1.3E-45 2.8E-50  425.3  26.4  311    8-324    67-436 (846)
  2 PF00872 Transposase_mut:  Tran  99.8 5.8E-21 1.3E-25  208.0   1.5  232  125-387    94-352 (381)
  3 PF03101 FAR1:  FAR1 DNA-bindin  99.8 3.4E-19 7.4E-24  154.6   8.0   90   29-119     1-91  (91)
  4 PF10551 MULE:  MULE transposas  99.7 7.2E-17 1.6E-21  140.5   6.1   78  192-271    16-93  (93)
  5 PF08731 AFT:  Transcription fa  99.6 1.3E-14 2.8E-19  125.6  10.8   91   21-117     1-111 (111)
  6 PF02338 OTU:  OTU-like cystein  99.5 5.4E-15 1.2E-19  135.2   5.5  107  564-687     1-121 (121)
  7 COG3328 Transposase and inacti  99.4 4.6E-12 9.9E-17  135.9  15.7  227  128-387    83-331 (379)
  8 KOG2606 OTU (ovarian tumor)-li  99.1 8.7E-11 1.9E-15  118.1   6.5  124  548-683   149-283 (302)
  9 PF03108 DBD_Tnp_Mut:  MuDR fam  98.3   2E-06 4.3E-11   69.9   7.0   63   16-106     5-67  (67)
 10 KOG3288 OTU-like cysteine prot  96.7  0.0051 1.1E-07   61.2   7.3  115  560-687   112-227 (307)
 11 PF10275 Peptidase_C65:  Peptid  96.7  0.0025 5.4E-08   65.7   5.5   27  551-577    34-60  (244)
 12 KOG2605 OTU (ovarian tumor)-li  96.5  0.0015 3.3E-08   70.2   2.4  123  553-687   213-338 (371)
 13 KOG3991 Uncharacterized conser  95.5   0.051 1.1E-06   53.6   7.9   20  559-578    65-84  (256)
 14 PF05412 Peptidase_C33:  Equine  94.9   0.043 9.3E-07   47.6   4.7   32  625-657    34-65  (108)
 15 PF03106 WRKY:  WRKY DNA -bindi  94.6    0.15 3.2E-06   40.4   6.8   56   39-116     4-59  (60)
 16 PF06782 UPF0236:  Uncharacteri  94.4     1.2 2.6E-05   50.3  16.2  201  123-333   111-354 (470)
 17 PF04684 BAF1_ABF1:  BAF1 / ABF  90.2       2 4.4E-05   47.0  10.2   42   18-64     25-66  (496)
 18 COG5539 Predicted cysteine pro  90.1    0.14   3E-06   52.4   1.5  110  539-655   152-271 (306)
 19 PF01610 DDE_Tnp_ISL3:  Transpo  90.1     0.2 4.3E-06   51.5   2.6   67  208-276    31-98  (249)
 20 smart00774 WRKY DNA binding do  89.7    0.67 1.5E-05   36.4   4.6   28   88-115    32-59  (59)
 21 PF04500 FLYWCH:  FLYWCH zinc f  86.8     1.6 3.5E-05   34.0   5.3   25   87-115    38-62  (62)
 22 PF15299 ALS2CR8:  Amyotrophic   84.8     4.9 0.00011   40.8   9.0   99   78-178    69-222 (225)
 23 PF03050 DDE_Tnp_IS66:  Transpo  76.4       3 6.6E-05   43.4   4.3  132  131-276     5-156 (271)
 24 PF08069 Ribosomal_S13_N:  Ribo  71.2     5.8 0.00012   31.3   3.5   31  130-160    28-59  (60)
 25 PF13610 DDE_Tnp_IS240:  DDE do  67.1    0.97 2.1E-05   42.2  -1.8   59  195-257    23-81  (140)
 26 COG5539 Predicted cysteine pro  63.9      20 0.00043   37.3   6.6  107  565-687   119-226 (306)
 27 PF13936 HTH_38:  Helix-turn-he  61.9     5.8 0.00013   29.1   1.9   30  128-157     3-32  (44)
 28 PRK08561 rps15p 30S ribosomal   57.6      39 0.00084   31.8   6.9   32  129-160    27-59  (151)
 29 KOG4345 NF-kappa B regulator A  44.3      10 0.00023   43.7   1.1   49  638-688   225-287 (774)
 30 PTZ00072 40S ribosomal protein  40.4      61  0.0013   30.3   5.2   31  130-160    25-56  (148)
 31 KOG0400 40S ribosomal protein   30.7      38 0.00083   30.9   2.3   29  132-160    31-59  (151)
 32 PF03462 PCRF:  PCRF domain;  I  29.5   1E+02  0.0022   27.7   4.9   42   25-66     66-107 (115)
 33 PRK09784 hypothetical protein;  29.3      51  0.0011   33.2   3.1   36  554-590   196-231 (417)
 34 PF02796 HTH_7:  Helix-turn-hel  29.0      49  0.0011   24.2   2.3   39  128-173     4-42  (45)
 35 PF04800 ETC_C1_NDUFA4:  ETC co  28.4      45 0.00096   29.4   2.3   27   17-47     51-77  (101)
 36 PF11427 HTH_Tnp_Tc3_1:  Tc3 tr  28.0      74  0.0016   24.2   3.1   30  129-158     4-33  (50)
 37 PF00665 rve:  Integrase core d  27.1      71  0.0015   27.9   3.5   50  199-249    33-82  (120)
 38 PF03461 TRCF:  TRCF domain;  I  27.0 1.1E+02  0.0024   26.8   4.6   40  286-325    18-57  (101)
 39 TIGR03147 cyt_nit_nrfF cytochr  26.7      81  0.0018   28.9   3.7   34  129-162    57-90  (126)
 40 KOG4825 Component of synaptic   26.4 1.4E+02  0.0031   33.1   6.0   29  479-507   284-312 (666)
 41 PLN03097 FHY3 Protein FAR-RED   26.1 4.1E+02  0.0089   32.6  10.5   28  398-425   595-623 (846)
 42 COG3316 Transposase and inacti  22.8 1.9E+02  0.0041   29.1   5.8  108  143-259    23-151 (215)
 43 PF07506 RepB:  RepB plasmid pa  22.7 2.4E+02  0.0052   27.5   6.6   62  559-621    53-114 (185)
 44 PRK10144 formate-dependent nit  20.8 1.3E+02  0.0027   27.7   3.7   34  129-162    57-90  (126)
 45 TIGR03277 methan_mark_9 putati  20.4      82  0.0018   27.8   2.3   30  569-598    78-108 (109)

No 1  
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00  E-value=1.3e-45  Score=425.32  Aligned_cols=311  Identities=16%  Similarity=0.257  Sum_probs=225.3

Q ss_pred             ccccCcceecccccCChhhHHHHHHHHHhhcCeEEEEeecccCC-CCCccEEEEEEecCCcCCCCCCC--C---------
Q 037162            8 TEESGKEKVVNVALMEREDMPREELQTELRNGLVIVIEKSDVAA-NGRKPRIIFTCERSGVYRDRSPQ--G---------   75 (689)
Q Consensus         8 ~~~~~~~~~~~~~F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~-~g~~~~~~~~C~r~G~~r~~~~~--~---------   75 (689)
                      .+..+.+|.+||+|+|.|||++||+.||+..||+|||.+|++.+ +|.+..+.|+|+|+|..+...+.  .         
T Consensus        67 ~~~~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~  146 (846)
T PLN03097         67 KEDTNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQD  146 (846)
T ss_pred             cCCCCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccC
Confidence            45566789999999999999999999999999999998887764 56788999999999964321110  0         


Q ss_pred             -CCCCCcCCceeeCCceEEEEEEeecCCCeEEEEEeCcccCCCCcccccccccccCCHHHHHHHH---------------
Q 037162           76 -PKPIKATGIQKCKCPFKLKGQKMANNDDWALIVICGFHNHPATQYLEGHSFAGRLSKEESNLLV---------------  139 (689)
Q Consensus        76 -~~~rr~~~s~ktgCpa~i~~~~~~~~~~W~V~~~~~~HNH~l~~~~~~h~~~RrLs~e~k~~I~---------------  139 (689)
                       ...+++++.+||||+|+|+++.. .+|+|.|+.+..+|||++.++.......|++-......+.               
T Consensus       147 ~~~~~~rR~~tRtGC~A~m~Vk~~-~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~~~~~~~~~~~~v~~~~~d~~~~  225 (846)
T PLN03097        147 PENGTGRRSCAKTDCKASMHVKRR-PDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYAAMARQFAEYKNVVGLKNDSKSS  225 (846)
T ss_pred             cccccccccccCCCCceEEEEEEc-CCCeEEEEEEecCCCCCCCCccccchhhhhhHHHHHhhhhccccccccchhhcch
Confidence             00112355789999999999885 4579999999999999999764322111222111111000               


Q ss_pred             --HhhhCCC---ChHHHHHH---HHhcCCCccc--------hhhHHHHHHHHhHHhhh-cc------------hhHHHHH
Q 037162          140 --DMSKNNV---KPKDILHV---LKKRDMHNAT--------TIRAIYNARRKCKVREQ-AG------------RSQMQLL  190 (689)
Q Consensus       140 --~L~~sgv---~pr~Il~~---L~~~~g~~~~--------t~kDIyN~~~k~r~~~l-~g------------~t~~~~L  190 (689)
                        ......+   -...++.+   ++..+|+.++        .++.|+++..+.+.... -|            .+. .+|
T Consensus       226 ~~~~r~~~~~~gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~FGDvV~fDTTY~tN~y~-~Pf  304 (846)
T PLN03097        226 FDKGRNLGLEAGDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGNFSDVVSFDTTYVRNKYK-MPL  304 (846)
T ss_pred             hhHHHhhhcccchHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHhcCCEEEEeceeeccccC-cEE
Confidence              0000001   11222222   2233443322        23344555444444321 11            111 268


Q ss_pred             hhhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHH
Q 037162          191 MKIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNV  270 (689)
Q Consensus       191 l~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv  270 (689)
                      +.|||||+|++++++||||+.+|+.|+|.|+|++|+.+|+ ++.|++||||+|.||.+||++|||++.|++|.|||++|+
T Consensus       305 a~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~-gk~P~tIiTDqd~am~~AI~~VfP~t~Hr~C~wHI~~~~  383 (846)
T PLN03097        305 ALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMG-GQAPKVIITDQDKAMKSVISEVFPNAHHCFFLWHILGKV  383 (846)
T ss_pred             EEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhC-CCCCceEEecCCHHHHHHHHHHCCCceehhhHHHHHHHH
Confidence            8999999999999999999999999999999999999995 799999999999999999999999999999999999999


Q ss_pred             HHHhhhhccchhhHHHHHHHHHH-HHhcCCHHHHHHHHHHHHHhhc-CChhHHhhh
Q 037162          271 LANCKKLFETNEIWETFISSWNL-LVLAASEEEFAQRLKSMESDFS-KYPTALTYI  324 (689)
Q Consensus       271 ~~~~~~~~~~~~~~~~~~~~w~~-l~~a~t~~ef~~~~~~l~~~~~-~~p~~~~Yl  324 (689)
                      .++++..+...   +.|...|.. |..+.|+++|+..|..|.++|. .-.+++..|
T Consensus       384 ~e~L~~~~~~~---~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~L  436 (846)
T PLN03097        384 SENLGQVIKQH---ENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSL  436 (846)
T ss_pred             HHHhhHHhhhh---hHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHH
Confidence            99999877643   467777765 4568999999999999999996 334444444


No 2  
>PF00872 Transposase_mut:  Transposase, Mutator family;  InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.80  E-value=5.8e-21  Score=208.03  Aligned_cols=232  Identities=21%  Similarity=0.277  Sum_probs=190.5

Q ss_pred             cccccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCc-c--chh----hHHHHHHHHhHHhhhcch-hHH---------
Q 037162          125 SFAGRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHN-A--TTI----RAIYNARRKCKVREQAGR-SQM---------  187 (689)
Q Consensus       125 ~~~RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~-~--~t~----kDIyN~~~k~r~~~l~g~-t~~---------  187 (689)
                      +.+++.+++....|..|+..|+++++|...|+..+|.. +  .++    +.+......|+.+.+++. +++         
T Consensus        94 ~~y~r~~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~~~w~~R~L~~~~y~~l~iD~~~~k  173 (381)
T PF00872_consen   94 PKYQRREDSLEELIISLYLKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEVEAWRNRPLESEPYPYLWIDGTYFK  173 (381)
T ss_pred             chhhhhhhhhhhhhhhhhccccccccccchhhhhhcccccCchhhhhhhhhhhhhHHHHhhhccccccccceeeeeeecc
Confidence            34455577778888999999999999999999999832 2  222    334445677777777665 442         


Q ss_pred             ---------HHHhhhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccc
Q 037162          188 ---------QLLMKIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSAT  258 (689)
Q Consensus       188 ---------~~Ll~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~  258 (689)
                               .+++.++|||.+|+..++|+.....|+.++|.-+|+.|++.  |...|.+||+|..+||..||+++||++.
T Consensus       174 vr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~L~~R--Gl~~~~lvv~Dg~~gl~~ai~~~fp~a~  251 (381)
T PF00872_consen  174 VREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQDLKER--GLKDILLVVSDGHKGLKEAIREVFPGAK  251 (381)
T ss_pred             cccccccccchhhhhhhhhcccccceeeeecccCCccCEeeecchhhhhc--cccccceeeccccccccccccccccchh
Confidence                     25789999999999999999999999999999999999886  7888999999999999999999999999


Q ss_pred             cccccchHHHHHHHHhhhhccchhhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHhhc-CChhHHhhhhhcchhhhhhHHH
Q 037162          259 TLLCRWHISRNVLANCKKLFETNEIWETFISSWNLLVLAASEEEFAQRLKSMESDFS-KYPTALTYIRNSSWTKVHTLLE  337 (689)
Q Consensus       259 ~~lC~wHi~kNv~~~~~~~~~~~~~~~~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~-~~p~~~~Yl~~~~W~~i~~~~e  337 (689)
                      +|+|++|+.+|+.+++++     .+.+++..+++.|+.+.+.+++.+.++.|.++|. +||++++++++ .|+.+     
T Consensus       252 ~QrC~vH~~RNv~~~v~~-----k~~~~v~~~Lk~I~~a~~~e~a~~~l~~f~~~~~~kyp~~~~~l~~-~~~~~-----  320 (381)
T PF00872_consen  252 WQRCVVHLMRNVLRKVPK-----KDRKEVKADLKAIYQAPDKEEAREALEEFAEKWEKKYPKAAKSLEE-NWDEL-----  320 (381)
T ss_pred             hhhheechhhhhcccccc-----ccchhhhhhccccccccccchhhhhhhhcccccccccchhhhhhhh-ccccc-----
Confidence            999999999999999876     3446889999999999999999999999999997 99999999996 66432     


Q ss_pred             hhHHHHHHHHHhhheeeeccccchhhhhhhhhhcHHHHHHHHHHHHHhcc
Q 037162          338 LQLVEIKASLERSLTMVQHDFKPSIFKELREFVAMNALTMILDESRRADS  387 (689)
Q Consensus       338 ~qh~~Ik~s~~~s~~~v~~~~~~~l~~~l~g~iS~~AL~~i~~q~~r~~~  387 (689)
                                   .+..  .|++.++..+.   |+|+|+.++.+++++.+
T Consensus       321 -------------~tf~--~fP~~~~~~i~---TTN~iEsln~~irrr~~  352 (381)
T PF00872_consen  321 -------------LTFL--DFPPEHRRSIR---TTNAIESLNKEIRRRTK  352 (381)
T ss_pred             -------------ccee--eecchhccccc---hhhhccccccchhhhcc
Confidence                         2222  25555555555   88888888888888643


No 3  
>PF03101 FAR1:  FAR1 DNA-binding domain;  InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ].   This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=99.78  E-value=3.4e-19  Score=154.59  Aligned_cols=90  Identities=21%  Similarity=0.382  Sum_probs=79.4

Q ss_pred             HHHHHHHhhcCeEEEEeecccC-CCCCccEEEEEEecCCcCCCCCCCCCCCCCcCCceeeCCceEEEEEEeecCCCeEEE
Q 037162           29 REELQTELRNGLVIVIEKSDVA-ANGRKPRIIFTCERSGVYRDRSPQGPKPIKATGIQKCKCPFKLKGQKMANNDDWALI  107 (689)
Q Consensus        29 ~~~~~yA~~~GF~v~i~rS~~~-~~g~~~~~~~~C~r~G~~r~~~~~~~~~rr~~~s~ktgCpa~i~~~~~~~~~~W~V~  107 (689)
                      +||+.||..+||+|++.+|.+. .+|.+.++.|+|+++|.++.........++++.|.||||||+|.++... ++.|.|.
T Consensus         1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~v~   79 (91)
T PF03101_consen    1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK-DGKWRVT   79 (91)
T ss_pred             CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc-CCEEEEE
Confidence            5999999999999999998876 5788999999999999877655432456788999999999999999987 8899999


Q ss_pred             EEeCcccCCCCc
Q 037162          108 VICGFHNHPATQ  119 (689)
Q Consensus       108 ~~~~~HNH~l~~  119 (689)
                      .+..+|||+|.|
T Consensus        80 ~~~~~HNH~L~P   91 (91)
T PF03101_consen   80 SFVLEHNHPLCP   91 (91)
T ss_pred             ECcCCcCCCCCC
Confidence            999999999975


No 4  
>PF10551 MULE:  MULE transposase domain;  InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 []. 
Probab=99.67  E-value=7.2e-17  Score=140.49  Aligned_cols=78  Identities=41%  Similarity=0.669  Sum_probs=74.2

Q ss_pred             hhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHHH
Q 037162          192 KIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNVL  271 (689)
Q Consensus       192 ~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv~  271 (689)
                      .++|+|++|+.+++||+++.+|+.++|.|+|+.+++.+.. . |.+||||++.|+++||+++||++.|++|.||+.||++
T Consensus        16 ~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~~-~-p~~ii~D~~~~~~~Ai~~vfP~~~~~~C~~H~~~n~k   93 (93)
T PF10551_consen   16 IAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMPQ-K-PKVIISDFDKALINAIKEVFPDARHQLCLFHILRNIK   93 (93)
T ss_pred             eEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhcccc-C-ceeeeccccHHHHHHHHHHCCCceEehhHHHHHHhhC
Confidence            4899999999999999999999999999999999999853 5 9999999999999999999999999999999999974


No 5  
>PF08731 AFT:  Transcription factor AFT;  InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2. 
Probab=99.58  E-value=1.3e-14  Score=125.64  Aligned_cols=91  Identities=30%  Similarity=0.578  Sum_probs=79.6

Q ss_pred             cCChhhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEecCCcCCCCCCC--------------------CCCCCC
Q 037162           21 LMEREDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCERSGVYRDRSPQ--------------------GPKPIK   80 (689)
Q Consensus        21 F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r~G~~r~~~~~--------------------~~~~rr   80 (689)
                      |.+.+|...|++.+++.+||+|+|.||+.      ..++|.|--+|.++.....                    ...+.+
T Consensus         1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~------~ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k   74 (111)
T PF08731_consen    1 FDDKDEIKPWLQKIFYPQGIGIVIERSDK------KKIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKK   74 (111)
T ss_pred             CCchHHHHHHHHHHhhhcCceEEEEecCC------ceEEEEEecCCCcccccccccccccccccccccccccccccccCC
Confidence            89999999999999999999999999995      4799999999987764331                    123346


Q ss_pred             cCCceeeCCceEEEEEEeecCCCeEEEEEeCcccCCC
Q 037162           81 ATGIQKCKCPFKLKGQKMANNDDWALIVICGFHNHPA  117 (689)
Q Consensus        81 ~~~s~ktgCpa~i~~~~~~~~~~W~V~~~~~~HNH~l  117 (689)
                      .+.+++++|||+|++......+.|.|.++++.|||++
T Consensus        75 ~t~srk~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l  111 (111)
T PF08731_consen   75 RTKSRKNTCPFRIRANYSKKNKKWTLVVVNNEHNHPL  111 (111)
T ss_pred             cccccccCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence            6889999999999999999999999999999999996


No 6  
>PF02338 OTU:  OTU-like cysteine protease;  InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).  None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.54  E-value=5.4e-15  Score=135.21  Aligned_cols=107  Identities=25%  Similarity=0.323  Sum_probs=84.9

Q ss_pred             cCCCCcchHHHHHHhc----CCCccHHHHHHHHHHHHH-hhhhhhhhhccCchhHHHHHhhhcCCCCCCcccccccccch
Q 037162          564 AADGNCGFRTVADLIG----IGEDNWARVRRDLVDELQ-CHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHWMIMPNT  638 (689)
Q Consensus       564 ~~dg~Cgfraia~~l~----~~~~~~~~vr~~l~~el~-~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~Wl~~~~~  638 (689)
                      +|||||+|||||.+|+    .+++.|..||+.++++|+ .+++.|...+.++        .+      .....|.+.+++
T Consensus         1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~~~~~~~~--------~~------~~~~~Wg~~~el   66 (121)
T PF02338_consen    1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKFEEFLEGD--------KM------SKPGTWGGEIEL   66 (121)
T ss_dssp             -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHHHHHHHHH--------HH------TSTTSHEEHHHH
T ss_pred             CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchhhhhhhhh--------hh------ccccccCcHHHH
Confidence            5999999999999999    999999999999999999 9999988876433        34      457799999988


Q ss_pred             hhhhccccceeEEEEccCc--eeeccCCCC--CCCCCCCCCcEEEEEec-----CCCc
Q 037162          639 GYLIAFKYNVIGLLISMQQ--CLTFLPLRS--IPGPRSSHKIIAIGYIY-----GCHF  687 (689)
Q Consensus       639 g~~iA~~y~~pv~~~s~~~--s~t~~P~~~--~p~~~~~~~~i~l~~~~-----~nHf  687 (689)
                       +++|+.|+|+|++|+...  ...+.+..+  +|..  ..++|+|+|..     +|||
T Consensus        67 -~a~a~~~~~~I~v~~~~~~~~~~~~~~~~~~~~~~--~~~~i~l~~~~~l~~~~~Hy  121 (121)
T PF02338_consen   67 -QALANVLNRPIIVYSSSDGDNVVFIKFTGKYPPLE--SPPPICLCYHGHLYYTGNHY  121 (121)
T ss_dssp             -HHHHHHHTSEEEEECETTTBEEEEEEESCEESTTT--TTTSEEEEEETEEEEETTEE
T ss_pred             -HHHHHHhCCeEEEEEcCCCCccceeeecCccccCC--CCCeEEEEEcCCccCCCCCC
Confidence             599999999999987532  233333322  2222  25789999998     8998


No 7  
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.39  E-value=4.6e-12  Score=135.94  Aligned_cols=227  Identities=20%  Similarity=0.215  Sum_probs=177.0

Q ss_pred             ccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCcc------chhhHHHHHHHHhHHhhhcchhH---------------
Q 037162          128 GRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHNA------TTIRAIYNARRKCKVREQAGRSQ---------------  186 (689)
Q Consensus       128 RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~~------~t~kDIyN~~~k~r~~~l~g~t~---------------  186 (689)
                      ++........|..|...|+++++|-..++..++..+      ...+.+.+.+..+..+.+ |+++               
T Consensus        83 ~r~~~~~~~~v~~~y~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e~v~~~~~r~l-~~~~~v~~D~~~~k~r~v~  161 (379)
T COG3328          83 QRRERALDLPVLSMYAKGVTTREIEALLEELYGHKVSPSVISVVTDRLDEKVKAWQNRPL-GDYPYVYLDAKYVKVRSVR  161 (379)
T ss_pred             HhhhhhHHHHHHHHHHcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHHHHHHHHhccc-cCceEEEEecceeehhhhh
Confidence            344455566788899999999999999999876421      123455566777887777 4332               


Q ss_pred             HHHHhhhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchH
Q 037162          187 MQLLMKIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHI  266 (689)
Q Consensus       187 ~~~Ll~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi  266 (689)
                      -++++.++||+.+|+..++|+..-..|+ ..|.-+|..|+..  |......+|+|...++-+||..+||.+.+|+|..|+
T Consensus       162 ~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~r--gl~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~  238 (379)
T COG3328         162 NKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNR--GLSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHL  238 (379)
T ss_pred             hheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhc--cccceeEEecchhhhhHHHHHHhccHhhhhhhhhHH
Confidence            1356789999999999999999999998 7888677777765  677788888999999999999999999999999999


Q ss_pred             HHHHHHHhhhhccchhhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHhhc-CChhHHhhhhhcchhhhhhHHHhhHHHHHH
Q 037162          267 SRNVLANCKKLFETNEIWETFISSWNLLVLAASEEEFAQRLKSMESDFS-KYPTALTYIRNSSWTKVHTLLELQLVEIKA  345 (689)
Q Consensus       267 ~kNv~~~~~~~~~~~~~~~~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~-~~p~~~~Yl~~~~W~~i~~~~e~qh~~Ik~  345 (689)
                      .+|+.++.+.     .+.+.+....+.|+.+++.++....|..+.+.|. .||+++..+.+ .|..+....         
T Consensus       239 ~Rnll~~v~~-----k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~~yP~i~~~~~~-~~~~~~~F~---------  303 (379)
T COG3328         239 VRNLLDKVPR-----KDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGKRYPAILKSWRN-ALEELLPFF---------  303 (379)
T ss_pred             Hhhhhhhhhh-----hhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhhhcchHHHHHHH-HHHHhcccc---------
Confidence            9999999887     4456788889999999999999999999999997 89999999886 554322110         


Q ss_pred             HHHhhheeeeccccchhhhhhhhhhcHHHHHHHHHHHHHhcc
Q 037162          346 SLERSLTMVQHDFKPSIFKELREFVAMNALTMILDESRRADS  387 (689)
Q Consensus       346 s~~~s~~~v~~~~~~~l~~~l~g~iS~~AL~~i~~q~~r~~~  387 (689)
                                 .|+.....-+.   |++||+.++.+++++.+
T Consensus       304 -----------~fp~~~r~~i~---ttN~IE~~n~~ir~~~~  331 (379)
T COG3328         304 -----------AFPSEIRKIIY---TTNAIESLNKLIRRRTK  331 (379)
T ss_pred             -----------cCcHHHHhHhh---cchHHHHHHHHHHHHHh
Confidence                       12222221222   89999999999887654


No 8  
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.11  E-value=8.7e-11  Score=118.14  Aligned_cols=124  Identities=11%  Similarity=0.096  Sum_probs=105.9

Q ss_pred             cccccccccccccccccCCCCcchHHHHHHhcCCC---ccHHHHHHHHHHHHHhhhhhhhhhcc--------CchhHHHH
Q 037162          548 LFPSGIRPYIRGAKDVAADGNCGFRTVADLIGIGE---DNWARVRRDLVDELQCHYNEYTLLLG--------YAGRYQEL  616 (689)
Q Consensus       548 ~~~~~~~~~i~~i~~v~~dg~Cgfraia~~l~~~~---~~~~~vr~~l~~el~~~~~~y~~~~~--------~~~~~~~~  616 (689)
                      .+...+.+-...++|+++||||.|+||+.+|-+..   -+-..+|+...+++..|.+++.+++-        +..+|+.+
T Consensus       149 k~~~il~~~~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df~pf~~~eet~d~~~~~~f~~Y  228 (302)
T KOG2606|consen  149 KLAQILEERGLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDFLPFLLDEETGDSLGPEDFDKY  228 (302)
T ss_pred             HHHHHHHhccCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHhhhHhcCccccccCCHHHHHHH
Confidence            46667788888999999999999999999998863   56789999999999999999998763        34579999


Q ss_pred             HhhhcCCCCCCcccccccccchhhhhccccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEec
Q 037162          617 LHLLSNFEPNPSYDHWMIMPNTGYLIAFKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIY  683 (689)
Q Consensus       617 ~~~l~~~~~~~~~~~Wl~~~~~g~~iA~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~  683 (689)
                      ++.+      +.+..|++-++.+ +|++.+.+||.+|..+++    |+..++...+ .+||+|+|..
T Consensus       229 c~eI------~~t~~WGgelEL~-AlShvL~~PI~Vy~~~~p----~~~~geey~k-d~pL~lvY~r  283 (302)
T KOG2606|consen  229 CREI------RNTAAWGGELELK-ALSHVLQVPIEVYQADGP----ILEYGEEYGK-DKPLILVYHR  283 (302)
T ss_pred             HHHh------hhhccccchHHHH-HHHHhhccCeEEeecCCC----ceeechhhCC-CCCeeeehHH
Confidence            9999      5679999999997 999999999999998766    6666776553 4899999874


No 9  
>PF03108 DBD_Tnp_Mut:  MuDR family transposase;  InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.30  E-value=2e-06  Score=69.92  Aligned_cols=63  Identities=21%  Similarity=0.458  Sum_probs=55.7

Q ss_pred             ecccccCChhhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEecCCcCCCCCCCCCCCCCcCCceeeCCceEEEE
Q 037162           16 VVNVALMEREDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCERSGVYRDRSPQGPKPIKATGIQKCKCPFKLKG   95 (689)
Q Consensus        16 ~~~~~F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r~G~~r~~~~~~~~~rr~~~s~ktgCpa~i~~   95 (689)
                      -+|+.|+|.+|+..++..||..+||.+++.+|+.      .++..+|.  +                    .||||+|++
T Consensus         5 ~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~------~r~~~~C~--~--------------------~~C~Wrv~a   56 (67)
T PF03108_consen    5 EVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDK------KRYRAKCK--D--------------------KGCPWRVRA   56 (67)
T ss_pred             ccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCC------EEEEEEEc--C--------------------CCCCEEEEE
Confidence            3689999999999999999999999999999984      58999996  1                    169999999


Q ss_pred             EEeecCCCeEE
Q 037162           96 QKMANNDDWAL  106 (689)
Q Consensus        96 ~~~~~~~~W~V  106 (689)
                      ...+..+.|.|
T Consensus        57 s~~~~~~~~~I   67 (67)
T PF03108_consen   57 SKRKRSDTFQI   67 (67)
T ss_pred             EEcCCCCEEEC
Confidence            99998888875


No 10 
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=96.66  E-value=0.0051  Score=61.19  Aligned_cols=115  Identities=17%  Similarity=0.170  Sum_probs=77.8

Q ss_pred             cccccCCCCcchHHHHHHhcCCCccH-HHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhcCCCCCCcccccccccch
Q 037162          560 AKDVAADGNCGFRTVADLIGIGEDNW-ARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHWMIMPNT  638 (689)
Q Consensus       560 i~~v~~dg~Cgfraia~~l~~~~~~~-~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~Wl~~~~~  638 (689)
                      ..=|+.|--|.|+||+--+......- .++|+-+.+|+-++++.|..-+-|... ++++.=+      -.++.|.+-.+.
T Consensus       112 ~~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n-~eYc~WI------~k~dsWGGaIEl  184 (307)
T KOG3288|consen  112 RRVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPN-KEYCAWI------LKMDSWGGAIEL  184 (307)
T ss_pred             EEeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCc-HHHHHHH------ccccccCceEEe
Confidence            34478899999999987776543222 589999999999999999864322211 2333333      246889998888


Q ss_pred             hhhhccccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEecCCCc
Q 037162          639 GYLIAFKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIYGCHF  687 (689)
Q Consensus       639 g~~iA~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHf  687 (689)
                      + ||++.|++-|++++.+..-.  -.+ ++..+- ..-++|.|. |-||
T Consensus       185 s-ILS~~ygveI~vvDiqt~ri--d~f-ged~~~-~~rv~llyd-GIHY  227 (307)
T KOG3288|consen  185 S-ILSDYYGVEICVVDIQTVRI--DRF-GEDKNF-DNRVLLLYD-GIHY  227 (307)
T ss_pred             e-eehhhhceeEEEEecceeee--hhc-CCCCCC-CceEEEEec-cccc
Confidence            8 99999999999998643211  112 221111 234777777 7887


No 11 
>PF10275 Peptidase_C65:  Peptidase C65 Otubain;  InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].   This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=96.66  E-value=0.0025  Score=65.67  Aligned_cols=27  Identities=26%  Similarity=0.442  Sum_probs=19.4

Q ss_pred             ccccccccccccccCCCCcchHHHHHH
Q 037162          551 SGIRPYIRGAKDVAADGNCGFRTVADL  577 (689)
Q Consensus       551 ~~~~~~i~~i~~v~~dg~Cgfraia~~  577 (689)
                      +.|......+..|.|||||+|||++-+
T Consensus        34 ~~L~~~y~~~R~vRGDGNCFYRAf~F~   60 (244)
T PF10275_consen   34 KKLSQKYSGIRRVRGDGNCFYRAFGFS   60 (244)
T ss_dssp             HHHHHHEEEEE-B-SSSTHHHHHHHHH
T ss_pred             HHHHhhhhheEeecCCccHHHHHHHHH
Confidence            344445677888999999999999865


No 12 
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=96.45  E-value=0.0015  Score=70.23  Aligned_cols=123  Identities=17%  Similarity=0.121  Sum_probs=82.4

Q ss_pred             ccccccccccccCCCCcchHHHHHHhcCCCccHHHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhcCCCCCCccccc
Q 037162          553 IRPYIRGAKDVAADGNCGFRTVADLIGIGEDNWARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHW  632 (689)
Q Consensus       553 ~~~~i~~i~~v~~dg~Cgfraia~~l~~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~W  632 (689)
                      +..|+..+.-|.+||+|.||++|+++.++.|-|..||++..++++..++.|.....  ..|..+++.-      .-.+-|
T Consensus       213 ~~~~g~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~~~~vt--~~~~~y~k~k------r~~~~~  284 (371)
T KOG2605|consen  213 KKHFGFEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFYEDYVT--EDFTSYIKRK------RADGEP  284 (371)
T ss_pred             HHHhhhhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhcccccccccc--cchhhccccc------ccCCCC
Confidence            36678888999999999999999999999999999999999999998888776542  2345544444      334677


Q ss_pred             ccccchhhhhc---cccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEecCCCc
Q 037162          633 MIMPNTGYLIA---FKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIYGCHF  687 (689)
Q Consensus       633 l~~~~~g~~iA---~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHf  687 (689)
                      ++-..+. .+|   --+.+|++..+.+  .|-+-...++...+ ..-+++.++...||
T Consensus       285 gnhie~Q-a~a~~~~~~~~~~~~~~~~--~t~~~~~~~~~~~~-~~~~~~n~~~~~h~  338 (371)
T KOG2605|consen  285 GNHIEQQ-AAADIYEEIEKPLNITSFK--DTCYIQTPPAIEES-VKMEKYNFWVEVHY  338 (371)
T ss_pred             cchHHHh-hhhhhhhhccccceeeccc--ccceeccCcccccc-hhhhhhcccchhhh
Confidence            7776664 666   4555666666532  11111222222111 33466666555665


No 13 
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=95.53  E-value=0.051  Score=53.62  Aligned_cols=20  Identities=30%  Similarity=0.463  Sum_probs=16.7

Q ss_pred             ccccccCCCCcchHHHHHHh
Q 037162          559 GAKDVAADGNCGFRTVADLI  578 (689)
Q Consensus       559 ~i~~v~~dg~Cgfraia~~l  578 (689)
                      .|....|||||.|||+|.++
T Consensus        65 ~iR~trgDGNCfyra~~~s~   84 (256)
T KOG3991|consen   65 VIRKTRGDGNCFYRAFAYSY   84 (256)
T ss_pred             hhheecCCCceehHHHHHHH
Confidence            46778999999999998654


No 14 
>PF05412 Peptidase_C33:  Equine arterivirus Nsp2-type cysteine proteinase;  InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=94.89  E-value=0.043  Score=47.58  Aligned_cols=32  Identities=16%  Similarity=0.034  Sum_probs=22.5

Q ss_pred             CCCcccccccccchhhhhccccceeEEEEccCc
Q 037162          625 PNPSYDHWMIMPNTGYLIAFKYNVIGLLISMQQ  657 (689)
Q Consensus       625 ~~~~~~~Wl~~~~~g~~iA~~y~~pv~~~s~~~  657 (689)
                      .+.+.+.|++--+++++|-.. +-|+-+--++.
T Consensus        34 ~~r~~d~W~~dedl~~~iq~l-~lPat~~~~~~   65 (108)
T PF05412_consen   34 RNRPSDDWADDEDLYQVIQSL-RLPATLDRNGA   65 (108)
T ss_pred             cCCChHHccChHHHHHHHHHc-cCceeccCCCC
Confidence            345789999999999988755 66655544333


No 15 
>PF03106 WRKY:  WRKY DNA -binding domain;  InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=94.64  E-value=0.15  Score=40.37  Aligned_cols=56  Identities=25%  Similarity=0.424  Sum_probs=37.8

Q ss_pred             CeEEEEeecccCCCCCccEEEEEEecCCcCCCCCCCCCCCCCcCCceeeCCceEEEEEEeecCCCeEEEEEeCcccCC
Q 037162           39 GLVIVIEKSDVAANGRKPRIIFTCERSGVYRDRSPQGPKPIKATGIQKCKCPFKLKGQKMANNDDWALIVICGFHNHP  116 (689)
Q Consensus        39 GF~v~i~rS~~~~~g~~~~~~~~C~r~G~~r~~~~~~~~~rr~~~s~ktgCpa~i~~~~~~~~~~W~V~~~~~~HNH~  116 (689)
                      ||..|+--.+..++....|.+|.|+..                      ||||+=.+.+..+++.-.++.+.++|||+
T Consensus         4 gy~WRKYGqK~i~g~~~pRsYYrCt~~----------------------~C~akK~Vqr~~~d~~~~~vtY~G~H~h~   59 (60)
T PF03106_consen    4 GYRWRKYGQKNIKGSPYPRSYYRCTHP----------------------GCPAKKQVQRSADDPNIVIVTYEGEHNHP   59 (60)
T ss_dssp             SS-EEEEEEEEETTTTCEEEEEEEECT----------------------TEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred             CCchhhccCcccCCCceeeEeeecccc----------------------ChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence            677765333333333456788999631                      79999999988777888889999999997


No 16 
>PF06782 UPF0236:  Uncharacterised protein family (UPF0236);  InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=94.38  E-value=1.2  Score=50.34  Aligned_cols=201  Identities=15%  Similarity=0.177  Sum_probs=119.5

Q ss_pred             cccccccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCccchhhHHHHHHHHhHHhhh-----------------cch-
Q 037162          123 GHSFAGRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHNATTIRAIYNARRKCKVREQ-----------------AGR-  184 (689)
Q Consensus       123 ~h~~~RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~~k~r~~~l-----------------~g~-  184 (689)
                      +...+.|+|++.+..|..+... ++-++..+.|....+....+...|.|.+...-....                 ||. 
T Consensus       111 Gl~~~~R~S~~~~~~i~~~a~~-~sYr~aa~~l~~~~~~~~iS~~tV~~~v~~~g~~~~~~~~~~k~~~~~LyIEaDg~~  189 (470)
T PF06782_consen  111 GLKKYQRISPELKEKIVELATE-MSYRKAAEILEELLGNVSISKQTVWNIVKEAGFEEIKEEEKEKKKVPVLYIEADGVH  189 (470)
T ss_pred             CCCcccchhHHHHHHHHHHHhh-cCHHHHHHHHhhccCCCccCHHHHHHHHHhccchhhhccccccCCCCeEEEecCcce
Confidence            4455689999999888887644 888999999987777666777788887655532110                 110 


Q ss_pred             hHHH----------HHhhhcc---cCCC-CceeEEEE-EEec---CCccchHHHHHHHHHHHHhcCCC-CeEEEechhHH
Q 037162          185 SQMQ----------LLMKIVG---VTST-DLTFSVCC-VYLE---SERENNYIWALERLKGVMEENML-PSVIVIDRELA  245 (689)
Q Consensus       185 t~~~----------~Ll~~vG---vd~~-~~~~~~gf-~~~~---~E~~e~~~w~l~~lk~~~~~~~~-P~viiTD~~~a  245 (689)
                      -+.|          .++.-.|   .... ++...+.- .|+.   ....+-|.-+.+.+-..+.-... --++.+|....
T Consensus       190 v~~qg~~~~~~e~k~~~vheG~~~~~~~~~R~~L~n~~~f~~~~~~~~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~W  269 (470)
T PF06782_consen  190 VKLQGKKKKKKEVKLFVVHEGWEKEKPGGKRNKLKNKRHFVSGVGESAEEFWEEVLDYIYNHYDLDKTTKIIINGDGASW  269 (470)
T ss_pred             ecccccccccceeeEEEEEeeeeeeeccCCcceeecchheecccccchHHHHHHHHHHHHHhcCcccceEEEEeCCCcHH
Confidence            0011          1122244   2222 22222222 3333   33345566666766666631222 24667888888


Q ss_pred             HHHHHHhhCcccccccccchHHHHHHHHhhhhccchhhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHhhc------CChh
Q 037162          246 LMKAIKKKFPSATTLLCRWHISRNVLANCKKLFETNEIWETFISSWNLLVLAASEEEFAQRLKSMESDFS------KYPT  319 (689)
Q Consensus       246 l~~Ai~~vFP~a~~~lC~wHi~kNv~~~~~~~~~~~~~~~~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~------~~p~  319 (689)
                      +.+++. .||++.+.|..||+++.+...++..-   +   .-...|+.| ...+...++..++.+...-.      ....
T Consensus       270 Ik~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~---~---~~~~~~~al-~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~  341 (470)
T PF06782_consen  270 IKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDP---E---LKEKIRKAL-KKGDKKKLETVLDTAESCAKDEEERKKIRK  341 (470)
T ss_pred             HHHHHH-hhcCceEEecHHHHHHHHHHHhhhCh---H---HHHHHHHHH-HhcCHHHHHHHHHHHHHhhhchHHHHHHHH
Confidence            887766 99999999999999999998776421   1   111233333 34455666666665554332      1235


Q ss_pred             HHhhhhhcchhhhh
Q 037162          320 ALTYIRNSSWTKVH  333 (689)
Q Consensus       320 ~~~Yl~~~~W~~i~  333 (689)
                      +..||.+ +|..++
T Consensus       342 ~~~Yl~~-n~~~i~  354 (470)
T PF06782_consen  342 LRKYLLN-NWDGIK  354 (470)
T ss_pred             HHHHHHH-CHHHhh
Confidence            6788885 887663


No 17 
>PF04684 BAF1_ABF1:  BAF1 / ABF1 chromatin reorganising factor;  InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=90.16  E-value=2  Score=47.02  Aligned_cols=42  Identities=14%  Similarity=0.162  Sum_probs=35.0

Q ss_pred             ccccCChhhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEec
Q 037162           18 NVALMEREDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCER   64 (689)
Q Consensus        18 ~~~F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r   64 (689)
                      +..|+|.|+-|..++.|....-.-|+.+.|-+     .+.++|.|.+
T Consensus        25 ~~~f~tl~~wy~v~ndyefq~rcpiilknsh~-----nkhftfachl   66 (496)
T PF04684_consen   25 ARKFPTLEAWYNVINDYEFQSRCPIILKNSHR-----NKHFTFACHL   66 (496)
T ss_pred             ccCCCcHHHHHHHHhhhhhhhcCceeeccccc-----ccceEEEeec
Confidence            46899999999999999998888888777654     3578898864


No 18 
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=90.15  E-value=0.14  Score=52.43  Aligned_cols=110  Identities=11%  Similarity=-0.097  Sum_probs=63.2

Q ss_pred             CCCCCCccccccccccccccccccccCCCCcchHHHHHHhcCCCc--c---HHHHHHHHHHHHHhhhhhhhhh-c----c
Q 037162          539 PLKPVPFITLFPSGIRPYIRGAKDVAADGNCGFRTVADLIGIGED--N---WARVRRDLVDELQCHYNEYTLL-L----G  608 (689)
Q Consensus       539 ~~~~~p~~~~~~~~~~~~i~~i~~v~~dg~Cgfraia~~l~~~~~--~---~~~vr~~l~~el~~~~~~y~~~-~----~  608 (689)
                      |..-.|++-.+|....---..-+|..|||||.|.+|+++|++.-.  +   =...|..=......+...|..+ |    +
T Consensus       152 PDl~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f~g~hfD~~t~  231 (306)
T COG5539         152 PDLYNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILFTGIHFDEETL  231 (306)
T ss_pred             ccccchhhcCcchHHHHHhhhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhhcccccchhhh
Confidence            333344444444433332233367789999999999999998411  0   0111111112222233334432 1    1


Q ss_pred             CchhHHHHHhhhcCCCCCCcccccccccchhhhhccccceeEEEEcc
Q 037162          609 YAGRYQELLHLLSNFEPNPSYDHWMIMPNTGYLIAFKYNVIGLLISM  655 (689)
Q Consensus       609 ~~~~~~~~~~~l~~~~~~~~~~~Wl~~~~~g~~iA~~y~~pv~~~s~  655 (689)
                      ....|+.+.+.+      -..+.|...+... .||+.+..|+-++..
T Consensus       232 ~m~~~dt~~ne~------~~~a~~g~~~ei~-qLas~lk~~~~~~nT  271 (306)
T COG5539         232 AMVLWDTYVNEV------LFDASDGITIEIQ-QLASLLKNPHYYTNT  271 (306)
T ss_pred             hcchHHHHHhhh------cccccccchHHHH-HHHHHhcCceEEeec
Confidence            124567777776      4467898666654 799999999988864


No 19 
>PF01610 DDE_Tnp_ISL3:  Transposase;  InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=90.13  E-value=0.2  Score=51.53  Aligned_cols=67  Identities=19%  Similarity=0.196  Sum_probs=53.0

Q ss_pred             EEecCCccchHHHHHHHH-HHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHHHHHhhh
Q 037162          208 VYLESERENNYIWALERL-KGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNVLANCKK  276 (689)
Q Consensus       208 ~~~~~E~~e~~~w~l~~l-k~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv~~~~~~  276 (689)
                      .++.+-+.+++.-+|..+ -..  ....+++|++|...+...|+++.||+|....-.|||++++-+.+..
T Consensus        31 ~i~~~r~~~~l~~~~~~~~~~~--~~~~v~~V~~Dm~~~y~~~~~~~~P~A~iv~DrFHvvk~~~~al~~   98 (249)
T PF01610_consen   31 DILPGRDKETLKDFFRSLYPEE--ERKNVKVVSMDMSPPYRSAIREYFPNAQIVADRFHVVKLANRALDK   98 (249)
T ss_pred             EEcCCccHHHHHHHHHHhCccc--cccceEEEEcCCCccccccccccccccccccccchhhhhhhhcchh
Confidence            356666777766555544 222  2467899999999999999999999999999999999999886554


No 20 
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=89.71  E-value=0.67  Score=36.43  Aligned_cols=28  Identities=25%  Similarity=0.385  Sum_probs=23.3

Q ss_pred             CCceEEEEEEeecCCCeEEEEEeCcccC
Q 037162           88 KCPFKLKGQKMANNDDWALIVICGFHNH  115 (689)
Q Consensus        88 gCpa~i~~~~~~~~~~W~V~~~~~~HNH  115 (689)
                      ||||+=.+.+..+++.-.+..+.++|||
T Consensus        32 ~C~a~K~Vq~~~~d~~~~~vtY~g~H~h   59 (59)
T smart00774       32 GCPAKKQVQRSDDDPSVVEVTYEGEHTH   59 (59)
T ss_pred             CCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence            7998877777766677888889999998


No 21 
>PF04500 FLYWCH:  FLYWCH zinc finger domain;  InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif:  F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH  where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=86.75  E-value=1.6  Score=33.95  Aligned_cols=25  Identities=28%  Similarity=0.391  Sum_probs=11.4

Q ss_pred             eCCceEEEEEEeecCCCeEEEEEeCcccC
Q 037162           87 CKCPFKLKGQKMANNDDWALIVICGFHNH  115 (689)
Q Consensus        87 tgCpa~i~~~~~~~~~~W~V~~~~~~HNH  115 (689)
                      .+|+|+|...    .+.-.+.....+|||
T Consensus        38 ~~C~a~~~~~----~~~~~~~~~~~~HnH   62 (62)
T PF04500_consen   38 HGCRARLITD----AGDGRVVRTNGEHNH   62 (62)
T ss_dssp             S----EEEEE------TTEEEE-S---SS
T ss_pred             CCCeEEEEEE----CCCCEEEECCCccCC
Confidence            5899999998    234566667789999


No 22 
>PF15299 ALS2CR8:  Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8
Probab=84.83  E-value=4.9  Score=40.82  Aligned_cols=99  Identities=17%  Similarity=0.183  Sum_probs=62.0

Q ss_pred             CCCcCCceeeCCceEEEEEEeec-----------------------------------CCCeEEEE-E--eCcc-cCCCC
Q 037162           78 PIKATGIQKCKCPFKLKGQKMAN-----------------------------------NDDWALIV-I--CGFH-NHPAT  118 (689)
Q Consensus        78 ~rr~~~s~ktgCpa~i~~~~~~~-----------------------------------~~~W~V~~-~--~~~H-NH~l~  118 (689)
                      .++...|.|.||||+|+++....                                   .+.+.+.+ +  ..+| +|+..
T Consensus        69 ~~~~~~skK~~CPA~I~Ik~I~~FPdykv~~~~~~~~~~~r~~~~~~lk~~l~~~~~~~~~~r~yv~lP~~~~H~~H~~~  148 (225)
T PF15299_consen   69 RRRSKPSKKRDCPARIYIKEIIKFPDYKVPTNSQKDTRRERRKASKKLKKALLSGKSIEGERRFYVQLPSPEEHSGHPIG  148 (225)
T ss_pred             ccccccccCCCCCeEEEEEEEEEcCCcccccchhhhhHHHHHHHHHHHHHHHhcCCCCCceEEEEEECCChHhcCCCccc
Confidence            44567899999999999976321                                   01222222 1  3567 78877


Q ss_pred             cccccccccccCCHHHHHHHHHhhhCCCCh-HHHHHHHHhc-----CC----------CccchhhHHHHHHHHhHH
Q 037162          119 QYLEGHSFAGRLSKEESNLLVDMSKNNVKP-KDILHVLKKR-----DM----------HNATTIRAIYNARRKCKV  178 (689)
Q Consensus       119 ~~~~~h~~~RrLs~e~k~~I~~L~~sgv~p-r~Il~~L~~~-----~g----------~~~~t~kDIyN~~~k~r~  178 (689)
                      ....  -....+.+...+.|.+|...|+.. .+|...|+..     +.          ..+.|.+||.|.......
T Consensus       149 ~~~~--~~~q~~~~~v~~ki~eLv~~gv~~v~e~k~~l~~fV~~~lf~~~~~p~~~n~~y~Pt~~di~n~~~~~~~  222 (225)
T PF15299_consen  149 QEAA--GLKQPLDPRVVEKIHELVAQGVTSVPEMKRHLKKFVEEELFKDQEPPPPTNRRYFPTDKDIRNHMYSAKK  222 (225)
T ss_pred             cccc--cccccCCHHHHHHHHHHHHcccccHHHHHHHHHHHhhhhccCCCCCCCCCccccCCchHHHHHHHHHHHh
Confidence            5321  123567788888999999999754 6666666432     21          123578899998765543


No 23 
>PF03050 DDE_Tnp_IS66:  Transposase IS66 family ;  InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=76.38  E-value=3  Score=43.39  Aligned_cols=132  Identities=15%  Similarity=0.194  Sum_probs=68.0

Q ss_pred             CHHHHHHHHHh-hhCCCChHHHHHHHHhcCCCccchhhHHHHHHHHhHHhhhcchhHHHH-Hh--hhcccCCCCceeEEE
Q 037162          131 SKEESNLLVDM-SKNNVKPKDILHVLKKRDMHNATTIRAIYNARRKCKVREQAGRSQMQL-LM--KIVGVTSTDLTFSVC  206 (689)
Q Consensus       131 s~e~k~~I~~L-~~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~~k~r~~~l~g~t~~~~-Ll--~~vGvd~~~~~~~~g  206 (689)
                      ++.....|.-+ ...+++-..|...+... | ..++...|.|...+.......-...+.. +.  .++.+|+|+-.+.- 
T Consensus         5 g~~~~a~i~~l~~~~~lp~~r~~~~~~~~-G-~~is~~ti~~~~~~~~~~l~~~~~~l~~~~~~~~~~~~DET~~~vl~-   81 (271)
T PF03050_consen    5 GPSLLALIAYLKYVYHLPLYRIQQMLEDL-G-ITISRGTIANWIKRVAEALKPLYEALKEELRSSPVVHADETGWRVLD-   81 (271)
T ss_pred             CHHHHHHHHHHHhcCCCCHHHHhhhhhcc-c-eeeccchhHhHhhhhhhhhhhhhhhhhhhccccceeccCCceEEEec-
Confidence            34444444433 35678888888888877 4 4445555555544433221000000000 11  34555555444111 


Q ss_pred             EEEecCCccchHHHHH----------------HHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHH
Q 037162          207 CVYLESERENNYIWAL----------------ERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNV  270 (689)
Q Consensus       207 f~~~~~E~~e~~~w~l----------------~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv  270 (689)
                          .......|.|++                +.++..++ + ...+++||+-.+-..     |.+..|+.|+-|+.|.+
T Consensus        82 ----~~~g~~~~~Wv~~~~~~v~f~~~~sR~~~~~~~~L~-~-~~GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~  150 (271)
T PF03050_consen   82 ----KGKGKKGYLWVFVSPEVVLFFYAPSRSSKVIKEFLG-D-FSGILVSDGYSAYNK-----LAGITHQLCWAHLRRDF  150 (271)
T ss_pred             ----cccccceEEEeeeccceeeeeecccccccchhhhhc-c-cceeeeccccccccc-----ccccccccccccccccc
Confidence                111111122211                12233332 2 457999999877644     33889999999999988


Q ss_pred             HHHhhh
Q 037162          271 LANCKK  276 (689)
Q Consensus       271 ~~~~~~  276 (689)
                      ..-...
T Consensus       151 ~~~~~~  156 (271)
T PF03050_consen  151 QDAAES  156 (271)
T ss_pred             cccccc
Confidence            766554


No 24 
>PF08069 Ribosomal_S13_N:  Ribosomal S13/S15 N-terminal domain;  InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=71.17  E-value=5.8  Score=31.30  Aligned_cols=31  Identities=26%  Similarity=0.418  Sum_probs=25.8

Q ss_pred             CC-HHHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162          130 LS-KEESNLLVDMSKNNVKPKDILHVLKKRDM  160 (689)
Q Consensus       130 Ls-~e~k~~I~~L~~sgv~pr~Il~~L~~~~g  160 (689)
                      ++ ++..+.|..|.+.|++|++|--.|++++|
T Consensus        28 ~~~~eVe~~I~klakkG~tpSqIG~iLRD~~G   59 (60)
T PF08069_consen   28 YSPEEVEELIVKLAKKGLTPSQIGVILRDQYG   59 (60)
T ss_dssp             S-HHHHHHHHHHHCCTTHCHHHHHHHHHHSCT
T ss_pred             CCHHHHHHHHHHHHHcCCCHHHhhhhhhhccC
Confidence            44 45566788999999999999999999986


No 25 
>PF13610 DDE_Tnp_IS240:  DDE domain
Probab=67.09  E-value=0.97  Score=42.21  Aligned_cols=59  Identities=20%  Similarity=0.094  Sum_probs=40.5

Q ss_pred             ccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCccc
Q 037162          195 GVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSA  257 (689)
Q Consensus       195 Gvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a  257 (689)
                      .||..+.  ++++-+-..-+...-..||..+.+..  ...|.+|+||+..+...|+++++++.
T Consensus        23 aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~--~~~p~~ivtDk~~aY~~A~~~l~~~~   81 (140)
T PF13610_consen   23 AIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRH--RGEPRVIVTDKLPAYPAAIKELNPEG   81 (140)
T ss_pred             eeccccc--chhhhhhhhcccccceeeccccceee--ccccceeecccCCccchhhhhccccc
Confidence            3455555  45555554455555455555555542  27899999999999999999999985


No 26 
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=63.90  E-value=20  Score=37.27  Aligned_cols=107  Identities=15%  Similarity=0.053  Sum_probs=67.9

Q ss_pred             CCCCcchHHHHHHhcCCCccHHHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhcCCCCCCcccccc-cccchhhhhc
Q 037162          565 ADGNCGFRTVADLIGIGEDNWARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHWM-IMPNTGYLIA  643 (689)
Q Consensus       565 ~dg~Cgfraia~~l~~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~Wl-~~~~~g~~iA  643 (689)
                      .|--|.|+|.+-.++--  +=..+|+....|..++++.|..-.-+-+ --.++..|      +.++-|. +--..+ +|.
T Consensus       119 ~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~-~i~y~~~i------~k~d~~~dG~ieia-~iS  188 (306)
T COG5539         119 DDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEID-VIAYATWI------VKPDSQGDGCIEIA-IIS  188 (306)
T ss_pred             CchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcc-hHHHHHhh------hccccCCCceEEEe-Eec
Confidence            45779999998887653  6778999999999999999987542222 12223333      4456666 333444 788


Q ss_pred             cccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEecCCCc
Q 037162          644 FKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIYGCHF  687 (689)
Q Consensus       644 ~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHf  687 (689)
                      +.+++-|.+.......   -++-.+-+.  ..-|++.|. |-||
T Consensus       189 ~~l~v~i~~Vdv~~~~---~dr~~~~~~--~q~~~i~f~-g~hf  226 (306)
T COG5539         189 DQLPVRIHVVDVDKDS---EDRYNSHPY--VQRISILFT-GIHF  226 (306)
T ss_pred             cccceeeeeeecchhH---HhhccCChh--hhhhhhhhc-cccc
Confidence            8998888877754221   122222221  123777887 6777


No 27 
>PF13936 HTH_38:  Helix-turn-helix domain; PDB: 2W48_A.
Probab=61.88  E-value=5.8  Score=29.11  Aligned_cols=30  Identities=20%  Similarity=0.320  Sum_probs=15.6

Q ss_pred             ccCCHHHHHHHHHhhhCCCChHHHHHHHHh
Q 037162          128 GRLSKEESNLLVDMSKNNVKPKDILHVLKK  157 (689)
Q Consensus       128 RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~  157 (689)
                      ++||.+++..|..|...|.+.++|...|..
T Consensus         3 ~~Lt~~eR~~I~~l~~~G~s~~~IA~~lg~   32 (44)
T PF13936_consen    3 KHLTPEERNQIEALLEQGMSIREIAKRLGR   32 (44)
T ss_dssp             ---------HHHHHHCS---HHHHHHHTT-
T ss_pred             cchhhhHHHHHHHHHHcCCCHHHHHHHHCc
Confidence            579999999999999999999999988743


No 28 
>PRK08561 rps15p 30S ribosomal protein S15P; Reviewed
Probab=57.57  E-value=39  Score=31.83  Aligned_cols=32  Identities=25%  Similarity=0.314  Sum_probs=26.7

Q ss_pred             cCCH-HHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162          129 RLSK-EESNLLVDMSKNNVKPKDILHVLKKRDM  160 (689)
Q Consensus       129 rLs~-e~k~~I~~L~~sgv~pr~Il~~L~~~~g  160 (689)
                      .+++ +..+.|.+|.+.|++|++|--.|++++|
T Consensus        27 ~~~~eeve~~I~~lakkG~~pSqIG~~LRD~~g   59 (151)
T PRK08561         27 DYSPEEIEELVVELAKQGYSPSMIGIILRDQYG   59 (151)
T ss_pred             cCCHHHHHHHHHHHHHCCCCHHHhhhhHhhccC
Confidence            3444 4566788999999999999999999986


No 29 
>KOG4345 consensus NF-kappa B regulator AP20/Cezanne [Signal transduction mechanisms]
Probab=44.34  E-value=10  Score=43.66  Aligned_cols=49  Identities=16%  Similarity=0.179  Sum_probs=36.9

Q ss_pred             hhhhhccccceeEEEEcc-----C---------ceeeccCCCCCCCCCCCCCcEEEEEecCCCcc
Q 037162          638 TGYLIAFKYNVIGLLISM-----Q---------QCLTFLPLRSIPGPRSSHKIIAIGYIYGCHFI  688 (689)
Q Consensus       638 ~g~~iA~~y~~pv~~~s~-----~---------~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHfv  688 (689)
                      |-+++|+...||||+++.     .         .-..|+||-.|+.-+.. -||.|+|. .-||+
T Consensus       225 hifvl~~ilRrpivvvsd~mlR~s~~~sfap~~~ggiylpLe~p~~~c~r-~pLvl~yd-~~hf~  287 (774)
T KOG4345|consen  225 HIFVLAHILRRPIVVVSDTMLRDSGGESFAPIPVGGIYLPLEVPAQECHR-SPLVLAYD-QAHFS  287 (774)
T ss_pred             HHHHHHHHhhCCeeEecccccccCCCcccccCccCceEEeccCchhhccc-chhhhhhH-hhhhh
Confidence            678899999999999962     1         23567888888866543 47889988 47775


No 30 
>PTZ00072 40S ribosomal protein S13; Provisional
Probab=40.35  E-value=61  Score=30.29  Aligned_cols=31  Identities=23%  Similarity=0.347  Sum_probs=26.0

Q ss_pred             CCH-HHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162          130 LSK-EESNLLVDMSKNNVKPKDILHVLKKRDM  160 (689)
Q Consensus       130 Ls~-e~k~~I~~L~~sgv~pr~Il~~L~~~~g  160 (689)
                      .++ +..+.|..|.+.|++|++|--.|++++|
T Consensus        25 ~~~eeVe~~I~klaKkG~~pSqIG~iLRD~~g   56 (148)
T PTZ00072         25 LSSSEVEDQICKLAKKGLTPSQIGVILRDSMG   56 (148)
T ss_pred             CCHHHHHHHHHHHHHCCCCHhHhhhhhhhccC
Confidence            444 4566788999999999999999999994


No 31 
>KOG0400 consensus 40S ribosomal protein S13 [Translation, ribosomal structure and biogenesis]
Probab=30.74  E-value=38  Score=30.88  Aligned_cols=29  Identities=14%  Similarity=0.304  Sum_probs=26.1

Q ss_pred             HHHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162          132 KEESNLLVDMSKNNVKPKDILHVLKKRDM  160 (689)
Q Consensus       132 ~e~k~~I~~L~~sgv~pr~Il~~L~~~~g  160 (689)
                      ++.++.|..|.+-|++|.+|--.|++.+|
T Consensus        31 ddvkeqI~K~akKGltpsqIGviLRDshG   59 (151)
T KOG0400|consen   31 DDVKEQIYKLAKKGLTPSQIGVILRDSHG   59 (151)
T ss_pred             HHHHHHHHHHHHcCCChhHceeeeecccC
Confidence            56788899999999999999999998887


No 32 
>PF03462 PCRF:  PCRF domain;  InterPro: IPR005139 This domain is found in peptide chain release factors. Peptide chain release factors are important for protein synthesis since they direct the termination of translation in response to the peptide chain termination codons UAG and UAA. These are structurally distinct but both contain the PCRF domain [].; GO: 0016149 translation release factor activity, codon specific, 0006415 translational termination, 0005737 cytoplasm; PDB: 3D5A_X 3D5C_X 3MR8_V 3MS0_V 3F1G_X 3F1E_X 1ZBT_A 2IHR_1 2X9R_Y 2X9T_Y ....
Probab=29.49  E-value=1e+02  Score=27.70  Aligned_cols=42  Identities=14%  Similarity=0.145  Sum_probs=29.3

Q ss_pred             hhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEecCC
Q 037162           25 EDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCERSG   66 (689)
Q Consensus        25 eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r~G   66 (689)
                      .++.+.|..||...||.+.+.....+.-|.++..++.=+-.|
T Consensus        66 ~~L~~MY~~~a~~~gw~~~~l~~~~~~~~G~k~a~~~I~G~~  107 (115)
T PF03462_consen   66 EELFRMYQRYAERRGWKVEVLDYSPGEEGGIKSATLEISGEG  107 (115)
T ss_dssp             HHHHHHHHHHHHHTT-EEEEEEEEE-SSSSEEEEEEEEESTT
T ss_pred             HHHHHHHHHHHHHcCCEEEEEecCCCCccceeEEEEEEEcCC
Confidence            578899999999999999987665544455666666544333


No 33 
>PRK09784 hypothetical protein; Provisional
Probab=29.35  E-value=51  Score=33.21  Aligned_cols=36  Identities=22%  Similarity=0.345  Sum_probs=26.0

Q ss_pred             cccccccccccCCCCcchHHHHHHhcCCCccHHHHHH
Q 037162          554 RPYIRGAKDVAADGNCGFRTVADLIGIGEDNWARVRR  590 (689)
Q Consensus       554 ~~~i~~i~~v~~dg~Cgfraia~~l~~~~~~~~~vr~  590 (689)
                      +.|..+---|+|||-|..|||-. |...+-+|..+-.
T Consensus       196 ~~~glkyapvdgdgycllrailv-lk~h~yswal~s~  231 (417)
T PRK09784        196 KTYGLKYAPVDGDGYCLLRAILV-LKQHDYSWALGSH  231 (417)
T ss_pred             hhhCceecccCCCchhHHHHHHH-hhhcccchhhccc
Confidence            45666667799999999999974 3444567776543


No 34 
>PF02796 HTH_7:  Helix-turn-helix domain of resolvase;  InterPro: IPR006120 Site-specific recombination plays an important role in DNA rearrangement in prokaryotic organisms. Two types of site-specific recombination are known to occur:  Recombination between inverted repeats resulting in the reversal of a DNA segment. Recombination between repeat sequences on two DNA molecules resulting in their cointegration, or between repeats on one DNA molecule resulting in the excision of a DNA fragment.  Site-specific recombination is characterised by a strand exchange mechanism that requires no DNA synthesis or high energy cofactor; the phosphodiester bond energy is conserved in a phospho-protein linkage during strand cleavage and re-ligation. Two unrelated families of recombinases are currently known []. The first, called the 'phage integrase' family, groups a number of bacterial phage and yeast plasmid enzymes. The second [], called the 'resolvase' family, groups enzymes which share the following structural characteristics: an N-terminal catalytic and dimerization domain that contains a conserved serine residue involved in the transient covalent attachment to DNA IPR006119 from INTERPRO, and a C-terminal helix-turn-helix DNA-binding domain. ; GO: 0000150 recombinase activity, 0003677 DNA binding, 0006310 DNA recombination; PDB: 1ZR2_A 2GM4_B 1RES_A 1ZR4_A 1RET_A 1GDT_B 2R0Q_C 1JKP_C 1IJW_C 1JJ6_C ....
Probab=28.98  E-value=49  Score=24.18  Aligned_cols=39  Identities=15%  Similarity=0.243  Sum_probs=26.8

Q ss_pred             ccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCccchhhHHHHHH
Q 037162          128 GRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHNATTIRAIYNAR  173 (689)
Q Consensus       128 RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~  173 (689)
                      +.+++++.+.|..|...|++..+|...+.       .....||.++
T Consensus         4 ~~~~~~~~~~i~~l~~~G~si~~IA~~~g-------vsr~TvyR~l   42 (45)
T PF02796_consen    4 PKLSKEQIEEIKELYAEGMSIAEIAKQFG-------VSRSTVYRYL   42 (45)
T ss_dssp             SSSSHCCHHHHHHHHHTT--HHHHHHHTT-------S-HHHHHHHH
T ss_pred             CCCCHHHHHHHHHHHHCCCCHHHHHHHHC-------cCHHHHHHHH
Confidence            45677788899999999999999988652       3455666554


No 35 
>PF04800 ETC_C1_NDUFA4:  ETC complex I subunit conserved region;  InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=28.37  E-value=45  Score=29.39  Aligned_cols=27  Identities=26%  Similarity=0.297  Sum_probs=19.5

Q ss_pred             cccccCChhhHHHHHHHHHhhcCeEEEEeec
Q 037162           17 VNVALMEREDMPREELQTELRNGLVIVIEKS   47 (689)
Q Consensus        17 ~~~~F~S~eea~~~~~~yA~~~GF~v~i~rS   47 (689)
                      +.+.|+|.|+|..    ||.++|....|.--
T Consensus        51 v~l~F~skE~Ai~----yaer~G~~Y~V~~p   77 (101)
T PF04800_consen   51 VRLKFDSKEDAIA----YAERNGWDYEVEEP   77 (101)
T ss_dssp             CEEEESSHHHHHH----HHHHCT-EEEEE-S
T ss_pred             eEeeeCCHHHHHH----HHHHcCCeEEEeCC
Confidence            4569999999986    57788888777533


No 36 
>PF11427 HTH_Tnp_Tc3_1:  Tc3 transposase; PDB: 1U78_A 1TC3_C.
Probab=28.05  E-value=74  Score=24.21  Aligned_cols=30  Identities=13%  Similarity=0.193  Sum_probs=21.6

Q ss_pred             cCCHHHHHHHHHhhhCCCChHHHHHHHHhc
Q 037162          129 RLSKEESNLLVDMSKNNVKPKDILHVLKKR  158 (689)
Q Consensus       129 rLs~e~k~~I~~L~~sgv~pr~Il~~L~~~  158 (689)
                      .|++.++..|..|.+.|++..+|...|...
T Consensus         4 ~Lt~~Eqaqid~m~qlG~s~~~isr~i~RS   33 (50)
T PF11427_consen    4 TLTDAEQAQIDVMHQLGMSLREISRRIGRS   33 (50)
T ss_dssp             ---HHHHHHHHHHHHTT--HHHHHHHHT--
T ss_pred             cCCHHHHHHHHHHHHhchhHHHHHHHhCcc
Confidence            478999999999999999999999888654


No 37 
>PF00665 rve:  Integrase core domain;  InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis [].  Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.  HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=27.15  E-value=71  Score=27.92  Aligned_cols=50  Identities=18%  Similarity=0.085  Sum_probs=29.6

Q ss_pred             CCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHH
Q 037162          199 TDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKA  249 (689)
Q Consensus       199 ~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~A  249 (689)
                      ....+.+++.+...++.+.+.-+|........ ...|.+|+||+..+..+.
T Consensus        33 ~~S~~~~~~~~~~~~~~~~~~~~l~~~~~~~~-~~~p~~i~tD~g~~f~~~   82 (120)
T PF00665_consen   33 DYSRFIYAFPVSSKETAEAALRALKRAIEKRG-GRPPRVIRTDNGSEFTSH   82 (120)
T ss_dssp             TTTTEEEEEEESSSSHHHHHHHHHHHHHHHHS--SE-SEEEEESCHHHHSH
T ss_pred             CCCCcEEEEEeecccccccccccccccccccc-cccceecccccccccccc
Confidence            34456667777666555555555554333321 222999999999888643


No 38 
>PF03461 TRCF:  TRCF domain;  InterPro: IPR005118  This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript.; GO: 0003684 damaged DNA binding, 0004386 helicase activity, 0005524 ATP binding, 0006281 DNA repair; PDB: 2QSR_A 2EYQ_A.
Probab=27.02  E-value=1.1e+02  Score=26.75  Aligned_cols=40  Identities=20%  Similarity=0.273  Sum_probs=29.0

Q ss_pred             HHHHHHHHHHhcCCHHHHHHHHHHHHHhhcCChhHHhhhh
Q 037162          286 TFISSWNLLVLAASEEEFAQRLKSMESDFSKYPTALTYIR  325 (689)
Q Consensus       286 ~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~~~p~~~~Yl~  325 (689)
                      +=+..++.+..+.|.++.++...+|.+.|+..|.-++.|-
T Consensus        18 ~Rl~~Yrrl~~~~~~~el~~l~~El~DRFG~~P~ev~~L~   57 (101)
T PF03461_consen   18 ERLELYRRLASAESEEELEDLREELIDRFGPLPEEVENLL   57 (101)
T ss_dssp             HHHHHHHHHHC--SHHHHHHHHHHHHHHH-S--HHHHHHH
T ss_pred             HHHHHHHHHhhCCCHHHHHHHHHHHHHHcCCCcHHHHHHH
Confidence            3355677889999999999999999999998887776664


No 39 
>TIGR03147 cyt_nit_nrfF cytochrome c nitrite reductase, accessory protein NrfF.
Probab=26.69  E-value=81  Score=28.93  Aligned_cols=34  Identities=9%  Similarity=0.093  Sum_probs=30.2

Q ss_pred             cCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCc
Q 037162          129 RLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHN  162 (689)
Q Consensus       129 rLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~  162 (689)
                      .+..+.+.+|.++...|.+..+|.+.|.++||+.
T Consensus        57 ~iA~dmR~~Vr~~i~~G~Sd~eI~~~~v~RYG~~   90 (126)
T TIGR03147        57 PIAYDLRHEVYSMVNEGKSNQQIIDFMTARFGDF   90 (126)
T ss_pred             HHHHHHHHHHHHHHHcCCCHHHHHHHHHHhcCCe
Confidence            3556788999999999999999999999999974


No 40 
>KOG4825 consensus Component of synaptic membrane glycine-, glutamate- and thienylcyclohexylpiperidine-binding glycoprotein (43kDa) [Signal transduction mechanisms]
Probab=26.44  E-value=1.4e+02  Score=33.09  Aligned_cols=29  Identities=21%  Similarity=0.217  Sum_probs=22.6

Q ss_pred             ccCCCCCCCCccccCcccCCCCccccccc
Q 037162          479 VKGKTRGRPSLKAYTSARRNPSKFEYVLS  507 (689)
Q Consensus       479 ~k~~tkg~p~~~~~~st~r~ps~~e~~~~  507 (689)
                      .+.-+.|||.--...++.|.||.||.--+
T Consensus       284 pqleepgrenqfaepflqekpsswelpIr  312 (666)
T KOG4825|consen  284 PQLEEPGRENQFAEPFLQEKPSSWELPIR  312 (666)
T ss_pred             ccccCCCCccccccchhhcCCCcceeecc
Confidence            34557788877777899999999997643


No 41 
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=26.09  E-value=4.1e+02  Score=32.56  Aligned_cols=28  Identities=11%  Similarity=-0.007  Sum_probs=19.2

Q ss_pred             hhhhhhh-ccCCCcCCCCCcccccccccc
Q 037162          398 PEIAEYK-REGRPIPLSSLHSHRKKLDLL  425 (689)
Q Consensus       398 h~i~~~l-~~~~~l~~~~~H~~W~~l~~~  425 (689)
                      |.|.-+. .+=..||..-|=.+|.+.--.
T Consensus       595 HaLkVL~~~~v~~IP~~YILkRWTKdAK~  623 (846)
T PLN03097        595 HALVVLQMCQLSAIPSQYILKRWTKDAKS  623 (846)
T ss_pred             hHHHHHhhcCcccCchhhhhhhchhhhhh
Confidence            6666544 344569999999999965443


No 42 
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=22.81  E-value=1.9e+02  Score=29.09  Aligned_cols=108  Identities=16%  Similarity=0.124  Sum_probs=60.6

Q ss_pred             hCCCChHHHHHHHHhcCCCccchhhHHHHHHHHhHHhhh--------c-ch------hH------HHHHhhhcccCCCCc
Q 037162          143 KNNVKPKDILHVLKKRDMHNATTIRAIYNARRKCKVREQ--------A-GR------SQ------MQLLMKIVGVTSTDL  201 (689)
Q Consensus       143 ~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~~k~r~~~l--------~-g~------t~------~~~Ll~~vGvd~~~~  201 (689)
                      ..+++-+.+.+.|.+..  ....-..|+...+++-....        . ++      +-      -..|..+  ||.+|.
T Consensus        23 ~~~Ls~r~v~e~l~~rg--i~v~h~Ti~rwv~k~~~~~~~~~~~r~~~~~~~w~vDEt~ikv~gkw~ylyrA--id~~g~   98 (215)
T COG3316          23 RYGLSLRDVEEMLAERG--IEVDHETIHRWVQKYGPLLARRLKRRKRKAGDSWRVDETYIKVNGKWHYLYRA--IDADGL   98 (215)
T ss_pred             hcchhhccHHHHHHHcC--cchhHHHHHHHHHHHhHHHHHHhhhhccccccceeeeeeEEeeccEeeehhhh--hccCCC
Confidence            44788888888777664  33344555555444322111        0 00      00      0012222  344444


Q ss_pred             eeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCccccc
Q 037162          202 TFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATT  259 (689)
Q Consensus       202 ~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~  259 (689)
                      +  +.+-+...-+...-.-||..+++.   ...|.+|+||+.+....|+.++-+...|
T Consensus        99 ~--Ld~~L~~rRn~~aAk~Fl~kllk~---~g~p~v~vtDka~s~~~A~~~l~~~~eh  151 (215)
T COG3316          99 T--LDVWLSKRRNALAAKAFLKKLLKK---HGEPRVFVTDKAPSYTAALRKLGSEVEH  151 (215)
T ss_pred             e--EEEEEEcccCcHHHHHHHHHHHHh---cCCCceEEecCccchHHHHHhcCcchhe
Confidence            3  445454444444444555555554   3789999999999999999999885543


No 43 
>PF07506 RepB:  RepB plasmid partitioning protein;  InterPro: IPR011111 This family includes proteins with sequence similarity to the RepB partitioning protein of the large Ti (tumour-inducing) plasmids of Agrobacterium tumefaciens [, ].
Probab=22.71  E-value=2.4e+02  Score=27.48  Aligned_cols=62  Identities=18%  Similarity=0.274  Sum_probs=50.7

Q ss_pred             ccccccCCCCcchHHHHHHhcCCCccHHHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhc
Q 037162          559 GAKDVAADGNCGFRTVADLIGIGEDNWARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLS  621 (689)
Q Consensus       559 ~i~~v~~dg~Cgfraia~~l~~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~  621 (689)
                      +.++.-.|+.||..+|..+-+++.+.|..+.+.|+.. ......|......+.+|+.+++.+.
T Consensus        53 e~~~ll~~~~~~~~~ig~A~~igr~Rw~ela~l~~~~-~~~~~~~~~~~~s~~rf~~l~~~l~  114 (185)
T PF07506_consen   53 EEVRLLADVEVGIIAIGPAPKIGRPRWIELAELMIAA-NNFTSAYFRALLSDTRFEALVRALR  114 (185)
T ss_pred             HHHHHHhhccccHHHHHHHHHCCCcCHHHHHHHHHHH-HhhhHHHHHhcccCCcHHHHHHHHh
Confidence            5577889999999999999999999999999999332 3333456666668899999999993


No 44 
>PRK10144 formate-dependent nitrite reductase complex subunit NrfF; Provisional
Probab=20.79  E-value=1.3e+02  Score=27.70  Aligned_cols=34  Identities=9%  Similarity=0.050  Sum_probs=30.1

Q ss_pred             cCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCc
Q 037162          129 RLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHN  162 (689)
Q Consensus       129 rLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~  162 (689)
                      .+..+.+.+|.++...|.+..+|.+.|.++||+.
T Consensus        57 ~iA~dmR~~Vr~~i~~G~sd~eI~~~~v~RYG~~   90 (126)
T PRK10144         57 PVAVSMRHQVYSMVAEGKSEVEIIGWMTERYGDF   90 (126)
T ss_pred             HHHHHHHHHHHHHHHcCCCHHHHHHHHHHhcCCe
Confidence            3456788899999999999999999999999974


No 45 
>TIGR03277 methan_mark_9 putative methanogenesis marker domain 9. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins.
Probab=20.40  E-value=82  Score=27.75  Aligned_cols=30  Identities=23%  Similarity=0.598  Sum_probs=27.1

Q ss_pred             cchH-HHHHHhcCCCccHHHHHHHHHHHHHh
Q 037162          569 CGFR-TVADLIGIGEDNWARVRRDLVDELQC  598 (689)
Q Consensus       569 Cgfr-aia~~l~~~~~~~~~vr~~l~~el~~  598 (689)
                      |-|| ..-..+|++.+.+..+.+++.+||..
T Consensus        78 Cplrd~aL~~igls~~EYm~lKkelae~i~~  108 (109)
T TIGR03277        78 CPLRDSALQRIGMSPEEYMELKKKLAEELLK  108 (109)
T ss_pred             CcCchHHHHHcCCCHHHHHHHHHHHHHHHhc
Confidence            8888 57789999999999999999999875


Done!