Query         003149
Match_columns 844
No_of_seqs    451 out of 2097
Neff          8.8 
Searched_HMMs 46136
Date          Thu Mar 28 17:57:57 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/003149.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/003149hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PLN03097 FHY3 Protein FAR-RED  100.0 1.2E-79 2.6E-84  712.8  47.2  509   31-569    69-627 (846)
  2 PF10551 MULE:  MULE transposas  99.8 3.8E-20 8.3E-25  162.6   8.7   90  260-352     1-93  (93)
  3 PF00872 Transposase_mut:  Tran  99.7 3.8E-18 8.1E-23  188.3   5.0  248  152-448    98-352 (381)
  4 PF03101 FAR1:  FAR1 DNA-bindin  99.6 3.6E-16 7.8E-21  136.6   7.7   90   50-142     1-91  (91)
  5 PF02338 OTU:  OTU-like cystein  99.6 9.2E-16   2E-20  141.9   6.0  107  706-829     1-121 (121)
  6 PF08731 AFT:  Transcription fa  99.4 7.2E-13 1.6E-17  113.8  10.6   93   42-140     1-111 (111)
  7 KOG2606 OTU (ovarian tumor)-li  99.3 2.2E-12 4.7E-17  129.2   6.5  133  690-834   149-297 (302)
  8 COG3328 Transposase and inacti  99.2 3.6E-10 7.9E-15  122.2  16.5  244  154-449    86-332 (379)
  9 PF03108 DBD_Tnp_Mut:  MuDR fam  98.9 7.1E-09 1.5E-13   84.4   7.4   66   34-129     2-67  (67)
 10 smart00575 ZnF_PMZ plant mutat  98.4 8.7E-08 1.9E-12   62.5   1.8   26  522-547     1-26  (28)
 11 KOG2605 OTU (ovarian tumor)-li  98.0 2.1E-06 4.7E-11   92.2   0.8  132  692-834   210-343 (371)
 12 KOG3288 OTU-like cysteine prot  97.7  0.0001 2.3E-09   72.5   7.7  121  703-838   113-236 (307)
 13 PF04434 SWIM:  SWIM zinc finge  97.5 4.9E-05 1.1E-09   54.7   2.0   30  517-546    10-39  (40)
 14 PF10275 Peptidase_C65:  Peptid  97.5 0.00018   4E-09   74.9   6.7   52  782-834   190-244 (244)
 15 KOG3991 Uncharacterized conser  97.0  0.0021 4.5E-08   62.8   7.1   88  736-835   165-256 (256)
 16 PF06782 UPF0236:  Uncharacteri  96.4    0.32 6.9E-06   55.7  20.9  245  150-449   115-381 (470)
 17 PF03106 WRKY:  WRKY DNA -bindi  95.4   0.049 1.1E-06   42.7   6.1   57   59-139     3-59  (60)
 18 PF05412 Peptidase_C33:  Equine  94.6   0.089 1.9E-06   45.3   5.8   88  704-839     3-90  (108)
 19 PF01610 DDE_Tnp_ISL3:  Transpo  94.5   0.046   1E-06   57.2   5.1   68  288-357    30-98  (249)
 20 PF04684 BAF1_ABF1:  BAF1 / ABF  93.2    0.62 1.3E-05   51.0  10.4   46   35-85     21-66  (496)
 21 smart00774 WRKY DNA binding do  92.3    0.27 5.9E-06   38.2   4.6   56   60-138     4-59  (59)
 22 COG5539 Predicted cysteine pro  92.2   0.044 9.6E-07   55.9   0.3  109  684-799   155-273 (306)
 23 PF13610 DDE_Tnp_IS240:  DDE do  90.7    0.17 3.6E-06   47.8   2.4   81  253-338     1-81  (140)
 24 PF04500 FLYWCH:  FLYWCH zinc f  89.4       1 2.2E-05   35.4   5.7   49   59-138    14-62  (62)
 25 PF00665 rve:  Integrase core d  79.4     8.7 0.00019   34.5   7.8   76  253-330     6-82  (120)
 26 PF15299 ALS2CR8:  Amyotrophic   78.1      14 0.00031   37.8   9.6   98  101-200    69-221 (225)
 27 PHA02517 putative transposase   76.4      22 0.00047   37.7  10.9   69  253-326   110-180 (277)
 28 COG5539 Predicted cysteine pro  61.1     8.6 0.00019   39.8   3.5  112  707-834   119-231 (306)
 29 COG4279 Uncharacterized conser  59.2     4.2   9E-05   41.2   0.9   30  521-553   124-153 (266)
 30 PF04937 DUF659:  Protein of un  58.8      83  0.0018   30.0   9.7  106  250-358    30-139 (153)
 31 KOG4345 NF-kappa B regulator A  56.8     7.3 0.00016   45.0   2.4   52  780-833   225-290 (774)
 32 COG3316 Transposase and inacti  53.2      32 0.00069   34.6   5.9   81  253-340    70-151 (215)
 33 COG3464 Transposase and inacti  49.6      68  0.0015   36.1   8.6   56  289-349   183-238 (402)
 34 PF08069 Ribosomal_S13_N:  Ribo  48.4      28 0.00061   27.3   3.7   29  155-183    31-59  (60)
 35 PF13936 HTH_38:  Helix-turn-he  47.8      25 0.00053   25.6   3.2   30  151-180     3-32  (44)
 36 PRK09784 hypothetical protein;  45.1      13 0.00029   37.0   1.9   36  697-733   197-232 (417)
 37 PF03050 DDE_Tnp_IS66:  Transpo  44.5      16 0.00036   38.5   2.7   85  253-357    67-156 (271)
 38 PRK13907 rnhA ribonuclease H;   44.5 1.8E+02  0.0039   26.4   9.4   78  255-335     3-81  (128)
 39 COG5431 Uncharacterized metal-  34.8      15 0.00032   31.8   0.4   25  521-545    49-78  (117)
 40 KOG4540 Putative lipase essent  32.8 1.3E+02  0.0027   31.6   6.6   57  732-797   246-306 (425)
 41 COG5153 CVT17 Putative lipase   32.8 1.3E+02  0.0027   31.6   6.6   57  732-797   246-306 (425)
 42 PF09921 DUF2153:  Uncharacteri  28.4      38 0.00083   30.5   1.8   48  725-778     3-56  (126)
 43 PF09607 BrkDBD:  Brinker DNA-b  26.7      47   0.001   25.8   1.8   18  707-724    20-39  (58)
 44 PF03412 Peptidase_C39:  Peptid  25.7 3.2E+02   0.007   24.6   7.8   83  708-834    10-92  (131)
 45 KOG4825 Component of synaptic   25.5   1E+02  0.0022   34.2   4.7   36  616-651   279-314 (666)
 46 KOG0030 Myosin essential light  25.5      28 0.00061   32.2   0.5   86  699-786    16-116 (152)
 47 PRK08561 rps15p 30S ribosomal   23.3   3E+02  0.0064   26.1   6.7   31  154-184    30-60  (151)
 48 TIGR03277 methan_mark_9 putati  21.6      69  0.0015   28.1   2.1   32  709-740    76-108 (109)
 49 PF13082 DUF3931:  Protein of u  21.3 1.8E+02  0.0039   21.8   3.8   42  253-294     8-62  (66)
 50 PRK14702 insertion element IS2  20.7 3.2E+02   0.007   28.6   7.5   72  254-326    88-163 (262)
 51 KOG0400 40S ribosomal protein   20.5 5.1E+02   0.011   23.8   7.2  102  153-263    29-140 (151)

No 1  
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00  E-value=1.2e-79  Score=712.75  Aligned_cols=509  Identities=19%  Similarity=0.272  Sum_probs=410.9

Q ss_pred             CCCCCCCCCcccCCHHHHHHHHHHHhhhcCeEEEEEeecCCC-CCCCceEEEEEecCCccCCCCCCCCCC---------C
Q 003149           31 DFSSAFTTDMVFNSREELVEWIRDTGKRNGLVIVIKKSDVGG-DGRRPRITFACERSGAYRRKYTEGQTP---------K  100 (844)
Q Consensus        31 ~~~~~~~~g~~F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~-~g~~~~~~~vC~r~G~~r~~~~~~~~~---------~  100 (844)
                      +....|.+||+|+|.|||++||+.||++.||+||+.+|.+++ ++.++.++|+|+|+|+.+.+.+.....         .
T Consensus        69 ~~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~~~  148 (846)
T PLN03097         69 DTNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQDPE  148 (846)
T ss_pred             CCCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccCcc
Confidence            444579999999999999999999999999999999888876 667788999999999765322100000         0


Q ss_pred             -CCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCCCCCCCcccCccccCCCHHHHHHHHHHhhCCCChHHHHHHHH
Q 003149          101 -RPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNHPVTQHVEGHSYAGRLTEQEANILVDLSRSNISPKEILQTLK  179 (844)
Q Consensus       101 -~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH~~~~~~~~~~~~rrl~~~~~~~i~~l~~~~~~p~~I~~~l~  179 (844)
                       .+++++.+|+||+|+|+++. ..+|+|.|+.+..+|||++.++...       +...+.....+..          .+.
T Consensus       149 ~~~~rR~~tRtGC~A~m~Vk~-~~~gkW~V~~fv~eHNH~L~p~~~~-------~~~~r~~~~~~~~----------~~~  210 (846)
T PLN03097        149 NGTGRRSCAKTDCKASMHVKR-RPDGKWVIHSFVKEHNHELLPAQAV-------SEQTRKMYAAMAR----------QFA  210 (846)
T ss_pred             cccccccccCCCCceEEEEEE-cCCCeEEEEEEecCCCCCCCCcccc-------chhhhhhHHHHHh----------hhh
Confidence             01234467899999999965 4679999999999999999976431       1121221111110          000


Q ss_pred             hcCCCcc-chHHHHHHHHHHhhhccccCcchHHHHHHHHH----hcCcEEEEEeeccCCceeeeEecChHHHHHHHhCCc
Q 003149          180 QRDMHNV-STIKAIYNARHKYRVGEQVGQLHMHQLLDKLR----KHGYIEWHRYNEETDCFKDLFWAHPFAVGLLRAFPS  254 (844)
Q Consensus       180 ~~~~~~~-~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~----e~~~~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~~  254 (844)
                      . + .++ .+.++..|...+.+.... ...+++.++++|+    +||.|+|++++|+++++.+|||+++.++..|.+|||
T Consensus       211 ~-~-~~v~~~~~d~~~~~~~~r~~~~-~~gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~FGD  287 (846)
T PLN03097        211 E-Y-KNVVGLKNDSKSSFDKGRNLGL-EAGDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGNFSD  287 (846)
T ss_pred             c-c-ccccccchhhcchhhHHHhhhc-ccchHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHhcCC
Confidence            0 0 111 233444444444443333 2348999999985    599999999999999999999999999999999999


Q ss_pred             EEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHHh
Q 003149          255 VVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRAV  334 (844)
Q Consensus       255 vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~v  334 (844)
                      ||++|+||.||+|++||..|+|+|+|++++++|+||+.+|+.++|.|+|++|+++| ++..|++||||+|.+|.+||.+|
T Consensus       288 vV~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM-~gk~P~tIiTDqd~am~~AI~~V  366 (846)
T PLN03097        288 VVSFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAM-GGQAPKVIITDQDKAMKSVISEV  366 (846)
T ss_pred             EEEEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHh-CCCCCceEEecCCHHHHHHHHHH
Confidence            99999999999999999999999999999999999999999999999999999999 88999999999999999999999


Q ss_pred             CCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhhhc-ccCcCHHHHHHHHHHHHhhccCchhHHHHHHHhhhcch
Q 003149          335 FPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWNVL-VLSVTEQEYMQHLGAMESDFSRYPQAIDYVKQTWLANY  413 (844)
Q Consensus       335 fP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~~l-~~s~t~~~f~~~~~~l~~~~~~~~~~~~y~~~~Wl~~~  413 (844)
                      ||++.|++|.|||++|+.+++...+..   .+.|...|..+ ..+.++++|+..|..|.+.+.-..  -+++..-|  ..
T Consensus       367 fP~t~Hr~C~wHI~~~~~e~L~~~~~~---~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~--n~WL~~LY--~~  439 (846)
T PLN03097        367 FPNAHHCFFLWHILGKVSENLGQVIKQ---HENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKE--DEWMQSLY--ED  439 (846)
T ss_pred             CCCceehhhHHHHHHHHHHHhhHHhhh---hhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccc--cHHHHHHH--Hh
Confidence            999999999999999999999877643   34677777665 458999999999999988763211  13333333  47


Q ss_pred             hhhhHhhhhcccccCCCcccccccchhhHHHHhhcCCCCChhhHHHHHHHHHHH-HHHHHHHHhhhccceeccccchhHH
Q 003149          414 KEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLLAVPQGNFETSWAKVHSLLEQ-QHYEIKASFERSSTIVQHNFKVPIF  492 (844)
Q Consensus       414 ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l~~~~~~l~~~~~~i~~~i~~-~~~e~~~~~~~~~~~~~~~~~~~~~  492 (844)
                      |++||++|+++.|.+|+.||+++||+|+.||+++.. ..+|..|+.+++.+++. +.+|..+++......+......|+.
T Consensus       440 RekWapaY~k~~F~agm~sTqRSES~Ns~fk~yv~~-~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piE  518 (846)
T PLN03097        440 RKQWVPTYMRDAFLAGMSTVQRSESINAFFDKYVHK-KTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLE  518 (846)
T ss_pred             HhhhhHHHhcccccCCcccccccccHHHHHHHHhCc-CCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHH
Confidence            999999999999999999999999999999999975 67899999999999985 5577888888766655555667899


Q ss_pred             HHhhhhhcHHHHHHhhcccccc---------------------------cccC----CcCCCCCccccccCCccchhHhh
Q 003149          493 EELRGFVSLNAMNIILGESERA---------------------------DSVG----LNASACGCVFRRTHGLPCAHEIA  541 (844)
Q Consensus       493 ~~l~~~is~~a~~~~~~e~~~~---------------------------~~V~----~~~~~CsC~~~~~~GlPC~H~la  541 (844)
                      +++.+.+|+.+|..++.|+...                           ..|.    ....+|+|++|+..||||+|+|+
T Consensus       519 kQAs~iYT~~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLk  598 (846)
T PLN03097        519 KSVSGVYTHAVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALV  598 (846)
T ss_pred             HHHHHHhHHHHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHH
Confidence            9999999999999998875421                           0111    13569999999999999999999


Q ss_pred             HHhhcC-CCCCcccccccccccccCCCcc
Q 003149          542 EYKHER-RSIPLLAVDRHWKKLDFVPVTQ  569 (844)
Q Consensus       542 v~~~~~-~~i~~~~i~~rW~~~~~~~~~~  569 (844)
                      |+.+.+ ..||..||++|||+.+......
T Consensus       599 VL~~~~v~~IP~~YILkRWTKdAK~~~~~  627 (846)
T PLN03097        599 VLQMCQLSAIPSQYILKRWTKDAKSRHLL  627 (846)
T ss_pred             HHhhcCcccCchhhhhhhchhhhhhcccC
Confidence            999999 8999999999999998876543


No 2  
>PF10551 MULE:  MULE transposase domain;  InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 []. 
Probab=99.81  E-value=3.8e-20  Score=162.65  Aligned_cols=90  Identities=43%  Similarity=0.744  Sum_probs=85.3

Q ss_pred             ccccCCCCCCeeEE---EEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHHhCC
Q 003149          260 CTYKTSMYPFSFLE---IVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRAVFP  336 (844)
Q Consensus       260 ~T~~tn~~~~~l~~---~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~vfP  336 (844)
                      +||+||+| ++++.   ++|+|++|+.+|+||+++.+|+.++|.|+|+.+++.+. .. |.+||+|++.++.+|++++||
T Consensus         1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~-~~-p~~ii~D~~~~~~~Ai~~vfP   77 (93)
T PF10551_consen    1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP-QK-PKVIISDFDKALINAIKEVFP   77 (93)
T ss_pred             Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc-cC-ceeeeccccHHHHHHHHHHCC
Confidence            69999999 88885   99999999999999999999999999999999999983 35 999999999999999999999


Q ss_pred             CcccchhhhhHHHhHH
Q 003149          337 RATNLLCRWHISKNIS  352 (844)
Q Consensus       337 ~~~~~lC~~Hi~kn~~  352 (844)
                      ++.|++|.||+.||++
T Consensus        78 ~~~~~~C~~H~~~n~k   93 (93)
T PF10551_consen   78 DARHQLCLFHILRNIK   93 (93)
T ss_pred             CceEehhHHHHHHhhC
Confidence            9999999999999974


No 3  
>PF00872 Transposase_mut:  Transposase, Mutator family;  InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.71  E-value=3.8e-18  Score=188.35  Aligned_cols=248  Identities=16%  Similarity=0.172  Sum_probs=192.1

Q ss_pred             CCCHHHHHHHHHHhhCCCChHHHHHHHHhcCCCccchHHHHHHHHHHhhhccccCcchHHHHHHHHHhcCcEEEEEeecc
Q 003149          152 RLTEQEANILVDLSRSNISPKEILQTLKQRDMHNVSTIKAIYNARHKYRVGEQVGQLHMHQLLDKLRKHGYIEWHRYNEE  231 (844)
Q Consensus       152 rl~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~e~~~~~~~~~~d~  231 (844)
                      +..+.....|..|.-.|++.++|...+..-++..-.+...|.+....+...           +..++.            
T Consensus        98 r~~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~-----------~~~w~~------------  154 (381)
T PF00872_consen   98 RREDSLEELIISLYLKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEE-----------VEAWRN------------  154 (381)
T ss_pred             hhhhhhhhhhhhhhccccccccccchhhhhhcccccCchhhhhhhhhhhhh-----------HHHHhh------------
Confidence            446666777888889999999999999888763323444554444333221           111111            


Q ss_pred             CCceeeeEecChHHHHHHHhC-CcEEEEeccccCCCC-----CCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHH
Q 003149          232 TDCFKDLFWAHPFAVGLLRAF-PSVVMIDCTYKTSMY-----PFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLER  305 (844)
Q Consensus       232 ~~~~~~if~~~~~~~~~~~~~-~~vl~iD~T~~tn~~-----~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~  305 (844)
                                     .-+... -.+|++|++|-.-+.     +..+++++|++.+|+-.++|+.+...|+..+|.-+|+.
T Consensus       155 ---------------R~L~~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~  219 (381)
T PF00872_consen  155 ---------------RPLESEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQD  219 (381)
T ss_pred             ---------------hccccccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeecccCCccCEeeecchh
Confidence                           111122 257899999986552     46789999999999999999999999999999999999


Q ss_pred             HHHhhcCCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhhhcccCcCHHHHH
Q 003149          306 LRSMMEDDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWNVLVLSVTEQEYM  385 (844)
Q Consensus       306 lk~~~~~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~~l~~s~t~~~f~  385 (844)
                      |++.  |...|..||+|..+++.+||.++||++.++.|++|+++|+.+++.+.     ..+.+...++.+..+++.++..
T Consensus       220 L~~R--Gl~~~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~k-----~~~~v~~~Lk~I~~a~~~e~a~  292 (381)
T PF00872_consen  220 LKER--GLKDILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPKK-----DRKEVKADLKAIYQAPDKEEAR  292 (381)
T ss_pred             hhhc--cccccceeeccccccccccccccccchhhhhheechhhhhccccccc-----cchhhhhhccccccccccchhh
Confidence            9998  67789999999999999999999999999999999999999987542     3467888899999999999999


Q ss_pred             HHHHHHHhhc-cCchhHHHHHHHhhhcchhhhhHhhhhcccccCCCcccccccchhhHHHHhhc
Q 003149          386 QHLGAMESDF-SRYPQAIDYVKQTWLANYKEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLLA  448 (844)
Q Consensus       386 ~~~~~l~~~~-~~~~~~~~y~~~~Wl~~~ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l~  448 (844)
                      +.++.+.+.| ..+|.+..++...|-.    .|.-.-++...+--++|||.+|++|+.+|+...
T Consensus       293 ~~l~~f~~~~~~kyp~~~~~l~~~~~~----~~tf~~fP~~~~~~i~TTN~iEsln~~irrr~~  352 (381)
T PF00872_consen  293 EALEEFAEKWEKKYPKAAKSLEENWDE----LLTFLDFPPEHRRSIRTTNAIESLNKEIRRRTK  352 (381)
T ss_pred             hhhhhcccccccccchhhhhhhhcccc----ccceeeecchhccccchhhhccccccchhhhcc
Confidence            9999999887 5689999998887741    222111233333456899999999999998665


No 4  
>PF03101 FAR1:  FAR1 DNA-binding domain;  InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ].   This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=99.64  E-value=3.6e-16  Score=136.56  Aligned_cols=90  Identities=27%  Similarity=0.539  Sum_probs=77.7

Q ss_pred             HHHHHHhhhcCeEEEEEeecCC-CCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeE
Q 003149           50 EWIRDTGKRNGLVIVIKKSDVG-GDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWI  128 (844)
Q Consensus        50 ~~~~~ya~~~GF~v~~~~S~~~-~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~  128 (844)
                      +||+.||..+||+|++.+|.+. ++|...+++|+|+++|.++.+...  ....++.+.+.+|||||+|.++... ++.|.
T Consensus         1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~--~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~   77 (91)
T PF03101_consen    1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKN--EEKRRRNRPSKKTGCKARINVKRRK-DGKWR   77 (91)
T ss_pred             CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccc--cccccccccccccCCCEEEEEEEcc-CCEEE
Confidence            4899999999999999999887 477888999999999998865422  2345677889999999999998776 99999


Q ss_pred             EEEEccccCCCCCC
Q 003149          129 LKVVCGVHNHPVTQ  142 (844)
Q Consensus       129 v~~~~~~HNH~~~~  142 (844)
                      |+.+..+|||++.+
T Consensus        78 v~~~~~~HNH~L~P   91 (91)
T PF03101_consen   78 VTSFVLEHNHPLCP   91 (91)
T ss_pred             EEECcCCcCCCCCC
Confidence            99999999999865


No 5  
>PF02338 OTU:  OTU-like cysteine protease;  InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).  None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.60  E-value=9.2e-16  Score=141.93  Aligned_cols=107  Identities=20%  Similarity=0.293  Sum_probs=86.1

Q ss_pred             CCCCCchHHHHHHhhc----CCCccHHHHHHHHHHHHH-hhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCc
Q 003149          706 IADGHCGFRVVAELMD----IGEDNWAQVRRDLVDELQ-SHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDT  780 (844)
Q Consensus       706 ~~dg~Cgfraia~~lg----~~~~~~~~vr~~l~~el~-~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~  780 (844)
                      +|||||+|||||.+|+    .+++.|..||++++++|+ .+++.|.+++.+.        .+.      ..+.|.+.+++
T Consensus         1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~~~~~~~~--------~~~------~~~~Wg~~~el   66 (121)
T PF02338_consen    1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKFEEFLEGD--------KMS------KPGTWGGEIEL   66 (121)
T ss_dssp             -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHHHHHHHHH--------HHT------STTSHEEHHHH
T ss_pred             CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchhhhhhhhh--------hhc------cccccCcHHHH
Confidence            6999999999999999    999999999999999999 9999999887433        343      56899999988


Q ss_pred             hhhhhcccCeeEEEEcCCccc--cccCCCC--CCCCCCCCCcEEEEEeC-----CCcE
Q 003149          781 GHLIASRYNIVLMHLSQQQCF--TFLPLRS--VPLPRTSRKIVTIGFVN-----ECQF  829 (844)
Q Consensus       781 ~~~~a~~~~~~v~~~~~~~~~--~~~p~~~--~p~~~~~~~~i~l~~~~-----~~H~  829 (844)
                       +++|+.|+|+|++|+.....  .+++..+  +|...  .++|+|+|..     ++||
T Consensus        67 -~a~a~~~~~~I~v~~~~~~~~~~~~~~~~~~~~~~~--~~~i~l~~~~~l~~~~~Hy  121 (121)
T PF02338_consen   67 -QALANVLNRPIIVYSSSDGDNVVFIKFTGKYPPLES--PPPICLCYHGHLYYTGNHY  121 (121)
T ss_dssp             -HHHHHHHTSEEEEECETTTBEEEEEEESCEESTTTT--TTSEEEEEETEEEEETTEE
T ss_pred             -HHHHHHhCCeEEEEEcCCCCccceeeecCccccCCC--CCeEEEEEcCCccCCCCCC
Confidence             69999999999998874333  3333332  23333  6889999997     8998


No 6  
>PF08731 AFT:  Transcription factor AFT;  InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2. 
Probab=99.44  E-value=7.2e-13  Score=113.75  Aligned_cols=93  Identities=32%  Similarity=0.661  Sum_probs=79.0

Q ss_pred             cCCHHHHHHHHHHHhhhcCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCC------------------CCCCCCC
Q 003149           42 FNSREELVEWIRDTGKRNGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEG------------------QTPKRPK  103 (844)
Q Consensus        42 F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~------------------~~~~~rr  103 (844)
                      |.+.+|++.|++..+..+||.|+|.||+..      .++|.|.-+|.++......                  ......+
T Consensus         1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~------ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k   74 (111)
T PF08731_consen    1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKK------KIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKK   74 (111)
T ss_pred             CCchHHHHHHHHHHhhhcCceEEEEecCCc------eEEEEEecCCCcccccccccccccccccccccccccccccccCC
Confidence            899999999999999999999999999987      7999999998877544210                  1122235


Q ss_pred             CCCceeeCCccEEEEEeeCCCCCeEEEEEccccCCCC
Q 003149          104 TTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNHPV  140 (844)
Q Consensus       104 ~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH~~  140 (844)
                      .+.++++.|||+|++......+.|.|..+++.|||++
T Consensus        75 ~t~srk~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l  111 (111)
T PF08731_consen   75 RTKSRKNTCPFRIRANYSKKNKKWTLVVVNNEHNHPL  111 (111)
T ss_pred             cccccccCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence            6678899999999999999999999999999999986


No 7  
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.31  E-value=2.2e-12  Score=129.16  Aligned_cols=133  Identities=11%  Similarity=0.116  Sum_probs=111.7

Q ss_pred             ccccccccccccccccCCCCCchHHHHHHhhcCCC---ccHHHHHHHHHHHHHhhhhhHHHhhC--------CHHHHHHH
Q 003149          690 SFPAGLRPYIHDVQDVIADGHCGFRVVAELMDIGE---DNWAQVRRDLVDELQSHYDDYIQLYG--------DAEIAREL  758 (844)
Q Consensus       690 ~~~~~~~~~~~~~~~v~~dg~Cgfraia~~lg~~~---~~~~~vr~~l~~el~~~~~~y~~~~~--------~~~~~~~~  758 (844)
                      .+-+.|.+-.+.++|+++||||.|+||+.+|...-   -+-..+|++..++|++|.+++.+++-        ++.+|+.+
T Consensus       149 k~~~il~~~~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df~pf~~~eet~d~~~~~~f~~Y  228 (302)
T KOG2606|consen  149 KLAQILEERGLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDFLPFLLDEETGDSLGPEDFDKY  228 (302)
T ss_pred             HHHHHHHhccCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHhhhHhcCccccccCCHHHHHHH
Confidence            56778889999999999999999999999998864   56889999999999999999999872        24479999


Q ss_pred             HhhcCCCCCCCCccccccCCCchhhhhcccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeC-----CCcEEEEE
Q 003149          759 LHSLSYSESNPGIEHRMIMPDTGHLIASRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVN-----ECQFVKVL  833 (844)
Q Consensus       759 ~~~l~~~~~~~~~~~w~~~~~~~~~~a~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~-----~~H~~~~~  833 (844)
                      ++.+.      .+..|++-+..+ +|++.|.+||.+|..+++    ++..++...+ .+||+|+|+-     +-||+++.
T Consensus       229 c~eI~------~t~~WGgelEL~-AlShvL~~PI~Vy~~~~p----~~~~geey~k-d~pL~lvY~rH~y~LGeHYNS~~  296 (302)
T KOG2606|consen  229 CREIR------NTAAWGGELELK-ALSHVLQVPIEVYQADGP----ILEYGEEYGK-DKPLILVYHRHAYGLGEHYNSVT  296 (302)
T ss_pred             HHHhh------hhccccchHHHH-HHHHhhccCeEEeecCCC----ceeechhhCC-CCCeeeehHHhHHHHHhhhcccc
Confidence            99996      679999999986 999999999999999866    5555544443 3789999963     26888876


Q ss_pred             e
Q 003149          834 D  834 (844)
Q Consensus       834 ~  834 (844)
                      +
T Consensus       297 ~  297 (302)
T KOG2606|consen  297 P  297 (302)
T ss_pred             c
Confidence            5


No 8  
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.19  E-value=3.6e-10  Score=122.15  Aligned_cols=244  Identities=15%  Similarity=0.126  Sum_probs=176.5

Q ss_pred             CHHHHHHHHHHhhCCCChHHHHHHHHhcCCCccchHHHHHHHHHHhhhccccCcchHHHHHHHHHhcCcEEEEEeeccCC
Q 003149          154 TEQEANILVDLSRSNISPKEILQTLKQRDMHNVSTIKAIYNARHKYRVGEQVGQLHMHQLLDKLRKHGYIEWHRYNEETD  233 (844)
Q Consensus       154 ~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~e~~~~~~~~~~d~~~  233 (844)
                      ....-..|..+...|+++++|-..+++.+...+  ..+.++....          .+..-+..++.-+.           
T Consensus        86 ~~~~~~~v~~~y~~gv~Tr~i~~~~~~~~~~~~--s~~~iS~~~~----------~~~e~v~~~~~r~l-----------  142 (379)
T COG3328          86 ERALDLPVLSMYAKGVTTREIEALLEELYGHKV--SPSVISVVTD----------RLDEKVKAWQNRPL-----------  142 (379)
T ss_pred             hhhHHHHHHHHHHcCCcHHHHHHHHHHhhCccc--CHHHhhhHHH----------HHHHHHHHHHhccc-----------
Confidence            334455677788899999999999998875422  1111111100          11111111111111           


Q ss_pred             ceeeeEecChHHHHHHHhCCcEEEEeccccCCC--CCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhc
Q 003149          234 CFKDLFWAHPFAVGLLRAFPSVVMIDCTYKTSM--YPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMME  311 (844)
Q Consensus       234 ~~~~if~~~~~~~~~~~~~~~vl~iD~T~~tn~--~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~  311 (844)
                                       .--.++++|++|..-+  -+..++.++|++.+|+-..+|+.+-..|+ ..|.-+|..|+..  
T Consensus       143 -----------------~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~r--  202 (379)
T COG3328         143 -----------------GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNR--  202 (379)
T ss_pred             -----------------cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhc--
Confidence                             1125789999998876  46789999999999999999999999999 8999888888887  


Q ss_pred             CCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhhhcccCcCHHHHHHHHHHH
Q 003149          312 DDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWNVLVLSVTEQEYMQHLGAM  391 (844)
Q Consensus       312 ~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~~l~~s~t~~~f~~~~~~l  391 (844)
                      +......+++|...++.+||.++||.+.++.|..|+.+|+..+....     ..+.+...++.+..+++.++....+..+
T Consensus       203 gl~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~k-----~~d~i~~~~~~I~~a~~~e~~~~~~~~~  277 (379)
T COG3328         203 GLSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPRK-----DQDAVLSDLRSIYIAPDAEEALLALLAF  277 (379)
T ss_pred             cccceeEEecchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhhh-----hhHHHHhhhhhhhccCCcHHHHHHHHHH
Confidence            56666778889999999999999999999999999999998886543     4567778888889999999999999998


Q ss_pred             Hhhcc-CchhHHHHHHHhhhcchhhhhHhhhhcccccCCCcccccccchhhHHHHhhcC
Q 003149          392 ESDFS-RYPQAIDYVKQTWLANYKEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLLAV  449 (844)
Q Consensus       392 ~~~~~-~~~~~~~y~~~~Wl~~~ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l~~  449 (844)
                      .+.|. .+|.......++|..    .|.-.-+.....--..|||.+|++|..++.....
T Consensus       278 ~~~w~~~yP~i~~~~~~~~~~----~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~  332 (379)
T COG3328         278 SELWGKRYPAILKSWRNALEE----LLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKV  332 (379)
T ss_pred             HHhhhhhcchHHHHHHHHHHH----hcccccCcHHHHhHhhcchHHHHHHHHHHHHHhh
Confidence            88764 578777777777642    2111001111112358999999999988866553


No 9  
>PF03108 DBD_Tnp_Mut:  MuDR family transposase;  InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.86  E-value=7.1e-09  Score=84.39  Aligned_cols=66  Identities=29%  Similarity=0.543  Sum_probs=59.1

Q ss_pred             CCCCCCcccCCHHHHHHHHHHHhhhcCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCc
Q 003149           34 SAFTTDMVFNSREELVEWIRDTGKRNGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCP  113 (844)
Q Consensus        34 ~~~~~g~~F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp  113 (844)
                      +.+.+||+|+|.+|++.++..||..+||.++..+|++.      ++.++|.                        ..|||
T Consensus         2 ~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~~------r~~~~C~------------------------~~~C~   51 (67)
T PF03108_consen    2 PELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDKK------RYRAKCK------------------------DKGCP   51 (67)
T ss_pred             CccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCCE------EEEEEEc------------------------CCCCC
Confidence            46899999999999999999999999999999999866      8899993                        24799


Q ss_pred             cEEEEEeeCCCCCeEE
Q 003149          114 FLLKGHKLDTDDDWIL  129 (844)
Q Consensus       114 ~~i~~~~~~~~~~w~v  129 (844)
                      |+|++.....++.|.|
T Consensus        52 Wrv~as~~~~~~~~~I   67 (67)
T PF03108_consen   52 WRVRASKRKRSDTFQI   67 (67)
T ss_pred             EEEEEEEcCCCCEEEC
Confidence            9999998888888875


No 10 
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.45  E-value=8.7e-08  Score=62.52  Aligned_cols=26  Identities=31%  Similarity=0.676  Sum_probs=23.9

Q ss_pred             CCCCccccccCCccchhHhhHHhhcC
Q 003149          522 SACGCVFRRTHGLPCAHEIAEYKHER  547 (844)
Q Consensus       522 ~~CsC~~~~~~GlPC~H~lav~~~~~  547 (844)
                      .+|||+.|+..||||+|+|+|+...+
T Consensus         1 ~~CsC~~~~~~gipC~H~i~v~~~~~   26 (28)
T smart00575        1 KTCSCRKFQLSGIPCRHALAAAIHIG   26 (28)
T ss_pred             CcccCCCcccCCccHHHHHHHHHHhC
Confidence            37999999999999999999998865


No 11 
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=97.95  E-value=2.1e-06  Score=92.16  Aligned_cols=132  Identities=16%  Similarity=0.183  Sum_probs=98.4

Q ss_pred             ccccccccccccccCCCCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCc
Q 003149          692 PAGLRPYIHDVQDVIADGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGI  771 (844)
Q Consensus       692 ~~~~~~~~~~~~~v~~dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~  771 (844)
                      .+.+..|+.-+.-|.+||+|.|||+|+++.++.|-|..||++..++++..+++|..+.  +..|..+++...      -.
T Consensus       210 ~~~~~~~g~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~~~~v--t~~~~~y~k~kr------~~  281 (371)
T KOG2605|consen  210 AKRKKHFGFEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFYEDYV--TEDFTSYIKRKR------AD  281 (371)
T ss_pred             HHHHHHhhhhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhccccccccc--ccchhhcccccc------cC
Confidence            3445788889999999999999999999999999999999999999999999888876  446777777765      56


Q ss_pred             cccccCCCchhhhhc--ccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149          772 EHRMIMPDTGHLIAS--RYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD  834 (844)
Q Consensus       772 ~~w~~~~~~~~~~a~--~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~  834 (844)
                      +.|++-..+ |++|.  -+-.+.+.+++.....|--.  +|....+...+++.++...||..+..
T Consensus       282 ~~~gnhie~-Qa~a~~~~~~~~~~~~~~~~~t~~~~~--~~~~~~~~~~~~~n~~~~~h~~~~~~  343 (371)
T KOG2605|consen  282 GEPGNHIEQ-QAAADIYEEIEKPLNITSFKDTCYIQT--PPAIEESVKMEKYNFWVEVHYNTARH  343 (371)
T ss_pred             CCCcchHHH-hhhhhhhhhccccceeecccccceecc--Ccccccchhhhhhcccchhhhhhccc
Confidence            778887766 58885  33334444554333333222  23344445668888888899987765


No 12 
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=97.70  E-value=0.0001  Score=72.51  Aligned_cols=121  Identities=12%  Similarity=0.093  Sum_probs=87.5

Q ss_pred             cccCCCCCchHHHHHHhhcCCCccH-HHHHHHHHHHHHhhhhhHHHhh-CCH-HHHHHHHhhcCCCCCCCCccccccCCC
Q 003149          703 QDVIADGHCGFRVVAELMDIGEDNW-AQVRRDLVDELQSHYDDYIQLY-GDA-EIARELLHSLSYSESNPGIEHRMIMPD  779 (844)
Q Consensus       703 ~~v~~dg~Cgfraia~~lg~~~~~~-~~vr~~l~~el~~~~~~y~~~~-~~~-~~~~~~~~~l~~~~~~~~~~~w~~~~~  779 (844)
                      .=|+.|--|+|+||+.-+....+.- .++|+-+.+|+-++++.|..-+ |.+ .+|..++.         .++.|.+-.+
T Consensus       113 ~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n~eYc~WI~---------k~dsWGGaIE  183 (307)
T KOG3288|consen  113 RVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPNKEYCAWIL---------KMDSWGGAIE  183 (307)
T ss_pred             EeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCcHHHHHHHc---------cccccCceEE
Confidence            3489999999999988886654222 6899999999999999997644 433 34555554         4589999999


Q ss_pred             chhhhhcccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEecCCC
Q 003149          780 TGHLIASRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLDFSQT  838 (844)
Q Consensus       780 ~~~~~a~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~~~~~  838 (844)
                      .+ ||++.|++-|++++.+....  -.+++ ..+- ..-++|-| ++-||-+|.+..-.
T Consensus       184 ls-ILS~~ygveI~vvDiqt~ri--d~fge-d~~~-~~rv~lly-dGIHYD~l~m~~~~  236 (307)
T KOG3288|consen  184 LS-ILSDYYGVEICVVDIQTVRI--DRFGE-DKNF-DNRVLLLY-DGIHYDPLAMNEFK  236 (307)
T ss_pred             ee-eehhhhceeEEEEecceeee--hhcCC-CCCC-CceEEEEe-cccccChhhhccCC
Confidence            97 99999999999999753210  11222 1111 23478888 58999999986533


No 13 
>PF04434 SWIM:  SWIM zinc finger;  InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=97.52  E-value=4.9e-05  Score=54.67  Aligned_cols=30  Identities=27%  Similarity=0.612  Sum_probs=26.2

Q ss_pred             cCCcCCCCCccccccCCccchhHhhHHhhc
Q 003149          517 VGLNASACGCVFRRTHGLPCAHEIAEYKHE  546 (844)
Q Consensus       517 V~~~~~~CsC~~~~~~GlPC~H~lav~~~~  546 (844)
                      +++...+|||..|+..|.||+|+++++...
T Consensus        10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~   39 (40)
T PF04434_consen   10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL   39 (40)
T ss_pred             ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence            345688999999999999999999998754


No 14 
>PF10275 Peptidase_C65:  Peptidase C65 Otubain;  InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].   This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=97.49  E-value=0.00018  Score=74.92  Aligned_cols=52  Identities=6%  Similarity=-0.038  Sum_probs=29.9

Q ss_pred             hhhhcccCeeEEEEcCCcc---ccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149          782 HLIASRYNIVLMHLSQQQC---FTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD  834 (844)
Q Consensus       782 ~~~a~~~~~~v~~~~~~~~---~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~  834 (844)
                      .+||+++++||-++-.+.+   ..+=....+|.+....|.|.|.|- +.||.-|+.
T Consensus       190 ~ALa~aL~v~i~v~yld~~~~~~~~~~~~~~~~~~~~~~~i~LLyr-pgHYdIly~  244 (244)
T PF10275_consen  190 IALAQALGVPIRVEYLDRSVEGDEVNRHEFPPDNESQEPQITLLYR-PGHYDILYP  244 (244)
T ss_dssp             HHHHHHHT--EEEEESSSSGCSTTSEEEEES-SSTTSS-SEEEEEE-TBEEEEEEE
T ss_pred             HHHHHHhCCeEEEEEecCCCCCCccccccCCCccCCCCCEEEEEEc-CCccccccC
Confidence            4799999999888765433   111112222222233688999997 569998873


No 15 
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=96.96  E-value=0.0021  Score=62.76  Aligned_cols=88  Identities=10%  Similarity=0.026  Sum_probs=50.3

Q ss_pred             HHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhh--hhcccCe--eEEEEcCCccccccCCCCCCC
Q 003149          736 DELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHL--IASRYNI--VLMHLSQQQCFTFLPLRSVPL  811 (844)
Q Consensus       736 ~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~--~a~~~~~--~v~~~~~~~~~~~~p~~~~p~  811 (844)
                      -+|+++.++|.++..+...++.++..--        +-...-.+|-+|  |+++.+.  -|.+++-+...++=+..-|  
T Consensus       165 ~~ik~~adfy~pFI~e~~tV~~fC~~eV--------EPm~kesdhi~I~ALs~Al~i~irVey~dr~~~~~~~hH~fp--  234 (256)
T KOG3991|consen  165 GFIKSNADFYQPFIDEGMTVKAFCTQEV--------EPMYKESDHIHITALSQALGIRIRVEYVDRGSGDTVNHHDFP--  234 (256)
T ss_pred             HHHhhChhhhhccCCCCCcHHHHHHhhc--------chhhhccCceeHHHHHhhhCceEEEEEecCCCCCCCCCCcCc--
Confidence            3556666666666655556666655431        111122334444  6777775  5667776655343333322  


Q ss_pred             CCCCCCcEEEEEeCCCcEEEEEec
Q 003149          812 PRTSRKIVTIGFVNECQFVKVLDF  835 (844)
Q Consensus       812 ~~~~~~~i~l~~~~~~H~~~~~~~  835 (844)
                       .-+.|-|.|.|- +-||..|+.+
T Consensus       235 -e~s~P~I~LLYr-pGHYdilY~~  256 (256)
T KOG3991|consen  235 -EASAPEIYLLYR-PGHYDILYKK  256 (256)
T ss_pred             -cccCceEEEEec-CCccccccCC
Confidence             223577999997 5799988753


No 16 
>PF06782 UPF0236:  Uncharacterised protein family (UPF0236);  InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=96.41  E-value=0.32  Score=55.68  Aligned_cols=245  Identities=13%  Similarity=0.132  Sum_probs=135.0

Q ss_pred             ccCCCHHHHHHHHHHhhCCCChHHHHHHHHhcCCCccchHHHHHHHHHHhhhccccCcchHHHHHHHHHhcCcEEEEEee
Q 003149          150 AGRLTEQEANILVDLSRSNISPKEILQTLKQRDMHNVSTIKAIYNARHKYRVGEQVGQLHMHQLLDKLRKHGYIEWHRYN  229 (844)
Q Consensus       150 ~rrl~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~e~~~~~~~~~~  229 (844)
                      ..|+++..+..+..++.. ++-+...+.+....+....+...|.|..+........            .+         .
T Consensus       115 ~~R~S~~~~~~i~~~a~~-~sYr~aa~~l~~~~~~~~iS~~tV~~~v~~~g~~~~~------------~~---------~  172 (470)
T PF06782_consen  115 YQRISPELKEKIVELATE-MSYRKAAEILEELLGNVSISKQTVWNIVKEAGFEEIK------------EE---------E  172 (470)
T ss_pred             ccchhHHHHHHHHHHHhh-cCHHHHHHHHhhccCCCccCHHHHHHHHHhccchhhh------------cc---------c
Confidence            468999999888887644 8888888888776654445777788877665421100            00         0


Q ss_pred             ccCCceeeeEecChHHHHHHHhCCcEEEEeccccC----CCC--CCee-EEEEE---e-cCCCCceeeEE-Eeec---cC
Q 003149          230 EETDCFKDLFWAHPFAVGLLRAFPSVVMIDCTYKT----SMY--PFSF-LEIVG---A-TSTELTFSIAF-AYLE---SE  294 (844)
Q Consensus       230 d~~~~~~~if~~~~~~~~~~~~~~~vl~iD~T~~t----n~~--~~~l-~~~~g---~-~~~~~~~~~a~-al~~---~E  294 (844)
                      .+......+|                |-.|++|-.    ++.  ...+ ++-.|   . .+.++.....- .++.   ..
T Consensus       173 ~~k~~~~~Ly----------------IEaDg~~v~~qg~~~~~~e~k~~~vheG~~~~~~~~~R~~L~n~~~f~~~~~~~  236 (470)
T PF06782_consen  173 KEKKKVPVLY----------------IEADGVHVKLQGKKKKKKEVKLFVVHEGWEKEKPGGKRNKLKNKRHFVSGVGES  236 (470)
T ss_pred             cccCCCCeEE----------------EecCcceecccccccccceeeEEEEEeeeeeeeccCCcceeecchheecccccc
Confidence            0011111111                112333322    111  1222 22224   1 11122222222 2222   44


Q ss_pred             ccchHHHHHHHHHHhhcCCCC-CeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhh
Q 003149          295 RDDNYIWTLERLRSMMEDDAL-PRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWN  373 (844)
Q Consensus       295 ~~es~~w~l~~lk~~~~~~~~-P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~  373 (844)
                      ..+-|..+.+.+-+....... -.++..|...-+.+++. .||.+.+.+..||+.+.+.+.+...   ++..+.+...+ 
T Consensus       237 ~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~---~~~~~~~~~al-  311 (470)
T PF06782_consen  237 AEEFWEEVLDYIYNHYDLDKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHD---PELKEKIRKAL-  311 (470)
T ss_pred             hHHHHHHHHHHHHHhcCcccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhC---hHHHHHHHHHH-
Confidence            456677777777766622222 23566788888887776 9999999999999999999887542   12222233222 


Q ss_pred             hcccCcCHHHHHHHHHHHHhhcc------CchhHHHHHHHhhhcchhhhhHhhhhcccccCCCcccccccchhhHHHHhh
Q 003149          374 VLVLSVTEQEYMQHLGAMESDFS------RYPQAIDYVKQTWLANYKEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLL  447 (844)
Q Consensus       374 ~l~~s~t~~~f~~~~~~l~~~~~------~~~~~~~y~~~~Wl~~~ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l  447 (844)
                         ...+...+...++.+.....      ....+..|+.++|=. .     ..|...   .|.......|+.|..+..-+
T Consensus       312 ---~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~~~~Yl~~n~~~-i-----~~y~~~---~~~~g~g~ee~~~~~~s~Rm  379 (470)
T PF06782_consen  312 ---KKGDKKKLETVLDTAESCAKDEEERKKIRKLRKYLLNNWDG-I-----KPYRER---EGLRGIGAEESVSHVLSYRM  379 (470)
T ss_pred             ---HhcCHHHHHHHHHHHHHhhhchHHHHHHHHHHHHHHHCHHH-h-----hhhhhc---cCCCccchhhhhhhHHHHHh
Confidence               24456666666666654432      234567899998831 1     112111   34444455788998887766


Q ss_pred             cC
Q 003149          448 AV  449 (844)
Q Consensus       448 ~~  449 (844)
                      +.
T Consensus       380 K~  381 (470)
T PF06782_consen  380 KS  381 (470)
T ss_pred             cC
Confidence            64


No 17 
>PF03106 WRKY:  WRKY DNA -binding domain;  InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=95.43  E-value=0.049  Score=42.74  Aligned_cols=57  Identities=23%  Similarity=0.363  Sum_probs=41.6

Q ss_pred             cCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCC
Q 003149           59 NGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNH  138 (844)
Q Consensus        59 ~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH  138 (844)
                      =||..|+--.+.-++....+.+|.|+.                        .+||++=.+.+...++...++.+.++|||
T Consensus         3 Dgy~WRKYGqK~i~g~~~pRsYYrCt~------------------------~~C~akK~Vqr~~~d~~~~~vtY~G~H~h   58 (60)
T PF03106_consen    3 DGYRWRKYGQKNIKGSPYPRSYYRCTH------------------------PGCPAKKQVQRSADDPNIVIVTYEGEHNH   58 (60)
T ss_dssp             SSS-EEEEEEEEETTTTCEEEEEEEEC------------------------TTEEEEEEEEEETTCCCEEEEEEES--SS
T ss_pred             CCCchhhccCcccCCCceeeEeeeccc------------------------cChhheeeEEEecCCCCEEEEEEeeeeCC
Confidence            377888766666555556678899944                        38999988888877888999999999999


Q ss_pred             C
Q 003149          139 P  139 (844)
Q Consensus       139 ~  139 (844)
                      +
T Consensus        59 ~   59 (60)
T PF03106_consen   59 P   59 (60)
T ss_dssp             -
T ss_pred             C
Confidence            6


No 18 
>PF05412 Peptidase_C33:  Equine arterivirus Nsp2-type cysteine proteinase;  InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=94.60  E-value=0.089  Score=45.25  Aligned_cols=88  Identities=16%  Similarity=0.241  Sum_probs=53.5

Q ss_pred             ccCCCCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhh
Q 003149          704 DVIADGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHL  783 (844)
Q Consensus       704 ~v~~dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~  783 (844)
                      .+++||+||+|+||.-+..                         ++++.  |.   ..  .+.-+-+.+.|++.-+++++
T Consensus         3 sPP~DG~CG~H~i~aI~n~-------------------------m~~~~--~t---~~--l~~~~r~~d~W~~dedl~~~   50 (108)
T PF05412_consen    3 SPPGDGSCGWHCIAAIMNH-------------------------MMGGE--FT---TP--LPQRNRPSDDWADDEDLYQV   50 (108)
T ss_pred             CCCCCCchHHHHHHHHHHH-------------------------hhccC--CC---cc--ccccCCChHHccChHHHHHH
Confidence            4899999999999876532                         12210  00   00  12234578999999999999


Q ss_pred             hhcccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEecCCCC
Q 003149          784 IASRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLDFSQTP  839 (844)
Q Consensus       784 ~a~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~~~~~~  839 (844)
                      |-.. +.|+-+--.+.|         |      .-.++-=.++.||..-.-++.-|
T Consensus        51 iq~l-~lPat~~~~~~C---------p------~ArYv~~l~~qHW~V~~~~g~~~   90 (108)
T PF05412_consen   51 IQSL-RLPATLDRNGAC---------P------HARYVLKLDGQHWEVSVRKGRAP   90 (108)
T ss_pred             HHHc-cCceeccCCCCC---------C------CCEEEEEecCceEEEEEcCCCCc
Confidence            8754 555544333333         2      22444444678998766666544


No 19 
>PF01610 DDE_Tnp_ISL3:  Transposase;  InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=94.54  E-value=0.046  Score=57.20  Aligned_cols=68  Identities=19%  Similarity=0.226  Sum_probs=57.2

Q ss_pred             EEeeccCccchHHHHHHHH-HHhhcCCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhh
Q 003149          288 FAYLESERDDNYIWTLERL-RSMMEDDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKK  357 (844)
Q Consensus       288 ~al~~~E~~es~~w~l~~l-k~~~~~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~  357 (844)
                      +.++++-+.+++.-+|..+ -..  .....++|++|...+...|+++.||+|.+..-.|||++++.+.+.+
T Consensus        30 l~i~~~r~~~~l~~~~~~~~~~~--~~~~v~~V~~Dm~~~y~~~~~~~~P~A~iv~DrFHvvk~~~~al~~   98 (249)
T PF01610_consen   30 LDILPGRDKETLKDFFRSLYPEE--ERKNVKVVSMDMSPPYRSAIREYFPNAQIVADRFHVVKLANRALDK   98 (249)
T ss_pred             EEEcCCccHHHHHHHHHHhCccc--cccceEEEEcCCCccccccccccccccccccccchhhhhhhhcchh
Confidence            3478888888888888876 333  3567789999999999999999999999999999999998875543


No 20 
>PF04684 BAF1_ABF1:  BAF1 / ABF1 chromatin reorganising factor;  InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=93.24  E-value=0.62  Score=51.01  Aligned_cols=46  Identities=24%  Similarity=0.273  Sum_probs=39.1

Q ss_pred             CCCCCcccCCHHHHHHHHHHHhhhcCeEEEEEeecCCCCCCCceEEEEEec
Q 003149           35 AFTTDMVFNSREELVEWIRDTGKRNGLVIVIKKSDVGGDGRRPRITFACER   85 (844)
Q Consensus        35 ~~~~g~~F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~~g~~~~~~~vC~r   85 (844)
                      .-..+..|+++++-|..+++|.-+...-|+...|.+.+     .++|.|.+
T Consensus        21 ~~~~~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nk-----hftfachl   66 (496)
T PF04684_consen   21 QSAQARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNK-----HFTFACHL   66 (496)
T ss_pred             ccccccCCCcHHHHHHHHhhhhhhhcCceeeccccccc-----ceEEEeec
Confidence            33467899999999999999999999999988887764     68898854


No 21 
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=92.26  E-value=0.27  Score=38.24  Aligned_cols=56  Identities=23%  Similarity=0.292  Sum_probs=39.3

Q ss_pred             CeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCC
Q 003149           60 GLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNH  138 (844)
Q Consensus        60 GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH  138 (844)
                      ||..|+-..+..++....|.+|.|+.                       ..|||++=.+.+...++...++.+.++|||
T Consensus         4 Gy~WRKYGQK~ikgs~~pRsYYrCt~-----------------------~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h   59 (59)
T smart00774        4 GYQWRKYGQKVIKGSPFPRSYYRCTY-----------------------SQGCPAKKQVQRSDDDPSVVEVTYEGEHTH   59 (59)
T ss_pred             cccccccCcEecCCCcCcceEEeccc-----------------------cCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence            56666654444444455577888844                       147999766666666788888899999998


No 22 
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=92.25  E-value=0.044  Score=55.91  Aligned_cols=109  Identities=9%  Similarity=-0.097  Sum_probs=71.2

Q ss_pred             CcccccccccccccccccccccCCCCCchHHHHHHhhcCCCc-----cHHHHHHHHHHHHHhhhhhHHHhhCCH-----H
Q 003149          684 PLCFIDSFPAGLRPYIHDVQDVIADGHCGFRVVAELMDIGED-----NWAQVRRDLVDELQSHYDDYIQLYGDA-----E  753 (844)
Q Consensus       684 ~~~~~~~~~~~~~~~~~~~~~v~~dg~Cgfraia~~lg~~~~-----~~~~vr~~l~~el~~~~~~y~~~~~~~-----~  753 (844)
                      -.+++-.+|.....--..-.|..|||||.|-+|+.+|+..-.     -=...|..=....+.+...|.++.-++     .
T Consensus       155 ~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f~g~hfD~~t~~m~  234 (306)
T COG5539         155 YNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILFTGIHFDEETLAMV  234 (306)
T ss_pred             cchhhcCcchHHHHHhhhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhhcccccchhhhhcc
Confidence            345566666655555555567899999999999999988421     012222222233344555565544222     2


Q ss_pred             HHHHHHhhcCCCCCCCCccccccCCCchhhhhcccCeeEEEEcCCc
Q 003149          754 IARELLHSLSYSESNPGIEHRMIMPDTGHLIASRYNIVLMHLSQQQ  799 (844)
Q Consensus       754 ~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~~~~~~v~~~~~~~  799 (844)
                      .|+.+.+.+.      ....|...+.. +.||+.+..|+-++....
T Consensus       235 ~~dt~~ne~~------~~a~~g~~~ei-~qLas~lk~~~~~~nT~~  273 (306)
T COG5539         235 LWDTYVNEVL------FDASDGITIEI-QQLASLLKNPHYYTNTAS  273 (306)
T ss_pred             hHHHHHhhhc------ccccccchHHH-HHHHHHhcCceEEeecCC
Confidence            5788777776      66889877766 589999999998887653


No 23 
>PF13610 DDE_Tnp_IS240:  DDE domain
Probab=90.70  E-value=0.17  Score=47.81  Aligned_cols=81  Identities=21%  Similarity=0.103  Sum_probs=66.3

Q ss_pred             CcEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHH
Q 003149          253 PSVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIR  332 (844)
Q Consensus       253 ~~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~  332 (844)
                      |+.|.+|-||-.-+ +--.+....+|..|+  ++.+-|....+...=..||..+.+..  ...|.+|+||+..+...|++
T Consensus         1 ~~~w~~DEt~iki~-G~~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~--~~~p~~ivtDk~~aY~~A~~   75 (140)
T PF13610_consen    1 GDSWHVDETYIKIK-GKWHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRH--RGEPRVIVTDKLPAYPAAIK   75 (140)
T ss_pred             CCEEEEeeEEEEEC-CEEEEEEEeeccccc--chhhhhhhhcccccceeeccccceee--ccccceeecccCCccchhhh
Confidence            57899999997543 234556778899888  78888888888888778887777764  37899999999999999999


Q ss_pred             HhCCCc
Q 003149          333 AVFPRA  338 (844)
Q Consensus       333 ~vfP~~  338 (844)
                      +++++.
T Consensus        76 ~l~~~~   81 (140)
T PF13610_consen   76 ELNPEG   81 (140)
T ss_pred             hccccc
Confidence            999974


No 24 
>PF04500 FLYWCH:  FLYWCH zinc finger domain;  InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif:  F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH  where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=89.40  E-value=1  Score=35.37  Aligned_cols=49  Identities=27%  Similarity=0.421  Sum_probs=23.9

Q ss_pred             cCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCC
Q 003149           59 NGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNH  138 (844)
Q Consensus        59 ~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH  138 (844)
                      .||.+...+....      ...+.|.+..                     ..+|+|++...    .+.-.+.....+|||
T Consensus        14 ~Gy~y~~~~~~~~------~~~WrC~~~~---------------------~~~C~a~~~~~----~~~~~~~~~~~~HnH   62 (62)
T PF04500_consen   14 DGYRYYFNKRNDG------KTYWRCSRRR---------------------SHGCRARLITD----AGDGRVVRTNGEHNH   62 (62)
T ss_dssp             TTEEEEEEEE-SS-------EEEEEGGGT---------------------TS----EEEEE------TTEEEE-S---SS
T ss_pred             CCeEEECcCCCCC------cEEEEeCCCC---------------------CCCCeEEEEEE----CCCCEEEECCCccCC
Confidence            5777776665522      5788895521                     25899999864    344566666788999


No 25 
>PF00665 rve:  Integrase core domain;  InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis [].  Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group.  HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=79.40  E-value=8.7  Score=34.50  Aligned_cols=76  Identities=21%  Similarity=0.109  Sum_probs=49.2

Q ss_pred             CcEEEEeccccC-CCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHH
Q 003149          253 PSVVMIDCTYKT-SMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNS  330 (844)
Q Consensus       253 ~~vl~iD~T~~t-n~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~A  330 (844)
                      +.++.+|.+... ...+...+.++.+|..-. +.+++.+...++.+.+..+|....... +...|.+|++|+..+..+.
T Consensus         6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~-~~~~~~~~~~~~~~~~~~~l~~~~~~~-~~~~p~~i~tD~g~~f~~~   82 (120)
T PF00665_consen    6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSR-FIYAFPVSSKETAEAALRALKRAIEKR-GGRPPRVIRTDNGSEFTSH   82 (120)
T ss_dssp             TTEEEEEEEEETGGCTT-CEEEEEEEETTTT-EEEEEEESSSSHHHHHHHHHHHHHHHH-S-SE-SEEEEESCHHHHSH
T ss_pred             CCEEEEeeEEEecCCCCccEEEEEEEECCCC-cEEEEEeeccccccccccccccccccc-ccccceecccccccccccc
Confidence            478899988544 334447777788876554 555666766666666666666544443 2333999999999987643


No 26 
>PF15299 ALS2CR8:  Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8
Probab=78.06  E-value=14  Score=37.75  Aligned_cols=98  Identities=19%  Similarity=0.240  Sum_probs=59.2

Q ss_pred             CCCCCCceeeCCccEEEEEeeCCC-----------------------------------C--CeEEEEE-cccc-CCCCC
Q 003149          101 RPKTTGTKKCGCPFLLKGHKLDTD-----------------------------------D--DWILKVV-CGVH-NHPVT  141 (844)
Q Consensus       101 ~rr~~~s~~~gCp~~i~~~~~~~~-----------------------------------~--~w~v~~~-~~~H-NH~~~  141 (844)
                      .++...+.+.+|||+|.++....-                                   +  .|.|..- ..+| +|+..
T Consensus        69 ~~~~~~skK~~CPA~I~Ik~I~~FPdykv~~~~~~~~~~~r~~~~~~lk~~l~~~~~~~~~~r~yv~lP~~~~H~~H~~~  148 (225)
T PF15299_consen   69 RRRSKPSKKRDCPARIYIKEIIKFPDYKVPTNSQKDTRRERRKASKKLKKALLSGKSIEGERRFYVQLPSPEEHSGHPIG  148 (225)
T ss_pred             ccccccccCCCCCeEEEEEEEEEcCCcccccchhhhhHHHHHHHHHHHHHHHhcCCCCCceEEEEEECCChHhcCCCccc
Confidence            455667889999999998643111                                   1  1222221 2356 78876


Q ss_pred             CCcccCccccCCCHHHHHHHHHHhhCCCCh-HHHHHHHHhc-----CC----------CccchHHHHHHHHHHhh
Q 003149          142 QHVEGHSYAGRLTEQEANILVDLSRSNISP-KEILQTLKQR-----DM----------HNVSTIKAIYNARHKYR  200 (844)
Q Consensus       142 ~~~~~~~~~rrl~~~~~~~i~~l~~~~~~p-~~I~~~l~~~-----~~----------~~~~t~kdi~n~~~~~r  200 (844)
                      .....  ...++.++....|..|...|+.. .+|...|+..     +.          .-.+|.+||.|......
T Consensus       149 ~~~~~--~~q~~~~~v~~ki~eLv~~gv~~v~e~k~~l~~fV~~~lf~~~~~p~~~n~~y~Pt~~di~n~~~~~~  221 (225)
T PF15299_consen  149 QEAAG--LKQPLDPRVVEKIHELVAQGVTSVPEMKRHLKKFVEEELFKDQEPPPPTNRRYFPTDKDIRNHMYSAK  221 (225)
T ss_pred             ccccc--ccccCCHHHHHHHHHHHHcccccHHHHHHHHHHHhhhhccCCCCCCCCCccccCCchHHHHHHHHHHH
Confidence            53321  22467888889999999999765 6666666321     10          11357788888766543


No 27 
>PHA02517 putative transposase OrfB; Reviewed
Probab=76.35  E-value=22  Score=37.69  Aligned_cols=69  Identities=13%  Similarity=-0.012  Sum_probs=41.9

Q ss_pred             CcEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhc--CCCCCeEEEeeccHH
Q 003149          253 PSVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMME--DDALPRVIVTDKDLA  326 (844)
Q Consensus       253 ~~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~--~~~~P~~iitD~~~a  326 (844)
                      ..++..|.|+..... +-.++++.+|...+ .++|+.+....+.+.   +++.|..++.  +...+.+|.||....
T Consensus       110 n~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~---~~~~l~~a~~~~~~~~~~i~~sD~G~~  180 (277)
T PHA02517        110 NQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDF---VLDALEQALWARGRPGGLIHHSDKGSQ  180 (277)
T ss_pred             CCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHH---HHHHHHHHHHhcCCCcCcEeecccccc
Confidence            478999999965443 34566666665544 567788877666664   4455555541  222234667888764


No 28 
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=61.06  E-value=8.6  Score=39.84  Aligned_cols=112  Identities=13%  Similarity=0.039  Sum_probs=72.6

Q ss_pred             CCCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccc-cCCCchhhhh
Q 003149          707 ADGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRM-IMPDTGHLIA  785 (844)
Q Consensus       707 ~dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~-~~~~~~~~~a  785 (844)
                      .|--|.|+|.+..++--  +=..+|+.+..|+.++++.|.+...+-+... ++..|.      .++-|. +--..+ +|.
T Consensus       119 ~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~~i~-y~~~i~------k~d~~~dG~ieia-~iS  188 (306)
T COG5539         119 DDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEIDVIA-YATWIV------KPDSQGDGCIEIA-IIS  188 (306)
T ss_pred             CchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcchHH-HHHhhh------ccccCCCceEEEe-Eec
Confidence            45789999999888654  7788999999999999999987653322222 222222      345565 333344 799


Q ss_pred             cccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149          786 SRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD  834 (844)
Q Consensus       786 ~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~  834 (844)
                      +.+++-|.+.......   -++-.+-+.  -.-|++-|. +-||-++.+
T Consensus       189 ~~l~v~i~~Vdv~~~~---~dr~~~~~~--~q~~~i~f~-g~hfD~~t~  231 (306)
T COG5539         189 DQLPVRIHVVDVDKDS---EDRYNSHPY--VQRISILFT-GIHFDEETL  231 (306)
T ss_pred             cccceeeeeeecchhH---HhhccCChh--hhhhhhhhc-ccccchhhh
Confidence            9999998888765321   222222122  123788885 689987764


No 29 
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=59.19  E-value=4.2  Score=41.18  Aligned_cols=30  Identities=23%  Similarity=0.348  Sum_probs=23.6

Q ss_pred             CCCCCccccccCCccchhHhhHHhhcCCCCCcc
Q 003149          521 ASACGCVFRRTHGLPCAHEIAEYKHERRSIPLL  553 (844)
Q Consensus       521 ~~~CsC~~~~~~GlPC~H~lav~~~~~~~i~~~  553 (844)
                      ...|||..+   ..||+|+-||+.+.+..+..+
T Consensus       124 ~~dCSCPD~---anPCKHi~AvyY~lae~f~~d  153 (266)
T COG4279         124 STDCSCPDY---ANPCKHIAAVYYLLAEKFDED  153 (266)
T ss_pred             ccccCCCCc---ccchHHHHHHHHHHHHHhccC
Confidence            457999875   579999999999887555444


No 30 
>PF04937 DUF659:  Protein of unknown function (DUF 659);  InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=58.82  E-value=83  Score=30.04  Aligned_cols=106  Identities=13%  Similarity=0.078  Sum_probs=66.1

Q ss_pred             HhCCcEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeec-cCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHH
Q 003149          250 RAFPSVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLE-SERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALM  328 (844)
Q Consensus       250 ~~~~~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~-~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~  328 (844)
                      ...|=.|..|+=  ++..+.+++.|+.....|..|.-..-.-. ..+.+.+-.+|+...+.+ +......||||-...+.
T Consensus        30 ~~~Gcsi~~DgW--td~~~~~lInf~v~~~~g~~Flksvd~s~~~~~a~~l~~ll~~vIeeV-G~~nVvqVVTDn~~~~~  106 (153)
T PF04937_consen   30 KRTGCSIMSDGW--TDRKGRSLINFMVYCPEGTVFLKSVDASSIIKTAEYLFELLDEVIEEV-GEENVVQVVTDNASNMK  106 (153)
T ss_pred             HhcCEEEEEecC--cCCCCCeEEEEEEEcccccEEEEEEecccccccHHHHHHHHHHHHHHh-hhhhhhHHhccCchhHH
Confidence            344445666654  34455677777777776665543332211 134444555555555444 45556668999999988


Q ss_pred             HHHH---HhCCCcccchhhhhHHHhHHHhhhhh
Q 003149          329 NSIR---AVFPRATNLLCRWHISKNISVNCKKL  358 (844)
Q Consensus       329 ~Ai~---~vfP~~~~~lC~~Hi~kn~~~~~~~~  358 (844)
                      +|-+   +-+|..-...|.-|-+.-+.+.+.++
T Consensus       107 ~a~~~L~~k~p~ifw~~CaaH~inLmledi~k~  139 (153)
T PF04937_consen  107 KAGKLLMEKYPHIFWTPCAAHCINLMLEDIGKL  139 (153)
T ss_pred             HHHHHHHhcCCCEEEechHHHHHHHHHHHHhcC
Confidence            8844   44788777889999888777766543


No 31 
>KOG4345 consensus NF-kappa B regulator AP20/Cezanne [Signal transduction mechanisms]
Probab=56.80  E-value=7.3  Score=45.05  Aligned_cols=52  Identities=13%  Similarity=0.193  Sum_probs=38.3

Q ss_pred             chhhhhcccCeeEEEEcC-----C---------ccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEE
Q 003149          780 TGHLIASRYNIVLMHLSQ-----Q---------QCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVL  833 (844)
Q Consensus       780 ~~~~~a~~~~~~v~~~~~-----~---------~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~  833 (844)
                      +-.++|+...|||++++-     .         .-..|+||-.|+...+ .-||+|+|- ..||+.++
T Consensus       225 hifvl~~ilRrpivvvsd~mlR~s~~~sfap~~~ggiylpLe~p~~~c~-r~pLvl~yd-~~hf~~lv  290 (774)
T KOG4345|consen  225 HIFVLAHILRRPIVVVSDTMLRDSGGESFAPIPVGGIYLPLEVPAQECH-RSPLVLAYD-QAHFSALV  290 (774)
T ss_pred             HHHHHHHHhhCCeeEecccccccCCCcccccCccCceEEeccCchhhcc-cchhhhhhH-hhhhhhhh
Confidence            667899999999999973     1         2345777777754443 458999996 47999874


No 32 
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=53.25  E-value=32  Score=34.57  Aligned_cols=81  Identities=16%  Similarity=0.100  Sum_probs=57.7

Q ss_pred             CcEEEEeccccCCCCC-CeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHH
Q 003149          253 PSVVMIDCTYKTSMYP-FSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSI  331 (844)
Q Consensus       253 ~~vl~iD~T~~tn~~~-~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai  331 (844)
                      .+++.+|-||-+-+-+ .-|+  -.+|..  ..++.+-|...-+...=..||..+++..   ..|.+|+||+......|+
T Consensus        70 ~~~w~vDEt~ikv~gkw~yly--rAid~~--g~~Ld~~L~~rRn~~aAk~Fl~kllk~~---g~p~v~vtDka~s~~~A~  142 (215)
T COG3316          70 GDSWRVDETYIKVNGKWHYLY--RAIDAD--GLTLDVWLSKRRNALAAKAFLKKLLKKH---GEPRVFVTDKAPSYTAAL  142 (215)
T ss_pred             ccceeeeeeEEeeccEeeehh--hhhccC--CCeEEEEEEcccCcHHHHHHHHHHHHhc---CCCceEEecCccchHHHH
Confidence            4678889888653322 2233  334555  3566777777777777777777777764   789999999999999999


Q ss_pred             HHhCCCccc
Q 003149          332 RAVFPRATN  340 (844)
Q Consensus       332 ~~vfP~~~~  340 (844)
                      .++-++..|
T Consensus       143 ~~l~~~~eh  151 (215)
T COG3316         143 RKLGSEVEH  151 (215)
T ss_pred             HhcCcchhe
Confidence            999875443


No 33 
>COG3464 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=49.57  E-value=68  Score=36.07  Aligned_cols=56  Identities=18%  Similarity=0.214  Sum_probs=43.0

Q ss_pred             EeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHH
Q 003149          289 AYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISK  349 (844)
Q Consensus       289 al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~k  349 (844)
                      .++++-+.+++...|..+     +....+.|..|......+++++.||++.+.+=.||+.+
T Consensus       183 ~i~~~r~~~ti~~~l~~~-----g~~~v~~V~~D~~~~y~~~v~e~~pna~i~~d~fh~~~  238 (402)
T COG3464         183 DILEGRSVRTLRRYLRRG-----GSEQVKSVSMDMFGPYASAVQELFPNALIIADRFHVVQ  238 (402)
T ss_pred             eecCCccHHHHHHHHHhC-----CCcceeEEEccccHHHHHHHHHhCCChheeeeeeeeee
Confidence            355666666555554433     12267889999999999999999999999999999877


No 34 
>PF08069 Ribosomal_S13_N:  Ribosomal S13/S15 N-terminal domain;  InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.  Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=48.45  E-value=28  Score=27.29  Aligned_cols=29  Identities=21%  Similarity=0.416  Sum_probs=25.7

Q ss_pred             HHHHHHHHHHhhCCCChHHHHHHHHhcCC
Q 003149          155 EQEANILVDLSRSNISPKEILQTLKQRDM  183 (844)
Q Consensus       155 ~~~~~~i~~l~~~~~~p~~I~~~l~~~~~  183 (844)
                      ++..+.|.+|.+.|++|.+|-..|+++++
T Consensus        31 ~eVe~~I~klakkG~tpSqIG~iLRD~~G   59 (60)
T PF08069_consen   31 EEVEELIVKLAKKGLTPSQIGVILRDQYG   59 (60)
T ss_dssp             HHHHHHHHHHCCTTHCHHHHHHHHHHSCT
T ss_pred             HHHHHHHHHHHHcCCCHHHhhhhhhhccC
Confidence            56678889999999999999999999874


No 35 
>PF13936 HTH_38:  Helix-turn-helix domain; PDB: 2W48_A.
Probab=47.78  E-value=25  Score=25.64  Aligned_cols=30  Identities=30%  Similarity=0.362  Sum_probs=15.4

Q ss_pred             cCCCHHHHHHHHHHhhCCCChHHHHHHHHh
Q 003149          151 GRLTEQEANILVDLSRSNISPKEILQTLKQ  180 (844)
Q Consensus       151 rrl~~~~~~~i~~l~~~~~~p~~I~~~l~~  180 (844)
                      ++|+.+++..|..+.+.|.+.++|...|..
T Consensus         3 ~~Lt~~eR~~I~~l~~~G~s~~~IA~~lg~   32 (44)
T PF13936_consen    3 KHLTPEERNQIEALLEQGMSIREIAKRLGR   32 (44)
T ss_dssp             ---------HHHHHHCS---HHHHHHHTT-
T ss_pred             cchhhhHHHHHHHHHHcCCCHHHHHHHHCc
Confidence            468999999999999999999999987744


No 36 
>PRK09784 hypothetical protein; Provisional
Probab=45.11  E-value=13  Score=37.01  Aligned_cols=36  Identities=25%  Similarity=0.350  Sum_probs=25.7

Q ss_pred             cccccccccCCCCCchHHHHHHhhcCCCccHHHHHHH
Q 003149          697 PYIHDVQDVIADGHCGFRVVAELMDIGEDNWAQVRRD  733 (844)
Q Consensus       697 ~~~~~~~~v~~dg~Cgfraia~~lg~~~~~~~~vr~~  733 (844)
                      .|.++---|+|||-|..|||-.. ...+-+|..+-..
T Consensus       197 ~~glkyapvdgdgycllrailvl-k~h~yswal~s~k  232 (417)
T PRK09784        197 TYGLKYAPVDGDGYCLLRAILVL-KQHDYSWALGSHK  232 (417)
T ss_pred             hhCceecccCCCchhHHHHHHHh-hhcccchhhccch
Confidence            45556666999999999999763 3455677766443


No 37 
>PF03050 DDE_Tnp_IS66:  Transposase IS66 family ;  InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=44.51  E-value=16  Score=38.49  Aligned_cols=85  Identities=19%  Similarity=0.150  Sum_probs=53.4

Q ss_pred             CcEEEEeccccC-----CCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHH
Q 003149          253 PSVVMIDCTYKT-----SMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLAL  327 (844)
Q Consensus       253 ~~vl~iD~T~~t-----n~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al  327 (844)
                      .+++.+|-|.-.     +.-+.-+.+++.-+      .+.|.+.++-..+...-+|..         ...++++|+-.+.
T Consensus        67 ~~~~~~DET~~~vl~~~~g~~~~~Wv~~~~~------~v~f~~~~sR~~~~~~~~L~~---------~~GilvsD~y~~Y  131 (271)
T PF03050_consen   67 SPVVHADETGWRVLDKGKGKKGYLWVFVSPE------VVLFFYAPSRSSKVIKEFLGD---------FSGILVSDGYSAY  131 (271)
T ss_pred             cceeccCCceEEEeccccccceEEEeeeccc------eeeeeecccccccchhhhhcc---------cceeeeccccccc
Confidence            456666766544     22233344444333      556666666666555444332         3369999999876


Q ss_pred             HHHHHHhCCCcccchhhhhHHHhHHHhhhh
Q 003149          328 MNSIRAVFPRATNLLCRWHISKNISVNCKK  357 (844)
Q Consensus       328 ~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~  357 (844)
                      ..     +..+.|+.|+-|+.+.+..-...
T Consensus       132 ~~-----~~~~~hq~C~AH~~R~~~~~~~~  156 (271)
T PF03050_consen  132 NK-----LAGITHQLCWAHLRRDFQDAAES  156 (271)
T ss_pred             cc-----ccccccccccccccccccccccc
Confidence            54     22889999999999998876543


No 38 
>PRK13907 rnhA ribonuclease H; Provisional
Probab=44.50  E-value=1.8e+02  Score=26.45  Aligned_cols=78  Identities=15%  Similarity=0.080  Sum_probs=43.6

Q ss_pred             EEEEeccccCCCCCCeeEEEEEecCCCCceeeEE-EeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHH
Q 003149          255 VVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAF-AYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRA  333 (844)
Q Consensus       255 vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~-al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~  333 (844)
                      .|.+|+.++.|.-..-...++ .+..+.. ...+ .-..+....-|.-++..|+.+...+..+..|.+|. ..+.+++..
T Consensus         3 ~iy~DGa~~~~~g~~G~G~vi-~~~~~~~-~~~~~~~~~tn~~AE~~All~aL~~a~~~g~~~v~i~sDS-~~vi~~~~~   79 (128)
T PRK13907          3 EVYIDGASKGNPGPSGAGVFI-KGVQPAV-QLSLPLGTMSNHEAEYHALLAALKYCTEHNYNIVSFRTDS-QLVERAVEK   79 (128)
T ss_pred             EEEEeeCCCCCCCccEEEEEE-EECCeeE-EEEecccccCCcHHHHHHHHHHHHHHHhCCCCEEEEEech-HHHHHHHhH
Confidence            378999998876443333333 3444432 2222 11223344456677777777764455566777876 556666665


Q ss_pred             hC
Q 003149          334 VF  335 (844)
Q Consensus       334 vf  335 (844)
                      .+
T Consensus        80 ~~   81 (128)
T PRK13907         80 EY   81 (128)
T ss_pred             HH
Confidence            54


No 39 
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=34.84  E-value=15  Score=31.83  Aligned_cols=25  Identities=28%  Similarity=0.371  Sum_probs=18.0

Q ss_pred             CCCCCccccc-----cCCccchhHhhHHhh
Q 003149          521 ASACGCVFRR-----THGLPCAHEIAEYKH  545 (844)
Q Consensus       521 ~~~CsC~~~~-----~~GlPC~H~lav~~~  545 (844)
                      .--|||.++-     .-.-||.|++.+-..
T Consensus        49 ~gfCSCp~~~~svvl~Gk~~C~Hi~glk~A   78 (117)
T COG5431          49 GGFCSCPDFLGSVVLKGKSPCAHIIGLKVA   78 (117)
T ss_pred             cCcccCHHHHhHhhhcCcccchhhhheeee
Confidence            3489999887     235579999986433


No 40 
>KOG4540 consensus Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking, secretion, and vesicular transport; Lipid transport and metabolism]
Probab=32.79  E-value=1.3e+02  Score=31.59  Aligned_cols=57  Identities=16%  Similarity=0.204  Sum_probs=35.6

Q ss_pred             HHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhhhhc----ccCeeEEEEcC
Q 003149          732 RDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHLIAS----RYNIVLMHLSQ  797 (844)
Q Consensus       732 ~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~----~~~~~v~~~~~  797 (844)
                      +=+-+|++.....|...+    ++-.-+.++ ++    .-.-|++-.+.|.+||+    +||.|+|.+++
T Consensus       246 ~ClE~eir~~dryySa~l----dI~~~v~~~-Yp----da~iwlTGHSLGGa~AsLlG~~fglP~VaFes  306 (425)
T KOG4540|consen  246 ECLEEEIREFDRYYSAAL----DILGAVRRI-YP----DARIWLTGHSLGGAIASLLGIRFGLPVVAFES  306 (425)
T ss_pred             HHHHHHHHhhcchhHHHH----HHHHHHHHh-CC----CceEEEeccccchHHHHHhccccCCceEEecC
Confidence            345556665544444322    222223333 33    34789999999999997    56679999986


No 41 
>COG5153 CVT17 Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking and secretion / Lipid metabolism]
Probab=32.79  E-value=1.3e+02  Score=31.59  Aligned_cols=57  Identities=16%  Similarity=0.204  Sum_probs=35.6

Q ss_pred             HHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhhhhc----ccCeeEEEEcC
Q 003149          732 RDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHLIAS----RYNIVLMHLSQ  797 (844)
Q Consensus       732 ~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~----~~~~~v~~~~~  797 (844)
                      +=+-+|++.....|...+    ++-.-+.++ ++    .-.-|++-.+.|.+||+    +||.|+|.+++
T Consensus       246 ~ClE~eir~~dryySa~l----dI~~~v~~~-Yp----da~iwlTGHSLGGa~AsLlG~~fglP~VaFes  306 (425)
T COG5153         246 ECLEEEIREFDRYYSAAL----DILGAVRRI-YP----DARIWLTGHSLGGAIASLLGIRFGLPVVAFES  306 (425)
T ss_pred             HHHHHHHHhhcchhHHHH----HHHHHHHHh-CC----CceEEEeccccchHHHHHhccccCCceEEecC
Confidence            345556665544444322    222223333 33    34789999999999997    56679999986


No 42 
>PF09921 DUF2153:  Uncharacterized protein conserved in archaea (DUF2153);  InterPro: IPR014450 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=28.38  E-value=38  Score=30.52  Aligned_cols=48  Identities=13%  Similarity=0.448  Sum_probs=35.7

Q ss_pred             ccHHHHHHHHHHHHHhhhhhHHH------hhCCHHHHHHHHhhcCCCCCCCCccccccCC
Q 003149          725 DNWAQVRRDLVDELQSHYDDYIQ------LYGDAEIARELLHSLSYSESNPGIEHRMIMP  778 (844)
Q Consensus       725 ~~~~~vr~~l~~el~~~~~~y~~------~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~  778 (844)
                      |.|+...+.++++++++.+.|..      ++.+-..|..++..|.      .-|.||..|
T Consensus         3 d~WVk~Qk~~l~~~~~~e~~~~~~DRL~LIl~sr~afqhm~RTlK------aFd~WLqdP   56 (126)
T PF09921_consen    3 DEWVKMQKRLLETFKKHEKNVESADRLDLILSSRAAFQHMMRTLK------AFDQWLQDP   56 (126)
T ss_pred             HHHHHHHHHHHHHHHHHHHHhhhhhHHHHHHHHHHHHHHHHHHHH------HHHHHHcCc
Confidence            78999999999999999876653      2233335666666675      568888887


No 43 
>PF09607 BrkDBD:  Brinker DNA-binding domain;  InterPro: IPR018586  This DNA-binding domain is the first approx. 100 residues of the N-terminal end of Brinker. The structure of this domain in complex with DNA consists of four alpha-helices that contain a helix-turn-helix DNA recognition motif specific for GC-rich DNA. The Brinker nuclear repressor is a major element of the Drosophila Decapentaplegic morphogen signalling pathway []. ; PDB: 2GLO_A.
Probab=26.70  E-value=47  Score=25.84  Aligned_cols=18  Identities=22%  Similarity=0.501  Sum_probs=14.6

Q ss_pred             CCCCch--HHHHHHhhcCCC
Q 003149          707 ADGHCG--FRVVAELMDIGE  724 (844)
Q Consensus       707 ~dg~Cg--fraia~~lg~~~  724 (844)
                      -|+||-  +||.|+..|.++
T Consensus        20 ~~~nc~~~~RAaarkf~V~r   39 (58)
T PF09607_consen   20 KDNNCKGNQRAAARKFNVSR   39 (58)
T ss_dssp             H-TTTTT-HHHHHHHTTS-H
T ss_pred             HccchhhhHHHHHHHhCccH
Confidence            688998  999999999975


No 44 
>PF03412 Peptidase_C39:  Peptidase C39 family This is family C39 in the peptidase classification. ;  InterPro: IPR005074 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of sequences defined by this cysteine peptidase domain belong to the MEROPS peptidase family C39 (clan CA). It is found in a wide range of ABC transporters, which are maturation proteases for peptide bacteriocins, the proteolytic domain residing in the N-terminal region of the protein []. A number of the proteins are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity. Lantibiotic and non-lantibiotic bacteriocins are synthesised as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved.  ; GO: 0005524 ATP binding, 0008233 peptidase activity, 0006508 proteolysis, 0016021 integral to membrane; PDB: 3K8U_A 3B79_A.
Probab=25.69  E-value=3.2e+02  Score=24.60  Aligned_cols=83  Identities=20%  Similarity=0.285  Sum_probs=44.0

Q ss_pred             CCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhhhhcc
Q 003149          708 DGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHLIASR  787 (844)
Q Consensus       708 dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~~  787 (844)
                      +-.||..|+|..+..                          ++-.-..+++...+.      ....-+++.++. .+|..
T Consensus        10 ~~dcg~acl~~l~~~--------------------------~g~~~s~~~l~~~~~------~~~~g~s~~~L~-~~~~~   56 (131)
T PF03412_consen   10 SNDCGLACLAMLLKY--------------------------YGIPVSEEELRRQLG------TSEEGTSLADLK-RAARK   56 (131)
T ss_dssp             TT-HHHHHHHHHHHH--------------------------TT----HHHHHCCTT-------BTTB--CCCHH-HHHHH
T ss_pred             CCCHHHHHHHHHHHH--------------------------hCCCchHHHHHHHhc------CCccCCCHHHHH-HHHHh
Confidence            457999999887732                          233334556666663      223345666665 67889


Q ss_pred             cCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149          788 YNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD  834 (844)
Q Consensus       788 ~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~  834 (844)
                      |+...-.+.....  -|  .    ..  +-| +|++++.+||+.|.-
T Consensus        57 ~gl~~~~~~~~~~--~l--~----~~--~~P-~I~~~~~~h~vVi~~   92 (131)
T PF03412_consen   57 YGLKAKAVKLNFE--KL--K----RL--PLP-AIAHLKDGHFVVIYK   92 (131)
T ss_dssp             TTEEEEEEE--GG--GC--T----CG--GSS-EEEEECCCEEEEEEE
T ss_pred             cccceeeeecchh--hh--h----hc--ccc-EEEEecCcceEEEEe
Confidence            9987777765433  11  1    11  122 334446689998774


No 45 
>KOG4825 consensus Component of synaptic membrane glycine-, glutamate- and thienylcyclohexylpiperidine-binding glycoprotein (43kDa) [Signal transduction mechanisms]
Probab=25.52  E-value=1e+02  Score=34.24  Aligned_cols=36  Identities=22%  Similarity=0.133  Sum_probs=30.1

Q ss_pred             CCCCcccccCCCCCCcCCCCCcccCCCccceeeecc
Q 003149          616 ASTSLVELEVDGFPLSKLGTSTYQDPSELQYVLSVQ  651 (844)
Q Consensus       616 ~~~~~~~~~~Kg~p~~~~~~st~r~ps~~e~~~~~~  651 (844)
                      .++.+.+..+.|+|..-..-++++.||.||-.+++|
T Consensus       279 pmpSlpqleepgrenqfaepflqekpsswelpIrPq  314 (666)
T KOG4825|consen  279 PMPSLPQLEEPGRENQFAEPFLQEKPSSWELPIRPQ  314 (666)
T ss_pred             CCCccccccCCCCccccccchhhcCCCcceeecccc
Confidence            344457788899998878889999999999888887


No 46 
>KOG0030 consensus Myosin essential light chain, EF-Hand protein superfamily [Cytoskeleton]
Probab=25.51  E-value=28  Score=32.23  Aligned_cols=86  Identities=12%  Similarity=0.168  Sum_probs=48.5

Q ss_pred             cccccccCCCCCchHHH---HHHhhcCCCcc---HHHHHHHHHHHHHhhhh---hHHHhh------CCHHHHHHHHhhcC
Q 003149          699 IHDVQDVIADGHCGFRV---VAELMDIGEDN---WAQVRRDLVDELQSHYD---DYIQLY------GDAEIARELLHSLS  763 (844)
Q Consensus       699 ~~~~~~v~~dg~Cgfra---ia~~lg~~~~~---~~~vr~~l~~el~~~~~---~y~~~~------~~~~~~~~~~~~l~  763 (844)
                      |...+|..|||-=+++-   +.++||.++-.   +..+++---.|+...+-   .+.+++      .....|++++.+|.
T Consensus        16 ~F~lfD~~gD~ki~~~q~gdvlRalG~nPT~aeV~k~l~~~~~~~~~~~rl~FE~fLpm~q~vaknk~q~t~edfvegLr   95 (152)
T KOG0030|consen   16 AFLLFDRTGDGKISGSQVGDVLRALGQNPTNAEVLKVLGQPKRREMNVKRLDFEEFLPMYQQVAKNKDQGTYEDFVEGLR   95 (152)
T ss_pred             HHHHHhccCcccccHHHHHHHHHHhcCCCcHHHHHHHHcCcccchhhhhhhhHHHHHHHHHHHHhccccCcHHHHHHHHH
Confidence            34567888998655554   45677887632   23333333333333443   445554      12235999999998


Q ss_pred             CCCCCCCccccccCCCchhhhhc
Q 003149          764 YSESNPGIEHRMIMPDTGHLIAS  786 (844)
Q Consensus       764 ~~~~~~~~~~w~~~~~~~~~~a~  786 (844)
                      +.++  ....|+.-...-|+|++
T Consensus        96 vFDk--eg~G~i~~aeLRhvLtt  116 (152)
T KOG0030|consen   96 VFDK--EGNGTIMGAELRHVLTT  116 (152)
T ss_pred             hhcc--cCCcceeHHHHHHHHHH
Confidence            7664  23345555555566654


No 47 
>PRK08561 rps15p 30S ribosomal protein S15P; Reviewed
Probab=23.34  E-value=3e+02  Score=26.11  Aligned_cols=31  Identities=23%  Similarity=0.298  Sum_probs=26.7

Q ss_pred             CHHHHHHHHHHhhCCCChHHHHHHHHhcCCC
Q 003149          154 TEQEANILVDLSRSNISPKEILQTLKQRDMH  184 (844)
Q Consensus       154 ~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~  184 (844)
                      .++..+.|..|.+.|.+|.+|-..|+++++.
T Consensus        30 ~eeve~~I~~lakkG~~pSqIG~~LRD~~gi   60 (151)
T PRK08561         30 PEEIEELVVELAKQGYSPSMIGIILRDQYGI   60 (151)
T ss_pred             HHHHHHHHHHHHHCCCCHHHhhhhHhhccCC
Confidence            4566788889999999999999999999853


No 48 
>TIGR03277 methan_mark_9 putative methanogenesis marker domain 9. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins.
Probab=21.59  E-value=69  Score=28.06  Aligned_cols=32  Identities=16%  Similarity=0.481  Sum_probs=28.1

Q ss_pred             CCchHH-HHHHhhcCCCccHHHHHHHHHHHHHh
Q 003149          709 GHCGFR-VVAELMDIGEDNWAQVRRDLVDELQS  740 (844)
Q Consensus       709 g~Cgfr-aia~~lg~~~~~~~~vr~~l~~el~~  740 (844)
                      -.|-|| ..-..+|++.+.+..+.+++.+||..
T Consensus        76 KPCplrd~aL~~igls~~EYm~lKkelae~i~~  108 (109)
T TIGR03277        76 KPCPLRDSALQRIGMSPEEYMELKKKLAEELLK  108 (109)
T ss_pred             CCCcCchHHHHHcCCCHHHHHHHHHHHHHHHhc
Confidence            358889 68889999999999999999999864


No 49 
>PF13082 DUF3931:  Protein of unknown function (DUF3931)
Probab=21.26  E-value=1.8e+02  Score=21.75  Aligned_cols=42  Identities=24%  Similarity=0.307  Sum_probs=25.5

Q ss_pred             CcEEEEeccccC-CCC------------CCeeEEEEEecCCCCceeeEEEeeccC
Q 003149          253 PSVVMIDCTYKT-SMY------------PFSFLEIVGATSTELTFSIAFAYLESE  294 (844)
Q Consensus       253 ~~vl~iD~T~~t-n~~------------~~~l~~~~g~~~~~~~~~~a~al~~~E  294 (844)
                      .+|+.||+--+. ..|            .+.-+++.|-+..|+...+...+..+|
T Consensus         8 cnvisidgkkkksdtysypklvvenktyefssfvlcgetpdgrrlvlthmistde   62 (66)
T PF13082_consen    8 CNVISIDGKKKKSDTYSYPKLVVENKTYEFSSFVLCGETPDGRRLVLTHMISTDE   62 (66)
T ss_pred             ccEEEeccccccCCcccCceEEEeCceEEEEEEEEEccCCCCcEEEEEEEecchh
Confidence            467788875543 223            344455667777777777776665544


No 50 
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=20.74  E-value=3.2e+02  Score=28.62  Aligned_cols=72  Identities=7%  Similarity=-0.121  Sum_probs=47.2

Q ss_pred             cEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeecc-CccchHHHHHHH-HHHhhc--CCCCCeEEEeeccHH
Q 003149          254 SVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLES-ERDDNYIWTLER-LRSMME--DDALPRVIVTDKDLA  326 (844)
Q Consensus       254 ~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~-E~~es~~w~l~~-lk~~~~--~~~~P~~iitD~~~a  326 (844)
                      .++..|-||-....+.-++..+.+|...+ .++||.+... .+.+....+|+. +.....  ....|.+|.||+...
T Consensus        88 ~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~~v~~~l~~A~~~~~~~~~~~~~~iihSD~Gsq  163 (262)
T PRK14702         88 QRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQDVMLGAVERRFGNDLPSSPVEWLTDNGSC  163 (262)
T ss_pred             CEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHHHHHHHHHHHHHHHhcccCCCCCeEEEcCCCcc
Confidence            78999988865544446777778887665 7789988764 455555555543 333221  133577899998853


No 51 
>KOG0400 consensus 40S ribosomal protein S13 [Translation, ribosomal structure and biogenesis]
Probab=20.49  E-value=5.1e+02  Score=23.78  Aligned_cols=102  Identities=14%  Similarity=0.208  Sum_probs=59.9

Q ss_pred             CCHHHHHHHHHHhhCCCChHHHHHHHHhcCCCc---cchHHHHHHHHHHhhhccccCcchHHHHHHH-------HHhcCc
Q 003149          153 LTEQEANILVDLSRSNISPKEILQTLKQRDMHN---VSTIKAIYNARHKYRVGEQVGQLHMHQLLDK-------LRKHGY  222 (844)
Q Consensus       153 l~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~---~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~-------l~e~~~  222 (844)
                      ..++.++.|..+.+.|++|.+|--.|++.++..   ..+-..|.+..++--.... =-.|+..|+..       |+.+- 
T Consensus        29 ~~ddvkeqI~K~akKGltpsqIGviLRDshGi~q~r~v~G~kI~Rilk~~Gl~Pe-iPeDLy~likkAv~iRkHLer~R-  106 (151)
T KOG0400|consen   29 TADDVKEQIYKLAKKGLTPSQIGVILRDSHGIGQVRFVTGNKILRILKSNGLAPE-IPEDLYHLIKKAVAIRKHLERNR-  106 (151)
T ss_pred             CHHHHHHHHHHHHHcCCChhHceeeeecccCcchhheechhHHHHHHHHcCCCCC-CcHHHHHHHHHHHHHHHHHHHhc-
Confidence            456778999999999999999999999877532   2344555555444322111 01144444443       22111 


Q ss_pred             EEEEEeeccCCceeeeEecChHHHHHHHhCCcEEEEecccc
Q 003149          223 IEWHRYNEETDCFKDLFWAHPFAVGLLRAFPSVVMIDCTYK  263 (844)
Q Consensus       223 ~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~~vl~iD~T~~  263 (844)
                            .|.+..| ++++....+..+.++|..+..+-.+++
T Consensus       107 ------KD~d~K~-RLILveSRihRlARYYk~~~~lPp~WK  140 (151)
T KOG0400|consen  107 ------KDKDAKF-RLILVESRIHRLARYYKTKMVLPPNWK  140 (151)
T ss_pred             ------cccccce-EEEeehHHHHHHHHHHHhcccCCCCCC
Confidence                  2333333 455566778888888776655554443


Done!