Query 003149
Match_columns 844
No_of_seqs 451 out of 2097
Neff 8.8
Searched_HMMs 46136
Date Thu Mar 28 17:57:57 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/003149.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/003149hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 1.2E-79 2.6E-84 712.8 47.2 509 31-569 69-627 (846)
2 PF10551 MULE: MULE transposas 99.8 3.8E-20 8.3E-25 162.6 8.7 90 260-352 1-93 (93)
3 PF00872 Transposase_mut: Tran 99.7 3.8E-18 8.1E-23 188.3 5.0 248 152-448 98-352 (381)
4 PF03101 FAR1: FAR1 DNA-bindin 99.6 3.6E-16 7.8E-21 136.6 7.7 90 50-142 1-91 (91)
5 PF02338 OTU: OTU-like cystein 99.6 9.2E-16 2E-20 141.9 6.0 107 706-829 1-121 (121)
6 PF08731 AFT: Transcription fa 99.4 7.2E-13 1.6E-17 113.8 10.6 93 42-140 1-111 (111)
7 KOG2606 OTU (ovarian tumor)-li 99.3 2.2E-12 4.7E-17 129.2 6.5 133 690-834 149-297 (302)
8 COG3328 Transposase and inacti 99.2 3.6E-10 7.9E-15 122.2 16.5 244 154-449 86-332 (379)
9 PF03108 DBD_Tnp_Mut: MuDR fam 98.9 7.1E-09 1.5E-13 84.4 7.4 66 34-129 2-67 (67)
10 smart00575 ZnF_PMZ plant mutat 98.4 8.7E-08 1.9E-12 62.5 1.8 26 522-547 1-26 (28)
11 KOG2605 OTU (ovarian tumor)-li 98.0 2.1E-06 4.7E-11 92.2 0.8 132 692-834 210-343 (371)
12 KOG3288 OTU-like cysteine prot 97.7 0.0001 2.3E-09 72.5 7.7 121 703-838 113-236 (307)
13 PF04434 SWIM: SWIM zinc finge 97.5 4.9E-05 1.1E-09 54.7 2.0 30 517-546 10-39 (40)
14 PF10275 Peptidase_C65: Peptid 97.5 0.00018 4E-09 74.9 6.7 52 782-834 190-244 (244)
15 KOG3991 Uncharacterized conser 97.0 0.0021 4.5E-08 62.8 7.1 88 736-835 165-256 (256)
16 PF06782 UPF0236: Uncharacteri 96.4 0.32 6.9E-06 55.7 20.9 245 150-449 115-381 (470)
17 PF03106 WRKY: WRKY DNA -bindi 95.4 0.049 1.1E-06 42.7 6.1 57 59-139 3-59 (60)
18 PF05412 Peptidase_C33: Equine 94.6 0.089 1.9E-06 45.3 5.8 88 704-839 3-90 (108)
19 PF01610 DDE_Tnp_ISL3: Transpo 94.5 0.046 1E-06 57.2 5.1 68 288-357 30-98 (249)
20 PF04684 BAF1_ABF1: BAF1 / ABF 93.2 0.62 1.3E-05 51.0 10.4 46 35-85 21-66 (496)
21 smart00774 WRKY DNA binding do 92.3 0.27 5.9E-06 38.2 4.6 56 60-138 4-59 (59)
22 COG5539 Predicted cysteine pro 92.2 0.044 9.6E-07 55.9 0.3 109 684-799 155-273 (306)
23 PF13610 DDE_Tnp_IS240: DDE do 90.7 0.17 3.6E-06 47.8 2.4 81 253-338 1-81 (140)
24 PF04500 FLYWCH: FLYWCH zinc f 89.4 1 2.2E-05 35.4 5.7 49 59-138 14-62 (62)
25 PF00665 rve: Integrase core d 79.4 8.7 0.00019 34.5 7.8 76 253-330 6-82 (120)
26 PF15299 ALS2CR8: Amyotrophic 78.1 14 0.00031 37.8 9.6 98 101-200 69-221 (225)
27 PHA02517 putative transposase 76.4 22 0.00047 37.7 10.9 69 253-326 110-180 (277)
28 COG5539 Predicted cysteine pro 61.1 8.6 0.00019 39.8 3.5 112 707-834 119-231 (306)
29 COG4279 Uncharacterized conser 59.2 4.2 9E-05 41.2 0.9 30 521-553 124-153 (266)
30 PF04937 DUF659: Protein of un 58.8 83 0.0018 30.0 9.7 106 250-358 30-139 (153)
31 KOG4345 NF-kappa B regulator A 56.8 7.3 0.00016 45.0 2.4 52 780-833 225-290 (774)
32 COG3316 Transposase and inacti 53.2 32 0.00069 34.6 5.9 81 253-340 70-151 (215)
33 COG3464 Transposase and inacti 49.6 68 0.0015 36.1 8.6 56 289-349 183-238 (402)
34 PF08069 Ribosomal_S13_N: Ribo 48.4 28 0.00061 27.3 3.7 29 155-183 31-59 (60)
35 PF13936 HTH_38: Helix-turn-he 47.8 25 0.00053 25.6 3.2 30 151-180 3-32 (44)
36 PRK09784 hypothetical protein; 45.1 13 0.00029 37.0 1.9 36 697-733 197-232 (417)
37 PF03050 DDE_Tnp_IS66: Transpo 44.5 16 0.00036 38.5 2.7 85 253-357 67-156 (271)
38 PRK13907 rnhA ribonuclease H; 44.5 1.8E+02 0.0039 26.4 9.4 78 255-335 3-81 (128)
39 COG5431 Uncharacterized metal- 34.8 15 0.00032 31.8 0.4 25 521-545 49-78 (117)
40 KOG4540 Putative lipase essent 32.8 1.3E+02 0.0027 31.6 6.6 57 732-797 246-306 (425)
41 COG5153 CVT17 Putative lipase 32.8 1.3E+02 0.0027 31.6 6.6 57 732-797 246-306 (425)
42 PF09921 DUF2153: Uncharacteri 28.4 38 0.00083 30.5 1.8 48 725-778 3-56 (126)
43 PF09607 BrkDBD: Brinker DNA-b 26.7 47 0.001 25.8 1.8 18 707-724 20-39 (58)
44 PF03412 Peptidase_C39: Peptid 25.7 3.2E+02 0.007 24.6 7.8 83 708-834 10-92 (131)
45 KOG4825 Component of synaptic 25.5 1E+02 0.0022 34.2 4.7 36 616-651 279-314 (666)
46 KOG0030 Myosin essential light 25.5 28 0.00061 32.2 0.5 86 699-786 16-116 (152)
47 PRK08561 rps15p 30S ribosomal 23.3 3E+02 0.0064 26.1 6.7 31 154-184 30-60 (151)
48 TIGR03277 methan_mark_9 putati 21.6 69 0.0015 28.1 2.1 32 709-740 76-108 (109)
49 PF13082 DUF3931: Protein of u 21.3 1.8E+02 0.0039 21.8 3.8 42 253-294 8-62 (66)
50 PRK14702 insertion element IS2 20.7 3.2E+02 0.007 28.6 7.5 72 254-326 88-163 (262)
51 KOG0400 40S ribosomal protein 20.5 5.1E+02 0.011 23.8 7.2 102 153-263 29-140 (151)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=1.2e-79 Score=712.75 Aligned_cols=509 Identities=19% Similarity=0.272 Sum_probs=410.9
Q ss_pred CCCCCCCCCcccCCHHHHHHHHHHHhhhcCeEEEEEeecCCC-CCCCceEEEEEecCCccCCCCCCCCCC---------C
Q 003149 31 DFSSAFTTDMVFNSREELVEWIRDTGKRNGLVIVIKKSDVGG-DGRRPRITFACERSGAYRRKYTEGQTP---------K 100 (844)
Q Consensus 31 ~~~~~~~~g~~F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~-~g~~~~~~~vC~r~G~~r~~~~~~~~~---------~ 100 (844)
+....|.+||+|+|.|||++||+.||++.||+||+.+|.+++ ++.++.++|+|+|+|+.+.+.+..... .
T Consensus 69 ~~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~~~ 148 (846)
T PLN03097 69 DTNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQDPE 148 (846)
T ss_pred CCCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccCcc
Confidence 444579999999999999999999999999999999888876 667788999999999765322100000 0
Q ss_pred -CCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCCCCCCCcccCccccCCCHHHHHHHHHHhhCCCChHHHHHHHH
Q 003149 101 -RPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNHPVTQHVEGHSYAGRLTEQEANILVDLSRSNISPKEILQTLK 179 (844)
Q Consensus 101 -~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH~~~~~~~~~~~~rrl~~~~~~~i~~l~~~~~~p~~I~~~l~ 179 (844)
.+++++.+|+||+|+|+++. ..+|+|.|+.+..+|||++.++... +...+.....+.. .+.
T Consensus 149 ~~~~rR~~tRtGC~A~m~Vk~-~~~gkW~V~~fv~eHNH~L~p~~~~-------~~~~r~~~~~~~~----------~~~ 210 (846)
T PLN03097 149 NGTGRRSCAKTDCKASMHVKR-RPDGKWVIHSFVKEHNHELLPAQAV-------SEQTRKMYAAMAR----------QFA 210 (846)
T ss_pred cccccccccCCCCceEEEEEE-cCCCeEEEEEEecCCCCCCCCcccc-------chhhhhhHHHHHh----------hhh
Confidence 01234467899999999965 4679999999999999999976431 1121221111110 000
Q ss_pred hcCCCcc-chHHHHHHHHHHhhhccccCcchHHHHHHHHH----hcCcEEEEEeeccCCceeeeEecChHHHHHHHhCCc
Q 003149 180 QRDMHNV-STIKAIYNARHKYRVGEQVGQLHMHQLLDKLR----KHGYIEWHRYNEETDCFKDLFWAHPFAVGLLRAFPS 254 (844)
Q Consensus 180 ~~~~~~~-~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~----e~~~~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~~ 254 (844)
. + .++ .+.++..|...+.+.... ...+++.++++|+ +||.|+|++++|+++++.+|||+++.++..|.+|||
T Consensus 211 ~-~-~~v~~~~~d~~~~~~~~r~~~~-~~gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~FGD 287 (846)
T PLN03097 211 E-Y-KNVVGLKNDSKSSFDKGRNLGL-EAGDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGNFSD 287 (846)
T ss_pred c-c-ccccccchhhcchhhHHHhhhc-ccchHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHhcCC
Confidence 0 0 111 233444444444443333 2348999999985 599999999999999999999999999999999999
Q ss_pred EEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHHh
Q 003149 255 VVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRAV 334 (844)
Q Consensus 255 vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~v 334 (844)
||++|+||.||+|++||..|+|+|+|++++++|+||+.+|+.++|.|+|++|+++| ++..|++||||+|.+|.+||.+|
T Consensus 288 vV~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM-~gk~P~tIiTDqd~am~~AI~~V 366 (846)
T PLN03097 288 VVSFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAM-GGQAPKVIITDQDKAMKSVISEV 366 (846)
T ss_pred EEEEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHh-CCCCCceEEecCCHHHHHHHHHH
Confidence 99999999999999999999999999999999999999999999999999999999 88999999999999999999999
Q ss_pred CCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhhhc-ccCcCHHHHHHHHHHHHhhccCchhHHHHHHHhhhcch
Q 003149 335 FPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWNVL-VLSVTEQEYMQHLGAMESDFSRYPQAIDYVKQTWLANY 413 (844)
Q Consensus 335 fP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~~l-~~s~t~~~f~~~~~~l~~~~~~~~~~~~y~~~~Wl~~~ 413 (844)
||++.|++|.|||++|+.+++...+.. .+.|...|..+ ..+.++++|+..|..|.+.+.-.. -+++..-| ..
T Consensus 367 fP~t~Hr~C~wHI~~~~~e~L~~~~~~---~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~--n~WL~~LY--~~ 439 (846)
T PLN03097 367 FPNAHHCFFLWHILGKVSENLGQVIKQ---HENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKE--DEWMQSLY--ED 439 (846)
T ss_pred CCCceehhhHHHHHHHHHHHhhHHhhh---hhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccc--cHHHHHHH--Hh
Confidence 999999999999999999999877643 34677777665 458999999999999988763211 13333333 47
Q ss_pred hhhhHhhhhcccccCCCcccccccchhhHHHHhhcCCCCChhhHHHHHHHHHHH-HHHHHHHHhhhccceeccccchhHH
Q 003149 414 KEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLLAVPQGNFETSWAKVHSLLEQ-QHYEIKASFERSSTIVQHNFKVPIF 492 (844)
Q Consensus 414 ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l~~~~~~l~~~~~~i~~~i~~-~~~e~~~~~~~~~~~~~~~~~~~~~ 492 (844)
|++||++|+++.|.+|+.||+++||+|+.||+++.. ..+|..|+.+++.+++. +.+|..+++......+......|+.
T Consensus 440 RekWapaY~k~~F~agm~sTqRSES~Ns~fk~yv~~-~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piE 518 (846)
T PLN03097 440 RKQWVPTYMRDAFLAGMSTVQRSESINAFFDKYVHK-KTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLE 518 (846)
T ss_pred HhhhhHHHhcccccCCcccccccccHHHHHHHHhCc-CCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHH
Confidence 999999999999999999999999999999999975 67899999999999985 5577888888766655555667899
Q ss_pred HHhhhhhcHHHHHHhhcccccc---------------------------cccC----CcCCCCCccccccCCccchhHhh
Q 003149 493 EELRGFVSLNAMNIILGESERA---------------------------DSVG----LNASACGCVFRRTHGLPCAHEIA 541 (844)
Q Consensus 493 ~~l~~~is~~a~~~~~~e~~~~---------------------------~~V~----~~~~~CsC~~~~~~GlPC~H~la 541 (844)
+++.+.+|+.+|..++.|+... ..|. ....+|+|++|+..||||+|+|+
T Consensus 519 kQAs~iYT~~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLk 598 (846)
T PLN03097 519 KSVSGVYTHAVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALV 598 (846)
T ss_pred HHHHHHhHHHHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHH
Confidence 9999999999999998875421 0111 13569999999999999999999
Q ss_pred HHhhcC-CCCCcccccccccccccCCCcc
Q 003149 542 EYKHER-RSIPLLAVDRHWKKLDFVPVTQ 569 (844)
Q Consensus 542 v~~~~~-~~i~~~~i~~rW~~~~~~~~~~ 569 (844)
|+.+.+ ..||..||++|||+.+......
T Consensus 599 VL~~~~v~~IP~~YILkRWTKdAK~~~~~ 627 (846)
T PLN03097 599 VLQMCQLSAIPSQYILKRWTKDAKSRHLL 627 (846)
T ss_pred HHhhcCcccCchhhhhhhchhhhhhcccC
Confidence 999999 8999999999999998876543
No 2
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.81 E-value=3.8e-20 Score=162.65 Aligned_cols=90 Identities=43% Similarity=0.744 Sum_probs=85.3
Q ss_pred ccccCCCCCCeeEE---EEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHHhCC
Q 003149 260 CTYKTSMYPFSFLE---IVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRAVFP 336 (844)
Q Consensus 260 ~T~~tn~~~~~l~~---~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~vfP 336 (844)
+||+||+| ++++. ++|+|++|+.+|+||+++.+|+.++|.|+|+.+++.+. .. |.+||+|++.++.+|++++||
T Consensus 1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~-~~-p~~ii~D~~~~~~~Ai~~vfP 77 (93)
T PF10551_consen 1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMP-QK-PKVIISDFDKALINAIKEVFP 77 (93)
T ss_pred Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccc-cC-ceeeeccccHHHHHHHHHHCC
Confidence 69999999 88885 99999999999999999999999999999999999983 35 999999999999999999999
Q ss_pred CcccchhhhhHHHhHH
Q 003149 337 RATNLLCRWHISKNIS 352 (844)
Q Consensus 337 ~~~~~lC~~Hi~kn~~ 352 (844)
++.|++|.||+.||++
T Consensus 78 ~~~~~~C~~H~~~n~k 93 (93)
T PF10551_consen 78 DARHQLCLFHILRNIK 93 (93)
T ss_pred CceEehhHHHHHHhhC
Confidence 9999999999999974
No 3
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.71 E-value=3.8e-18 Score=188.35 Aligned_cols=248 Identities=16% Similarity=0.172 Sum_probs=192.1
Q ss_pred CCCHHHHHHHHHHhhCCCChHHHHHHHHhcCCCccchHHHHHHHHHHhhhccccCcchHHHHHHHHHhcCcEEEEEeecc
Q 003149 152 RLTEQEANILVDLSRSNISPKEILQTLKQRDMHNVSTIKAIYNARHKYRVGEQVGQLHMHQLLDKLRKHGYIEWHRYNEE 231 (844)
Q Consensus 152 rl~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~e~~~~~~~~~~d~ 231 (844)
+..+.....|..|.-.|++.++|...+..-++..-.+...|.+....+... +..++.
T Consensus 98 r~~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~-----------~~~w~~------------ 154 (381)
T PF00872_consen 98 RREDSLEELIISLYLKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEE-----------VEAWRN------------ 154 (381)
T ss_pred hhhhhhhhhhhhhhccccccccccchhhhhhcccccCchhhhhhhhhhhhh-----------HHHHhh------------
Confidence 446666777888889999999999999888763323444554444333221 111111
Q ss_pred CCceeeeEecChHHHHHHHhC-CcEEEEeccccCCCC-----CCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHH
Q 003149 232 TDCFKDLFWAHPFAVGLLRAF-PSVVMIDCTYKTSMY-----PFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLER 305 (844)
Q Consensus 232 ~~~~~~if~~~~~~~~~~~~~-~~vl~iD~T~~tn~~-----~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~ 305 (844)
.-+... -.+|++|++|-.-+. +..+++++|++.+|+-.++|+.+...|+..+|.-+|+.
T Consensus 155 ---------------R~L~~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~ 219 (381)
T PF00872_consen 155 ---------------RPLESEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQD 219 (381)
T ss_pred ---------------hccccccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeecccCCccCEeeecchh
Confidence 111122 257899999986552 46789999999999999999999999999999999999
Q ss_pred HHHhhcCCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhhhcccCcCHHHHH
Q 003149 306 LRSMMEDDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWNVLVLSVTEQEYM 385 (844)
Q Consensus 306 lk~~~~~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~~l~~s~t~~~f~ 385 (844)
|++. |...|..||+|..+++.+||.++||++.++.|++|+++|+.+++.+. ..+.+...++.+..+++.++..
T Consensus 220 L~~R--Gl~~~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~k-----~~~~v~~~Lk~I~~a~~~e~a~ 292 (381)
T PF00872_consen 220 LKER--GLKDILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPKK-----DRKEVKADLKAIYQAPDKEEAR 292 (381)
T ss_pred hhhc--cccccceeeccccccccccccccccchhhhhheechhhhhccccccc-----cchhhhhhccccccccccchhh
Confidence 9998 67789999999999999999999999999999999999999987542 3467888899999999999999
Q ss_pred HHHHHHHhhc-cCchhHHHHHHHhhhcchhhhhHhhhhcccccCCCcccccccchhhHHHHhhc
Q 003149 386 QHLGAMESDF-SRYPQAIDYVKQTWLANYKEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLLA 448 (844)
Q Consensus 386 ~~~~~l~~~~-~~~~~~~~y~~~~Wl~~~ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l~ 448 (844)
+.++.+.+.| ..+|.+..++...|-. .|.-.-++...+--++|||.+|++|+.+|+...
T Consensus 293 ~~l~~f~~~~~~kyp~~~~~l~~~~~~----~~tf~~fP~~~~~~i~TTN~iEsln~~irrr~~ 352 (381)
T PF00872_consen 293 EALEEFAEKWEKKYPKAAKSLEENWDE----LLTFLDFPPEHRRSIRTTNAIESLNKEIRRRTK 352 (381)
T ss_pred hhhhhcccccccccchhhhhhhhcccc----ccceeeecchhccccchhhhccccccchhhhcc
Confidence 9999999887 5689999998887741 222111233333456899999999999998665
No 4
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=99.64 E-value=3.6e-16 Score=136.56 Aligned_cols=90 Identities=27% Similarity=0.539 Sum_probs=77.7
Q ss_pred HHHHHHhhhcCeEEEEEeecCC-CCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeE
Q 003149 50 EWIRDTGKRNGLVIVIKKSDVG-GDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWI 128 (844)
Q Consensus 50 ~~~~~ya~~~GF~v~~~~S~~~-~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~ 128 (844)
+||+.||..+||+|++.+|.+. ++|...+++|+|+++|.++.+... ....++.+.+.+|||||+|.++... ++.|.
T Consensus 1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~--~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~ 77 (91)
T PF03101_consen 1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKN--EEKRRRNRPSKKTGCKARINVKRRK-DGKWR 77 (91)
T ss_pred CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccc--cccccccccccccCCCEEEEEEEcc-CCEEE
Confidence 4899999999999999999887 477888999999999998865422 2345677889999999999998776 99999
Q ss_pred EEEEccccCCCCCC
Q 003149 129 LKVVCGVHNHPVTQ 142 (844)
Q Consensus 129 v~~~~~~HNH~~~~ 142 (844)
|+.+..+|||++.+
T Consensus 78 v~~~~~~HNH~L~P 91 (91)
T PF03101_consen 78 VTSFVLEHNHPLCP 91 (91)
T ss_pred EEECcCCcCCCCCC
Confidence 99999999999865
No 5
>PF02338 OTU: OTU-like cysteine protease; InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65). None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.60 E-value=9.2e-16 Score=141.93 Aligned_cols=107 Identities=20% Similarity=0.293 Sum_probs=86.1
Q ss_pred CCCCCchHHHHHHhhc----CCCccHHHHHHHHHHHHH-hhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCc
Q 003149 706 IADGHCGFRVVAELMD----IGEDNWAQVRRDLVDELQ-SHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDT 780 (844)
Q Consensus 706 ~~dg~Cgfraia~~lg----~~~~~~~~vr~~l~~el~-~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~ 780 (844)
+|||||+|||||.+|+ .+++.|..||++++++|+ .+++.|.+++.+. .+. ..+.|.+.+++
T Consensus 1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~~~~~~~~--------~~~------~~~~Wg~~~el 66 (121)
T PF02338_consen 1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKFEEFLEGD--------KMS------KPGTWGGEIEL 66 (121)
T ss_dssp -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHHHHHHHHH--------HHT------STTSHEEHHHH
T ss_pred CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchhhhhhhhh--------hhc------cccccCcHHHH
Confidence 6999999999999999 999999999999999999 9999999887433 343 56899999988
Q ss_pred hhhhhcccCeeEEEEcCCccc--cccCCCC--CCCCCCCCCcEEEEEeC-----CCcE
Q 003149 781 GHLIASRYNIVLMHLSQQQCF--TFLPLRS--VPLPRTSRKIVTIGFVN-----ECQF 829 (844)
Q Consensus 781 ~~~~a~~~~~~v~~~~~~~~~--~~~p~~~--~p~~~~~~~~i~l~~~~-----~~H~ 829 (844)
+++|+.|+|+|++|+..... .+++..+ +|... .++|+|+|.. ++||
T Consensus 67 -~a~a~~~~~~I~v~~~~~~~~~~~~~~~~~~~~~~~--~~~i~l~~~~~l~~~~~Hy 121 (121)
T PF02338_consen 67 -QALANVLNRPIIVYSSSDGDNVVFIKFTGKYPPLES--PPPICLCYHGHLYYTGNHY 121 (121)
T ss_dssp -HHHHHHHTSEEEEECETTTBEEEEEEESCEESTTTT--TTSEEEEEETEEEEETTEE
T ss_pred -HHHHHHhCCeEEEEEcCCCCccceeeecCccccCCC--CCeEEEEEcCCccCCCCCC
Confidence 69999999999998874333 3333332 23333 6889999997 8998
No 6
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=99.44 E-value=7.2e-13 Score=113.75 Aligned_cols=93 Identities=32% Similarity=0.661 Sum_probs=79.0
Q ss_pred cCCHHHHHHHHHHHhhhcCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCC------------------CCCCCCC
Q 003149 42 FNSREELVEWIRDTGKRNGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEG------------------QTPKRPK 103 (844)
Q Consensus 42 F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~------------------~~~~~rr 103 (844)
|.+.+|++.|++..+..+||.|+|.||+.. .++|.|.-+|.++...... ......+
T Consensus 1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~------ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k 74 (111)
T PF08731_consen 1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKK------KIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKK 74 (111)
T ss_pred CCchHHHHHHHHHHhhhcCceEEEEecCCc------eEEEEEecCCCcccccccccccccccccccccccccccccccCC
Confidence 899999999999999999999999999987 7999999998877544210 1122235
Q ss_pred CCCceeeCCccEEEEEeeCCCCCeEEEEEccccCCCC
Q 003149 104 TTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNHPV 140 (844)
Q Consensus 104 ~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH~~ 140 (844)
.+.++++.|||+|++......+.|.|..+++.|||++
T Consensus 75 ~t~srk~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l 111 (111)
T PF08731_consen 75 RTKSRKNTCPFRIRANYSKKNKKWTLVVVNNEHNHPL 111 (111)
T ss_pred cccccccCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence 6678899999999999999999999999999999986
No 7
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.31 E-value=2.2e-12 Score=129.16 Aligned_cols=133 Identities=11% Similarity=0.116 Sum_probs=111.7
Q ss_pred ccccccccccccccccCCCCCchHHHHHHhhcCCC---ccHHHHHHHHHHHHHhhhhhHHHhhC--------CHHHHHHH
Q 003149 690 SFPAGLRPYIHDVQDVIADGHCGFRVVAELMDIGE---DNWAQVRRDLVDELQSHYDDYIQLYG--------DAEIAREL 758 (844)
Q Consensus 690 ~~~~~~~~~~~~~~~v~~dg~Cgfraia~~lg~~~---~~~~~vr~~l~~el~~~~~~y~~~~~--------~~~~~~~~ 758 (844)
.+-+.|.+-.+.++|+++||||.|+||+.+|...- -+-..+|++..++|++|.+++.+++- ++.+|+.+
T Consensus 149 k~~~il~~~~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df~pf~~~eet~d~~~~~~f~~Y 228 (302)
T KOG2606|consen 149 KLAQILEERGLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDFLPFLLDEETGDSLGPEDFDKY 228 (302)
T ss_pred HHHHHHHhccCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHhhhHhcCccccccCCHHHHHHH
Confidence 56778889999999999999999999999998864 56889999999999999999999872 24479999
Q ss_pred HhhcCCCCCCCCccccccCCCchhhhhcccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeC-----CCcEEEEE
Q 003149 759 LHSLSYSESNPGIEHRMIMPDTGHLIASRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVN-----ECQFVKVL 833 (844)
Q Consensus 759 ~~~l~~~~~~~~~~~w~~~~~~~~~~a~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~-----~~H~~~~~ 833 (844)
++.+. .+..|++-+..+ +|++.|.+||.+|..+++ ++..++...+ .+||+|+|+- +-||+++.
T Consensus 229 c~eI~------~t~~WGgelEL~-AlShvL~~PI~Vy~~~~p----~~~~geey~k-d~pL~lvY~rH~y~LGeHYNS~~ 296 (302)
T KOG2606|consen 229 CREIR------NTAAWGGELELK-ALSHVLQVPIEVYQADGP----ILEYGEEYGK-DKPLILVYHRHAYGLGEHYNSVT 296 (302)
T ss_pred HHHhh------hhccccchHHHH-HHHHhhccCeEEeecCCC----ceeechhhCC-CCCeeeehHHhHHHHHhhhcccc
Confidence 99996 679999999986 999999999999999866 5555544443 3789999963 26888876
Q ss_pred e
Q 003149 834 D 834 (844)
Q Consensus 834 ~ 834 (844)
+
T Consensus 297 ~ 297 (302)
T KOG2606|consen 297 P 297 (302)
T ss_pred c
Confidence 5
No 8
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.19 E-value=3.6e-10 Score=122.15 Aligned_cols=244 Identities=15% Similarity=0.126 Sum_probs=176.5
Q ss_pred CHHHHHHHHHHhhCCCChHHHHHHHHhcCCCccchHHHHHHHHHHhhhccccCcchHHHHHHHHHhcCcEEEEEeeccCC
Q 003149 154 TEQEANILVDLSRSNISPKEILQTLKQRDMHNVSTIKAIYNARHKYRVGEQVGQLHMHQLLDKLRKHGYIEWHRYNEETD 233 (844)
Q Consensus 154 ~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~e~~~~~~~~~~d~~~ 233 (844)
....-..|..+...|+++++|-..+++.+...+ ..+.++.... .+..-+..++.-+.
T Consensus 86 ~~~~~~~v~~~y~~gv~Tr~i~~~~~~~~~~~~--s~~~iS~~~~----------~~~e~v~~~~~r~l----------- 142 (379)
T COG3328 86 ERALDLPVLSMYAKGVTTREIEALLEELYGHKV--SPSVISVVTD----------RLDEKVKAWQNRPL----------- 142 (379)
T ss_pred hhhHHHHHHHHHHcCCcHHHHHHHHHHhhCccc--CHHHhhhHHH----------HHHHHHHHHHhccc-----------
Confidence 334455677788899999999999998875422 1111111100 11111111111111
Q ss_pred ceeeeEecChHHHHHHHhCCcEEEEeccccCCC--CCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhc
Q 003149 234 CFKDLFWAHPFAVGLLRAFPSVVMIDCTYKTSM--YPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMME 311 (844)
Q Consensus 234 ~~~~if~~~~~~~~~~~~~~~vl~iD~T~~tn~--~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~ 311 (844)
.--.++++|++|..-+ -+..++.++|++.+|+-..+|+.+-..|+ ..|.-+|..|+..
T Consensus 143 -----------------~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~r-- 202 (379)
T COG3328 143 -----------------GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNR-- 202 (379)
T ss_pred -----------------cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhc--
Confidence 1125789999998876 46789999999999999999999999999 8999888888887
Q ss_pred CCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhhhcccCcCHHHHHHHHHHH
Q 003149 312 DDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWNVLVLSVTEQEYMQHLGAM 391 (844)
Q Consensus 312 ~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~~l~~s~t~~~f~~~~~~l 391 (844)
+......+++|...++.+||.++||.+.++.|..|+.+|+..+.... ..+.+...++.+..+++.++....+..+
T Consensus 203 gl~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~k-----~~d~i~~~~~~I~~a~~~e~~~~~~~~~ 277 (379)
T COG3328 203 GLSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPRK-----DQDAVLSDLRSIYIAPDAEEALLALLAF 277 (379)
T ss_pred cccceeEEecchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhhh-----hhHHHHhhhhhhhccCCcHHHHHHHHHH
Confidence 56666778889999999999999999999999999999998886543 4567778888889999999999999998
Q ss_pred Hhhcc-CchhHHHHHHHhhhcchhhhhHhhhhcccccCCCcccccccchhhHHHHhhcC
Q 003149 392 ESDFS-RYPQAIDYVKQTWLANYKEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLLAV 449 (844)
Q Consensus 392 ~~~~~-~~~~~~~y~~~~Wl~~~ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l~~ 449 (844)
.+.|. .+|.......++|.. .|.-.-+.....--..|||.+|++|..++.....
T Consensus 278 ~~~w~~~yP~i~~~~~~~~~~----~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~ 332 (379)
T COG3328 278 SELWGKRYPAILKSWRNALEE----LLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKV 332 (379)
T ss_pred HHhhhhhcchHHHHHHHHHHH----hcccccCcHHHHhHhhcchHHHHHHHHHHHHHhh
Confidence 88764 578777777777642 2111001111112358999999999988866553
No 9
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.86 E-value=7.1e-09 Score=84.39 Aligned_cols=66 Identities=29% Similarity=0.543 Sum_probs=59.1
Q ss_pred CCCCCCcccCCHHHHHHHHHHHhhhcCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCc
Q 003149 34 SAFTTDMVFNSREELVEWIRDTGKRNGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCP 113 (844)
Q Consensus 34 ~~~~~g~~F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp 113 (844)
+.+.+||+|+|.+|++.++..||..+||.++..+|++. ++.++|. ..|||
T Consensus 2 ~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~~------r~~~~C~------------------------~~~C~ 51 (67)
T PF03108_consen 2 PELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDKK------RYRAKCK------------------------DKGCP 51 (67)
T ss_pred CccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCCE------EEEEEEc------------------------CCCCC
Confidence 46899999999999999999999999999999999866 8899993 24799
Q ss_pred cEEEEEeeCCCCCeEE
Q 003149 114 FLLKGHKLDTDDDWIL 129 (844)
Q Consensus 114 ~~i~~~~~~~~~~w~v 129 (844)
|+|++.....++.|.|
T Consensus 52 Wrv~as~~~~~~~~~I 67 (67)
T PF03108_consen 52 WRVRASKRKRSDTFQI 67 (67)
T ss_pred EEEEEEEcCCCCEEEC
Confidence 9999998888888875
No 10
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.45 E-value=8.7e-08 Score=62.52 Aligned_cols=26 Identities=31% Similarity=0.676 Sum_probs=23.9
Q ss_pred CCCCccccccCCccchhHhhHHhhcC
Q 003149 522 SACGCVFRRTHGLPCAHEIAEYKHER 547 (844)
Q Consensus 522 ~~CsC~~~~~~GlPC~H~lav~~~~~ 547 (844)
.+|||+.|+..||||+|+|+|+...+
T Consensus 1 ~~CsC~~~~~~gipC~H~i~v~~~~~ 26 (28)
T smart00575 1 KTCSCRKFQLSGIPCRHALAAAIHIG 26 (28)
T ss_pred CcccCCCcccCCccHHHHHHHHHHhC
Confidence 37999999999999999999998865
No 11
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=97.95 E-value=2.1e-06 Score=92.16 Aligned_cols=132 Identities=16% Similarity=0.183 Sum_probs=98.4
Q ss_pred ccccccccccccccCCCCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCc
Q 003149 692 PAGLRPYIHDVQDVIADGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGI 771 (844)
Q Consensus 692 ~~~~~~~~~~~~~v~~dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~ 771 (844)
.+.+..|+.-+.-|.+||+|.|||+|+++.++.|-|..||++..++++..+++|..+. +..|..+++... -.
T Consensus 210 ~~~~~~~g~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~~~~v--t~~~~~y~k~kr------~~ 281 (371)
T KOG2605|consen 210 AKRKKHFGFEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFYEDYV--TEDFTSYIKRKR------AD 281 (371)
T ss_pred HHHHHHhhhhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhccccccccc--ccchhhcccccc------cC
Confidence 3445788889999999999999999999999999999999999999999999888876 446777777765 56
Q ss_pred cccccCCCchhhhhc--ccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149 772 EHRMIMPDTGHLIAS--RYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD 834 (844)
Q Consensus 772 ~~w~~~~~~~~~~a~--~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~ 834 (844)
+.|++-..+ |++|. -+-.+.+.+++.....|--. +|....+...+++.++...||..+..
T Consensus 282 ~~~gnhie~-Qa~a~~~~~~~~~~~~~~~~~t~~~~~--~~~~~~~~~~~~~n~~~~~h~~~~~~ 343 (371)
T KOG2605|consen 282 GEPGNHIEQ-QAAADIYEEIEKPLNITSFKDTCYIQT--PPAIEESVKMEKYNFWVEVHYNTARH 343 (371)
T ss_pred CCCcchHHH-hhhhhhhhhccccceeecccccceecc--Ccccccchhhhhhcccchhhhhhccc
Confidence 778887766 58885 33334444554333333222 23344445668888888899987765
No 12
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=97.70 E-value=0.0001 Score=72.51 Aligned_cols=121 Identities=12% Similarity=0.093 Sum_probs=87.5
Q ss_pred cccCCCCCchHHHHHHhhcCCCccH-HHHHHHHHHHHHhhhhhHHHhh-CCH-HHHHHHHhhcCCCCCCCCccccccCCC
Q 003149 703 QDVIADGHCGFRVVAELMDIGEDNW-AQVRRDLVDELQSHYDDYIQLY-GDA-EIARELLHSLSYSESNPGIEHRMIMPD 779 (844)
Q Consensus 703 ~~v~~dg~Cgfraia~~lg~~~~~~-~~vr~~l~~el~~~~~~y~~~~-~~~-~~~~~~~~~l~~~~~~~~~~~w~~~~~ 779 (844)
.=|+.|--|+|+||+.-+....+.- .++|+-+.+|+-++++.|..-+ |.+ .+|..++. .++.|.+-.+
T Consensus 113 ~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n~eYc~WI~---------k~dsWGGaIE 183 (307)
T KOG3288|consen 113 RVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPNKEYCAWIL---------KMDSWGGAIE 183 (307)
T ss_pred EeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCcHHHHHHHc---------cccccCceEE
Confidence 3489999999999988886654222 6899999999999999997644 433 34555554 4589999999
Q ss_pred chhhhhcccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEecCCC
Q 003149 780 TGHLIASRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLDFSQT 838 (844)
Q Consensus 780 ~~~~~a~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~~~~~ 838 (844)
.+ ||++.|++-|++++.+.... -.+++ ..+- ..-++|-| ++-||-+|.+..-.
T Consensus 184 ls-ILS~~ygveI~vvDiqt~ri--d~fge-d~~~-~~rv~lly-dGIHYD~l~m~~~~ 236 (307)
T KOG3288|consen 184 LS-ILSDYYGVEICVVDIQTVRI--DRFGE-DKNF-DNRVLLLY-DGIHYDPLAMNEFK 236 (307)
T ss_pred ee-eehhhhceeEEEEecceeee--hhcCC-CCCC-CceEEEEe-cccccChhhhccCC
Confidence 97 99999999999999753210 11222 1111 23478888 58999999986533
No 13
>PF04434 SWIM: SWIM zinc finger; InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=97.52 E-value=4.9e-05 Score=54.67 Aligned_cols=30 Identities=27% Similarity=0.612 Sum_probs=26.2
Q ss_pred cCCcCCCCCccccccCCccchhHhhHHhhc
Q 003149 517 VGLNASACGCVFRRTHGLPCAHEIAEYKHE 546 (844)
Q Consensus 517 V~~~~~~CsC~~~~~~GlPC~H~lav~~~~ 546 (844)
+++...+|||..|+..|.||+|+++++...
T Consensus 10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~ 39 (40)
T PF04434_consen 10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL 39 (40)
T ss_pred ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence 345688999999999999999999998754
No 14
>PF10275 Peptidase_C65: Peptidase C65 Otubain; InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=97.49 E-value=0.00018 Score=74.92 Aligned_cols=52 Identities=6% Similarity=-0.038 Sum_probs=29.9
Q ss_pred hhhhcccCeeEEEEcCCcc---ccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149 782 HLIASRYNIVLMHLSQQQC---FTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD 834 (844)
Q Consensus 782 ~~~a~~~~~~v~~~~~~~~---~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~ 834 (844)
.+||+++++||-++-.+.+ ..+=....+|.+....|.|.|.|- +.||.-|+.
T Consensus 190 ~ALa~aL~v~i~v~yld~~~~~~~~~~~~~~~~~~~~~~~i~LLyr-pgHYdIly~ 244 (244)
T PF10275_consen 190 IALAQALGVPIRVEYLDRSVEGDEVNRHEFPPDNESQEPQITLLYR-PGHYDILYP 244 (244)
T ss_dssp HHHHHHHT--EEEEESSSSGCSTTSEEEEES-SSTTSS-SEEEEEE-TBEEEEEEE
T ss_pred HHHHHHhCCeEEEEEecCCCCCCccccccCCCccCCCCCEEEEEEc-CCccccccC
Confidence 4799999999888765433 111112222222233688999997 569998873
No 15
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=96.96 E-value=0.0021 Score=62.76 Aligned_cols=88 Identities=10% Similarity=0.026 Sum_probs=50.3
Q ss_pred HHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhh--hhcccCe--eEEEEcCCccccccCCCCCCC
Q 003149 736 DELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHL--IASRYNI--VLMHLSQQQCFTFLPLRSVPL 811 (844)
Q Consensus 736 ~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~--~a~~~~~--~v~~~~~~~~~~~~p~~~~p~ 811 (844)
-+|+++.++|.++..+...++.++..-- +-...-.+|-+| |+++.+. -|.+++-+...++=+..-|
T Consensus 165 ~~ik~~adfy~pFI~e~~tV~~fC~~eV--------EPm~kesdhi~I~ALs~Al~i~irVey~dr~~~~~~~hH~fp-- 234 (256)
T KOG3991|consen 165 GFIKSNADFYQPFIDEGMTVKAFCTQEV--------EPMYKESDHIHITALSQALGIRIRVEYVDRGSGDTVNHHDFP-- 234 (256)
T ss_pred HHHhhChhhhhccCCCCCcHHHHHHhhc--------chhhhccCceeHHHHHhhhCceEEEEEecCCCCCCCCCCcCc--
Confidence 3556666666666655556666655431 111122334444 6777775 5667776655343333322
Q ss_pred CCCCCCcEEEEEeCCCcEEEEEec
Q 003149 812 PRTSRKIVTIGFVNECQFVKVLDF 835 (844)
Q Consensus 812 ~~~~~~~i~l~~~~~~H~~~~~~~ 835 (844)
.-+.|-|.|.|- +-||..|+.+
T Consensus 235 -e~s~P~I~LLYr-pGHYdilY~~ 256 (256)
T KOG3991|consen 235 -EASAPEIYLLYR-PGHYDILYKK 256 (256)
T ss_pred -cccCceEEEEec-CCccccccCC
Confidence 223577999997 5799988753
No 16
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=96.41 E-value=0.32 Score=55.68 Aligned_cols=245 Identities=13% Similarity=0.132 Sum_probs=135.0
Q ss_pred ccCCCHHHHHHHHHHhhCCCChHHHHHHHHhcCCCccchHHHHHHHHHHhhhccccCcchHHHHHHHHHhcCcEEEEEee
Q 003149 150 AGRLTEQEANILVDLSRSNISPKEILQTLKQRDMHNVSTIKAIYNARHKYRVGEQVGQLHMHQLLDKLRKHGYIEWHRYN 229 (844)
Q Consensus 150 ~rrl~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~l~e~~~~~~~~~~ 229 (844)
..|+++..+..+..++.. ++-+...+.+....+....+...|.|..+........ .+ .
T Consensus 115 ~~R~S~~~~~~i~~~a~~-~sYr~aa~~l~~~~~~~~iS~~tV~~~v~~~g~~~~~------------~~---------~ 172 (470)
T PF06782_consen 115 YQRISPELKEKIVELATE-MSYRKAAEILEELLGNVSISKQTVWNIVKEAGFEEIK------------EE---------E 172 (470)
T ss_pred ccchhHHHHHHHHHHHhh-cCHHHHHHHHhhccCCCccCHHHHHHHHHhccchhhh------------cc---------c
Confidence 468999999888887644 8888888888776654445777788877665421100 00 0
Q ss_pred ccCCceeeeEecChHHHHHHHhCCcEEEEeccccC----CCC--CCee-EEEEE---e-cCCCCceeeEE-Eeec---cC
Q 003149 230 EETDCFKDLFWAHPFAVGLLRAFPSVVMIDCTYKT----SMY--PFSF-LEIVG---A-TSTELTFSIAF-AYLE---SE 294 (844)
Q Consensus 230 d~~~~~~~if~~~~~~~~~~~~~~~vl~iD~T~~t----n~~--~~~l-~~~~g---~-~~~~~~~~~a~-al~~---~E 294 (844)
.+......+| |-.|++|-. ++. ...+ ++-.| . .+.++.....- .++. ..
T Consensus 173 ~~k~~~~~Ly----------------IEaDg~~v~~qg~~~~~~e~k~~~vheG~~~~~~~~~R~~L~n~~~f~~~~~~~ 236 (470)
T PF06782_consen 173 KEKKKVPVLY----------------IEADGVHVKLQGKKKKKKEVKLFVVHEGWEKEKPGGKRNKLKNKRHFVSGVGES 236 (470)
T ss_pred cccCCCCeEE----------------EecCcceecccccccccceeeEEEEEeeeeeeeccCCcceeecchheecccccc
Confidence 0011111111 112333322 111 1222 22224 1 11122222222 2222 44
Q ss_pred ccchHHHHHHHHHHhhcCCCC-CeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhhhhhhhhHHHHHHHHhh
Q 003149 295 RDDNYIWTLERLRSMMEDDAL-PRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKKLFETKERWEAFICSWN 373 (844)
Q Consensus 295 ~~es~~w~l~~lk~~~~~~~~-P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~~~~~~~~~~~~~~~~~ 373 (844)
..+-|..+.+.+-+....... -.++..|...-+.+++. .||.+.+.+..||+.+.+.+.+... ++..+.+...+
T Consensus 237 ~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~---~~~~~~~~~al- 311 (470)
T PF06782_consen 237 AEEFWEEVLDYIYNHYDLDKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHD---PELKEKIRKAL- 311 (470)
T ss_pred hHHHHHHHHHHHHHhcCcccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhC---hHHHHHHHHHH-
Confidence 456677777777766622222 23566788888887776 9999999999999999999887542 12222233222
Q ss_pred hcccCcCHHHHHHHHHHHHhhcc------CchhHHHHHHHhhhcchhhhhHhhhhcccccCCCcccccccchhhHHHHhh
Q 003149 374 VLVLSVTEQEYMQHLGAMESDFS------RYPQAIDYVKQTWLANYKEKFVAAWTDLAMHFGNVTMNRGETTHTKLKRLL 447 (844)
Q Consensus 374 ~l~~s~t~~~f~~~~~~l~~~~~------~~~~~~~y~~~~Wl~~~ke~w~~~~~~~~~~~g~~T~n~~Es~n~~LK~~l 447 (844)
...+...+...++.+..... ....+..|+.++|=. . ..|... .|.......|+.|..+..-+
T Consensus 312 ---~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~~~~Yl~~n~~~-i-----~~y~~~---~~~~g~g~ee~~~~~~s~Rm 379 (470)
T PF06782_consen 312 ---KKGDKKKLETVLDTAESCAKDEEERKKIRKLRKYLLNNWDG-I-----KPYRER---EGLRGIGAEESVSHVLSYRM 379 (470)
T ss_pred ---HhcCHHHHHHHHHHHHHhhhchHHHHHHHHHHHHHHHCHHH-h-----hhhhhc---cCCCccchhhhhhhHHHHHh
Confidence 24456666666666654432 234567899998831 1 112111 34444455788998887766
Q ss_pred cC
Q 003149 448 AV 449 (844)
Q Consensus 448 ~~ 449 (844)
+.
T Consensus 380 K~ 381 (470)
T PF06782_consen 380 KS 381 (470)
T ss_pred cC
Confidence 64
No 17
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=95.43 E-value=0.049 Score=42.74 Aligned_cols=57 Identities=23% Similarity=0.363 Sum_probs=41.6
Q ss_pred cCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCC
Q 003149 59 NGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNH 138 (844)
Q Consensus 59 ~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH 138 (844)
=||..|+--.+.-++....+.+|.|+. .+||++=.+.+...++...++.+.++|||
T Consensus 3 Dgy~WRKYGqK~i~g~~~pRsYYrCt~------------------------~~C~akK~Vqr~~~d~~~~~vtY~G~H~h 58 (60)
T PF03106_consen 3 DGYRWRKYGQKNIKGSPYPRSYYRCTH------------------------PGCPAKKQVQRSADDPNIVIVTYEGEHNH 58 (60)
T ss_dssp SSS-EEEEEEEEETTTTCEEEEEEEEC------------------------TTEEEEEEEEEETTCCCEEEEEEES--SS
T ss_pred CCCchhhccCcccCCCceeeEeeeccc------------------------cChhheeeEEEecCCCCEEEEEEeeeeCC
Confidence 377888766666555556678899944 38999988888877888999999999999
Q ss_pred C
Q 003149 139 P 139 (844)
Q Consensus 139 ~ 139 (844)
+
T Consensus 59 ~ 59 (60)
T PF03106_consen 59 P 59 (60)
T ss_dssp -
T ss_pred C
Confidence 6
No 18
>PF05412 Peptidase_C33: Equine arterivirus Nsp2-type cysteine proteinase; InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=94.60 E-value=0.089 Score=45.25 Aligned_cols=88 Identities=16% Similarity=0.241 Sum_probs=53.5
Q ss_pred ccCCCCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhh
Q 003149 704 DVIADGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHL 783 (844)
Q Consensus 704 ~v~~dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~ 783 (844)
.+++||+||+|+||.-+.. ++++. |. .. .+.-+-+.+.|++.-+++++
T Consensus 3 sPP~DG~CG~H~i~aI~n~-------------------------m~~~~--~t---~~--l~~~~r~~d~W~~dedl~~~ 50 (108)
T PF05412_consen 3 SPPGDGSCGWHCIAAIMNH-------------------------MMGGE--FT---TP--LPQRNRPSDDWADDEDLYQV 50 (108)
T ss_pred CCCCCCchHHHHHHHHHHH-------------------------hhccC--CC---cc--ccccCCChHHccChHHHHHH
Confidence 4899999999999876532 12210 00 00 12234578999999999999
Q ss_pred hhcccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEecCCCC
Q 003149 784 IASRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLDFSQTP 839 (844)
Q Consensus 784 ~a~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~~~~~~ 839 (844)
|-.. +.|+-+--.+.| | .-.++-=.++.||..-.-++.-|
T Consensus 51 iq~l-~lPat~~~~~~C---------p------~ArYv~~l~~qHW~V~~~~g~~~ 90 (108)
T PF05412_consen 51 IQSL-RLPATLDRNGAC---------P------HARYVLKLDGQHWEVSVRKGRAP 90 (108)
T ss_pred HHHc-cCceeccCCCCC---------C------CCEEEEEecCceEEEEEcCCCCc
Confidence 8754 555544333333 2 22444444678998766666544
No 19
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=94.54 E-value=0.046 Score=57.20 Aligned_cols=68 Identities=19% Similarity=0.226 Sum_probs=57.2
Q ss_pred EEeeccCccchHHHHHHHH-HHhhcCCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHHhHHHhhhh
Q 003149 288 FAYLESERDDNYIWTLERL-RSMMEDDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISKNISVNCKK 357 (844)
Q Consensus 288 ~al~~~E~~es~~w~l~~l-k~~~~~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~ 357 (844)
+.++++-+.+++.-+|..+ -.. .....++|++|...+...|+++.||+|.+..-.|||++++.+.+.+
T Consensus 30 l~i~~~r~~~~l~~~~~~~~~~~--~~~~v~~V~~Dm~~~y~~~~~~~~P~A~iv~DrFHvvk~~~~al~~ 98 (249)
T PF01610_consen 30 LDILPGRDKETLKDFFRSLYPEE--ERKNVKVVSMDMSPPYRSAIREYFPNAQIVADRFHVVKLANRALDK 98 (249)
T ss_pred EEEcCCccHHHHHHHHHHhCccc--cccceEEEEcCCCccccccccccccccccccccchhhhhhhhcchh
Confidence 3478888888888888876 333 3567789999999999999999999999999999999998875543
No 20
>PF04684 BAF1_ABF1: BAF1 / ABF1 chromatin reorganising factor; InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=93.24 E-value=0.62 Score=51.01 Aligned_cols=46 Identities=24% Similarity=0.273 Sum_probs=39.1
Q ss_pred CCCCCcccCCHHHHHHHHHHHhhhcCeEEEEEeecCCCCCCCceEEEEEec
Q 003149 35 AFTTDMVFNSREELVEWIRDTGKRNGLVIVIKKSDVGGDGRRPRITFACER 85 (844)
Q Consensus 35 ~~~~g~~F~S~de~~~~~~~ya~~~GF~v~~~~S~~~~~g~~~~~~~vC~r 85 (844)
.-..+..|+++++-|..+++|.-+...-|+...|.+.+ .++|.|.+
T Consensus 21 ~~~~~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nk-----hftfachl 66 (496)
T PF04684_consen 21 QSAQARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNK-----HFTFACHL 66 (496)
T ss_pred ccccccCCCcHHHHHHHHhhhhhhhcCceeeccccccc-----ceEEEeec
Confidence 33467899999999999999999999999988887764 68898854
No 21
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=92.26 E-value=0.27 Score=38.24 Aligned_cols=56 Identities=23% Similarity=0.292 Sum_probs=39.3
Q ss_pred CeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCC
Q 003149 60 GLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNH 138 (844)
Q Consensus 60 GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH 138 (844)
||..|+-..+..++....|.+|.|+. ..|||++=.+.+...++...++.+.++|||
T Consensus 4 Gy~WRKYGQK~ikgs~~pRsYYrCt~-----------------------~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h 59 (59)
T smart00774 4 GYQWRKYGQKVIKGSPFPRSYYRCTY-----------------------SQGCPAKKQVQRSDDDPSVVEVTYEGEHTH 59 (59)
T ss_pred cccccccCcEecCCCcCcceEEeccc-----------------------cCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence 56666654444444455577888844 147999766666666788888899999998
No 22
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=92.25 E-value=0.044 Score=55.91 Aligned_cols=109 Identities=9% Similarity=-0.097 Sum_probs=71.2
Q ss_pred CcccccccccccccccccccccCCCCCchHHHHHHhhcCCCc-----cHHHHHHHHHHHHHhhhhhHHHhhCCH-----H
Q 003149 684 PLCFIDSFPAGLRPYIHDVQDVIADGHCGFRVVAELMDIGED-----NWAQVRRDLVDELQSHYDDYIQLYGDA-----E 753 (844)
Q Consensus 684 ~~~~~~~~~~~~~~~~~~~~~v~~dg~Cgfraia~~lg~~~~-----~~~~vr~~l~~el~~~~~~y~~~~~~~-----~ 753 (844)
-.+++-.+|.....--..-.|..|||||.|-+|+.+|+..-. -=...|..=....+.+...|.++.-++ .
T Consensus 155 ~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f~g~hfD~~t~~m~ 234 (306)
T COG5539 155 YNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILFTGIHFDEETLAMV 234 (306)
T ss_pred cchhhcCcchHHHHHhhhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhhcccccchhhhhcc
Confidence 345566666655555555567899999999999999988421 012222222233344555565544222 2
Q ss_pred HHHHHHhhcCCCCCCCCccccccCCCchhhhhcccCeeEEEEcCCc
Q 003149 754 IARELLHSLSYSESNPGIEHRMIMPDTGHLIASRYNIVLMHLSQQQ 799 (844)
Q Consensus 754 ~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~~~~~~v~~~~~~~ 799 (844)
.|+.+.+.+. ....|...+.. +.||+.+..|+-++....
T Consensus 235 ~~dt~~ne~~------~~a~~g~~~ei-~qLas~lk~~~~~~nT~~ 273 (306)
T COG5539 235 LWDTYVNEVL------FDASDGITIEI-QQLASLLKNPHYYTNTAS 273 (306)
T ss_pred hHHHHHhhhc------ccccccchHHH-HHHHHHhcCceEEeecCC
Confidence 5788777776 66889877766 589999999998887653
No 23
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=90.70 E-value=0.17 Score=47.81 Aligned_cols=81 Identities=21% Similarity=0.103 Sum_probs=66.3
Q ss_pred CcEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHH
Q 003149 253 PSVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIR 332 (844)
Q Consensus 253 ~~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~ 332 (844)
|+.|.+|-||-.-+ +--.+....+|..|+ ++.+-|....+...=..||..+.+.. ...|.+|+||+..+...|++
T Consensus 1 ~~~w~~DEt~iki~-G~~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~--~~~p~~ivtDk~~aY~~A~~ 75 (140)
T PF13610_consen 1 GDSWHVDETYIKIK-GKWHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRH--RGEPRVIVTDKLPAYPAAIK 75 (140)
T ss_pred CCEEEEeeEEEEEC-CEEEEEEEeeccccc--chhhhhhhhcccccceeeccccceee--ccccceeecccCCccchhhh
Confidence 57899999997543 234556778899888 78888888888888778887777764 37899999999999999999
Q ss_pred HhCCCc
Q 003149 333 AVFPRA 338 (844)
Q Consensus 333 ~vfP~~ 338 (844)
+++++.
T Consensus 76 ~l~~~~ 81 (140)
T PF13610_consen 76 ELNPEG 81 (140)
T ss_pred hccccc
Confidence 999974
No 24
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=89.40 E-value=1 Score=35.37 Aligned_cols=49 Identities=27% Similarity=0.421 Sum_probs=23.9
Q ss_pred cCeEEEEEeecCCCCCCCceEEEEEecCCccCCCCCCCCCCCCCCCCCceeeCCccEEEEEeeCCCCCeEEEEEccccCC
Q 003149 59 NGLVIVIKKSDVGGDGRRPRITFACERSGAYRRKYTEGQTPKRPKTTGTKKCGCPFLLKGHKLDTDDDWILKVVCGVHNH 138 (844)
Q Consensus 59 ~GF~v~~~~S~~~~~g~~~~~~~vC~r~G~~r~~~~~~~~~~~rr~~~s~~~gCp~~i~~~~~~~~~~w~v~~~~~~HNH 138 (844)
.||.+...+.... ...+.|.+.. ..+|+|++... .+.-.+.....+|||
T Consensus 14 ~Gy~y~~~~~~~~------~~~WrC~~~~---------------------~~~C~a~~~~~----~~~~~~~~~~~~HnH 62 (62)
T PF04500_consen 14 DGYRYYFNKRNDG------KTYWRCSRRR---------------------SHGCRARLITD----AGDGRVVRTNGEHNH 62 (62)
T ss_dssp TTEEEEEEEE-SS-------EEEEEGGGT---------------------TS----EEEEE------TTEEEE-S---SS
T ss_pred CCeEEECcCCCCC------cEEEEeCCCC---------------------CCCCeEEEEEE----CCCCEEEECCCccCC
Confidence 5777776665522 5788895521 25899999864 344566666788999
No 25
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=79.40 E-value=8.7 Score=34.50 Aligned_cols=76 Identities=21% Similarity=0.109 Sum_probs=49.2
Q ss_pred CcEEEEeccccC-CCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHH
Q 003149 253 PSVVMIDCTYKT-SMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNS 330 (844)
Q Consensus 253 ~~vl~iD~T~~t-n~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~A 330 (844)
+.++.+|.+... ...+...+.++.+|..-. +.+++.+...++.+.+..+|....... +...|.+|++|+..+..+.
T Consensus 6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~-~~~~~~~~~~~~~~~~~~~l~~~~~~~-~~~~p~~i~tD~g~~f~~~ 82 (120)
T PF00665_consen 6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSR-FIYAFPVSSKETAEAALRALKRAIEKR-GGRPPRVIRTDNGSEFTSH 82 (120)
T ss_dssp TTEEEEEEEEETGGCTT-CEEEEEEEETTTT-EEEEEEESSSSHHHHHHHHHHHHHHHH-S-SE-SEEEEESCHHHHSH
T ss_pred CCEEEEeeEEEecCCCCccEEEEEEEECCCC-cEEEEEeeccccccccccccccccccc-ccccceecccccccccccc
Confidence 478899988544 334447777788876554 555666766666666666666544443 2333999999999987643
No 26
>PF15299 ALS2CR8: Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8
Probab=78.06 E-value=14 Score=37.75 Aligned_cols=98 Identities=19% Similarity=0.240 Sum_probs=59.2
Q ss_pred CCCCCCceeeCCccEEEEEeeCCC-----------------------------------C--CeEEEEE-cccc-CCCCC
Q 003149 101 RPKTTGTKKCGCPFLLKGHKLDTD-----------------------------------D--DWILKVV-CGVH-NHPVT 141 (844)
Q Consensus 101 ~rr~~~s~~~gCp~~i~~~~~~~~-----------------------------------~--~w~v~~~-~~~H-NH~~~ 141 (844)
.++...+.+.+|||+|.++....- + .|.|..- ..+| +|+..
T Consensus 69 ~~~~~~skK~~CPA~I~Ik~I~~FPdykv~~~~~~~~~~~r~~~~~~lk~~l~~~~~~~~~~r~yv~lP~~~~H~~H~~~ 148 (225)
T PF15299_consen 69 RRRSKPSKKRDCPARIYIKEIIKFPDYKVPTNSQKDTRRERRKASKKLKKALLSGKSIEGERRFYVQLPSPEEHSGHPIG 148 (225)
T ss_pred ccccccccCCCCCeEEEEEEEEEcCCcccccchhhhhHHHHHHHHHHHHHHHhcCCCCCceEEEEEECCChHhcCCCccc
Confidence 455667889999999998643111 1 1222221 2356 78876
Q ss_pred CCcccCccccCCCHHHHHHHHHHhhCCCCh-HHHHHHHHhc-----CC----------CccchHHHHHHHHHHhh
Q 003149 142 QHVEGHSYAGRLTEQEANILVDLSRSNISP-KEILQTLKQR-----DM----------HNVSTIKAIYNARHKYR 200 (844)
Q Consensus 142 ~~~~~~~~~rrl~~~~~~~i~~l~~~~~~p-~~I~~~l~~~-----~~----------~~~~t~kdi~n~~~~~r 200 (844)
..... ...++.++....|..|...|+.. .+|...|+.. +. .-.+|.+||.|......
T Consensus 149 ~~~~~--~~q~~~~~v~~ki~eLv~~gv~~v~e~k~~l~~fV~~~lf~~~~~p~~~n~~y~Pt~~di~n~~~~~~ 221 (225)
T PF15299_consen 149 QEAAG--LKQPLDPRVVEKIHELVAQGVTSVPEMKRHLKKFVEEELFKDQEPPPPTNRRYFPTDKDIRNHMYSAK 221 (225)
T ss_pred ccccc--ccccCCHHHHHHHHHHHHcccccHHHHHHHHHHHhhhhccCCCCCCCCCccccCCchHHHHHHHHHHH
Confidence 53321 22467888889999999999765 6666666321 10 11357788888766543
No 27
>PHA02517 putative transposase OrfB; Reviewed
Probab=76.35 E-value=22 Score=37.69 Aligned_cols=69 Identities=13% Similarity=-0.012 Sum_probs=41.9
Q ss_pred CcEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhc--CCCCCeEEEeeccHH
Q 003149 253 PSVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMME--DDALPRVIVTDKDLA 326 (844)
Q Consensus 253 ~~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~--~~~~P~~iitD~~~a 326 (844)
..++..|.|+..... +-.++++.+|...+ .++|+.+....+.+. +++.|..++. +...+.+|.||....
T Consensus 110 n~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~---~~~~l~~a~~~~~~~~~~i~~sD~G~~ 180 (277)
T PHA02517 110 NQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDF---VLDALEQALWARGRPGGLIHHSDKGSQ 180 (277)
T ss_pred CCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHH---HHHHHHHHHHhcCCCcCcEeecccccc
Confidence 478999999965443 34566666665544 567788877666664 4455555541 222234667888764
No 28
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=61.06 E-value=8.6 Score=39.84 Aligned_cols=112 Identities=13% Similarity=0.039 Sum_probs=72.6
Q ss_pred CCCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccc-cCCCchhhhh
Q 003149 707 ADGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRM-IMPDTGHLIA 785 (844)
Q Consensus 707 ~dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~-~~~~~~~~~a 785 (844)
.|--|.|+|.+..++-- +=..+|+.+..|+.++++.|.+...+-+... ++..|. .++-|. +--..+ +|.
T Consensus 119 ~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~~i~-y~~~i~------k~d~~~dG~ieia-~iS 188 (306)
T COG5539 119 DDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEIDVIA-YATWIV------KPDSQGDGCIEIA-IIS 188 (306)
T ss_pred CchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcchHH-HHHhhh------ccccCCCceEEEe-Eec
Confidence 45789999999888654 7788999999999999999987653322222 222222 345565 333344 799
Q ss_pred cccCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149 786 SRYNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD 834 (844)
Q Consensus 786 ~~~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~ 834 (844)
+.+++-|.+....... -++-.+-+. -.-|++-|. +-||-++.+
T Consensus 189 ~~l~v~i~~Vdv~~~~---~dr~~~~~~--~q~~~i~f~-g~hfD~~t~ 231 (306)
T COG5539 189 DQLPVRIHVVDVDKDS---EDRYNSHPY--VQRISILFT-GIHFDEETL 231 (306)
T ss_pred cccceeeeeeecchhH---HhhccCChh--hhhhhhhhc-ccccchhhh
Confidence 9999998888765321 222222122 123788885 689987764
No 29
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=59.19 E-value=4.2 Score=41.18 Aligned_cols=30 Identities=23% Similarity=0.348 Sum_probs=23.6
Q ss_pred CCCCCccccccCCccchhHhhHHhhcCCCCCcc
Q 003149 521 ASACGCVFRRTHGLPCAHEIAEYKHERRSIPLL 553 (844)
Q Consensus 521 ~~~CsC~~~~~~GlPC~H~lav~~~~~~~i~~~ 553 (844)
...|||..+ ..||+|+-||+.+.+..+..+
T Consensus 124 ~~dCSCPD~---anPCKHi~AvyY~lae~f~~d 153 (266)
T COG4279 124 STDCSCPDY---ANPCKHIAAVYYLLAEKFDED 153 (266)
T ss_pred ccccCCCCc---ccchHHHHHHHHHHHHHhccC
Confidence 457999875 579999999999887555444
No 30
>PF04937 DUF659: Protein of unknown function (DUF 659); InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=58.82 E-value=83 Score=30.04 Aligned_cols=106 Identities=13% Similarity=0.078 Sum_probs=66.1
Q ss_pred HhCCcEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeec-cCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHH
Q 003149 250 RAFPSVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLE-SERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALM 328 (844)
Q Consensus 250 ~~~~~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~-~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~ 328 (844)
...|=.|..|+= ++..+.+++.|+.....|..|.-..-.-. ..+.+.+-.+|+...+.+ +......||||-...+.
T Consensus 30 ~~~Gcsi~~DgW--td~~~~~lInf~v~~~~g~~Flksvd~s~~~~~a~~l~~ll~~vIeeV-G~~nVvqVVTDn~~~~~ 106 (153)
T PF04937_consen 30 KRTGCSIMSDGW--TDRKGRSLINFMVYCPEGTVFLKSVDASSIIKTAEYLFELLDEVIEEV-GEENVVQVVTDNASNMK 106 (153)
T ss_pred HhcCEEEEEecC--cCCCCCeEEEEEEEcccccEEEEEEecccccccHHHHHHHHHHHHHHh-hhhhhhHHhccCchhHH
Confidence 344445666654 34455677777777776665543332211 134444555555555444 45556668999999988
Q ss_pred HHHH---HhCCCcccchhhhhHHHhHHHhhhhh
Q 003149 329 NSIR---AVFPRATNLLCRWHISKNISVNCKKL 358 (844)
Q Consensus 329 ~Ai~---~vfP~~~~~lC~~Hi~kn~~~~~~~~ 358 (844)
+|-+ +-+|..-...|.-|-+.-+.+.+.++
T Consensus 107 ~a~~~L~~k~p~ifw~~CaaH~inLmledi~k~ 139 (153)
T PF04937_consen 107 KAGKLLMEKYPHIFWTPCAAHCINLMLEDIGKL 139 (153)
T ss_pred HHHHHHHhcCCCEEEechHHHHHHHHHHHHhcC
Confidence 8844 44788777889999888777766543
No 31
>KOG4345 consensus NF-kappa B regulator AP20/Cezanne [Signal transduction mechanisms]
Probab=56.80 E-value=7.3 Score=45.05 Aligned_cols=52 Identities=13% Similarity=0.193 Sum_probs=38.3
Q ss_pred chhhhhcccCeeEEEEcC-----C---------ccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEE
Q 003149 780 TGHLIASRYNIVLMHLSQ-----Q---------QCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVL 833 (844)
Q Consensus 780 ~~~~~a~~~~~~v~~~~~-----~---------~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~ 833 (844)
+-.++|+...|||++++- . .-..|+||-.|+...+ .-||+|+|- ..||+.++
T Consensus 225 hifvl~~ilRrpivvvsd~mlR~s~~~sfap~~~ggiylpLe~p~~~c~-r~pLvl~yd-~~hf~~lv 290 (774)
T KOG4345|consen 225 HIFVLAHILRRPIVVVSDTMLRDSGGESFAPIPVGGIYLPLEVPAQECH-RSPLVLAYD-QAHFSALV 290 (774)
T ss_pred HHHHHHHHhhCCeeEecccccccCCCcccccCccCceEEeccCchhhcc-cchhhhhhH-hhhhhhhh
Confidence 667899999999999973 1 2345777777754443 458999996 47999874
No 32
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=53.25 E-value=32 Score=34.57 Aligned_cols=81 Identities=16% Similarity=0.100 Sum_probs=57.7
Q ss_pred CcEEEEeccccCCCCC-CeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHH
Q 003149 253 PSVVMIDCTYKTSMYP-FSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSI 331 (844)
Q Consensus 253 ~~vl~iD~T~~tn~~~-~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai 331 (844)
.+++.+|-||-+-+-+ .-|+ -.+|.. ..++.+-|...-+...=..||..+++.. ..|.+|+||+......|+
T Consensus 70 ~~~w~vDEt~ikv~gkw~yly--rAid~~--g~~Ld~~L~~rRn~~aAk~Fl~kllk~~---g~p~v~vtDka~s~~~A~ 142 (215)
T COG3316 70 GDSWRVDETYIKVNGKWHYLY--RAIDAD--GLTLDVWLSKRRNALAAKAFLKKLLKKH---GEPRVFVTDKAPSYTAAL 142 (215)
T ss_pred ccceeeeeeEEeeccEeeehh--hhhccC--CCeEEEEEEcccCcHHHHHHHHHHHHhc---CCCceEEecCccchHHHH
Confidence 4678889888653322 2233 334555 3566777777777777777777777764 789999999999999999
Q ss_pred HHhCCCccc
Q 003149 332 RAVFPRATN 340 (844)
Q Consensus 332 ~~vfP~~~~ 340 (844)
.++-++..|
T Consensus 143 ~~l~~~~eh 151 (215)
T COG3316 143 RKLGSEVEH 151 (215)
T ss_pred HhcCcchhe
Confidence 999875443
No 33
>COG3464 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=49.57 E-value=68 Score=36.07 Aligned_cols=56 Identities=18% Similarity=0.214 Sum_probs=43.0
Q ss_pred EeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHHhCCCcccchhhhhHHH
Q 003149 289 AYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRAVFPRATNLLCRWHISK 349 (844)
Q Consensus 289 al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~vfP~~~~~lC~~Hi~k 349 (844)
.++++-+.+++...|..+ +....+.|..|......+++++.||++.+.+=.||+.+
T Consensus 183 ~i~~~r~~~ti~~~l~~~-----g~~~v~~V~~D~~~~y~~~v~e~~pna~i~~d~fh~~~ 238 (402)
T COG3464 183 DILEGRSVRTLRRYLRRG-----GSEQVKSVSMDMFGPYASAVQELFPNALIIADRFHVVQ 238 (402)
T ss_pred eecCCccHHHHHHHHHhC-----CCcceeEEEccccHHHHHHHHHhCCChheeeeeeeeee
Confidence 355666666555554433 12267889999999999999999999999999999877
No 34
>PF08069 Ribosomal_S13_N: Ribosomal S13/S15 N-terminal domain; InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=48.45 E-value=28 Score=27.29 Aligned_cols=29 Identities=21% Similarity=0.416 Sum_probs=25.7
Q ss_pred HHHHHHHHHHhhCCCChHHHHHHHHhcCC
Q 003149 155 EQEANILVDLSRSNISPKEILQTLKQRDM 183 (844)
Q Consensus 155 ~~~~~~i~~l~~~~~~p~~I~~~l~~~~~ 183 (844)
++..+.|.+|.+.|++|.+|-..|+++++
T Consensus 31 ~eVe~~I~klakkG~tpSqIG~iLRD~~G 59 (60)
T PF08069_consen 31 EEVEELIVKLAKKGLTPSQIGVILRDQYG 59 (60)
T ss_dssp HHHHHHHHHHCCTTHCHHHHHHHHHHSCT
T ss_pred HHHHHHHHHHHHcCCCHHHhhhhhhhccC
Confidence 56678889999999999999999999874
No 35
>PF13936 HTH_38: Helix-turn-helix domain; PDB: 2W48_A.
Probab=47.78 E-value=25 Score=25.64 Aligned_cols=30 Identities=30% Similarity=0.362 Sum_probs=15.4
Q ss_pred cCCCHHHHHHHHHHhhCCCChHHHHHHHHh
Q 003149 151 GRLTEQEANILVDLSRSNISPKEILQTLKQ 180 (844)
Q Consensus 151 rrl~~~~~~~i~~l~~~~~~p~~I~~~l~~ 180 (844)
++|+.+++..|..+.+.|.+.++|...|..
T Consensus 3 ~~Lt~~eR~~I~~l~~~G~s~~~IA~~lg~ 32 (44)
T PF13936_consen 3 KHLTPEERNQIEALLEQGMSIREIAKRLGR 32 (44)
T ss_dssp ---------HHHHHHCS---HHHHHHHTT-
T ss_pred cchhhhHHHHHHHHHHcCCCHHHHHHHHCc
Confidence 468999999999999999999999987744
No 36
>PRK09784 hypothetical protein; Provisional
Probab=45.11 E-value=13 Score=37.01 Aligned_cols=36 Identities=25% Similarity=0.350 Sum_probs=25.7
Q ss_pred cccccccccCCCCCchHHHHHHhhcCCCccHHHHHHH
Q 003149 697 PYIHDVQDVIADGHCGFRVVAELMDIGEDNWAQVRRD 733 (844)
Q Consensus 697 ~~~~~~~~v~~dg~Cgfraia~~lg~~~~~~~~vr~~ 733 (844)
.|.++---|+|||-|..|||-.. ...+-+|..+-..
T Consensus 197 ~~glkyapvdgdgycllrailvl-k~h~yswal~s~k 232 (417)
T PRK09784 197 TYGLKYAPVDGDGYCLLRAILVL-KQHDYSWALGSHK 232 (417)
T ss_pred hhCceecccCCCchhHHHHHHHh-hhcccchhhccch
Confidence 45556666999999999999763 3455677766443
No 37
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=44.51 E-value=16 Score=38.49 Aligned_cols=85 Identities=19% Similarity=0.150 Sum_probs=53.4
Q ss_pred CcEEEEeccccC-----CCCCCeeEEEEEecCCCCceeeEEEeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHH
Q 003149 253 PSVVMIDCTYKT-----SMYPFSFLEIVGATSTELTFSIAFAYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLAL 327 (844)
Q Consensus 253 ~~vl~iD~T~~t-----n~~~~~l~~~~g~~~~~~~~~~a~al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al 327 (844)
.+++.+|-|.-. +.-+.-+.+++.-+ .+.|.+.++-..+...-+|.. ...++++|+-.+.
T Consensus 67 ~~~~~~DET~~~vl~~~~g~~~~~Wv~~~~~------~v~f~~~~sR~~~~~~~~L~~---------~~GilvsD~y~~Y 131 (271)
T PF03050_consen 67 SPVVHADETGWRVLDKGKGKKGYLWVFVSPE------VVLFFYAPSRSSKVIKEFLGD---------FSGILVSDGYSAY 131 (271)
T ss_pred cceeccCCceEEEeccccccceEEEeeeccc------eeeeeecccccccchhhhhcc---------cceeeeccccccc
Confidence 456666766544 22233344444333 556666666666555444332 3369999999876
Q ss_pred HHHHHHhCCCcccchhhhhHHHhHHHhhhh
Q 003149 328 MNSIRAVFPRATNLLCRWHISKNISVNCKK 357 (844)
Q Consensus 328 ~~Ai~~vfP~~~~~lC~~Hi~kn~~~~~~~ 357 (844)
.. +..+.|+.|+-|+.+.+..-...
T Consensus 132 ~~-----~~~~~hq~C~AH~~R~~~~~~~~ 156 (271)
T PF03050_consen 132 NK-----LAGITHQLCWAHLRRDFQDAAES 156 (271)
T ss_pred cc-----ccccccccccccccccccccccc
Confidence 54 22889999999999998876543
No 38
>PRK13907 rnhA ribonuclease H; Provisional
Probab=44.50 E-value=1.8e+02 Score=26.45 Aligned_cols=78 Identities=15% Similarity=0.080 Sum_probs=43.6
Q ss_pred EEEEeccccCCCCCCeeEEEEEecCCCCceeeEE-EeeccCccchHHHHHHHHHHhhcCCCCCeEEEeeccHHHHHHHHH
Q 003149 255 VVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAF-AYLESERDDNYIWTLERLRSMMEDDALPRVIVTDKDLALMNSIRA 333 (844)
Q Consensus 255 vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~-al~~~E~~es~~w~l~~lk~~~~~~~~P~~iitD~~~al~~Ai~~ 333 (844)
.|.+|+.++.|.-..-...++ .+..+.. ...+ .-..+....-|.-++..|+.+...+..+..|.+|. ..+.+++..
T Consensus 3 ~iy~DGa~~~~~g~~G~G~vi-~~~~~~~-~~~~~~~~~tn~~AE~~All~aL~~a~~~g~~~v~i~sDS-~~vi~~~~~ 79 (128)
T PRK13907 3 EVYIDGASKGNPGPSGAGVFI-KGVQPAV-QLSLPLGTMSNHEAEYHALLAALKYCTEHNYNIVSFRTDS-QLVERAVEK 79 (128)
T ss_pred EEEEeeCCCCCCCccEEEEEE-EECCeeE-EEEecccccCCcHHHHHHHHHHHHHHHhCCCCEEEEEech-HHHHHHHhH
Confidence 378999998876443333333 3444432 2222 11223344456677777777764455566777876 556666665
Q ss_pred hC
Q 003149 334 VF 335 (844)
Q Consensus 334 vf 335 (844)
.+
T Consensus 80 ~~ 81 (128)
T PRK13907 80 EY 81 (128)
T ss_pred HH
Confidence 54
No 39
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=34.84 E-value=15 Score=31.83 Aligned_cols=25 Identities=28% Similarity=0.371 Sum_probs=18.0
Q ss_pred CCCCCccccc-----cCCccchhHhhHHhh
Q 003149 521 ASACGCVFRR-----THGLPCAHEIAEYKH 545 (844)
Q Consensus 521 ~~~CsC~~~~-----~~GlPC~H~lav~~~ 545 (844)
.--|||.++- .-.-||.|++.+-..
T Consensus 49 ~gfCSCp~~~~svvl~Gk~~C~Hi~glk~A 78 (117)
T COG5431 49 GGFCSCPDFLGSVVLKGKSPCAHIIGLKVA 78 (117)
T ss_pred cCcccCHHHHhHhhhcCcccchhhhheeee
Confidence 3489999887 235579999986433
No 40
>KOG4540 consensus Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking, secretion, and vesicular transport; Lipid transport and metabolism]
Probab=32.79 E-value=1.3e+02 Score=31.59 Aligned_cols=57 Identities=16% Similarity=0.204 Sum_probs=35.6
Q ss_pred HHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhhhhc----ccCeeEEEEcC
Q 003149 732 RDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHLIAS----RYNIVLMHLSQ 797 (844)
Q Consensus 732 ~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~----~~~~~v~~~~~ 797 (844)
+=+-+|++.....|...+ ++-.-+.++ ++ .-.-|++-.+.|.+||+ +||.|+|.+++
T Consensus 246 ~ClE~eir~~dryySa~l----dI~~~v~~~-Yp----da~iwlTGHSLGGa~AsLlG~~fglP~VaFes 306 (425)
T KOG4540|consen 246 ECLEEEIREFDRYYSAAL----DILGAVRRI-YP----DARIWLTGHSLGGAIASLLGIRFGLPVVAFES 306 (425)
T ss_pred HHHHHHHHhhcchhHHHH----HHHHHHHHh-CC----CceEEEeccccchHHHHHhccccCCceEEecC
Confidence 345556665544444322 222223333 33 34789999999999997 56679999986
No 41
>COG5153 CVT17 Putative lipase essential for disintegration of autophagic bodies inside the vacuole [Intracellular trafficking and secretion / Lipid metabolism]
Probab=32.79 E-value=1.3e+02 Score=31.59 Aligned_cols=57 Identities=16% Similarity=0.204 Sum_probs=35.6
Q ss_pred HHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhhhhc----ccCeeEEEEcC
Q 003149 732 RDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHLIAS----RYNIVLMHLSQ 797 (844)
Q Consensus 732 ~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~----~~~~~v~~~~~ 797 (844)
+=+-+|++.....|...+ ++-.-+.++ ++ .-.-|++-.+.|.+||+ +||.|+|.+++
T Consensus 246 ~ClE~eir~~dryySa~l----dI~~~v~~~-Yp----da~iwlTGHSLGGa~AsLlG~~fglP~VaFes 306 (425)
T COG5153 246 ECLEEEIREFDRYYSAAL----DILGAVRRI-YP----DARIWLTGHSLGGAIASLLGIRFGLPVVAFES 306 (425)
T ss_pred HHHHHHHHhhcchhHHHH----HHHHHHHHh-CC----CceEEEeccccchHHHHHhccccCCceEEecC
Confidence 345556665544444322 222223333 33 34789999999999997 56679999986
No 42
>PF09921 DUF2153: Uncharacterized protein conserved in archaea (DUF2153); InterPro: IPR014450 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=28.38 E-value=38 Score=30.52 Aligned_cols=48 Identities=13% Similarity=0.448 Sum_probs=35.7
Q ss_pred ccHHHHHHHHHHHHHhhhhhHHH------hhCCHHHHHHHHhhcCCCCCCCCccccccCC
Q 003149 725 DNWAQVRRDLVDELQSHYDDYIQ------LYGDAEIARELLHSLSYSESNPGIEHRMIMP 778 (844)
Q Consensus 725 ~~~~~vr~~l~~el~~~~~~y~~------~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~ 778 (844)
|.|+...+.++++++++.+.|.. ++.+-..|..++..|. .-|.||..|
T Consensus 3 d~WVk~Qk~~l~~~~~~e~~~~~~DRL~LIl~sr~afqhm~RTlK------aFd~WLqdP 56 (126)
T PF09921_consen 3 DEWVKMQKRLLETFKKHEKNVESADRLDLILSSRAAFQHMMRTLK------AFDQWLQDP 56 (126)
T ss_pred HHHHHHHHHHHHHHHHHHHHhhhhhHHHHHHHHHHHHHHHHHHHH------HHHHHHcCc
Confidence 78999999999999999876653 2233335666666675 568888887
No 43
>PF09607 BrkDBD: Brinker DNA-binding domain; InterPro: IPR018586 This DNA-binding domain is the first approx. 100 residues of the N-terminal end of Brinker. The structure of this domain in complex with DNA consists of four alpha-helices that contain a helix-turn-helix DNA recognition motif specific for GC-rich DNA. The Brinker nuclear repressor is a major element of the Drosophila Decapentaplegic morphogen signalling pathway []. ; PDB: 2GLO_A.
Probab=26.70 E-value=47 Score=25.84 Aligned_cols=18 Identities=22% Similarity=0.501 Sum_probs=14.6
Q ss_pred CCCCch--HHHHHHhhcCCC
Q 003149 707 ADGHCG--FRVVAELMDIGE 724 (844)
Q Consensus 707 ~dg~Cg--fraia~~lg~~~ 724 (844)
-|+||- +||.|+..|.++
T Consensus 20 ~~~nc~~~~RAaarkf~V~r 39 (58)
T PF09607_consen 20 KDNNCKGNQRAAARKFNVSR 39 (58)
T ss_dssp H-TTTTT-HHHHHHHTTS-H
T ss_pred HccchhhhHHHHHHHhCccH
Confidence 688998 999999999975
No 44
>PF03412 Peptidase_C39: Peptidase C39 family This is family C39 in the peptidase classification. ; InterPro: IPR005074 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of sequences defined by this cysteine peptidase domain belong to the MEROPS peptidase family C39 (clan CA). It is found in a wide range of ABC transporters, which are maturation proteases for peptide bacteriocins, the proteolytic domain residing in the N-terminal region of the protein []. A number of the proteins are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity. Lantibiotic and non-lantibiotic bacteriocins are synthesised as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved. ; GO: 0005524 ATP binding, 0008233 peptidase activity, 0006508 proteolysis, 0016021 integral to membrane; PDB: 3K8U_A 3B79_A.
Probab=25.69 E-value=3.2e+02 Score=24.60 Aligned_cols=83 Identities=20% Similarity=0.285 Sum_probs=44.0
Q ss_pred CCCchHHHHHHhhcCCCccHHHHHHHHHHHHHhhhhhHHHhhCCHHHHHHHHhhcCCCCCCCCccccccCCCchhhhhcc
Q 003149 708 DGHCGFRVVAELMDIGEDNWAQVRRDLVDELQSHYDDYIQLYGDAEIARELLHSLSYSESNPGIEHRMIMPDTGHLIASR 787 (844)
Q Consensus 708 dg~Cgfraia~~lg~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~w~~~~~~~~~~a~~ 787 (844)
+-.||..|+|..+.. ++-.-..+++...+. ....-+++.++. .+|..
T Consensus 10 ~~dcg~acl~~l~~~--------------------------~g~~~s~~~l~~~~~------~~~~g~s~~~L~-~~~~~ 56 (131)
T PF03412_consen 10 SNDCGLACLAMLLKY--------------------------YGIPVSEEELRRQLG------TSEEGTSLADLK-RAARK 56 (131)
T ss_dssp TT-HHHHHHHHHHHH--------------------------TT----HHHHHCCTT-------BTTB--CCCHH-HHHHH
T ss_pred CCCHHHHHHHHHHHH--------------------------hCCCchHHHHHHHhc------CCccCCCHHHHH-HHHHh
Confidence 457999999887732 233334556666663 223345666665 67889
Q ss_pred cCeeEEEEcCCccccccCCCCCCCCCCCCCcEEEEEeCCCcEEEEEe
Q 003149 788 YNIVLMHLSQQQCFTFLPLRSVPLPRTSRKIVTIGFVNECQFVKVLD 834 (844)
Q Consensus 788 ~~~~v~~~~~~~~~~~~p~~~~p~~~~~~~~i~l~~~~~~H~~~~~~ 834 (844)
|+...-.+..... -| . .. +-| +|++++.+||+.|.-
T Consensus 57 ~gl~~~~~~~~~~--~l--~----~~--~~P-~I~~~~~~h~vVi~~ 92 (131)
T PF03412_consen 57 YGLKAKAVKLNFE--KL--K----RL--PLP-AIAHLKDGHFVVIYK 92 (131)
T ss_dssp TTEEEEEEE--GG--GC--T----CG--GSS-EEEEECCCEEEEEEE
T ss_pred cccceeeeecchh--hh--h----hc--ccc-EEEEecCcceEEEEe
Confidence 9987777765433 11 1 11 122 334446689998774
No 45
>KOG4825 consensus Component of synaptic membrane glycine-, glutamate- and thienylcyclohexylpiperidine-binding glycoprotein (43kDa) [Signal transduction mechanisms]
Probab=25.52 E-value=1e+02 Score=34.24 Aligned_cols=36 Identities=22% Similarity=0.133 Sum_probs=30.1
Q ss_pred CCCCcccccCCCCCCcCCCCCcccCCCccceeeecc
Q 003149 616 ASTSLVELEVDGFPLSKLGTSTYQDPSELQYVLSVQ 651 (844)
Q Consensus 616 ~~~~~~~~~~Kg~p~~~~~~st~r~ps~~e~~~~~~ 651 (844)
.++.+.+..+.|+|..-..-++++.||.||-.+++|
T Consensus 279 pmpSlpqleepgrenqfaepflqekpsswelpIrPq 314 (666)
T KOG4825|consen 279 PMPSLPQLEEPGRENQFAEPFLQEKPSSWELPIRPQ 314 (666)
T ss_pred CCCccccccCCCCccccccchhhcCCCcceeecccc
Confidence 344457788899998878889999999999888887
No 46
>KOG0030 consensus Myosin essential light chain, EF-Hand protein superfamily [Cytoskeleton]
Probab=25.51 E-value=28 Score=32.23 Aligned_cols=86 Identities=12% Similarity=0.168 Sum_probs=48.5
Q ss_pred cccccccCCCCCchHHH---HHHhhcCCCcc---HHHHHHHHHHHHHhhhh---hHHHhh------CCHHHHHHHHhhcC
Q 003149 699 IHDVQDVIADGHCGFRV---VAELMDIGEDN---WAQVRRDLVDELQSHYD---DYIQLY------GDAEIARELLHSLS 763 (844)
Q Consensus 699 ~~~~~~v~~dg~Cgfra---ia~~lg~~~~~---~~~vr~~l~~el~~~~~---~y~~~~------~~~~~~~~~~~~l~ 763 (844)
|...+|..|||-=+++- +.++||.++-. +..+++---.|+...+- .+.+++ .....|++++.+|.
T Consensus 16 ~F~lfD~~gD~ki~~~q~gdvlRalG~nPT~aeV~k~l~~~~~~~~~~~rl~FE~fLpm~q~vaknk~q~t~edfvegLr 95 (152)
T KOG0030|consen 16 AFLLFDRTGDGKISGSQVGDVLRALGQNPTNAEVLKVLGQPKRREMNVKRLDFEEFLPMYQQVAKNKDQGTYEDFVEGLR 95 (152)
T ss_pred HHHHHhccCcccccHHHHHHHHHHhcCCCcHHHHHHHHcCcccchhhhhhhhHHHHHHHHHHHHhccccCcHHHHHHHHH
Confidence 34567888998655554 45677887632 23333333333333443 445554 12235999999998
Q ss_pred CCCCCCCccccccCCCchhhhhc
Q 003149 764 YSESNPGIEHRMIMPDTGHLIAS 786 (844)
Q Consensus 764 ~~~~~~~~~~w~~~~~~~~~~a~ 786 (844)
+.++ ....|+.-...-|+|++
T Consensus 96 vFDk--eg~G~i~~aeLRhvLtt 116 (152)
T KOG0030|consen 96 VFDK--EGNGTIMGAELRHVLTT 116 (152)
T ss_pred hhcc--cCCcceeHHHHHHHHHH
Confidence 7664 23345555555566654
No 47
>PRK08561 rps15p 30S ribosomal protein S15P; Reviewed
Probab=23.34 E-value=3e+02 Score=26.11 Aligned_cols=31 Identities=23% Similarity=0.298 Sum_probs=26.7
Q ss_pred CHHHHHHHHHHhhCCCChHHHHHHHHhcCCC
Q 003149 154 TEQEANILVDLSRSNISPKEILQTLKQRDMH 184 (844)
Q Consensus 154 ~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~ 184 (844)
.++..+.|..|.+.|.+|.+|-..|+++++.
T Consensus 30 ~eeve~~I~~lakkG~~pSqIG~~LRD~~gi 60 (151)
T PRK08561 30 PEEIEELVVELAKQGYSPSMIGIILRDQYGI 60 (151)
T ss_pred HHHHHHHHHHHHHCCCCHHHhhhhHhhccCC
Confidence 4566788889999999999999999999853
No 48
>TIGR03277 methan_mark_9 putative methanogenesis marker domain 9. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins.
Probab=21.59 E-value=69 Score=28.06 Aligned_cols=32 Identities=16% Similarity=0.481 Sum_probs=28.1
Q ss_pred CCchHH-HHHHhhcCCCccHHHHHHHHHHHHHh
Q 003149 709 GHCGFR-VVAELMDIGEDNWAQVRRDLVDELQS 740 (844)
Q Consensus 709 g~Cgfr-aia~~lg~~~~~~~~vr~~l~~el~~ 740 (844)
-.|-|| ..-..+|++.+.+..+.+++.+||..
T Consensus 76 KPCplrd~aL~~igls~~EYm~lKkelae~i~~ 108 (109)
T TIGR03277 76 KPCPLRDSALQRIGMSPEEYMELKKKLAEELLK 108 (109)
T ss_pred CCCcCchHHHHHcCCCHHHHHHHHHHHHHHHhc
Confidence 358889 68889999999999999999999864
No 49
>PF13082 DUF3931: Protein of unknown function (DUF3931)
Probab=21.26 E-value=1.8e+02 Score=21.75 Aligned_cols=42 Identities=24% Similarity=0.307 Sum_probs=25.5
Q ss_pred CcEEEEeccccC-CCC------------CCeeEEEEEecCCCCceeeEEEeeccC
Q 003149 253 PSVVMIDCTYKT-SMY------------PFSFLEIVGATSTELTFSIAFAYLESE 294 (844)
Q Consensus 253 ~~vl~iD~T~~t-n~~------------~~~l~~~~g~~~~~~~~~~a~al~~~E 294 (844)
.+|+.||+--+. ..| .+.-+++.|-+..|+...+...+..+|
T Consensus 8 cnvisidgkkkksdtysypklvvenktyefssfvlcgetpdgrrlvlthmistde 62 (66)
T PF13082_consen 8 CNVISIDGKKKKSDTYSYPKLVVENKTYEFSSFVLCGETPDGRRLVLTHMISTDE 62 (66)
T ss_pred ccEEEeccccccCCcccCceEEEeCceEEEEEEEEEccCCCCcEEEEEEEecchh
Confidence 467788875543 223 344455667777777777776665544
No 50
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=20.74 E-value=3.2e+02 Score=28.62 Aligned_cols=72 Identities=7% Similarity=-0.121 Sum_probs=47.2
Q ss_pred cEEEEeccccCCCCCCeeEEEEEecCCCCceeeEEEeecc-CccchHHHHHHH-HHHhhc--CCCCCeEEEeeccHH
Q 003149 254 SVVMIDCTYKTSMYPFSFLEIVGATSTELTFSIAFAYLES-ERDDNYIWTLER-LRSMME--DDALPRVIVTDKDLA 326 (844)
Q Consensus 254 ~vl~iD~T~~tn~~~~~l~~~~g~~~~~~~~~~a~al~~~-E~~es~~w~l~~-lk~~~~--~~~~P~~iitD~~~a 326 (844)
.++..|-||-....+.-++..+.+|...+ .++||.+... .+.+....+|+. +..... ....|.+|.||+...
T Consensus 88 ~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~~v~~~l~~A~~~~~~~~~~~~~~iihSD~Gsq 163 (262)
T PRK14702 88 QRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQDVMLGAVERRFGNDLPSSPVEWLTDNGSC 163 (262)
T ss_pred CEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHHHHHHHHHHHHHHHhcccCCCCCeEEEcCCCcc
Confidence 78999988865544446777778887665 7789988764 455555555543 333221 133577899998853
No 51
>KOG0400 consensus 40S ribosomal protein S13 [Translation, ribosomal structure and biogenesis]
Probab=20.49 E-value=5.1e+02 Score=23.78 Aligned_cols=102 Identities=14% Similarity=0.208 Sum_probs=59.9
Q ss_pred CCHHHHHHHHHHhhCCCChHHHHHHHHhcCCCc---cchHHHHHHHHHHhhhccccCcchHHHHHHH-------HHhcCc
Q 003149 153 LTEQEANILVDLSRSNISPKEILQTLKQRDMHN---VSTIKAIYNARHKYRVGEQVGQLHMHQLLDK-------LRKHGY 222 (844)
Q Consensus 153 l~~~~~~~i~~l~~~~~~p~~I~~~l~~~~~~~---~~t~kdi~n~~~~~r~~~~~g~~~~~~ll~~-------l~e~~~ 222 (844)
..++.++.|..+.+.|++|.+|--.|++.++.. ..+-..|.+..++--.... =-.|+..|+.. |+.+-
T Consensus 29 ~~ddvkeqI~K~akKGltpsqIGviLRDshGi~q~r~v~G~kI~Rilk~~Gl~Pe-iPeDLy~likkAv~iRkHLer~R- 106 (151)
T KOG0400|consen 29 TADDVKEQIYKLAKKGLTPSQIGVILRDSHGIGQVRFVTGNKILRILKSNGLAPE-IPEDLYHLIKKAVAIRKHLERNR- 106 (151)
T ss_pred CHHHHHHHHHHHHHcCCChhHceeeeecccCcchhheechhHHHHHHHHcCCCCC-CcHHHHHHHHHHHHHHHHHHHhc-
Confidence 456778999999999999999999999877532 2344555555444322111 01144444443 22111
Q ss_pred EEEEEeeccCCceeeeEecChHHHHHHHhCCcEEEEecccc
Q 003149 223 IEWHRYNEETDCFKDLFWAHPFAVGLLRAFPSVVMIDCTYK 263 (844)
Q Consensus 223 ~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~~vl~iD~T~~ 263 (844)
.|.+..| ++++....+..+.++|..+..+-.+++
T Consensus 107 ------KD~d~K~-RLILveSRihRlARYYk~~~~lPp~WK 140 (151)
T KOG0400|consen 107 ------KDKDAKF-RLILVESRIHRLARYYKTKMVLPPNWK 140 (151)
T ss_pred ------cccccce-EEEeehHHHHHHHHHHHhcccCCCCCC
Confidence 2333333 455566778888888776655554443
Done!