Query 004420
Match_columns 754
No_of_seqs 318 out of 1394
Neff 8.9
Searched_HMMs 46136
Date Thu Mar 28 23:08:52 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/004420.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/004420hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 5E-124 1E-128 1065.1 62.7 686 22-746 54-765 (846)
2 PF10551 MULE: MULE transposas 99.9 6.5E-22 1.4E-26 171.0 8.4 90 276-367 1-93 (93)
3 PF03101 FAR1: FAR1 DNA-bindin 99.8 3.3E-19 7.2E-24 153.1 8.8 91 56-150 1-91 (91)
4 PF00872 Transposase_mut: Tran 99.8 5.2E-20 1.1E-24 199.6 2.8 255 155-457 91-352 (381)
5 COG3328 Transposase and inacti 99.4 8.6E-12 1.9E-16 132.4 14.9 261 158-469 80-345 (379)
6 PF08731 AFT: Transcription fa 99.2 4.7E-11 1E-15 101.1 10.1 91 48-148 1-111 (111)
7 smart00575 ZnF_PMZ plant mutat 98.9 8.7E-10 1.9E-14 70.8 1.7 26 560-585 2-27 (28)
8 PF03108 DBD_Tnp_Mut: MuDR fam 98.5 6.5E-07 1.4E-11 71.6 7.9 66 40-137 2-67 (67)
9 PF04434 SWIM: SWIM zinc finge 98.2 1.1E-06 2.5E-11 62.2 3.2 29 555-583 11-39 (40)
10 PF01610 DDE_Tnp_ISL3: Transpo 97.1 0.00038 8.2E-09 71.7 3.9 94 272-371 1-97 (249)
11 PF06782 UPF0236: Uncharacteri 96.5 0.22 4.8E-06 56.0 19.8 248 157-457 112-380 (470)
12 PF13610 DDE_Tnp_IS240: DDE do 95.8 0.0067 1.5E-07 56.3 3.0 81 269-353 1-81 (140)
13 PF03106 WRKY: WRKY DNA -bindi 95.4 0.065 1.4E-06 41.4 6.6 57 65-147 3-59 (60)
14 PF15288 zf-CCHC_6: Zinc knuck 94.7 0.018 3.8E-07 39.8 1.5 22 723-744 2-25 (40)
15 PF00098 zf-CCHC: Zinc knuckle 94.7 0.025 5.5E-07 32.1 1.9 17 724-740 2-18 (18)
16 PF13696 zf-CCHC_2: Zinc knuck 93.2 0.042 9.1E-07 36.0 1.0 24 721-744 7-30 (32)
17 smart00774 WRKY DNA binding do 89.1 0.76 1.6E-05 35.2 4.4 56 66-146 4-59 (59)
18 PF00665 rve: Integrase core d 88.9 1.8 3.9E-05 38.4 7.7 75 269-344 6-81 (120)
19 PRK08561 rps15p 30S ribosomal 82.2 2.7 5.9E-05 38.6 5.1 79 165-243 31-143 (151)
20 COG5179 TAF1 Transcription ini 80.3 1.1 2.3E-05 49.8 2.2 27 720-746 935-963 (968)
21 PF04500 FLYWCH: FLYWCH zinc f 79.4 3.6 7.7E-05 31.6 4.5 25 116-146 38-62 (62)
22 PF14392 zf-CCHC_4: Zinc knuck 78.3 0.84 1.8E-05 33.7 0.5 18 723-740 32-49 (49)
23 PF03050 DDE_Tnp_IS66: Transpo 77.7 7.7 0.00017 40.3 7.7 145 166-372 7-156 (271)
24 smart00343 ZnF_C2HC zinc finge 75.7 1.5 3.2E-05 27.4 1.0 19 724-742 1-19 (26)
25 PF04937 DUF659: Protein of un 75.6 19 0.00041 33.8 8.9 108 262-371 26-137 (153)
26 KOG0400 40S ribosomal protein 70.7 6.6 0.00014 34.8 4.1 109 157-279 22-140 (151)
27 PF13936 HTH_38: Helix-turn-he 70.5 3.8 8.3E-05 29.4 2.3 29 160-188 2-30 (44)
28 PF04684 BAF1_ABF1: BAF1 / ABF 69.4 8.7 0.00019 41.8 5.6 54 45-127 25-78 (496)
29 KOG4602 Nanos and related prot 67.5 19 0.00041 35.8 7.0 31 717-747 263-296 (318)
30 PF08069 Ribosomal_S13_N: Ribo 63.6 5.5 0.00012 30.6 2.0 36 159-194 24-60 (60)
31 PF05741 zf-nanos: Nanos RNA b 60.6 3.1 6.7E-05 31.4 0.2 22 720-741 31-55 (55)
32 PHA02517 putative transposase 60.1 78 0.0017 32.8 10.8 73 269-343 110-182 (277)
33 COG3316 Transposase and inacti 57.6 17 0.00036 35.9 4.7 82 270-356 71-152 (215)
34 COG5431 Uncharacterized metal- 57.6 24 0.00053 30.1 4.9 20 560-579 51-75 (117)
35 KOG3517 Transcription factor P 49.0 27 0.00059 34.6 4.6 69 161-240 19-98 (334)
36 PF11433 DUF3198: Protein of u 47.1 29 0.00063 24.9 3.2 44 384-427 6-49 (51)
37 PRK14702 insertion element IS2 42.9 90 0.002 32.2 7.8 72 269-341 87-163 (262)
38 PF02796 HTH_7: Helix-turn-hel 41.9 23 0.0005 25.4 2.3 40 161-211 4-43 (45)
39 PF00292 PAX: 'Paired box' dom 41.9 28 0.0006 31.3 3.2 70 160-240 15-95 (125)
40 PTZ00072 40S ribosomal protein 39.1 46 0.001 30.5 4.2 30 165-194 28-57 (148)
41 PRK09409 IS2 transposase TnpB; 39.0 1.1E+02 0.0025 32.2 8.0 72 269-341 126-202 (301)
42 COG4279 Uncharacterized conser 36.8 17 0.00037 36.3 1.2 24 558-584 124-147 (266)
43 COG4715 Uncharacterized conser 34.9 1.8E+02 0.0039 33.0 8.6 50 531-582 45-94 (587)
44 PRK14892 putative transcriptio 32.3 31 0.00068 29.6 1.9 10 720-729 19-28 (99)
45 COG5222 Uncharacterized conser 31.1 23 0.00049 36.1 1.0 26 722-747 176-201 (427)
46 COG5470 Uncharacterized conser 30.8 24 0.00051 29.8 0.9 42 29-70 41-82 (96)
47 KOG0341 DEAD-box protein abstr 30.3 22 0.00048 37.9 0.8 20 724-743 572-591 (610)
48 PF02178 AT_hook: AT hook moti 28.7 24 0.00052 18.3 0.4 9 700-708 2-10 (13)
49 PF02171 Piwi: Piwi domain; I 28.6 1.8E+02 0.0039 30.5 7.5 98 271-368 79-200 (302)
50 PRK00766 hypothetical protein; 27.5 3.1E+02 0.0068 26.8 8.1 34 330-366 99-135 (194)
51 PF15299 ALS2CR8: Amyotrophic 26.0 59 0.0013 32.7 3.0 20 108-127 70-89 (225)
52 PF13917 zf-CCHC_3: Zinc knuck 24.0 39 0.00086 24.0 0.9 17 724-740 6-22 (42)
53 smart00351 PAX Paired Box doma 22.7 2E+02 0.0043 25.8 5.5 69 161-240 16-95 (125)
54 PF12353 eIF3g: Eukaryotic tra 22.7 51 0.0011 29.9 1.6 23 720-743 104-126 (128)
55 PF04800 ETC_C1_NDUFA4: ETC co 21.9 73 0.0016 27.5 2.3 27 43-73 50-76 (101)
56 COG5082 AIR1 Arginine methyltr 21.1 50 0.0011 31.9 1.3 19 723-741 98-117 (190)
57 PF11427 HTH_Tnp_Tc3_1: Tc3 tr 20.6 94 0.002 23.1 2.4 28 162-189 4-31 (50)
58 PF13384 HTH_23: Homeodomain-l 20.5 87 0.0019 22.6 2.3 40 166-216 5-44 (50)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=5.2e-124 Score=1065.09 Aligned_cols=686 Identities=25% Similarity=0.431 Sum_probs=571.9
Q ss_pred CccCCCCcccccccCCCCCCCCCCCccCCHHHHHHHHHHHHhhcCCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCC
Q 004420 22 TIEENPEETILSQQTSVNLVPFIGQRFVSQDAAYEFYCSFAKQCGFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKP 101 (754)
Q Consensus 22 ~~~~~~~~~~~~~~~~~~~~P~vg~~F~S~eea~~~y~~yA~~~GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~ 101 (754)
.+..+++...+..+.+...+|.+||+|+|+|||++||+.||.+.||+||+.++++++.. +.++.++|+|+|+|+++.+.
T Consensus 54 ~~~~~~~~~~~~~~~~~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~-~~ii~r~fvCsreG~~~~~~ 132 (846)
T PLN03097 54 GDMNSPTGELVEFKEDTNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTS-REFIDAKFACSRYGTKREYD 132 (846)
T ss_pred ccccccccccccccCCCCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCC-CcEEEEEEEEcCCCCCcccc
Confidence 33445556666777888899999999999999999999999999999999999887654 36889999999999875432
Q ss_pred CC------------ccccccCCCcccccccceEEEEEeeeCCCCcEEEEeecCCccCCCCCccccccccccCCCCcchhh
Q 004420 102 SD------------DGKMQRNRKSSRCGCQAYMRIVKRVDFDVPEWHVTGFSNVHNHELLKLNEVRLLPAYCSITPDDKT 169 (754)
Q Consensus 102 ~~------------~~~~~r~~~s~r~gCpa~i~v~~~~~~~~~~w~V~~~~~~HNH~l~~~~~~~~l~s~R~l~~~~k~ 169 (754)
+. ....+++|+.+||||||+|+|++. .+|+|+|+.|+.+|||||.++..+ +
T Consensus 133 ~~~~~~~~~~~k~~~~~~~~rR~~tRtGC~A~m~Vk~~---~~gkW~V~~fv~eHNH~L~p~~~~---------~----- 195 (846)
T PLN03097 133 KSFNRPRARQTKQDPENGTGRRSCAKTDCKASMHVKRR---PDGKWVIHSFVKEHNHELLPAQAV---------S----- 195 (846)
T ss_pred cccccccccccccCcccccccccccCCCCceEEEEEEc---CCCeEEEEEEecCCCCCCCCcccc---------c-----
Confidence 10 001123456789999999999875 458999999999999999876421 1
Q ss_pred HHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhc-cCcccHHHHHHHHHHhhcCCCCcEEEEEecCC
Q 004420 170 RICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNV-NRDYDAIDLIAMCKKMKDKNPNFQYDFKMDGH 248 (754)
Q Consensus 170 ~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~-~~~~d~~~l~~~~~~~~~~np~~~~~~~~d~~ 248 (754)
+.++.++..+....+. +.++..+..|..|...+.|+. +..+|+++|++||++++.+||+|+|.+++|++
T Consensus 196 ---------~~~r~~~~~~~~~~~~-~~~v~~~~~d~~~~~~~~r~~~~~~gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~ 265 (846)
T PLN03097 196 ---------EQTRKMYAAMARQFAE-YKNVVGLKNDSKSSFDKGRNLGLEAGDTKILLDFFTQMQNMNSNFFYAVDLGED 265 (846)
T ss_pred ---------hhhhhhHHHHHhhhhc-cccccccchhhcchhhHHHhhhcccchHHHHHHHHHHHHhhCCCceEEEEEccC
Confidence 1233444444444442 567777888888887776644 56899999999999999999999999999999
Q ss_pred CceeeEEeccchhHHHHHHcCCEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecccCCchhhHHHHHHHHHHHhCC
Q 004420 249 NRLEHIAWSYASSVQLYEAFGDALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLLRDENMQSFSWSLKTLLGFMNG 328 (754)
Q Consensus 249 ~~~~~if~~~~~~~~~~~~~~dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~~w~l~~~~~~~~~ 328 (754)
|++++|||+|+.|+.+|.+|||||+||+||+||+|++||++|+|||+|+|+++|||||+.+|+.++|.|+|++|+++|++
T Consensus 266 ~~l~niFWaD~~sr~~Y~~FGDvV~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~g 345 (846)
T PLN03097 266 QRLKNLFWVDAKSRHDYGNFSDVVSFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMGG 345 (846)
T ss_pred CCeeeEEeccHHHHHHHHhcCCEEEEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhCC
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred CCCeEeeccccHHHHHHHHhhCCCCccccchhhhhhhccccCCcccCCChhhHHHHHHH-HhcCCCHHHHHHHHHHHHHh
Q 004420 329 KAPQTLLTDQNIWLKEAVAVEMPETKHAVYIWHILAKLSDSLPTFLGSSYDDWKAEFYR-LYNLELEEDFEEEWSKMVNK 407 (754)
Q Consensus 329 ~~p~~iitD~~~~l~~Ai~~vfP~a~h~lC~~Hi~~n~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~eFe~~w~~l~~~ 407 (754)
+.|++||||+|.+|.+||++|||++.|++|+|||++|+.+++..++ ...+.|..+|+. ++.+.+++|||..|..|+++
T Consensus 346 k~P~tIiTDqd~am~~AI~~VfP~t~Hr~C~wHI~~~~~e~L~~~~-~~~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~k 424 (846)
T PLN03097 346 QAPKVIITDQDKAMKSVISEVFPNAHHCFFLWHILGKVSENLGQVI-KQHENFMAKFEKCIYRSWTEEEFGKRWWKILDR 424 (846)
T ss_pred CCCceEEecCCHHHHHHHHHHCCCceehhhHHHHHHHHHHHhhHHh-hhhhHHHHHHHHHHhCCCCHHHHHHHHHHHHHh
Confidence 9999999999999999999999999999999999999999998765 346789999987 45689999999999999999
Q ss_pred ccCchhHHHHHHHHhhcccccccccccccccccCCCccchHHHHHHhhhcccccHHHHHHHHHHHHHhHhHHHHHHHHHH
Q 004420 408 YGLREYKHITSLYALRTFWALPFLRHYFFAGLLSPCQSEAINAFIQRILSAQSQLDRFVERVAEIVEFNDRAATKQKMQR 487 (754)
Q Consensus 408 ~~~~~~~~l~~l~~~r~~W~~a~~~~~~~~g~~ttn~~Es~n~~lk~~l~~~~~l~~f~~~~~~~~~~~~~~e~~~~~~~ 487 (754)
|++++++||+.||+.|++||++|+++.|++||.||+|+||+|++|++|+++.++|..|+++|+.++..++++|++++..+
T Consensus 425 y~L~~n~WL~~LY~~RekWapaY~k~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s 504 (846)
T PLN03097 425 FELKEDEWMQSLYEDRKQWVPTYMRDAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDT 504 (846)
T ss_pred hcccccHHHHHHHHhHhhhhHHHhcccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999998
Q ss_pred hhcccccccCChHHHHHHhhhhHHHHHHHHHHHHhccCceEEEee-CC---eEEEEEeeeeCCceEEEEecCCCeeEEEc
Q 004420 488 KLQKICLKTGSPIESHAATVLTPYAFGKLQEELLMAPQYASLLVD-EG---CFQVKHHTETDGGCKVIWIPCQEHISCSC 563 (754)
Q Consensus 488 ~~~~~~~~t~~~~e~q~~~~~T~~~f~~~q~e~~~s~~~~v~~~~-~~---~y~V~~~~~~~~~~~V~~~~~~~~~~CsC 563 (754)
....|.+++.+|||+||+.+|||.||++||+|+..+..|.+.... +| +|.|....+ .+.|.|.++.....++|+|
T Consensus 505 ~~~~P~l~t~~piEkQAs~iYT~~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~-~~~~~V~~d~~~~~v~CsC 583 (846)
T PLN03097 505 WNKQPALKSPSPLEKSVSGVYTHAVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEK-NQDFTVTWNQTKLEVSCIC 583 (846)
T ss_pred ccCCcccccccHHHHHHHHHhHHHHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecC-CCcEEEEEecCCCeEEeec
Confidence 888899999999999999999999999999999999988876653 33 688876544 5689999999999999999
Q ss_pred cCccccCcchhhHHHHhhcCCcccCCCCCcccccccccCCCCCCCCCC--CCChhHHHHHHHHHHHHHHHHHhcCHHHHH
Q 004420 564 HQFEFSGILCRHVLRVLSTDNCFQIPDQYLPIRWRNVTSASTNPLRTT--TRDRSEKIQLLESMASALVSESLETEERLD 641 (754)
Q Consensus 564 ~~f~~~GiPC~Hil~vl~~~~i~~ip~~yi~~rWtk~a~~~~~~~~~~--~~~~~~r~~~l~~~~~~~~~~~~~s~e~~~ 641 (754)
++|++.||||+|||+||.++++.+||++||++||||+|+.....+... ..+...||+.|++.+.+++.+|+.|+|.|+
T Consensus 584 ~kFE~~GILCrHaLkVL~~~~v~~IP~~YILkRWTKdAK~~~~~~~~~~~~~~~~~Ryn~L~r~a~kla~~as~S~E~y~ 663 (846)
T PLN03097 584 RLFEYKGYLCRHALVVLQMCQLSAIPSQYILKRWTKDAKSRHLLGEESEQVQSRVQRYNDLCQRALKLSEEASLSQESYN 663 (846)
T ss_pred cCeecCccchhhHHHHHhhcCcccCchhhhhhhchhhhhhcccCccccccccchhhHHHHHHHHHHHHHHHHhCCHHHHH
Confidence 999999999999999999999999999999999999999876554332 234567999999999999999999999999
Q ss_pred HHHHHHHHHHHHhhcCCCCCcCCCCcccCCCCCCccCcccccccccccccccCCC-CcccccCCccccCCCCCccc----
Q 004420 642 VACEQVAMVLNHVKDLPRPIHGMDDIAYACPSHSLILPEVEDTDGIVQSITVGNS-HESFTLGKLKERRPRDGVDV---- 716 (754)
Q Consensus 642 ~~~~~l~~l~~~~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~P-~~~~~kGrpk~~r~r~~~~~---- 716 (754)
.|++.|++++.++..|..+.....+ +.+....+++..++++... ...| ..++++++.+ +..++.+.
T Consensus 664 ~a~~~L~e~~~~~~~~~n~~~~~~~-~~~~~~~~~~~~~~~~~~~------~~~~~~~~~~~~~~~--~~~~~~~~~~~~ 734 (846)
T PLN03097 664 IAFRALEEAFGNCISMNNSNKSLVE-AGTSPTHGLLCIEDDNQSR------SMTKTNKKKNPTKKR--KVNSEQEVTTVA 734 (846)
T ss_pred HHHHHHHHHHHHHHHhhccCCCccc-cccccccCCcccccccccc------ccCcCCccccccccc--cccCchhhhhhh
Confidence 9999999999999877655433221 1122233333333332221 2222 2223333222 22222221
Q ss_pred ccccccccCCccC-CCCCCCCCCCCcCcCCc
Q 004420 717 SRKRRHCSEPCCR-HFGHDASSCPIMGSDTL 746 (754)
Q Consensus 717 ~~~~r~~~C~~C~-~~gHn~~tCp~~~~~~~ 746 (754)
....++--|..|. ..+|+...||....-.|
T Consensus 735 ~~~~~~~~~~~~~~~~~~d~~y~~q~~~q~~ 765 (846)
T PLN03097 735 AQDSLQQMDKLSSRAVALESYYGTQQSVQGM 765 (846)
T ss_pred hhhhhhhHHhhhcccCCcccccccHHhhhHH
Confidence 1112333377888 68888888887766555
No 2
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.86 E-value=6.5e-22 Score=170.96 Aligned_cols=90 Identities=26% Similarity=0.451 Sum_probs=86.9
Q ss_pred CccccCcCCceeeE---EEEeecCCceeEEeeecccCCchhhHHHHHHHHHHHhCCCCCeEeeccccHHHHHHHHhhCCC
Q 004420 276 TTHRLDSYDMLFGI---WVGLDNHGMACFFGCVLLRDENMQSFSWSLKTLLGFMNGKAPQTLLTDQNIWLKEAVAVEMPE 352 (754)
Q Consensus 276 ~Ty~~n~y~~~l~~---~~g~d~~~~~~~~a~al~~~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~~l~~Ai~~vfP~ 352 (754)
+||+||+| ++++. ++|+|++|+.+++||+++.+|+.++|.|+|+.|++.++.. |.+||||++.++++||+++||+
T Consensus 1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~~~-p~~ii~D~~~~~~~Ai~~vfP~ 78 (93)
T PF10551_consen 1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMPQK-PKVIISDFDKALINAIKEVFPD 78 (93)
T ss_pred Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccccC-ceeeeccccHHHHHHHHHHCCC
Confidence 69999999 88886 9999999999999999999999999999999999999887 9999999999999999999999
Q ss_pred Cccccchhhhhhhcc
Q 004420 353 TKHAVYIWHILAKLS 367 (754)
Q Consensus 353 a~h~lC~~Hi~~n~~ 367 (754)
+.|++|.||+.+|++
T Consensus 79 ~~~~~C~~H~~~n~k 93 (93)
T PF10551_consen 79 ARHQLCLFHILRNIK 93 (93)
T ss_pred ceEehhHHHHHHhhC
Confidence 999999999999985
No 3
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=99.79 E-value=3.3e-19 Score=153.13 Aligned_cols=91 Identities=36% Similarity=0.673 Sum_probs=78.4
Q ss_pred HHHHHHHhhcCCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCCCCccccccCCCcccccccceEEEEEeeeCCCCcE
Q 004420 56 EFYCSFAKQCGFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKPSDDGKMQRNRKSSRCGCQAYMRIVKRVDFDVPEW 135 (754)
Q Consensus 56 ~~y~~yA~~~GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~r~gCpa~i~v~~~~~~~~~~w 135 (754)
+||+.||..+||+|++.++++++.. ..++++.|+|+++|.++.+.......++++++++|||||+|.|++.. .+.|
T Consensus 1 ~fy~~yA~~~GF~vr~~~s~~~~~~-~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~---~~~w 76 (91)
T PF03101_consen 1 DFYNSYARRHGFSVRKSSSRKSKKN-GEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK---DGKW 76 (91)
T ss_pred CHHHHhcCcCCeEEEEeeeEeCCCC-ceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc---CCEE
Confidence 5999999999999999998776333 26889999999999998776554556788899999999999999863 6999
Q ss_pred EEEeecCCccCCCCC
Q 004420 136 HVTGFSNVHNHELLK 150 (754)
Q Consensus 136 ~V~~~~~~HNH~l~~ 150 (754)
.|+.+..+|||||.|
T Consensus 77 ~v~~~~~~HNH~L~P 91 (91)
T PF03101_consen 77 RVTSFVLEHNHPLCP 91 (91)
T ss_pred EEEECcCCcCCCCCC
Confidence 999999999999965
No 4
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.78 E-value=5.2e-20 Score=199.58 Aligned_cols=255 Identities=18% Similarity=0.161 Sum_probs=192.9
Q ss_pred ccccccCCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhccCcccHHHHHHHHHHhhc
Q 004420 155 RLLPAYCSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNVNRDYDAIDLIAMCKKMKD 234 (754)
Q Consensus 155 ~~l~s~R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~~~~~d~~~l~~~~~~~~~ 234 (754)
.+++.+++.++...+.|..|.-.|+++++|...+....|. ..+....|.++...+.. .+..| +....
T Consensus 91 ~ll~~y~r~~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~----~~~S~s~vSri~~~~~~--------~~~~w-~~R~L 157 (381)
T PF00872_consen 91 QLLPKYQRREDSLEELIISLYLKGVSTRDIEEALEELYGE----VAVSKSTVSRITKQLDE--------EVEAW-RNRPL 157 (381)
T ss_pred cccchhhhhhhhhhhhhhhhhccccccccccchhhhhhcc----cccCchhhhhhhhhhhh--------hHHHH-hhhcc
Confidence 3456666667777788888999999999999999887772 12444555554443321 11111 11111
Q ss_pred CCCCcEEEEEecCCCceeeEEeccchhHHHHHHc-CCEEEEcCccccCcC-----CceeeEEEEeecCCceeEEeeeccc
Q 004420 235 KNPNFQYDFKMDGHNRLEHIAWSYASSVQLYEAF-GDALVFDTTHRLDSY-----DMLFGIWVGLDNHGMACFFGCVLLR 308 (754)
Q Consensus 235 ~np~~~~~~~~d~~~~~~~if~~~~~~~~~~~~~-~dvl~~D~Ty~~n~y-----~~~l~~~~g~d~~~~~~~~a~al~~ 308 (754)
. .. =++|++|++|.+-+. +..+++++|+|.+|+..++|+.+..
T Consensus 158 ~-------------------------------~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~ 206 (381)
T PF00872_consen 158 E-------------------------------SEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGD 206 (381)
T ss_pred c-------------------------------cccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeeccc
Confidence 1 11 147999999987653 4678999999999999999999999
Q ss_pred CCchhhHHHHHHHHHHHhCCCCCeEeeccccHHHHHHHHhhCCCCccccchhhhhhhccccCCcccCCChhhHHHHHHHH
Q 004420 309 DENMQSFSWSLKTLLGFMNGKAPQTLLTDQNIWLKEAVAVEMPETKHAVYIWHILAKLSDSLPTFLGSSYDDWKAEFYRL 388 (754)
Q Consensus 309 ~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~~l~~Ai~~vfP~a~h~lC~~Hi~~n~~~~~~~~~~~~~~~~~~~~~~~ 388 (754)
.|+.++|.-+|+.|++- |-..|..||+|..+||..||.++||++.++.|.+|+++|+.+++.. ...+.+..+++.+
T Consensus 207 ~Es~~~W~~~l~~L~~R-Gl~~~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~---k~~~~v~~~Lk~I 282 (381)
T PF00872_consen 207 RESAASWREFLQDLKER-GLKDILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPK---KDRKEVKADLKAI 282 (381)
T ss_pred CCccCEeeecchhhhhc-cccccceeeccccccccccccccccchhhhhheechhhhhcccccc---ccchhhhhhcccc
Confidence 99999999999988763 2346899999999999999999999999999999999999998865 4567889999999
Q ss_pred hcCCCHHHHHHHHHHHHHhccCchhHHHHHHHHh-hcccccccccccccccccCCCccchHHHHHHhhhc
Q 004420 389 YNLELEEDFEEEWSKMVNKYGLREYKHITSLYAL-RTFWALPFLRHYFFAGLLSPCQSEAINAFIQRILS 457 (754)
Q Consensus 389 ~~~~~~~eFe~~w~~l~~~~~~~~~~~l~~l~~~-r~~W~~a~~~~~~~~g~~ttn~~Es~n~~lk~~l~ 457 (754)
+.+.+.+++...++++.+++....+.....|-.. .+.|+..-|+...+--+.|||.+|++|+.||+..+
T Consensus 283 ~~a~~~e~a~~~l~~f~~~~~~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~i~TTN~iEsln~~irrr~~ 352 (381)
T PF00872_consen 283 YQAPDKEEAREALEEFAEKWEKKYPKAAKSLEENWDELLTFLDFPPEHRRSIRTTNAIESLNKEIRRRTK 352 (381)
T ss_pred ccccccchhhhhhhhcccccccccchhhhhhhhccccccceeeecchhccccchhhhccccccchhhhcc
Confidence 9999999999999999988876555444443211 12233333455555678899999999999998643
No 5
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.36 E-value=8.6e-12 Score=132.43 Aligned_cols=261 Identities=15% Similarity=0.104 Sum_probs=177.4
Q ss_pred cccCCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhccCcccHHHHHHHHHHhhcCCC
Q 004420 158 PAYCSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNVNRDYDAIDLIAMCKKMKDKNP 237 (754)
Q Consensus 158 ~s~R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~~~~~d~~~l~~~~~~~~~~np 237 (754)
..+++.....-..|..+...|+++++|-.+++.+.+. .+...-+..+-. .+.+.+...+..-
T Consensus 80 ~~~~r~~~~~~~~v~~~y~~gv~Tr~i~~~~~~~~~~-----~~s~~~iS~~~~------------~~~e~v~~~~~r~- 141 (379)
T COG3328 80 ERYQRRERALDLPVLSMYAKGVTTREIEALLEELYGH-----KVSPSVISVVTD------------RLDEKVKAWQNRP- 141 (379)
T ss_pred hhhHhhhhhHHHHHHHHHHcCCcHHHHHHHHHHhhCc-----ccCHHHhhhHHH------------HHHHHHHHHHhcc-
Confidence 3344445555667788889999999999999877653 122222222211 1222222222111
Q ss_pred CcEEEEEecCCCceeeEEeccchhHHHHHHcCCEEEEcCccccCc--CCceeeEEEEeecCCceeEEeeecccCCchhhH
Q 004420 238 NFQYDFKMDGHNRLEHIAWSYASSVQLYEAFGDALVFDTTHRLDS--YDMLFGIWVGLDNHGMACFFGCVLLRDENMQSF 315 (754)
Q Consensus 238 ~~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~dvl~~D~Ty~~n~--y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~ 315 (754)
+ .--.+|++|++|.+-+ -+..+++++|++.+|+-.++|+.+-..|+ ..|
T Consensus 142 --------------------------l--~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w 192 (379)
T COG3328 142 --------------------------L--GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFW 192 (379)
T ss_pred --------------------------c--cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhH
Confidence 1 1124789999999987 47899999999999999999999999999 888
Q ss_pred HHHHHHHHHHhCCCCCeEeeccccHHHHHHHHhhCCCCccccchhhhhhhccccCCcccCCChhhHHHHHHHHhcCCCHH
Q 004420 316 SWSLKTLLGFMNGKAPQTLLTDQNIWLKEAVAVEMPETKHAVYIWHILAKLSDSLPTFLGSSYDDWKAEFYRLYNLELEE 395 (754)
Q Consensus 316 ~w~l~~~~~~~~~~~p~~iitD~~~~l~~Ai~~vfP~a~h~lC~~Hi~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 395 (754)
.-+|..|+.. |-.....+++|..+++.+||..+||.+.++.|..|+.+|+..+... ...+....+++.++.+.+.+
T Consensus 193 ~~~l~~l~~r-gl~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~---k~~d~i~~~~~~I~~a~~~e 268 (379)
T COG3328 193 LSFLLDLKNR-GLSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPR---KDQDAVLSDLRSIYIAPDAE 268 (379)
T ss_pred HHHHHHHHhc-cccceeEEecchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhh---hhhHHHHhhhhhhhccCCcH
Confidence 7555555543 2234456778999999999999999999999999999999988765 45677888888999999999
Q ss_pred HHHHHHHHHHHhccCchhHHHHHHHH-hhcccccccccccccccccCCCccchHHHHHHhhhcc--cccHHHHHHHH
Q 004420 396 DFEEEWSKMVNKYGLREYKHITSLYA-LRTFWALPFLRHYFFAGLLSPCQSEAINAFIQRILSA--QSQLDRFVERV 469 (754)
Q Consensus 396 eFe~~w~~l~~~~~~~~~~~l~~l~~-~r~~W~~a~~~~~~~~g~~ttn~~Es~n~~lk~~l~~--~~~l~~f~~~~ 469 (754)
+....|..+...+......-+..+.. .-+.|.-.-|.....--+.|||.+|++|+.++...+. ..+-.+++..+
T Consensus 269 ~~~~~~~~~~~~w~~~yP~i~~~~~~~~~~~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~~~~fpn~~sv~k~ 345 (379)
T COG3328 269 EALLALLAFSELWGKRYPAILKSWRNALEELLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKVVGIFPNEESVEKL 345 (379)
T ss_pred HHHHHHHHHHHhhhhhcchHHHHHHHHHHHhcccccCcHHHHhHhhcchHHHHHHHHHHHHHhhhccCCCHHHHHHH
Confidence 99999999877665433332332221 1223322112221113467999999999988866543 33344444433
No 6
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=99.25 E-value=4.7e-11 Score=101.11 Aligned_cols=91 Identities=23% Similarity=0.330 Sum_probs=72.8
Q ss_pred cCCHHHHHHHHHHHHhhcCCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCCCCc--------------------ccc
Q 004420 48 FVSQDAAYEFYCSFAKQCGFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKPSDD--------------------GKM 107 (754)
Q Consensus 48 F~S~eea~~~y~~yA~~~GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~~~~--------------------~~~ 107 (754)
|.+.+|+..|++.++..+||+|++.+|..+ .+.|.|--+|.++.+.... ...
T Consensus 1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~--------ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k 72 (111)
T PF08731_consen 1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKK--------KIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKK 72 (111)
T ss_pred CCchHHHHHHHHHHhhhcCceEEEEecCCc--------eEEEEEecCCCccccccccccccccccccccccccccccccc
Confidence 889999999999999999999999998654 3579998888776544310 011
Q ss_pred ccCCCcccccccceEEEEEeeeCCCCcEEEEeecCCccCCC
Q 004420 108 QRNRKSSRCGCQAYMRIVKRVDFDVPEWHVTGFSNVHNHEL 148 (754)
Q Consensus 108 ~r~~~s~r~gCpa~i~v~~~~~~~~~~w~V~~~~~~HNH~l 148 (754)
.....+.++.|||+|++.... ..+.|.|..++..|||||
T Consensus 73 ~k~t~srk~~CPFriRA~yS~--k~k~W~lvvvnn~HnH~l 111 (111)
T PF08731_consen 73 KKRTKSRKNTCPFRIRANYSK--KNKKWTLVVVNNEHNHPL 111 (111)
T ss_pred CCcccccccCCCeEEEEEEEe--cCCeEEEEEecCCcCCCC
Confidence 122235679999999999876 689999999999999996
No 7
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.87 E-value=8.7e-10 Score=70.79 Aligned_cols=26 Identities=42% Similarity=0.866 Sum_probs=24.5
Q ss_pred EEEccCccccCcchhhHHHHhhcCCc
Q 004420 560 SCSCHQFEFSGILCRHVLRVLSTDNC 585 (754)
Q Consensus 560 ~CsC~~f~~~GiPC~Hil~vl~~~~i 585 (754)
+|+|++||..||||+|+|+|+...++
T Consensus 2 ~CsC~~~~~~gipC~H~i~v~~~~~~ 27 (28)
T smart00575 2 TCSCRKFQLSGIPCRHALAAAIHIGL 27 (28)
T ss_pred cccCCCcccCCccHHHHHHHHHHhCC
Confidence 79999999999999999999998775
No 8
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.47 E-value=6.5e-07 Score=71.60 Aligned_cols=66 Identities=23% Similarity=0.353 Sum_probs=55.0
Q ss_pred CCCCCCCccCCHHHHHHHHHHHHhhcCCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCCCCccccccCCCccccccc
Q 004420 40 LVPFIGQRFVSQDAAYEFYCSFAKQCGFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKPSDDGKMQRNRKSSRCGCQ 119 (754)
Q Consensus 40 ~~P~vg~~F~S~eea~~~y~~yA~~~GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~r~gCp 119 (754)
....+||.|+|.+|+..++..||..+||.++..+|.+ .++..+|.. .|||
T Consensus 2 ~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~--------~r~~~~C~~----------------------~~C~ 51 (67)
T PF03108_consen 2 PELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDK--------KRYRAKCKD----------------------KGCP 51 (67)
T ss_pred CccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCC--------EEEEEEEcC----------------------CCCC
Confidence 4557899999999999999999999999999988753 256789941 4699
Q ss_pred ceEEEEEeeeCCCCcEEE
Q 004420 120 AYMRIVKRVDFDVPEWHV 137 (754)
Q Consensus 120 a~i~v~~~~~~~~~~w~V 137 (754)
|+|++.+.. ..+.|.|
T Consensus 52 Wrv~as~~~--~~~~~~I 67 (67)
T PF03108_consen 52 WRVRASKRK--RSDTFQI 67 (67)
T ss_pred EEEEEEEcC--CCCEEEC
Confidence 999999875 5677875
No 9
>PF04434 SWIM: SWIM zinc finger; InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.21 E-value=1.1e-06 Score=62.19 Aligned_cols=29 Identities=41% Similarity=0.787 Sum_probs=26.3
Q ss_pred CCCeeEEEccCccccCcchhhHHHHhhcC
Q 004420 555 CQEHISCSCHQFEFSGILCRHVLRVLSTD 583 (754)
Q Consensus 555 ~~~~~~CsC~~f~~~GiPC~Hil~vl~~~ 583 (754)
.....+|+|..|+..|.||+|+++|+...
T Consensus 11 ~~~~~~CsC~~~~~~~~~CkHi~av~~~~ 39 (40)
T PF04434_consen 11 SIEQASCSCPYFQFRGGPCKHIVAVLLAL 39 (40)
T ss_pred cccccEeeCCCccccCCcchhHHHHHHhh
Confidence 46789999999999999999999998764
No 10
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=97.12 E-value=0.00038 Score=71.71 Aligned_cols=94 Identities=13% Similarity=0.065 Sum_probs=65.3
Q ss_pred EEEcCccccCcCCceeeEEEEeec--CCceeEEeeecccCCchhhHHHHHHHH-HHHhCCCCCeEeeccccHHHHHHHHh
Q 004420 272 LVFDTTHRLDSYDMLFGIWVGLDN--HGMACFFGCVLLRDENMQSFSWSLKTL-LGFMNGKAPQTLLTDQNIWLKEAVAV 348 (754)
Q Consensus 272 l~~D~Ty~~n~y~~~l~~~~g~d~--~~~~~~~a~al~~~E~~es~~w~l~~~-~~~~~~~~p~~iitD~~~~l~~Ai~~ 348 (754)
|+||=+........ +..+.+|. ++..++ .++.+-+.+++.-+|..+ -.. .....++|.+|...+...|+++
T Consensus 1 lgiDE~~~~~g~~~--y~t~~~d~~~~~~~il---~i~~~r~~~~l~~~~~~~~~~~-~~~~v~~V~~Dm~~~y~~~~~~ 74 (249)
T PF01610_consen 1 LGIDEFAFRKGHRS--YVTVVVDLDTDTGRIL---DILPGRDKETLKDFFRSLYPEE-ERKNVKVVSMDMSPPYRSAIRE 74 (249)
T ss_pred CeEeeeeeecCCcc--eeEEEEECccCCceEE---EEcCCccHHHHHHHHHHhCccc-cccceEEEEcCCCccccccccc
Confidence 34555544332321 34444554 333332 467888888887666554 222 3456789999999999999999
Q ss_pred hCCCCccccchhhhhhhccccCC
Q 004420 349 EMPETKHAVYIWHILAKLSDSLP 371 (754)
Q Consensus 349 vfP~a~h~lC~~Hi~~n~~~~~~ 371 (754)
.||+|.+.+-.|||++++.+.+.
T Consensus 75 ~~P~A~iv~DrFHvvk~~~~al~ 97 (249)
T PF01610_consen 75 YFPNAQIVADRFHVVKLANRALD 97 (249)
T ss_pred cccccccccccchhhhhhhhcch
Confidence 99999999999999998877664
No 11
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=96.47 E-value=0.22 Score=56.00 Aligned_cols=248 Identities=17% Similarity=0.203 Sum_probs=141.7
Q ss_pred ccccCCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhccCcccHHHHHHHHHHhhcCC
Q 004420 157 LPAYCSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNVNRDYDAIDLIAMCKKMKDKN 236 (754)
Q Consensus 157 l~s~R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~~~~~d~~~l~~~~~~~~~~n 236 (754)
+..+.++|+..+..|..+... ++-++..+.+....| .+.++...|.|.+....... .........
T Consensus 112 l~~~~R~S~~~~~~i~~~a~~-~sYr~aa~~l~~~~~----~~~iS~~tV~~~v~~~g~~~----------~~~~~~~k~ 176 (470)
T PF06782_consen 112 LKKYQRISPELKEKIVELATE-MSYRKAAEILEELLG----NVSISKQTVWNIVKEAGFEE----------IKEEEKEKK 176 (470)
T ss_pred CCcccchhHHHHHHHHHHHhh-cCHHHHHHHHhhccC----CCccCHHHHHHHHHhccchh----------hhccccccC
Confidence 467788999999988877644 888888888865443 35567778888776543100 000000000
Q ss_pred CCcEEEEEecCCCceeeEEeccchhHHHHHHcCCEEEEcCccccCc----C--Cce-eeEEEE---ee-cCCceeEEee-
Q 004420 237 PNFQYDFKMDGHNRLEHIAWSYASSVQLYEAFGDALVFDTTHRLDS----Y--DML-FGIWVG---LD-NHGMACFFGC- 304 (754)
Q Consensus 237 p~~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~dvl~~D~Ty~~n~----y--~~~-l~~~~g---~d-~~~~~~~~a~- 304 (754)
....|| |-.|++|.... . ... +++-.| .. +.+...+..-
T Consensus 177 -------------~~~~Ly----------------IEaDg~~v~~qg~~~~~~e~k~~~vheG~~~~~~~~~R~~L~n~~ 227 (470)
T PF06782_consen 177 -------------KVPVLY----------------IEADGVHVKLQGKKKKKKEVKLFVVHEGWEKEKPGGKRNKLKNKR 227 (470)
T ss_pred -------------CCCeEE----------------EecCcceecccccccccceeeEEEEEeeeeeeeccCCcceeecch
Confidence 011111 12344444321 1 122 233335 11 1222333322
Q ss_pred eccc---CCchhhHHHHHHHHHHHhCCCCC--eEeeccccHHHHHHHHhhCCCCccccchhhhhhhccccCCcccCCChh
Q 004420 305 VLLR---DENMQSFSWSLKTLLGFMNGKAP--QTLLTDQNIWLKEAVAVEMPETKHAVYIWHILAKLSDSLPTFLGSSYD 379 (754)
Q Consensus 305 al~~---~E~~es~~w~l~~~~~~~~~~~p--~~iitD~~~~l~~Ai~~vfP~a~h~lC~~Hi~~n~~~~~~~~~~~~~~ 379 (754)
.++. ..+.+-|.-+.+.+-+....... .++.+|+...+.+++. .||++.|.|..||+.+.+.+.++.. +
T Consensus 228 ~f~~~~~~~~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~-----~ 301 (470)
T PF06782_consen 228 HFVSGVGESAEEFWEEVLDYIYNHYDLDKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHD-----P 301 (470)
T ss_pred heecccccchHHHHHHHHHHHHHhcCcccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhC-----h
Confidence 2232 45566777777777666543322 4566999999988776 9999999999999999999877542 3
Q ss_pred hHHHHHHHHhcCCCHHHHHHHHHHHHHhccCch-hHHHHHHH-Hhhccccc--ccccccccccccCCCccchHHHHHHhh
Q 004420 380 DWKAEFYRLYNLELEEDFEEEWSKMVNKYGLRE-YKHITSLY-ALRTFWAL--PFLRHYFFAGLLSPCQSEAINAFIQRI 455 (754)
Q Consensus 380 ~~~~~~~~~~~~~~~~eFe~~w~~l~~~~~~~~-~~~l~~l~-~~r~~W~~--a~~~~~~~~g~~ttn~~Es~n~~lk~~ 455 (754)
.+...+++.....+...++..++.+........ .+-+..+. .....|-. +|... .|+......|+.+..+...
T Consensus 302 ~~~~~~~~al~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~~~~Yl~~n~~~i~~y~~~---~~~~g~g~ee~~~~~~s~R 378 (470)
T PF06782_consen 302 ELKEKIRKALKKGDKKKLETVLDTAESCAKDEEERKKIRKLRKYLLNNWDGIKPYRER---EGLRGIGAEESVSHVLSYR 378 (470)
T ss_pred HHHHHHHHHHHhcCHHHHHHHHHHHHHhhhchHHHHHHHHHHHHHHHCHHHhhhhhhc---cCCCccchhhhhhhHHHHH
Confidence 456655565566677888888887765443222 12222222 34445532 33221 3444455578888888765
Q ss_pred hc
Q 004420 456 LS 457 (754)
Q Consensus 456 l~ 457 (754)
++
T Consensus 379 mK 380 (470)
T PF06782_consen 379 MK 380 (470)
T ss_pred hc
Confidence 54
No 12
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=95.81 E-value=0.0067 Score=56.30 Aligned_cols=81 Identities=21% Similarity=0.224 Sum_probs=67.3
Q ss_pred CCEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecccCCchhhHHHHHHHHHHHhCCCCCeEeeccccHHHHHHHHh
Q 004420 269 GDALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLLRDENMQSFSWSLKTLLGFMNGKAPQTLLTDQNIWLKEAVAV 348 (754)
Q Consensus 269 ~dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~~l~~Ai~~ 348 (754)
|+.+.+|=||.+-+ |--.++...+|.+|+ ++++-+...-+...=..||..+++..+ ..|..|+||+.++...|+++
T Consensus 1 ~~~w~~DEt~iki~-G~~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~~-~~p~~ivtDk~~aY~~A~~~ 76 (140)
T PF13610_consen 1 GDSWHVDETYIKIK-GKWHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRHR-GEPRVIVTDKLPAYPAAIKE 76 (140)
T ss_pred CCEEEEeeEEEEEC-CEEEEEEEeeccccc--chhhhhhhhcccccceeeccccceeec-cccceeecccCCccchhhhh
Confidence 57889999997744 345667888999999 788888888888888888888777665 68999999999999999999
Q ss_pred hCCCC
Q 004420 349 EMPET 353 (754)
Q Consensus 349 vfP~a 353 (754)
++|..
T Consensus 77 l~~~~ 81 (140)
T PF13610_consen 77 LNPEG 81 (140)
T ss_pred ccccc
Confidence 88753
No 13
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=95.43 E-value=0.065 Score=41.37 Aligned_cols=57 Identities=23% Similarity=0.472 Sum_probs=39.1
Q ss_pred cCCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCCCCccccccCCCcccccccceEEEEEeeeCCCCcEEEEeecCCc
Q 004420 65 CGFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKPSDDGKMQRNRKSSRCGCQAYMRIVKRVDFDVPEWHVTGFSNVH 144 (754)
Q Consensus 65 ~GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~r~gCpa~i~v~~~~~~~~~~w~V~~~~~~H 144 (754)
=||.+|+--.+.-++. ..-+.+|.|+. .+|||+=.|.+.. .++...++....+|
T Consensus 3 Dgy~WRKYGqK~i~g~--~~pRsYYrCt~----------------------~~C~akK~Vqr~~--~d~~~~~vtY~G~H 56 (60)
T PF03106_consen 3 DGYRWRKYGQKNIKGS--PYPRSYYRCTH----------------------PGCPAKKQVQRSA--DDPNIVIVTYEGEH 56 (60)
T ss_dssp SSS-EEEEEEEEETTT--TCEEEEEEEEC----------------------TTEEEEEEEEEET--TCCCEEEEEEES--
T ss_pred CCCchhhccCcccCCC--ceeeEeeeccc----------------------cChhheeeEEEec--CCCCEEEEEEeeee
Confidence 4888888665544433 35567899954 2799999998875 46778899999999
Q ss_pred cCC
Q 004420 145 NHE 147 (754)
Q Consensus 145 NH~ 147 (754)
||+
T Consensus 57 ~h~ 59 (60)
T PF03106_consen 57 NHP 59 (60)
T ss_dssp SS-
T ss_pred CCC
Confidence 997
No 14
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=94.74 E-value=0.018 Score=39.79 Aligned_cols=22 Identities=27% Similarity=0.560 Sum_probs=18.6
Q ss_pred ccCCccCCCCCCC--CCCCCcCcC
Q 004420 723 CSEPCCRHFGHDA--SSCPIMGSD 744 (754)
Q Consensus 723 ~~C~~C~~~gHn~--~tCp~~~~~ 744 (754)
++|+.|++.||.+ ++||.+...
T Consensus 2 ~kC~~CG~~GH~~t~k~CP~~~~~ 25 (40)
T PF15288_consen 2 VKCKNCGAFGHMRTNKRCPMYCWS 25 (40)
T ss_pred ccccccccccccccCccCCCCCCC
Confidence 3699999999998 779998754
No 15
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=94.72 E-value=0.025 Score=32.10 Aligned_cols=17 Identities=35% Similarity=0.532 Sum_probs=15.7
Q ss_pred cCCccCCCCCCCCCCCC
Q 004420 724 SEPCCRHFGHDASSCPI 740 (754)
Q Consensus 724 ~C~~C~~~gHn~~tCp~ 740 (754)
+|-+|++.||-++.||+
T Consensus 2 ~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 2 KCFNCGEPGHIARDCPK 18 (18)
T ss_dssp BCTTTSCSSSCGCTSSS
T ss_pred cCcCCCCcCcccccCcc
Confidence 48899999999999995
No 16
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=93.19 E-value=0.042 Score=36.03 Aligned_cols=24 Identities=21% Similarity=0.303 Sum_probs=20.0
Q ss_pred ccccCCccCCCCCCCCCCCCcCcC
Q 004420 721 RHCSEPCCRHFGHDASSCPIMGSD 744 (754)
Q Consensus 721 r~~~C~~C~~~gHn~~tCp~~~~~ 744 (754)
..+.|-.|++.||-++.||....+
T Consensus 7 ~~Y~C~~C~~~GH~i~dCP~~~Pk 30 (32)
T PF13696_consen 7 PGYVCHRCGQKGHWIQDCPTNKPK 30 (32)
T ss_pred CCCEeecCCCCCccHhHCCCCCCC
Confidence 356699999999999999996543
No 17
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=89.11 E-value=0.76 Score=35.23 Aligned_cols=56 Identities=21% Similarity=0.403 Sum_probs=37.6
Q ss_pred CCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCCCCccccccCCCcccccccceEEEEEeeeCCCCcEEEEeecCCcc
Q 004420 66 GFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKPSDDGKMQRNRKSSRCGCQAYMRIVKRVDFDVPEWHVTGFSNVHN 145 (754)
Q Consensus 66 GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~r~gCpa~i~v~~~~~~~~~~w~V~~~~~~HN 145 (754)
||.+|+--.+.-++. ..-+.+|.|+. ..||||+=.|.+.. .++...++....+||
T Consensus 4 Gy~WRKYGQK~ikgs--~~pRsYYrCt~---------------------~~~C~a~K~Vq~~~--~d~~~~~vtY~g~H~ 58 (59)
T smart00774 4 GYQWRKYGQKVIKGS--PFPRSYYRCTY---------------------SQGCPAKKQVQRSD--DDPSVVEVTYEGEHT 58 (59)
T ss_pred cccccccCcEecCCC--cCcceEEeccc---------------------cCCCCCcccEEEEC--CCCCEEEEEEeeEeC
Confidence 677766544333322 23456778854 14799988887764 457888889999999
Q ss_pred C
Q 004420 146 H 146 (754)
Q Consensus 146 H 146 (754)
|
T Consensus 59 h 59 (59)
T smart00774 59 H 59 (59)
T ss_pred C
Confidence 8
No 18
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=88.91 E-value=1.8 Score=38.36 Aligned_cols=75 Identities=16% Similarity=0.136 Sum_probs=54.2
Q ss_pred CCEEEEcCcccc-CcCCceeeEEEEeecCCceeEEeeecccCCchhhHHHHHHHHHHHhCCCCCeEeeccccHHHHH
Q 004420 269 GDALVFDTTHRL-DSYDMLFGIWVGLDNHGMACFFGCVLLRDENMQSFSWSLKTLLGFMNGKAPQTLLTDQNIWLKE 344 (754)
Q Consensus 269 ~dvl~~D~Ty~~-n~y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~~l~~ 344 (754)
+.++.+|.+... ...++..+.++.+|..-... +++.+-..++.+.+..+|.......+...|.+|+||+..++..
T Consensus 6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~~~-~~~~~~~~~~~~~~~~~l~~~~~~~~~~~p~~i~tD~g~~f~~ 81 (120)
T PF00665_consen 6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSRFI-YAFPVSSKETAEAALRALKRAIEKRGGRPPRVIRTDNGSEFTS 81 (120)
T ss_dssp TTEEEEEEEEETGGCTT-CEEEEEEEETTTTEE-EEEEESSSSHHHHHHHHHHHHHHHHS-SE-SEEEEESCHHHHS
T ss_pred CCEEEEeeEEEecCCCCccEEEEEEEECCCCcE-EEEEeecccccccccccccccccccccccceeccccccccccc
Confidence 467888888544 34555888888888776644 4666677778888888888777777666699999999999864
No 19
>PRK08561 rps15p 30S ribosomal protein S15P; Reviewed
Probab=82.21 E-value=2.7 Score=38.59 Aligned_cols=79 Identities=19% Similarity=0.234 Sum_probs=53.1
Q ss_pred cchhhHHHHhhhcCCcHHHHHHHHHHhcCccC----------------CCCCcchhhhhhhHHHhhhc-----c------
Q 004420 165 PDDKTRICMFAKAGMSVRQMLRLMELEKGVKL----------------GCLPFTEIDVRNLLQSFRNV-----N------ 217 (754)
Q Consensus 165 ~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~----------------~~~~~~~~Di~N~~~~~r~~-----~------ 217 (754)
++..+.|..|.+.|.+|++|--+|+.++|+.. +..+-.+.|++|+..++... .
T Consensus 31 eeve~~I~~lakkG~~pSqIG~~LRD~~gip~Vk~vtG~ki~~iLk~~gl~p~iPEDL~~L~~ri~~L~~HL~~nkKD~~ 110 (151)
T PRK08561 31 EEIEELVVELAKQGYSPSMIGIILRDQYGIPDVKLITGKKITEILEENGLAPEIPEDLRNLIKKAVNLRKHLEENPKDLH 110 (151)
T ss_pred HHHHHHHHHHHHCCCCHHHhhhhHhhccCCCceeeeccchHHHHHHHcCCCCCCcHHHHHHHHHHHHHHHHHHhCCCcch
Confidence 44567788999999999999999999997621 12233556888877665432 0
Q ss_pred -------CcccHHHHHHHHHHhhcCCCCcEEEE
Q 004420 218 -------RDYDAIDLIAMCKKMKDKNPNFQYDF 243 (754)
Q Consensus 218 -------~~~d~~~l~~~~~~~~~~np~~~~~~ 243 (754)
..+-.-.|++|++....--|+|.|+-
T Consensus 111 skRgL~~~~skrrRLl~Yyk~~~~LP~~WkY~~ 143 (151)
T PRK08561 111 NKRGLQLIESKIRRLVKYYKRTGVLPADWRYSP 143 (151)
T ss_pred hHHHHHHHHHHHHHHHHHHHhcCCCCCCCcCCH
Confidence 12233457778877766677777653
No 20
>COG5179 TAF1 Transcription initiation factor TFIID, subunit TAF1 [Transcription]
Probab=80.32 E-value=1.1 Score=49.84 Aligned_cols=27 Identities=22% Similarity=0.416 Sum_probs=21.6
Q ss_pred cccccCCccCCCCCCC--CCCCCcCcCCc
Q 004420 720 RRHCSEPCCRHFGHDA--SSCPIMGSDTL 746 (754)
Q Consensus 720 ~r~~~C~~C~~~gHn~--~tCp~~~~~~~ 746 (754)
..+.+|+.||+.||-+ +.||+.++++=
T Consensus 935 ~Ttr~C~nCGQvGHmkTNK~CP~f~s~~~ 963 (968)
T COG5179 935 NTTRTCGNCGQVGHMKTNKACPKFSSKDN 963 (968)
T ss_pred CcceecccccccccccccccCccccCCCC
Confidence 3466799999999966 45999998764
No 21
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=79.40 E-value=3.6 Score=31.58 Aligned_cols=25 Identities=28% Similarity=0.381 Sum_probs=10.2
Q ss_pred ccccceEEEEEeeeCCCCcEEEEeecCCccC
Q 004420 116 CGCQAYMRIVKRVDFDVPEWHVTGFSNVHNH 146 (754)
Q Consensus 116 ~gCpa~i~v~~~~~~~~~~w~V~~~~~~HNH 146 (754)
.+|+|++.+.. +.-.|.....+|||
T Consensus 38 ~~C~a~~~~~~------~~~~~~~~~~~HnH 62 (62)
T PF04500_consen 38 HGCRARLITDA------GDGRVVRTNGEHNH 62 (62)
T ss_dssp S----EEEEE--------TTEEEE-S---SS
T ss_pred CCCeEEEEEEC------CCCEEEECCCccCC
Confidence 68999998862 22355556689999
No 22
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=78.31 E-value=0.84 Score=33.69 Aligned_cols=18 Identities=28% Similarity=0.540 Sum_probs=16.3
Q ss_pred ccCCccCCCCCCCCCCCC
Q 004420 723 CSEPCCRHFGHDASSCPI 740 (754)
Q Consensus 723 ~~C~~C~~~gHn~~tCp~ 740 (754)
..|..|+..||+.+.||+
T Consensus 32 ~~C~~C~~~gH~~~~C~k 49 (49)
T PF14392_consen 32 RFCFHCGRIGHSDKECPK 49 (49)
T ss_pred hhhcCCCCcCcCHhHcCC
Confidence 359999999999999995
No 23
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=77.67 E-value=7.7 Score=40.26 Aligned_cols=145 Identities=11% Similarity=0.090 Sum_probs=80.8
Q ss_pred chhhHHHHh-hhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhccCcccHHHHHHHHHHhhcCCCCcEEEEE
Q 004420 166 DDKTRICMF-AKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNVNRDYDAIDLIAMCKKMKDKNPNFQYDFK 244 (754)
Q Consensus 166 ~~k~~i~~l-~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~~~~~d~~~l~~~~~~~~~~np~~~~~~~ 244 (754)
...+.+..+ ...++|...+.+++... | +.+....|.|+........ ..+.+.+.+..
T Consensus 7 ~~~a~i~~l~~~~~lp~~r~~~~~~~~-G-----~~is~~ti~~~~~~~~~~l-----~~~~~~l~~~~----------- 64 (271)
T PF03050_consen 7 SLLALIAYLKYVYHLPLYRIQQMLEDL-G-----ITISRGTIANWIKRVAEAL-----KPLYEALKEEL----------- 64 (271)
T ss_pred HHHHHHHHHHhcCCCCHHHHhhhhhcc-c-----eeeccchhHhHhhhhhhhh-----hhhhhhhhhhc-----------
Confidence 344444444 35678888888888765 5 4566677777766543221 11222221111
Q ss_pred ecCCCceeeEEeccchhHHHHHHcCCEEEEcCcccc----CcCCceeeEEEEeecCCceeEEeeecccCCchhhHHHHHH
Q 004420 245 MDGHNRLEHIAWSYASSVQLYEAFGDALVFDTTHRL----DSYDMLFGIWVGLDNHGMACFFGCVLLRDENMQSFSWSLK 320 (754)
Q Consensus 245 ~d~~~~~~~if~~~~~~~~~~~~~~dvl~~D~Ty~~----n~y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~~w~l~ 320 (754)
.-.+||.+|-|.-. .+.. .-+..+-++.. .+.|.+..+-..+...
T Consensus 65 ----------------------~~~~~~~~DET~~~vl~~~~g~-~~~~Wv~~~~~----~v~f~~~~sR~~~~~~---- 113 (271)
T PF03050_consen 65 ----------------------RSSPVVHADETGWRVLDKGKGK-KGYLWVFVSPE----VVLFFYAPSRSSKVIK---- 113 (271)
T ss_pred ----------------------cccceeccCCceEEEecccccc-ceEEEeeeccc----eeeeeecccccccchh----
Confidence 12356777766555 3332 12222222222 3334444444444333
Q ss_pred HHHHHhCCCCCeEeeccccHHHHHHHHhhCCCCccccchhhhhhhccccCCc
Q 004420 321 TLLGFMNGKAPQTLLTDQNIWLKEAVAVEMPETKHAVYIWHILAKLSDSLPT 372 (754)
Q Consensus 321 ~~~~~~~~~~p~~iitD~~~~l~~Ai~~vfP~a~h~lC~~Hi~~n~~~~~~~ 372 (754)
+.+++ ..-+++||+-.+-.. +.++.|+.|..|+.+.+.+-...
T Consensus 114 ---~~L~~-~~GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~ 156 (271)
T PF03050_consen 114 ---EFLGD-FSGILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES 156 (271)
T ss_pred ---hhhcc-cceeeeccccccccc-----ccccccccccccccccccccccc
Confidence 33444 346999999988755 33889999999999999866543
No 24
>smart00343 ZnF_C2HC zinc finger.
Probab=75.75 E-value=1.5 Score=27.45 Aligned_cols=19 Identities=32% Similarity=0.436 Sum_probs=16.1
Q ss_pred cCCccCCCCCCCCCCCCcC
Q 004420 724 SEPCCRHFGHDASSCPIMG 742 (754)
Q Consensus 724 ~C~~C~~~gHn~~tCp~~~ 742 (754)
.|.+|++.||..+.||...
T Consensus 1 ~C~~CG~~GH~~~~C~~~~ 19 (26)
T smart00343 1 KCYNCGKEGHIARDCPKXX 19 (26)
T ss_pred CCccCCCCCcchhhCCccc
Confidence 3889999999999999443
No 25
>PF04937 DUF659: Protein of unknown function (DUF 659); InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=75.64 E-value=19 Score=33.84 Aligned_cols=108 Identities=10% Similarity=0.014 Sum_probs=73.3
Q ss_pred HHHHHHcCCEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecc-cCCchhhHHHHHHHHHHHhCCCCCeEeeccccH
Q 004420 262 VQLYEAFGDALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLL-RDENMQSFSWSLKTLLGFMNGKAPQTLLTDQNI 340 (754)
Q Consensus 262 ~~~~~~~~dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~-~~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~ 340 (754)
+..+..+|=-|..|+= ++..+.+|+.|+..-..|..++-.+-.- ...+.+.+.-+|+...+.+|...-..||||...
T Consensus 26 k~~w~~~Gcsi~~DgW--td~~~~~lInf~v~~~~g~~Flksvd~s~~~~~a~~l~~ll~~vIeeVG~~nVvqVVTDn~~ 103 (153)
T PF04937_consen 26 KKSWKRTGCSIMSDGW--TDRKGRSLINFMVYCPEGTVFLKSVDASSIIKTAEYLFELLDEVIEEVGEENVVQVVTDNAS 103 (153)
T ss_pred HHHHHhcCEEEEEecC--cCCCCCeEEEEEEEcccccEEEEEEecccccccHHHHHHHHHHHHHHhhhhhhhHHhccCch
Confidence 4455666666777776 4566778888887777766665443222 224566666666666666676677788999999
Q ss_pred HHHHHHH---hhCCCCccccchhhhhhhccccCC
Q 004420 341 WLKEAVA---VEMPETKHAVYIWHILAKLSDSLP 371 (754)
Q Consensus 341 ~l~~Ai~---~vfP~a~h~lC~~Hi~~n~~~~~~ 371 (754)
.++.|-+ +-+|.....-|.-|-+.-+.+.+.
T Consensus 104 ~~~~a~~~L~~k~p~ifw~~CaaH~inLmledi~ 137 (153)
T PF04937_consen 104 NMKKAGKLLMEKYPHIFWTPCAAHCINLMLEDIG 137 (153)
T ss_pred hHHHHHHHHHhcCCCEEEechHHHHHHHHHHHHh
Confidence 9988844 457887777899988776655443
No 26
>KOG0400 consensus 40S ribosomal protein S13 [Translation, ribosomal structure and biogenesis]
Probab=70.70 E-value=6.6 Score=34.79 Aligned_cols=109 Identities=17% Similarity=0.242 Sum_probs=61.4
Q ss_pred ccccCCC-CcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCC-cchhhhhhhHHHhhhc-cCcccHHHHHH------
Q 004420 157 LPAYCSI-TPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLP-FTEIDVRNLLQSFRNV-NRDYDAIDLIA------ 227 (754)
Q Consensus 157 l~s~R~l-~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~-~~~~Di~N~~~~~r~~-~~~~d~~~l~~------ 227 (754)
.|++-++ .++.+++|..|.+.|++|.+|--+|+..+|+. ++. ++-..|-.+++..--. ....|...|+.
T Consensus 22 ~PtWlK~~~ddvkeqI~K~akKGltpsqIGviLRDshGi~--q~r~v~G~kI~Rilk~~Gl~PeiPeDLy~likkAv~iR 99 (151)
T KOG0400|consen 22 VPTWLKLTADDVKEQIYKLAKKGLTPSQIGVILRDSHGIG--QVRFVTGNKILRILKSNGLAPEIPEDLYHLIKKAVAIR 99 (151)
T ss_pred CcHHHhcCHHHHHHHHHHHHHcCCChhHceeeeecccCcc--hhheechhHHHHHHHHcCCCCCCcHHHHHHHHHHHHHH
Confidence 3444344 45678889999999999999999999998863 222 1222222222221111 22445554442
Q ss_pred -HHHHhhcCCCCcEEEEEecCCCceeeEEeccchhHHHHHHcCCEEEEcCccc
Q 004420 228 -MCKKMKDKNPNFQYDFKMDGHNRLEHIAWSYASSVQLYEAFGDALVFDTTHR 279 (754)
Q Consensus 228 -~~~~~~~~np~~~~~~~~d~~~~~~~if~~~~~~~~~~~~~~dvl~~D~Ty~ 279 (754)
.|+.-. -|.+.+|+-|... ..+-.+-++|..+..+-.+++
T Consensus 100 kHLer~R-----------KD~d~K~RLILve-SRihRlARYYk~~~~lPp~WK 140 (151)
T KOG0400|consen 100 KHLERNR-----------KDKDAKFRLILVE-SRIHRLARYYKTKMVLPPNWK 140 (151)
T ss_pred HHHHHhc-----------cccccceEEEeeh-HHHHHHHHHHHhcccCCCCCC
Confidence 333221 2455666555543 456667777776666666554
No 27
>PF13936 HTH_38: Helix-turn-helix domain; PDB: 2W48_A.
Probab=70.52 E-value=3.8 Score=29.38 Aligned_cols=29 Identities=28% Similarity=0.657 Sum_probs=14.3
Q ss_pred cCCCCcchhhHHHHhhhcCCcHHHHHHHH
Q 004420 160 YCSITPDDKTRICMFAKAGMSVRQMLRLM 188 (754)
Q Consensus 160 ~R~l~~~~k~~i~~l~~~g~~~~~i~~~l 188 (754)
+++|+.+++..|..+.+.|.+.++|...|
T Consensus 2 ~~~Lt~~eR~~I~~l~~~G~s~~~IA~~l 30 (44)
T PF13936_consen 2 YKHLTPEERNQIEALLEQGMSIREIAKRL 30 (44)
T ss_dssp ----------HHHHHHCS---HHHHHHHT
T ss_pred ccchhhhHHHHHHHHHHcCCCHHHHHHHH
Confidence 46789999999999999999999998755
No 28
>PF04684 BAF1_ABF1: BAF1 / ABF1 chromatin reorganising factor; InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=69.39 E-value=8.7 Score=41.77 Aligned_cols=54 Identities=17% Similarity=0.289 Sum_probs=45.1
Q ss_pred CCccCCHHHHHHHHHHHHhhcCCeEEEcceeccCCCCcceEEEEEeecCCCCCCCCCCCccccccCCCcccccccceEEE
Q 004420 45 GQRFVSQDAAYEFYCSFAKQCGFSIRRHRTRGKDGVGRGVTRRDFTCHRGGFPQMKPSDDGKMQRNRKSSRCGCQAYMRI 124 (754)
Q Consensus 45 g~~F~S~eea~~~y~~yA~~~GF~ir~~~s~~~~~~~k~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~r~gCpa~i~v 124 (754)
+..|+|+++-|..++.|.-...--|....|.+.+ ..+|.|.. -.|||+|-+
T Consensus 25 ~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nk-------hftfachl----------------------k~c~fkill 75 (496)
T PF04684_consen 25 ARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNK-------HFTFACHL----------------------KNCPFKILL 75 (496)
T ss_pred ccCCCcHHHHHHHHhhhhhhhcCceeeccccccc-------ceEEEeec----------------------cCCCceeee
Confidence 5679999999999999999999999988887654 35799965 259999988
Q ss_pred EEe
Q 004420 125 VKR 127 (754)
Q Consensus 125 ~~~ 127 (754)
...
T Consensus 76 sy~ 78 (496)
T PF04684_consen 76 SYC 78 (496)
T ss_pred eec
Confidence 754
No 29
>KOG4602 consensus Nanos and related proteins [General function prediction only]
Probab=67.51 E-value=19 Score=35.84 Aligned_cols=31 Identities=29% Similarity=0.484 Sum_probs=24.4
Q ss_pred ccccccccCCccCCCC---CCCCCCCCcCcCCcc
Q 004420 717 SRKRRHCSEPCCRHFG---HDASSCPIMGSDTLN 747 (754)
Q Consensus 717 ~~~~r~~~C~~C~~~g---Hn~~tCp~~~~~~~~ 747 (754)
=.|-|.|-|+.||..| |+++-||+....++.
T Consensus 263 CPkLR~YVCPiCGATgDnAHTiKyCPl~~~~~~s 296 (318)
T KOG4602|consen 263 CPKLRSYVCPICGATGDNAHTIKYCPLAFGDDTS 296 (318)
T ss_pred chhHhhhcCccccccCCcccceecccccCCCCcc
Confidence 3456788899999655 888889999888773
No 30
>PF08069 Ribosomal_S13_N: Ribosomal S13/S15 N-terminal domain; InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=63.62 E-value=5.5 Score=30.61 Aligned_cols=36 Identities=17% Similarity=0.294 Sum_probs=27.1
Q ss_pred ccCCCC-cchhhHHHHhhhcCCcHHHHHHHHHHhcCc
Q 004420 159 AYCSIT-PDDKTRICMFAKAGMSVRQMLRLMELEKGV 194 (754)
Q Consensus 159 s~R~l~-~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~ 194 (754)
++-+++ ++..+.|..|.+.|++|.+|--+|+.++|+
T Consensus 24 ~W~~~~~~eVe~~I~klakkG~tpSqIG~iLRD~~GI 60 (60)
T PF08069_consen 24 SWLKYSPEEVEELIVKLAKKGLTPSQIGVILRDQYGI 60 (60)
T ss_dssp TT--S-HHHHHHHHHHHCCTTHCHHHHHHHHHHSCTC
T ss_pred CCcCCCHHHHHHHHHHHHHcCCCHHHhhhhhhhccCC
Confidence 333444 445677889999999999999999998873
No 31
>PF05741 zf-nanos: Nanos RNA binding domain; InterPro: IPR024161 Nanos is a highly conserved RNA-binding protein in higher eukaryotes and functions as a key regulatory protein in translational control using a 3' untranslated region during the development and maintenance of germ cells. Nanos comprises a non-conserved amino-terminus and highly conserved carboxy- terminal regions. The C-terminal region has two conserved Cys-Cys-His-Cys (CCHC)-type zinc-finger motifs that are indispensable for nanos function [, , ]. The structure of the nanos-type zinc finger is composed of two independent zinc-finger (ZF) lobes, the N-terminal ZF1 and the C-terminal ZF2, which are connected by a linker helix []. These lobes create a large cleft. Zinc ions in ZF1 and ZF2 are bound to the CCHC motif by tetrahedral coordination.; PDB: 3ALR_B.
Probab=60.56 E-value=3.1 Score=31.36 Aligned_cols=22 Identities=27% Similarity=0.474 Sum_probs=9.4
Q ss_pred cccccCCccCC---CCCCCCCCCCc
Q 004420 720 RRHCSEPCCRH---FGHDASSCPIM 741 (754)
Q Consensus 720 ~r~~~C~~C~~---~gHn~~tCp~~ 741 (754)
-|.+.|+.|+. ..|+.+-||++
T Consensus 31 Lr~y~Cp~CgAtGd~AHT~~yCP~k 55 (55)
T PF05741_consen 31 LRKYVCPICGATGDNAHTIKYCPKK 55 (55)
T ss_dssp GGG---TTT---GGG---GGG-TT-
T ss_pred HhcCcCCCCcCcCccccccccCcCC
Confidence 36678999996 66888889975
No 32
>PHA02517 putative transposase OrfB; Reviewed
Probab=60.14 E-value=78 Score=32.76 Aligned_cols=73 Identities=11% Similarity=-0.121 Sum_probs=45.4
Q ss_pred CCEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecccCCchhhHHHHHHHHHHHhCCCCCeEeeccccHHHH
Q 004420 269 GDALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLLRDENMQSFSWSLKTLLGFMNGKAPQTLLTDQNIWLK 343 (754)
Q Consensus 269 ~dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~~l~ 343 (754)
..+++.|.||..... +-.++++.+|...+ .++|+.+...++.+...-+|+......+...+..|.||+.....
T Consensus 110 n~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~~~~~l~~a~~~~~~~~~~i~~sD~G~~y~ 182 (277)
T PHA02517 110 NQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDFVLDALEQALWARGRPGGLIHHSDKGSQYV 182 (277)
T ss_pred CCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHHHHHHHHHHHHhcCCCcCcEeecccccccc
Confidence 468888999876443 44567777776655 45577777777776554444443333332233466799987654
No 33
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=57.59 E-value=17 Score=35.91 Aligned_cols=82 Identities=21% Similarity=0.182 Sum_probs=54.2
Q ss_pred CEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecccCCchhhHHHHHHHHHHHhCCCCCeEeeccccHHHHHHHHhh
Q 004420 270 DALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLLRDENMQSFSWSLKTLLGFMNGKAPQTLLTDQNIWLKEAVAVE 349 (754)
Q Consensus 270 dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~~~E~~es~~w~l~~~~~~~~~~~p~~iitD~~~~l~~Ai~~v 349 (754)
+++-+|=||.+-+-+. .+....+|..|++ +.+-|...-+...=.-||..+++.. ..|.+|+||+.+....|+.++
T Consensus 71 ~~w~vDEt~ikv~gkw-~ylyrAid~~g~~--Ld~~L~~rRn~~aAk~Fl~kllk~~--g~p~v~vtDka~s~~~A~~~l 145 (215)
T COG3316 71 DSWRVDETYIKVNGKW-HYLYRAIDADGLT--LDVWLSKRRNALAAKAFLKKLLKKH--GEPRVFVTDKAPSYTAALRKL 145 (215)
T ss_pred cceeeeeeEEeeccEe-eehhhhhccCCCe--EEEEEEcccCcHHHHHHHHHHHHhc--CCCceEEecCccchHHHHHhc
Confidence 4566676766532221 1222335555543 4455556656666677777777766 679999999999999999999
Q ss_pred CCCCccc
Q 004420 350 MPETKHA 356 (754)
Q Consensus 350 fP~a~h~ 356 (754)
-+...|+
T Consensus 146 ~~~~ehr 152 (215)
T COG3316 146 GSEVEHR 152 (215)
T ss_pred Ccchhee
Confidence 8866654
No 34
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=57.58 E-value=24 Score=30.08 Aligned_cols=20 Identities=30% Similarity=0.667 Sum_probs=15.1
Q ss_pred EEEccCccc-----cCcchhhHHHH
Q 004420 560 SCSCHQFEF-----SGILCRHVLRV 579 (754)
Q Consensus 560 ~CsC~~f~~-----~GiPC~Hil~v 579 (754)
.|||..|-. -.-||.|++.+
T Consensus 51 fCSCp~~~~svvl~Gk~~C~Hi~gl 75 (117)
T COG5431 51 FCSCPDFLGSVVLKGKSPCAHIIGL 75 (117)
T ss_pred cccCHHHHhHhhhcCcccchhhhhe
Confidence 899998872 23469999864
No 35
>KOG3517 consensus Transcription factor PAX1/9 [Transcription]
Probab=49.00 E-value=27 Score=34.58 Aligned_cols=69 Identities=14% Similarity=0.313 Sum_probs=50.6
Q ss_pred CCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhc--cCc---------ccHHHHHHHH
Q 004420 161 CSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNV--NRD---------YDAIDLIAMC 229 (754)
Q Consensus 161 R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~--~~~---------~d~~~l~~~~ 229 (754)
|.|..+.+..|.+|...|++|..|-+.|+..+|- +..++.++... .+. -..-.+.+|.
T Consensus 19 RPLPna~RlrIVELarlGiRPCDISRQLrvSHGC-----------VSKILaRy~EtGsIlPGaIGGSkPRVTTP~VV~~I 87 (334)
T KOG3517|consen 19 RPLPNAIRLRIVELARLGIRPCDISRQLRVSHGC-----------VSKILARYNETGSILPGAIGGSKPRVTTPKVVKYI 87 (334)
T ss_pred ccCcchhhhhHHHHHHcCCCccchhhhhhhccch-----------HHHHHHHhccCCcccccccCCCCCccCChhHHHHH
Confidence 6778888889999999999999999999887762 23334443321 111 1345788999
Q ss_pred HHhhcCCCCcE
Q 004420 230 KKMKDKNPNFQ 240 (754)
Q Consensus 230 ~~~~~~np~~~ 240 (754)
++++..|||.|
T Consensus 88 R~~Kq~DPGIF 98 (334)
T KOG3517|consen 88 RSLKQRDPGIF 98 (334)
T ss_pred HHhhccCCcee
Confidence 99999999964
No 36
>PF11433 DUF3198: Protein of unknown function (DUF3198); InterPro: IPR024504 This domain is found at the C-terminal of a family of archaeal proteins annotated as membrane proteins.; PDB: 1X9B_A.
Probab=47.07 E-value=29 Score=24.91 Aligned_cols=44 Identities=14% Similarity=0.092 Sum_probs=30.6
Q ss_pred HHHHHhcCCCHHHHHHHHHHHHHhccCchhHHHHHHHHhhcccc
Q 004420 384 EFYRLYNLELEEDFEEEWSKMVNKYGLREYKHITSLYALRTFWA 427 (754)
Q Consensus 384 ~~~~~~~~~~~~eFe~~w~~l~~~~~~~~~~~l~~l~~~r~~W~ 427 (754)
.|..++++++..+|-..+.+|-..-.--+..|...+-+-+++|-
T Consensus 6 ~Fe~~InS~SK~~Fv~nL~ELE~is~rlg~~Y~~~LeeaK~kWk 49 (51)
T PF11433_consen 6 KFESYINSESKSVFVRNLTELERISKRLGKSYQIRLEEAKEKWK 49 (51)
T ss_dssp HHHHHHHS--HHHHHHHHHHHHHHHHHH-SHHHHHHHHHHHHH-
T ss_pred HHHHHhCCccHHHHHHhHHHHHHHHHHHchHHHHHHHHHHHhhc
Confidence 67889999999999999988854333234567777777788884
No 37
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=42.89 E-value=90 Score=32.17 Aligned_cols=72 Identities=11% Similarity=-0.038 Sum_probs=47.9
Q ss_pred CCEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecccC-CchhhHHHHHHHHHHHh-C---CCCCeEeeccccHH
Q 004420 269 GDALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLLRD-ENMQSFSWSLKTLLGFM-N---GKAPQTLLTDQNIW 341 (754)
Q Consensus 269 ~dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~~~-E~~es~~w~l~~~~~~~-~---~~~p~~iitD~~~~ 341 (754)
..+.+.|-||....-+.-+++.+.+|.+.. .++|+++... .+.+...-+|+..++.. + ...|..|.||+-..
T Consensus 87 n~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~~v~~~l~~A~~~~~~~~~~~~~~iihSD~Gsq 163 (262)
T PRK14702 87 NQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQDVMLGAVERRFGNDLPSSPVEWLTDNGSC 163 (262)
T ss_pred CCEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHHHHHHHHHHHHHHHhcccCCCCCeEEEcCCCcc
Confidence 367888999876554557888888887776 5678888764 56666555555443332 2 23567888997654
No 38
>PF02796 HTH_7: Helix-turn-helix domain of resolvase; InterPro: IPR006120 Site-specific recombination plays an important role in DNA rearrangement in prokaryotic organisms. Two types of site-specific recombination are known to occur: Recombination between inverted repeats resulting in the reversal of a DNA segment. Recombination between repeat sequences on two DNA molecules resulting in their cointegration, or between repeats on one DNA molecule resulting in the excision of a DNA fragment. Site-specific recombination is characterised by a strand exchange mechanism that requires no DNA synthesis or high energy cofactor; the phosphodiester bond energy is conserved in a phospho-protein linkage during strand cleavage and re-ligation. Two unrelated families of recombinases are currently known []. The first, called the 'phage integrase' family, groups a number of bacterial phage and yeast plasmid enzymes. The second [], called the 'resolvase' family, groups enzymes which share the following structural characteristics: an N-terminal catalytic and dimerization domain that contains a conserved serine residue involved in the transient covalent attachment to DNA IPR006119 from INTERPRO, and a C-terminal helix-turn-helix DNA-binding domain. ; GO: 0000150 recombinase activity, 0003677 DNA binding, 0006310 DNA recombination; PDB: 1ZR2_A 2GM4_B 1RES_A 1ZR4_A 1RET_A 1GDT_B 2R0Q_C 1JKP_C 1IJW_C 1JJ6_C ....
Probab=41.92 E-value=23 Score=25.39 Aligned_cols=40 Identities=18% Similarity=0.300 Sum_probs=26.1
Q ss_pred CCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHH
Q 004420 161 CSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQ 211 (754)
Q Consensus 161 R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~ 211 (754)
+.+++++.+.+..|...|++..+|...+ | +....||.++.
T Consensus 4 ~~~~~~~~~~i~~l~~~G~si~~IA~~~----g-------vsr~TvyR~l~ 43 (45)
T PF02796_consen 4 PKLSKEQIEEIKELYAEGMSIAEIAKQF----G-------VSRSTVYRYLN 43 (45)
T ss_dssp SSSSHCCHHHHHHHHHTT--HHHHHHHT----T-------S-HHHHHHHHC
T ss_pred CCCCHHHHHHHHHHHHCCCCHHHHHHHH----C-------cCHHHHHHHHh
Confidence 4567777888999999999988887643 3 35556666543
No 39
>PF00292 PAX: 'Paired box' domain; InterPro: IPR001523 The paired box is a conserved 124 amino acid N-terminal domain of unknown function that usually, but not always, precedes a homeobox domain (see IPR001356 from INTERPRO) [, ]. Paired box genes are expressed in alternate segments of the developing fruit fly, the observed grouping of segments into pairs depending on the position of the segment in the segmental array, and not on the identity of the segment as in the case of homeotic genes. This implies that the genes affect different processes from those altered by homeotic genes.; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 6PAX_A 1K78_E 1MDM_A 2K27_A 1PDN_C.
Probab=41.85 E-value=28 Score=31.33 Aligned_cols=70 Identities=20% Similarity=0.339 Sum_probs=40.9
Q ss_pred cCCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhc--cCc---------ccHHHHHHH
Q 004420 160 YCSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNV--NRD---------YDAIDLIAM 228 (754)
Q Consensus 160 ~R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~--~~~---------~d~~~l~~~ 228 (754)
-|.|+.+.+..|.+|...|++|.+|...|.-.+| -+..++.++++. ... --...+..+
T Consensus 15 GrPLp~~~R~rIvela~~G~rp~~Isr~l~Vs~g-----------cVsKIl~Ry~eTGsi~Pg~iGGskprv~tp~v~~~ 83 (125)
T PF00292_consen 15 GRPLPNELRQRIVELAKEGVRPCDISRQLRVSHG-----------CVSKILSRYRETGSIRPGPIGGSKPRVATPEVVEK 83 (125)
T ss_dssp TSSS-HHHHHHHHHHHHTT--HHHHHHHHT--HH-----------HHHHHHHHHHHHS-SS----S----SSS-HCHHHH
T ss_pred CccCcHHHHHHHHHHhhhcCCHHHHHHHHccchh-----------HHHHHHHHHHHhcccCcccccCCCCCCCChHHHHH
Confidence 4678888999999999999999999887765443 234444444422 011 112235667
Q ss_pred HHHhhcCCCCcE
Q 004420 229 CKKMKDKNPNFQ 240 (754)
Q Consensus 229 ~~~~~~~np~~~ 240 (754)
+.+.+.+||..|
T Consensus 84 I~~~k~enP~if 95 (125)
T PF00292_consen 84 IEQYKRENPTIF 95 (125)
T ss_dssp HHHHHHH-TTS-
T ss_pred HHHHHhcCCCcc
Confidence 777788888764
No 40
>PTZ00072 40S ribosomal protein S13; Provisional
Probab=39.14 E-value=46 Score=30.47 Aligned_cols=30 Identities=23% Similarity=0.472 Sum_probs=26.1
Q ss_pred cchhhHHHHhhhcCCcHHHHHHHHHHhcCc
Q 004420 165 PDDKTRICMFAKAGMSVRQMLRLMELEKGV 194 (754)
Q Consensus 165 ~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~ 194 (754)
++..+.|..|.+.|.+|++|--+|+.++|+
T Consensus 28 eeVe~~I~klaKkG~~pSqIG~iLRD~~gi 57 (148)
T PTZ00072 28 SEVEDQICKLAKKGLTPSQIGVILRDSMGI 57 (148)
T ss_pred HHHHHHHHHHHHCCCCHhHhhhhhhhccCc
Confidence 446677889999999999999999999975
No 41
>PRK09409 IS2 transposase TnpB; Reviewed
Probab=39.00 E-value=1.1e+02 Score=32.15 Aligned_cols=72 Identities=11% Similarity=-0.014 Sum_probs=48.4
Q ss_pred CCEEEEcCccccCcCCceeeEEEEeecCCceeEEeeecccC-CchhhHHHHHHHHHHH-hCC---CCCeEeeccccHH
Q 004420 269 GDALVFDTTHRLDSYDMLFGIWVGLDNHGMACFFGCVLLRD-ENMQSFSWSLKTLLGF-MNG---KAPQTLLTDQNIW 341 (754)
Q Consensus 269 ~dvl~~D~Ty~~n~y~~~l~~~~g~d~~~~~~~~a~al~~~-E~~es~~w~l~~~~~~-~~~---~~p~~iitD~~~~ 341 (754)
..+.+.|-||....-+.-+++.+.+|.... .++|+++... .+.+...-+|+..+.. .+. ..|..|.||+-..
T Consensus 126 N~~W~tDiT~~~~~~g~~~Yl~~ViD~~sR-~ivg~~~s~~~~~~~~v~~~l~~a~~~~~~~~~~~~~~iihSDrGsq 202 (301)
T PRK09409 126 NQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQDVMLGAVERRFGNDLPSSPVEWLTDNGSC 202 (301)
T ss_pred CCEEEeeeEEEEeCCCCEEEEEEEeecccc-eEEEEEeccCCCCHHHHHHHHHHHHHHHhccCCCCCCcEEecCCCcc
Confidence 468888999976544556888888887776 6778998875 5666666666543333 332 2456778887654
No 42
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=36.82 E-value=17 Score=36.35 Aligned_cols=24 Identities=25% Similarity=0.585 Sum_probs=19.4
Q ss_pred eeEEEccCccccCcchhhHHHHhhcCC
Q 004420 558 HISCSCHQFEFSGILCRHVLRVLSTDN 584 (754)
Q Consensus 558 ~~~CsC~~f~~~GiPC~Hil~vl~~~~ 584 (754)
...|||..|. .||.||-+|.-++.
T Consensus 124 ~~dCSCPD~a---nPCKHi~AvyY~la 147 (266)
T COG4279 124 STDCSCPDYA---NPCKHIAAVYYLLA 147 (266)
T ss_pred ccccCCCCcc---cchHHHHHHHHHHH
Confidence 4569999864 79999999987654
No 43
>COG4715 Uncharacterized conserved protein [Function unknown]
Probab=34.87 E-value=1.8e+02 Score=33.02 Aligned_cols=50 Identities=20% Similarity=0.288 Sum_probs=30.7
Q ss_pred eeCCeEEEEEeeeeCCceEEEEecCCCeeEEEccCccccCcchhhHHHHhhc
Q 004420 531 VDEGCFQVKHHTETDGGCKVIWIPCQEHISCSCHQFEFSGILCRHVLRVLST 582 (754)
Q Consensus 531 ~~~~~y~V~~~~~~~~~~~V~~~~~~~~~~CsC~~f~~~GiPC~Hil~vl~~ 582 (754)
..++.+.....++..-.+.|+......+..|+|.. ...| -|.|+.+|+..
T Consensus 45 ~~g~~v~A~V~Gs~~y~v~vtL~~~~~ss~CTCP~-~~~g-aCKH~VAvvl~ 94 (587)
T COG4715 45 IRGGTVRAVVEGSRRYRVRVTLEGGALSSICTCPY-GGSG-ACKHVVAVVLE 94 (587)
T ss_pred ecCCeEEEEEeccceeeEEEEeecCCcCceeeCCC-CCCc-chHHHHHHHHH
Confidence 44555544333322223445555556789999987 4443 69999998875
No 44
>PRK14892 putative transcription elongation factor Elf1; Provisional
Probab=32.30 E-value=31 Score=29.63 Aligned_cols=10 Identities=20% Similarity=0.398 Sum_probs=7.3
Q ss_pred cccccCCccC
Q 004420 720 RRHCSEPCCR 729 (754)
Q Consensus 720 ~r~~~C~~C~ 729 (754)
.....|+.|+
T Consensus 19 pt~f~CP~Cg 28 (99)
T PRK14892 19 PKIFECPRCG 28 (99)
T ss_pred CcEeECCCCC
Confidence 3455699998
No 45
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=31.10 E-value=23 Score=36.08 Aligned_cols=26 Identities=19% Similarity=0.318 Sum_probs=22.8
Q ss_pred cccCCccCCCCCCCCCCCCcCcCCcc
Q 004420 722 HCSEPCCRHFGHDASSCPIMGSDTLN 747 (754)
Q Consensus 722 ~~~C~~C~~~gHn~~tCp~~~~~~~~ 747 (754)
-|-|=+|++.||-+..||.+.+.+..
T Consensus 176 gY~CyRCGqkgHwIqnCpTN~Dpnfd 201 (427)
T COG5222 176 GYVCYRCGQKGHWIQNCPTNQDPNFD 201 (427)
T ss_pred ceeEEecCCCCchhhcCCCCCCCCcc
Confidence 35599999999999999999998764
No 46
>COG5470 Uncharacterized conserved protein [Function unknown]
Probab=30.82 E-value=24 Score=29.81 Aligned_cols=42 Identities=21% Similarity=0.135 Sum_probs=28.0
Q ss_pred cccccccCCCCCCCCCCCccCCHHHHHHHHHHHHhhcCCeEE
Q 004420 29 ETILSQQTSVNLVPFIGQRFVSQDAAYEFYCSFAKQCGFSIR 70 (754)
Q Consensus 29 ~~~~~~~~~~~~~P~vg~~F~S~eea~~~y~~yA~~~GF~ir 70 (754)
....+++.+-+..+-+=++|+|++.|.++|++=+...--++|
T Consensus 41 G~v~~lEG~w~ptr~vviEFps~~~ar~~y~SpeYq~a~~~R 82 (96)
T COG5470 41 GEVETLEGEWRPTRNVVIEFPSLEAARDCYNSPEYQAAAAIR 82 (96)
T ss_pred CCeeeccCCCCcccEEEEEcCCHHHHHHHhcCHHHHHHHHHH
Confidence 344455555444556778999999999999976554433433
No 47
>KOG0341 consensus DEAD-box protein abstrakt [RNA processing and modification]
Probab=30.29 E-value=22 Score=37.91 Aligned_cols=20 Identities=25% Similarity=0.698 Sum_probs=17.9
Q ss_pred cCCccCCCCCCCCCCCCcCc
Q 004420 724 SEPCCRHFGHDASSCPIMGS 743 (754)
Q Consensus 724 ~C~~C~~~gHn~~tCp~~~~ 743 (754)
.|.+|++.||-++-||+..-
T Consensus 572 GCayCgGLGHRItdCPKle~ 591 (610)
T KOG0341|consen 572 GCAYCGGLGHRITDCPKLEA 591 (610)
T ss_pred ccccccCCCcccccCchhhh
Confidence 49999999999999998753
No 48
>PF02178 AT_hook: AT hook motif; InterPro: IPR017956 AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins [], in DNA-binding proteins from plants [] and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex []. High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin []. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner [, ]. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions. ; GO: 0003677 DNA binding; PDB: 2EZE_A 2EZD_A 2EZF_A 2EZG_A.
Probab=28.69 E-value=24 Score=18.33 Aligned_cols=9 Identities=11% Similarity=0.220 Sum_probs=3.1
Q ss_pred cccCCcccc
Q 004420 700 FTLGKLKER 708 (754)
Q Consensus 700 ~~kGrpk~~ 708 (754)
+..|||++.
T Consensus 2 r~RGRP~k~ 10 (13)
T PF02178_consen 2 RKRGRPRKN 10 (13)
T ss_dssp --SS--TT-
T ss_pred CcCCCCccc
Confidence 457999864
No 49
>PF02171 Piwi: Piwi domain; InterPro: IPR003165 This domain is found in the stem cell self-renewal protein Piwi and its relatives in Drosophila melanogaster []. It has been found in the C-terminal of a number of proteins which also contain the PAZ domain (IPR003100 from INTERPRO) in their central region, for example the Argonaute proteins. Several of these proteins have been implicated in the development and maintenance of stem cells through the RNA-mediated gene-quelling mechanisms associated with the protein DICER. ; GO: 0005515 protein binding; PDB: 4F1N_B 3LUH_B 4EI1_A 3QX8_A 3LUC_C 3LUJ_B 3LUD_B 3QX9_A 3LUG_B 3LUK_B ....
Probab=28.60 E-value=1.8e+02 Score=30.48 Aligned_cols=98 Identities=14% Similarity=0.117 Sum_probs=53.1
Q ss_pred EEEEcCccccCcC-Cce-eeEEEE-eecCCceeEEeeecc--cCCchhh----HHHHHHHHHHHhCCCCCeEeec--cc-
Q 004420 271 ALVFDTTHRLDSY-DML-FGIWVG-LDNHGMACFFGCVLL--RDENMQS----FSWSLKTLLGFMNGKAPQTLLT--DQ- 338 (754)
Q Consensus 271 vl~~D~Ty~~n~y-~~~-l~~~~g-~d~~~~~~~~a~al~--~~E~~es----~~w~l~~~~~~~~~~~p~~iit--D~- 338 (754)
||.+|.++..... ..| ++.+++ +|.++..+.-.+.+. ..|..+. +.+.|+.|.+..+...|..||. |+
T Consensus 79 iIGidv~h~~~~~~~~~sv~g~~~s~~~~~~~~~~~~~~~~~~~e~~~~l~~~~~~~L~~~~~~~~~~~P~~IiiyRdGv 158 (302)
T PF02171_consen 79 IIGIDVSHPSPGSDKNPSVVGFVASFDSDGSKYFSSVRFQDSGQEIIDNLEEIIKEALKEFKKNNGKWLPERIIIYRDGV 158 (302)
T ss_dssp EEEEEEEEESSTCTCSCEEEEEEEEESTTTCEEEEEEEEECTTCCCHHHHHHHHHHHHHHHHHTTTT-TTSEEEEEEES-
T ss_pred EEEEEEEecCcccCCcceeeEEEEeccCccccccceeEEeccchhhhcchhhHHHHHHHHHHHHcCCCCCceEEEEEccc
Confidence 7889999998777 333 444444 344444444333332 3344444 5666666666666657765543 22
Q ss_pred ---c---------HHHHHHHHhhCCCCccccchhhhhhhccc
Q 004420 339 ---N---------IWLKEAVAVEMPETKHAVYIWHILAKLSD 368 (754)
Q Consensus 339 ---~---------~~l~~Ai~~vfP~a~h~lC~~Hi~~n~~~ 368 (754)
+ .++.+|+.+.-++....++...+.++...
T Consensus 159 se~~~~~v~~~Ei~~i~~a~~~~~~~~~p~~~~i~v~K~~~~ 200 (302)
T PF02171_consen 159 SEGQFKKVLEEEIEAIKEAIKELGEDYNPKITYIVVQKRHNT 200 (302)
T ss_dssp -GGGHHHHHHHHHHHHHHHHHHHTHTTCTEEEEEEEESSSS-
T ss_pred CHHhhcccHHHHHHHHHHHHhhcccCCCCcEEEEEeeccccc
Confidence 2 24555555555555555666666665443
No 50
>PRK00766 hypothetical protein; Provisional
Probab=27.55 E-value=3.1e+02 Score=26.81 Aligned_cols=34 Identities=9% Similarity=0.178 Sum_probs=21.6
Q ss_pred CCeEeecccc---HHHHHHHHhhCCCCccccchhhhhhhc
Q 004420 330 APQTLLTDQN---IWLKEAVAVEMPETKHAVYIWHILAKL 366 (754)
Q Consensus 330 ~p~~iitD~~---~~l~~Ai~~vfP~a~h~lC~~Hi~~n~ 366 (754)
.|..+++..- .+|..|+.+.||+...+ |.+++++
T Consensus 99 ~PVI~V~r~~p~~~~ie~AL~k~f~~~~~R---~~~~~~~ 135 (194)
T PRK00766 99 LPVIVVMRKKPDFEAIESALKKHFSDWEER---IKLIKKA 135 (194)
T ss_pred CCEEEEEecCCCHHHHHHHHHHHCCCHHHH---HHHHHhC
Confidence 4555553332 37899999999987664 4444443
No 51
>PF15299 ALS2CR8: Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8
Probab=26.03 E-value=59 Score=32.67 Aligned_cols=20 Identities=25% Similarity=0.431 Sum_probs=15.2
Q ss_pred ccCCCcccccccceEEEEEe
Q 004420 108 QRNRKSSRCGCQAYMRIVKR 127 (754)
Q Consensus 108 ~r~~~s~r~gCpa~i~v~~~ 127 (754)
.+...+.+.+|||+|+|+..
T Consensus 70 ~~~~~skK~~CPA~I~Ik~I 89 (225)
T PF15299_consen 70 RRSKPSKKRDCPARIYIKEI 89 (225)
T ss_pred cccccccCCCCCeEEEEEEE
Confidence 34455789999999998753
No 52
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=23.95 E-value=39 Score=24.00 Aligned_cols=17 Identities=29% Similarity=0.487 Sum_probs=15.7
Q ss_pred cCCccCCCCCCCCCCCC
Q 004420 724 SEPCCRHFGHDASSCPI 740 (754)
Q Consensus 724 ~C~~C~~~gHn~~tCp~ 740 (754)
.|-+|++.||-..-||.
T Consensus 6 ~CqkC~~~GH~tyeC~~ 22 (42)
T PF13917_consen 6 RCQKCGQKGHWTYECPN 22 (42)
T ss_pred cCcccCCCCcchhhCCC
Confidence 49999999999999995
No 53
>smart00351 PAX Paired Box domain.
Probab=22.66 E-value=2e+02 Score=25.82 Aligned_cols=69 Identities=16% Similarity=0.263 Sum_probs=44.9
Q ss_pred CCCCcchhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhcc--Cc----c-----cHHHHHHHH
Q 004420 161 CSITPDDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNVN--RD----Y-----DAIDLIAMC 229 (754)
Q Consensus 161 R~l~~~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~~--~~----~-----d~~~l~~~~ 229 (754)
+.++.+.+..|..+...|.+.++|...+ | ++...+++++++++... .. + -......++
T Consensus 16 ~~~s~~~R~riv~~~~~G~s~~~iA~~~----g-------vs~~tV~kwi~r~~~~G~~~pk~~gg~rp~~~~~~~~~~I 84 (125)
T smart00351 16 RPLPDEERQRIVELAQNGVRPCDISRQL----C-------VSHGCVSKILGRYYETGSIRPGAIGGSKPKVATPKVVKKI 84 (125)
T ss_pred CCCCHHHHHHHHHHHHcCCCHHHHHHHH----C-------cCHHHHHHHHHHHHHcCCcCCcCCCCCCCCccCHHHHHHH
Confidence 5578889999998888999999886544 3 46677888888776431 00 1 111233455
Q ss_pred HHhhcCCCCcE
Q 004420 230 KKMKDKNPNFQ 240 (754)
Q Consensus 230 ~~~~~~np~~~ 240 (754)
.++..+||...
T Consensus 85 ~~~~~~~p~~t 95 (125)
T smart00351 85 ADYKQENPGIF 95 (125)
T ss_pred HHHHHHCCCCC
Confidence 55666777763
No 54
>PF12353 eIF3g: Eukaryotic translation initiation factor 3 subunit G ; InterPro: IPR024675 At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding []. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700 kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. Subunit G is required for eIF3 integrity. This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain PF00076 from PFAM.
Probab=22.66 E-value=51 Score=29.89 Aligned_cols=23 Identities=22% Similarity=0.430 Sum_probs=19.1
Q ss_pred cccccCCccCCCCCCCCCCCCcCc
Q 004420 720 RRHCSEPCCRHFGHDASSCPIMGS 743 (754)
Q Consensus 720 ~r~~~C~~C~~~gHn~~tCp~~~~ 743 (754)
...+.|..|+ -+|-...||.+..
T Consensus 104 ~~~v~CR~Ck-GdH~T~~CPyKd~ 126 (128)
T PF12353_consen 104 KSKVKCRICK-GDHWTSKCPYKDT 126 (128)
T ss_pred CceEEeCCCC-CCcccccCCcccc
Confidence 4566799997 8999999998753
No 55
>PF04800 ETC_C1_NDUFA4: ETC complex I subunit conserved region; InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=21.89 E-value=73 Score=27.54 Aligned_cols=27 Identities=19% Similarity=0.384 Sum_probs=20.8
Q ss_pred CCCCccCCHHHHHHHHHHHHhhcCCeEEEcc
Q 004420 43 FIGQRFVSQDAAYEFYCSFAKQCGFSIRRHR 73 (754)
Q Consensus 43 ~vg~~F~S~eea~~~y~~yA~~~GF~ir~~~ 73 (754)
.+.+.|+|.|+|. +||+++|....+..
T Consensus 50 ~v~l~F~skE~Ai----~yaer~G~~Y~V~~ 76 (101)
T PF04800_consen 50 SVRLKFDSKEDAI----AYAERNGWDYEVEE 76 (101)
T ss_dssp -CEEEESSHHHHH----HHHHHCT-EEEEE-
T ss_pred eeEeeeCCHHHHH----HHHHHcCCeEEEeC
Confidence 3888999999996 57999998887654
No 56
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=21.10 E-value=50 Score=31.89 Aligned_cols=19 Identities=26% Similarity=0.422 Sum_probs=0.0
Q ss_pred ccCCccCCCCCCCCCC-CCc
Q 004420 723 CSEPCCRHFGHDASSC-PIM 741 (754)
Q Consensus 723 ~~C~~C~~~gHn~~tC-p~~ 741 (754)
..|.+|++.||-++-| |.+
T Consensus 98 ~~C~~Cg~~GH~~~dC~P~~ 117 (190)
T COG5082 98 KKCYNCGETGHLSRDCNPSK 117 (190)
T ss_pred cccccccccCccccccCccc
No 57
>PF11427 HTH_Tnp_Tc3_1: Tc3 transposase; PDB: 1U78_A 1TC3_C.
Probab=20.64 E-value=94 Score=23.05 Aligned_cols=28 Identities=25% Similarity=0.495 Sum_probs=20.8
Q ss_pred CCCcchhhHHHHhhhcCCcHHHHHHHHH
Q 004420 162 SITPDDKTRICMFAKAGMSVRQMLRLME 189 (754)
Q Consensus 162 ~l~~~~k~~i~~l~~~g~~~~~i~~~l~ 189 (754)
.|++.++.+|..|...|++..+|-..+.
T Consensus 4 ~Lt~~Eqaqid~m~qlG~s~~~isr~i~ 31 (50)
T PF11427_consen 4 TLTDAEQAQIDVMHQLGMSLREISRRIG 31 (50)
T ss_dssp ---HHHHHHHHHHHHTT--HHHHHHHHT
T ss_pred cCCHHHHHHHHHHHHhchhHHHHHHHhC
Confidence 4788999999999999999999987775
No 58
>PF13384 HTH_23: Homeodomain-like domain; PDB: 2X48_C.
Probab=20.52 E-value=87 Score=22.61 Aligned_cols=40 Identities=15% Similarity=0.395 Sum_probs=20.1
Q ss_pred chhhHHHHhhhcCCcHHHHHHHHHHhcCccCCCCCcchhhhhhhHHHhhhc
Q 004420 166 DDKTRICMFAKAGMSVRQMLRLMELEKGVKLGCLPFTEIDVRNLLQSFRNV 216 (754)
Q Consensus 166 ~~k~~i~~l~~~g~~~~~i~~~l~~~~g~~~~~~~~~~~Di~N~~~~~r~~ 216 (754)
..+..+..+...|.+.++|.+.+. ++...|++++.+++..
T Consensus 5 ~~R~~ii~l~~~G~s~~~ia~~lg-----------vs~~Tv~~w~kr~~~~ 44 (50)
T PF13384_consen 5 ERRAQIIRLLREGWSIREIAKRLG-----------VSRSTVYRWIKRYREE 44 (50)
T ss_dssp -----HHHHHHHT--HHHHHHHHT-----------S-HHHHHHHHT-----
T ss_pred hHHHHHHHHHHCCCCHHHHHHHHC-----------cCHHHHHHHHHHcccc
Confidence 344556666666999999887652 4667888888877643
Done!