Query 007318
Match_columns 608
No_of_seqs 255 out of 1737
Neff 9.4
Searched_HMMs 46136
Date Thu Mar 28 21:56:48 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/007318.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/007318hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 2.2E-76 4.9E-81 644.6 43.0 474 39-535 71-623 (846)
2 PF10551 MULE: MULE transposas 99.9 1.5E-22 3.3E-27 167.0 8.7 90 240-331 1-93 (93)
3 PF00872 Transposase_mut: Tran 99.8 5.8E-22 1.3E-26 205.1 4.9 240 142-430 113-368 (381)
4 PF03108 DBD_Tnp_Mut: MuDR fam 99.7 3.1E-16 6.7E-21 119.8 8.9 67 39-105 1-67 (67)
5 COG3328 Transposase and inacti 99.6 7.6E-14 1.6E-18 141.3 17.5 236 142-428 99-345 (379)
6 PF08731 AFT: Transcription fa 99.0 1.8E-09 3.8E-14 87.3 8.8 69 48-116 1-111 (111)
7 smart00575 ZnF_PMZ plant mutat 98.9 5.7E-10 1.2E-14 68.1 2.2 27 493-519 1-27 (28)
8 PF03101 FAR1: FAR1 DNA-bindin 98.8 1.1E-08 2.4E-13 83.5 6.5 61 56-117 1-90 (91)
9 PF04434 SWIM: SWIM zinc finge 98.1 1.8E-06 3.8E-11 58.2 2.9 30 488-517 10-39 (40)
10 PF00098 zf-CCHC: Zinc knuckle 96.4 0.0023 5.1E-08 34.4 1.6 18 590-607 1-18 (18)
11 PF01610 DDE_Tnp_ISL3: Transpo 95.5 0.021 4.5E-07 56.1 5.5 94 236-335 1-97 (249)
12 PF15288 zf-CCHC_6: Zinc knuck 95.4 0.0069 1.5E-07 39.6 1.1 18 590-607 2-21 (40)
13 PF13610 DDE_Tnp_IS240: DDE do 95.0 0.014 3.1E-07 51.6 2.3 81 233-317 1-81 (140)
14 PHA02517 putative transposase 94.9 1.5 3.4E-05 43.6 16.8 151 129-306 30-181 (277)
15 PF06782 UPF0236: Uncharacteri 94.8 1.2 2.6E-05 47.9 16.5 130 273-415 235-377 (470)
16 PF03106 WRKY: WRKY DNA -bindi 94.7 0.078 1.7E-06 38.9 5.1 40 76-115 20-59 (60)
17 PF04684 BAF1_ABF1: BAF1 / ABF 94.1 0.11 2.4E-06 53.3 6.2 57 43-99 23-80 (496)
18 COG3316 Transposase and inacti 93.7 0.63 1.4E-05 43.5 9.9 120 152-320 33-152 (215)
19 PF13696 zf-CCHC_2: Zinc knuck 93.6 0.038 8.2E-07 34.4 1.2 19 589-607 8-26 (32)
20 PRK14702 insertion element IS2 93.4 8.6 0.00019 37.9 20.1 146 126-305 9-163 (262)
21 PF03050 DDE_Tnp_IS66: Transpo 92.6 0.54 1.2E-05 46.7 8.6 133 142-336 19-156 (271)
22 PF04500 FLYWCH: FLYWCH zinc f 90.4 0.47 1E-05 34.9 4.2 46 65-114 14-62 (62)
23 PRK09409 IS2 transposase TnpB; 90.3 5.8 0.00013 40.0 13.2 143 129-305 51-202 (301)
24 smart00774 WRKY DNA binding do 89.3 0.46 9.9E-06 34.6 3.2 38 77-114 21-59 (59)
25 PF00665 rve: Integrase core d 87.9 1.8 3.9E-05 36.6 6.7 74 233-307 6-80 (120)
26 PF14392 zf-CCHC_4: Zinc knuck 86.5 0.28 6.1E-06 34.4 0.7 19 589-607 31-49 (49)
27 smart00343 ZnF_C2HC zinc finge 86.0 0.37 8E-06 28.6 0.9 16 591-606 1-16 (26)
28 PF13565 HTH_32: Homeodomain-l 84.2 2.7 5.9E-05 32.4 5.4 40 130-169 35-76 (77)
29 PF04937 DUF659: Protein of un 81.7 8.1 0.00018 34.6 8.0 63 274-336 73-138 (153)
30 PF02178 AT_hook: AT hook moti 80.5 0.74 1.6E-05 22.5 0.5 9 570-578 2-10 (13)
31 smart00384 AT_hook DNA binding 72.5 2.7 5.9E-05 24.7 1.5 13 569-581 1-13 (26)
32 COG5431 Uncharacterized metal- 72.0 3.1 6.8E-05 33.6 2.3 29 483-513 42-75 (117)
33 PF05741 zf-nanos: Nanos RNA b 67.3 2.2 4.8E-05 30.5 0.4 20 588-607 32-54 (55)
34 COG5179 TAF1 Transcription ini 62.7 4.3 9.3E-05 43.4 1.7 21 587-607 935-957 (968)
35 COG4279 Uncharacterized conser 62.3 4.1 8.8E-05 38.8 1.3 23 493-518 125-147 (266)
36 PRK12286 rpmF 50S ribosomal pr 59.7 9.5 0.00021 27.7 2.6 42 566-607 4-45 (57)
37 PRK14892 putative transcriptio 54.5 11 0.00024 30.8 2.4 10 588-597 20-29 (99)
38 PRK09335 30S ribosomal protein 53.9 12 0.00026 30.0 2.5 28 566-598 2-29 (95)
39 PF13917 zf-CCHC_3: Zinc knuck 51.7 7.8 0.00017 26.1 1.0 18 589-606 4-21 (42)
40 TIGR01031 rpmF_bact ribosomal 51.5 17 0.00037 26.2 2.7 42 566-607 2-44 (55)
41 COG4715 Uncharacterized conser 49.5 47 0.001 35.8 6.7 44 477-522 51-100 (587)
42 PRK00766 hypothetical protein; 48.6 1.4E+02 0.0031 27.8 9.0 88 233-320 9-128 (194)
43 PF13592 HTH_33: Winged helix- 47.6 33 0.00071 25.0 3.9 30 142-171 3-32 (60)
44 PLN00186 ribosomal protein S26 47.1 17 0.00038 29.8 2.5 28 566-598 2-29 (109)
45 PF14201 DUF4318: Domain of un 46.4 34 0.00074 26.3 3.8 30 47-76 13-42 (74)
46 PTZ00172 40S ribosomal protein 45.6 19 0.00041 29.6 2.4 28 566-598 2-29 (108)
47 COG5082 AIR1 Arginine methyltr 44.6 11 0.00025 34.5 1.2 16 590-605 98-113 (190)
48 PF12762 DDE_Tnp_IS1595: ISXO2 42.3 72 0.0016 28.1 6.1 68 234-306 4-86 (151)
49 COG0333 RpmF Ribosomal protein 40.1 17 0.00037 26.2 1.3 19 589-607 27-45 (57)
50 KOG4602 Nanos and related prot 40.0 13 0.00029 35.3 0.9 21 588-608 267-290 (318)
51 PF13082 DUF3931: Protein of u 38.1 94 0.002 21.6 4.5 28 247-274 35-62 (66)
52 PHA00689 hypothetical protein 37.2 18 0.00039 24.8 0.9 13 587-599 15-27 (62)
53 PF01498 HTH_Tnp_Tc3_2: Transp 36.1 37 0.0008 25.7 2.8 37 134-171 4-40 (72)
54 PF08459 UvrC_HhH_N: UvrC Heli 34.3 1E+02 0.0022 27.6 5.6 46 270-315 49-100 (155)
55 PF08766 DEK_C: DEK C terminal 34.0 85 0.0019 22.3 4.2 36 129-164 4-41 (54)
56 KOG0341 DEAD-box protein abstr 32.6 20 0.00044 36.5 1.0 19 589-607 570-588 (610)
57 COG4830 RPS26B Ribosomal prote 31.4 35 0.00075 27.4 1.8 28 566-598 2-29 (108)
58 PRK13907 rnhA ribonuclease H; 31.0 3.3E+02 0.0071 23.0 8.8 77 235-314 3-81 (128)
59 PRK13130 H/ACA RNA-protein com 29.9 31 0.00067 24.9 1.3 20 588-608 4-23 (56)
60 PF04800 ETC_C1_NDUFA4: ETC co 29.0 90 0.0019 25.7 3.9 29 43-75 50-78 (101)
61 PRK14890 putative Zn-ribbon RN 28.6 24 0.00051 25.6 0.5 19 589-608 36-54 (59)
62 COG5082 AIR1 Arginine methyltr 28.5 27 0.00059 32.1 1.0 18 589-606 60-77 (190)
63 PF01283 Ribosomal_S26e: Ribos 26.9 45 0.00098 27.9 1.9 27 566-597 2-28 (113)
64 PF09713 A_thal_3526: Plant pr 26.7 72 0.0016 22.8 2.6 26 143-168 12-38 (54)
65 PF05634 APO_RNA-bind: APO RNA 26.4 37 0.00081 31.4 1.5 19 589-607 98-121 (204)
66 PF13276 HTH_21: HTH-like doma 25.5 1.7E+02 0.0036 21.1 4.6 42 130-171 6-48 (60)
67 PF14420 Clr5: Clr5 domain 25.1 2.1E+02 0.0046 20.3 4.9 25 141-165 18-42 (54)
68 PF01783 Ribosomal_L32p: Ribos 24.6 17 0.00038 26.2 -0.8 20 588-607 25-44 (56)
69 PF13877 RPAP3_C: Potential Mo 24.5 40 0.00086 27.1 1.2 34 357-390 5-38 (94)
70 PF13248 zf-ribbon_3: zinc-rib 23.5 39 0.00084 20.0 0.7 18 590-607 3-21 (26)
71 PF14787 zf-CCHC_5: GAG-polypr 22.3 57 0.0012 21.0 1.3 17 590-606 3-19 (36)
72 KOG4027 Uncharacterized conser 22.3 1.1E+02 0.0024 27.1 3.4 36 238-273 70-108 (187)
73 PF13358 DDE_3: DDE superfamil 21.7 92 0.002 26.7 3.1 54 250-305 38-91 (146)
74 PF13551 HTH_29: Winged helix- 21.0 2.1E+02 0.0045 23.3 5.0 39 133-171 65-109 (112)
75 PF12353 eIF3g: Eukaryotic tra 20.6 68 0.0015 27.7 1.9 20 587-607 104-123 (128)
76 PRK01110 rpmF 50S ribosomal pr 20.5 77 0.0017 23.3 1.9 37 566-602 4-42 (60)
77 PF10045 DUF2280: Uncharacteri 20.3 1.2E+02 0.0026 24.9 3.0 22 145-166 21-42 (104)
78 cd01388 SOX-TCF_HMG-box SOX-TC 20.1 3.8E+02 0.0083 20.0 7.0 21 138-158 21-41 (72)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=2.2e-76 Score=644.56 Aligned_cols=474 Identities=18% Similarity=0.237 Sum_probs=377.9
Q ss_pred CccccccceeCCHHHHHHHHHHHHHhcCceEEEEeeCCe-------EEEEEEec--------------------------
Q 007318 39 NTITGVDQRFSSFSEFREALHKYSIAHGFAYRYKKNDSH-------RVTVKCKC-------------------------- 85 (608)
Q Consensus 39 ~~~~~vG~~F~s~~e~~~~~~~ya~~~gf~~~~~~s~~~-------r~~~~C~~-------------------------- 85 (608)
+..|.+||+|+|.+|+++||+.||...||++++.++.++ ..+++|++
T Consensus 71 ~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~~~~~ 150 (846)
T PLN03097 71 NLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQDPENG 150 (846)
T ss_pred CccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccCcccc
Confidence 456899999999999999999999999999998654322 23566653
Q ss_pred --------cCCCeEEEEEEcCCcceEEEEeccCcccccccccccccccchhhHHHHHHHHhhhCCCCCHHHHHHHHHHHh
Q 007318 86 --------QGCPWRIYASRLSTTQLVCIKKMNSKHTCEGASVKAGYRATRGWVGNIIKEKLKASPNYKPKDIADDIKREY 157 (608)
Q Consensus 86 --------~gC~~~v~~~~~~~~~~~~V~~~~~~H~c~~~~~~~~~~~s~~~i~~~~~~~l~~~~~~~~~~i~~~v~~~~ 157 (608)
+||+++|++++..+ +.|.|+.++.+|||++.........+++.... +...+....++.
T Consensus 151 ~~rR~~tRtGC~A~m~Vk~~~~-gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~~-~~~~~~~~~~v~------------ 216 (846)
T PLN03097 151 TGRRSCAKTDCKASMHVKRRPD-GKWVIHSFVKEHNHELLPAQAVSEQTRKMYAA-MARQFAEYKNVV------------ 216 (846)
T ss_pred cccccccCCCCceEEEEEEcCC-CeEEEEEEecCCCCCCCCccccchhhhhhHHH-HHhhhhcccccc------------
Confidence 47999999988544 68999999999999997432211111111111 000000000000
Q ss_pred CCcc-cHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEEEEEecCCCceeEEEeehhhhHHHHhhcCccEE
Q 007318 158 GIQL-NYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVVTFTTKEDSSFHRLFVSFHASISGFQQGCRPLL 236 (608)
Q Consensus 158 g~~~-s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~~~~~~~~~~~~~if~~~~~~~~~~~~~~~~vi 236 (608)
+... ......+.|.+ .+. .+..+.|..|+.+++..||+|+|.+++|+++++++|||+++.++.+|.. |+|||
T Consensus 217 ~~~~d~~~~~~~~r~~---~~~---~gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~-FGDvV 289 (846)
T PLN03097 217 GLKNDSKSSFDKGRNL---GLE---AGDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGN-FSDVV 289 (846)
T ss_pred ccchhhcchhhHHHhh---hcc---cchHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHh-cCCEE
Confidence 0000 00011111111 111 2356779999999999999999999999999999999999999999998 99999
Q ss_pred EeecccccCccceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEEecCcccHHHHHhhhcCC
Q 007318 237 FLDTTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFIADFQNGLNKSLAEVFDN 316 (608)
Q Consensus 237 ~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~ 316 (608)
.+|+||.+|+|++||+.++|+|+|++++++|+||+.+|+.++|.|+|++|+++|++..|.+||||++.+|.+||++|||+
T Consensus 290 ~fDTTY~tN~y~~Pfa~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~gk~P~tIiTDqd~am~~AI~~VfP~ 369 (846)
T PLN03097 290 SFDTTYVRNKYKMPLALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMGGQAPKVIITDQDKAMKSVISEVFPN 369 (846)
T ss_pred EEeceeeccccCcEEEEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhCCCCCceEEecCCHHHHHHHHHHCCC
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred ccccchHHHHHHHHHhhhcccccHHHHHHHHHHHHHHhcC-CCcccHHHHHHHH-hccChhHHHHHHhc--CCCCccccc
Q 007318 317 CYHSYCLRHLAEKLNRDIKGQFSHEARRFMINDLYAAAYA-PKFEGFQCSIESI-KGISPDAYDWVTQS--EPEHWANTY 392 (608)
Q Consensus 317 a~~~~C~~Hi~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~-~t~~~f~~~~~~l-~~~~~~~~~~l~~~--~~~~W~~~~ 392 (608)
+.|++|.|||++|+.++++..+.. .+.+...|..+++. .+++||+..|..| .+++++.++||+.+ .|++|+++|
T Consensus 370 t~Hr~C~wHI~~~~~e~L~~~~~~--~~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~LY~~RekWapaY 447 (846)
T PLN03097 370 AHHCFFLWHILGKVSENLGQVIKQ--HENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSLYEDRKQWVPTY 447 (846)
T ss_pred ceehhhHHHHHHHHHHHhhHHhhh--hhHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHHHHhHhhhhHHH
Confidence 999999999999999999876532 35788899998886 7999999999665 56779999999999 799999999
Q ss_pred cCCCccccc-ccchhHHHHHHHHhh--hcCChhhhHHHHHHHHHHHHHHhh-----------------HhhhhcccCCCh
Q 007318 393 FPGARYDHM-TSNFGQQFYSWVSEA--HELPITHMVDVLRGKMMETIYTRR-----------------VESNQWLTKLTP 452 (608)
Q Consensus 393 ~~~~~~~~~-ttn~~Es~n~~lk~~--r~~~i~~l~~~i~~~~~~~~~~r~-----------------~~~~~~~~~~tp 452 (608)
+++.+++.| ||+++||+|+.+++. +..+|..|++.+-..+..+..+.. ...+|.+..|||
T Consensus 448 ~k~~F~agm~sTqRSES~Ns~fk~yv~~~tsL~~Fv~qye~~l~~~~ekE~~aD~~s~~~~P~l~t~~piEkQAs~iYT~ 527 (846)
T PLN03097 448 MRDAFLAGMSTVQRSESINAFFDKYVHKKTTVQEFVKQYETILQDRYEEEAKADSDTWNKQPALKSPSPLEKSVSGVYTH 527 (846)
T ss_pred hcccccCCcccccccccHHHHHHHHhCcCCCHHHHHHHHHHHHHHHHHHHHHhhhhcccCCcccccccHHHHHHHHHhHH
Confidence 998887655 779999999999985 778888888766544433322211 134567889999
Q ss_pred hHHHHHHHHHhhccceEEEEe----cCceEEEEc---CceEEEEe----cceeeecCCcccCCCCchhHHHHHHHhCC--
Q 007318 453 SKEDKLQKETAIARSFQVLHL----QSSTFEVRG---ESADIVDV----DRWDCTCKTWHLTGLPCCHAIAVLEWIGR-- 519 (608)
Q Consensus 453 ~~~~~l~~~~~~~~~~~v~~~----~~~~f~V~~---~~~~~V~l----~~~~CsC~~~~~~giPC~H~lav~~~~~~-- 519 (608)
.+|++||+++..+..+.+... ...+|.|.+ .+.|.|.. ...+|+|++|+..||||+|||.||...++
T Consensus 528 ~iF~kFQ~El~~~~~~~~~~~~~dg~~~~y~V~~~~~~~~~~V~~d~~~~~v~CsC~kFE~~GILCrHaLkVL~~~~v~~ 607 (846)
T PLN03097 528 AVFKKFQVEVLGAVACHPKMESQDETSITFRVQDFEKNQDFTVTWNQTKLEVSCICRLFEYKGYLCRHALVVLQMCQLSA 607 (846)
T ss_pred HHHHHHHHHHHHhhheEEeeeccCCceEEEEEEEecCCCcEEEEEecCCCeEEeeccCeecCccchhhHHHHHhhcCccc
Confidence 999999999999988887642 235688865 34577743 36899999999999999999999999986
Q ss_pred Cccccccccchhhhhh
Q 007318 520 SPYDYCSKYFTTESYR 535 (608)
Q Consensus 520 ~p~~~i~~~~t~~~~~ 535 (608)
+|+.||.+|||+++-.
T Consensus 608 IP~~YILkRWTKdAK~ 623 (846)
T PLN03097 608 IPSQYILKRWTKDAKS 623 (846)
T ss_pred Cchhhhhhhchhhhhh
Confidence 6999999999998864
No 2
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.87 E-value=1.5e-22 Score=167.04 Aligned_cols=90 Identities=39% Similarity=0.716 Sum_probs=86.8
Q ss_pred cccccCccceeEEE---EEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEEecCcccHHHHHhhhcCC
Q 007318 240 TTPLNSKYQGTLLT---ATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFIADFQNGLNKSLAEVFDN 316 (608)
Q Consensus 240 ~T~~~~~y~~~l~~---~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~l~~Ai~~vfP~ 316 (608)
|||++|+| ++++. ++|+|++|+.+|+||+++++|+.++|.|||+.+++.++.. |.+||+|++.|+.+||+++||+
T Consensus 1 ~T~~tn~~-~~l~~~~~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~~~-p~~ii~D~~~~~~~Ai~~vfP~ 78 (93)
T PF10551_consen 1 GTYKTNKY-GPLLYLMIAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMPQK-PKVIISDFDKALINAIKEVFPD 78 (93)
T ss_pred Cccccccc-cccceeceEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhccccC-ceeeeccccHHHHHHHHHHCCC
Confidence 79999999 88885 9999999999999999999999999999999999999887 9999999999999999999999
Q ss_pred ccccchHHHHHHHHH
Q 007318 317 CYHSYCLRHLAEKLN 331 (608)
Q Consensus 317 a~~~~C~~Hi~~n~~ 331 (608)
+.|++|.||+.+|++
T Consensus 79 ~~~~~C~~H~~~n~k 93 (93)
T PF10551_consen 79 ARHQLCLFHILRNIK 93 (93)
T ss_pred ceEehhHHHHHHhhC
Confidence 999999999999974
No 3
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.84 E-value=5.8e-22 Score=205.05 Aligned_cols=240 Identities=22% Similarity=0.270 Sum_probs=189.3
Q ss_pred CCCCHHHHHHHHHHHhC-CcccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEEEEEecCCCceeEEEee
Q 007318 142 PNYKPKDIADDIKREYG-IQLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVVTFTTKEDSSFHRLFVS 220 (608)
Q Consensus 142 ~~~~~~~i~~~v~~~~g-~~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~~~~~~~~~~~~~if~~ 220 (608)
.|++.++|.+.++..+| ..+|.+++.+..+.....+. .|..+-++
T Consensus 113 ~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~~-----------~w~~R~L~----------------------- 158 (381)
T PF00872_consen 113 KGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEVE-----------AWRNRPLE----------------------- 158 (381)
T ss_pred cccccccccchhhhhhcccccCchhhhhhhhhhhhhHH-----------HHhhhccc-----------------------
Confidence 47899999999999999 77999888886665544322 22211111
Q ss_pred hhhhHHHHhhcC-ccEEEeecccccCccc-----eeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCC
Q 007318 221 FHASISGFQQGC-RPLLFLDTTPLNSKYQ-----GTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQ 294 (608)
Q Consensus 221 ~~~~~~~~~~~~-~~vi~~D~T~~~~~y~-----~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~ 294 (608)
.. .++|++||+|.+-+.+ ..+++++|+|.+|+..+||+.+.+.|+.++|.-||+.|++. |...
T Consensus 159 ----------~~~y~~l~iD~~~~kvr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~L~~R-Gl~~ 227 (381)
T PF00872_consen 159 ----------SEPYPYLWIDGTYFKVREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQDLKER-GLKD 227 (381)
T ss_pred ----------cccccceeeeeeecccccccccccchhhhhhhhhcccccceeeeecccCCccCEeeecchhhhhc-cccc
Confidence 14 5789999999976643 46899999999999999999999999999999999999888 4456
Q ss_pred cEEEEecCcccHHHHHhhhcCCccccchHHHHHHHHHhhhcccccHHHHHHHHHHHHHHhcCCCcccHHHHHHHHh----
Q 007318 295 QITFIADFQNGLNKSLAEVFDNCYHSYCLRHLAEKLNRDIKGQFSHEARRFMINDLYAAAYAPKFEGFQCSIESIK---- 370 (608)
Q Consensus 295 ~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~f~~~~~~l~---- 370 (608)
|..||+|+.+||.+||.++||++.++.|++|+++|+.+++... .++.+...++.+..+.+.++....++.+.
T Consensus 228 ~~lvv~Dg~~gl~~ai~~~fp~a~~QrC~vH~~RNv~~~v~~k----~~~~v~~~Lk~I~~a~~~e~a~~~l~~f~~~~~ 303 (381)
T PF00872_consen 228 ILLVVSDGHKGLKEAIREVFPGAKWQRCVVHLMRNVLRKVPKK----DRKEVKADLKAIYQAPDKEEAREALEEFAEKWE 303 (381)
T ss_pred cceeeccccccccccccccccchhhhhheechhhhhccccccc----cchhhhhhccccccccccchhhhhhhhcccccc
Confidence 9999999999999999999999999999999999999998653 34567777778878888887777776654
Q ss_pred ccChhHHHHHHhcCCCCccccccCCCcc-cccccchhHHHHHHHHhh----hcCChhhhHHHHHH
Q 007318 371 GISPDAYDWVTQSEPEHWANTYFPGARY-DHMTSNFGQQFYSWVSEA----HELPITHMVDVLRG 430 (608)
Q Consensus 371 ~~~~~~~~~l~~~~~~~W~~~~~~~~~~-~~~ttn~~Es~n~~lk~~----r~~~i~~l~~~i~~ 430 (608)
..+|++.++|++...+.|+..-|+...+ -..|||.+|++|+.+|+. ...|-.+.+..+..
T Consensus 304 ~kyp~~~~~l~~~~~~~~tf~~fP~~~~~~i~TTN~iEsln~~irrr~~~~~~Fp~~~s~lr~~~ 368 (381)
T PF00872_consen 304 KKYPKAAKSLEENWDELLTFLDFPPEHRRSIRTTNAIESLNKEIRRRTKVVGIFPNEESALRLVY 368 (381)
T ss_pred cccchhhhhhhhccccccceeeecchhccccchhhhccccccchhhhccccccCCCHHHHHHHHH
Confidence 4568899999888777777655665444 567999999999999974 34555544444433
No 4
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=99.67 E-value=3.1e-16 Score=119.81 Aligned_cols=67 Identities=42% Similarity=0.769 Sum_probs=64.9
Q ss_pred CccccccceeCCHHHHHHHHHHHHHhcCceEEEEeeCCeEEEEEEeccCCCeEEEEEEcCCcceEEE
Q 007318 39 NTITGVDQRFSSFSEFREALHKYSIAHGFAYRYKKNDSHRVTVKCKCQGCPWRIYASRLSTTQLVCI 105 (608)
Q Consensus 39 ~~~~~vG~~F~s~~e~~~~~~~ya~~~gf~~~~~~s~~~r~~~~C~~~gC~~~v~~~~~~~~~~~~V 105 (608)
||.+.+||+|+|++|+++|+..||+.++|.+++.+|++.|++++|...||||+|+|++.++++.|+|
T Consensus 1 n~~l~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~~r~~~~C~~~~C~Wrv~as~~~~~~~~~I 67 (67)
T PF03108_consen 1 NPELEVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDKKRYRAKCKDKGCPWRVRASKRKRSDTFQI 67 (67)
T ss_pred CCccccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCCEEEEEEEcCCCCCEEEEEEEcCCCCEEEC
Confidence 6789999999999999999999999999999999999999999999999999999999999999986
No 5
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.57 E-value=7.6e-14 Score=141.27 Aligned_cols=236 Identities=18% Similarity=0.207 Sum_probs=176.4
Q ss_pred CCCCHHHHHHHHHHHhCCcccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEEEEEecCCCceeEEEeeh
Q 007318 142 PNYKPKDIADDIKREYGIQLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVVTFTTKEDSSFHRLFVSF 221 (608)
Q Consensus 142 ~~~~~~~i~~~v~~~~g~~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~~~~~~~~~~~~~if~~~ 221 (608)
.+++++++.+.+++.++..++...+.+.-....+.+ .+++..-+
T Consensus 99 ~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e~v---------------~~~~~r~l--------------------- 142 (379)
T COG3328 99 KGVTTREIEALLEELYGHKVSPSVISVVTDRLDEKV---------------KAWQNRPL--------------------- 142 (379)
T ss_pred cCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHHHH---------------HHHHhccc---------------------
Confidence 478999999999999888777776666555444432 22222211
Q ss_pred hhhHHHHhhcCccEEEeecccccCc--cceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEE
Q 007318 222 HASISGFQQGCRPLLFLDTTPLNSK--YQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFI 299 (608)
Q Consensus 222 ~~~~~~~~~~~~~vi~~D~T~~~~~--y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~ii 299 (608)
+..+++++|++|++-+ -+..+++++|++.+|+-.++++.+...|+ ..|.-||..|+.. |......++
T Consensus 143 ---------~~~~~v~~D~~~~k~r~v~~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~r-gl~~v~l~v 211 (379)
T COG3328 143 ---------GDYPYVYLDAKYVKVRSVRNKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNR-GLSDVLLVV 211 (379)
T ss_pred ---------cCceEEEEecceeehhhhhhheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhc-cccceeEEe
Confidence 1568999999999876 35689999999999999999999999999 9999888888877 334456677
Q ss_pred ecCcccHHHHHhhhcCCccccchHHHHHHHHHhhhcccccHHHHHHHHHHHHHHhcCCCcccHHHHHHH----HhccChh
Q 007318 300 ADFQNGLNKSLAEVFDNCYHSYCLRHLAEKLNRDIKGQFSHEARRFMINDLYAAAYAPKFEGFQCSIES----IKGISPD 375 (608)
Q Consensus 300 tD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~f~~~~~~----l~~~~~~ 375 (608)
+|+.+|+.+||.++||.+.++.|..|+.+|+..+... +.++.+...+..+.-+.+.++....|.. +...+|.
T Consensus 212 ~Dg~~gl~~aI~~v~p~a~~Q~C~vH~~Rnll~~v~~----k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~~yP~ 287 (379)
T COG3328 212 VDGLKGLPEAISAVFPQAAVQRCIVHLVRNLLDKVPR----KDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGKRYPA 287 (379)
T ss_pred cchhhhhHHHHHHhccHhhhhhhhhHHHhhhhhhhhh----hhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhhhcch
Confidence 7999999999999999999999999999999998775 3445566666666666776666666655 4445688
Q ss_pred HHHHHHhcCCCCccccccCC-CcccccccchhHHHHHHHHhh----hcCChhhhHHHH
Q 007318 376 AYDWVTQSEPEHWANTYFPG-ARYDHMTSNFGQQFYSWVSEA----HELPITHMVDVL 428 (608)
Q Consensus 376 ~~~~l~~~~~~~W~~~~~~~-~~~~~~ttn~~Es~n~~lk~~----r~~~i~~l~~~i 428 (608)
...++.+..-+.|...-|+. .+-...|||.+|++|+.++.. ..+|-.+.+..+
T Consensus 288 i~~~~~~~~~~~~~F~~fp~~~r~~i~ttN~IE~~n~~ir~~~~~~~~fpn~~sv~k~ 345 (379)
T COG3328 288 ILKSWRNALEELLPFFAFPSEIRKIIYTTNAIESLNKLIRRRTKVVGIFPNEESVEKL 345 (379)
T ss_pred HHHHHHHHHHHhcccccCcHHHHhHhhcchHHHHHHHHHHHHHhhhccCCCHHHHHHH
Confidence 88888777666665444443 234678999999999988753 344444444443
No 6
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=99.01 E-value=1.8e-09 Score=87.34 Aligned_cols=69 Identities=23% Similarity=0.406 Sum_probs=65.7
Q ss_pred eCCHHHHHHHHHHHHHhcCceEEEEeeCCeEEEEEEec------------------------------------------
Q 007318 48 FSSFSEFREALHKYSIAHGFAYRYKKNDSHRVTVKCKC------------------------------------------ 85 (608)
Q Consensus 48 F~s~~e~~~~~~~ya~~~gf~~~~~~s~~~r~~~~C~~------------------------------------------ 85 (608)
|.+++|++.+|..++..+||++.+.||+...+.|+|..
T Consensus 1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~~ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k~t~srk 80 (111)
T PF08731_consen 1 FDDKDEIKPWLQKIFYPQGIGIVIERSDKKKIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKKRTKSRK 80 (111)
T ss_pred CCchHHHHHHHHHHhhhcCceEEEEecCCceEEEEEecCCCcccccccccccccccccccccccccccccccCCcccccc
Confidence 88999999999999999999999999999999999982
Q ss_pred cCCCeEEEEEEcCCcceEEEEeccCcccccc
Q 007318 86 QGCPWRIYASRLSTTQLVCIKKMNSKHTCEG 116 (608)
Q Consensus 86 ~gC~~~v~~~~~~~~~~~~V~~~~~~H~c~~ 116 (608)
.+|||+|+|+.+...+.|.|..+++.|+|++
T Consensus 81 ~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l 111 (111)
T PF08731_consen 81 NTCPFRIRANYSKKNKKWTLVVVNNEHNHPL 111 (111)
T ss_pred cCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence 6999999999999999999999999999974
No 7
>smart00575 ZnF_PMZ plant mutator transposase zinc finger.
Probab=98.92 E-value=5.7e-10 Score=68.10 Aligned_cols=27 Identities=41% Similarity=0.960 Sum_probs=24.8
Q ss_pred eeeecCCcccCCCCchhHHHHHHHhCC
Q 007318 493 WDCTCKTWHLTGLPCCHAIAVLEWIGR 519 (608)
Q Consensus 493 ~~CsC~~~~~~giPC~H~lav~~~~~~ 519 (608)
.+|||++|+..||||+|+|+|+...++
T Consensus 1 ~~CsC~~~~~~gipC~H~i~v~~~~~~ 27 (28)
T smart00575 1 KTCSCRKFQLSGIPCRHALAAAIHIGL 27 (28)
T ss_pred CcccCCCcccCCccHHHHHHHHHHhCC
Confidence 479999999999999999999988775
No 8
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=98.79 E-value=1.1e-08 Score=83.54 Aligned_cols=61 Identities=25% Similarity=0.371 Sum_probs=53.6
Q ss_pred HHHHHHHHhcCceEEEEeeCCe-------EEEEEEec----------------------cCCCeEEEEEEcCCcceEEEE
Q 007318 56 EALHKYSIAHGFAYRYKKNDSH-------RVTVKCKC----------------------QGCPWRIYASRLSTTQLVCIK 106 (608)
Q Consensus 56 ~~~~~ya~~~gf~~~~~~s~~~-------r~~~~C~~----------------------~gC~~~v~~~~~~~~~~~~V~ 106 (608)
+||+.||..+||++++.++.+. ++.++|.+ +||||+|.+.... .+.|.|.
T Consensus 1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~v~ 79 (91)
T PF03101_consen 1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK-DGKWRVT 79 (91)
T ss_pred CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc-CCEEEEE
Confidence 4899999999999999776443 78899984 7999999999987 7799999
Q ss_pred eccCccccccc
Q 007318 107 KMNSKHTCEGA 117 (608)
Q Consensus 107 ~~~~~H~c~~~ 117 (608)
.+..+|||++.
T Consensus 80 ~~~~~HNH~L~ 90 (91)
T PF03101_consen 80 SFVLEHNHPLC 90 (91)
T ss_pred ECcCCcCCCCC
Confidence 99999999874
No 9
>PF04434 SWIM: SWIM zinc finger; InterPro: IPR007527 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination []. SWIM domains are also found in the homologous recombination protein Sws1 [], as well as in several hypothetical proteins. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding
Probab=98.14 E-value=1.8e-06 Score=58.24 Aligned_cols=30 Identities=33% Similarity=0.774 Sum_probs=26.8
Q ss_pred EEecceeeecCCcccCCCCchhHHHHHHHh
Q 007318 488 VDVDRWDCTCKTWHLTGLPCCHAIAVLEWI 517 (608)
Q Consensus 488 V~l~~~~CsC~~~~~~giPC~H~lav~~~~ 517 (608)
+++...+|||..|+..|.||+|++|++...
T Consensus 10 ~~~~~~~CsC~~~~~~~~~CkHi~av~~~~ 39 (40)
T PF04434_consen 10 VSIEQASCSCPYFQFRGGPCKHIVAVLLAL 39 (40)
T ss_pred ccccccEeeCCCccccCCcchhHHHHHHhh
Confidence 566789999999999999999999998764
No 10
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=96.36 E-value=0.0023 Score=34.41 Aligned_cols=18 Identities=28% Similarity=0.733 Sum_probs=16.3
Q ss_pred eecCCcCCCCcCccCCCC
Q 007318 590 LQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 590 ~~C~~C~~~gHn~~tC~~ 607 (608)
++|-+|++.||..+.||.
T Consensus 1 ~~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 1 RKCFNCGEPGHIARDCPK 18 (18)
T ss_dssp SBCTTTSCSSSCGCTSSS
T ss_pred CcCcCCCCcCcccccCcc
Confidence 379999999999999995
No 11
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=95.53 E-value=0.021 Score=56.08 Aligned_cols=94 Identities=15% Similarity=0.182 Sum_probs=69.0
Q ss_pred EEeecccccCccceeEEEEEEecC--CCCeEEEEEEEeecCCcccHHHHHHHH-HHhcCCCCcEEEEecCcccHHHHHhh
Q 007318 236 LFLDTTPLNSKYQGTLLTATSADG--DDGIFPVAFAVVDAETEDNWHWFLQEL-KSAVSTSQQITFIADFQNGLNKSLAE 312 (608)
Q Consensus 236 i~~D~T~~~~~y~~~l~~~~g~d~--~~~~~~~a~a~~~~E~~~~~~w~l~~l-~~~~~~~~~~~iitD~~~~l~~Ai~~ 312 (608)
|+||-+.....++. +..+.+|. +++. -+.++++-+.++..-||..+ -.. ......+|++|..++...|+++
T Consensus 1 lgiDE~~~~~g~~~--y~t~~~d~~~~~~~---il~i~~~r~~~~l~~~~~~~~~~~-~~~~v~~V~~Dm~~~y~~~~~~ 74 (249)
T PF01610_consen 1 LGIDEFAFRKGHRS--YVTVVVDLDTDTGR---ILDILPGRDKETLKDFFRSLYPEE-ERKNVKVVSMDMSPPYRSAIRE 74 (249)
T ss_pred CeEeeeeeecCCcc--eeEEEEECccCCce---EEEEcCCccHHHHHHHHHHhCccc-cccceEEEEcCCCccccccccc
Confidence 46777766544442 33333443 3322 23578888888888888866 333 3346889999999999999999
Q ss_pred hcCCccccchHHHHHHHHHhhhc
Q 007318 313 VFDNCYHSYCLRHLAEKLNRDIK 335 (608)
Q Consensus 313 vfP~a~~~~C~~Hi~~n~~~~~~ 335 (608)
.||+|.+..-.|||++++.+.+.
T Consensus 75 ~~P~A~iv~DrFHvvk~~~~al~ 97 (249)
T PF01610_consen 75 YFPNAQIVADRFHVVKLANRALD 97 (249)
T ss_pred cccccccccccchhhhhhhhcch
Confidence 99999999999999999988654
No 12
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=95.43 E-value=0.0069 Score=39.65 Aligned_cols=18 Identities=33% Similarity=0.973 Sum_probs=16.1
Q ss_pred eecCCcCCCCcCc--cCCCC
Q 007318 590 LQCSKCKGLGHNK--KTCKD 607 (608)
Q Consensus 590 ~~C~~C~~~gHn~--~tC~~ 607 (608)
++|+.||..||.+ ++||.
T Consensus 2 ~kC~~CG~~GH~~t~k~CP~ 21 (40)
T PF15288_consen 2 VKCKNCGAFGHMRTNKRCPM 21 (40)
T ss_pred ccccccccccccccCccCCC
Confidence 6899999999998 78985
No 13
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=95.03 E-value=0.014 Score=51.60 Aligned_cols=81 Identities=11% Similarity=0.064 Sum_probs=66.5
Q ss_pred ccEEEeecccccCccceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEEecCcccHHHHHhh
Q 007318 233 RPLLFLDTTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFIADFQNGLNKSLAE 312 (608)
Q Consensus 233 ~~vi~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~l~~Ai~~ 312 (608)
++.+.+|-||.+-+ +-..+....+|.+++ +|++-|...-+...=..||..+.+..+ ..|..|+||+.++...|+++
T Consensus 1 ~~~w~~DEt~iki~-G~~~yl~~aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~~-~~p~~ivtDk~~aY~~A~~~ 76 (140)
T PF13610_consen 1 GDSWHVDETYIKIK-GKWHYLWRAIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRHR-GEPRVIVTDKLPAYPAAIKE 76 (140)
T ss_pred CCEEEEeeEEEEEC-CEEEEEEEeeccccc--chhhhhhhhcccccceeeccccceeec-cccceeecccCCccchhhhh
Confidence 35789999998532 224566788899998 789999888888888888888877765 67999999999999999999
Q ss_pred hcCCc
Q 007318 313 VFDNC 317 (608)
Q Consensus 313 vfP~a 317 (608)
++|.-
T Consensus 77 l~~~~ 81 (140)
T PF13610_consen 77 LNPEG 81 (140)
T ss_pred ccccc
Confidence 99874
No 14
>PHA02517 putative transposase OrfB; Reviewed
Probab=94.88 E-value=1.5 Score=43.57 Aligned_cols=151 Identities=17% Similarity=0.082 Sum_probs=83.6
Q ss_pred hHHHHHHHHhhh-CCCCCHHHHHHHHHHHhCCcccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEEEEE
Q 007318 129 WVGNIIKEKLKA-SPNYKPKDIADDIKREYGIQLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVVTFT 207 (608)
Q Consensus 129 ~i~~~~~~~l~~-~~~~~~~~i~~~v~~~~g~~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~~~~ 207 (608)
.+.+.+.+.+.. .+.+..+.|...|.+. |..+|.++++|..+.. |-... .. ..-.....+-. .
T Consensus 30 ~l~~~I~~i~~~~~~~~G~r~I~~~L~~~-g~~vs~~tV~Rim~~~-----gl~~~--~~-----~k~~~~~~~~~---~ 93 (277)
T PHA02517 30 WLKSEILRVYDENHQVYGVRKVWRQLNRE-GIRVARCTVGRLMKEL-----GLAGV--LR-----GKKVRTTISRK---A 93 (277)
T ss_pred HHHHHHHHHHHHhCCCCCHHHHHHHHHhc-CcccCHHHHHHHHHHc-----CCceE--ec-----CCCcCCCCCCC---C
Confidence 455566666554 5788999999998755 9999999888754432 10000 00 00000000000 0
Q ss_pred ecCCCceeEEEeehhhhHHHHhhcCccEEEeecccccCccceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHH
Q 007318 208 TKEDSSFHRLFVSFHASISGFQQGCRPLLFLDTTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELK 287 (608)
Q Consensus 208 ~~~~~~~~~if~~~~~~~~~~~~~~~~vi~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~ 287 (608)
....+.+.+-|-+. .-..++..|.||....- +..++.+.+|...+ .++|+.+...++.+...-+|+...
T Consensus 94 ~~~~n~~~r~f~~~---------~pn~~w~~D~t~~~~~~-g~~yl~~iiD~~sr-~i~~~~~~~~~~~~~~~~~l~~a~ 162 (277)
T PHA02517 94 VAAPDRVNRQFVAT---------RPNQLWVADFTYVSTWQ-GWVYVAFIIDVFAR-RIVGWRVSSSMDTDFVLDALEQAL 162 (277)
T ss_pred CCCCCcccCCCCCC---------CCCCeEEeceeEEEeCC-CCEEEEEecccCCC-eeeecccCCCCChHHHHHHHHHHH
Confidence 00011222222111 13568999999975443 45677777776655 467888887777776555555544
Q ss_pred HhcCCCCcEEEEecCcccH
Q 007318 288 SAVSTSQQITFIADFQNGL 306 (608)
Q Consensus 288 ~~~~~~~~~~iitD~~~~l 306 (608)
...+...+..|.||+....
T Consensus 163 ~~~~~~~~~i~~sD~G~~y 181 (277)
T PHA02517 163 WARGRPGGLIHHSDKGSQY 181 (277)
T ss_pred HhcCCCcCcEeeccccccc
Confidence 4444333457779997643
No 15
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=94.79 E-value=1.2 Score=47.89 Aligned_cols=130 Identities=11% Similarity=0.122 Sum_probs=85.8
Q ss_pred cCCcccHHHHHHHHHHhcCCCC--cEEEEecCcccHHHHHhhhcCCccccchHHHHHHHHHhhhcccccHHHHHHHHHHH
Q 007318 273 AETEDNWHWFLQELKSAVSTSQ--QITFIADFQNGLNKSLAEVFDNCYHSYCLRHLAEKLNRDIKGQFSHEARRFMINDL 350 (608)
Q Consensus 273 ~E~~~~~~w~l~~l~~~~~~~~--~~~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~~~~~~~~~~~~~~~ 350 (608)
..+.+-|..+++.+.+...... -+++.+|+.+.+.+++. .+|.+.|.+..||+.+.+.+.++... .+...+
T Consensus 235 ~~~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~WIk~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~------~~~~~~ 307 (470)
T PF06782_consen 235 ESAEEFWEEVLDYIYNHYDLDKTTKIIINGDGASWIKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDP------ELKEKI 307 (470)
T ss_pred cchHHHHHHHHHHHHHhcCcccceEEEEeCCCcHHHHHHHH-hhcCceEEecHHHHHHHHHHHhhhCh------HHHHHH
Confidence 4556788888888877765332 57788999999988776 99999999999999999999887531 344555
Q ss_pred HHHhcCCCcccHHHHHHHHhccC---------hhHHHHHHhcCCCCccc--cccCCCcccccccchhHHHHHHHHh
Q 007318 351 YAAAYAPKFEGFQCSIESIKGIS---------PDAYDWVTQSEPEHWAN--TYFPGARYDHMTSNFGQQFYSWVSE 415 (608)
Q Consensus 351 ~~~~~~~t~~~f~~~~~~l~~~~---------~~~~~~l~~~~~~~W~~--~~~~~~~~~~~ttn~~Es~n~~lk~ 415 (608)
+++.......+++..++.+.... .++..||..+ |.. +|.. +-+.......|+.++.+..
T Consensus 308 ~~al~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~~~~Yl~~n----~~~i~~y~~--~~~~~g~g~ee~~~~~~s~ 377 (470)
T PF06782_consen 308 RKALKKGDKKKLETVLDTAESCAKDEEERKKIRKLRKYLLNN----WDGIKPYRE--REGLRGIGAEESVSHVLSY 377 (470)
T ss_pred HHHHHhcCHHHHHHHHHHHHHhhhchHHHHHHHHHHHHHHHC----HHHhhhhhh--ccCCCccchhhhhhhHHHH
Confidence 56666667777777776655422 1345566655 221 1111 0222334457777776643
No 16
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=94.74 E-value=0.078 Score=38.91 Aligned_cols=40 Identities=20% Similarity=0.375 Sum_probs=33.0
Q ss_pred CeEEEEEEeccCCCeEEEEEEcCCcceEEEEeccCccccc
Q 007318 76 SHRVTVKCKCQGCPWRIYASRLSTTQLVCIKKMNSKHTCE 115 (608)
Q Consensus 76 ~~r~~~~C~~~gC~~~v~~~~~~~~~~~~V~~~~~~H~c~ 115 (608)
..|..++|+..+|+++-.+.+..+.....++++.++|||+
T Consensus 20 ~pRsYYrCt~~~C~akK~Vqr~~~d~~~~~vtY~G~H~h~ 59 (60)
T PF03106_consen 20 YPRSYYRCTHPGCPAKKQVQRSADDPNIVIVTYEGEHNHP 59 (60)
T ss_dssp CEEEEEEEECTTEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred eeeEeeeccccChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence 3577899999999999999998877788999999999996
No 17
>PF04684 BAF1_ABF1: BAF1 / ABF1 chromatin reorganising factor; InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=94.09 E-value=0.11 Score=53.29 Aligned_cols=57 Identities=16% Similarity=0.442 Sum_probs=51.3
Q ss_pred cccceeCCHHHHHHHHHHHHHhcCceEEEEeeC-CeEEEEEEeccCCCeEEEEEEcCC
Q 007318 43 GVDQRFSSFSEFREALHKYSIAHGFAYRYKKND-SHRVTVKCKCQGCPWRIYASRLST 99 (608)
Q Consensus 43 ~vG~~F~s~~e~~~~~~~ya~~~gf~~~~~~s~-~~r~~~~C~~~gC~~~v~~~~~~~ 99 (608)
..+..|++.++-+.+|+.|..+....|..+.|- .+.++|.|....|||+|.++..+.
T Consensus 23 ~~~~~f~tl~~wy~v~ndyefq~rcpiilknsh~nkhftfachlk~c~fkillsy~g~ 80 (496)
T PF04684_consen 23 AQARKFPTLEAWYNVINDYEFQSRCPIILKNSHRNKHFTFACHLKNCPFKILLSYCGN 80 (496)
T ss_pred ccccCCCcHHHHHHHHhhhhhhhcCceeecccccccceEEEeeccCCCceeeeeeccc
Confidence 356789999999999999999999999998884 567999999999999999998765
No 18
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=93.66 E-value=0.63 Score=43.48 Aligned_cols=120 Identities=13% Similarity=0.134 Sum_probs=81.8
Q ss_pred HHHHHhCCcccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEEEEEecCCCceeEEEeehhhhHHHHhhc
Q 007318 152 DIKREYGIQLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVVTFTTKEDSSFHRLFVSFHASISGFQQG 231 (608)
Q Consensus 152 ~v~~~~g~~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~~~~~~~~~~~~~if~~~~~~~~~~~~~ 231 (608)
.+..+.|+.+.+.++++.-++.-. .+.+.+.+.++.
T Consensus 33 e~l~~rgi~v~h~Ti~rwv~k~~~--------------~~~~~~~~r~~~------------------------------ 68 (215)
T COG3316 33 EMLAERGIEVDHETIHRWVQKYGP--------------LLARRLKRRKRK------------------------------ 68 (215)
T ss_pred HHHHHcCcchhHHHHHHHHHHHhH--------------HHHHHhhhhccc------------------------------
Confidence 345578999999999887665432 233455555433
Q ss_pred CccEEEeecccccCccceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEEecCcccHHHHHh
Q 007318 232 CRPLLFLDTTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFIADFQNGLNKSLA 311 (608)
Q Consensus 232 ~~~vi~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~l~~Ai~ 311 (608)
-.+++.+|-||.+-.-+ -.+.-..+|.+| .++.+-|...-+...=.-||..+++.. ..|.+|+||+.+....|+.
T Consensus 69 ~~~~w~vDEt~ikv~gk-w~ylyrAid~~g--~~Ld~~L~~rRn~~aAk~Fl~kllk~~--g~p~v~vtDka~s~~~A~~ 143 (215)
T COG3316 69 AGDSWRVDETYIKVNGK-WHYLYRAIDADG--LTLDVWLSKRRNALAAKAFLKKLLKKH--GEPRVFVTDKAPSYTAALR 143 (215)
T ss_pred cccceeeeeeEEeeccE-eeehhhhhccCC--CeEEEEEEcccCcHHHHHHHHHHHHhc--CCCceEEecCccchHHHHH
Confidence 35677888888743211 122334455554 467777877777777777787777776 6899999999999999999
Q ss_pred hhcCCcccc
Q 007318 312 EVFDNCYHS 320 (608)
Q Consensus 312 ~vfP~a~~~ 320 (608)
++-+.+.|+
T Consensus 144 ~l~~~~ehr 152 (215)
T COG3316 144 KLGSEVEHR 152 (215)
T ss_pred hcCcchhee
Confidence 998865554
No 19
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=93.60 E-value=0.038 Score=34.38 Aligned_cols=19 Identities=26% Similarity=0.693 Sum_probs=17.7
Q ss_pred eeecCCcCCCCcCccCCCC
Q 007318 589 SLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~~ 607 (608)
.+.|..|++.||..+.||.
T Consensus 8 ~Y~C~~C~~~GH~i~dCP~ 26 (32)
T PF13696_consen 8 GYVCHRCGQKGHWIQDCPT 26 (32)
T ss_pred CCEeecCCCCCccHhHCCC
Confidence 4899999999999999996
No 20
>PRK14702 insertion element IS2 transposase InsD; Provisional
Probab=93.45 E-value=8.6 Score=37.91 Aligned_cols=146 Identities=14% Similarity=0.084 Sum_probs=86.0
Q ss_pred chhhHHHHHHHHhhhCCCCCHHHHHHHHHHH---hCC-cccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCC
Q 007318 126 TRGWVGNIIKEKLKASPNYKPKDIADDIKRE---YGI-QLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPG 201 (608)
Q Consensus 126 s~~~i~~~~~~~l~~~~~~~~~~i~~~v~~~---~g~-~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg 201 (608)
....+...+.+....++.+..+.|...|..+ .|. .++...++|..+..--.. ..+...+.
T Consensus 9 ~~~~l~~~I~~~~~~~~~yG~rri~~~L~~~~~~~g~~~v~~krV~rlmr~~gL~~----------------~~r~~~~~ 72 (262)
T PRK14702 9 DDTDVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRLMRQNALLL----------------ERKPAVPP 72 (262)
T ss_pred chHHHHHHHHHHHHhCcccChHHHHHHHHhhhcccCccccCHHHHHHHHHHhCCcc----------------ccCCCCCC
Confidence 3445666677766677889999999999875 366 488888877644321000 00000000
Q ss_pred cEEEEEecCCCceeEEEeehhhhHHHHhhcCccEEEeecccccCccceeEEEEEEecCCCCeEEEEEEEeec-CCcccHH
Q 007318 202 SVVTFTTKEDSSFHRLFVSFHASISGFQQGCRPLLFLDTTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDA-ETEDNWH 280 (608)
Q Consensus 202 ~~~~~~~~~~~~~~~if~~~~~~~~~~~~~~~~vi~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~-E~~~~~~ 280 (608)
+. .+... .|. ...-..++..|-||.....++.++.++.+|...+ .+|||++... .+.+.-.
T Consensus 73 ~~-------~~~~~-~~~---------~~~pn~~W~~DiT~~~~~~g~~~Yl~~viD~~sR-~ivg~~is~~~~~~~~v~ 134 (262)
T PRK14702 73 SK-------RAHTG-RVA---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQ 134 (262)
T ss_pred CC-------cCCCC-ccc---------cCCCCCEEEeeeEEEEecCCcEEEEEEEEecccc-eeeeEEeccCcCCHHHHH
Confidence 00 00000 010 0113468999999976544556888888887776 6789999874 5655555
Q ss_pred HHHHHH-HHhcC---CCCcEEEEecCccc
Q 007318 281 WFLQEL-KSAVS---TSQQITFIADFQNG 305 (608)
Q Consensus 281 w~l~~l-~~~~~---~~~~~~iitD~~~~ 305 (608)
-+|+.. ....+ ...|..|.||+...
T Consensus 135 ~~l~~A~~~~~~~~~~~~~~iihSD~Gsq 163 (262)
T PRK14702 135 DVMLGAVERRFGNDLPSSPVEWLTDNGSC 163 (262)
T ss_pred HHHHHHHHHHhcccCCCCCeEEEcCCCcc
Confidence 555543 33322 23578899999663
No 21
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=92.61 E-value=0.54 Score=46.71 Aligned_cols=133 Identities=20% Similarity=0.249 Sum_probs=86.4
Q ss_pred CCCCHHHHHHHHHHHhCCcccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEEEEEecCCCceeEEEeeh
Q 007318 142 PNYKPKDIADDIKREYGIQLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVVTFTTKEDSSFHRLFVSF 221 (608)
Q Consensus 142 ~~~~~~~i~~~v~~~~g~~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~~~~~~~~~~~~~if~~~ 221 (608)
-.++...+.+.+.+. |+.+|.+++.+....+.+.+.. .| +.+.+.
T Consensus 19 ~~lp~~r~~~~~~~~-G~~is~~ti~~~~~~~~~~l~~----~~-------~~l~~~----------------------- 63 (271)
T PF03050_consen 19 YHLPLYRIQQMLEDL-GITISRGTIANWIKRVAEALKP----LY-------EALKEE----------------------- 63 (271)
T ss_pred CCCCHHHHhhhhhcc-ceeeccchhHhHhhhhhhhhhh----hh-------hhhhhh-----------------------
Confidence 356677777777777 9999999888876665543321 11 122221
Q ss_pred hhhHHHHhhcCccEEEeeccccc----Ccc-ceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcE
Q 007318 222 HASISGFQQGCRPLLFLDTTPLN----SKY-QGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQI 296 (608)
Q Consensus 222 ~~~~~~~~~~~~~vi~~D~T~~~----~~y-~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~ 296 (608)
.. -.+|+.+|-|... ++. ++-+-++++.+ .+.|.+.++=..+...-+|.. -.-
T Consensus 64 ------~~--~~~~~~~DET~~~vl~~~~g~~~~~Wv~~~~~------~v~f~~~~sR~~~~~~~~L~~--------~~G 121 (271)
T PF03050_consen 64 ------LR--SSPVVHADETGWRVLDKGKGKKGYLWVFVSPE------VVLFFYAPSRSSKVIKEFLGD--------FSG 121 (271)
T ss_pred ------cc--ccceeccCCceEEEeccccccceEEEeeeccc------eeeeeecccccccchhhhhcc--------cce
Confidence 01 3578888888766 333 33444444444 566666666666666655543 224
Q ss_pred EEEecCcccHHHHHhhhcCCccccchHHHHHHHHHhhhcc
Q 007318 297 TFIADFQNGLNKSLAEVFDNCYHSYCLRHLAEKLNRDIKG 336 (608)
Q Consensus 297 ~iitD~~~~l~~Ai~~vfP~a~~~~C~~Hi~~n~~~~~~~ 336 (608)
+++||+-.+-.. +.++.|+.|+.|+.+.+..-...
T Consensus 122 ilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~~~~~~~ 156 (271)
T PF03050_consen 122 ILVSDGYSAYNK-----LAGITHQLCWAHLRRDFQDAAES 156 (271)
T ss_pred eeeccccccccc-----ccccccccccccccccccccccc
Confidence 899999887644 23889999999999999887664
No 22
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=90.37 E-value=0.47 Score=34.89 Aligned_cols=46 Identities=20% Similarity=0.509 Sum_probs=25.6
Q ss_pred cCceEEEEeeCCeEEEEEEecc---CCCeEEEEEEcCCcceEEEEeccCcccc
Q 007318 65 HGFAYRYKKNDSHRVTVKCKCQ---GCPWRIYASRLSTTQLVCIKKMNSKHTC 114 (608)
Q Consensus 65 ~gf~~~~~~s~~~r~~~~C~~~---gC~~~v~~~~~~~~~~~~V~~~~~~H~c 114 (608)
.|+.|...+.......+.|... +|+++|... .+. -.|.....+|||
T Consensus 14 ~Gy~y~~~~~~~~~~~WrC~~~~~~~C~a~~~~~--~~~--~~~~~~~~~HnH 62 (62)
T PF04500_consen 14 DGYRYYFNKRNDGKTYWRCSRRRSHGCRARLITD--AGD--GRVVRTNGEHNH 62 (62)
T ss_dssp TTEEEEEEEE-SS-EEEEEGGGTTS----EEEEE----T--TEEEE-S---SS
T ss_pred CCeEEECcCCCCCcEEEEeCCCCCCCCeEEEEEE--CCC--CEEEECCCccCC
Confidence 6788888777788889999864 899999998 222 345555578887
No 23
>PRK09409 IS2 transposase TnpB; Reviewed
Probab=90.26 E-value=5.8 Score=40.01 Aligned_cols=143 Identities=14% Similarity=0.095 Sum_probs=84.3
Q ss_pred hHHHHHHHHhhhCCCCCHHHHHHHHHHHh---CC-cccHHHHHHHHHHHHHHHhcChHhHhccHHHHHHHHHhhCCCcEE
Q 007318 129 WVGNIIKEKLKASPNYKPKDIADDIKREY---GI-QLNYSQAWRAKEIAREQLQGSYKDSYTLLPFFCEKIKETNPGSVV 204 (608)
Q Consensus 129 ~i~~~~~~~l~~~~~~~~~~i~~~v~~~~---g~-~~s~~~~~~~k~~~~~~~~g~~~~~~~~L~~~~~~~~~~npg~~~ 204 (608)
.+...+.+.....+.+..+.|...|..+. |. .++..+++|..+..-- .+ ..+...+.+.
T Consensus 51 ~l~~~I~~i~~~~~~yG~Rri~~~L~~~g~~~g~~~v~~k~V~RlMr~~Gl--~~--------------~~~~~~~~~~- 113 (301)
T PRK09409 51 DVLLRIHHVIGELPTYGYRRVWALLRRQAELDGMPAINAKRVYRIMRQNAL--LL--------------ERKPAVPPSK- 113 (301)
T ss_pred HHHHHHHHHHHhCccCCHHHHHHHHHhhhcccCccccCHHHHHHHHHHcCC--cc--------------cccCCCCCCC-
Confidence 34555666656678899999999998753 66 5888877775433210 00 0000000000
Q ss_pred EEEecCCCceeEEEeehhhhHHHHhhcCccEEEeecccccCccceeEEEEEEecCCCCeEEEEEEEeec-CCcccHHHHH
Q 007318 205 TFTTKEDSSFHRLFVSFHASISGFQQGCRPLLFLDTTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDA-ETEDNWHWFL 283 (608)
Q Consensus 205 ~~~~~~~~~~~~if~~~~~~~~~~~~~~~~vi~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~-E~~~~~~w~l 283 (608)
.+.... |. ...-..+++.|-||....-++-++.++.+|...+ .+|||++... .+.+.-..+|
T Consensus 114 ------~~~~~~-~~---------~~~pN~~W~tDiT~~~~~~g~~~Yl~~ViD~~sR-~ivg~~~s~~~~~~~~v~~~l 176 (301)
T PRK09409 114 ------RAHTGR-VA---------VKESNQRWCSDGFEFCCDNGERLRVTFALDCCDR-EALHWAVTTGGFNSETVQDVM 176 (301)
T ss_pred ------CCCCCC-cC---------CCCCCCEEEeeeEEEEeCCCCEEEEEEEeecccc-eEEEEEeccCCCCHHHHHHHH
Confidence 000000 10 1114569999999976544556888888888777 6789999875 5666655566
Q ss_pred HH-HHHhcC---CCCcEEEEecCccc
Q 007318 284 QE-LKSAVS---TSQQITFIADFQNG 305 (608)
Q Consensus 284 ~~-l~~~~~---~~~~~~iitD~~~~ 305 (608)
+. +....+ ...|..|-||+...
T Consensus 177 ~~a~~~~~~~~~~~~~~iihSDrGsq 202 (301)
T PRK09409 177 LGAVERRFGNDLPSSPVEWLTDNGSC 202 (301)
T ss_pred HHHHHHHhccCCCCCCcEEecCCCcc
Confidence 54 444333 22478899999664
No 24
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=89.33 E-value=0.46 Score=34.61 Aligned_cols=38 Identities=26% Similarity=0.412 Sum_probs=31.8
Q ss_pred eEEEEEEec-cCCCeEEEEEEcCCcceEEEEeccCcccc
Q 007318 77 HRVTVKCKC-QGCPWRIYASRLSTTQLVCIKKMNSKHTC 114 (608)
Q Consensus 77 ~r~~~~C~~-~gC~~~v~~~~~~~~~~~~V~~~~~~H~c 114 (608)
-|..++|+. .||+++=.+.+..+.....++.+.++|||
T Consensus 21 pRsYYrCt~~~~C~a~K~Vq~~~~d~~~~~vtY~g~H~h 59 (59)
T smart00774 21 PRSYYRCTYSQGCPAKKQVQRSDDDPSVVEVTYEGEHTH 59 (59)
T ss_pred cceEEeccccCCCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence 466799998 89999988888765567788899999997
No 25
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=87.93 E-value=1.8 Score=36.58 Aligned_cols=74 Identities=16% Similarity=0.090 Sum_probs=54.3
Q ss_pred ccEEEeeccccc-CccceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEEecCcccHH
Q 007318 233 RPLLFLDTTPLN-SKYQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFIADFQNGLN 307 (608)
Q Consensus 233 ~~vi~~D~T~~~-~~y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~l~ 307 (608)
..++.+|.++.. ...++..+..+.+|..-+. .+++.+-..++.+.+..+|.......+...|.+|++|+..+..
T Consensus 6 ~~~~~~D~~~~~~~~~~~~~~~~~~iD~~S~~-~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~p~~i~tD~g~~f~ 80 (120)
T PF00665_consen 6 GERWQIDFTPMPIPDKGGRVYLLVFIDDYSRF-IYAFPVSSKETAEAALRALKRAIEKRGGRPPRVIRTDNGSEFT 80 (120)
T ss_dssp TTEEEEEEEEETGGCTT-CEEEEEEEETTTTE-EEEEEESSSSHHHHHHHHHHHHHHHHS-SE-SEEEEESCHHHH
T ss_pred CCEEEEeeEEEecCCCCccEEEEEEEECCCCc-EEEEEeecccccccccccccccccccccccceecccccccccc
Confidence 457899999664 3455578888888877664 4577777777888888888877777666559999999998765
No 26
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=86.54 E-value=0.28 Score=34.41 Aligned_cols=19 Identities=32% Similarity=0.800 Sum_probs=17.2
Q ss_pred eeecCCcCCCCcCccCCCC
Q 007318 589 SLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~~ 607 (608)
...|+.||..||+.+.||+
T Consensus 31 p~~C~~C~~~gH~~~~C~k 49 (49)
T PF14392_consen 31 PRFCFHCGRIGHSDKECPK 49 (49)
T ss_pred ChhhcCCCCcCcCHhHcCC
Confidence 4689999999999999985
No 27
>smart00343 ZnF_C2HC zinc finger.
Probab=86.02 E-value=0.37 Score=28.64 Aligned_cols=16 Identities=31% Similarity=0.862 Sum_probs=14.8
Q ss_pred ecCCcCCCCcCccCCC
Q 007318 591 QCSKCKGLGHNKKTCK 606 (608)
Q Consensus 591 ~C~~C~~~gHn~~tC~ 606 (608)
.|.+|++.||..+.||
T Consensus 1 ~C~~CG~~GH~~~~C~ 16 (26)
T smart00343 1 KCYNCGKEGHIARDCP 16 (26)
T ss_pred CCccCCCCCcchhhCC
Confidence 4889999999999998
No 28
>PF13565 HTH_32: Homeodomain-like domain
Probab=84.18 E-value=2.7 Score=32.43 Aligned_cols=40 Identities=25% Similarity=0.551 Sum_probs=33.8
Q ss_pred HHHHHHHHhhhCCCCCHHHHHHHHHHHhCCcc--cHHHHHHH
Q 007318 130 VGNIIKEKLKASPNYKPKDIADDIKREYGIQL--NYSQAWRA 169 (608)
Q Consensus 130 i~~~~~~~l~~~~~~~~~~i~~~v~~~~g~~~--s~~~~~~~ 169 (608)
+...+.+.+..+|.+++.+|.+.|.+++|+.+ |.+++||.
T Consensus 35 ~~~~i~~~~~~~p~wt~~~i~~~L~~~~g~~~~~S~~tv~R~ 76 (77)
T PF13565_consen 35 QRERIIALIEEHPRWTPREIAEYLEEEFGISVRVSRSTVYRI 76 (77)
T ss_pred HHHHHHHHHHhCCCCCHHHHHHHHHHHhCCCCCccHhHHHHh
Confidence 33566677778899999999999999999876 99999874
No 29
>PF04937 DUF659: Protein of unknown function (DUF 659); InterPro: IPR007021 These are transposase-like proteins with no known function.
Probab=81.73 E-value=8.1 Score=34.56 Aligned_cols=63 Identities=19% Similarity=0.331 Sum_probs=50.7
Q ss_pred CCcccHHHHHHHHHHhcCCCCcEEEEecCcccHHHH---HhhhcCCccccchHHHHHHHHHhhhcc
Q 007318 274 ETEDNWHWFLQELKSAVSTSQQITFIADFQNGLNKS---LAEVFDNCYHSYCLRHLAEKLNRDIKG 336 (608)
Q Consensus 274 E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~l~~A---i~~vfP~a~~~~C~~Hi~~n~~~~~~~ 336 (608)
.+.+....+|+...+.+|..+..-||||....+.+| +.+-+|......|.-|-+.-+.+.+..
T Consensus 73 ~~a~~l~~ll~~vIeeVG~~nVvqVVTDn~~~~~~a~~~L~~k~p~ifw~~CaaH~inLmledi~k 138 (153)
T PF04937_consen 73 KTAEYLFELLDEVIEEVGEENVVQVVTDNASNMKKAGKLLMEKYPHIFWTPCAAHCINLMLEDIGK 138 (153)
T ss_pred ccHHHHHHHHHHHHHHhhhhhhhHHhccCchhHHHHHHHHHhcCCCEEEechHHHHHHHHHHHHhc
Confidence 456666677777777778778888999999998888 566789999999999998888777664
No 30
>PF02178 AT_hook: AT hook motif; InterPro: IPR017956 AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins [], in DNA-binding proteins from plants [] and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex []. High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin []. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner [, ]. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions. ; GO: 0003677 DNA binding; PDB: 2EZE_A 2EZD_A 2EZF_A 2EZG_A.
Probab=80.51 E-value=0.74 Score=22.49 Aligned_cols=9 Identities=44% Similarity=0.896 Sum_probs=3.5
Q ss_pred CCCCCCCCC
Q 007318 570 RPPGRPKMK 578 (608)
Q Consensus 570 r~~GRPk~~ 578 (608)
+++|||++.
T Consensus 2 r~RGRP~k~ 10 (13)
T PF02178_consen 2 RKRGRPRKN 10 (13)
T ss_dssp --SS--TT-
T ss_pred CcCCCCccc
Confidence 679999875
No 31
>smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).
Probab=72.52 E-value=2.7 Score=24.66 Aligned_cols=13 Identities=31% Similarity=0.708 Sum_probs=9.9
Q ss_pred CCCCCCCCCCCCC
Q 007318 569 RRPPGRPKMKQPE 581 (608)
Q Consensus 569 ~r~~GRPk~~r~~ 581 (608)
.|++|||+|....
T Consensus 1 kRkRGRPrK~~~~ 13 (26)
T smart00384 1 KRKRGRPRKAPKD 13 (26)
T ss_pred CCCCCCCCCCCCc
Confidence 3789999987643
No 32
>COG5431 Uncharacterized metal-binding protein [Function unknown]
Probab=72.02 E-value=3.1 Score=33.56 Aligned_cols=29 Identities=28% Similarity=0.565 Sum_probs=22.0
Q ss_pred CceEEEEecceeeecCCccc-----CCCCchhHHHH
Q 007318 483 ESADIVDVDRWDCTCKTWHL-----TGLPCCHAIAV 513 (608)
Q Consensus 483 ~~~~~V~l~~~~CsC~~~~~-----~giPC~H~lav 513 (608)
++.|+++.+ -|||..|-. -.-||.|++..
T Consensus 42 ~rdYIl~~g--fCSCp~~~~svvl~Gk~~C~Hi~gl 75 (117)
T COG5431 42 ERDYILEGG--FCSCPDFLGSVVLKGKSPCAHIIGL 75 (117)
T ss_pred ccceEEEcC--cccCHHHHhHhhhcCcccchhhhhe
Confidence 557888776 999998872 23579999875
No 33
>PF05741 zf-nanos: Nanos RNA binding domain; InterPro: IPR024161 Nanos is a highly conserved RNA-binding protein in higher eukaryotes and functions as a key regulatory protein in translational control using a 3' untranslated region during the development and maintenance of germ cells. Nanos comprises a non-conserved amino-terminus and highly conserved carboxy- terminal regions. The C-terminal region has two conserved Cys-Cys-His-Cys (CCHC)-type zinc-finger motifs that are indispensable for nanos function [, , ]. The structure of the nanos-type zinc finger is composed of two independent zinc-finger (ZF) lobes, the N-terminal ZF1 and the C-terminal ZF2, which are connected by a linker helix []. These lobes create a large cleft. Zinc ions in ZF1 and ZF2 are bound to the CCHC motif by tetrahedral coordination.; PDB: 3ALR_B.
Probab=67.28 E-value=2.2 Score=30.52 Aligned_cols=20 Identities=30% Similarity=0.672 Sum_probs=8.5
Q ss_pred ceeecCCcCCC---CcCccCCCC
Q 007318 588 RSLQCSKCKGL---GHNKKTCKD 607 (608)
Q Consensus 588 r~~~C~~C~~~---gHn~~tC~~ 607 (608)
|.+.|..||.. +|+.+-||+
T Consensus 32 r~y~Cp~CgAtGd~AHT~~yCP~ 54 (55)
T PF05741_consen 32 RKYVCPICGATGDNAHTIKYCPK 54 (55)
T ss_dssp GG---TTT---GGG---GGG-TT
T ss_pred hcCcCCCCcCcCccccccccCcC
Confidence 56899999886 599999996
No 34
>COG5179 TAF1 Transcription initiation factor TFIID, subunit TAF1 [Transcription]
Probab=62.65 E-value=4.3 Score=43.44 Aligned_cols=21 Identities=33% Similarity=0.835 Sum_probs=17.0
Q ss_pred cceeecCCcCCCCcCc--cCCCC
Q 007318 587 KRSLQCSKCKGLGHNK--KTCKD 607 (608)
Q Consensus 587 kr~~~C~~C~~~gHn~--~tC~~ 607 (608)
..+++|+.||+.||-+ +.||+
T Consensus 935 ~Ttr~C~nCGQvGHmkTNK~CP~ 957 (968)
T COG5179 935 NTTRTCGNCGQVGHMKTNKACPK 957 (968)
T ss_pred CcceecccccccccccccccCcc
Confidence 3479999999999966 46875
No 35
>COG4279 Uncharacterized conserved protein [Function unknown]
Probab=62.29 E-value=4.1 Score=38.77 Aligned_cols=23 Identities=35% Similarity=0.751 Sum_probs=19.2
Q ss_pred eeeecCCcccCCCCchhHHHHHHHhC
Q 007318 493 WDCTCKTWHLTGLPCCHAIAVLEWIG 518 (608)
Q Consensus 493 ~~CsC~~~~~~giPC~H~lav~~~~~ 518 (608)
..|||..+. .||.|+-||....+
T Consensus 125 ~dCSCPD~a---nPCKHi~AvyY~la 147 (266)
T COG4279 125 TDCSCPDYA---NPCKHIAAVYYLLA 147 (266)
T ss_pred cccCCCCcc---cchHHHHHHHHHHH
Confidence 469999866 59999999988765
No 36
>PRK12286 rpmF 50S ribosomal protein L32; Reviewed
Probab=59.74 E-value=9.5 Score=27.66 Aligned_cols=42 Identities=12% Similarity=0.331 Sum_probs=23.1
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCCCCcCccCCCC
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~~gHn~~tC~~ 607 (608)
|+.+.++.|..++|..-.-.......|+.||..-=.=+-||.
T Consensus 4 PKrk~S~srr~~RRsh~~l~~~~l~~C~~CG~~~~~H~vC~~ 45 (57)
T PRK12286 4 PKRKTSKSRKRKRRAHFKLKAPGLVECPNCGEPKLPHRVCPS 45 (57)
T ss_pred CcCcCChhhcchhcccccccCCcceECCCCCCccCCeEECCC
Confidence 555566666666654322223346789999886322233443
No 37
>PRK14892 putative transcription elongation factor Elf1; Provisional
Probab=54.50 E-value=11 Score=30.82 Aligned_cols=10 Identities=20% Similarity=0.949 Sum_probs=7.4
Q ss_pred ceeecCCcCC
Q 007318 588 RSLQCSKCKG 597 (608)
Q Consensus 588 r~~~C~~C~~ 597 (608)
....|.+|+.
T Consensus 20 t~f~CP~Cge 29 (99)
T PRK14892 20 KIFECPRCGK 29 (99)
T ss_pred cEeECCCCCC
Confidence 5678888883
No 38
>PRK09335 30S ribosomal protein S26e; Provisional
Probab=53.93 E-value=12 Score=29.95 Aligned_cols=28 Identities=29% Similarity=0.491 Sum_probs=19.5
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCCC
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKGL 598 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~~ 598 (608)
|.+++..||-|+-|-. -+..+|.+||..
T Consensus 2 ~kKRrn~GR~K~~rGh-----v~~V~C~nCgr~ 29 (95)
T PRK09335 2 PKKRENRGRRKGDKGH-----VGYVQCDNCGRR 29 (95)
T ss_pred CcccccCCCCCCCCCC-----CccEEeCCCCCc
Confidence 5678888988765421 245899999863
No 39
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=51.68 E-value=7.8 Score=26.08 Aligned_cols=18 Identities=33% Similarity=0.798 Sum_probs=16.4
Q ss_pred eeecCCcCCCCcCccCCC
Q 007318 589 SLQCSKCKGLGHNKKTCK 606 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~ 606 (608)
...|.+|++.||-..-||
T Consensus 4 ~~~CqkC~~~GH~tyeC~ 21 (42)
T PF13917_consen 4 RVRCQKCGQKGHWTYECP 21 (42)
T ss_pred CCcCcccCCCCcchhhCC
Confidence 368999999999999998
No 40
>TIGR01031 rpmF_bact ribosomal protein L32. This protein describes bacterial ribosomal protein L32. The noise cutoff is set low enough to include the equivalent protein from mitochondria and chloroplasts. No related proteins from the Archaea nor from the eukaryotic cytosol are detected by this model. This model is a fragment model; the putative L32 of some species shows similarity only toward the N-terminus.
Probab=51.51 E-value=17 Score=26.16 Aligned_cols=42 Identities=14% Similarity=0.407 Sum_probs=23.9
Q ss_pred CCCCCCCCCCCCCCCCcc-ccccceeecCCcCCCCcCccCCCC
Q 007318 566 PPTRRPPGRPKMKQPESA-EIIKRSLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~-~~~kr~~~C~~C~~~gHn~~tC~~ 607 (608)
|+.+.++-|..++|.... ........|+.||+.--.=+-||.
T Consensus 2 PKrk~Sksr~~~RRah~~kl~~p~l~~C~~cG~~~~~H~vc~~ 44 (55)
T TIGR01031 2 PKRKTSKSRKRKRRSHDAKLTAPTLVVCPNCGEFKLPHRVCPS 44 (55)
T ss_pred CCCcCCcccccchhcCcccccCCcceECCCCCCcccCeeECCc
Confidence 445555566555554322 233456789999986544445554
No 41
>COG4715 Uncharacterized conserved protein [Function unknown]
Probab=49.46 E-value=47 Score=35.80 Aligned_cols=44 Identities=27% Similarity=0.290 Sum_probs=30.9
Q ss_pred eEEEEcCceE--EEEe----cceeeecCCcccCCCCchhHHHHHHHhCCCcc
Q 007318 477 TFEVRGESAD--IVDV----DRWDCTCKTWHLTGLPCCHAIAVLEWIGRSPY 522 (608)
Q Consensus 477 ~f~V~~~~~~--~V~l----~~~~CsC~~~~~~giPC~H~lav~~~~~~~p~ 522 (608)
..+|.+++.| .|.+ -..+|||.. ...| =|.|+.||+......|+
T Consensus 51 ~A~V~Gs~~y~v~vtL~~~~~ss~CTCP~-~~~g-aCKH~VAvvl~~~~~p~ 100 (587)
T COG4715 51 RAVVEGSRRYRVRVTLEGGALSSICTCPY-GGSG-ACKHVVAVVLEYLDDPE 100 (587)
T ss_pred EEEEeccceeeEEEEeecCCcCceeeCCC-CCCc-chHHHHHHHHHHhhccc
Confidence 3467776665 4455 257999997 5555 59999999887655443
No 42
>PRK00766 hypothetical protein; Provisional
Probab=48.56 E-value=1.4e+02 Score=27.79 Aligned_cols=88 Identities=15% Similarity=0.125 Sum_probs=51.8
Q ss_pred ccEEEee-cccccCccceeEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcC--------------------
Q 007318 233 RPLLFLD-TTPLNSKYQGTLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVS-------------------- 291 (608)
Q Consensus 233 ~~vi~~D-~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~-------------------- 291 (608)
-.|++|| +.|..+.-+-..++-+..-++.-..-++|..+.-...|.=..+.+.++....
T Consensus 9 irvlGidds~f~~~~~~~~~lvGvv~r~~~~idGv~~~~itvdG~DaT~~i~~mv~~~~~r~~i~~V~L~Git~agFNvv 88 (194)
T PRK00766 9 IRVLGIDDGTFLFKSSEKVILVGVVMRGGDWVDGVLSRWITVDGLDATEAIIEMVNSSRHKGQLRVIMLDGITYGGFNVV 88 (194)
T ss_pred ceEEEEecCccccCCCCCEEEEEEEEECCeEEeeEEEEEEEECCccHHHHHHHHHHhcccccceEEEEECCEeeeeeEEe
Confidence 3588888 4444332233444444445555555677777777666666666666554110
Q ss_pred --------CCCcEEEEecCcc---cHHHHHhhhcCCcccc
Q 007318 292 --------TSQQITFIADFQN---GLNKSLAEVFDNCYHS 320 (608)
Q Consensus 292 --------~~~~~~iitD~~~---~l~~Ai~~vfP~a~~~ 320 (608)
-..|+.+++..-+ +|.+||++.||+...+
T Consensus 89 D~~~l~~~tg~PVI~V~r~~p~~~~ie~AL~k~f~~~~~R 128 (194)
T PRK00766 89 DIEELYRETGLPVIVVMRKKPDFEAIESALKKHFSDWEER 128 (194)
T ss_pred cHHHHHHHHCCCEEEEEecCCCHHHHHHHHHHHCCCHHHH
Confidence 0136666644443 7888998899987654
No 43
>PF13592 HTH_33: Winged helix-turn helix
Probab=47.60 E-value=33 Score=25.03 Aligned_cols=30 Identities=27% Similarity=0.458 Sum_probs=25.3
Q ss_pred CCCCHHHHHHHHHHHhCCcccHHHHHHHHH
Q 007318 142 PNYKPKDIADDIKREYGIQLNYSQAWRAKE 171 (608)
Q Consensus 142 ~~~~~~~i~~~v~~~~g~~~s~~~~~~~k~ 171 (608)
..++..+|.+.|.+.||+.+|.+.+++.-.
T Consensus 3 ~~wt~~~i~~~I~~~fgv~ys~~~v~~lL~ 32 (60)
T PF13592_consen 3 GRWTLKEIAAYIEEEFGVKYSPSGVYRLLK 32 (60)
T ss_pred CcccHHHHHHHHHHHHCCEEcHHHHHHHHH
Confidence 456889999999999999999988877544
No 44
>PLN00186 ribosomal protein S26; Provisional
Probab=47.05 E-value=17 Score=29.77 Aligned_cols=28 Identities=29% Similarity=0.541 Sum_probs=19.4
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCCC
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKGL 598 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~~ 598 (608)
|.++|..||-|+.|-. -+..+|++|+..
T Consensus 2 ~kKRrN~GR~K~~rGh-----v~~V~C~nCgr~ 29 (109)
T PLN00186 2 TKKRRNGGRNKHGRGH-----VKRIRCSNCGKC 29 (109)
T ss_pred CcccccCCCCCCCCCC-----CcceeeCCCccc
Confidence 5677888888765421 235899999873
No 45
>PF14201 DUF4318: Domain of unknown function (DUF4318)
Probab=46.40 E-value=34 Score=26.28 Aligned_cols=30 Identities=30% Similarity=0.525 Sum_probs=26.7
Q ss_pred eeCCHHHHHHHHHHHHHhcCceEEEEeeCC
Q 007318 47 RFSSFSEFREALHKYSIAHGFAYRYKKNDS 76 (608)
Q Consensus 47 ~F~s~~e~~~~~~~ya~~~gf~~~~~~s~~ 76 (608)
.+||.+++..+|..|+.++|-.+.+.+.+.
T Consensus 13 ~yPs~e~i~~aIE~YC~~~~~~l~Fisr~~ 42 (74)
T PF14201_consen 13 KYPSKEEICEAIEKYCIKNGESLEFISRDK 42 (74)
T ss_pred CCCCHHHHHHHHHHHHHHcCCceEEEecCC
Confidence 689999999999999999999999876543
No 46
>PTZ00172 40S ribosomal protein S26; Provisional
Probab=45.61 E-value=19 Score=29.58 Aligned_cols=28 Identities=29% Similarity=0.573 Sum_probs=19.5
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCCC
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKGL 598 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~~ 598 (608)
|.++|..||-|+.|-. -+..+|.+|+..
T Consensus 2 ~kKRrN~GR~K~~rGh-----v~~V~C~nCgr~ 29 (108)
T PTZ00172 2 TSKRRNNGRSKHGRGH-----VKPVRCSNCGRC 29 (108)
T ss_pred CcccccCCCCCCCCCC-----CccEEeCCcccc
Confidence 5677888988765421 245899999873
No 47
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=44.55 E-value=11 Score=34.46 Aligned_cols=16 Identities=31% Similarity=0.875 Sum_probs=13.8
Q ss_pred eecCCcCCCCcCccCC
Q 007318 590 LQCSKCKGLGHNKKTC 605 (608)
Q Consensus 590 ~~C~~C~~~gHn~~tC 605 (608)
.+|..||+.||-++-|
T Consensus 98 ~~C~~Cg~~GH~~~dC 113 (190)
T COG5082 98 KKCYNCGETGHLSRDC 113 (190)
T ss_pred cccccccccCcccccc
Confidence 5788899999998888
No 48
>PF12762 DDE_Tnp_IS1595: ISXO2-like transposase domain; InterPro: IPR024445 This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2.
Probab=42.29 E-value=72 Score=28.15 Aligned_cols=68 Identities=16% Similarity=0.138 Sum_probs=38.2
Q ss_pred cEEEeecccccCcc--------------ceeEEEEEEecCC-CCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEE
Q 007318 234 PLLFLDTTPLNSKY--------------QGTLLTATSADGD-DGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITF 298 (608)
Q Consensus 234 ~vi~~D~T~~~~~y--------------~~~l~~~~g~d~~-~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~i 298 (608)
.+|-+|.||..++- .....++++++-+ +..--+-..++++.+.++..-+++...+ +..+|
T Consensus 4 G~VEiDEty~~~~~~~~~~~~~~~gr~~~~k~~V~~~ver~~~~~~~~~~~~v~~~~~~tl~~~i~~~i~-----~gs~i 78 (151)
T PF12762_consen 4 GIVEIDETYFGGRKNKKPRRKGKRGRGSKNKVPVFGAVERNDGGTGRVFMFVVPDRSAETLKPIIQEHIE-----PGSTI 78 (151)
T ss_pred CEEEeCcCEECCcccccccCCCCCCCcCCCCcEEEEEEeecccCCceEEEEeecccccchhHHHHHHhhh-----cccee
Confidence 46777877764322 1223344444443 3333344455577888887666643322 34678
Q ss_pred EecCcccH
Q 007318 299 IADFQNGL 306 (608)
Q Consensus 299 itD~~~~l 306 (608)
+||...+-
T Consensus 79 ~TD~~~aY 86 (151)
T PF12762_consen 79 ITDGWRAY 86 (151)
T ss_pred eecchhhc
Confidence 99997765
No 49
>COG0333 RpmF Ribosomal protein L32 [Translation, ribosomal structure and biogenesis]
Probab=40.14 E-value=17 Score=26.25 Aligned_cols=19 Identities=16% Similarity=0.455 Sum_probs=13.6
Q ss_pred eeecCCcCCCCcCccCCCC
Q 007318 589 SLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~~ 607 (608)
...|+.||+..---+.|+.
T Consensus 27 ~~~c~~cG~~~l~Hrvc~~ 45 (57)
T COG0333 27 LSVCPNCGEYKLPHRVCLK 45 (57)
T ss_pred ceeccCCCCcccCceEcCC
Confidence 6889999987555555654
No 50
>KOG4602 consensus Nanos and related proteins [General function prediction only]
Probab=39.97 E-value=13 Score=35.27 Aligned_cols=21 Identities=38% Similarity=0.716 Sum_probs=17.7
Q ss_pred ceeecCCcCCCC---cCccCCCCC
Q 007318 588 RSLQCSKCKGLG---HNKKTCKDS 608 (608)
Q Consensus 588 r~~~C~~C~~~g---Hn~~tC~~~ 608 (608)
|.+.|..||..| |+++-||+.
T Consensus 267 R~YVCPiCGATgDnAHTiKyCPl~ 290 (318)
T KOG4602|consen 267 RSYVCPICGATGDNAHTIKYCPLA 290 (318)
T ss_pred hhhcCccccccCCcccceeccccc
Confidence 679999998876 999999963
No 51
>PF13082 DUF3931: Protein of unknown function (DUF3931)
Probab=38.15 E-value=94 Score=21.62 Aligned_cols=28 Identities=11% Similarity=0.138 Sum_probs=18.6
Q ss_pred cceeEEEEEEecCCCCeEEEEEEEeecC
Q 007318 247 YQGTLLTATSADGDDGIFPVAFAVVDAE 274 (608)
Q Consensus 247 y~~~l~~~~g~d~~~~~~~~a~a~~~~E 274 (608)
|...-++++|...+|+..++...+..+|
T Consensus 35 yefssfvlcgetpdgrrlvlthmistde 62 (66)
T PF13082_consen 35 YEFSSFVLCGETPDGRRLVLTHMISTDE 62 (66)
T ss_pred EEEEEEEEEccCCCCcEEEEEEEecchh
Confidence 3344566778888888877776665544
No 52
>PHA00689 hypothetical protein
Probab=37.21 E-value=18 Score=24.77 Aligned_cols=13 Identities=31% Similarity=0.920 Sum_probs=10.5
Q ss_pred cceeecCCcCCCC
Q 007318 587 KRSLQCSKCKGLG 599 (608)
Q Consensus 587 kr~~~C~~C~~~g 599 (608)
.|..+|.+||+.|
T Consensus 15 pravtckrcgktg 27 (62)
T PHA00689 15 PRAVTCKRCGKTG 27 (62)
T ss_pred cceeehhhccccC
Confidence 4668999999876
No 53
>PF01498 HTH_Tnp_Tc3_2: Transposase; InterPro: IPR002492 Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of Caenorhabditis elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in []. Tc3 is a member of the Tc1/mariner family of transposable elements. This entry also includes histone-lysine N-methyltransferase SETMAR, which is a SET domain and mariner transposase fusion gene-containing protein. This histone methyltransferase has sequence-specific DNA-binding activity and recognises the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element. This protein has DNA nicking activity, and has in vivo end joining activity and may mediate genomic integration of foreign DNA [, , , ]. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated, 0015074 DNA integration; PDB: 3K9K_B 3F2K_B 3K9J_B 1U78_A.
Probab=36.12 E-value=37 Score=25.66 Aligned_cols=37 Identities=22% Similarity=0.428 Sum_probs=16.2
Q ss_pred HHHHhhhCCCCCHHHHHHHHHHHhCCcccHHHHHHHHH
Q 007318 134 IKEKLKASPNYKPKDIADDIKREYGIQLNYSQAWRAKE 171 (608)
Q Consensus 134 ~~~~l~~~~~~~~~~i~~~v~~~~g~~~s~~~~~~~k~ 171 (608)
+...+..+|.++..+|...+.+. |..+|..++++.-.
T Consensus 4 I~~~v~~~p~~s~~~i~~~l~~~-~~~vS~~TI~r~L~ 40 (72)
T PF01498_consen 4 IVRMVRRNPRISAREIAQELQEA-GISVSKSTIRRRLR 40 (72)
T ss_dssp ------------HHHHHHHT----T--S-HHHHHHHHH
T ss_pred HHHHHHHCCCCCHHHHHHHHHHc-cCCcCHHHHHHHHH
Confidence 34456678999999999999888 99999999888644
No 54
>PF08459 UvrC_HhH_N: UvrC Helix-hairpin-helix N-terminal; InterPro: IPR001162 During the process of Escherichia coli nucleotide excision repair, DNA damage recognition and processing are achieved by the action of the uvrA, uvrB, and uvrC gene products []. The UvrC proteins contain 4 conserved regions: a central region which interacts with UvrB (Uvr domain), a Helix hairpin Helix (HhH) domain important for 5 prime incision of damage DNA and the homology regions 1 and 2 of unknown function. UvrC homology region 2 is specific for UvrC proteins, whereas UvrC homology region 1 is also shared by few other nucleases. Proteins that contain the UvrC homology region 1, IPR000305 from INTERPRO, are listed below: Prokaryotic UvrC proteins. Bacteriophage T4 END2 protein. Small subunit of ribonucleotide reductase enzyme. T4 TEV1 protein. Endonuclease specific to the thymidylate synthase (td) gene splice junction. Found in putative intron-homing endonucleases encoded by group I introns of fungi and phage. Mycobacterium hypothetical protein Y002. Exonuclease by similarity. Bacillus subtilis hypothetical protein YURQ. ; GO: 0003677 DNA binding, 0004518 nuclease activity, 0006289 nucleotide-excision repair; PDB: 3C65_A 2NRZ_A 2NRR_A 2NRX_A 2NRV_A 2NRT_A 2NRW_A.
Probab=34.34 E-value=1e+02 Score=27.56 Aligned_cols=46 Identities=13% Similarity=0.140 Sum_probs=32.1
Q ss_pred EeecCCcccHHHHHHHHHHhcCC------CCcEEEEecCcccHHHHHhhhcC
Q 007318 270 VVDAETEDNWHWFLQELKSAVST------SQQITFIADFQNGLNKSLAEVFD 315 (608)
Q Consensus 270 ~~~~E~~~~~~w~l~~l~~~~~~------~~~~~iitD~~~~l~~Ai~~vfP 315 (608)
+-..+..+.|..+-+.+.+.+.. .-|-.|+.|+.++..+|..+++-
T Consensus 49 i~~~~~~dDy~~M~Evl~RR~~~~~~~~~~lPDLilIDGG~gQl~aa~~~l~ 100 (155)
T PF08459_consen 49 IKTVDGGDDYAAMREVLTRRFKRLKEEKEPLPDLILIDGGKGQLNAAKEVLK 100 (155)
T ss_dssp EE--STT-HHHHHHHHHHHHHCCCHHHT----SEEEESSSHHHHHHHHHHHH
T ss_pred cCCCCCCcHHHHHHHHHHHHHhcccccCCCCCCEEEEcCCHHHHHHHHHHHH
Confidence 33345568899888888888753 24889999999999999887753
No 55
>PF08766 DEK_C: DEK C terminal domain; InterPro: IPR014876 DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C-terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients []. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family []. This domain is also found in chitin synthase proteins like Q8TF96 from SWISSPROT, and in protein phosphatases such as Q6NN85 from SWISSPROT. ; PDB: 1Q1V_A.
Probab=33.97 E-value=85 Score=22.28 Aligned_cols=36 Identities=17% Similarity=0.394 Sum_probs=23.1
Q ss_pred hHHHHHHHHhhh--CCCCCHHHHHHHHHHHhCCcccHH
Q 007318 129 WVGNIIKEKLKA--SPNYKPKDIADDIKREYGIQLNYS 164 (608)
Q Consensus 129 ~i~~~~~~~l~~--~~~~~~~~i~~~v~~~~g~~~s~~ 164 (608)
.+...+.+.++. -.+++.++|...+++.+|..++..
T Consensus 4 ~i~~~i~~iL~~~dl~~vT~k~vr~~Le~~~~~dL~~~ 41 (54)
T PF08766_consen 4 EIREAIREILREADLDTVTKKQVREQLEERFGVDLSSR 41 (54)
T ss_dssp HHHHHHHHHHTTS-GGG--HHHHHHHHHHH-SS--SHH
T ss_pred HHHHHHHHHHHhCCHhHhhHHHHHHHHHHHHCCCcHHH
Confidence 355566777764 246799999999999999998843
No 56
>KOG0341 consensus DEAD-box protein abstrakt [RNA processing and modification]
Probab=32.59 E-value=20 Score=36.54 Aligned_cols=19 Identities=37% Similarity=0.892 Sum_probs=17.2
Q ss_pred eeecCCcCCCCcCccCCCC
Q 007318 589 SLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~~ 607 (608)
..-|.+||+.||..+.||+
T Consensus 570 ~kGCayCgGLGHRItdCPK 588 (610)
T KOG0341|consen 570 EKGCAYCGGLGHRITDCPK 588 (610)
T ss_pred ccccccccCCCcccccCch
Confidence 4679999999999999996
No 57
>COG4830 RPS26B Ribosomal protein S26 [Translation, ribosomal structure and biogenesis]
Probab=31.38 E-value=35 Score=27.39 Aligned_cols=28 Identities=29% Similarity=0.589 Sum_probs=19.3
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCCC
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKGL 598 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~~ 598 (608)
|+.|+..||-|+-|-. -.-.+|-+||..
T Consensus 2 pkkR~N~GR~K~~rGh-----v~~v~CdnCg~~ 29 (108)
T COG4830 2 PKKRRNRGRNKKGRGH-----VKYVRCDNCGKA 29 (108)
T ss_pred cchhhhcCCCCCCCCC-----ccceeecccccc
Confidence 6778889998775521 124789999864
No 58
>PRK13907 rnhA ribonuclease H; Provisional
Probab=30.96 E-value=3.3e+02 Score=23.03 Aligned_cols=77 Identities=10% Similarity=0.175 Sum_probs=40.8
Q ss_pred EEEeecccccCccceeEEEEEEecCCCCeEEEEEE-EeecCCcccHHHHHHHHHHhcCCC-CcEEEEecCcccHHHHHhh
Q 007318 235 LLFLDTTPLNSKYQGTLLTATSADGDDGIFPVAFA-VVDAETEDNWHWFLQELKSAVSTS-QQITFIADFQNGLNKSLAE 312 (608)
Q Consensus 235 vi~~D~T~~~~~y~~~l~~~~g~d~~~~~~~~a~a-~~~~E~~~~~~w~l~~l~~~~~~~-~~~~iitD~~~~l~~Ai~~ 312 (608)
.|.+||.+..+.-.+-...++ .+..+.. .+.+. -..+.+..-|.-++..|+.+.... .++.|.+|- ..+.+++..
T Consensus 3 ~iy~DGa~~~~~g~~G~G~vi-~~~~~~~-~~~~~~~~~tn~~AE~~All~aL~~a~~~g~~~v~i~sDS-~~vi~~~~~ 79 (128)
T PRK13907 3 EVYIDGASKGNPGPSGAGVFI-KGVQPAV-QLSLPLGTMSNHEAEYHALLAALKYCTEHNYNIVSFRTDS-QLVERAVEK 79 (128)
T ss_pred EEEEeeCCCCCCCccEEEEEE-EECCeeE-EEEecccccCCcHHHHHHHHHHHHHHHhCCCCEEEEEech-HHHHHHHhH
Confidence 478999998653222222222 4444433 23321 122344455677777777665432 467777876 455566655
Q ss_pred hc
Q 007318 313 VF 314 (608)
Q Consensus 313 vf 314 (608)
.+
T Consensus 80 ~~ 81 (128)
T PRK13907 80 EY 81 (128)
T ss_pred HH
Confidence 44
No 59
>PRK13130 H/ACA RNA-protein complex component Nop10p; Reviewed
Probab=29.88 E-value=31 Score=24.89 Aligned_cols=20 Identities=25% Similarity=0.522 Sum_probs=14.5
Q ss_pred ceeecCCcCCCCcCccCCCCC
Q 007318 588 RSLQCSKCKGLGHNKKTCKDS 608 (608)
Q Consensus 588 r~~~C~~C~~~gHn~~tC~~~ 608 (608)
+.++|+.||...= +..||.|
T Consensus 4 ~mr~C~~CgvYTL-k~~CP~C 23 (56)
T PRK13130 4 KIRKCPKCGVYTL-KEICPVC 23 (56)
T ss_pred cceECCCCCCEEc-cccCcCC
Confidence 5678888888665 7777754
No 60
>PF04800 ETC_C1_NDUFA4: ETC complex I subunit conserved region; InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=28.96 E-value=90 Score=25.67 Aligned_cols=29 Identities=21% Similarity=0.399 Sum_probs=21.9
Q ss_pred cccceeCCHHHHHHHHHHHHHhcCceEEEEeeC
Q 007318 43 GVDQRFSSFSEFREALHKYSIAHGFAYRYKKND 75 (608)
Q Consensus 43 ~vG~~F~s~~e~~~~~~~ya~~~gf~~~~~~s~ 75 (608)
.+.+.|+|+|++. .||.++|..|.+..-.
T Consensus 50 ~v~l~F~skE~Ai----~yaer~G~~Y~V~~p~ 78 (101)
T PF04800_consen 50 SVRLKFDSKEDAI----AYAERNGWDYEVEEPK 78 (101)
T ss_dssp -CEEEESSHHHHH----HHHHHCT-EEEEE-ST
T ss_pred eeEeeeCCHHHHH----HHHHHcCCeEEEeCCC
Confidence 3888999988875 5899999999986543
No 61
>PRK14890 putative Zn-ribbon RNA-binding protein; Provisional
Probab=28.59 E-value=24 Score=25.64 Aligned_cols=19 Identities=26% Similarity=0.684 Sum_probs=12.4
Q ss_pred eeecCCcCCCCcCccCCCCC
Q 007318 589 SLQCSKCKGLGHNKKTCKDS 608 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~~~ 608 (608)
..||.+|++.|+ .-+||+|
T Consensus 36 I~RC~~CRk~~~-~Y~CP~C 54 (59)
T PRK14890 36 IYRCEKCRKQSN-PYTCPKC 54 (59)
T ss_pred EeechhHHhcCC-ceECCCC
Confidence 567777777773 4456654
No 62
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=28.48 E-value=27 Score=32.06 Aligned_cols=18 Identities=28% Similarity=0.720 Sum_probs=16.6
Q ss_pred eeecCCcCCCCcCccCCC
Q 007318 589 SLQCSKCKGLGHNKKTCK 606 (608)
Q Consensus 589 ~~~C~~C~~~gHn~~tC~ 606 (608)
...|-+||+.||-++-||
T Consensus 60 ~~~C~nCg~~GH~~~DCP 77 (190)
T COG5082 60 NPVCFNCGQNGHLRRDCP 77 (190)
T ss_pred ccccchhcccCcccccCC
Confidence 478999999999999999
No 63
>PF01283 Ribosomal_S26e: Ribosomal protein S26e; InterPro: IPR000892 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26 []; Octopus S26 []; Drosophila S26 (DS31) []; plant cytoplasmic S26; and fungal S26 []. These proteins have 114 to 127 amino acids.; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005622 intracellular, 0005840 ribosome; PDB: 3U5G_a 3U5C_a 2XZM_5 2XZN_5.
Probab=26.88 E-value=45 Score=27.85 Aligned_cols=27 Identities=26% Similarity=0.564 Sum_probs=12.7
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCC
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKG 597 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~ 597 (608)
|.++|..||-|+-|-. -+..+|.+|+.
T Consensus 2 ~~KRrN~Gr~KkgrGh-----v~~V~C~nCgr 28 (113)
T PF01283_consen 2 TKKRRNNGRSKKGRGH-----VQPVRCDNCGR 28 (113)
T ss_dssp ----TTTTSS-SSSS--------EEE-TTTB-
T ss_pred CcccccCCCCCCCCCC-----CcCEeeCcccc
Confidence 5667778887765421 24589999986
No 64
>PF09713 A_thal_3526: Plant protein 1589 of unknown function (A_thal_3526); InterPro: IPR006476 This plant-specific family of proteins are defined by an uncharacterised region 57 residues in length. It is found toward the N terminus of most proteins that contain it. Examples include at least several proteins from Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa (Rice). The function of the proteins are unknown.
Probab=26.65 E-value=72 Score=22.84 Aligned_cols=26 Identities=12% Similarity=0.404 Sum_probs=19.0
Q ss_pred CCCHHHHHHHHHHHhCCcccH-HHHHH
Q 007318 143 NYKPKDIADDIKREYGIQLNY-SQAWR 168 (608)
Q Consensus 143 ~~~~~~i~~~v~~~~g~~~s~-~~~~~ 168 (608)
.++..+++..|.++.|+.... ..+|+
T Consensus 12 yMsk~E~v~~L~~~a~I~P~~T~~VW~ 38 (54)
T PF09713_consen 12 YMSKEECVRALQKQANIEPVFTSTVWQ 38 (54)
T ss_pred cCCHHHHHHHHHHHcCCChHHHHHHHH
Confidence 467889999999988887654 34554
No 65
>PF05634 APO_RNA-bind: APO RNA-binding; InterPro: IPR008512 This family consists of plant APO (accumulation of photosystem 1) proteins.
Probab=26.38 E-value=37 Score=31.42 Aligned_cols=19 Identities=32% Similarity=0.796 Sum_probs=15.8
Q ss_pred eeecCCcCC-----CCcCccCCCC
Q 007318 589 SLQCSKCKG-----LGHNKKTCKD 607 (608)
Q Consensus 589 ~~~C~~C~~-----~gHn~~tC~~ 607 (608)
.+.|++|.+ .||..+||..
T Consensus 98 V~~C~~C~EVHVG~~GH~irtC~g 121 (204)
T PF05634_consen 98 VKACGYCPEVHVGPVGHKIRTCGG 121 (204)
T ss_pred eeecCCCCCeEECCCcccccccCC
Confidence 589999954 6999999964
No 66
>PF13276 HTH_21: HTH-like domain
Probab=25.51 E-value=1.7e+02 Score=21.08 Aligned_cols=42 Identities=29% Similarity=0.483 Sum_probs=32.7
Q ss_pred HHHHHHHHhhh-CCCCCHHHHHHHHHHHhCCcccHHHHHHHHH
Q 007318 130 VGNIIKEKLKA-SPNYKPKDIADDIKREYGIQLNYSQAWRAKE 171 (608)
Q Consensus 130 i~~~~~~~l~~-~~~~~~~~i~~~v~~~~g~~~s~~~~~~~k~ 171 (608)
+...+.+.+.. .+.+....|...|..+.|+.+|...+++..+
T Consensus 6 l~~~I~~i~~~~~~~yG~rri~~~L~~~~~~~v~~krV~RlM~ 48 (60)
T PF13276_consen 6 LRELIKEIFKESKPTYGYRRIWAELRREGGIRVSRKRVRRLMR 48 (60)
T ss_pred HHHHHHHHHHHcCCCeehhHHHHHHhccCcccccHHHHHHHHH
Confidence 44556666665 4788999999999999889999988877543
No 67
>PF14420 Clr5: Clr5 domain
Probab=25.11 E-value=2.1e+02 Score=20.31 Aligned_cols=25 Identities=20% Similarity=0.282 Sum_probs=21.5
Q ss_pred CCCCCHHHHHHHHHHHhCCcccHHH
Q 007318 141 SPNYKPKDIADDIKREYGIQLNYSQ 165 (608)
Q Consensus 141 ~~~~~~~~i~~~v~~~~g~~~s~~~ 165 (608)
+.+.+..+|++.++..+|..+|..+
T Consensus 18 ~e~~tl~~v~~~M~~~~~F~at~rq 42 (54)
T PF14420_consen 18 DENKTLEEVMEIMKEEHGFKATKRQ 42 (54)
T ss_pred hCCCcHHHHHHHHHHHhCCCcCHHH
Confidence 4578999999999999999998654
No 68
>PF01783 Ribosomal_L32p: Ribosomal L32p protein family; InterPro: IPR002677 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. Ribosomal protein L32p is part of the 50S ribosomal subunit. This family is found in both prokaryotes and eukaryotes. Ribosomal protein L32 of yeast binds to and regulates the splicing and the translation of the transcript of its own gene [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0015934 large ribosomal subunit; PDB: 3PYT_2 3F1F_5 3PYV_2 3D5B_5 3MRZ_2 3D5D_5 3F1H_5 1VSP_Y 3PYR_2 3MS1_2 ....
Probab=24.62 E-value=17 Score=26.18 Aligned_cols=20 Identities=15% Similarity=0.539 Sum_probs=13.2
Q ss_pred ceeecCCcCCCCcCccCCCC
Q 007318 588 RSLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 588 r~~~C~~C~~~gHn~~tC~~ 607 (608)
....|+.||..--.-+-||.
T Consensus 25 ~l~~c~~cg~~~~~H~vc~~ 44 (56)
T PF01783_consen 25 NLVKCPNCGEPKLPHRVCPS 44 (56)
T ss_dssp SEEESSSSSSEESTTSBCTT
T ss_pred ceeeeccCCCEecccEeeCC
Confidence 56889999986333355554
No 69
>PF13877 RPAP3_C: Potential Monad-binding region of RPAP3
Probab=24.50 E-value=40 Score=27.15 Aligned_cols=34 Identities=12% Similarity=0.269 Sum_probs=27.3
Q ss_pred CCcccHHHHHHHHhccChhHHHHHHhcCCCCccc
Q 007318 357 PKFEGFQCSIESIKGISPDAYDWVTQSEPEHWAN 390 (608)
Q Consensus 357 ~t~~~f~~~~~~l~~~~~~~~~~l~~~~~~~W~~ 390 (608)
.|..||++.|..+.......++||..+.++....
T Consensus 5 ~~~~eF~~~w~~~~~~~~~~~~yL~~i~p~~l~~ 38 (94)
T PF13877_consen 5 KNSYEFERDWRRLKKDPEERYEYLKSIPPDSLPK 38 (94)
T ss_pred CCHHHHHHHHHHHcCCHHHHHHHHHhCChHHHHH
Confidence 4677999999999877678899999987766653
No 70
>PF13248 zf-ribbon_3: zinc-ribbon domain
Probab=23.51 E-value=39 Score=19.96 Aligned_cols=18 Identities=22% Similarity=0.665 Sum_probs=10.5
Q ss_pred eecCCcCCC-CcCccCCCC
Q 007318 590 LQCSKCKGL-GHNKKTCKD 607 (608)
Q Consensus 590 ~~C~~C~~~-gHn~~tC~~ 607 (608)
+.|..||.. .-..+-||.
T Consensus 3 ~~Cp~Cg~~~~~~~~fC~~ 21 (26)
T PF13248_consen 3 MFCPNCGAEIDPDAKFCPN 21 (26)
T ss_pred CCCcccCCcCCcccccChh
Confidence 467777664 444555554
No 71
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=22.30 E-value=57 Score=21.05 Aligned_cols=17 Identities=24% Similarity=0.481 Sum_probs=11.2
Q ss_pred eecCCcCCCCcCccCCC
Q 007318 590 LQCSKCKGLGHNKKTCK 606 (608)
Q Consensus 590 ~~C~~C~~~gHn~~tC~ 606 (608)
..|.+|++-.|=.+.|-
T Consensus 3 ~~CprC~kg~Hwa~~C~ 19 (36)
T PF14787_consen 3 GLCPRCGKGFHWASECR 19 (36)
T ss_dssp -C-TTTSSSCS-TTT--
T ss_pred ccCcccCCCcchhhhhh
Confidence 46999999999999885
No 72
>KOG4027 consensus Uncharacterized conserved protein [Function unknown]
Probab=22.26 E-value=1.1e+02 Score=27.15 Aligned_cols=36 Identities=14% Similarity=0.171 Sum_probs=29.9
Q ss_pred eecccc-cCcccee--EEEEEEecCCCCeEEEEEEEeec
Q 007318 238 LDTTPL-NSKYQGT--LLTATSADGDDGIFPVAFAVVDA 273 (608)
Q Consensus 238 ~D~T~~-~~~y~~~--l~~~~g~d~~~~~~~~a~a~~~~ 273 (608)
||.||+ ++.|+.| ++...|.|+-|+-...||+.+.-
T Consensus 70 ievt~KstsPygWPqivl~vfg~d~~G~d~v~GYg~~hi 108 (187)
T KOG4027|consen 70 IEVTLKSTSPYGWPQIVLNVFGKDHSGKDCVTGYGMLHI 108 (187)
T ss_pred eEEEeccCCCCCCceEEEEEecCCcCCcceeeeeeeEec
Confidence 788887 6788765 67889999999999999997643
No 73
>PF13358 DDE_3: DDE superfamily endonuclease
Probab=21.67 E-value=92 Score=26.68 Aligned_cols=54 Identities=19% Similarity=0.221 Sum_probs=39.8
Q ss_pred eEEEEEEecCCCCeEEEEEEEeecCCcccHHHHHHHHHHhcCCCCcEEEEecCccc
Q 007318 250 TLLTATSADGDDGIFPVAFAVVDAETEDNWHWFLQELKSAVSTSQQITFIADFQNG 305 (608)
Q Consensus 250 ~l~~~~g~d~~~~~~~~a~a~~~~E~~~~~~w~l~~l~~~~~~~~~~~iitD~~~~ 305 (608)
.+.++.+++.++...+ +.....-+.+.|.-||+.+........+.+||.|..+.
T Consensus 38 ~~~~~~ai~~~~~~~~--~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~li~DNa~~ 91 (146)
T PF13358_consen 38 RVSVWGAISYNGGIVL--FVVEGTMNSEDFIEFLEQLLRPYPRKGRIVLIMDNASI 91 (146)
T ss_pred EEEEEEEecccccccc--eeeeeeeccccccccccccccccccceEEEEecccccc
Confidence 5666666777775555 55566778888888999888776655589999999763
No 74
>PF13551 HTH_29: Winged helix-turn helix
Probab=20.96 E-value=2.1e+02 Score=23.29 Aligned_cols=39 Identities=23% Similarity=0.440 Sum_probs=29.5
Q ss_pred HHHHHhhhCC-----CCCHHHHHHHH-HHHhCCcccHHHHHHHHH
Q 007318 133 IIKEKLKASP-----NYKPKDIADDI-KREYGIQLNYSQAWRAKE 171 (608)
Q Consensus 133 ~~~~~l~~~~-----~~~~~~i~~~v-~~~~g~~~s~~~~~~~k~ 171 (608)
.+.+.+..+| .+++..|...+ ++.+|+.+|.+++++.-.
T Consensus 65 ~l~~~~~~~p~~g~~~~t~~~l~~~l~~~~~~~~~s~~ti~r~L~ 109 (112)
T PF13551_consen 65 QLIELLRENPPEGRSRWTLEELAEWLIEEEFGIDVSPSTIRRILK 109 (112)
T ss_pred HHHHHHHHCCCCCCCcccHHHHHHHHHHhccCccCCHHHHHHHHH
Confidence 4555566655 47889999866 888999999999888643
No 75
>PF12353 eIF3g: Eukaryotic translation initiation factor 3 subunit G ; InterPro: IPR024675 At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding []. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700 kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. Subunit G is required for eIF3 integrity. This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain PF00076 from PFAM.
Probab=20.55 E-value=68 Score=27.67 Aligned_cols=20 Identities=30% Similarity=0.614 Sum_probs=17.5
Q ss_pred cceeecCCcCCCCcCccCCCC
Q 007318 587 KRSLQCSKCKGLGHNKKTCKD 607 (608)
Q Consensus 587 kr~~~C~~C~~~gHn~~tC~~ 607 (608)
+....|..|+ -.|-...||.
T Consensus 104 ~~~v~CR~Ck-GdH~T~~CPy 123 (128)
T PF12353_consen 104 KSKVKCRICK-GDHWTSKCPY 123 (128)
T ss_pred CceEEeCCCC-CCcccccCCc
Confidence 4679999996 8899999995
No 76
>PRK01110 rpmF 50S ribosomal protein L32; Validated
Probab=20.52 E-value=77 Score=23.25 Aligned_cols=37 Identities=8% Similarity=0.069 Sum_probs=19.2
Q ss_pred CCCCCCCCCCCCCCCCccccccceeecCCcCCC--CcCc
Q 007318 566 PPTRRPPGRPKMKQPESAEIIKRSLQCSKCKGL--GHNK 602 (608)
Q Consensus 566 p~~~r~~GRPk~~r~~~~~~~kr~~~C~~C~~~--gHn~ 602 (608)
|+.+.++-|..++|..-.-.......|+.||.. -|..
T Consensus 4 PKrK~Sksr~~~RRa~~~~~~~~~~~c~~cg~~~~pH~v 42 (60)
T PRK01110 4 PKRKTSKSKRRMRRSHWKLTAPTLSVDKTTGEYHLPHHV 42 (60)
T ss_pred CcCccchhhchhhhhhhhccCCceeEcCCCCceecccee
Confidence 444444555444443222222346789999885 4543
No 77
>PF10045 DUF2280: Uncharacterized conserved protein (DUF2280); InterPro: IPR018738 This entry is represented by Burkholderia phage Bups phi1, Orf2.36. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Probab=20.31 E-value=1.2e+02 Score=24.88 Aligned_cols=22 Identities=32% Similarity=0.745 Sum_probs=19.6
Q ss_pred CHHHHHHHHHHHhCCcccHHHH
Q 007318 145 KPKDIADDIKREYGIQLNYSQA 166 (608)
Q Consensus 145 ~~~~i~~~v~~~~g~~~s~~~~ 166 (608)
+|.++.+.|+++||+.+|..++
T Consensus 21 TPs~v~~aVk~eFgi~vsrQqv 42 (104)
T PF10045_consen 21 TPSEVAEAVKEEFGIDVSRQQV 42 (104)
T ss_pred CHHHHHHHHHHHhCCccCHHHH
Confidence 7999999999999999997654
No 78
>cd01388 SOX-TCF_HMG-box SOX-TCF_HMG-box, class I member of the HMG-box superfamily of DNA-binding proteins. These proteins contain a single HMG box, and bind the minor groove of DNA in a highly sequence-specific manner. Members include SRY and its homologs in insects and vertebrates, and transcription factor-like proteins, TCF-1, -3, -4, and LEF-1. They appear to bind the minor groove of the A/T C A A A G/C-motif.
Probab=20.11 E-value=3.8e+02 Score=20.03 Aligned_cols=21 Identities=14% Similarity=0.246 Sum_probs=17.4
Q ss_pred hhhCCCCCHHHHHHHHHHHhC
Q 007318 138 LKASPNYKPKDIADDIKREYG 158 (608)
Q Consensus 138 l~~~~~~~~~~i~~~v~~~~g 158 (608)
...+|+++..+|...|.+.+.
T Consensus 21 ~~~~p~~~~~eisk~l~~~Wk 41 (72)
T cd01388 21 LQEYPLKENRAISKILGDRWK 41 (72)
T ss_pred HHHCCCCCHHHHHHHHHHHHH
Confidence 346899999999999988774
Done!