Query 037162
Match_columns 689
No_of_seqs 421 out of 1354
Neff 7.7
Searched_HMMs 46136
Date Fri Mar 29 09:18:47 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/037162.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/037162hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03097 FHY3 Protein FAR-RED 100.0 1.3E-45 2.8E-50 425.3 26.4 311 8-324 67-436 (846)
2 PF00872 Transposase_mut: Tran 99.8 5.8E-21 1.3E-25 208.0 1.5 232 125-387 94-352 (381)
3 PF03101 FAR1: FAR1 DNA-bindin 99.8 3.4E-19 7.4E-24 154.6 8.0 90 29-119 1-91 (91)
4 PF10551 MULE: MULE transposas 99.7 7.2E-17 1.6E-21 140.5 6.1 78 192-271 16-93 (93)
5 PF08731 AFT: Transcription fa 99.6 1.3E-14 2.8E-19 125.6 10.8 91 21-117 1-111 (111)
6 PF02338 OTU: OTU-like cystein 99.5 5.4E-15 1.2E-19 135.2 5.5 107 564-687 1-121 (121)
7 COG3328 Transposase and inacti 99.4 4.6E-12 9.9E-17 135.9 15.7 227 128-387 83-331 (379)
8 KOG2606 OTU (ovarian tumor)-li 99.1 8.7E-11 1.9E-15 118.1 6.5 124 548-683 149-283 (302)
9 PF03108 DBD_Tnp_Mut: MuDR fam 98.3 2E-06 4.3E-11 69.9 7.0 63 16-106 5-67 (67)
10 KOG3288 OTU-like cysteine prot 96.7 0.0051 1.1E-07 61.2 7.3 115 560-687 112-227 (307)
11 PF10275 Peptidase_C65: Peptid 96.7 0.0025 5.4E-08 65.7 5.5 27 551-577 34-60 (244)
12 KOG2605 OTU (ovarian tumor)-li 96.5 0.0015 3.3E-08 70.2 2.4 123 553-687 213-338 (371)
13 KOG3991 Uncharacterized conser 95.5 0.051 1.1E-06 53.6 7.9 20 559-578 65-84 (256)
14 PF05412 Peptidase_C33: Equine 94.9 0.043 9.3E-07 47.6 4.7 32 625-657 34-65 (108)
15 PF03106 WRKY: WRKY DNA -bindi 94.6 0.15 3.2E-06 40.4 6.8 56 39-116 4-59 (60)
16 PF06782 UPF0236: Uncharacteri 94.4 1.2 2.6E-05 50.3 16.2 201 123-333 111-354 (470)
17 PF04684 BAF1_ABF1: BAF1 / ABF 90.2 2 4.4E-05 47.0 10.2 42 18-64 25-66 (496)
18 COG5539 Predicted cysteine pro 90.1 0.14 3E-06 52.4 1.5 110 539-655 152-271 (306)
19 PF01610 DDE_Tnp_ISL3: Transpo 90.1 0.2 4.3E-06 51.5 2.6 67 208-276 31-98 (249)
20 smart00774 WRKY DNA binding do 89.7 0.67 1.5E-05 36.4 4.6 28 88-115 32-59 (59)
21 PF04500 FLYWCH: FLYWCH zinc f 86.8 1.6 3.5E-05 34.0 5.3 25 87-115 38-62 (62)
22 PF15299 ALS2CR8: Amyotrophic 84.8 4.9 0.00011 40.8 9.0 99 78-178 69-222 (225)
23 PF03050 DDE_Tnp_IS66: Transpo 76.4 3 6.6E-05 43.4 4.3 132 131-276 5-156 (271)
24 PF08069 Ribosomal_S13_N: Ribo 71.2 5.8 0.00012 31.3 3.5 31 130-160 28-59 (60)
25 PF13610 DDE_Tnp_IS240: DDE do 67.1 0.97 2.1E-05 42.2 -1.8 59 195-257 23-81 (140)
26 COG5539 Predicted cysteine pro 63.9 20 0.00043 37.3 6.6 107 565-687 119-226 (306)
27 PF13936 HTH_38: Helix-turn-he 61.9 5.8 0.00013 29.1 1.9 30 128-157 3-32 (44)
28 PRK08561 rps15p 30S ribosomal 57.6 39 0.00084 31.8 6.9 32 129-160 27-59 (151)
29 KOG4345 NF-kappa B regulator A 44.3 10 0.00023 43.7 1.1 49 638-688 225-287 (774)
30 PTZ00072 40S ribosomal protein 40.4 61 0.0013 30.3 5.2 31 130-160 25-56 (148)
31 KOG0400 40S ribosomal protein 30.7 38 0.00083 30.9 2.3 29 132-160 31-59 (151)
32 PF03462 PCRF: PCRF domain; I 29.5 1E+02 0.0022 27.7 4.9 42 25-66 66-107 (115)
33 PRK09784 hypothetical protein; 29.3 51 0.0011 33.2 3.1 36 554-590 196-231 (417)
34 PF02796 HTH_7: Helix-turn-hel 29.0 49 0.0011 24.2 2.3 39 128-173 4-42 (45)
35 PF04800 ETC_C1_NDUFA4: ETC co 28.4 45 0.00096 29.4 2.3 27 17-47 51-77 (101)
36 PF11427 HTH_Tnp_Tc3_1: Tc3 tr 28.0 74 0.0016 24.2 3.1 30 129-158 4-33 (50)
37 PF00665 rve: Integrase core d 27.1 71 0.0015 27.9 3.5 50 199-249 33-82 (120)
38 PF03461 TRCF: TRCF domain; I 27.0 1.1E+02 0.0024 26.8 4.6 40 286-325 18-57 (101)
39 TIGR03147 cyt_nit_nrfF cytochr 26.7 81 0.0018 28.9 3.7 34 129-162 57-90 (126)
40 KOG4825 Component of synaptic 26.4 1.4E+02 0.0031 33.1 6.0 29 479-507 284-312 (666)
41 PLN03097 FHY3 Protein FAR-RED 26.1 4.1E+02 0.0089 32.6 10.5 28 398-425 595-623 (846)
42 COG3316 Transposase and inacti 22.8 1.9E+02 0.0041 29.1 5.8 108 143-259 23-151 (215)
43 PF07506 RepB: RepB plasmid pa 22.7 2.4E+02 0.0052 27.5 6.6 62 559-621 53-114 (185)
44 PRK10144 formate-dependent nit 20.8 1.3E+02 0.0027 27.7 3.7 34 129-162 57-90 (126)
45 TIGR03277 methan_mark_9 putati 20.4 82 0.0018 27.8 2.3 30 569-598 78-108 (109)
No 1
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=100.00 E-value=1.3e-45 Score=425.32 Aligned_cols=311 Identities=16% Similarity=0.257 Sum_probs=225.3
Q ss_pred ccccCcceecccccCChhhHHHHHHHHHhhcCeEEEEeecccCC-CCCccEEEEEEecCCcCCCCCCC--C---------
Q 037162 8 TEESGKEKVVNVALMEREDMPREELQTELRNGLVIVIEKSDVAA-NGRKPRIIFTCERSGVYRDRSPQ--G--------- 75 (689)
Q Consensus 8 ~~~~~~~~~~~~~F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~-~g~~~~~~~~C~r~G~~r~~~~~--~--------- 75 (689)
.+..+.+|.+||+|+|.|||++||+.||+..||+|||.+|++.+ +|.+..+.|+|+|+|..+...+. .
T Consensus 67 ~~~~~~~P~vGMeF~S~eeA~~FYn~YA~~~GFsVRi~~srrsk~~~~ii~r~fvCsreG~~~~~~~~~~~~~~~~~k~~ 146 (846)
T PLN03097 67 KEDTNLEPLSGMEFESHGEAYSFYQEYARSMGFNTAIQNSRRSKTSREFIDAKFACSRYGTKREYDKSFNRPRARQTKQD 146 (846)
T ss_pred cCCCCccCcCCCeECCHHHHHHHHHHHHhhcCceEEeeceeccCCCCcEEEEEEEEcCCCCCcccccccccccccccccC
Confidence 45566789999999999999999999999999999998887764 56788999999999964321110 0
Q ss_pred -CCCCCcCCceeeCCceEEEEEEeecCCCeEEEEEeCcccCCCCcccccccccccCCHHHHHHHH---------------
Q 037162 76 -PKPIKATGIQKCKCPFKLKGQKMANNDDWALIVICGFHNHPATQYLEGHSFAGRLSKEESNLLV--------------- 139 (689)
Q Consensus 76 -~~~rr~~~s~ktgCpa~i~~~~~~~~~~W~V~~~~~~HNH~l~~~~~~h~~~RrLs~e~k~~I~--------------- 139 (689)
...+++++.+||||+|+|+++.. .+|+|.|+.+..+|||++.++.......|++-......+.
T Consensus 147 ~~~~~~rR~~tRtGC~A~m~Vk~~-~~gkW~V~~fv~eHNH~L~p~~~~~~~~r~~~~~~~~~~~~~~~v~~~~~d~~~~ 225 (846)
T PLN03097 147 PENGTGRRSCAKTDCKASMHVKRR-PDGKWVIHSFVKEHNHELLPAQAVSEQTRKMYAAMARQFAEYKNVVGLKNDSKSS 225 (846)
T ss_pred cccccccccccCCCCceEEEEEEc-CCCeEEEEEEecCCCCCCCCccccchhhhhhHHHHHhhhhccccccccchhhcch
Confidence 00112355789999999999885 4579999999999999999764322111222111111000
Q ss_pred --HhhhCCC---ChHHHHHH---HHhcCCCccc--------hhhHHHHHHHHhHHhhh-cc------------hhHHHHH
Q 037162 140 --DMSKNNV---KPKDILHV---LKKRDMHNAT--------TIRAIYNARRKCKVREQ-AG------------RSQMQLL 190 (689)
Q Consensus 140 --~L~~sgv---~pr~Il~~---L~~~~g~~~~--------t~kDIyN~~~k~r~~~l-~g------------~t~~~~L 190 (689)
......+ -...++.+ ++..+|+.++ .++.|+++..+.+.... -| .+. .+|
T Consensus 226 ~~~~r~~~~~~gD~~~ll~yf~~~q~~nP~Ffy~~qlDe~~~l~niFWaD~~sr~~Y~~FGDvV~fDTTY~tN~y~-~Pf 304 (846)
T PLN03097 226 FDKGRNLGLEAGDTKILLDFFTQMQNMNSNFFYAVDLGEDQRLKNLFWVDAKSRHDYGNFSDVVSFDTTYVRNKYK-MPL 304 (846)
T ss_pred hhHHHhhhcccchHHHHHHHHHHHHhhCCCceEEEEEccCCCeeeEEeccHHHHHHHHhcCCEEEEeceeeccccC-cEE
Confidence 0000001 11222222 2233443322 23344555444444321 11 111 268
Q ss_pred hhhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHH
Q 037162 191 MKIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNV 270 (689)
Q Consensus 191 l~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv 270 (689)
+.|||||+|++++++||||+.+|+.|+|.|+|++|+.+|+ ++.|++||||+|.||.+||++|||++.|++|.|||++|+
T Consensus 305 a~FvGvNhH~qtvlfGcaLl~dEt~eSf~WLf~tfl~aM~-gk~P~tIiTDqd~am~~AI~~VfP~t~Hr~C~wHI~~~~ 383 (846)
T PLN03097 305 ALFVGVNQHYQFMLLGCALISDESAATYSWLMQTWLRAMG-GQAPKVIITDQDKAMKSVISEVFPNAHHCFFLWHILGKV 383 (846)
T ss_pred EEEEEecCCCCeEEEEEEEcccCchhhHHHHHHHHHHHhC-CCCCceEEecCCHHHHHHHHHHCCCceehhhHHHHHHHH
Confidence 8999999999999999999999999999999999999995 799999999999999999999999999999999999999
Q ss_pred HHHhhhhccchhhHHHHHHHHHH-HHhcCCHHHHHHHHHHHHHhhc-CChhHHhhh
Q 037162 271 LANCKKLFETNEIWETFISSWNL-LVLAASEEEFAQRLKSMESDFS-KYPTALTYI 324 (689)
Q Consensus 271 ~~~~~~~~~~~~~~~~~~~~w~~-l~~a~t~~ef~~~~~~l~~~~~-~~p~~~~Yl 324 (689)
.++++..+... +.|...|.. |..+.|+++|+..|..|.++|. .-.+++..|
T Consensus 384 ~e~L~~~~~~~---~~f~~~f~~cv~~s~t~eEFE~~W~~mi~ky~L~~n~WL~~L 436 (846)
T PLN03097 384 SENLGQVIKQH---ENFMAKFEKCIYRSWTEEEFGKRWWKILDRFELKEDEWMQSL 436 (846)
T ss_pred HHHhhHHhhhh---hHHHHHHHHHHhCCCCHHHHHHHHHHHHHhhcccccHHHHHH
Confidence 99999877643 467777765 4568999999999999999996 334444444
No 2
>PF00872 Transposase_mut: Transposase, Mutator family; InterPro: IPR001207 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. The mutator family of transposases consists of a number of elements that include, mutator from maize, IsT2 from Thiobacillus ferrooxidans, Is256 from Staphylococcus aureus, Is1201 from Lactobacillus helveticus, Is1081 from Mycobacterium bovis, IsRm3 from Rhizobium meliloti and others. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=99.80 E-value=5.8e-21 Score=208.03 Aligned_cols=232 Identities=21% Similarity=0.277 Sum_probs=190.5
Q ss_pred cccccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCc-c--chh----hHHHHHHHHhHHhhhcch-hHH---------
Q 037162 125 SFAGRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHN-A--TTI----RAIYNARRKCKVREQAGR-SQM--------- 187 (689)
Q Consensus 125 ~~~RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~-~--~t~----kDIyN~~~k~r~~~l~g~-t~~--------- 187 (689)
+.+++.+++....|..|+..|+++++|...|+..+|.. + .++ +.+......|+.+.+++. +++
T Consensus 94 ~~y~r~~~~l~~~i~~ly~~G~Str~i~~~l~~l~g~~~~S~s~vSri~~~~~~~~~~w~~R~L~~~~y~~l~iD~~~~k 173 (381)
T PF00872_consen 94 PKYQRREDSLEELIISLYLKGVSTRDIEEALEELYGEVAVSKSTVSRITKQLDEEVEAWRNRPLESEPYPYLWIDGTYFK 173 (381)
T ss_pred chhhhhhhhhhhhhhhhhccccccccccchhhhhhcccccCchhhhhhhhhhhhhHHHHhhhccccccccceeeeeeecc
Confidence 34455577778888999999999999999999999832 2 222 334445677777777665 442
Q ss_pred ---------HHHhhhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccc
Q 037162 188 ---------QLLMKIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSAT 258 (689)
Q Consensus 188 ---------~~Ll~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~ 258 (689)
.+++.++|||.+|+..++|+.....|+.++|.-+|+.|++. |...|.+||+|..+||..||+++||++.
T Consensus 174 vr~~~~~~~~~~~v~iGi~~dG~r~vLg~~~~~~Es~~~W~~~l~~L~~R--Gl~~~~lvv~Dg~~gl~~ai~~~fp~a~ 251 (381)
T PF00872_consen 174 VREDGRVVKKAVYVAIGIDEDGRREVLGFWVGDRESAASWREFLQDLKER--GLKDILLVVSDGHKGLKEAIREVFPGAK 251 (381)
T ss_pred cccccccccchhhhhhhhhcccccceeeeecccCCccCEeeecchhhhhc--cccccceeeccccccccccccccccchh
Confidence 25789999999999999999999999999999999999886 7888999999999999999999999999
Q ss_pred cccccchHHHHHHHHhhhhccchhhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHhhc-CChhHHhhhhhcchhhhhhHHH
Q 037162 259 TLLCRWHISRNVLANCKKLFETNEIWETFISSWNLLVLAASEEEFAQRLKSMESDFS-KYPTALTYIRNSSWTKVHTLLE 337 (689)
Q Consensus 259 ~~lC~wHi~kNv~~~~~~~~~~~~~~~~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~-~~p~~~~Yl~~~~W~~i~~~~e 337 (689)
+|+|++|+.+|+.+++++ .+.+++..+++.|+.+.+.+++.+.++.|.++|. +||++++++++ .|+.+
T Consensus 252 ~QrC~vH~~RNv~~~v~~-----k~~~~v~~~Lk~I~~a~~~e~a~~~l~~f~~~~~~kyp~~~~~l~~-~~~~~----- 320 (381)
T PF00872_consen 252 WQRCVVHLMRNVLRKVPK-----KDRKEVKADLKAIYQAPDKEEAREALEEFAEKWEKKYPKAAKSLEE-NWDEL----- 320 (381)
T ss_pred hhhheechhhhhcccccc-----ccchhhhhhccccccccccchhhhhhhhcccccccccchhhhhhhh-ccccc-----
Confidence 999999999999999876 3446889999999999999999999999999997 99999999996 66432
Q ss_pred hhHHHHHHHHHhhheeeeccccchhhhhhhhhhcHHHHHHHHHHHHHhcc
Q 037162 338 LQLVEIKASLERSLTMVQHDFKPSIFKELREFVAMNALTMILDESRRADS 387 (689)
Q Consensus 338 ~qh~~Ik~s~~~s~~~v~~~~~~~l~~~l~g~iS~~AL~~i~~q~~r~~~ 387 (689)
.+.. .|++.++..+. |+|+|+.++.+++++.+
T Consensus 321 -------------~tf~--~fP~~~~~~i~---TTN~iEsln~~irrr~~ 352 (381)
T PF00872_consen 321 -------------LTFL--DFPPEHRRSIR---TTNAIESLNKEIRRRTK 352 (381)
T ss_pred -------------ccee--eecchhccccc---hhhhccccccchhhhcc
Confidence 2222 25555555555 88888888888888643
No 3
>PF03101 FAR1: FAR1 DNA-binding domain; InterPro: IPR004330 Phytochrome A is the primary photoreceptor for mediating various far-red light-induced responses in higher plants. It has been found that the proteins governing this response, which include FAR-RED ELONGATED HYPOCOTYL3 (FHY3) and FAR-RED-IMPAIRED RESPONSE1 (FAR1), are a pair of homologous proteins sharing significant sequence homology to mutator-like transposases. These proteins appear to be novel transcription factors, which are essential for activating the expression of FHY1 and FHL (for FHY1-like) and related genes, whose products are required for light-induced phytochrome A nuclear accumulation and subsequent light responses in plants. The FRS (FAR1 Related Sequences) family of proteins share a similar domain structure to mutator-like transposases, including an N-terminal C2H2 zinc finger domain, a central putative core transposase domain, and a C-terminal SWIM motif (named after SWI2/SNF and MuDR transposases). It seems plausible that the FRS family represent transcription factors derived from mutator-like transposases [, ]. This entry represents a domain found in FAR1 and FRS proteins. It contains a WRKY like fold and is therefore most likely a zinc binding DNA-binding domain.
Probab=99.78 E-value=3.4e-19 Score=154.59 Aligned_cols=90 Identities=21% Similarity=0.382 Sum_probs=79.4
Q ss_pred HHHHHHHhhcCeEEEEeecccC-CCCCccEEEEEEecCCcCCCCCCCCCCCCCcCCceeeCCceEEEEEEeecCCCeEEE
Q 037162 29 REELQTELRNGLVIVIEKSDVA-ANGRKPRIIFTCERSGVYRDRSPQGPKPIKATGIQKCKCPFKLKGQKMANNDDWALI 107 (689)
Q Consensus 29 ~~~~~yA~~~GF~v~i~rS~~~-~~g~~~~~~~~C~r~G~~r~~~~~~~~~rr~~~s~ktgCpa~i~~~~~~~~~~W~V~ 107 (689)
+||+.||..+||+|++.+|.+. .+|.+.++.|+|+++|.++.........++++.|.||||||+|.++... ++.|.|.
T Consensus 1 ~fy~~yA~~~GF~vr~~~s~~~~~~~~~~~~~~~C~r~G~~~~~~~~~~~~~r~~~s~ktgC~a~i~v~~~~-~~~w~v~ 79 (91)
T PF03101_consen 1 DFYNSYARRHGFSVRKSSSRKSKKNGEIKRVTFVCSRGGKYKSKKKNEEKRRRNRPSKKTGCKARINVKRRK-DGKWRVT 79 (91)
T ss_pred CHHHHhcCcCCeEEEEeeeEeCCCCceEEEEEEEECCcccccccccccccccccccccccCCCEEEEEEEcc-CCEEEEE
Confidence 5999999999999999998876 5788999999999999877655432456788999999999999999987 8899999
Q ss_pred EEeCcccCCCCc
Q 037162 108 VICGFHNHPATQ 119 (689)
Q Consensus 108 ~~~~~HNH~l~~ 119 (689)
.+..+|||+|.|
T Consensus 80 ~~~~~HNH~L~P 91 (91)
T PF03101_consen 80 SFVLEHNHPLCP 91 (91)
T ss_pred ECcCCcCCCCCC
Confidence 999999999975
No 4
>PF10551 MULE: MULE transposase domain; InterPro: IPR018289 This entry represents a domain found in Mutator-like elements (MULE)-encoded tranposases, some of which also contain a zinc-finger motif [, ]. This domain is also found in a transposase for the insertion sequence element IS256 in transposon Tn4001 [].
Probab=99.67 E-value=7.2e-17 Score=140.49 Aligned_cols=78 Identities=41% Similarity=0.669 Sum_probs=74.2
Q ss_pred hhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHHH
Q 037162 192 KIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNVL 271 (689)
Q Consensus 192 ~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv~ 271 (689)
.++|+|++|+.+++||+++.+|+.++|.|+|+.+++.+.. . |.+||||++.|+++||+++||++.|++|.||+.||++
T Consensus 16 ~~~~~d~~~~~~~v~~~l~~~e~~~~~~~~l~~~~~~~~~-~-p~~ii~D~~~~~~~Ai~~vfP~~~~~~C~~H~~~n~k 93 (93)
T PF10551_consen 16 IAVGIDGNGRGFPVAFALVSSESEESYEWFLEKLKEAMPQ-K-PKVIISDFDKALINAIKEVFPDARHQLCLFHILRNIK 93 (93)
T ss_pred eEEEEcCCCCEEEEEEEEEcCCChhhhHHHHHHhhhcccc-C-ceeeeccccHHHHHHHHHHCCCceEehhHHHHHHhhC
Confidence 4899999999999999999999999999999999999853 5 9999999999999999999999999999999999974
No 5
>PF08731 AFT: Transcription factor AFT; InterPro: IPR014842 AFT (activator of iron transcription) is an iron regulated transcriptional activator that regulates the expression of genes involved in iron homeostasis. This entry includes the paralogous pair of transcription factors AFT1 and AFT2.
Probab=99.58 E-value=1.3e-14 Score=125.64 Aligned_cols=91 Identities=30% Similarity=0.578 Sum_probs=79.6
Q ss_pred cCChhhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEecCCcCCCCCCC--------------------CCCCCC
Q 037162 21 LMEREDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCERSGVYRDRSPQ--------------------GPKPIK 80 (689)
Q Consensus 21 F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r~G~~r~~~~~--------------------~~~~rr 80 (689)
|.+.+|...|++.+++.+||+|+|.||+. ..++|.|--+|.++..... ...+.+
T Consensus 1 F~~k~~ikpwlq~~~~~~Gi~iVIerSd~------~ki~FkCk~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~k~k 74 (111)
T PF08731_consen 1 FDDKDEIKPWLQKIFYPQGIGIVIERSDK------KKIVFKCKNGKRYRHKKKKKGQAQAQQKESTSGNKNKSSKKKKKK 74 (111)
T ss_pred CCchHHHHHHHHHHhhhcCceEEEEecCC------ceEEEEEecCCCcccccccccccccccccccccccccccccccCC
Confidence 89999999999999999999999999995 4799999999987764331 123346
Q ss_pred cCCceeeCCceEEEEEEeecCCCeEEEEEeCcccCCC
Q 037162 81 ATGIQKCKCPFKLKGQKMANNDDWALIVICGFHNHPA 117 (689)
Q Consensus 81 ~~~s~ktgCpa~i~~~~~~~~~~W~V~~~~~~HNH~l 117 (689)
.+.+++++|||+|++......+.|.|.++++.|||++
T Consensus 75 ~t~srk~~CPFriRA~yS~k~k~W~lvvvnn~HnH~l 111 (111)
T PF08731_consen 75 RTKSRKNTCPFRIRANYSKKNKKWTLVVVNNEHNHPL 111 (111)
T ss_pred cccccccCCCeEEEEEEEecCCeEEEEEecCCcCCCC
Confidence 6889999999999999999999999999999999996
No 6
>PF02338 OTU: OTU-like cysteine protease; InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65). None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.54 E-value=5.4e-15 Score=135.21 Aligned_cols=107 Identities=25% Similarity=0.323 Sum_probs=84.9
Q ss_pred cCCCCcchHHHHHHhc----CCCccHHHHHHHHHHHHH-hhhhhhhhhccCchhHHHHHhhhcCCCCCCcccccccccch
Q 037162 564 AADGNCGFRTVADLIG----IGEDNWARVRRDLVDELQ-CHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHWMIMPNT 638 (689)
Q Consensus 564 ~~dg~Cgfraia~~l~----~~~~~~~~vr~~l~~el~-~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~Wl~~~~~ 638 (689)
+|||||+|||||.+|+ .+++.|..||+.++++|+ .+++.|...+.++ .+ .....|.+.+++
T Consensus 1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~~~~~~~~--------~~------~~~~~Wg~~~el 66 (121)
T PF02338_consen 1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKFEEFLEGD--------KM------SKPGTWGGEIEL 66 (121)
T ss_dssp -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHHHHHHHHH--------HH------TSTTSHEEHHHH
T ss_pred CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchhhhhhhhh--------hh------ccccccCcHHHH
Confidence 5999999999999999 999999999999999999 9999988876433 34 457799999988
Q ss_pred hhhhccccceeEEEEccCc--eeeccCCCC--CCCCCCCCCcEEEEEec-----CCCc
Q 037162 639 GYLIAFKYNVIGLLISMQQ--CLTFLPLRS--IPGPRSSHKIIAIGYIY-----GCHF 687 (689)
Q Consensus 639 g~~iA~~y~~pv~~~s~~~--s~t~~P~~~--~p~~~~~~~~i~l~~~~-----~nHf 687 (689)
+++|+.|+|+|++|+... ...+.+..+ +|.. ..++|+|+|.. +|||
T Consensus 67 -~a~a~~~~~~I~v~~~~~~~~~~~~~~~~~~~~~~--~~~~i~l~~~~~l~~~~~Hy 121 (121)
T PF02338_consen 67 -QALANVLNRPIIVYSSSDGDNVVFIKFTGKYPPLE--SPPPICLCYHGHLYYTGNHY 121 (121)
T ss_dssp -HHHHHHHTSEEEEECETTTBEEEEEEESCEESTTT--TTTSEEEEEETEEEEETTEE
T ss_pred -HHHHHHhCCeEEEEEcCCCCccceeeecCccccCC--CCCeEEEEEcCCccCCCCCC
Confidence 599999999999987532 233333322 2222 25789999998 8998
No 7
>COG3328 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=99.39 E-value=4.6e-12 Score=135.94 Aligned_cols=227 Identities=20% Similarity=0.215 Sum_probs=177.0
Q ss_pred ccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCcc------chhhHHHHHHHHhHHhhhcchhH---------------
Q 037162 128 GRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHNA------TTIRAIYNARRKCKVREQAGRSQ--------------- 186 (689)
Q Consensus 128 RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~~------~t~kDIyN~~~k~r~~~l~g~t~--------------- 186 (689)
++........|..|...|+++++|-..++..++..+ ...+.+.+.+..+..+.+ |+++
T Consensus 83 ~r~~~~~~~~v~~~y~~gv~Tr~i~~~~~~~~~~~~s~~~iS~~~~~~~e~v~~~~~r~l-~~~~~v~~D~~~~k~r~v~ 161 (379)
T COG3328 83 QRRERALDLPVLSMYAKGVTTREIEALLEELYGHKVSPSVISVVTDRLDEKVKAWQNRPL-GDYPYVYLDAKYVKVRSVR 161 (379)
T ss_pred HhhhhhHHHHHHHHHHcCCcHHHHHHHHHHhhCcccCHHHhhhHHHHHHHHHHHHHhccc-cCceEEEEecceeehhhhh
Confidence 344455566788899999999999999999876421 123455566777887777 4332
Q ss_pred HHHHhhhcccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchH
Q 037162 187 MQLLMKIVGVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHI 266 (689)
Q Consensus 187 ~~~Ll~~vGvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi 266 (689)
-++++.++||+.+|+..++|+..-..|+ ..|.-+|..|+.. |......+|+|...++-+||..+||.+.+|+|..|+
T Consensus 162 ~~~~~ia~Gv~~eG~reilg~~~~~~e~-~~w~~~l~~l~~r--gl~~v~l~v~Dg~~gl~~aI~~v~p~a~~Q~C~vH~ 238 (379)
T COG3328 162 NKAVYIAIGVTEEGRREILGIWVGVRES-KFWLSFLLDLKNR--GLSDVLLVVVDGLKGLPEAISAVFPQAAVQRCIVHL 238 (379)
T ss_pred hheeeeeeccCcccchhhhceeeecccc-hhHHHHHHHHHhc--cccceeEEecchhhhhHHHHHHhccHhhhhhhhhHH
Confidence 1356789999999999999999999998 7888677777765 677788888999999999999999999999999999
Q ss_pred HHHHHHHhhhhccchhhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHhhc-CChhHHhhhhhcchhhhhhHHHhhHHHHHH
Q 037162 267 SRNVLANCKKLFETNEIWETFISSWNLLVLAASEEEFAQRLKSMESDFS-KYPTALTYIRNSSWTKVHTLLELQLVEIKA 345 (689)
Q Consensus 267 ~kNv~~~~~~~~~~~~~~~~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~-~~p~~~~Yl~~~~W~~i~~~~e~qh~~Ik~ 345 (689)
.+|+.++.+. .+.+.+....+.|+.+++.++....|..+.+.|. .||+++..+.+ .|..+....
T Consensus 239 ~Rnll~~v~~-----k~~d~i~~~~~~I~~a~~~e~~~~~~~~~~~~w~~~yP~i~~~~~~-~~~~~~~F~--------- 303 (379)
T COG3328 239 VRNLLDKVPR-----KDQDAVLSDLRSIYIAPDAEEALLALLAFSELWGKRYPAILKSWRN-ALEELLPFF--------- 303 (379)
T ss_pred Hhhhhhhhhh-----hhhHHHHhhhhhhhccCCcHHHHHHHHHHHHhhhhhcchHHHHHHH-HHHHhcccc---------
Confidence 9999999887 4456788889999999999999999999999997 89999999886 554322110
Q ss_pred HHHhhheeeeccccchhhhhhhhhhcHHHHHHHHHHHHHhcc
Q 037162 346 SLERSLTMVQHDFKPSIFKELREFVAMNALTMILDESRRADS 387 (689)
Q Consensus 346 s~~~s~~~v~~~~~~~l~~~l~g~iS~~AL~~i~~q~~r~~~ 387 (689)
.|+.....-+. |++||+.++.+++++.+
T Consensus 304 -----------~fp~~~r~~i~---ttN~IE~~n~~ir~~~~ 331 (379)
T COG3328 304 -----------AFPSEIRKIIY---TTNAIESLNKLIRRRTK 331 (379)
T ss_pred -----------cCcHHHHhHhh---cchHHHHHHHHHHHHHh
Confidence 12222221222 89999999999887654
No 8
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.11 E-value=8.7e-11 Score=118.14 Aligned_cols=124 Identities=11% Similarity=0.096 Sum_probs=105.9
Q ss_pred cccccccccccccccccCCCCcchHHHHHHhcCCC---ccHHHHHHHHHHHHHhhhhhhhhhcc--------CchhHHHH
Q 037162 548 LFPSGIRPYIRGAKDVAADGNCGFRTVADLIGIGE---DNWARVRRDLVDELQCHYNEYTLLLG--------YAGRYQEL 616 (689)
Q Consensus 548 ~~~~~~~~~i~~i~~v~~dg~Cgfraia~~l~~~~---~~~~~vr~~l~~el~~~~~~y~~~~~--------~~~~~~~~ 616 (689)
.+...+.+-...++|+++||||.|+||+.+|-+.. -+-..+|+...+++..|.+++.+++- +..+|+.+
T Consensus 149 k~~~il~~~~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df~pf~~~eet~d~~~~~~f~~Y 228 (302)
T KOG2606|consen 149 KLAQILEERGLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDFLPFLLDEETGDSLGPEDFDKY 228 (302)
T ss_pred HHHHHHHhccCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHhhhHhcCccccccCCHHHHHHH
Confidence 46667788888999999999999999999998863 56789999999999999999998763 34579999
Q ss_pred HhhhcCCCCCCcccccccccchhhhhccccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEec
Q 037162 617 LHLLSNFEPNPSYDHWMIMPNTGYLIAFKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIY 683 (689)
Q Consensus 617 ~~~l~~~~~~~~~~~Wl~~~~~g~~iA~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~ 683 (689)
++.+ +.+..|++-++.+ +|++.+.+||.+|..+++ |+..++...+ .+||+|+|..
T Consensus 229 c~eI------~~t~~WGgelEL~-AlShvL~~PI~Vy~~~~p----~~~~geey~k-d~pL~lvY~r 283 (302)
T KOG2606|consen 229 CREI------RNTAAWGGELELK-ALSHVLQVPIEVYQADGP----ILEYGEEYGK-DKPLILVYHR 283 (302)
T ss_pred HHHh------hhhccccchHHHH-HHHHhhccCeEEeecCCC----ceeechhhCC-CCCeeeehHH
Confidence 9999 5679999999997 999999999999998766 6666776553 4899999874
No 9
>PF03108 DBD_Tnp_Mut: MuDR family transposase; InterPro: IPR004332 The plant MuDR transposase domain is present in plant proteins that are presumed to be the transposases for Mutator transposable elements [, ]. The function of these proteins is unknown. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=98.30 E-value=2e-06 Score=69.92 Aligned_cols=63 Identities=21% Similarity=0.458 Sum_probs=55.7
Q ss_pred ecccccCChhhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEecCCcCCCCCCCCCCCCCcCCceeeCCceEEEE
Q 037162 16 VVNVALMEREDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCERSGVYRDRSPQGPKPIKATGIQKCKCPFKLKG 95 (689)
Q Consensus 16 ~~~~~F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r~G~~r~~~~~~~~~rr~~~s~ktgCpa~i~~ 95 (689)
-+|+.|+|.+|+..++..||..+||.+++.+|+. .++..+|. + .||||+|++
T Consensus 5 ~~G~~F~~~~e~k~av~~yai~~~~~~~v~ksd~------~r~~~~C~--~--------------------~~C~Wrv~a 56 (67)
T PF03108_consen 5 EVGQTFPSKEEFKEAVREYAIKNGFEFKVKKSDK------KRYRAKCK--D--------------------KGCPWRVRA 56 (67)
T ss_pred ccCCEECCHHHHHHHHHHHHHhcCcEEEEeccCC------EEEEEEEc--C--------------------CCCCEEEEE
Confidence 3689999999999999999999999999999984 58999996 1 169999999
Q ss_pred EEeecCCCeEE
Q 037162 96 QKMANNDDWAL 106 (689)
Q Consensus 96 ~~~~~~~~W~V 106 (689)
...+..+.|.|
T Consensus 57 s~~~~~~~~~I 67 (67)
T PF03108_consen 57 SKRKRSDTFQI 67 (67)
T ss_pred EEcCCCCEEEC
Confidence 99998888875
No 10
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=96.66 E-value=0.0051 Score=61.19 Aligned_cols=115 Identities=17% Similarity=0.170 Sum_probs=77.8
Q ss_pred cccccCCCCcchHHHHHHhcCCCccH-HHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhcCCCCCCcccccccccch
Q 037162 560 AKDVAADGNCGFRTVADLIGIGEDNW-ARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHWMIMPNT 638 (689)
Q Consensus 560 i~~v~~dg~Cgfraia~~l~~~~~~~-~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~Wl~~~~~ 638 (689)
..=|+.|--|.|+||+--+......- .++|+-+.+|+-++++.|..-+-|... ++++.=+ -.++.|.+-.+.
T Consensus 112 ~~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n-~eYc~WI------~k~dsWGGaIEl 184 (307)
T KOG3288|consen 112 RRVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPN-KEYCAWI------LKMDSWGGAIEL 184 (307)
T ss_pred EEeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCc-HHHHHHH------ccccccCceEEe
Confidence 34478899999999987776543222 589999999999999999864322211 2333333 246889998888
Q ss_pred hhhhccccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEecCCCc
Q 037162 639 GYLIAFKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIYGCHF 687 (689)
Q Consensus 639 g~~iA~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHf 687 (689)
+ ||++.|++-|++++.+..-. -.+ ++..+- ..-++|.|. |-||
T Consensus 185 s-ILS~~ygveI~vvDiqt~ri--d~f-ged~~~-~~rv~llyd-GIHY 227 (307)
T KOG3288|consen 185 S-ILSDYYGVEICVVDIQTVRI--DRF-GEDKNF-DNRVLLLYD-GIHY 227 (307)
T ss_pred e-eehhhhceeEEEEecceeee--hhc-CCCCCC-CceEEEEec-cccc
Confidence 8 99999999999998643211 112 221111 234777777 7887
No 11
>PF10275 Peptidase_C65: Peptidase C65 Otubain; InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=96.66 E-value=0.0025 Score=65.67 Aligned_cols=27 Identities=26% Similarity=0.442 Sum_probs=19.4
Q ss_pred ccccccccccccccCCCCcchHHHHHH
Q 037162 551 SGIRPYIRGAKDVAADGNCGFRTVADL 577 (689)
Q Consensus 551 ~~~~~~i~~i~~v~~dg~Cgfraia~~ 577 (689)
+.|......+..|.|||||+|||++-+
T Consensus 34 ~~L~~~y~~~R~vRGDGNCFYRAf~F~ 60 (244)
T PF10275_consen 34 KKLSQKYSGIRRVRGDGNCFYRAFGFS 60 (244)
T ss_dssp HHHHHHEEEEE-B-SSSTHHHHHHHHH
T ss_pred HHHHhhhhheEeecCCccHHHHHHHHH
Confidence 344445677888999999999999865
No 12
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=96.45 E-value=0.0015 Score=70.23 Aligned_cols=123 Identities=17% Similarity=0.121 Sum_probs=82.4
Q ss_pred ccccccccccccCCCCcchHHHHHHhcCCCccHHHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhcCCCCCCccccc
Q 037162 553 IRPYIRGAKDVAADGNCGFRTVADLIGIGEDNWARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHW 632 (689)
Q Consensus 553 ~~~~i~~i~~v~~dg~Cgfraia~~l~~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~W 632 (689)
+..|+..+.-|.+||+|.||++|+++.++.|-|..||++..++++..++.|..... ..|..+++.- .-.+-|
T Consensus 213 ~~~~g~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~~~~vt--~~~~~y~k~k------r~~~~~ 284 (371)
T KOG2605|consen 213 KKHFGFEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFYEDYVT--EDFTSYIKRK------RADGEP 284 (371)
T ss_pred HHHhhhhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhcccccccccc--cchhhccccc------ccCCCC
Confidence 36678888999999999999999999999999999999999999998888776542 2345544444 334677
Q ss_pred ccccchhhhhc---cccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEecCCCc
Q 037162 633 MIMPNTGYLIA---FKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIYGCHF 687 (689)
Q Consensus 633 l~~~~~g~~iA---~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHf 687 (689)
++-..+. .+| --+.+|++..+.+ .|-+-...++...+ ..-+++.++...||
T Consensus 285 gnhie~Q-a~a~~~~~~~~~~~~~~~~--~t~~~~~~~~~~~~-~~~~~~n~~~~~h~ 338 (371)
T KOG2605|consen 285 GNHIEQQ-AAADIYEEIEKPLNITSFK--DTCYIQTPPAIEES-VKMEKYNFWVEVHY 338 (371)
T ss_pred cchHHHh-hhhhhhhhccccceeeccc--ccceeccCcccccc-hhhhhhcccchhhh
Confidence 7776664 666 4555666666532 11111222222111 33466666555665
No 13
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=95.53 E-value=0.051 Score=53.62 Aligned_cols=20 Identities=30% Similarity=0.463 Sum_probs=16.7
Q ss_pred ccccccCCCCcchHHHHHHh
Q 037162 559 GAKDVAADGNCGFRTVADLI 578 (689)
Q Consensus 559 ~i~~v~~dg~Cgfraia~~l 578 (689)
.|....|||||.|||+|.++
T Consensus 65 ~iR~trgDGNCfyra~~~s~ 84 (256)
T KOG3991|consen 65 VIRKTRGDGNCFYRAFAYSY 84 (256)
T ss_pred hhheecCCCceehHHHHHHH
Confidence 46778999999999998654
No 14
>PF05412 Peptidase_C33: Equine arterivirus Nsp2-type cysteine proteinase; InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=94.89 E-value=0.043 Score=47.58 Aligned_cols=32 Identities=16% Similarity=0.034 Sum_probs=22.5
Q ss_pred CCCcccccccccchhhhhccccceeEEEEccCc
Q 037162 625 PNPSYDHWMIMPNTGYLIAFKYNVIGLLISMQQ 657 (689)
Q Consensus 625 ~~~~~~~Wl~~~~~g~~iA~~y~~pv~~~s~~~ 657 (689)
.+.+.+.|++--+++++|-.. +-|+-+--++.
T Consensus 34 ~~r~~d~W~~dedl~~~iq~l-~lPat~~~~~~ 65 (108)
T PF05412_consen 34 RNRPSDDWADDEDLYQVIQSL-RLPATLDRNGA 65 (108)
T ss_pred cCCChHHccChHHHHHHHHHc-cCceeccCCCC
Confidence 345789999999999988755 66655544333
No 15
>PF03106 WRKY: WRKY DNA -binding domain; InterPro: IPR003657 The WRKY domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger- like motif. The WRKY domain is found in one or two copies in a superfamily of plant transcription factors involved in the regulation of various physiological programs that are unique to plants, including pathogen defence, senescence, trichome development and the biosynthesis of secondary metabolites. The WRKY domain binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core of the W box is essential for function and WRKY binding []. Some proteins known to contain a WRKY domain include Arabidopsis thaliana ZAP1 (Zinc-dependent Activator Protein-1) and AtWRKY44/TTG2, a protein involved in trichome development and anthocyanin pigmentation; and wild oat ABF1-2, two proteins involved in the gibberelic acid-induced expression of the alpha-Amy2 gene. Structural studies indicate that this domain is a four-stranded beta-sheet with a zinc binding pocket, forming a novel zinc and DNA binding structure []. The WRKYGQK residues correspond to the most N-terminal beta-strand, which enables extensive hydrophobic interactions, contributing to the structural stability of the beta-sheet.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0043565 sequence-specific DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 2AYD_A 1WJ2_A 2LEX_A.
Probab=94.64 E-value=0.15 Score=40.37 Aligned_cols=56 Identities=25% Similarity=0.424 Sum_probs=37.8
Q ss_pred CeEEEEeecccCCCCCccEEEEEEecCCcCCCCCCCCCCCCCcCCceeeCCceEEEEEEeecCCCeEEEEEeCcccCC
Q 037162 39 GLVIVIEKSDVAANGRKPRIIFTCERSGVYRDRSPQGPKPIKATGIQKCKCPFKLKGQKMANNDDWALIVICGFHNHP 116 (689)
Q Consensus 39 GF~v~i~rS~~~~~g~~~~~~~~C~r~G~~r~~~~~~~~~rr~~~s~ktgCpa~i~~~~~~~~~~W~V~~~~~~HNH~ 116 (689)
||..|+--.+..++....|.+|.|+.. ||||+=.+.+..+++.-.++.+.++|||+
T Consensus 4 gy~WRKYGqK~i~g~~~pRsYYrCt~~----------------------~C~akK~Vqr~~~d~~~~~vtY~G~H~h~ 59 (60)
T PF03106_consen 4 GYRWRKYGQKNIKGSPYPRSYYRCTHP----------------------GCPAKKQVQRSADDPNIVIVTYEGEHNHP 59 (60)
T ss_dssp SS-EEEEEEEEETTTTCEEEEEEEECT----------------------TEEEEEEEEEETTCCCEEEEEEES--SS-
T ss_pred CCchhhccCcccCCCceeeEeeecccc----------------------ChhheeeEEEecCCCCEEEEEEeeeeCCC
Confidence 677765333333333456788999631 79999999988777888889999999997
No 16
>PF06782 UPF0236: Uncharacterised protein family (UPF0236); InterPro: IPR009620 This is a group of proteins of unknown function.
Probab=94.38 E-value=1.2 Score=50.34 Aligned_cols=201 Identities=15% Similarity=0.177 Sum_probs=119.5
Q ss_pred cccccccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCccchhhHHHHHHHHhHHhhh-----------------cch-
Q 037162 123 GHSFAGRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHNATTIRAIYNARRKCKVREQ-----------------AGR- 184 (689)
Q Consensus 123 ~h~~~RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~~k~r~~~l-----------------~g~- 184 (689)
+...+.|+|++.+..|..+... ++-++..+.|....+....+...|.|.+...-.... ||.
T Consensus 111 Gl~~~~R~S~~~~~~i~~~a~~-~sYr~aa~~l~~~~~~~~iS~~tV~~~v~~~g~~~~~~~~~~k~~~~~LyIEaDg~~ 189 (470)
T PF06782_consen 111 GLKKYQRISPELKEKIVELATE-MSYRKAAEILEELLGNVSISKQTVWNIVKEAGFEEIKEEEKEKKKVPVLYIEADGVH 189 (470)
T ss_pred CCCcccchhHHHHHHHHHHHhh-cCHHHHHHHHhhccCCCccCHHHHHHHHHhccchhhhccccccCCCCeEEEecCcce
Confidence 4455689999999888887644 888999999987777666777788887655532110 110
Q ss_pred hHHH----------HHhhhcc---cCCC-CceeEEEE-EEec---CCccchHHHHHHHHHHHHhcCCC-CeEEEechhHH
Q 037162 185 SQMQ----------LLMKIVG---VTST-DLTFSVCC-VYLE---SERENNYIWALERLKGVMEENML-PSVIVIDRELA 245 (689)
Q Consensus 185 t~~~----------~Ll~~vG---vd~~-~~~~~~gf-~~~~---~E~~e~~~w~l~~lk~~~~~~~~-P~viiTD~~~a 245 (689)
-+.| .++.-.| .... ++...+.- .|+. ....+-|.-+.+.+-..+.-... --++.+|....
T Consensus 190 v~~qg~~~~~~e~k~~~vheG~~~~~~~~~R~~L~n~~~f~~~~~~~~~~~~~~v~~~i~~~Y~~~~~~~iiingDGa~W 269 (470)
T PF06782_consen 190 VKLQGKKKKKKEVKLFVVHEGWEKEKPGGKRNKLKNKRHFVSGVGESAEEFWEEVLDYIYNHYDLDKTTKIIINGDGASW 269 (470)
T ss_pred ecccccccccceeeEEEEEeeeeeeeccCCcceeecchheecccccchHHHHHHHHHHHHHhcCcccceEEEEeCCCcHH
Confidence 0011 1122244 2222 22222222 3333 33345566666766666631222 24667888888
Q ss_pred HHHHHHhhCcccccccccchHHHHHHHHhhhhccchhhHHHHHHHHHHHHhcCCHHHHHHHHHHHHHhhc------CChh
Q 037162 246 LMKAIKKKFPSATTLLCRWHISRNVLANCKKLFETNEIWETFISSWNLLVLAASEEEFAQRLKSMESDFS------KYPT 319 (689)
Q Consensus 246 l~~Ai~~vFP~a~~~lC~wHi~kNv~~~~~~~~~~~~~~~~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~------~~p~ 319 (689)
+.+++. .||++.+.|..||+++.+...++..- + .-...|+.| ...+...++..++.+...-. ....
T Consensus 270 Ik~~~~-~~~~~~~~LD~FHl~k~i~~~~~~~~---~---~~~~~~~al-~~~d~~~l~~~L~~~~~~~~~~~~~~~i~~ 341 (470)
T PF06782_consen 270 IKEGAE-FFPKAEYFLDRFHLNKKIKQALSHDP---E---LKEKIRKAL-KKGDKKKLETVLDTAESCAKDEEERKKIRK 341 (470)
T ss_pred HHHHHH-hhcCceEEecHHHHHHHHHHHhhhCh---H---HHHHHHHHH-HhcCHHHHHHHHHHHHHhhhchHHHHHHHH
Confidence 887766 99999999999999999998776421 1 111233333 34455666666665554332 1235
Q ss_pred HHhhhhhcchhhhh
Q 037162 320 ALTYIRNSSWTKVH 333 (689)
Q Consensus 320 ~~~Yl~~~~W~~i~ 333 (689)
+..||.+ +|..++
T Consensus 342 ~~~Yl~~-n~~~i~ 354 (470)
T PF06782_consen 342 LRKYLLN-NWDGIK 354 (470)
T ss_pred HHHHHHH-CHHHhh
Confidence 6788885 887663
No 17
>PF04684 BAF1_ABF1: BAF1 / ABF1 chromatin reorganising factor; InterPro: IPR006774 ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture []. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 Saccharomyces cerevisiae (P14164 from SWISSPROT)) []. The N-terminal two thirds of the protein are necessary for DNA binding, and the N terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure [].; GO: 0003677 DNA binding, 0006338 chromatin remodeling, 0005634 nucleus
Probab=90.16 E-value=2 Score=47.02 Aligned_cols=42 Identities=14% Similarity=0.162 Sum_probs=35.0
Q ss_pred ccccCChhhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEec
Q 037162 18 NVALMEREDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCER 64 (689)
Q Consensus 18 ~~~F~S~eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r 64 (689)
+..|+|.|+-|..++.|....-.-|+.+.|-+ .+.++|.|.+
T Consensus 25 ~~~f~tl~~wy~v~ndyefq~rcpiilknsh~-----nkhftfachl 66 (496)
T PF04684_consen 25 ARKFPTLEAWYNVINDYEFQSRCPIILKNSHR-----NKHFTFACHL 66 (496)
T ss_pred ccCCCcHHHHHHHHhhhhhhhcCceeeccccc-----ccceEEEeec
Confidence 46899999999999999998888888777654 3578898864
No 18
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=90.15 E-value=0.14 Score=52.43 Aligned_cols=110 Identities=11% Similarity=-0.097 Sum_probs=63.2
Q ss_pred CCCCCCccccccccccccccccccccCCCCcchHHHHHHhcCCCc--c---HHHHHHHHHHHHHhhhhhhhhh-c----c
Q 037162 539 PLKPVPFITLFPSGIRPYIRGAKDVAADGNCGFRTVADLIGIGED--N---WARVRRDLVDELQCHYNEYTLL-L----G 608 (689)
Q Consensus 539 ~~~~~p~~~~~~~~~~~~i~~i~~v~~dg~Cgfraia~~l~~~~~--~---~~~vr~~l~~el~~~~~~y~~~-~----~ 608 (689)
|..-.|++-.+|....---..-+|..|||||.|.+|+++|++.-. + =...|..=......+...|..+ | +
T Consensus 152 PDl~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f~g~hfD~~t~ 231 (306)
T COG5539 152 PDLYNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILFTGIHFDEETL 231 (306)
T ss_pred ccccchhhcCcchHHHHHhhhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhhcccccchhhh
Confidence 333344444444433332233367789999999999999998411 0 0111111112222233334432 1 1
Q ss_pred CchhHHHHHhhhcCCCCCCcccccccccchhhhhccccceeEEEEcc
Q 037162 609 YAGRYQELLHLLSNFEPNPSYDHWMIMPNTGYLIAFKYNVIGLLISM 655 (689)
Q Consensus 609 ~~~~~~~~~~~l~~~~~~~~~~~Wl~~~~~g~~iA~~y~~pv~~~s~ 655 (689)
....|+.+.+.+ -..+.|...+... .||+.+..|+-++..
T Consensus 232 ~m~~~dt~~ne~------~~~a~~g~~~ei~-qLas~lk~~~~~~nT 271 (306)
T COG5539 232 AMVLWDTYVNEV------LFDASDGITIEIQ-QLASLLKNPHYYTNT 271 (306)
T ss_pred hcchHHHHHhhh------cccccccchHHHH-HHHHHhcCceEEeec
Confidence 124567777776 4467898666654 799999999988864
No 19
>PF01610 DDE_Tnp_ISL3: Transposase; InterPro: IPR002560 Autonomous mobile genetic elements such as transposon or insertion sequences (IS) encode an enzyme, transposase, that is required for excising and inserting the mobile element. Transposases have been grouped into various families [, , ]. This family includes the IS204 [], IS1001 [], IS1096 [] and IS1165 [] transposases. More information about these proteins can be found at Protein of the Month: Transposase [].; GO: 0003677 DNA binding, 0004803 transposase activity, 0006313 transposition, DNA-mediated
Probab=90.13 E-value=0.2 Score=51.53 Aligned_cols=67 Identities=19% Similarity=0.196 Sum_probs=53.0
Q ss_pred EEecCCccchHHHHHHHH-HHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHHHHHhhh
Q 037162 208 VYLESERENNYIWALERL-KGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNVLANCKK 276 (689)
Q Consensus 208 ~~~~~E~~e~~~w~l~~l-k~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv~~~~~~ 276 (689)
.++.+-+.+++.-+|..+ -.. ....+++|++|...+...|+++.||+|....-.|||++++-+.+..
T Consensus 31 ~i~~~r~~~~l~~~~~~~~~~~--~~~~v~~V~~Dm~~~y~~~~~~~~P~A~iv~DrFHvvk~~~~al~~ 98 (249)
T PF01610_consen 31 DILPGRDKETLKDFFRSLYPEE--ERKNVKVVSMDMSPPYRSAIREYFPNAQIVADRFHVVKLANRALDK 98 (249)
T ss_pred EEcCCccHHHHHHHHHHhCccc--cccceEEEEcCCCccccccccccccccccccccchhhhhhhhcchh
Confidence 356666777766555544 222 2467899999999999999999999999999999999999886554
No 20
>smart00774 WRKY DNA binding domain. The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding.
Probab=89.71 E-value=0.67 Score=36.43 Aligned_cols=28 Identities=25% Similarity=0.385 Sum_probs=23.3
Q ss_pred CCceEEEEEEeecCCCeEEEEEeCcccC
Q 037162 88 KCPFKLKGQKMANNDDWALIVICGFHNH 115 (689)
Q Consensus 88 gCpa~i~~~~~~~~~~W~V~~~~~~HNH 115 (689)
||||+=.+.+..+++.-.+..+.++|||
T Consensus 32 ~C~a~K~Vq~~~~d~~~~~vtY~g~H~h 59 (59)
T smart00774 32 GCPAKKQVQRSDDDPSVVEVTYEGEHTH 59 (59)
T ss_pred CCCCcccEEEECCCCCEEEEEEeeEeCC
Confidence 7998877777766677888889999998
No 21
>PF04500 FLYWCH: FLYWCH zinc finger domain; InterPro: IPR007588 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 []. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets []. This entry represents a potential FLYWCH Zn-finger domain found in a number of eukaryotic proteins. FLYWCH is a C2H2-type zinc finger characterised by five conserved hydrophobic residues, containing the conserved sequence motif: F/Y-X(n)-L-X(n)-F/Y-X(n)-WXCX(6-12)CX(17-22)HXH where X indicates any amino acid. This domain was first characterised in Drosophila Modifier of mdg4 proteins, Mod(mgd4), putative chromatin modulators involved in higher order chromatin domains. Mod(mdg4) proteins share a common N-terminal BTB/POZ domain, but differ in their C-terminal region, most containing C-terminal FLYWCH zinc finger motifs []. The FLYWCH domain in Mod(mdg4) proteins has a putative role in protein-protein interactions; for example, Mod(mdg4)-67.2 interacts with DNA-binding protein Su(Hw) via its FLYWCH domain. FLYWCH domains have been described in other proteins as well, including suppressor of killer of prune, Su(Kpn), which contains 4 terminal FLYWCH zinc finger motifs in a tandem array and a C-terminal glutathione SH-transferase (GST) domain []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2RPR_A.
Probab=86.75 E-value=1.6 Score=33.95 Aligned_cols=25 Identities=28% Similarity=0.391 Sum_probs=11.4
Q ss_pred eCCceEEEEEEeecCCCeEEEEEeCcccC
Q 037162 87 CKCPFKLKGQKMANNDDWALIVICGFHNH 115 (689)
Q Consensus 87 tgCpa~i~~~~~~~~~~W~V~~~~~~HNH 115 (689)
.+|+|+|... .+.-.+.....+|||
T Consensus 38 ~~C~a~~~~~----~~~~~~~~~~~~HnH 62 (62)
T PF04500_consen 38 HGCRARLITD----AGDGRVVRTNGEHNH 62 (62)
T ss_dssp S----EEEEE------TTEEEE-S---SS
T ss_pred CCCeEEEEEE----CCCCEEEECCCccCC
Confidence 5899999998 234566667789999
No 22
>PF15299 ALS2CR8: Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 8
Probab=84.83 E-value=4.9 Score=40.82 Aligned_cols=99 Identities=17% Similarity=0.183 Sum_probs=62.0
Q ss_pred CCCcCCceeeCCceEEEEEEeec-----------------------------------CCCeEEEE-E--eCcc-cCCCC
Q 037162 78 PIKATGIQKCKCPFKLKGQKMAN-----------------------------------NDDWALIV-I--CGFH-NHPAT 118 (689)
Q Consensus 78 ~rr~~~s~ktgCpa~i~~~~~~~-----------------------------------~~~W~V~~-~--~~~H-NH~l~ 118 (689)
.++...|.|.||||+|+++.... .+.+.+.+ + ..+| +|+..
T Consensus 69 ~~~~~~skK~~CPA~I~Ik~I~~FPdykv~~~~~~~~~~~r~~~~~~lk~~l~~~~~~~~~~r~yv~lP~~~~H~~H~~~ 148 (225)
T PF15299_consen 69 RRRSKPSKKRDCPARIYIKEIIKFPDYKVPTNSQKDTRRERRKASKKLKKALLSGKSIEGERRFYVQLPSPEEHSGHPIG 148 (225)
T ss_pred ccccccccCCCCCeEEEEEEEEEcCCcccccchhhhhHHHHHHHHHHHHHHHhcCCCCCceEEEEEECCChHhcCCCccc
Confidence 44567899999999999976321 01222222 1 3567 78877
Q ss_pred cccccccccccCCHHHHHHHHHhhhCCCCh-HHHHHHHHhc-----CC----------CccchhhHHHHHHHHhHH
Q 037162 119 QYLEGHSFAGRLSKEESNLLVDMSKNNVKP-KDILHVLKKR-----DM----------HNATTIRAIYNARRKCKV 178 (689)
Q Consensus 119 ~~~~~h~~~RrLs~e~k~~I~~L~~sgv~p-r~Il~~L~~~-----~g----------~~~~t~kDIyN~~~k~r~ 178 (689)
.... -....+.+...+.|.+|...|+.. .+|...|+.. +. ..+.|.+||.|.......
T Consensus 149 ~~~~--~~~q~~~~~v~~ki~eLv~~gv~~v~e~k~~l~~fV~~~lf~~~~~p~~~n~~y~Pt~~di~n~~~~~~~ 222 (225)
T PF15299_consen 149 QEAA--GLKQPLDPRVVEKIHELVAQGVTSVPEMKRHLKKFVEEELFKDQEPPPPTNRRYFPTDKDIRNHMYSAKK 222 (225)
T ss_pred cccc--cccccCCHHHHHHHHHHHHcccccHHHHHHHHHHHhhhhccCCCCCCCCCccccCCchHHHHHHHHHHHh
Confidence 5321 123567788888999999999754 6666666432 21 123578899998765543
No 23
>PF03050 DDE_Tnp_IS66: Transposase IS66 family ; InterPro: IPR004291 Transposase proteins are necessary for efficient DNA transposition. This family includes the bacterial insertion sequence (IS) element, IS66, from Agrobacterium tumefaciens []. IS66 may cause genetic and structural variations of the T region and the vir region of the octopine Ti plasmids []. More information about these proteins can be found at Protein of the Month: Transposase [].
Probab=76.38 E-value=3 Score=43.39 Aligned_cols=132 Identities=15% Similarity=0.194 Sum_probs=68.0
Q ss_pred CHHHHHHHHHh-hhCCCChHHHHHHHHhcCCCccchhhHHHHHHHHhHHhhhcchhHHHH-Hh--hhcccCCCCceeEEE
Q 037162 131 SKEESNLLVDM-SKNNVKPKDILHVLKKRDMHNATTIRAIYNARRKCKVREQAGRSQMQL-LM--KIVGVTSTDLTFSVC 206 (689)
Q Consensus 131 s~e~k~~I~~L-~~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~~k~r~~~l~g~t~~~~-Ll--~~vGvd~~~~~~~~g 206 (689)
++.....|.-+ ...+++-..|...+... | ..++...|.|...+.......-...+.. +. .++.+|+|+-.+.-
T Consensus 5 g~~~~a~i~~l~~~~~lp~~r~~~~~~~~-G-~~is~~ti~~~~~~~~~~l~~~~~~l~~~~~~~~~~~~DET~~~vl~- 81 (271)
T PF03050_consen 5 GPSLLALIAYLKYVYHLPLYRIQQMLEDL-G-ITISRGTIANWIKRVAEALKPLYEALKEELRSSPVVHADETGWRVLD- 81 (271)
T ss_pred CHHHHHHHHHHHhcCCCCHHHHhhhhhcc-c-eeeccchhHhHhhhhhhhhhhhhhhhhhhccccceeccCCceEEEec-
Confidence 34444444433 35678888888888877 4 4445555555544433221000000000 11 34555555444111
Q ss_pred EEEecCCccchHHHHH----------------HHHHHHHhcCCCCeEEEechhHHHHHHHHhhCcccccccccchHHHHH
Q 037162 207 CVYLESERENNYIWAL----------------ERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATTLLCRWHISRNV 270 (689)
Q Consensus 207 f~~~~~E~~e~~~w~l----------------~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~~lC~wHi~kNv 270 (689)
.......|.|++ +.++..++ + ...+++||+-.+-.. |.+..|+.|+-|+.|.+
T Consensus 82 ----~~~g~~~~~Wv~~~~~~v~f~~~~sR~~~~~~~~L~-~-~~GilvsD~y~~Y~~-----~~~~~hq~C~AH~~R~~ 150 (271)
T PF03050_consen 82 ----KGKGKKGYLWVFVSPEVVLFFYAPSRSSKVIKEFLG-D-FSGILVSDGYSAYNK-----LAGITHQLCWAHLRRDF 150 (271)
T ss_pred ----cccccceEEEeeeccceeeeeecccccccchhhhhc-c-cceeeeccccccccc-----ccccccccccccccccc
Confidence 111111122211 12233332 2 457999999877644 33889999999999988
Q ss_pred HHHhhh
Q 037162 271 LANCKK 276 (689)
Q Consensus 271 ~~~~~~ 276 (689)
..-...
T Consensus 151 ~~~~~~ 156 (271)
T PF03050_consen 151 QDAAES 156 (271)
T ss_pred cccccc
Confidence 766554
No 24
>PF08069 Ribosomal_S13_N: Ribosomal S13/S15 N-terminal domain; InterPro: IPR012606 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. This domain is found at the N terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 [].; GO: 0003735 structural constituent of ribosome, 0006412 translation, 0005840 ribosome; PDB: 3U5C_N 3O30_G 3IZB_O 3O2Z_G 3U5G_N 2XZN_O 2XZM_O 3IZ6_O.
Probab=71.17 E-value=5.8 Score=31.30 Aligned_cols=31 Identities=26% Similarity=0.418 Sum_probs=25.8
Q ss_pred CC-HHHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162 130 LS-KEESNLLVDMSKNNVKPKDILHVLKKRDM 160 (689)
Q Consensus 130 Ls-~e~k~~I~~L~~sgv~pr~Il~~L~~~~g 160 (689)
++ ++..+.|..|.+.|++|++|--.|++++|
T Consensus 28 ~~~~eVe~~I~klakkG~tpSqIG~iLRD~~G 59 (60)
T PF08069_consen 28 YSPEEVEELIVKLAKKGLTPSQIGVILRDQYG 59 (60)
T ss_dssp S-HHHHHHHHHHHCCTTHCHHHHHHHHHHSCT
T ss_pred CCHHHHHHHHHHHHHcCCCHHHhhhhhhhccC
Confidence 44 45566788999999999999999999986
No 25
>PF13610 DDE_Tnp_IS240: DDE domain
Probab=67.09 E-value=0.97 Score=42.21 Aligned_cols=59 Identities=20% Similarity=0.094 Sum_probs=40.5
Q ss_pred ccCCCCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCccc
Q 037162 195 GVTSTDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSA 257 (689)
Q Consensus 195 Gvd~~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a 257 (689)
.||..+. ++++-+-..-+...-..||..+.+.. ...|.+|+||+..+...|+++++++.
T Consensus 23 aiD~~~~--~l~~~ls~~Rd~~aA~~Fl~~~l~~~--~~~p~~ivtDk~~aY~~A~~~l~~~~ 81 (140)
T PF13610_consen 23 AIDAEGN--ILDFYLSKRRDTAAAKRFLKRALKRH--RGEPRVIVTDKLPAYPAAIKELNPEG 81 (140)
T ss_pred eeccccc--chhhhhhhhcccccceeeccccceee--ccccceeecccCCccchhhhhccccc
Confidence 3455555 45555554455555455555555542 27899999999999999999999985
No 26
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=63.90 E-value=20 Score=37.27 Aligned_cols=107 Identities=15% Similarity=0.053 Sum_probs=67.9
Q ss_pred CCCCcchHHHHHHhcCCCccHHHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhcCCCCCCcccccc-cccchhhhhc
Q 037162 565 ADGNCGFRTVADLIGIGEDNWARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLSNFEPNPSYDHWM-IMPNTGYLIA 643 (689)
Q Consensus 565 ~dg~Cgfraia~~l~~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~~~~~~~~~~~Wl-~~~~~g~~iA 643 (689)
.|--|.|+|.+-.++-- +=..+|+....|..++++.|..-.-+-+ --.++..| +.++-|. +--..+ +|.
T Consensus 119 ~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~-~i~y~~~i------~k~d~~~dG~ieia-~iS 188 (306)
T COG5539 119 DDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEID-VIAYATWI------VKPDSQGDGCIEIA-IIS 188 (306)
T ss_pred CchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcc-hHHHHHhh------hccccCCCceEEEe-Eec
Confidence 45779999998887653 6778999999999999999987542222 12223333 4456666 333444 788
Q ss_pred cccceeEEEEccCceeeccCCCCCCCCCCCCCcEEEEEecCCCc
Q 037162 644 FKYNVIGLLISMQQCLTFLPLRSIPGPRSSHKIIAIGYIYGCHF 687 (689)
Q Consensus 644 ~~y~~pv~~~s~~~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHf 687 (689)
+.+++-|.+....... -++-.+-+. ..-|++.|. |-||
T Consensus 189 ~~l~v~i~~Vdv~~~~---~dr~~~~~~--~q~~~i~f~-g~hf 226 (306)
T COG5539 189 DQLPVRIHVVDVDKDS---EDRYNSHPY--VQRISILFT-GIHF 226 (306)
T ss_pred cccceeeeeeecchhH---HhhccCChh--hhhhhhhhc-cccc
Confidence 8998888877754221 122222221 123777887 6777
No 27
>PF13936 HTH_38: Helix-turn-helix domain; PDB: 2W48_A.
Probab=61.88 E-value=5.8 Score=29.11 Aligned_cols=30 Identities=20% Similarity=0.320 Sum_probs=15.6
Q ss_pred ccCCHHHHHHHHHhhhCCCChHHHHHHHHh
Q 037162 128 GRLSKEESNLLVDMSKNNVKPKDILHVLKK 157 (689)
Q Consensus 128 RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~ 157 (689)
++||.+++..|..|...|.+.++|...|..
T Consensus 3 ~~Lt~~eR~~I~~l~~~G~s~~~IA~~lg~ 32 (44)
T PF13936_consen 3 KHLTPEERNQIEALLEQGMSIREIAKRLGR 32 (44)
T ss_dssp ---------HHHHHHCS---HHHHHHHTT-
T ss_pred cchhhhHHHHHHHHHHcCCCHHHHHHHHCc
Confidence 579999999999999999999999988743
No 28
>PRK08561 rps15p 30S ribosomal protein S15P; Reviewed
Probab=57.57 E-value=39 Score=31.83 Aligned_cols=32 Identities=25% Similarity=0.314 Sum_probs=26.7
Q ss_pred cCCH-HHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162 129 RLSK-EESNLLVDMSKNNVKPKDILHVLKKRDM 160 (689)
Q Consensus 129 rLs~-e~k~~I~~L~~sgv~pr~Il~~L~~~~g 160 (689)
.+++ +..+.|.+|.+.|++|++|--.|++++|
T Consensus 27 ~~~~eeve~~I~~lakkG~~pSqIG~~LRD~~g 59 (151)
T PRK08561 27 DYSPEEIEELVVELAKQGYSPSMIGIILRDQYG 59 (151)
T ss_pred cCCHHHHHHHHHHHHHCCCCHHHhhhhHhhccC
Confidence 3444 4566788999999999999999999986
No 29
>KOG4345 consensus NF-kappa B regulator AP20/Cezanne [Signal transduction mechanisms]
Probab=44.34 E-value=10 Score=43.66 Aligned_cols=49 Identities=16% Similarity=0.179 Sum_probs=36.9
Q ss_pred hhhhhccccceeEEEEcc-----C---------ceeeccCCCCCCCCCCCCCcEEEEEecCCCcc
Q 037162 638 TGYLIAFKYNVIGLLISM-----Q---------QCLTFLPLRSIPGPRSSHKIIAIGYIYGCHFI 688 (689)
Q Consensus 638 ~g~~iA~~y~~pv~~~s~-----~---------~s~t~~P~~~~p~~~~~~~~i~l~~~~~nHfv 688 (689)
|-+++|+...||||+++. . .-..|+||-.|+.-+.. -||.|+|. .-||+
T Consensus 225 hifvl~~ilRrpivvvsd~mlR~s~~~sfap~~~ggiylpLe~p~~~c~r-~pLvl~yd-~~hf~ 287 (774)
T KOG4345|consen 225 HIFVLAHILRRPIVVVSDTMLRDSGGESFAPIPVGGIYLPLEVPAQECHR-SPLVLAYD-QAHFS 287 (774)
T ss_pred HHHHHHHHhhCCeeEecccccccCCCcccccCccCceEEeccCchhhccc-chhhhhhH-hhhhh
Confidence 678899999999999962 1 23567888888866543 47889988 47775
No 30
>PTZ00072 40S ribosomal protein S13; Provisional
Probab=40.35 E-value=61 Score=30.29 Aligned_cols=31 Identities=23% Similarity=0.347 Sum_probs=26.0
Q ss_pred CCH-HHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162 130 LSK-EESNLLVDMSKNNVKPKDILHVLKKRDM 160 (689)
Q Consensus 130 Ls~-e~k~~I~~L~~sgv~pr~Il~~L~~~~g 160 (689)
.++ +..+.|..|.+.|++|++|--.|++++|
T Consensus 25 ~~~eeVe~~I~klaKkG~~pSqIG~iLRD~~g 56 (148)
T PTZ00072 25 LSSSEVEDQICKLAKKGLTPSQIGVILRDSMG 56 (148)
T ss_pred CCHHHHHHHHHHHHHCCCCHhHhhhhhhhccC
Confidence 444 4566788999999999999999999994
No 31
>KOG0400 consensus 40S ribosomal protein S13 [Translation, ribosomal structure and biogenesis]
Probab=30.74 E-value=38 Score=30.88 Aligned_cols=29 Identities=14% Similarity=0.304 Sum_probs=26.1
Q ss_pred HHHHHHHHHhhhCCCChHHHHHHHHhcCC
Q 037162 132 KEESNLLVDMSKNNVKPKDILHVLKKRDM 160 (689)
Q Consensus 132 ~e~k~~I~~L~~sgv~pr~Il~~L~~~~g 160 (689)
++.++.|..|.+-|++|.+|--.|++.+|
T Consensus 31 ddvkeqI~K~akKGltpsqIGviLRDshG 59 (151)
T KOG0400|consen 31 DDVKEQIYKLAKKGLTPSQIGVILRDSHG 59 (151)
T ss_pred HHHHHHHHHHHHcCCChhHceeeeecccC
Confidence 56788899999999999999999998887
No 32
>PF03462 PCRF: PCRF domain; InterPro: IPR005139 This domain is found in peptide chain release factors. Peptide chain release factors are important for protein synthesis since they direct the termination of translation in response to the peptide chain termination codons UAG and UAA. These are structurally distinct but both contain the PCRF domain [].; GO: 0016149 translation release factor activity, codon specific, 0006415 translational termination, 0005737 cytoplasm; PDB: 3D5A_X 3D5C_X 3MR8_V 3MS0_V 3F1G_X 3F1E_X 1ZBT_A 2IHR_1 2X9R_Y 2X9T_Y ....
Probab=29.49 E-value=1e+02 Score=27.70 Aligned_cols=42 Identities=14% Similarity=0.145 Sum_probs=29.3
Q ss_pred hhHHHHHHHHHhhcCeEEEEeecccCCCCCccEEEEEEecCC
Q 037162 25 EDMPREELQTELRNGLVIVIEKSDVAANGRKPRIIFTCERSG 66 (689)
Q Consensus 25 eea~~~~~~yA~~~GF~v~i~rS~~~~~g~~~~~~~~C~r~G 66 (689)
.++.+.|..||...||.+.+.....+.-|.++..++.=+-.|
T Consensus 66 ~~L~~MY~~~a~~~gw~~~~l~~~~~~~~G~k~a~~~I~G~~ 107 (115)
T PF03462_consen 66 EELFRMYQRYAERRGWKVEVLDYSPGEEGGIKSATLEISGEG 107 (115)
T ss_dssp HHHHHHHHHHHHHTT-EEEEEEEEE-SSSSEEEEEEEEESTT
T ss_pred HHHHHHHHHHHHHcCCEEEEEecCCCCccceeEEEEEEEcCC
Confidence 578899999999999999987665544455666666544333
No 33
>PRK09784 hypothetical protein; Provisional
Probab=29.35 E-value=51 Score=33.21 Aligned_cols=36 Identities=22% Similarity=0.345 Sum_probs=26.0
Q ss_pred cccccccccccCCCCcchHHHHHHhcCCCccHHHHHH
Q 037162 554 RPYIRGAKDVAADGNCGFRTVADLIGIGEDNWARVRR 590 (689)
Q Consensus 554 ~~~i~~i~~v~~dg~Cgfraia~~l~~~~~~~~~vr~ 590 (689)
+.|..+---|+|||-|..|||-. |...+-+|..+-.
T Consensus 196 ~~~glkyapvdgdgycllrailv-lk~h~yswal~s~ 231 (417)
T PRK09784 196 KTYGLKYAPVDGDGYCLLRAILV-LKQHDYSWALGSH 231 (417)
T ss_pred hhhCceecccCCCchhHHHHHHH-hhhcccchhhccc
Confidence 45666667799999999999974 3444567776543
No 34
>PF02796 HTH_7: Helix-turn-helix domain of resolvase; InterPro: IPR006120 Site-specific recombination plays an important role in DNA rearrangement in prokaryotic organisms. Two types of site-specific recombination are known to occur: Recombination between inverted repeats resulting in the reversal of a DNA segment. Recombination between repeat sequences on two DNA molecules resulting in their cointegration, or between repeats on one DNA molecule resulting in the excision of a DNA fragment. Site-specific recombination is characterised by a strand exchange mechanism that requires no DNA synthesis or high energy cofactor; the phosphodiester bond energy is conserved in a phospho-protein linkage during strand cleavage and re-ligation. Two unrelated families of recombinases are currently known []. The first, called the 'phage integrase' family, groups a number of bacterial phage and yeast plasmid enzymes. The second [], called the 'resolvase' family, groups enzymes which share the following structural characteristics: an N-terminal catalytic and dimerization domain that contains a conserved serine residue involved in the transient covalent attachment to DNA IPR006119 from INTERPRO, and a C-terminal helix-turn-helix DNA-binding domain. ; GO: 0000150 recombinase activity, 0003677 DNA binding, 0006310 DNA recombination; PDB: 1ZR2_A 2GM4_B 1RES_A 1ZR4_A 1RET_A 1GDT_B 2R0Q_C 1JKP_C 1IJW_C 1JJ6_C ....
Probab=28.98 E-value=49 Score=24.18 Aligned_cols=39 Identities=15% Similarity=0.243 Sum_probs=26.8
Q ss_pred ccCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCccchhhHHHHHH
Q 037162 128 GRLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHNATTIRAIYNAR 173 (689)
Q Consensus 128 RrLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~ 173 (689)
+.+++++.+.|..|...|++..+|...+. .....||.++
T Consensus 4 ~~~~~~~~~~i~~l~~~G~si~~IA~~~g-------vsr~TvyR~l 42 (45)
T PF02796_consen 4 PKLSKEQIEEIKELYAEGMSIAEIAKQFG-------VSRSTVYRYL 42 (45)
T ss_dssp SSSSHCCHHHHHHHHHTT--HHHHHHHTT-------S-HHHHHHHH
T ss_pred CCCCHHHHHHHHHHHHCCCCHHHHHHHHC-------cCHHHHHHHH
Confidence 45677788899999999999999988652 3455666554
No 35
>PF04800 ETC_C1_NDUFA4: ETC complex I subunit conserved region; InterPro: IPR006885 This entry represents prokaryotic NADH-ubiquinone oxidoreductase subunits (1.6.5.3 from EC, 1.6.99.3 from EC) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein [].; GO: 0016651 oxidoreductase activity, acting on NADH or NADPH, 0022900 electron transport chain, 0005743 mitochondrial inner membrane; PDB: 2JYA_A 2LJU_A.
Probab=28.37 E-value=45 Score=29.39 Aligned_cols=27 Identities=26% Similarity=0.297 Sum_probs=19.5
Q ss_pred cccccCChhhHHHHHHHHHhhcCeEEEEeec
Q 037162 17 VNVALMEREDMPREELQTELRNGLVIVIEKS 47 (689)
Q Consensus 17 ~~~~F~S~eea~~~~~~yA~~~GF~v~i~rS 47 (689)
+.+.|+|.|+|.. ||.++|....|.--
T Consensus 51 v~l~F~skE~Ai~----yaer~G~~Y~V~~p 77 (101)
T PF04800_consen 51 VRLKFDSKEDAIA----YAERNGWDYEVEEP 77 (101)
T ss_dssp CEEEESSHHHHHH----HHHHCT-EEEEE-S
T ss_pred eEeeeCCHHHHHH----HHHHcCCeEEEeCC
Confidence 4569999999986 57788888777533
No 36
>PF11427 HTH_Tnp_Tc3_1: Tc3 transposase; PDB: 1U78_A 1TC3_C.
Probab=28.05 E-value=74 Score=24.21 Aligned_cols=30 Identities=13% Similarity=0.193 Sum_probs=21.6
Q ss_pred cCCHHHHHHHHHhhhCCCChHHHHHHHHhc
Q 037162 129 RLSKEESNLLVDMSKNNVKPKDILHVLKKR 158 (689)
Q Consensus 129 rLs~e~k~~I~~L~~sgv~pr~Il~~L~~~ 158 (689)
.|++.++..|..|.+.|++..+|...|...
T Consensus 4 ~Lt~~Eqaqid~m~qlG~s~~~isr~i~RS 33 (50)
T PF11427_consen 4 TLTDAEQAQIDVMHQLGMSLREISRRIGRS 33 (50)
T ss_dssp ---HHHHHHHHHHHHTT--HHHHHHHHT--
T ss_pred cCCHHHHHHHHHHHHhchhHHHHHHHhCcc
Confidence 478999999999999999999999888654
No 37
>PF00665 rve: Integrase core domain; InterPro: IPR001584 Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (ASV) have been studied most carefully with respect to the structural basis of catalysis. Although the active site of ASV integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS []. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].; GO: 0015074 DNA integration; PDB: 3AO3_A 3OVN_A 3AO5_A 3AO4_A 3AO1_A 1C6V_D 3HPG_A 3HPH_A 3OYD_A 3OYF_B ....
Probab=27.15 E-value=71 Score=27.92 Aligned_cols=50 Identities=18% Similarity=0.085 Sum_probs=29.6
Q ss_pred CCceeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHH
Q 037162 199 TDLTFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKA 249 (689)
Q Consensus 199 ~~~~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~A 249 (689)
....+.+++.+...++.+.+.-+|........ ...|.+|+||+..+..+.
T Consensus 33 ~~S~~~~~~~~~~~~~~~~~~~~l~~~~~~~~-~~~p~~i~tD~g~~f~~~ 82 (120)
T PF00665_consen 33 DYSRFIYAFPVSSKETAEAALRALKRAIEKRG-GRPPRVIRTDNGSEFTSH 82 (120)
T ss_dssp TTTTEEEEEEESSSSHHHHHHHHHHHHHHHHS--SE-SEEEEESCHHHHSH
T ss_pred CCCCcEEEEEeecccccccccccccccccccc-cccceecccccccccccc
Confidence 34456667777666555555555554333321 222999999999888643
No 38
>PF03461 TRCF: TRCF domain; InterPro: IPR005118 This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript.; GO: 0003684 damaged DNA binding, 0004386 helicase activity, 0005524 ATP binding, 0006281 DNA repair; PDB: 2QSR_A 2EYQ_A.
Probab=27.02 E-value=1.1e+02 Score=26.75 Aligned_cols=40 Identities=20% Similarity=0.273 Sum_probs=29.0
Q ss_pred HHHHHHHHHHhcCCHHHHHHHHHHHHHhhcCChhHHhhhh
Q 037162 286 TFISSWNLLVLAASEEEFAQRLKSMESDFSKYPTALTYIR 325 (689)
Q Consensus 286 ~~~~~w~~l~~a~t~~ef~~~~~~l~~~~~~~p~~~~Yl~ 325 (689)
+=+..++.+..+.|.++.++...+|.+.|+..|.-++.|-
T Consensus 18 ~Rl~~Yrrl~~~~~~~el~~l~~El~DRFG~~P~ev~~L~ 57 (101)
T PF03461_consen 18 ERLELYRRLASAESEEELEDLREELIDRFGPLPEEVENLL 57 (101)
T ss_dssp HHHHHHHHHHC--SHHHHHHHHHHHHHHH-S--HHHHHHH
T ss_pred HHHHHHHHHhhCCCHHHHHHHHHHHHHHcCCCcHHHHHHH
Confidence 3355677889999999999999999999998887776664
No 39
>TIGR03147 cyt_nit_nrfF cytochrome c nitrite reductase, accessory protein NrfF.
Probab=26.69 E-value=81 Score=28.93 Aligned_cols=34 Identities=9% Similarity=0.093 Sum_probs=30.2
Q ss_pred cCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCc
Q 037162 129 RLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHN 162 (689)
Q Consensus 129 rLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~ 162 (689)
.+..+.+.+|.++...|.+..+|.+.|.++||+.
T Consensus 57 ~iA~dmR~~Vr~~i~~G~Sd~eI~~~~v~RYG~~ 90 (126)
T TIGR03147 57 PIAYDLRHEVYSMVNEGKSNQQIIDFMTARFGDF 90 (126)
T ss_pred HHHHHHHHHHHHHHHcCCCHHHHHHHHHHhcCCe
Confidence 3556788999999999999999999999999974
No 40
>KOG4825 consensus Component of synaptic membrane glycine-, glutamate- and thienylcyclohexylpiperidine-binding glycoprotein (43kDa) [Signal transduction mechanisms]
Probab=26.44 E-value=1.4e+02 Score=33.09 Aligned_cols=29 Identities=21% Similarity=0.217 Sum_probs=22.6
Q ss_pred ccCCCCCCCCccccCcccCCCCccccccc
Q 037162 479 VKGKTRGRPSLKAYTSARRNPSKFEYVLS 507 (689)
Q Consensus 479 ~k~~tkg~p~~~~~~st~r~ps~~e~~~~ 507 (689)
.+.-+.|||.--...++.|.||.||.--+
T Consensus 284 pqleepgrenqfaepflqekpsswelpIr 312 (666)
T KOG4825|consen 284 PQLEEPGRENQFAEPFLQEKPSSWELPIR 312 (666)
T ss_pred ccccCCCCccccccchhhcCCCcceeecc
Confidence 34557788877777899999999997643
No 41
>PLN03097 FHY3 Protein FAR-RED ELONGATED HYPOCOTYL 3; Provisional
Probab=26.09 E-value=4.1e+02 Score=32.56 Aligned_cols=28 Identities=11% Similarity=-0.007 Sum_probs=19.2
Q ss_pred hhhhhhh-ccCCCcCCCCCcccccccccc
Q 037162 398 PEIAEYK-REGRPIPLSSLHSHRKKLDLL 425 (689)
Q Consensus 398 h~i~~~l-~~~~~l~~~~~H~~W~~l~~~ 425 (689)
|.|.-+. .+=..||..-|=.+|.+.--.
T Consensus 595 HaLkVL~~~~v~~IP~~YILkRWTKdAK~ 623 (846)
T PLN03097 595 HALVVLQMCQLSAIPSQYILKRWTKDAKS 623 (846)
T ss_pred hHHHHHhhcCcccCchhhhhhhchhhhhh
Confidence 6666544 344569999999999965443
No 42
>COG3316 Transposase and inactivated derivatives [DNA replication, recombination, and repair]
Probab=22.81 E-value=1.9e+02 Score=29.09 Aligned_cols=108 Identities=16% Similarity=0.124 Sum_probs=60.6
Q ss_pred hCCCChHHHHHHHHhcCCCccchhhHHHHHHHHhHHhhh--------c-ch------hH------HHHHhhhcccCCCCc
Q 037162 143 KNNVKPKDILHVLKKRDMHNATTIRAIYNARRKCKVREQ--------A-GR------SQ------MQLLMKIVGVTSTDL 201 (689)
Q Consensus 143 ~sgv~pr~Il~~L~~~~g~~~~t~kDIyN~~~k~r~~~l--------~-g~------t~------~~~Ll~~vGvd~~~~ 201 (689)
..+++-+.+.+.|.+.. ....-..|+...+++-.... . ++ +- -..|..+ ||.+|.
T Consensus 23 ~~~Ls~r~v~e~l~~rg--i~v~h~Ti~rwv~k~~~~~~~~~~~r~~~~~~~w~vDEt~ikv~gkw~ylyrA--id~~g~ 98 (215)
T COG3316 23 RYGLSLRDVEEMLAERG--IEVDHETIHRWVQKYGPLLARRLKRRKRKAGDSWRVDETYIKVNGKWHYLYRA--IDADGL 98 (215)
T ss_pred hcchhhccHHHHHHHcC--cchhHHHHHHHHHHHhHHHHHHhhhhccccccceeeeeeEEeeccEeeehhhh--hccCCC
Confidence 44788888888777664 33344555555444322111 0 00 00 0012222 344444
Q ss_pred eeEEEEEEecCCccchHHHHHHHHHHHHhcCCCCeEEEechhHHHHHHHHhhCccccc
Q 037162 202 TFSVCCVYLESERENNYIWALERLKGVMEENMLPSVIVIDRELALMKAIKKKFPSATT 259 (689)
Q Consensus 202 ~~~~gf~~~~~E~~e~~~w~l~~lk~~~~~~~~P~viiTD~~~al~~Ai~~vFP~a~~ 259 (689)
+ +.+-+...-+...-.-||..+++. ...|.+|+||+.+....|+.++-+...|
T Consensus 99 ~--Ld~~L~~rRn~~aAk~Fl~kllk~---~g~p~v~vtDka~s~~~A~~~l~~~~eh 151 (215)
T COG3316 99 T--LDVWLSKRRNALAAKAFLKKLLKK---HGEPRVFVTDKAPSYTAALRKLGSEVEH 151 (215)
T ss_pred e--EEEEEEcccCcHHHHHHHHHHHHh---cCCCceEEecCccchHHHHHhcCcchhe
Confidence 3 445454444444444555555554 3789999999999999999999885543
No 43
>PF07506 RepB: RepB plasmid partitioning protein; InterPro: IPR011111 This family includes proteins with sequence similarity to the RepB partitioning protein of the large Ti (tumour-inducing) plasmids of Agrobacterium tumefaciens [, ].
Probab=22.71 E-value=2.4e+02 Score=27.48 Aligned_cols=62 Identities=18% Similarity=0.274 Sum_probs=50.7
Q ss_pred ccccccCCCCcchHHHHHHhcCCCccHHHHHHHHHHHHHhhhhhhhhhccCchhHHHHHhhhc
Q 037162 559 GAKDVAADGNCGFRTVADLIGIGEDNWARVRRDLVDELQCHYNEYTLLLGYAGRYQELLHLLS 621 (689)
Q Consensus 559 ~i~~v~~dg~Cgfraia~~l~~~~~~~~~vr~~l~~el~~~~~~y~~~~~~~~~~~~~~~~l~ 621 (689)
+.++.-.|+.||..+|..+-+++.+.|..+.+.|+.. ......|......+.+|+.+++.+.
T Consensus 53 e~~~ll~~~~~~~~~ig~A~~igr~Rw~ela~l~~~~-~~~~~~~~~~~~s~~rf~~l~~~l~ 114 (185)
T PF07506_consen 53 EEVRLLADVEVGIIAIGPAPKIGRPRWIELAELMIAA-NNFTSAYFRALLSDTRFEALVRALR 114 (185)
T ss_pred HHHHHHhhccccHHHHHHHHHCCCcCHHHHHHHHHHH-HhhhHHHHHhcccCCcHHHHHHHHh
Confidence 5577889999999999999999999999999999332 3333456666668899999999993
No 44
>PRK10144 formate-dependent nitrite reductase complex subunit NrfF; Provisional
Probab=20.79 E-value=1.3e+02 Score=27.70 Aligned_cols=34 Identities=9% Similarity=0.050 Sum_probs=30.1
Q ss_pred cCCHHHHHHHHHhhhCCCChHHHHHHHHhcCCCc
Q 037162 129 RLSKEESNLLVDMSKNNVKPKDILHVLKKRDMHN 162 (689)
Q Consensus 129 rLs~e~k~~I~~L~~sgv~pr~Il~~L~~~~g~~ 162 (689)
.+..+.+.+|.++...|.+..+|.+.|.++||+.
T Consensus 57 ~iA~dmR~~Vr~~i~~G~sd~eI~~~~v~RYG~~ 90 (126)
T PRK10144 57 PVAVSMRHQVYSMVAEGKSEVEIIGWMTERYGDF 90 (126)
T ss_pred HHHHHHHHHHHHHHHcCCCHHHHHHHHHHhcCCe
Confidence 3456788899999999999999999999999974
No 45
>TIGR03277 methan_mark_9 putative methanogenesis marker domain 9. A gene for a protein that contains a copy of this domain, to date, is found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it. A 69-amino acid core region of this 110-amino acid domain contains eight invariant Cys residues, including two copies of a motif [WFY]CCxxKPC. These motifs could be consistent with predicted metal-binding transcription factor as was suggested for the COG4008 family. Some members of this family have an additional N-terminal domain of about 250 amino acids from the nifR3 family of predicted TIM-barrel proteins.
Probab=20.40 E-value=82 Score=27.75 Aligned_cols=30 Identities=23% Similarity=0.598 Sum_probs=27.1
Q ss_pred cchH-HHHHHhcCCCccHHHHHHHHHHHHHh
Q 037162 569 CGFR-TVADLIGIGEDNWARVRRDLVDELQC 598 (689)
Q Consensus 569 Cgfr-aia~~l~~~~~~~~~vr~~l~~el~~ 598 (689)
|-|| ..-..+|++.+.+..+.+++.+||..
T Consensus 78 Cplrd~aL~~igls~~EYm~lKkelae~i~~ 108 (109)
T TIGR03277 78 CPLRDSALQRIGMSPEEYMELKKKLAEELLK 108 (109)
T ss_pred CcCchHHHHHcCCCHHHHHHHHHHHHHHHhc
Confidence 8888 57789999999999999999999875
Done!