Query 003292
Match_columns 833
No_of_seqs 235 out of 463
Neff 5.3
Searched_HMMs 46136
Date Thu Mar 28 20:44:06 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/003292.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/003292hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 TIGR00605 rad4 DNA repair prot 100.0 5E-122 1E-126 1070.7 40.6 544 68-747 137-713 (713)
2 KOG2179 Nucleotide excision re 100.0 6E-100 1E-104 862.6 32.4 578 70-747 82-669 (669)
3 COG5535 RAD4 DNA repair protei 100.0 3.4E-85 7.3E-90 726.3 16.7 484 71-748 127-650 (650)
4 PF10405 BHD_3: Rad4 beta-hair 100.0 1.6E-32 3.4E-37 240.1 7.6 75 631-705 1-76 (76)
5 PF03835 Rad4: Rad4 transgluta 99.9 2E-23 4.4E-28 204.0 5.1 109 378-504 29-145 (145)
6 PF10403 BHD_1: Rad4 beta-hair 99.8 5.5E-20 1.2E-24 152.8 0.8 51 509-559 3-57 (57)
7 KOG0909 Peptide:N-glycanase [P 99.7 5.6E-18 1.2E-22 185.6 8.4 83 385-473 250-334 (500)
8 PF10404 BHD_2: Rad4 beta-hair 99.6 2.8E-17 6.2E-22 139.9 1.1 64 561-624 1-64 (64)
9 TIGR00605 rad4 DNA repair prot 99.2 5.5E-13 1.2E-17 157.9 -5.8 150 48-202 13-178 (713)
10 PF01841 Transglut_core: Trans 97.0 0.00061 1.3E-08 62.6 4.2 65 128-194 16-80 (113)
11 KOG2179 Nucleotide excision re 96.0 0.0041 8.9E-08 73.7 2.9 114 87-203 1-120 (669)
12 smart00460 TGc Transglutaminas 95.6 0.0079 1.7E-07 50.6 2.3 30 161-190 2-31 (68)
13 TIGR00598 rad14 DNA repair pro 94.9 0.17 3.7E-06 51.6 9.8 35 797-833 137-172 (172)
14 COG1305 Transglutaminase-like 93.0 0.097 2.1E-06 55.9 4.1 85 95-190 133-219 (319)
15 COG5145 RAD14 DNA excision rep 87.9 1.2 2.5E-05 47.0 6.3 35 798-833 258-292 (292)
16 KOG4017 DNA excision repair pr 86.5 2 4.4E-05 46.0 7.2 34 798-833 241-274 (274)
17 smart00460 TGc Transglutaminas 84.7 0.9 1.9E-05 38.0 3.0 21 382-407 47-67 (68)
18 PF01841 Transglut_core: Trans 84.4 0.82 1.8E-05 41.8 2.9 20 383-406 94-113 (113)
19 PF13369 Transglut_core2: Tran 74.2 4 8.6E-05 40.6 4.2 36 155-190 54-89 (152)
20 COG5216 Uncharacterized conser 59.2 3.9 8.5E-05 34.8 0.6 28 806-833 9-38 (67)
21 COG5535 RAD4 DNA repair protei 56.1 3.1 6.8E-05 49.4 -0.6 161 21-199 42-228 (650)
22 PF12677 DUF3797: Domain of un 52.4 6 0.00013 32.5 0.6 14 813-828 35-48 (49)
23 PRK10941 hypothetical protein; 38.5 36 0.00079 37.4 4.2 62 129-190 58-120 (269)
24 COG1305 Transglutaminase-like 34.2 27 0.00058 37.2 2.3 25 383-411 239-263 (319)
25 PF05207 zf-CSL: CSL zinc fing 27.6 37 0.00081 28.5 1.6 25 807-831 6-30 (55)
26 PF00797 Acetyltransf_2: N-ace 25.5 67 0.0014 33.8 3.4 32 159-190 39-71 (240)
27 PF11702 DUF3295: Protein of u 24.3 34 0.00074 40.6 1.0 11 21-31 307-317 (507)
28 PF09082 DUF1922: Domain of un 23.2 41 0.00089 29.7 1.1 18 816-833 17-34 (68)
29 PF14402 7TM_transglut: 7 tran 22.1 50 0.0011 37.0 1.7 27 383-414 30-56 (313)
30 PRK15047 N-hydroxyarylamine O- 21.2 1.1E+02 0.0024 33.8 4.2 32 158-190 58-91 (281)
No 1
>TIGR00605 rad4 DNA repair protein rad4. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=100.00 E-value=4.9e-122 Score=1070.74 Aligned_cols=544 Identities=29% Similarity=0.446 Sum_probs=423.0
Q ss_pred ccChHHHHHHHHHHHHHHHHHHhhhHHHHhhcCc-HHHHHHhhcccccccccccCcc---------cccccccchhhhhh
Q 003292 68 RASAEDKELAELVHKVHLLCLLARGRLIDSVCDD-PLIQASLLSLLPSYLLKISEVS---------KLTANALSPIVSWF 137 (833)
Q Consensus 68 R~t~~eKe~~~~~HKvHLLCLLAhg~~rN~~Cnd-~~lqa~llSllp~~~~~~~~~~---------~~~~~~L~~lv~Wf 137 (833)
-++.++|+.|+++|++||||||.|+.+||.|||| +++|+.|..++|.++....++. ....++|..|+.-|
T Consensus 137 ~~~~~eR~~R~~iH~~~ll~ll~h~~~RN~w~n~~~~~~~~L~~~~p~k~~~~l~p~~~~~~~~~s~s~~~~~~~l~~~~ 216 (713)
T TIGR00605 137 VCSNEARKDRKYIHILYLLCLMVHLFTRNEWSLSAPLKSAKLSNLIPEKVRLLLHPSVRKSEELPSRSLRGLRKPLVEKL 216 (713)
T ss_pred hccHHHHHHHHHHHHHHHHHHhhhhHhhhhhhCChHHHhHHHhhhCCHHHHHhcCccccccccccchhhhhhhHHHHHhh
Confidence 3789999999999999999999999999999999 7999999999999987654432 23467888999999
Q ss_pred ccceeecccCccc---c----------ch--hhHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEEeeccccCCcccc
Q 003292 138 HDNFHVRSSVSTR---R----------SF--HSDLAHALESREGTPEEIAALSVALFRALKLTTRFVSILDVASLKPEAD 202 (833)
Q Consensus 138 ~~~F~v~~~~~~~---~----------~~--~~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~sLqp~pLkp~~~ 202 (833)
+..|.++..-... . .+ ..++..+..+++|||+.++||||||||++|+.+||||||||+|+.....
T Consensus 217 kk~~~it~~g~~~~~~~~~~~~~~~~~~~~~~~ef~~~a~~~~gsrd~~aql~~allr~~~~~~rlv~slqpl~~~~~~~ 296 (713)
T TIGR00605 217 KKCMETWQKGLRKTTKGLLKLLNGGRYSRSKWEEIEKSSNRKLGGRKYRTLKRGSILENLNVPTRLVFSDFLLSVSKGHN 296 (713)
T ss_pred hhcchhcccccccCcccccccchhhHHhhhhHHHHHHhhhccccccchhhhHHHHHHhhhcccccccccccccCcccCCC
Confidence 9999998643100 0 00 2345667778899999999999999999999999999999998865311
Q ss_pred cccCCCCCCCCcCCCccCCCCccccCccccccCCCccccccccccccccCCCCCCCCCCCCCCCCCCCCCCCCCcccccC
Q 003292 203 KNVSSNQDSSRVGGGIFNAPTLMVAKPEEVLASPVKSFSCDKKENVCETSSKGSPECKYSSPKSNNTQSKKSPVSCELSS 282 (833)
Q Consensus 203 k~~~~~~~~~~~~~~~~~~~~p~~~k~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~~~~~~ 282 (833)
... +..+. ..+ +... + + + +
T Consensus 297 ~~~---------------------~~~~~--~~~-~~~~-------~---------------------~----~----~- 315 (713)
T TIGR00605 297 DPE---------------------ISSEG--FVP-KLSA-------C---------------------N----A----N- 315 (713)
T ss_pred Ccc---------------------ccccc--ccc-cccc-------c---------------------c----c----c-
Confidence 000 00000 000 0000 0 0 0 0
Q ss_pred CCCCCCCCcccCCCcccCCCccchhhhccCCChHHHHHHHHHhhccccccCCCCccccccccCCCCCCcccccccccccc
Q 003292 283 GNLDPSSSMACSDISEACHPKEKSQALKRKGDLEFEMQLEMALSATNVATSKSNICSDVKDLNSNSSTVLPVKRLKKIES 362 (833)
Q Consensus 283 ~~~~~~~~~~~~~~~~~~~~~~~~~~~kr~g~~~~~~q~~~a~~~t~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~k~~~ 362 (833)
+ .+..+ ..++.........++.+.|.|. +++.+.+..+.. .
T Consensus 316 ----q---~~~~s-~~S~~~tsR~~l~~~l~~P~fs------------------------------~~~~~~k~~~~~-~ 356 (713)
T TIGR00605 316 ----Q---RLIMS-CESADRTSRFRMKKDPSLPGFS------------------------------AYSDMDKSPIFT-C 356 (713)
T ss_pred ----c---ccccc-cCCCCccccccccccCCCCCcc------------------------------ccccCCCCCccc-h
Confidence 0 00000 0000000000012233332210 000000100000 0
Q ss_pred CCCCCccCCcccccCCccCCCCceEEEEEecCCCCCCceEEEeccc-cccccch-hhhHhHhhcCCCeeEEEEEcCCC-c
Q 003292 363 GESSTSCLGISTAVGSRKVGAPLYWAEVYCSGENLTGKWVHVDAAN-AIIDGEQ-KVEAAAAACKTSLRYIVAFAGCG-A 439 (833)
Q Consensus 363 ~~s~~~~~~~s~~~~~~~~~~P~fWvEV~~~~~~~~~kWI~VDPv~-~~Vd~~~-~~Ep~~~~~~~~msYVVAfd~dG-a 439 (833)
++ .....+.+++||+||+|||++. +++||||||++ ++++++. .+|| +.++|+|||||++|| |
T Consensus 357 ~~--------~~~~~~~~~~~p~~W~Ev~~~~---~~rWI~VD~~~~~~~~~~~~~~e~----~~~~m~YVvAf~~d~~~ 421 (713)
T TIGR00605 357 EE--------GDKFIDRWITYVDFWVEVFIEQ---EEKWVCVDAVHSGVVPKGVTCFEP----ATLMMTYVFAYDRDGYV 421 (713)
T ss_pred hc--------ccccccccCCCCeeEEEEeecc---cceeEEeccccccccCCchhhccC----CCCceEEEEEEcCCCce
Confidence 00 0112346788999999999985 49999999999 8888875 3443 569999999999997 9
Q ss_pred ccchhhhHhhHhh-hcccccCHHHHHHH-HhhhhhcccCCCCCCcccccccCcccccCCchHHHHHHHHhccCCCCcChH
Q 003292 440 KDVTRRYCMKWYR-IASKRVNSAWWDAV-LAPLRELESGATGDLNVESSAKDSFVADRNSLEDMELETRALTEPLPTNQQ 517 (833)
Q Consensus 440 kDVTrRY~~~~~~-~~rkRv~~~Ww~~~-L~~~~~~~~~~~g~~~i~a~~~~~~~~~rd~~Ed~El~~~~~~e~mP~si~ 517 (833)
+|||+||+++|++ +++.|++..||..+ |++|.+... +.. ..+|..||.||.+++++||||+|++
T Consensus 422 kDVT~RY~~~~~~k~r~~Rv~~~w~~~~w~~~~~~~~~-------------~r~-~~~d~~Ed~el~~~~~~e~~P~si~ 487 (713)
T TIGR00605 422 KDVTRRYCDQWSTKVRKRRVEKADFGETWFRPIFGALH-------------KRK-RTIDDIEDQEFLRRHESEGIPKSIQ 487 (713)
T ss_pred eechhhHhhhhhhhhheeeecccchHHHHHHHHhhhhc-------------cCc-cchhhhhhhHhhhhhcccCCChhHH
Confidence 9999999999996 66789998888877 776653210 111 1267899999999999999999999
Q ss_pred hhhcCCchhhhhhhccccccCCC--CCcceeeccee-eeecCCccccccHHHHHHhcccccCCCcccceeccCCCCCCCC
Q 003292 518 AYKNHQLYVIERWLNKYQILYPK--GPILGFCSGHA-VYPRSCVQTLKTKERWLREALQVKANEVPVKVIKNSSKSKKGQ 594 (833)
Q Consensus 518 ~fKnHP~YvLerhL~k~EvI~P~--a~~~G~~~gE~-VY~R~dV~~LkS~e~W~r~GR~VK~gE~P~K~vk~~~~~~k~~ 594 (833)
+|||||+|||||||++||+|||+ ++++|+++|++ ||+|+||++|||+++||++||+||+||+|+|+|+.+++..+.
T Consensus 488 ~fKnHP~YvLer~L~~~Evi~P~~~~~~~g~~~g~~~VY~Rs~V~~lkS~~~W~~~GR~VK~ge~P~K~vk~r~r~~~~- 566 (713)
T TIGR00605 488 DLKNHPLYVLERHLKKTQALKPGKKACTLGFVNGKAGVYSRKDVHDLKSAEQWYKKGRVIKLGEQPYKVVKARARTVRL- 566 (713)
T ss_pred HhhcCceEEehhhcccceeeccCCCCCceeccCCCCCccchhHhhhhhhHHHHHHcCCccCCCCccceEeccccccccc-
Confidence 99999999999999999999995 46789999998 999999999999999999999999999999999987442211
Q ss_pred CCCCCCccccccccccccccccccccCCCCCCCCCCccCCCCCceEeecCCCCCCCeeeecCccHHHHHHHcCCCeeece
Q 003292 595 DFEPEDYDEVDARGNIELYGKWQLEPLRLPSAVNGIVPRNERGQVDVWSEKCLPPGTVHLRLPRVYSVAKRLEIDSAPAM 674 (833)
Q Consensus 595 ~~~~~~~~~~~~~~~~~LYg~wQTe~y~pPp~vdG~VPkN~yGNVdlf~p~MlP~G~vHi~~~~~~rvAkkLgIDyA~AV 674 (833)
..++. ..+.++|||+||||+|+|||++||+||||+|||||||+|+|||+|||||++++|.+||++||||||+||
T Consensus 567 --~~~~~----~~~~~~LY~~~QTe~y~Pppv~dG~VPkN~yGNidv~~p~MiP~G~vhi~~~~~~rvak~LgIDyA~AV 640 (713)
T TIGR00605 567 --PKGEA----EEEDLGLYSYEQTELYIPPPAVDGIVPKNAYGNIDLFVPSMIPKGAVHLRLPGAIKAAKKLNIDYAPAV 640 (713)
T ss_pred --ccccc----cccccccCCHhhCcCccCCCccCCccccCCCCCEEecCCCCCCCCcEEecCccHHHHHHHhCCCeeeee
Confidence 11111 114789999999999999999999999999999999999999999999999999999999999999999
Q ss_pred eceeecCCeeeeeEceEEEccccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcC
Q 003292 675 VGFEFRNGRSTPVFDGIVVCAEFKDTILEAYAEEEEKREAEEKKRREAQATSRWYQLLSSIVTRQRLNNCYGN 747 (833)
Q Consensus 675 tGFeF~~g~a~PvidGIVV~ee~~e~l~~a~~~~~~~~~~~e~~k~e~~aL~~Wk~Ll~~L~IreRL~~~Yg~ 747 (833)
|||+|++|+++|||+|||||+||+++|++||.++++.++++++++++++||++|++||++|||++||+.+||.
T Consensus 641 tGFeF~~g~~~Pv~~GvVV~~e~~~~v~~a~~~~~~~~~~~e~~k~e~~aL~~Wk~ll~~LrIr~Rl~~~Yg~ 713 (713)
T TIGR00605 641 TGFDFHRGYSKPVLDGIIVCEEFREAIETAWEEIEQIQEEKEQEKHRKRALGNWKTLLKGLRIRERLKETYGK 713 (713)
T ss_pred eceeecCCceeEeeceEEEehhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCc
Confidence 9999999999999999999999999999999999999999999999999999999999999999999999995
No 2
>KOG2179 consensus Nucleotide excision repair complex XPC-HR23B, subunit XPC/DPB11 [Replication, recombination and repair]
Probab=100.00 E-value=6.4e-100 Score=862.63 Aligned_cols=578 Identities=35% Similarity=0.488 Sum_probs=414.5
Q ss_pred ChHHHHHHHHHHHHHHHHHHhhhHHHHhhcCcHHHHHHhhcccccccccccCcccccccccchhhhhhccceeecccCcc
Q 003292 70 SAEDKELAELVHKVHLLCLLARGRLIDSVCDDPLIQASLLSLLPSYLLKISEVSKLTANALSPIVSWFHDNFHVRSSVST 149 (833)
Q Consensus 70 t~~eKe~~~~~HKvHLLCLLAhg~~rN~~Cnd~~lqa~llSllp~~~~~~~~~~~~~~~~L~~lv~Wf~~~F~v~~~~~~ 149 (833)
...+.++....|-.|.+|.+-+...+|.||.+-.+-+ ++-..| .+++++....
T Consensus 82 ~~~d~~~~~~~~~l~~~~~~~~~~~~~~~~~~~r~~~-l~~~~p-------------------------~~~~~s~~p~- 134 (669)
T KOG2179|consen 82 ARDDQDLEYQFHLLDRLFMLFLLKTRNLWPDPVRLNA-LVRSKP-------------------------KKIRKSFKPS- 134 (669)
T ss_pred cccHHHHHHHHHHHhhhhHHHHHHHhcccCCcchhhH-hhhccC-------------------------cccccCCCcc-
Confidence 4578888899999999999999999999988766644 333333 3333332222
Q ss_pred ccchhhHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEEeeccccCCcccccccCCCCCCCCcCCCccCCCCccccCc
Q 003292 150 RRSFHSDLAHALESREGTPEEIAALSVALFRALKLTTRFVSILDVASLKPEADKNVSSNQDSSRVGGGIFNAPTLMVAKP 229 (833)
Q Consensus 150 ~~~~~~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~sLqp~pLkp~~~k~~~~~~~~~~~~~~~~~~~~p~~~k~ 229 (833)
+...+.....+++.++-..+++.|++..+ ++.+|.+.. .-+.+|+...-..+..+.......+.+.+.+......
T Consensus 135 --s~~~s~a~~~~s~r~~~~~l~~~~~~~~g--~irt~~~~~-~~~~~k~~~~~sEse~~~~~k~~e~~~~~~~~l~~~~ 209 (669)
T KOG2179|consen 135 --SSRKSQAFKNKSRRKTLHGLVLVCLSKYG--KIRTNFLRK-NYADLKNENLISESELKKVAKNQELFSGSRPLLLKGV 209 (669)
T ss_pred --ccccchHhHhhhhhhhHHHHHHHHHHHhc--ccccchhhh-hhhhcccccCCcchhccchhhhhhhhccCchHhhhhh
Confidence 12234456667777888889999988888 899999853 5555555321111100000000001111111101100
Q ss_pred cccccCCCccccccccccccccCCCCCCCCCCCCCCCCCCCCCCCCCcccccCCCCCCCCCcccCCCcccCCCccchhhh
Q 003292 230 EEVLASPVKSFSCDKKENVCETSSKGSPECKYSSPKSNNTQSKKSPVSCELSSGNLDPSSSMACSDISEACHPKEKSQAL 309 (833)
Q Consensus 230 ~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 309 (833)
+.. .-+++..+. . ...+.+ .. +..+. -+.++++ .++ ... -+.+ +-.+...+
T Consensus 210 ~~~-s~~k~~~~~-~-------k~~~~~---~~--~~s~~---~~~~~d~--------~~~-~~~--~~~~-p~~~a~i~ 260 (669)
T KOG2179|consen 210 EDA-SIRKKWKSK-M-------KNVSSG---TE--KLSKE---LSDGADE--------ASK-PYL--LEAV-PAHRADIR 260 (669)
T ss_pred hhh-hhhhhhcCC-c-------cccCcc---hh--hhccc---ccCCCcc--------ccc-hhh--hhcC-cHHHhhhc
Confidence 000 000000000 0 000000 00 00000 0000000 000 000 0001 10122334
Q ss_pred ccCCChHHHHHHHHHhhccccccCCCCccccccccCCCCCCccccccccccccCCCCCccCCcccccCCccCCCCceEEE
Q 003292 310 KRKGDLEFEMQLEMALSATNVATSKSNICSDVKDLNSNSSTVLPVKRLKKIESGESSTSCLGISTAVGSRKVGAPLYWAE 389 (833)
Q Consensus 310 kr~g~~~~~~q~~~a~~~t~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~k~~~~~s~~~~~~~s~~~~~~~~~~P~fWvE 389 (833)
..+|+.++..|.++|+..+..........++..+++. +. .+.|+++..++|....+ ...|+||+|
T Consensus 261 ~~~g~~d~~~q~~~~l~~~~n~~r~~~~l~p~~~~~~------~~-------~~~s~~~~~~~s~~~~~--~~~p~~W~e 325 (669)
T KOG2179|consen 261 PNKGDADVSSQIIHALLRTPNNARLAPSLQPPVFSNL------SV-------KDLSDTSLYGNSLENID--GAGPVFWLE 325 (669)
T ss_pred cCCCCcchHHHHHHHHhhccchhhcccccCCcchhhc------cc-------cccccccccccchhhcC--CcccchhHH
Confidence 4489999999999998877643333333332222221 11 12233344444443333 335999999
Q ss_pred EEecCCCCCCceEEEec--cccccccchhhhHhHhhcCCCeeEEEEEcCCC-cccchhhhHhhHhhhcc----cccCHHH
Q 003292 390 VYCSGENLTGKWVHVDA--ANAIIDGEQKVEAAAAACKTSLRYIVAFAGCG-AKDVTRRYCMKWYRIAS----KRVNSAW 462 (833)
Q Consensus 390 V~~~~~~~~~kWI~VDP--v~~~Vd~~~~~Ep~~~~~~~~msYVVAfd~dG-akDVTrRY~~~~~~~~r----kRv~~~W 462 (833)
||+..+ ++|||||| +.+.++..+.+...+..+.+.|+|||||+.+| ++|||+||+..|+++.+ .|++..|
T Consensus 326 v~~~~e---~kwV~vd~~~v~~~~~~~~~~~~~a~~~~~~~~yVva~da~~~~kDVT~RY~~~~~s~~~~~~k~~~~~~w 402 (669)
T KOG2179|consen 326 VLDKFE---KKWVCVDPPSVIGKYHLFQPIGAVAEINGRHLAYVVAYDADGYVKDVTRRYCESWSSILRKRSKVRFSKKW 402 (669)
T ss_pred HHHhhc---ceEEEecchhhcceeccccccchhhhhccccceEEEEecCCCccchhHHHHhhhhhhhhhccccccHHHHH
Confidence 999754 89999994 45556666665555556677999999999999 99999999999998766 4667899
Q ss_pred HHHHHhhhhhcccCCCCCCcccccccCcccccCCchHHHHHHHHhccCCCCcChHhhhcCCchhhhhhhccccccCC-CC
Q 003292 463 WDAVLAPLRELESGATGDLNVESSAKDSFVADRNSLEDMELETRALTEPLPTNQQAYKNHQLYVIERWLNKYQILYP-KG 541 (833)
Q Consensus 463 w~~~L~~~~~~~~~~~g~~~i~a~~~~~~~~~rd~~Ed~El~~~~~~e~mP~si~~fKnHP~YvLerhL~k~EvI~P-~a 541 (833)
|..++.+|..+ ..+++.+||+++..+..+++||+|+++|||||+|||||||++||+||| ++
T Consensus 403 ~~~~l~~~~~~------------------~~~~e~~ed~~~~~~~~~~~lP~sv~~~K~Hp~fvler~Lkk~q~l~P~k~ 464 (669)
T KOG2179|consen 403 FDKVLAPLGKL------------------RKDREDTEDIELLRRHTSEGLPTSVQDLKNHPLFVLERHLKKNQALKPCKK 464 (669)
T ss_pred hhhhHhhhccc------------------cchHHHHHHHHHHHHhccCCCCchHHHhccCchhhhHHHHhhccccccccc
Confidence 99999999753 346788999999999999999999999999999999999999999999 56
Q ss_pred Ccceeecc--eeeeecCCccccccHHHHHHhcccccCCCcccceeccCCCCCCCCCCCCCCccccccccccccccccccc
Q 003292 542 PILGFCSG--HAVYPRSCVQTLKTKERWLREALQVKANEVPVKVIKNSSKSKKGQDFEPEDYDEVDARGNIELYGKWQLE 619 (833)
Q Consensus 542 ~~~G~~~g--E~VY~R~dV~~LkS~e~W~r~GR~VK~gE~P~K~vk~~~~~~k~~~~~~~~~~~~~~~~~~~LYg~wQTe 619 (833)
++.||++| |+||+|.||++|||+++||+.||+||+||||+|+||+++++.+.....+.+..+ ...++|||+|||+
T Consensus 465 p~~g~~kG~~E~VY~R~~V~~LkS~e~W~r~GRvIk~geqP~K~vK~~~~r~r~~r~~e~~~~~---~~~~~Lys~wqte 541 (669)
T KOG2179|consen 465 PTLGFTKGDVEAVYLRRDVVTLKSREQWYRKGRVIKPGEQPYKIVKRRPKRERMKRELEKDVRE---EYEQELYSPWQTE 541 (669)
T ss_pred ceeeeecCCceeeeehhhHHhhccHHHHHHhcccccCCCcchHHHhcCcchhhhhhhhhhhhhh---hhhhhccCccccc
Confidence 88999999 999999999999999999999999999999999999998764432221111111 1457899999999
Q ss_pred cCCCCCCCCCCccCCCCCceEeecCCCCCCCeeeecCccHHHHHHHcCCCeeeceeceeecCCeeeeeEceEEEccccHH
Q 003292 620 PLRLPSAVNGIVPRNERGQVDVWSEKCLPPGTVHLRLPRVYSVAKRLEIDSAPAMVGFEFRNGRSTPVFDGIVVCAEFKD 699 (833)
Q Consensus 620 ~y~pPp~vdG~VPkN~yGNVdlf~p~MlP~G~vHi~~~~~~rvAkkLgIDyA~AVtGFeF~~g~a~PvidGIVV~ee~~e 699 (833)
+|+|||+++|+||||+|||||||+|+|||.|||||++|++.+|||+||||||+|||||+|+.|+++|+++|||||+|+++
T Consensus 542 ~Y~pp~a~~givpkN~yGNielf~p~miP~g~vhl~~p~~~~vAk~L~id~a~av~gF~f~~~~~~P~~~Givv~~e~k~ 621 (669)
T KOG2179|consen 542 LYCPPPAVEGIVPKNEYGNIELFSPSMIPKGCVHLRLPNAVDVAKKLGIDYAPAVTGFDFRRGYAVPVFEGIVVCKEFKE 621 (669)
T ss_pred ccCCCccccCccccccccceeeeccccCCCCeEEecCchHHHHHHHhCCcccccccceeeccCcceecccceEeehhHHH
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcC
Q 003292 700 TILEAYAEEEEKREAEEKKRREAQATSRWYQLLSSIVTRQRLNNCYGN 747 (833)
Q Consensus 700 ~l~~a~~~~~~~~~~~e~~k~e~~aL~~Wk~Ll~~L~IreRL~~~Yg~ 747 (833)
+|..||+++++.++++|+++.+++||.+|+.||++||||+||+.+||.
T Consensus 622 ~i~~a~ee~~~~~e~ker~~~~~~~l~~Wk~Ll~~Lrir~Rl~~~Yg~ 669 (669)
T KOG2179|consen 622 VILLAWEEDQKIQEEKERRKKRKRALGRWKILLRGLRIRERLKKEYGN 669 (669)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhHHHHHHHHhhcC
Confidence 999999999999999999999999999999999999999999999995
No 3
>COG5535 RAD4 DNA repair protein RAD4 [DNA replication, recombination, and repair]
Probab=100.00 E-value=3.4e-85 Score=726.25 Aligned_cols=484 Identities=26% Similarity=0.363 Sum_probs=382.2
Q ss_pred hHHHHHHHHHHHHHHHHHHhhhHHHHhhcCcHHHHHHhhcccccccccccCcc-------cccccccchhhhhhccceee
Q 003292 71 AEDKELAELVHKVHLLCLLARGRLIDSVCDDPLIQASLLSLLPSYLLKISEVS-------KLTANALSPIVSWFHDNFHV 143 (833)
Q Consensus 71 ~~eKe~~~~~HKvHLLCLLAhg~~rN~~Cnd~~lqa~llSllp~~~~~~~~~~-------~~~~~~L~~lv~Wf~~~F~v 143 (833)
..++.+|...|++|++|||.||++||.|-++..+-..|+-+++.+........ .....-|.++-.||..-|..
T Consensus 127 ~~d~s~rks~him~~tcll~~g~irn~W~rsk~lsngLr~~~~ekq~~~l~~q~~ss~~~~~~~KlL~glr~y~nk~fk~ 206 (650)
T COG5535 127 FKDWSVRKSAHIMDSTCLLLLGFIRNLWFRSKMLSNGLRFNRLEKQIKYLDNQNESSISESTYKKLLEGLRFYGNKPFKN 206 (650)
T ss_pred ccCcchhhhHHHHHHHHHHHHHHHHHHHHHhhhhhcccchhhHHHhHHhhccccccccchhHHHHHHHhHHHHhhhhhHH
Confidence 37899999999999999999999999999999988888888877665443211 12245566777999999974
Q ss_pred cccCcccc--------c----hh-------hHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEEeeccccCCcccccc
Q 003292 144 RSSVSTRR--------S----FH-------SDLAHALESREGTPEEIAALSVALFRALKLTTRFVSILDVASLKPEADKN 204 (833)
Q Consensus 144 ~~~~~~~~--------~----~~-------~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~sLqp~pLkp~~~k~ 204 (833)
......+. . +. +.....+.+..|.+|.-++-|.|++|++.+.+||..+|||..+
T Consensus 207 i~~~dnrkl~~rt~kq~~~s~f~~~i~en~s~~~~~~~~~~~~~D~~vrgf~a~~r~~~v~~Rli~~l~~P~F------- 279 (650)
T COG5535 207 IGVEDNRKLAKRTMKQMESSDFWEEIFENYSLEVVPLKSADGRRDADVRGFEAEHRILNVFARLIASLIQPVF------- 279 (650)
T ss_pred hhhcccHHHHHHHHHHHHhccchHHHHhhcchHHhhHhhccCCCcchhHHHHHHHHHhccchhhhccccCccc-------
Confidence 32110000 0 00 0123344566899999999999999999999999975554211
Q ss_pred cCCCCCCCCcCCCccCCCCccccCccccccCCCccccccccccccccCCCCCCCCCCCCCCCCCCCCCCCCCcccccCCC
Q 003292 205 VSSNQDSSRVGGGIFNAPTLMVAKPEEVLASPVKSFSCDKKENVCETSSKGSPECKYSSPKSNNTQSKKSPVSCELSSGN 284 (833)
Q Consensus 205 ~~~~~~~~~~~~~~~~~~~p~~~k~~~~~~~~~k~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~~~~~~~~ 284 (833)
.+. +.
T Consensus 280 ---------------s~~------------------------------------------------~~------------ 284 (650)
T COG5535 280 ---------------SNN------------------------------------------------SD------------ 284 (650)
T ss_pred ---------------ccc------------------------------------------------cc------------
Confidence 100 00
Q ss_pred CCCCCCcccCCCcccCCCccchhhhccCCChHHHHHHHHHhhccccccCCCCccccccccCCCCCCccccccccccccCC
Q 003292 285 LDPSSSMACSDISEACHPKEKSQALKRKGDLEFEMQLEMALSATNVATSKSNICSDVKDLNSNSSTVLPVKRLKKIESGE 364 (833)
Q Consensus 285 ~~~~~~~~~~~~~~~~~~~~~~~~~kr~g~~~~~~q~~~a~~~t~~~~~~~~~~~~~~~~~~~~~~~~~~k~~~k~~~~~ 364 (833)
++.+ ++ |. .. +
T Consensus 285 ~~~~--------~e--------------~~------------------------------------------~~-~---- 295 (650)
T COG5535 285 LDVL--------SE--------------GL------------------------------------------LE-Y---- 295 (650)
T ss_pred cccC--------cc--------------cc------------------------------------------ce-e----
Confidence 0000 00 00 00 0
Q ss_pred CCCccCCcccccCCccCCCCceEEEEEecCCCCCCceEEEecccc--ccc-cchhhhHhHhhcCCCeeEEEEEcCCC-cc
Q 003292 365 SSTSCLGISTAVGSRKVGAPLYWAEVYCSGENLTGKWVHVDAANA--IID-GEQKVEAAAAACKTSLRYIVAFAGCG-AK 440 (833)
Q Consensus 365 s~~~~~~~s~~~~~~~~~~P~fWvEV~~~~~~~~~kWI~VDPv~~--~Vd-~~~~~Ep~~~~~~~~msYVVAfd~dG-ak 440 (833)
...+.+|.||+|||+- +.++||+|||++- ++. -...|||.+..-.+.|.||+|++.++ ++
T Consensus 296 -------------iD~l~~p~fw~ev~~~---~~~kwv~vdp~~l~~v~~~l~~kfepa~~~~~n~~~~V~ayd~~~y~~ 359 (650)
T COG5535 296 -------------IDSLEYPGFWGEVVDK---FEKKWVFVDPVRLYIVYSELKCKFEPAASIHLNIMEYVGAYDACVYVK 359 (650)
T ss_pred -------------ccchhcchHHHHHHHH---HHhceEecccchhhhhhhhhhheechhHHHHHHHHHHhhhhccCccch
Confidence 0124589999999985 5699999999963 222 45678885545567899999999987 99
Q ss_pred cchhhhHhhHhhhccccc-----CHHHHHHHHhhhhhcccCCCCCCcccccccCcccccCCchHHHHHHHHhccCCCCcC
Q 003292 441 DVTRRYCMKWYRIASKRV-----NSAWWDAVLAPLRELESGATGDLNVESSAKDSFVADRNSLEDMELETRALTEPLPTN 515 (833)
Q Consensus 441 DVTrRY~~~~~~~~rkRv-----~~~Ww~~~L~~~~~~~~~~~g~~~i~a~~~~~~~~~rd~~Ed~El~~~~~~e~mP~s 515 (833)
|||+||+..++...+ |+ ...|+...+..+.... .+ .+-+.+++.++-+....+++|+|
T Consensus 360 DVt~RY~d~~~s~~k-ritk~~fs~qy~~r~~~~l~~~k---------------~~-~~~e~i~~~~~L~~~~~~~iPkS 422 (650)
T COG5535 360 DVTLRYRDQSYSFLK-RITKHLFSVQYFVRQFPGLGKCK---------------EA-SDEEAIEDFDDLDERRSEGIPKS 422 (650)
T ss_pred hHHHHHHHHHhhhhh-hhhccchHHHHHHHHhcccCccc---------------cc-ccHHHHHhHHHHhhcccccCCcc
Confidence 999999975554433 33 4689999998876542 11 33355666665555567899999
Q ss_pred hHhhhcCCchhhhhhhccccccCCCCCc-ceeecc----eeeeecCCccccccHHHHHHhcccccCCCcccceeccCCCC
Q 003292 516 QQAYKNHQLYVIERWLNKYQILYPKGPI-LGFCSG----HAVYPRSCVQTLKTKERWLREALQVKANEVPVKVIKNSSKS 590 (833)
Q Consensus 516 i~~fKnHP~YvLerhL~k~EvI~P~a~~-~G~~~g----E~VY~R~dV~~LkS~e~W~r~GR~VK~gE~P~K~vk~~~~~ 590 (833)
++||||||+|||||||+++|+|+|++.+ .++++| |+||+|.||..|+|+++||++||+||+|+||+|+||+- .
T Consensus 423 vqdlK~HP~FVle~~Lk~~q~ikp~ak~~~~~tkGk~~vE~VY~RrdVv~lkS~e~wy~~GRvIkpgaqP~K~vK~~--~ 500 (650)
T COG5535 423 VQDLKRHPKFVLESHLKWNQAIKPGAKPGFTLTKGKNSVEAVYLRRDVVRLKSAEQWYRMGRVIKPGAQPLKIVKRM--R 500 (650)
T ss_pred HHHhccCCceeeHhhhhhhhhhccCCccceeeecCCCccchhhhhhhHHhhcCHHHHHhcCcccCCCCchHHHHHHH--h
Confidence 9999999999999999999999999754 356678 99999999999999999999999999999999999982 1
Q ss_pred CCCCCCCCCCccccccccccccccccccccCCCCCCCCCCccCCCCCceEeecCCCCCCCeeeecCccHHHHHHHcCCCe
Q 003292 591 KKGQDFEPEDYDEVDARGNIELYGKWQLEPLRLPSAVNGIVPRNERGQVDVWSEKCLPPGTVHLRLPRVYSVAKRLEIDS 670 (833)
Q Consensus 591 ~k~~~~~~~~~~~~~~~~~~~LYg~wQTe~y~pPp~vdG~VPkN~yGNVdlf~p~MlP~G~vHi~~~~~~rvAkkLgIDy 670 (833)
.+.+ +.++ ....+||++|||+.|.|||+++|+||||.|||||+|+|+|||.||+||+.+++.+||+.|||||
T Consensus 501 ~rv~--~~~d------~vi~~LYs~eqT~ly~pp~vv~~~i~KN~yGNid~~~psmiP~g~~~i~~~~a~~iAr~L~I~y 572 (650)
T COG5535 501 ERVR--NLDD------KVIRELYSPEQTELYGPPLVVAGIIPKNMYGNIDYYVPSMIPRGCVLIPNRNARDIARLLGIDY 572 (650)
T ss_pred hhcc--cccc------hHHHhhcCHHHHHhhcCCccccccccccccCCeeeecccccCCCeEeccCchHHHHHHHhCCch
Confidence 2222 1222 2556799999999999999999999999999999999999999999999999999999999999
Q ss_pred eeceeceeecCCeeeeeEceEEEccccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcCC
Q 003292 671 APAMVGFEFRNGRSTPVFDGIVVCAEFKDTILEAYAEEEEKREAEEKKRREAQATSRWYQLLSSIVTRQRLNNCYGNN 748 (833)
Q Consensus 671 A~AVtGFeF~~g~a~PvidGIVV~ee~~e~l~~a~~~~~~~~~~~e~~k~e~~aL~~Wk~Ll~~L~IreRL~~~Yg~~ 748 (833)
|+|||||+|+..++.||..||||.+++.++|..+..+++..++++++.+..+-+|..|+.||++||||.||+.+||..
T Consensus 573 a~aVtGFdF~r~~~kPv~~Givv~K~~~eai~~~~~e~e~iq~~ke~~e~r~~~L~~Wk~Ll~~LRir~Ri~~eYG~~ 650 (650)
T COG5535 573 ADAVTGFDFGRSTVKPVLRGIVVPKKNLEAISNFLAEYERIQEEKERSEVRLGGLKRWKILLRKLRIRLRINEEYGLK 650 (650)
T ss_pred hhhhcccccccccccccccceecchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhHHHHHHHHHhccC
Confidence 999999999998999999999999999999999999999888888888888899999999999999999999999963
No 4
>PF10405 BHD_3: Rad4 beta-hairpin domain 3; InterPro: IPR018328 Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors. In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5'- and 3'-side of the lesion, respectively []. The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs []. Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand. This entry represents the DNA-binding domain of Rad4, which has a beta-hairpin structure []. Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. ; GO: 0003684 damaged DNA binding, 0006289 nucleotide-excision repair, 0005634 nucleus; PDB: 2QSG_A 2QSF_A 2QSH_A.
Probab=99.97 E-value=1.6e-32 Score=240.09 Aligned_cols=75 Identities=48% Similarity=0.918 Sum_probs=62.5
Q ss_pred ccCCCCCceEeecCCCCCCCeeeecCccHHHHHHHcCCCeeeceeceee-cCCeeeeeEceEEEccccHHHHHHHH
Q 003292 631 VPRNERGQVDVWSEKCLPPGTVHLRLPRVYSVAKRLEIDSAPAMVGFEF-RNGRSTPVFDGIVVCAEFKDTILEAY 705 (833)
Q Consensus 631 VPkN~yGNVdlf~p~MlP~G~vHi~~~~~~rvAkkLgIDyA~AVtGFeF-~~g~a~PvidGIVV~ee~~e~l~~a~ 705 (833)
||||+|||||||+|+|+|.|||||+++++.++||+||||||+|||||+| ++|+++|+++|||||+||+++|++||
T Consensus 1 vPkN~~GNiei~~~~m~P~G~vhi~~~~~~~~a~~l~Idya~AV~GF~f~~~g~~~Pv~~GiVV~~e~~~~v~~a~ 76 (76)
T PF10405_consen 1 VPKNEYGNIEIFVPSMLPEGCVHIKLPGIEKVAKKLGIDYAPAVVGFDFQKGGRAVPVIDGIVVAEEDEEAVQDAW 76 (76)
T ss_dssp ----TTS-EE-SSGGGS-TTEEEEE-TTHHHHHHHTT---EEEEEEEEE-STT-EEEEEEEEEEEGGGHHHHHHHH
T ss_pred CCCCCCCCEEEeCCCCCCCceEEEecccHHHHHHHcCCcEEeeecceeEccCCCCeEEECeEEEEhhHHHHHHhhC
Confidence 7999999999999999999999999999999999999999999999999 99999999999999999999999998
No 5
>PF03835 Rad4: Rad4 transglutaminase-like domain; InterPro: IPR018325 RAD4/Xp-C proteins contain an ancient transglutaminase fold that is also found in peptide-N-glycanases (PNGases), which remove glycans from glycoproteins during their degradation. The PNGases retain the catalytic triad that is typical of this fold and are predicted to have a reaction mechanism similar to that involved in transglutamination. In contrast, the RAD4/Xp-C proteins are predicted to be inactive and are likely to only possess the interaction function in DNA repair []. ; GO: 0003684 damaged DNA binding, 0006289 nucleotide-excision repair, 0005634 nucleus; PDB: 2QSG_A 2QSF_A 2QSH_A 1X3W_A 1X3Z_A 3ESW_A.
Probab=99.88 E-value=2e-23 Score=203.96 Aligned_cols=109 Identities=33% Similarity=0.616 Sum_probs=79.3
Q ss_pred CccCCCCceEEEEEecCCCCCCceEEEeccccccccchhhhHhHhhcCCCeeEEEEEcCCC-cccchhhhHhh-Hhh-hc
Q 003292 378 SRKVGAPLYWAEVYCSGENLTGKWVHVDAANAIIDGEQKVEAAAAACKTSLRYIVAFAGCG-AKDVTRRYCMK-WYR-IA 454 (833)
Q Consensus 378 ~~~~~~P~fWvEV~~~~~~~~~kWI~VDPv~~~Vd~~~~~Ep~~~~~~~~msYVVAfd~dG-akDVTrRY~~~-~~~-~~ 454 (833)
+.++.+|+||+|||++. .++||||||+++.+.....+||....+.++|+|||||+++| |+|||+||+++ |+. +.
T Consensus 29 ~~~~~~~~~W~EV~~~~---~~rWI~VDp~~~~~~~~~~~ep~~~~~~~~~~YViA~d~~~~~kDVT~RY~~~~~~~~~~ 105 (145)
T PF03835_consen 29 DKDLPYPNFWVEVYSPE---EKRWIHVDPVVGKIIKVSCDEPLEENANNPMSYVIAFDNDGYAKDVTRRYASNYWNSKTR 105 (145)
T ss_dssp HHHTTTTCEEEEEEETT---TTEEEEEETTTS-EESTBTTSTCCCCCS--B-EEEEE-CTTEEEE-HHHH-T-TCCCCCG
T ss_pred cccCCCCeEEEEEEecC---CCeEEEeeeeccccccccccCchhhccCCceEEEEEEeCCCCEEEchHhhcccccccccc
Confidence 35678999999999975 58999999999855567778888778899999999999888 99999999998 553 56
Q ss_pred ccccC-----HHHHHHHHhhhhhcccCCCCCCcccccccCcccccCCchHHHHHH
Q 003292 455 SKRVN-----SAWWDAVLAPLRELESGATGDLNVESSAKDSFVADRNSLEDMELE 504 (833)
Q Consensus 455 rkRv~-----~~Ww~~~L~~~~~~~~~~~g~~~i~a~~~~~~~~~rd~~Ed~El~ 504 (833)
+.|+. ..||+.+|++|++... ....+|.+||.||+
T Consensus 106 r~R~~~~~~~~~W~~~~l~~~~~~~~---------------~~~~~d~~Ed~el~ 145 (145)
T PF03835_consen 106 RLRVDRSYEEEDWWEKVLRPYNRPRR---------------DRTIRDKKEDEELH 145 (145)
T ss_dssp GGSGGGSHHHHHHHHHHHHHH--S------------------H--HHHHHHHHH-
T ss_pred cccCCccccHHHHHHHHHHHHhcccc---------------cccchHHHHHhhcC
Confidence 78888 8999999999986421 11246888999885
No 6
>PF10403 BHD_1: Rad4 beta-hairpin domain 1; InterPro: IPR018326 Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors. In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5'- and 3'-side of the lesion, respectively []. The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs []. Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand. This entry represents the DNA-binding domain of Rad4, which has a beta-hairpin structure []. Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. ; GO: 0003684 damaged DNA binding, 0006289 nucleotide-excision repair, 0005634 nucleus; PDB: 2QSG_A 2QSF_A 2QSH_A.
Probab=99.77 E-value=5.5e-20 Score=152.83 Aligned_cols=51 Identities=43% Similarity=0.741 Sum_probs=40.2
Q ss_pred cCCCCcChHhhhcCCchhhhhhhccccccCCCCCcceeecc----eeeeecCCcc
Q 003292 509 TEPLPTNQQAYKNHQLYVIERWLNKYQILYPKGPILGFCSG----HAVYPRSCVQ 559 (833)
Q Consensus 509 ~e~mP~si~~fKnHP~YvLerhL~k~EvI~P~a~~~G~~~g----E~VY~R~dV~ 559 (833)
+|+||+|+++|||||+|||||||++||+|+|+++++|+|+| |+||+|+||+
T Consensus 3 ~e~~P~s~~~~K~hP~yvLe~~L~~~E~i~P~a~~vg~~~~~~~~e~VY~R~~V~ 57 (57)
T PF10403_consen 3 NEPLPKSIQDFKNHPNYVLERHLKRNEVIYPGAKPVGTFKGKGKKEPVYLRSDVI 57 (57)
T ss_dssp HH-S-SSCGGGTT-SSEEEGGGS-TTEEE-TT---SEEEE-TSTEEEEEEGGGE-
T ss_pred cCCCCccHHHHhCCChhhhhhhcCcceeECCCCceeEEEeCCCcceeeEeHhhCC
Confidence 68999999999999999999999999999999999999999 9999999996
No 7
>KOG0909 consensus Peptide:N-glycanase [Posttranslational modification, protein turnover, chaperones]
Probab=99.72 E-value=5.6e-18 Score=185.62 Aligned_cols=83 Identities=27% Similarity=0.440 Sum_probs=72.3
Q ss_pred ceEEEEEecCCCCCCceEEEeccccccccchhhhHhHhhcCCCeeEEEEEcCCCcccchhhhHhhHhhh--cccccCHHH
Q 003292 385 LYWAEVYCSGENLTGKWVHVDAANAIIDGEQKVEAAAAACKTSLRYIVAFAGCGAKDVTRRYCMKWYRI--ASKRVNSAW 462 (833)
Q Consensus 385 ~fWvEV~~~~~~~~~kWI~VDPv~~~Vd~~~~~Ep~~~~~~~~msYVVAfd~dGakDVTrRY~~~~~~~--~rkRv~~~W 462 (833)
++|+|||+.. .++|+|||||.+++|+|..||- .|++.|+|||||+.||+.|||.||+.+|... +|.++.+.=
T Consensus 250 HVWtEvYS~~---qqRW~HvDpcE~v~D~PllYe~---GW~KklsY~iafgkD~VvDVT~RYi~~h~e~~~~R~~~~E~~ 323 (500)
T KOG0909|consen 250 HVWTEVYSNA---QQRWVHVDPCENVFDKPLLYEI---GWGKKLSYCIAFGKDGVVDVTWRYILDHKENLLPRDLCKESV 323 (500)
T ss_pred chhHHhhhhh---hheeEeecccccccccceeeec---ccCcccceEEEeccCceEeeehhhhccchhhccchhhcchHH
Confidence 4999999975 5999999999999999999875 5999999999999999999999999988865 456677788
Q ss_pred HHHHHhhhhhc
Q 003292 463 WDAVLAPLREL 473 (833)
Q Consensus 463 w~~~L~~~~~~ 473 (833)
+..+|..++..
T Consensus 324 l~~~l~~in~~ 334 (500)
T KOG0909|consen 324 LQQTLQFINKR 334 (500)
T ss_pred HHHHHHHHHHH
Confidence 88888877643
No 8
>PF10404 BHD_2: Rad4 beta-hairpin domain 2; InterPro: IPR018327 Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognising the lesion and recruiting downstream factors. In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5'- and 3'-side of the lesion, respectively []. The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognised by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilise the double helix in a manner that facilitates the flipping-out of two base pairs []. Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand. This entry represents the DNA-binding domain of Rad4, which has a beta-hairpin structure []. Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. ; GO: 0003684 damaged DNA binding, 0006289 nucleotide-excision repair, 0005634 nucleus; PDB: 2QSG_A 2QSF_A 2QSH_A.
Probab=99.64 E-value=2.8e-17 Score=139.86 Aligned_cols=64 Identities=36% Similarity=0.637 Sum_probs=30.5
Q ss_pred cccHHHHHHhcccccCCCcccceeccCCCCCCCCCCCCCCccccccccccccccccccccCCCC
Q 003292 561 LKTKERWLREALQVKANEVPVKVIKNSSKSKKGQDFEPEDYDEVDARGNIELYGKWQLEPLRLP 624 (833)
Q Consensus 561 LkS~e~W~r~GR~VK~gE~P~K~vk~~~~~~k~~~~~~~~~~~~~~~~~~~LYg~wQTe~y~pP 624 (833)
|||+++|+++||+||+||+|+|+|+++++..+.......+..+.+....++|||+||||+|+||
T Consensus 1 LkS~e~W~r~GR~Vk~gE~P~K~vk~r~~~~~~~~~~~~~~~~~~~~~~~~LYg~wQTe~y~PP 64 (64)
T PF10404_consen 1 LKSAEKWYREGRVVKPGEQPYKVVKSRARTINRKREDEADENEDGEDETVPLYGEWQTEPYIPP 64 (64)
T ss_dssp -BEHHHHHTTTEEE-TT---SEEEE-----------------------EEEEB-GGGEEE----
T ss_pred CCCHHHHHHcCCccCCCCceeeEEecccccccccccccccccccccccCccCCCHHHCccccCC
Confidence 7999999999999999999999999987632211111111111122367999999999999998
No 9
>TIGR00605 rad4 DNA repair protein rad4. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=99.19 E-value=5.5e-13 Score=157.91 Aligned_cols=150 Identities=23% Similarity=0.151 Sum_probs=119.8
Q ss_pred CcceEEEEecCccccccc----------c-cccChHHHHHHHHHHHHHHHHHHhhhHHHHhhcCcHHHHHHhhccccccc
Q 003292 48 IKGVTIEFDAADSVTKKP----------V-RRASAEDKELAELVHKVHLLCLLARGRLIDSVCDDPLIQASLLSLLPSYL 116 (833)
Q Consensus 48 ~~~v~Ie~~~~~~~kkk~----------~-RR~t~~eKe~~~~~HKvHLLCLLAhg~~rN~~Cnd~~lqa~llSllp~~~ 116 (833)
..+|+||++++++.++++ + |||.++.|++++++|||||||+|+++...|+.|+++++.+.. +++|+.+
T Consensus 13 ~~~~~~e~~~~~~~~~r~~~~~~~~~~~~~~~~~r~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~d~~~~~-~~~~~~~ 91 (713)
T TIGR00605 13 IRKVENEKEAEKQPKSRRRKVRRENEPSLRRRKKRFKTGLNELPHEVVLMCNLDSTHSDDRVVSVPDSLSVS-EEIPSRE 91 (713)
T ss_pred HHHHHhhhhcchhhhhhcchhhhhhhhhHhhhhhhhhccccccCcceEEEEEeccccccccccccccccccc-ccCCccc
Confidence 567999999987644443 2 446689999999999999999999999999999999888777 9999998
Q ss_pred ccccCcccccccccchhhhhhccceeecccCccccchhhHHHHHHHHhcCCH-----HHHHHHHHHHHHhcCCceEEEEe
Q 003292 117 LKISEVSKLTANALSPIVSWFHDNFHVRSSVSTRRSFHSDLAHALESREGTP-----EEIAALSVALFRALKLTTRFVSI 191 (833)
Q Consensus 117 ~~~~~~~~~~~~~L~~lv~Wf~~~F~v~~~~~~~~~~~~~l~~~l~~~~Gs~-----de~a~LF~aLlRaLgl~~RLV~s 191 (833)
.... ...+...+|+++++||.++|.+++..+.. ...++...++++..+. ...+|+|..||+.+++.+|++.+
T Consensus 92 ~~~~-~~~~e~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~r~~~~~~eR~~R~~iH~~~ll~ll~h~~~RN~w~ 168 (713)
T TIGR00605 92 EDYD-SREFEDVYLSNLVAEFETISVEIKPSSKA--ESDDDAETLSRNVCSNEARKDRKYIHILYLLCLMVHLFTRNEWS 168 (713)
T ss_pred cccc-hhhhhhhhhcccccccCcccccccccchh--hhhhHHHHHHHhhccHHHHHHHHHHHHHHHHHHhhhhHhhhhhh
Confidence 7654 47888999999999999999997544322 2233445565555533 45999999999999999999999
Q ss_pred eccccCCcccc
Q 003292 192 LDVASLKPEAD 202 (833)
Q Consensus 192 Lqp~pLkp~~~ 202 (833)
+|| ++.+...
T Consensus 169 n~~-~~~~~~L 178 (713)
T TIGR00605 169 LSA-PLKSAKL 178 (713)
T ss_pred CCh-HHHhHHH
Confidence 998 6655433
No 10
>PF01841 Transglut_core: Transglutaminase-like superfamily; InterPro: IPR002931 This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the epsilon-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds []. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' []. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease []. A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal beta-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal beta-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [].; PDB: 2F4M_A 2F4O_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B ....
Probab=97.05 E-value=0.00061 Score=62.61 Aligned_cols=65 Identities=29% Similarity=0.334 Sum_probs=47.0
Q ss_pred cccchhhhhhccceeecccCccccchhhHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEEeecc
Q 003292 128 NALSPIVSWFHDNFHVRSSVSTRRSFHSDLAHALESREGTPEEIAALSVALFRALKLTTRFVSILDV 194 (833)
Q Consensus 128 ~~L~~lv~Wf~~~F~v~~~~~~~~~~~~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~sLqp 194 (833)
..+..+..|++++|...+..... .......+|++..|+-.+.+.||++|||++||+||+|.....
T Consensus 16 ~~~~~i~~~v~~~~~y~~~~~~~--~~~~~~~~l~~~~G~C~~~a~l~~allr~~Gipar~v~g~~~ 80 (113)
T PF01841_consen 16 EKAKAIYDWVRSNIRYDDPNYSP--GPRDASEVLRSGRGDCEDYASLFVALLRALGIPARVVSGYVK 80 (113)
T ss_dssp CCCCCCCCCCCCCCCEC-TCCCC--CCTTHHHHHHCEEESHHHHHHHHHHHHHHHT--EEEEEEEEE
T ss_pred HHHHHHHHHHHhCcEEeCCCCCC--CCCCHHHHHHcCCCccHHHHHHHHHHHhhCCCceEEEEEEcC
Confidence 55667899999999887211111 112356788888999999999999999999999999974443
No 11
>KOG2179 consensus Nucleotide excision repair complex XPC-HR23B, subunit XPC/DPB11 [Replication, recombination and repair]
Probab=95.98 E-value=0.0041 Score=73.75 Aligned_cols=114 Identities=18% Similarity=0.103 Sum_probs=90.0
Q ss_pred HHHhhhHHHHhhcCcHH-HHHHhhcccccccccccCcccccccccchhhhhhccceeecccCccccchhhHHHHHHHHh-
Q 003292 87 CLLARGRLIDSVCDDPL-IQASLLSLLPSYLLKISEVSKLTANALSPIVSWFHDNFHVRSSVSTRRSFHSDLAHALESR- 164 (833)
Q Consensus 87 CLLAhg~~rN~~Cnd~~-lqa~llSllp~~~~~~~~~~~~~~~~L~~lv~Wf~~~F~v~~~~~~~~~~~~~l~~~l~~~- 164 (833)
|.+|+....|+.|.-.+ ...+.+.++|....... .+.....++...++||...+++....+... +......++.+
T Consensus 1 c~~a~~d~~n~~~s~~~~~~~~~l~l~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~--~~~~~~s~~~~~ 77 (669)
T KOG2179|consen 1 CESANNDKANRKNSKIDKEVSLYLELLPDRSFRSV-VRDEEDDLLSSDVKWFSLDSELPVENVNDV--RDTILVSLEKRK 77 (669)
T ss_pred CccchhhhhccccchhhHhHhcccccCCCcccccc-cccccccchhhccCcccccccccccccchh--hhHhhhhhhhhh
Confidence 78999999999999888 88889999998765332 245667889999999999999976543321 12333444333
Q ss_pred ----cCCHHHHHHHHHHHHHhcCCceEEEEeeccccCCccccc
Q 003292 165 ----EGTPEEIAALSVALFRALKLTTRFVSILDVASLKPEADK 203 (833)
Q Consensus 165 ----~Gs~de~a~LF~aLlRaLgl~~RLV~sLqp~pLkp~~~k 203 (833)
.++.+.+..+|.+|+|.+++.+++|.+|+|.|+++-+.+
T Consensus 78 ~~k~~~~d~~~~~~~~~l~~~~~~~~~~~~~~~~~~~r~~~l~ 120 (669)
T KOG2179|consen 78 ANKEARDDQDLEYQFHLLDRLFMLFLLKTRNLWPDPVRLNALV 120 (669)
T ss_pred hhcccccHHHHHHHHHHHhhhhHHHHHHHhcccCCcchhhHhh
Confidence 458889999999999999999999999999999997644
No 12
>smart00460 TGc Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events.
Probab=95.58 E-value=0.0079 Score=50.57 Aligned_cols=30 Identities=50% Similarity=0.633 Sum_probs=27.0
Q ss_pred HHHhcCCHHHHHHHHHHHHHhcCCceEEEE
Q 003292 161 LESREGTPEEIAALSVALFRALKLTTRFVS 190 (833)
Q Consensus 161 l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~ 190 (833)
|+.+.|+-.+.+.||++|||++||+||+|.
T Consensus 2 ~~~~~G~C~~~a~l~~~llr~~GIpar~v~ 31 (68)
T smart00460 2 LKTKYGTCGEFAALFVALLRSLGIPARVVS 31 (68)
T ss_pred CcccceeeHHHHHHHHHHHHHCCCCeEEEe
Confidence 455678888999999999999999999996
No 13
>TIGR00598 rad14 DNA repair protein. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=94.92 E-value=0.17 Score=51.56 Aligned_cols=35 Identities=26% Similarity=0.489 Sum_probs=30.0
Q ss_pred CCCccccccccccCcccccceeecccC-CCceEEEeeC
Q 003292 797 SEEHEHVYLIEDQSFDEENSVTTKRCH-CGFTIQVEEL 833 (833)
Q Consensus 797 ~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~ 833 (833)
...|+|.|-.+ .+|+|+.+-+|+|- |||.|..|+|
T Consensus 137 ~~~H~H~f~~~--~~~~e~~~~~k~C~~Cg~e~~~e~m 172 (172)
T TIGR00598 137 GRVHEHEFGPE--TNGVEEDTYRRTCTTCGLEETYEKM 172 (172)
T ss_pred CCcccccCCcc--cccccCCceeeecCCCCceEEEEeC
Confidence 36799999665 46789999999997 9999999986
No 14
>COG1305 Transglutaminase-like enzymes, putative cysteine proteases [Amino acid transport and metabolism]
Probab=92.96 E-value=0.097 Score=55.90 Aligned_cols=85 Identities=16% Similarity=0.230 Sum_probs=56.3
Q ss_pred HHhhcC--cHHHHHHhhcccccccccccCcccccccccchhhhhhccceeecccCccccchhhHHHHHHHHhcCCHHHHH
Q 003292 95 IDSVCD--DPLIQASLLSLLPSYLLKISEVSKLTANALSPIVSWFHDNFHVRSSVSTRRSFHSDLAHALESREGTPEEIA 172 (833)
Q Consensus 95 rN~~Cn--d~~lqa~llSllp~~~~~~~~~~~~~~~~L~~lv~Wf~~~F~v~~~~~~~~~~~~~l~~~l~~~~Gs~de~a 172 (833)
....|+ ++.+++++..+.-.. .........+..|+...|.-+..... ....-..+|+...|+=...+
T Consensus 133 ~~~~~~~~~~~~~~la~~~~~~~--------~~~~~~~~~~~~~~~~~~~y~~~~~~---~~~~~~~~l~~~~G~C~d~a 201 (319)
T COG1305 133 VSPDTPIKKPRVAELAARETGGA--------TTPREKAAALFDYVNSKIRYSPGPTP---VTGSASDALRLGRGVCRDFA 201 (319)
T ss_pred CCCCCCcccHHHHHHHHHhhccc--------CCHHHHHHHHHHHHhhcceeecCCCC---CCCCHHHHHHhCCcccccHH
Confidence 344454 577777776633321 11224445678888877776544311 11233467777789888999
Q ss_pred HHHHHHHHhcCCceEEEE
Q 003292 173 ALSVALFRALKLTTRFVS 190 (833)
Q Consensus 173 ~LF~aLlRaLgl~~RLV~ 190 (833)
+||++|||++||+||+|.
T Consensus 202 ~l~val~Ra~GIpAR~V~ 219 (319)
T COG1305 202 HLLVALLRAAGIPARYVS 219 (319)
T ss_pred HHHHHHHHHcCCcceeee
Confidence 999999999999999995
No 15
>COG5145 RAD14 DNA excision repair protein [DNA replication, recombination, and repair]
Probab=87.95 E-value=1.2 Score=47.04 Aligned_cols=35 Identities=34% Similarity=0.643 Sum_probs=26.1
Q ss_pred CCccccccccccCcccccceeecccCCCceEEEeeC
Q 003292 798 EEHEHVYLIEDQSFDEENSVTTKRCHCGFTIQVEEL 833 (833)
Q Consensus 798 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 833 (833)
+.|.|+|-++-+. --|..|-..||.||..|.-+|+
T Consensus 258 ~kHvH~f~e~vdg-~~e~g~~iqRC~CGlevEq~ei 292 (292)
T COG5145 258 EKHVHVFDEFVDG-PNEPGVIIQRCSCGLEVEQEEI 292 (292)
T ss_pred hcceeeccccccC-CCCCCeEEEecccccchhhccC
Confidence 4599999766444 2278899999999998765553
No 16
>KOG4017 consensus DNA excision repair protein XPA/XPAC/RAD14 [Replication, recombination and repair]
Probab=86.54 E-value=2 Score=46.03 Aligned_cols=34 Identities=32% Similarity=0.481 Sum_probs=26.5
Q ss_pred CCccccccccccCcccccceeecccCCCceEEEeeC
Q 003292 798 EEHEHVYLIEDQSFDEENSVTTKRCHCGFTIQVEEL 833 (833)
Q Consensus 798 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 833 (833)
.-|+|+|-.|- --+|..++..+|.||+++.-|+|
T Consensus 241 ~~H~Hef~~e~--~~eEd~y~~tc~~Cg~e~e~ekl 274 (274)
T KOG4017|consen 241 EKHVHEFGPET--GIEEDGYRITCCTCGLEEEQEKL 274 (274)
T ss_pred cccceecCCCC--CCCCCcceeEeecccchhhhhcC
Confidence 56999999885 44555666669999999887765
No 17
>smart00460 TGc Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events.
Probab=84.70 E-value=0.9 Score=38.01 Aligned_cols=21 Identities=48% Similarity=0.846 Sum_probs=18.0
Q ss_pred CCCceEEEEEecCCCCCCceEEEecc
Q 003292 382 GAPLYWAEVYCSGENLTGKWVHVDAA 407 (833)
Q Consensus 382 ~~P~fWvEV~~~~~~~~~kWI~VDPv 407 (833)
..++.|+|||.. ++|+.+||.
T Consensus 47 ~~~H~W~ev~~~-----~~W~~~D~~ 67 (68)
T smart00460 47 WEAHAWAEVYLE-----GGWVPVDPT 67 (68)
T ss_pred CCcEEEEEEEEC-----CCeEEEeCC
Confidence 368999999984 699999995
No 18
>PF01841 Transglut_core: Transglutaminase-like superfamily; InterPro: IPR002931 This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the epsilon-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds []. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' []. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease []. A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal beta-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal beta-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [].; PDB: 2F4M_A 2F4O_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B ....
Probab=84.37 E-value=0.82 Score=41.81 Aligned_cols=20 Identities=40% Similarity=0.963 Sum_probs=17.0
Q ss_pred CCceEEEEEecCCCCCCceEEEec
Q 003292 383 APLYWAEVYCSGENLTGKWVHVDA 406 (833)
Q Consensus 383 ~P~fWvEV~~~~~~~~~kWI~VDP 406 (833)
..+.|+|||.++ ++||++||
T Consensus 94 ~~H~w~ev~~~~----~~W~~~Dp 113 (113)
T PF01841_consen 94 DNHAWVEVYLPG----GGWIPLDP 113 (113)
T ss_dssp EEEEEEEEEETT----TEEEEEET
T ss_pred CCEEEEEEEEcC----CcEEEcCC
Confidence 358999999953 79999998
No 19
>PF13369 Transglut_core2: Transglutaminase-like superfamily
Probab=74.24 E-value=4 Score=40.58 Aligned_cols=36 Identities=36% Similarity=0.423 Sum_probs=33.2
Q ss_pred hHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEE
Q 003292 155 SDLAHALESREGTPEEIAALSVALFRALKLTTRFVS 190 (833)
Q Consensus 155 ~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~ 190 (833)
..+..+|+++.|.+-.++.||++++|.|||++..|.
T Consensus 54 ~~l~~vL~~r~G~Pi~L~ily~~va~rlGl~~~~v~ 89 (152)
T PF13369_consen 54 SFLHKVLERRRGIPISLAILYLEVARRLGLPAEPVN 89 (152)
T ss_pred hhHHHHHhcCCCCcHHHHHHHHHHHHHcCCeEEEEe
Confidence 346789999999999999999999999999999995
No 20
>COG5216 Uncharacterized conserved protein [Function unknown]
Probab=59.22 E-value=3.9 Score=34.84 Aligned_cols=28 Identities=46% Similarity=0.865 Sum_probs=22.4
Q ss_pred ccccCcccccceeecccCCC--ceEEEeeC
Q 003292 806 IEDQSFDEENSVTTKRCHCG--FTIQVEEL 833 (833)
Q Consensus 806 ~~~~~~~~~~~~~~~~~~~~--~~~~~~~~ 833 (833)
.||-.|+.|+-.-|--|+|| |.|.+|.|
T Consensus 9 iedftf~~e~~~ftyPCPCGDRFeIsLeDl 38 (67)
T COG5216 9 IEDFTFSREEKTFTYPCPCGDRFEISLEDL 38 (67)
T ss_pred eeeeEEcCCCceEEecCCCCCEeEEEHHHh
Confidence 35678999999999999999 66665543
No 21
>COG5535 RAD4 DNA repair protein RAD4 [DNA replication, recombination, and repair]
Probab=56.08 E-value=3.1 Score=49.41 Aligned_cols=161 Identities=12% Similarity=0.017 Sum_probs=101.6
Q ss_pred CCCCCCCCCccCCCCCCCCcCCCCCCCCcceEEEEecCcccccccccccChHHHHHHHHHHHHHHHHHHhh-hHHHHhhc
Q 003292 21 GEEMYDSDWEDGSIPVACSKENHPESDIKGVTIEFDAADSVTKKPVRRASAEDKELAELVHKVHLLCLLAR-GRLIDSVC 99 (833)
Q Consensus 21 ~~ee~e~dWEeV~~~~~~~~~~~~~~~~~~v~Ie~~~~~~~kkk~~RR~t~~eKe~~~~~HKvHLLCLLAh-g~~rN~~C 99 (833)
++++.+++||.|+.+ .+++++++... ..+.++ ..+.++....+..|..||+|+=-+ ..-||.||
T Consensus 42 ~~~ek~i~e~~~el~-------------gd~~vtvn~~~-rdrs~v-~k~sdd~neklqssq~hl~~~~f~~l~s~nk~~ 106 (650)
T COG5535 42 QDEEKDIDEEPVELD-------------GDLTVTVNNIR-RDRSKV-SKYSDDHNEKLQSSQLHLIMIPFMLLKSRNKWI 106 (650)
T ss_pred hhhccccccCCccCC-------------Ccceeeecccc-cccccc-ccccchhhHHhccchhhhhcchhhhhcCcCeec
Confidence 556666899877765 33666776641 111111 123455556667999999999888 55599999
Q ss_pred CcHHHHHHhhcccccccccccCccc-------ccccccc--h--hhhhhccceeecc---c-----------Cccccchh
Q 003292 100 DDPLIQASLLSLLPSYLLKISEVSK-------LTANALS--P--IVSWFHDNFHVRS---S-----------VSTRRSFH 154 (833)
Q Consensus 100 nd~~lqa~llSllp~~~~~~~~~~~-------~~~~~L~--~--lv~Wf~~~F~v~~---~-----------~~~~~~~~ 154 (833)
+|.+|.-.+....|........... ++.-.|- . +..||+.+...+. . .+.. .
T Consensus 107 dderLn~~~k~s~pk~~~~s~~d~s~rks~him~~tcll~~g~irn~W~rsk~lsngLr~~~~ekq~~~l~~q~~s---s 183 (650)
T COG5535 107 DDERLNRRLKRSVPKLGGKSFKDWSVRKSAHIMDSTCLLLLGFIRNLWFRSKMLSNGLRFNRLEKQIKYLDNQNES---S 183 (650)
T ss_pred cchhhceeeeccCcccccccccCcchhhhHHHHHHHHHHHHHHHHHHHHHhhhhhcccchhhHHHhHHhhcccccc---c
Confidence 9999998888888854332221111 1111111 1 4679998443221 0 0000 0
Q ss_pred hHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEEeeccccCCc
Q 003292 155 SDLAHALESREGTPEEIAALSVALFRALKLTTRFVSILDVASLKP 199 (833)
Q Consensus 155 ~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~sLqp~pLkp 199 (833)
..+....+-+.|.++-+.+.|..+.+-.....+++..+|..+.-.
T Consensus 184 ~~~~~~~KlL~glr~y~nk~fk~i~~~dnrkl~~rt~kq~~~s~f 228 (650)
T COG5535 184 ISESTYKKLLEGLRFYGNKPFKNIGVEDNRKLAKRTMKQMESSDF 228 (650)
T ss_pred cchhHHHHHHHhHHHHhhhhhHHhhhcccHHHHHHHHHHHHhccc
Confidence 123445556689999999999999999999999998777765544
No 22
>PF12677 DUF3797: Domain of unknown function (DUF3797); InterPro: IPR024256 This presumed domain is functionally uncharacterised. This domain family is found in bacteria and viruses, and is approximately 50 amino acids in length. There is a conserved CGN sequence motif.
Probab=52.41 E-value=6 Score=32.52 Aligned_cols=14 Identities=50% Similarity=1.101 Sum_probs=9.6
Q ss_pred cccceeecccCCCceE
Q 003292 813 EENSVTTKRCHCGFTI 828 (833)
Q Consensus 813 ~~~~~~~~~~~~~~~~ 828 (833)
|.|+-|| |.|||.|
T Consensus 35 edtfkRt--CkCGfni 48 (49)
T PF12677_consen 35 EDTFKRT--CKCGFNI 48 (49)
T ss_pred ccceeee--ecccccc
Confidence 3455554 9999987
No 23
>PRK10941 hypothetical protein; Provisional
Probab=38.50 E-value=36 Score=37.36 Aligned_cols=62 Identities=19% Similarity=0.231 Sum_probs=43.2
Q ss_pred ccchhhhhhccceeecccCcccc-chhhHHHHHHHHhcCCHHHHHHHHHHHHHhcCCceEEEE
Q 003292 129 ALSPIVSWFHDNFHVRSSVSTRR-SFHSDLAHALESREGTPEEIAALSVALFRALKLTTRFVS 190 (833)
Q Consensus 129 ~L~~lv~Wf~~~F~v~~~~~~~~-~~~~~l~~~l~~~~Gs~de~a~LF~aLlRaLgl~~RLV~ 190 (833)
.|..|..||-+..--..+...=. +-.+-|-.+|++|.|.+-.++.||+.++|.|||++.-|.
T Consensus 58 ~l~~L~~~fy~~lgF~Gn~~~Y~~p~ns~L~~VL~~R~G~PisL~il~l~iA~~lglp~~gV~ 120 (269)
T PRK10941 58 QLEKLIALFYGEWGFGGASGVYRLSDALWLDKVLKTRQGSAVSLGAILLWIANRLDLPLMPVI 120 (269)
T ss_pred HHHHHHHHHHHHhCCCCCccccCCchhhHHHHHHHccCCCcHHHHHHHHHHHHHcCCCeeeee
Confidence 35566666655533322111111 112347789999999999999999999999999999984
No 24
>COG1305 Transglutaminase-like enzymes, putative cysteine proteases [Amino acid transport and metabolism]
Probab=34.19 E-value=27 Score=37.25 Aligned_cols=25 Identities=40% Similarity=0.860 Sum_probs=20.1
Q ss_pred CCceEEEEEecCCCCCCceEEEecccccc
Q 003292 383 APLYWAEVYCSGENLTGKWVHVDAANAII 411 (833)
Q Consensus 383 ~P~fWvEV~~~~~~~~~kWI~VDPv~~~V 411 (833)
..+.|+|||.++ ..|+++||-.+..
T Consensus 239 ~~Haw~ev~~~~----~gW~~~Dpt~~~~ 263 (319)
T COG1305 239 DAHAWAEVYLPG----RGWVPLDPTNGLL 263 (319)
T ss_pred ccceeeeeecCC----CccEeecCCCCCc
Confidence 468999999964 2699999997643
No 25
>PF05207 zf-CSL: CSL zinc finger; InterPro: IPR007872 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a probable zinc binding motif that contains four cysteines and may chelate zinc, known as the DPH-type after the diphthamide (DPH) biosynthesis protein in which it was first characterised, including the proteins DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJ IPR001623 from INTERPRO domain. Diphthamide is a unique post-translationally modified histidine residue found only in translation elongation factor 2 (eEF-2). It is conserved from archaea to humans and serves as the target for diphteria toxin and Pseudomonas exotoxin A. These two toxins catalyse the transfer of ADP-ribose to diphtamide on eEF-2, thus inactivating eEF-2, halting cellular protein synthesis, and causing cell death []. The biosynthesis of diphtamide is dependent on at least five proteins, DPH1 to -5, and a still unidentified amidating enzyme. DPH3 and DPH4 share a conserved region, which encode a putative zinc finger, the DPH-type or CSL-type (after the conserved motif of the final cysteine) zinc finger [, ]. The function of this motif is unknown. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 2L6L_A 1WGE_A 2JR7_A 1YOP_A 1YWS_A.
Probab=27.56 E-value=37 Score=28.46 Aligned_cols=25 Identities=36% Similarity=0.769 Sum_probs=20.2
Q ss_pred cccCcccccceeecccCCCceEEEe
Q 003292 807 EDQSFDEENSVTTKRCHCGFTIQVE 831 (833)
Q Consensus 807 ~~~~~~~~~~~~~~~~~~~~~~~~~ 831 (833)
++=.||++..+-+..|+||-...|.
T Consensus 6 ~d~~~~~~~~~~~y~CRCG~~f~i~ 30 (55)
T PF05207_consen 6 DDMEFDEEEGVYSYPCRCGGEFEIS 30 (55)
T ss_dssp TTSEEETTTTEEEEEETTSSEEEEE
T ss_pred hhceecCCCCEEEEcCCCCCEEEEc
Confidence 4566788888999999999877664
No 26
>PF00797 Acetyltransf_2: N-acetyltransferase; InterPro: IPR001447 Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30 kDa. It facilitates the transfer of an acetyl group from acetyl coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (Mycobacterium tuberculosis, Mycobacterium smegmatis etc) to Homo sapiens (Human). It was the first enzyme to be observed to have polymorphic activity amongst human individuals. NAT is responsible for the inactivation of Isoniazid (a drug used to treat tuberculosis) in humans. The NAT protein has also been shown to be involved in the breakdown of folic acid. NAT catalyses the reaction: Acetyl-coA + arylamine = coA + N-acetylarylamine NAT is the target of a common genetic polymorphism of clinical relevance in humans. The N-acetylation polymorphism is determined by low or high NAT activity in liver. NAT has been implicated in the action and toxicity of amine-containing drugs, and in the susceptibility to cancer and systematic lupus erythematosus. Two highly similar human genes for NAT, termed NAT1 and NAT2, encode genetically invariant and variant NAT proteins, respectively. ; GO: 0016407 acetyltransferase activity, 0008152 metabolic process; PDB: 1W6F_A 1W5R_A 1GX3_D 2PQT_A 2IJA_A 1W4T_A 2BSZ_B 3D9W_B 3LTW_A 3LNB_A ....
Probab=25.47 E-value=67 Score=33.80 Aligned_cols=32 Identities=25% Similarity=0.262 Sum_probs=22.2
Q ss_pred HHHHHhcC-CHHHHHHHHHHHHHhcCCceEEEE
Q 003292 159 HALESREG-TPEEIAALSVALFRALKLTTRFVS 190 (833)
Q Consensus 159 ~~l~~~~G-s~de~a~LF~aLlRaLgl~~RLV~ 190 (833)
+.+.++.| -=-++..||..||++||+.+++|.
T Consensus 39 kiv~~~rGG~C~elN~lf~~lL~~lGf~v~~~~ 71 (240)
T PF00797_consen 39 KIVRRGRGGYCFELNGLFYWLLRELGFDVTLVS 71 (240)
T ss_dssp HHTTTT--B-HHHHHHHHHHHHHHCT-EEEEEE
T ss_pred HHHhcCCCeEhHHHHHHHHHHHHHCCCeEEEEE
Confidence 33333334 345899999999999999999995
No 27
>PF11702 DUF3295: Protein of unknown function (DUF3295); InterPro: IPR021711 This family is conserved in fungi but the function is not known.
Probab=24.30 E-value=34 Score=40.65 Aligned_cols=11 Identities=45% Similarity=0.897 Sum_probs=9.1
Q ss_pred CCCCCCCCCcc
Q 003292 21 GEEMYDSDWED 31 (833)
Q Consensus 21 ~~ee~e~dWEe 31 (833)
|||+||+||||
T Consensus 307 dDDDDssDWED 317 (507)
T PF11702_consen 307 DDDDDSSDWED 317 (507)
T ss_pred cCCccchhhhh
Confidence 78889999964
No 28
>PF09082 DUF1922: Domain of unknown function (DUF1922); InterPro: IPR015166 Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown []. ; PDB: 1GH9_A.
Probab=23.20 E-value=41 Score=29.69 Aligned_cols=18 Identities=44% Similarity=0.853 Sum_probs=12.9
Q ss_pred ceeecccCCCceEEEeeC
Q 003292 816 SVTTKRCHCGFTIQVEEL 833 (833)
Q Consensus 816 ~~~~~~~~~~~~~~~~~~ 833 (833)
...||.|.||..|-|.+.
T Consensus 17 ~~kTkkC~CG~~l~vk~~ 34 (68)
T PF09082_consen 17 GAKTKKCVCGKTLKVKER 34 (68)
T ss_dssp T-SEEEETTTEEEE--SS
T ss_pred CcceeEecCCCeeeeeeE
Confidence 467999999999998763
No 29
>PF14402 7TM_transglut: 7 transmembrane helices usually fused to an inactive transglutaminase
Probab=22.09 E-value=50 Score=36.99 Aligned_cols=27 Identities=26% Similarity=0.442 Sum_probs=21.8
Q ss_pred CCceEEEEEecCCCCCCceEEEeccccccccc
Q 003292 383 APLYWAEVYCSGENLTGKWVHVDAANAIIDGE 414 (833)
Q Consensus 383 ~P~fWvEV~~~~~~~~~kWI~VDPv~~~Vd~~ 414 (833)
.+..|+|||+. ++|+.+||..+....|
T Consensus 30 ~l~~~lev~~~-----~~W~~f~p~tg~~g~p 56 (313)
T PF14402_consen 30 SLEPWLEVFNG-----GKWVLFNPRTGEQGLP 56 (313)
T ss_pred CcHhHHheeeC-----CeEEEECCCCCCcCCC
Confidence 57899999985 6899999998755444
No 30
>PRK15047 N-hydroxyarylamine O-acetyltransferase; Provisional
Probab=21.20 E-value=1.1e+02 Score=33.79 Aligned_cols=32 Identities=22% Similarity=0.242 Sum_probs=25.4
Q ss_pred HHHHHHhcC--CHHHHHHHHHHHHHhcCCceEEEE
Q 003292 158 AHALESREG--TPEEIAALSVALFRALKLTTRFVS 190 (833)
Q Consensus 158 ~~~l~~~~G--s~de~a~LF~aLlRaLgl~~RLV~ 190 (833)
.+.+.++.| |. |+.-||.++||+||..++++.
T Consensus 58 ~KlV~~~RGGyCf-E~N~Lf~~~L~~LGF~v~~~~ 91 (281)
T PRK15047 58 EKLVIARRGGYCF-EQNGLFERVLRELGFNVRSLL 91 (281)
T ss_pred HHHhcCCCCEEcH-hHHHHHHHHHHHcCCcEEEEE
Confidence 445555555 55 899999999999999999884
Done!