Query T0558 ZP_01960044.1, Bacteroides caccae, 294 residues Match_columns 294 No_of_seqs 145 out of 754 Neff 10.3 Searched_HMMs 11830 Date Tue May 25 16:00:21 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0558.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0558.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF10282 Muc_lac_enz: 3-carbox 99.5 3.1E-09 2.6E-13 80.8 31.7 180 33-212 13-222 (345) 2 PF05935 Arylsulfotrans: Aryls 99.4 4.7E-10 3.9E-14 86.8 22.8 256 21-279 113-460 (471) 3 PF02239 Cytochrom_D1: Cytochr 99.4 1.1E-09 9.4E-14 84.1 24.5 195 24-221 7-211 (369) 4 PF10282 Muc_lac_enz: 3-carbox 99.3 3.7E-09 3.1E-13 80.3 24.3 242 14-273 39-328 (345) 5 PF02239 Cytochrom_D1: Cytochr 99.2 1.1E-08 9.5E-13 76.7 21.2 257 17-281 42-338 (369) 6 PF07433 DUF1513: Protein of u 99.2 2.2E-07 1.9E-11 67.3 27.5 220 20-252 15-278 (305) 7 PF08450 SGL: SMP-30/Gluconola 99.0 3E-07 2.5E-11 66.3 22.5 189 22-223 11-223 (246) 8 PF08450 SGL: SMP-30/Gluconola 99.0 1.1E-06 9.2E-11 62.2 23.9 200 59-278 4-221 (246) 9 PF08662 eIF2A: Eukaryotic tra 98.9 4.1E-07 3.5E-11 65.3 20.5 136 73-218 38-184 (194) 10 PF08662 eIF2A: Eukaryotic tra 98.9 2.5E-07 2.1E-11 66.8 18.3 137 32-180 38-187 (194) 11 PF05935 Arylsulfotrans: Aryls 98.5 2.2E-06 1.8E-10 60.0 13.9 211 63-281 112-403 (471) 12 PF07433 DUF1513: Protein of u 98.4 9.5E-05 8E-09 48.1 21.2 192 17-213 56-286 (305) 13 PF08553 VID27: VID27 cytoplas 98.3 2.3E-05 1.9E-09 52.6 15.3 140 67-211 488-640 (788) 14 PF06433 Me-amine-dh_H: Methyl 98.3 0.00013 1.1E-08 47.2 21.9 138 32-173 66-214 (342) 15 PF04762 IKI3: IKI3 family; I 98.2 0.00023 1.9E-08 45.3 22.1 155 55-211 76-286 (928) 16 PF04053 Coatomer_WDAD: Coatom 97.8 0.00092 7.8E-08 40.9 24.9 200 56-277 34-262 (444) 17 PF04841 Vps16_N: Vps16, N-ter 97.8 0.00092 7.8E-08 40.9 26.3 250 18-276 45-312 (410) 18 PF04762 IKI3: IKI3 family; I 97.8 0.00097 8.2E-08 40.7 27.3 153 16-173 125-332 (928) 19 PF05096 Glu_cyclase_2: Glutam 97.7 0.0013 1.1E-07 39.8 15.2 199 43-251 33-252 (264) 20 PF00930 DPPIV_N: Dipeptidyl p 97.7 0.0014 1.2E-07 39.5 23.6 200 20-222 1-277 (353) 21 PF00930 DPPIV_N: Dipeptidyl p 97.6 0.002 1.7E-07 38.4 20.1 228 15-253 46-350 (353) 22 PF05096 Glu_cyclase_2: Glutam 97.5 0.0029 2.5E-07 37.2 13.3 151 92-252 42-195 (264) 23 PF08553 VID27: VID27 cytoplas 96.9 0.0055 4.7E-07 35.2 10.4 141 22-170 486-639 (788) 24 PF06433 Me-amine-dh_H: Methyl 96.9 0.011 8.9E-07 33.2 22.2 175 32-218 16-219 (342) 25 PF07250 Glyoxal_oxid_N: Glyox 96.8 0.013 1.1E-06 32.6 14.7 182 31-222 7-207 (243) 26 PF04841 Vps16_N: Vps16, N-ter 96.8 0.013 1.1E-06 32.6 21.4 50 76-128 63-112 (410) 27 PF07250 Glyoxal_oxid_N: Glyox 96.6 0.017 1.4E-06 31.7 14.0 139 34-175 47-200 (243) 28 PF03178 CPSF_A: CPSF A subuni 96.6 0.018 1.5E-06 31.4 14.1 170 33-212 2-202 (321) 29 PF03022 MRJP: Major royal jel 96.5 0.019 1.6E-06 31.4 16.1 60 33-92 34-106 (287) 30 PF00400 WD40: WD domain, G-be 96.1 0.005 4.2E-07 35.5 5.9 38 127-170 2-39 (39) 31 PF04053 Coatomer_WDAD: Coatom 95.8 0.043 3.6E-06 28.7 21.4 172 14-210 35-223 (444) 32 PF01011 PQQ: PQQ enzyme repea 95.8 0.0073 6.1E-07 34.3 5.4 32 25-56 2-33 (38) 33 PF02897 Peptidase_S9_N: Proly 95.4 0.057 4.8E-06 27.8 12.6 54 59-113 129-189 (415) 34 PF00400 WD40: WD domain, G-be 94.3 0.039 3.3E-06 29.0 5.7 36 87-122 4-39 (39) 35 PF05694 SBP56: 56kDa selenium 91.5 0.26 2.2E-05 23.0 13.8 145 23-171 88-276 (461) 36 PF10168 Nup88: Nuclear pore c 89.2 0.4 3.4E-05 21.6 15.3 107 24-132 34-187 (717) 37 PF12234 Rav1p_C: RAVE protein 88.6 0.44 3.7E-05 21.3 10.2 46 75-120 52-100 (630) 38 PF07569 Hira: TUP1-like enhan 84.0 0.73 6.2E-05 19.7 8.5 26 103-128 19-44 (220) 39 PF11768 DUF3312: Protein of u 83.2 0.78 6.6E-05 19.5 11.6 70 145-214 261-330 (544) 40 PF10647 Gmad1: Lipoprotein Lp 76.3 1.3 0.00011 17.9 19.5 144 55-203 24-186 (253) 41 PF05787 DUF839: Bacterial pro 75.5 1.3 0.00011 17.8 8.6 13 235-247 440-452 (524) 42 PF02333 Phytase: Phytase; In 75.4 1.3 0.00011 17.8 15.8 191 19-220 63-299 (381) 43 PF03088 Str_synth: Strictosid 74.3 1.4 0.00012 17.6 7.9 35 162-196 35-70 (89) 44 PF08194 DIM: DIM protein; In 71.1 0.91 7.7E-05 19.0 2.2 30 1-30 1-31 (36) 45 PF00780 CNH: CNH domain; Int 66.3 2.1 0.00018 16.4 23.1 227 19-287 3-264 (275) 46 PF00879 Defensin_propep: Defe 64.5 1.4 0.00011 17.8 2.0 24 1-24 1-24 (52) 47 PF09792 But2: Ubiquitin 3 bin 63.9 2.3 0.00019 16.1 8.6 83 1-91 1-86 (446) 48 PF10566 Glyco_hydro_97: Glyco 63.8 2.3 0.00019 16.1 3.6 51 1-52 1-53 (643) 49 PF01436 NHL: NHL repeat; Int 59.0 2.8 0.00023 15.5 4.0 22 185-206 4-25 (28) 50 PF05777 Acp26Ab: Drosophila a 55.2 3.1 0.00026 15.2 2.5 28 1-29 1-28 (90) 51 PF02261 Asp_decarbox: Asparta 54.9 2.1 0.00018 16.4 1.6 72 203-274 41-115 (116) 52 PF01453 B_lectin: D-mannose b 53.2 3.4 0.00029 14.8 6.5 19 161-179 26-44 (114) 53 PF06649 DUF1161: Protein of u 50.9 3 0.00026 15.2 1.9 17 1-17 1-17 (75) 54 PF05399 EVI2A: Ectropic viral 50.7 2.4 0.0002 15.9 1.3 11 31-41 32-42 (227) 55 PF10956 DUF2756: Protein of u 50.2 2.6 0.00022 15.7 1.4 19 1-19 1-19 (104) 56 PF05264 CfAFP: Choristoneura 47.5 1.6 0.00013 17.2 0.0 11 1-11 1-11 (137) 57 PF10614 Tafi-CsgF: Curli prod 43.7 4.7 0.0004 13.8 2.0 26 1-27 1-26 (142) 58 PF10279 Latarcin: Latarcin pr 43.0 2.1 0.00017 16.4 0.0 12 1-12 1-12 (105) 59 PF03646 FlaG: FlaG protein; 39.4 5.4 0.00046 13.4 2.7 16 35-50 70-85 (107) 60 PF01939 DUF91: Protein of unk 39.1 5.4 0.00046 13.3 5.4 14 60-73 30-43 (228) 61 PF07676 PD40: WD40-like Beta 36.8 5.9 0.0005 13.1 4.4 15 99-113 13-27 (39) 62 PF07202 Tcp10_C: T-complex pr 34.1 6.5 0.00055 12.8 19.4 98 151-276 80-177 (179) 63 PF04202 Mfp-3: Foot protein 3 32.4 6.9 0.00058 12.6 1.6 23 1-23 1-25 (71) 64 PF01731 Arylesterase: Arylest 31.9 7 0.00059 12.5 3.6 28 145-172 56-84 (86) 65 PF00993 MHC_II_alpha: Class I 31.3 7.1 0.0006 12.5 2.5 20 201-220 23-42 (82) 66 PF06422 PDR_CDR: CDR ABC tran 30.7 5.3 0.00045 13.4 0.5 10 2-11 50-59 (103) 67 PF07403 DUF1505: Protein of u 28.0 8.1 0.00068 12.1 1.5 24 1-24 1-24 (114) 68 PF10907 DUF2749: Protein of u 26.9 8.4 0.00071 12.0 1.5 28 1-31 1-28 (66) 69 PF11714 Inhibitor_I53: Thromb 26.8 8.5 0.00071 11.9 1.1 18 1-18 1-18 (78) 70 PF03548 LolA: Outer membrane 25.6 8.9 0.00075 11.8 4.6 41 32-86 26-66 (165) 71 PF07172 GRP: Glycine rich pro 25.2 9 0.00076 11.7 1.2 12 2-13 4-15 (95) 72 PF03527 RHS: RHS protein; In 22.8 9.9 0.00084 11.4 1.7 14 266-279 15-28 (41) 73 PF05436 MF_alpha_N: Mating fa 22.8 10 0.00084 11.4 2.1 23 1-23 1-23 (86) 74 PF12273 RCR: Chitin synthesis 21.3 11 0.0009 11.2 1.3 14 4-17 5-18 (130) 75 PF08801 Nucleoporin_N: Nup133 21.1 11 0.00091 11.2 11.1 34 145-178 194-229 (424) 76 PF10793 Gloverin: Gloverin-li 20.6 11 0.00093 11.1 1.2 20 4-23 5-24 (175) 77 PF10395 Utp8: Utp8 family 20.5 11 0.00093 11.1 17.0 88 35-123 53-158 (670) No 1 >PF10282 Muc_lac_enz: 3-carboxy-cis,cis-muconate lactonizing enzyme; PDB: 3hfq_B 3fgb_B 1jof_F 1ri6_A 1l0q_B 3bws_B. Probab=99.46 E-value=3.1e-09 Score=80.79 Aligned_cols=180 Identities=16% Similarity=0.123 Sum_probs=73.5 Q ss_pred CEEEEEEC--CCCEEEEEEECCCCCCCCEEEEECCCCEEEEE------CCEEEEEECCCC--CE--EEEECCCCCCEEEE Q ss_conf 86999988--78829999944998731164790798399841------882899752677--03--67723777631487 Q T0558 33 NKIAIINK--DTKEIVWEYPLEKGWECNSVAATKAGEILFSY------SKGAKMITRDGR--EL--WNIAAPAGCEMQTA 100 (294) Q Consensus 33 ~~i~~~d~--~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~------~~~v~~~~~~~~--~~--~~~~~~~~~~v~~~ 100 (294) +.|+.++. ++|++...........+..++++||+++|++. .+.+..|..+.. .+ .............+ T Consensus 13 ~gI~~~~~d~~~g~l~~~~~~~~~~~ps~l~~s~~~~~LY~~~~~~~~~g~v~~~~i~~~~g~l~~~~~~~~~g~~p~~~ 92 (345) T PF10282_consen 13 GGIYVYRFDDETGTLTLVSTVAEGGNPSYLALSPDGKRLYAVNESGTESGGVSSFRIDPDTGSLTLLNTVPSGGSSPCHL 92 (345) T ss_dssp TEEEEEEEETTTTEEEEEEEEETTS--SSEEE-TTSSEEEEEECCCCTTTEEEEEEEES----EEEEEEEES--EEECEE T ss_pred CCEEEEEECCCCCCCEEEEEECCCCCCCEEEEECCCCEEEEEEECCCCCCCEEEEEECCCCCEEEEEEEECCCCCCCEEE T ss_conf 97899998688886128112036899847999768987999860578886189998679753138810205789985699 Q ss_pred EECCCCCEEEEEECCC-CEEEEEC-CCCCEEEEEECCCCC-------CCCCCCCCEEEECCCCCEEEEEE-CCCEEEEEE Q ss_conf 8737875899970589-7999985-689588999706776-------77667400899969989999974-698899993 Q T0558 101 RILPDGNALVAWCGHP-STILEVN-MKGEVLSKTEFETGI-------ERPHAQFRQINKNKKGNYLVPLF-ATSEVREIA 170 (294) Q Consensus 101 ~~~~dg~~l~~~s~~~-~~~~~~~-~~G~~~~~~~~~~~~-------~~~~~~~~~~~~s~dG~~i~~g~-~d~~i~~~d 170 (294) +.+|+|++++++.... .+.++.. .+|.+.......... .......-.+.++|||+++++.. ....|.+++ T Consensus 93 ~~~~~g~~l~vany~~g~v~v~~l~~~G~~~~~~~~~~~~g~g~~~~rq~~~h~H~v~~sPdg~~l~v~dlG~D~i~~~~ 172 (345) T PF10282_consen 93 AVDPDGRYLFVANYGGGSVSVYPLDDDGSLGEQVQVIQHEGSGPNPDRQEGPHPHQVVFSPDGKYLFVPDLGADRIYVYD 172 (345) T ss_dssp EEETTTTEEEEEEST---EEEEEE-----EEEEEEEE--------TTT-TT--EEEEEE-TTSSEEEEEECCCTEEEEEE T ss_pred EECCCCCEEEEEECCCCEEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCEEEEEECCCCEEEEEE T ss_conf 97699999999984699899999277556442101002578887656678994308898899999999978999799999 Q ss_pred CCCC---EEE----EEECCCEEEEEEEECCCCEE-EECCCCCEEEEEECC Q ss_conf 7885---889----85169734898873489689-973589879999878 Q T0558 171 PNGQ---LLN----SVKLSGTPFSSAFLDNGDCL-VACGDAHCFVQLNLE 212 (294) Q Consensus 171 ~~g~---~~~----~~~~~~~~~~~~~~~~g~~~-v~~~~~~~i~~~d~~ 212 (294) .+.. ... ....+..|..+.++|+|+++ +++..++.+.+++.. T Consensus 173 ~~~~~~~l~~~~~~~~~~g~gPRh~~f~pdg~~~YV~~E~s~~V~v~~~~ 222 (345) T PF10282_consen 173 FDPGTGKLTPVPSIKVPPGSGPRHLAFSPDGRYAYVVNELSNTVSVYDYD 222 (345) T ss_dssp E-TT--TEEEEEEEECE----EEEEEE-TTTTEEEEEETTTTEEEEEEET T ss_pred EECCCCEEEECCEECCCCCCCCCEEEECCCCCEEEEEECCCCEEEEEEEC T ss_conf 70787323242202157789984699958999899974678848999951 No 2 >PF05935 Arylsulfotrans: Arylsulfotransferase (ASST); InterPro: IPR010262 This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate .; PDB: 3ett_B 3elq_B 3ets_A. Probab=99.38 E-value=4.7e-10 Score=86.81 Aligned_cols=256 Identities=17% Similarity=0.201 Sum_probs=173.0 Q ss_pred CCCCEEEEE---CCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEECCEEEEEECCCCCEEEEECCCCC-- Q ss_conf 888589997---479869999887882999994499873116479079839984188289975267703677237776-- Q T0558 21 SPQHLLVGG---SGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSYSKGAKMITRDGRELWNIAAPAGC-- 95 (294) Q Consensus 21 ~~~~~l~~g---s~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~~~~v~~~~~~~~~~~~~~~~~~~-- 95 (294) .+.-+++.+ ......+++|. .|+++|.++....... .+...+||.+++.....+..+|..|+.+|++..+... T Consensus 113 ~~gLy~~~~~~~~~~~~~~i~D~-~G~vrW~~~~~~~~~~-~~~~~~nG~l~~~~~~~~~e~D~~G~vi~~~~l~~~~~~ 190 (471) T PF05935_consen 113 EDGLYFVNPNDWDYQPGPYIYDN-NGNVRWYLPSDSGRDN-RFKRLDNGHLLFGSGGRYYEYDWLGKVIWQYDLPNGYYD 190 (471) T ss_dssp TT-EEEEEETT---EEEEEEEE-----EEEEE-GGGT-----EEE-----EEEE---EEEEE-----EEEEEE------- T ss_pred CCCEEEEECCCCCCCCEEEEECC-CCCEEEEECCCCCCCC-EEEECCCCCEEEEECCEEEEECCCCCEEEEEECCCCCCC T ss_conf 78689996777777760699959-9869999527778761-478738986999978848998889978999987887776 Q ss_pred CEEEEEECCCCCEEEEEEC-------------CCCEEEEECCCCCEEEEEECCCCCCCCC-------------------- Q ss_conf 3148787378758999705-------------8979999856895889997067767766-------------------- Q T0558 96 EMQTARILPDGNALVAWCG-------------HPSTILEVNMKGEVLSKTEFETGIERPH-------------------- 142 (294) Q Consensus 96 ~v~~~~~~~dg~~l~~~s~-------------~~~~~~~~~~~G~~~~~~~~~~~~~~~~-------------------- 142 (294) .-+.+...|+|++|+.+.. ...+ +..+.+|+++|++.......... T Consensus 191 ~HHd~~~~~nGn~Li~~~~~~~~~~~~~~~~~~D~i-iEid~tG~vv~~W~~~dhld~~~~~~~~~~~~~~~~~~~~~~~ 269 (471) T PF05935_consen 191 FHHDFQELPNGNILILAYERRYADEGKDGWTVEDVI-IEIDETGEVVWEWDASDHLDPYRDTNLKDLNDPFGDNPGSGGG 269 (471) T ss_dssp --S-EEE-----EEEEE--TTEE----EE---S-EE-EEE-----EEEEEEGGGTS-TT--S---B-T------------ T ss_pred CCEEEEECCCCCEEEEEEECCCCCCCCCCCEEECEE-EEECCCCCEEEEECHHHCCCCCCCCHHHCCCCCCCCCCCCCCC T ss_conf 430118969997999997311124677886882689-9998999699998766768800065000255566756688998 Q ss_pred ---CCCCEEEECC-CCCEEEEEECCCEEEEEE-CCCCEEEEEECCC------------E------------EEEEEEECC Q ss_conf ---7400899969-989999974698899993-7885889851697------------3------------489887348 Q T0558 143 ---AQFRQINKNK-KGNYLVPLFATSEVREIA-PNGQLLNSVKLSG------------T------------PFSSAFLDN 193 (294) Q Consensus 143 ---~~~~~~~~s~-dG~~i~~g~~d~~i~~~d-~~g~~~~~~~~~~------------~------------~~~~~~~~~ 193 (294) ..++++.+.+ ||.+|++.+....|...| .+++.+|...... + -+...+.++ T Consensus 270 ~Dw~HiNsv~yd~~d~~iliS~R~~s~V~~Id~~tg~I~W~lG~~~~~~~~~~~~~l~p~~~~~~~~~~~~QH~a~~~~~ 349 (471) T PF05935_consen 270 WDWFHINSVDYDPDDDSILISSRHQSTVIKIDYRTGEIKWILGGKGGWSKDYQDYLLTPVDGDGDFDWFWGQHDARFIPD 349 (471) T ss_dssp --S--EEEEEEETTTTEEEEEETT----EEE----S---EE-S------TTTGGGB--BB-SSSS----SS-EEEEE--- T ss_pred CCCCCCCCCEECCCCCCEEEECCCCEEEEEEECCCCCEEEEECCCCCCCCCCHHHCCCCCCCCCCCCEEECCCCCEECCC T ss_conf 88737356077789993999767755899995699868999479877673101200343566877650003542078189 Q ss_pred C---CEEEECC------------------CCCEEEEEECCCC--EEEEEECCCCCCCEEECCCCCEEECCC-CCEEEEEC Q ss_conf 9---6899735------------------8987999987898--599984488764114113453489389-98999804 Q T0558 194 G---DCLVACG------------------DAHCFVQLNLESN--RIVRRVNANDIEGVQLFFVAQLFPLQN-GGLYICNW 249 (294) Q Consensus 194 g---~~~v~~~------------------~~~~i~~~d~~~g--~~~~~~~~~~~~~~~~~~~~~~~~~~~-G~i~i~~~ 249 (294) + .+++... +....+.+|.+.+ +++|++.............++.+.++| |++++... T Consensus 350 ~~~~~i~~FDNg~~~~~~~~~~~~~~~~~Sr~~~y~ID~~~~Tv~~v~~y~~~~g~~~yS~~~s~~q~L~n~gn~li~~G 429 (471) T PF05935_consen 350 GPQGNILVFDNGNGRGYSQPNLVWMKDNYSRGVEYKIDENNMTVEQVWEYGKPRGNDFYSPSQSSAQYLPNTGNTLIGSG 429 (471) T ss_dssp -----EEEEE-----TTS--SSCCG-----EEEEEEE-S----EEEEEEE------TT--SS----EEETTTTEEE---- T ss_pred CCEEEEEEEECCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCEEEEEEEECCCCCCCCCCCCCCCEEECCCCCCEEEECC T ss_conf 97279999958986666776655677753436999985899869999997589887641575122058369998999746 Q ss_pred CCCEEECCCCCCCEEEEECCCC-CEEEEEEC Q ss_conf 6771430247775699990899-89999835 Q T0558 250 QGHDREAGKGKHPQLVEIDSEG-KVVWQLND 279 (294) Q Consensus 250 ~~~~~~~~~~~~~~~~~i~~~G-~~vW~~~~ 279 (294) ..............+.+++..+ +++.++.- T Consensus 430 ~~g~~~~~~~~~~~i~E~~~~~~~v~~e~~~ 460 (471) T PF05935_consen 430 SAGLFSNGKPTTGVITEIDYETKEVVFEIKV 460 (471) T ss_dssp -BTTT----B-B--EEEEETTT--EEEEEEE T ss_pred CCCCCCCCCCCCCEEEEEECCCCEEEEEEEE T ss_conf 5653256888875169996699769999996 No 3 >PF02239 Cytochrom_D1: Cytochrome D1 heme domain; InterPro: IPR003143 Cytochrome cd1 (nitrite reductase) catalyses the conversion of nitrite to nitric oxide in the nitrogen cycle. This family represents the d1 haem-binding domain of cytochrome cd1, in which His/Tyr side chains ligate the d1 haem iron of the active site in the oxidized state .; GO: 0005506 iron ion binding, 0009055 electron carrier activity, 0020037 heme binding, 0050421 nitrite reductase (NO-forming) activity, 0006118 electron transport, 0046209 nitric oxide metabolic process, 0042597 periplasmic space; PDB: 1aof_B 1hj5_B 1hj3_A 1aom_B 1h9x_B 1dy7_B 1hcm_B 1e2r_B 1qks_B 1aoq_B .... Probab=99.37 E-value=1.1e-09 Score=84.06 Aligned_cols=195 Identities=12% Similarity=0.141 Sum_probs=139.5 Q ss_pred CEEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEE--CCEEEEEECCCCCEEE-EECCCCCCEEEE Q ss_conf 5899974798699998878829999944998731164790798399841--8828997526770367-723777631487 Q T0558 24 HLLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSY--SKGAKMITRDGRELWN-IAAPAGCEMQTA 100 (294) Q Consensus 24 ~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~--~~~v~~~~~~~~~~~~-~~~~~~~~v~~~ 100 (294) -.+++-.++++|.++|.+|.+++.+.+.+.. ....+.++|||++++.. |+.+.++|.....+.+ ...+. ...++ T Consensus 7 l~~V~~~~~~~V~viD~~t~~v~~~i~~~~~-~h~~~~~s~Dgr~~yv~~rdg~vsviD~~~~~vv~~v~~G~--~~~gi 83 (369) T PF02239_consen 7 LFVVVERGDGSVSVIDGATNKVLATIPTGGA-PHAGVKFSPDGRYLYVASRDGWVSVIDLWTGKVVATVKVGS--NPRGI 83 (369) T ss_dssp -EEEEBTT---EEEE-----SEEEEE--------EEEE------EEEEE---SEEEEEETTSSSEEEEEE-----B---- T ss_pred EEEEEECCCCEEEEEECCCCEEEEEEECCCC-CEEEEEECCCCCEEEEECCCCEEEEEECCCCEEEEEEECCC--CCCEE T ss_conf 8999965899699998999869999727987-51587656888689997799729999898557999995487--86326 Q ss_pred EECCCCCEEEEEEC-CCCEEEEECCCCCEEEEEECCCCCC-CCCCCCCEEEECCCCC-EEEEEECCCEEEEEECCC---C Q ss_conf 87378758999705-8979999856895889997067767-7667400899969989-999974698899993788---5 Q T0558 101 RILPDGNALVAWCG-HPSTILEVNMKGEVLSKTEFETGIE-RPHAQFRQINKNKKGN-YLVPLFATSEVREIAPNG---Q 174 (294) Q Consensus 101 ~~~~dg~~l~~~s~-~~~~~~~~~~~G~~~~~~~~~~~~~-~~~~~~~~~~~s~dG~-~i~~g~~d~~i~~~d~~g---~ 174 (294) ++++||+++++++. .+++.+++..+.+.+.......... ....++.++..++... ++++....+.|.+.|.+. . T Consensus 84 a~S~DGk~v~v~n~~~~~v~viD~~tle~v~~i~~~~~~~~~~~sRv~aIv~s~~~~~fVv~lkd~~~I~iVdy~d~~~~ 163 (369) T PF02239_consen 84 AVSPDGKYVYVANYWPGTVVVIDAETLEPVKTIPTGGYTGDGPESRVAAIVASPYRPEFVVNLKDTGEIWIVDYSDPKNL 163 (369) T ss_dssp EE--TTT-EEE--TBTTEEEEE------EEEEEE-----TTTS---EEEEEE-SSSSEEEEEES----EEEEETTTSSCE T ss_pred EECCCCCEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCCCCCCCEEEEEECCCCCEEEEEECCCCEEEEEEECCCCCC T ss_conf 98789989999955899259980677640356530234465668732556717899779999803887999970467754 Q ss_pred EEEEEECCCEEEEEEEECCCCEEEE-CCCCCEEEEEECCCCEEEEEEC Q ss_conf 8898516973489887348968997-3589879999878985999844 Q T0558 175 LLNSVKLSGTPFSSAFLDNGDCLVA-CGDAHCFVQLNLESNRIVRRVN 221 (294) Q Consensus 175 ~~~~~~~~~~~~~~~~~~~g~~~v~-~~~~~~i~~~d~~~g~~~~~~~ 221 (294) .......+..++...+.++++++++ ....+.+.++|.++++++..+. T Consensus 164 ~~~~i~~g~~l~Dg~~d~~gRy~lva~~~~n~i~vvd~~~~~~v~~i~ 211 (369) T PF02239_consen 164 KVRTIKVGRFLHDGGFDPDGRYFLVAANGSNKIAVVDTKTGKLVALID 211 (369) T ss_dssp EEEEE---TTB------TTSSEEEEEEGGGTEEEEEETTTTEEEEEEE T ss_pred CEECCCCCCCCCCCCCCCCCEEEEEHHCCCCCEEEEECCCCEEEEEEC T ss_conf 200024543433566786301777000457821256236623899851 No 4 >PF10282 Muc_lac_enz: 3-carboxy-cis,cis-muconate lactonizing enzyme; PDB: 3hfq_B 3fgb_B 1jof_F 1ri6_A 1l0q_B 3bws_B. Probab=99.31 E-value=3.7e-09 Score=80.26 Aligned_cols=242 Identities=14% Similarity=0.138 Sum_probs=144.0 Q ss_pred HHHCCCCCCCCEEEEECC----CCEEEEEECCC--CEE--EEEEECCCCCCCCEEEEECCCCEEEEE---CCEEEEEECC Q ss_conf 320014788858999747----98699998878--829--999944998731164790798399841---8828997526 Q T0558 14 APFAQGSSPQHLLVGGSG----WNKIAIINKDT--KEI--VWEYPLEKGWECNSVAATKAGEILFSY---SKGAKMITRD 82 (294) Q Consensus 14 ~~~~~~s~~~~~l~~gs~----~~~i~~~d~~t--g~~--~w~~~~~~~~~~~~~~~~pdG~~l~s~---~~~v~~~~~~ 82 (294) +.|...++++++|.+.+. ++.|..|+.+. |++ +-+.+. .+..+..++++|||++++.. ++.+.++..+ T Consensus 39 ps~l~~s~~~~~LY~~~~~~~~~g~v~~~~i~~~~g~l~~~~~~~~-~g~~p~~~~~~~~g~~l~vany~~g~v~v~~l~ 117 (345) T PF10282_consen 39 PSYLALSPDGKRLYAVNESGTESGGVSSFRIDPDTGSLTLLNTVPS-GGSSPCHLAVDPDGRYLFVANYGGGSVSVYPLD 117 (345) T ss_dssp -SSEEE-TTSSEEEEEECCCCTTTEEEEEEEES----EEEEEEEES---EEECEEEEETTTTEEEEEEST---EEEEEE- T ss_pred CCEEEEECCCCEEEEEEECCCCCCCEEEEEECCCCCEEEEEEEECC-CCCCCEEEEECCCCCEEEEEECCCCEEEEEEEC T ss_conf 8479997689879998605788861899986797531388102057-899856999769999999998469989999927 Q ss_pred C--CCEEEE--E----------CCCCCCEEEEEECCCCCEEEEEEC-CCCEEEEEC--CCCCEEEEEECCCCCCCCCCCC Q ss_conf 7--703677--2----------377763148787378758999705-897999985--6895889997067767766740 Q T0558 83 G--RELWNI--A----------APAGCEMQTARILPDGNALVAWCG-HPSTILEVN--MKGEVLSKTEFETGIERPHAQF 145 (294) Q Consensus 83 ~--~~~~~~--~----------~~~~~~v~~~~~~~dg~~l~~~s~-~~~~~~~~~--~~G~~~~~~~~~~~~~~~~~~~ 145 (294) . ...... . ......++.+.++|||++++++.. .+.+.+++. .++++...... ........ T Consensus 118 ~~G~~~~~~~~~~~~g~g~~~~rq~~~h~H~v~~sPdg~~l~v~dlG~D~i~~~~~~~~~~~l~~~~~~---~~~~g~gP 194 (345) T PF10282_consen 118 DDGSLGEQVQVIQHEGSGPNPDRQEGPHPHQVVFSPDGKYLFVPDLGADRIYVYDFDPGTGKLTPVPSI---KVPPGSGP 194 (345) T ss_dssp ----EEEEEEEE--------TTT-TT--EEEEEE-TTSSEEEEEECCCTEEEEEEE-TT--TEEEEEEE---ECE----E T ss_pred CCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEEECCCCCEEEEEECCCCEEEEEEEECCCCEEEECCEE---CCCCCCCC T ss_conf 755644210100257888765667899430889889999999997899979999970787323242202---15778998 Q ss_pred CEEEECCCCCEEEEEEC-CCEEEEEECC--CCEEE---EEEC-------CCEEEEEEEECCCCEEEE-CCCCCEEEEEEC Q ss_conf 08999699899999746-9889999378--85889---8516-------973489887348968997-358987999987 Q T0558 146 RQINKNKKGNYLVPLFA-TSEVREIAPN--GQLLN---SVKL-------SGTPFSSAFLDNGDCLVA-CGDAHCFVQLNL 211 (294) Q Consensus 146 ~~~~~s~dG~~i~~g~~-d~~i~~~d~~--g~~~~---~~~~-------~~~~~~~~~~~~g~~~v~-~~~~~~i~~~d~ 211 (294) +++.|+|||+++++... +++|.+++.. ...+. .... ...+..+.++|||+++++ ....+.|.+++. T Consensus 195 Rh~~f~pdg~~~YV~~E~s~~V~v~~~~~~~g~~~~~~~~~~~p~~~~~~~~~~~I~~spdG~~lyvsnr~~~sI~vf~i 274 (345) T PF10282_consen 195 RHLAFSPDGRYAYVVNELSNTVSVYDYDSSDGTLTIVQTVSTLPEGPTGANSPAGIALSPDGKFLYVSNRGSNSISVFDI 274 (345) T ss_dssp EEEEE-TTTTEEEEEETTTTEEEEEEETTTT-EEEEEEEEESSETTCESBCCEEEEEE-----EEEEEECCTTEEEEEEE T ss_pred CEEEECCCCCEEEEEECCCCEEEEEEECCCCCEEEEEEEEECCCCCCCCCCCCEEEEECCCCCEEEEECCCCCEEEEEEE T ss_conf 46999589998999746788489999516764159989995125777788740389998988989997478998999999 Q ss_pred --CCCEEE--EEECCCCCCCEEECCCCCEEECCCCCEE-EEECCCCEEECCCCCCCEEEEECC-CCCE Q ss_conf --898599--9844887641141134534893899899-980467714302477756999908-9989 Q T0558 212 --ESNRIV--RRVNANDIEGVQLFFVAQLFPLQNGGLY-ICNWQGHDREAGKGKHPQLVEIDS-EGKV 273 (294) Q Consensus 212 --~~g~~~--~~~~~~~~~~~~~~~~~~~~~~~~G~i~-i~~~~~~~~~~~~~~~~~~~~i~~-~G~~ 273 (294) ++|++. ..+... -.++.++.+.++|+.+ +.+..+. ...++.+|. +|.+ T Consensus 275 d~~~g~l~~v~~~~~~------G~~Pr~~~~spdG~~L~van~~s~--------~V~vf~~d~~~G~l 328 (345) T PF10282_consen 275 DPATGKLTLVQTIPTG------GKFPRGFAFSPDGKYLYVANQDSN--------TVSVFDRDAETGKL 328 (345) T ss_dssp CCEETTEEEEEEEE-S------SS-B-EEEE-TTSSEEEEEETTTT--------EEEEEEE-S----E T ss_pred ECCCCCEEEEEEEECC------CCCCCEEEECCCCCEEEEEECCCC--------EEEEEEEECCCCCE T ss_conf 6689838997779489------999887899689999999977999--------39999998999978 No 5 >PF02239 Cytochrom_D1: Cytochrome D1 heme domain; InterPro: IPR003143 Cytochrome cd1 (nitrite reductase) catalyses the conversion of nitrite to nitric oxide in the nitrogen cycle. This family represents the d1 haem-binding domain of cytochrome cd1, in which His/Tyr side chains ligate the d1 haem iron of the active site in the oxidized state .; GO: 0005506 iron ion binding, 0009055 electron carrier activity, 0020037 heme binding, 0050421 nitrite reductase (NO-forming) activity, 0006118 electron transport, 0046209 nitric oxide metabolic process, 0042597 periplasmic space; PDB: 1aof_B 1hj5_B 1hj3_A 1aom_B 1h9x_B 1dy7_B 1hcm_B 1e2r_B 1qks_B 1aoq_B .... Probab=99.18 E-value=1.1e-08 Score=76.74 Aligned_cols=257 Identities=12% Similarity=0.077 Sum_probs=147.4 Q ss_pred CCCCCCCCEEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEE---CCEEEEEECCCCCEEEEE-CC Q ss_conf 01478885899974798699998878829999944998731164790798399841---882899752677036772-37 Q T0558 17 AQGSSPQHLLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSY---SKGAKMITRDGRELWNIA-AP 92 (294) Q Consensus 17 ~~~s~~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~---~~~v~~~~~~~~~~~~~~-~~ 92 (294) ...+++++++.+.+.|+.|.++|..+++++.+.+.+ .....++++|||++++.+ .+.+.++|...-+..+.. .. T Consensus 42 ~~~s~Dgr~~yv~~rdg~vsviD~~~~~vv~~v~~G--~~~~gia~S~DGk~v~v~n~~~~~v~viD~~tle~v~~i~~~ 119 (369) T PF02239_consen 42 VKFSPDGRYLYVASRDGWVSVIDLWTGKVVATVKVG--SNPRGIAVSPDGKYVYVANYWPGTVVVIDAETLEPVKTIPTG 119 (369) T ss_dssp EE------EEEEE---SEEEEEETTSSSEEEEEE-----B----EE--TTT-EEE--TBTTEEEEE------EEEEEE-- T ss_pred EEECCCCCEEEEECCCCEEEEEECCCCEEEEEEECC--CCCCEEEECCCCCEEEEEECCCCCEEEEECCCCCCEEEEEEC T ss_conf 765688868999779972999989855799999548--786326987899899999558992599806776403565302 Q ss_pred ------CCCCEEEEEECCCCCEEEEEECCC-CEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEE-CCC Q ss_conf ------776314878737875899970589-799998568958899970677677667400899969989999974-698 Q T0558 93 ------AGCEMQTARILPDGNALVAWCGHP-STILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLF-ATS 164 (294) Q Consensus 93 ------~~~~v~~~~~~~dg~~l~~~s~~~-~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~-~d~ 164 (294) ....+.++..++....+++.-.+. .+.+.+..+.+.+.......+... ....++|+|+|++++. ... T Consensus 120 ~~~~~~~~sRv~aIv~s~~~~~fVv~lkd~~~I~iVdy~d~~~~~~~~i~~g~~l-----~Dg~~d~~gRy~lva~~~~n 194 (369) T PF02239_consen 120 GYTGDGPESRVAAIVASPYRPEFVVNLKDTGEIWIVDYSDPKNLKVRTIKVGRFL-----HDGGFDPDGRYFLVAANGSN 194 (369) T ss_dssp ---TTTS---EEEEEE-SSSSEEEEEES----EEEEETTTSSCEEEEEE---TTB-----------TTSSEEEEEEGGGT T ss_pred CCCCCCCCCCEEEEEECCCCCEEEEEECCCCEEEEEEECCCCCCCEECCCCCCCC-----CCCCCCCCCEEEEEHHCCCC T ss_conf 3446566873255671789977999980388799997046775420002454343-----35667863017770004578 Q ss_pred EEEEEECC-CCEEEEEECCCEEEEE----EEECC----------CCEEEECCCCCEEEEEECCCCEEEEEECCCCCCCEE Q ss_conf 89999378-8588985169734898----87348----------968997358987999987898599984488764114 Q T0558 165 EVREIAPN-GQLLNSVKLSGTPFSS----AFLDN----------GDCLVACGDAHCFVQLNLESNRIVRRVNANDIEGVQ 229 (294) Q Consensus 165 ~i~~~d~~-g~~~~~~~~~~~~~~~----~~~~~----------g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~~~~~~~ 229 (294) .+.++|.+ ++.......+..+... ..++. +...++.-..+.+.+++..+-+++.++.....+ .. T Consensus 195 ~i~vvd~~~~~~v~~i~~g~~~~~~~~~~~pH~~~g~vW~t~~~~~~~v~~ig~~~v~v~~~~~wkvV~~i~~~G~g-lF 273 (369) T PF02239_consen 195 KIAVVDTKTGKLVALIDTGKKPHPGPGANMPHPGFGPVWATSHLGVFAVPLIGTDPVSVHDDYAWKVVKEIPTPGGG-LF 273 (369) T ss_dssp EEEEEETTTTEEEEEEE----BE-----EE--S----EEEEEB----EEEEE---TTT--TTTTTS--EEEE-------- T ss_pred CEEEEECCCCEEEEEECCCCCCCCCCCCCCCCCCCCEEEEECCCCCEEECCCCCCCCCCCCCCCCEEEEEEECCCCC-CE T ss_conf 21256236623899851365666665210006774415665168730410147884456521287598999688987-62 Q ss_pred ECCCCC--EEE------CCCCCEEEEECCCCEEECC--C--CCCCEEEEECCCCCEEEE-EECCC Q ss_conf 113453--489------3899899980467714302--4--777569999089989999-83588 Q T0558 230 LFFVAQ--LFP------LQNGGLYICNWQGHDREAG--K--GKHPQLVEIDSEGKVVWQ-LNDKV 281 (294) Q Consensus 230 ~~~~~~--~~~------~~~G~i~i~~~~~~~~~~~--~--~~~~~~~~i~~~G~~vW~-~~~~~ 281 (294) +...++ ... ..+..+.+.+.....+... . +...--++++++|+.+|- .-+.+ T Consensus 274 vkthP~s~~lwvd~~~~~~~~~v~viD~~tl~~v~~i~~~~~~~~~h~ef~~dG~~vwvS~w~~~ 338 (369) T PF02239_consen 274 VKTHPDSPYLWVDTFLNPDNDSVQVIDKQTLEVVKTIAPGPGKRVVHPEFNKDGDEVWVSVWDGN 338 (369) T ss_dssp EE--TT-SEEEEE-TT-SSHT-EEEEECCGTEEEE-HHHHHT--EEEEEE-----EEEEEEE--- T ss_pred EECCCCCCEEEEECCCCCCCCEEEEEECCCCCEEEEEECCCCCCEECCEECCCCCEEEEEEECCC T ss_conf 68689984089805779988779999887772778871068985655369999999999985799 No 6 >PF07433 DUF1513: Protein of unknown function (DUF1513); InterPro: IPR008311 There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. Probab=99.17 E-value=2.2e-07 Score=67.31 Aligned_cols=220 Identities=15% Similarity=0.175 Sum_probs=146.0 Q ss_pred CCCCCEEEEE-CCCCEEEEEECCCCEEEEEEECCCCCCCC-EEEEECCCCEEEEE-------CCEEEEEECCCCC-EEEE Q ss_conf 7888589997-47986999988788299999449987311-64790798399841-------8828997526770-3677 Q T0558 20 SSPQHLLVGG-SGWNKIAIINKDTKEIVWEYPLEKGWECN-SVAATKAGEILFSY-------SKGAKMITRDGRE-LWNI 89 (294) Q Consensus 20 s~~~~~l~~g-s~~~~i~~~d~~tg~~~w~~~~~~~~~~~-~~~~~pdG~~l~s~-------~~~v~~~~~~~~~-~~~~ 89 (294) +.+. .++.+ --.....++|..+|+.+.......+...+ +.+|+|||++|++. .+-+-+|+...+. .... T Consensus 15 ~~~~-avafARRPG~~~~v~D~~~g~~~~~l~a~~~RHFyGHg~fs~DG~~LytTEnd~~~g~G~IGV~d~~~~~~ri~E 93 (305) T PF07433_consen 15 TRPE-AVAFARRPGTFAVVFDCRTGQVLQRLAAPPGRHFYGHGVFSPDGRLLYTTENDYETGRGVIGVYDAADGYRRIGE 93 (305) T ss_pred CCCE-EEEEEECCCCEEEEEECCCCCEEEEECCCCCCEEECCEEECCCCCEEEECCCCCCCCCEEEEEEECCCCCEEEEE T ss_conf 9973-999997997689999858895668863898856515676849989898605566789569999987679289877 Q ss_pred ECCCCCCEEEEEECCCCCEEEEEEC------------------CCCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEEC Q ss_conf 2377763148787378758999705------------------8979999856895889997067767766740089996 Q T0558 90 AAPAGCEMQTARILPDGNALVAWCG------------------HPSTILEVNMKGEVLSKTEFETGIERPHAQFRQINKN 151 (294) Q Consensus 90 ~~~~~~~v~~~~~~~dg~~l~~~s~------------------~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s 151 (294) ...+....+.+.+.|||..|+++.+ ..++.+.+..+|+++.+..+.. ..+...++++++. T Consensus 94 f~s~GIGPHel~~~pdg~tLvVANGGI~Thpd~GR~kLNLdtM~psL~~ld~~~G~ll~q~~L~~--~~~~lSiRHLav~ 171 (305) T PF07433_consen 94 FPSGGIGPHELLLMPDGETLVVANGGIETHPDSGRAKLNLDTMQPSLVYLDARSGALLEQWELPP--DLHQLSIRHLAVD 171 (305) T ss_pred ECCCCCCHHHEEECCCCCEEEEECCCCCCCCCCCCCCCCHHHCCCCEEEEECCCCCEEEEECCCH--HHHHCCEEEEEEC T ss_conf 53899583538986999989997589816887686145832348615898427875221320683--4622115678774 Q ss_pred CCCCEEEEEECCCE-------EEEEECCCCEEEEEE--------CCCEEEEEEEECCCCEE-EECCCCCEEEEEECCCCE Q ss_conf 99899999746988-------999937885889851--------69734898873489689-973589879999878985 Q T0558 152 KKGNYLVPLFATSE-------VREIAPNGQLLNSVK--------LSGTPFSSAFLDNGDCL-VACGDAHCFVQLNLESNR 215 (294) Q Consensus 152 ~dG~~i~~g~~d~~-------i~~~d~~g~~~~~~~--------~~~~~~~~~~~~~g~~~-v~~~~~~~i~~~d~~~g~ 215 (294) +||..+......+. +.+++.. ..+.... ..+.+-|+++..+|..+ +++=.++++.+||..+|+ T Consensus 172 ~~G~v~~a~Q~qG~~~~~~PLla~~~~g-~~~~~~~~p~~~~~~l~~Y~GSVA~~~~g~~iavtsPrGg~~~~~d~~tg~ 250 (305) T PF07433_consen 172 GDGTVWFAMQYQGDPGDAPPLLALHRRG-EALQLLPAPEEQWRRLNGYIGSVAASRDGRLIAVTSPRGGRVQVWDAATGR 250 (305) T ss_pred CCCCEEEEEEECCCCCCCCCEEEEECCC-CCCEECCCCHHHHHHHCCCEEEEEECCCCCEEEEECCCCCEEEEEECCCCC T ss_conf 9983899886138866678758996189-863123798679887479279999869999999988989889999999888 Q ss_pred EEEEECCCCCCCEEECCCCCEEECCCCCEEEEECCCC Q ss_conf 9998448876411411345348938998999804677 Q T0558 216 IVRRVNANDIEGVQLFFVAQLFPLQNGGLYICNWQGH 252 (294) Q Consensus 216 ~~~~~~~~~~~~~~~~~~~~~~~~~~G~i~i~~~~~~ 252 (294) ++......+ ++++...++| +++.+..+. T Consensus 251 ~~~~~~l~D--------~cGva~~~~g-f~~ssG~G~ 278 (305) T PF07433_consen 251 LLGSVPLPD--------ACGVAALAGG-FLASSGQGR 278 (305) T ss_pred EEECCCCCC--------EEEEEECCCC-EEEECCCCC T ss_conf 762427252--------5788676997-699679985 No 7 >PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2ghs_A 3e5z_A 3dr2_A 2dg0_J 2dg1_C 2dso_F 2iax_A 2ias_A 2iaw_A 3byc_A .... Probab=99.02 E-value=3e-07 Score=66.34 Aligned_cols=189 Identities=16% Similarity=0.111 Sum_probs=128.3 Q ss_pred CCCEEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEE-CCCCEEEEECCEEEEEECCCCCEEEEECC-----CCC Q ss_conf 885899974798699998878829999944998731164790-79839984188289975267703677237-----776 Q T0558 22 PQHLLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAAT-KAGEILFSYSKGAKMITRDGRELWNIAAP-----AGC 95 (294) Q Consensus 22 ~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~-pdG~~l~s~~~~v~~~~~~~~~~~~~~~~-----~~~ 95 (294) .+.++.+....++|+.++.++++.. .+.... ...+++. +||+++++....+.+++...+.+...... ... T Consensus 11 ~g~ly~~Di~~~~i~~~~~~~~~~~-~~~~~~---~~G~~~~~~~g~l~v~~~~~~~~~d~~~g~~~~l~~~~~~~~~~~ 86 (246) T PF08450_consen 11 DGNLYWVDIPNGRIYRVDPDGGKAT-VFDLPG---PNGAAFDTPDGRLYVADSDGIAILDPDTGEVETLADIPDGGAPFN 86 (246) T ss_dssp GTEEEEEECCCTEEEEEESTT-EEE-EEESSS---BSEEEEECCTCEEEEEE----EEEECE---EEEEECCSTTSSB-S T ss_pred CCEEEEEECCCCEEEEEECCCCEEE-EEECCC---CCEEEEECCCCEEEEEECCCEEEEECCCCCEEEEEECCCCCCCCC T ss_conf 9989999998698999989999189-995899---858988867998999965855999579993899764257877756 Q ss_pred CEEEEEECCCCCEEEEEECCC--------CEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEE-EEECCCEE Q ss_conf 314878737875899970589--------799998568958899970677677667400899969989999-97469889 Q T0558 96 EMQTARILPDGNALVAWCGHP--------STILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLV-PLFATSEV 166 (294) Q Consensus 96 ~v~~~~~~~dg~~l~~~s~~~--------~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~-~g~~d~~i 166 (294) .+..+.+.|+|+..++..... .++.++.. |+....... ....+.++++|||+.|+ +-+..+.| T Consensus 87 ~~ND~~vd~~G~l~~t~~~~~~~~~~~~g~l~~~~~~-~~~~~~~~~-------~~~pNGi~~s~dg~~lyv~ds~~~~I 158 (246) T PF08450_consen 87 RPNDGAVDPDGNLYFTDMGDSGVGPSDPGALYRIDPD-GKVTVVIDG-------LSIPNGIAFSPDGKTLYVSDSFTGRI 158 (246) T ss_dssp BEEEEEE-----EEEEEB---TTTTT--EEEEEE-TT---EEEEEEC-------ESSBE--EEETTSSEEEEEETTTTEE T ss_pred CCCEEEECCCCCEEEECCCCCCCCCCCCCEEEEECCC-CEEEEEECC-------CCCCCCCEECCCCCEEEEEECCCCEE T ss_conf 7763899899999994899774543355279999899-879998536-------41125548978898999996577759 Q ss_pred EEEECC--CCEEE---EE---ECC-CEEEEEEEECCCCEEEECCCCCEEEEEECCCCEEEEEECCC Q ss_conf 999378--85889---85---169-73489887348968997358987999987898599984488 Q T0558 167 REIAPN--GQLLN---SV---KLS-GTPFSSAFLDNGDCLVACGDAHCFVQLNLESNRIVRRVNAN 223 (294) Q Consensus 167 ~~~d~~--g~~~~---~~---~~~-~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~ 223 (294) +.++.+ +..+- .+ ... ..|-...+..+|+++++...+++|.+++.. |+++..+... T Consensus 159 ~~~d~~~~~~~~~~~~~~~~~~~~~~~pDG~~vD~~G~l~va~~~~~~V~~~~p~-G~~~~~i~~p 223 (246) T PF08450_consen 159 YRYDLDADGGELSNKRVFADFPGGDGYPDGLAVDADGNLWVADWGGGRVQRFDPD-GKLLGEIELP 223 (246) T ss_dssp EEEEE-TT---EEEEEEEEE----------EEE-TT--EEEEE----EEEEE------BCEEEE-S T ss_pred EEEEECCCCCEECCCEEEEECCCCCCCCCCCEECCCCCEEEEEECCCEEEEECCC-CCEEEEEECC T ss_conf 9998327984702743899878997478860298999899999559999998999-9699999999 No 8 >PF08450 SGL: SMP-30/Gluconolaconase/LRE-like region; InterPro: IPR013658 This family describes a region that is found in proteins expressed by a variety of eukaryotic and prokaryotic species. These proteins include various enzymes, such as senescence marker protein 30 (SMP-30, Q15493 from SWISSPROT), gluconolactonase (Q01578 from SWISSPROT) and luciferin-regenerating enzyme (LRE, Q86DU5 from SWISSPROT). SMP-30 is known to hydrolyse diisopropyl phosphorofluoridate in the liver, and has been noted as having sequence similarity, in the region described in this family, with PON1 (P52430 from SWISSPROT) and LRE. ; PDB: 2ghs_A 3e5z_A 3dr2_A 2dg0_J 2dg1_C 2dso_F 2iax_A 2ias_A 2iaw_A 3byc_A .... Probab=98.97 E-value=1.1e-06 Score=62.24 Aligned_cols=200 Identities=16% Similarity=0.228 Sum_probs=129.3 Q ss_pred EEEEEC-CCCEEEEE--CCEEEEEECCCCCEEEEECCCCCCEEEEEEC-CCCCEEEEEECCCCEEEEECCCCCEEEEEEC Q ss_conf 647907-98399841--8828997526770367723777631487873-7875899970589799998568958899970 Q T0558 59 SVAATK-AGEILFSY--SKGAKMITRDGRELWNIAAPAGCEMQTARIL-PDGNALVAWCGHPSTILEVNMKGEVLSKTEF 134 (294) Q Consensus 59 ~~~~~p-dG~~l~s~--~~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~-~dg~~l~~~s~~~~~~~~~~~~G~~~~~~~~ 134 (294) .+.+.| +|.+++++ .+.++.++..++.......+. ..++.+. ++|.++++ ... .+.+++..+|+....... T Consensus 4 gp~~d~~~g~ly~~Di~~~~i~~~~~~~~~~~~~~~~~---~~G~~~~~~~g~l~v~-~~~-~~~~~d~~~g~~~~l~~~ 78 (246) T PF08450_consen 4 GPVWDPEDGNLYWVDIPNGRIYRVDPDGGKATVFDLPG---PNGAAFDTPDGRLYVA-DSD-GIAILDPDTGEVETLADI 78 (246) T ss_dssp -EEEEGGGTEEEEEECCCTEEEEEESTT-EEEEEESSS---BSEEEEECCTCEEEEE-E-----EEEECE---EEEEECC T ss_pred CCEEECCCCEEEEEECCCCEEEEEECCCCEEEEEECCC---CCEEEEECCCCEEEEE-ECC-CEEEEECCCCCEEEEEEC T ss_conf 31898999989999998698999989999189995899---8589888679989999-658-559995799938997642 Q ss_pred CCCCCCCCCCCCEEEECCCCCEEEEEECC--------CEEEEEECCCCEEEEEECCCEEEEEEEECCCCE-EEECCCCCE Q ss_conf 67767766740089996998999997469--------889999378858898516973489887348968-997358987 Q T0558 135 ETGIERPHAQFRQINKNKKGNYLVPLFAT--------SEVREIAPNGQLLNSVKLSGTPFSSAFLDNGDC-LVACGDAHC 205 (294) Q Consensus 135 ~~~~~~~~~~~~~~~~s~dG~~i~~g~~d--------~~i~~~d~~g~~~~~~~~~~~~~~~~~~~~g~~-~v~~~~~~~ 205 (294) .. ........+.+.+.|+|++.++.... +.++.++.+++.......-..++.+.++++|+. +++.+..++ T Consensus 79 ~~-~~~~~~~~ND~~vd~~G~l~~t~~~~~~~~~~~~g~l~~~~~~~~~~~~~~~~~~pNGi~~s~dg~~lyv~ds~~~~ 157 (246) T PF08450_consen 79 PD-GGAPFNRPNDGAVDPDGNLYFTDMGDSGVGPSDPGALYRIDPDGKVTVVIDGLSIPNGIAFSPDGKTLYVSDSFTGR 157 (246) T ss_dssp ST-TSSB-SBEEEEEE-----EEEEEB---TTTTT--EEEEEE-TT--EEEEEECESSBE--EEETTSSEEEEEETTTTE T ss_pred CC-CCCCCCCCCEEEECCCCCEEEECCCCCCCCCCCCCEEEEECCCCEEEEEECCCCCCCCCEECCCCCEEEEEECCCCE T ss_conf 57-87775677638998999999948997745433552799998998799985364112554897889899999657775 Q ss_pred EEEEECC--CCEEEEE---ECCCCCCCEEECCCCCEEECCCCCEEEEECCCCEEECCCCCCCEEEEECCCCCEEEEEE Q ss_conf 9999878--9859998---44887641141134534893899899980467714302477756999908998999983 Q T0558 206 FVQLNLE--SNRIVRR---VNANDIEGVQLFFVAQLFPLQNGGLYICNWQGHDREAGKGKHPQLVEIDSEGKVVWQLN 278 (294) Q Consensus 206 i~~~d~~--~g~~~~~---~~~~~~~~~~~~~~~~~~~~~~G~i~i~~~~~~~~~~~~~~~~~~~~i~~~G~~vW~~~ 278 (294) |+.++.. ++++... ....... ..+-++...++|+++++.+. ..++..++++|+++=.+. T Consensus 158 I~~~d~~~~~~~~~~~~~~~~~~~~~----~~pDG~~vD~~G~l~va~~~----------~~~V~~~~p~G~~~~~i~ 221 (246) T PF08450_consen 158 IYRYDLDADGGELSNKRVFADFPGGD----GYPDGLAVDADGNLWVADWG----------GGRVQRFDPDGKLLGEIE 221 (246) T ss_dssp EEEEEE-TT---EEEEEEEEE--------------EEE-TT--EEEEE--------------EEEEE-----BCEEEE T ss_pred EEEEEECCCCCEECCCEEEEECCCCC----CCCCCCEECCCCCEEEEEEC----------CCEEEEECCCCCEEEEEE T ss_conf 99998327984702743899878997----47886029899989999955----------999999899996999999 No 9 >PF08662 eIF2A: Eukaryotic translation initiation factor eIF2A; InterPro: IPR013979 This entry contains eukaryotic translation initiation factors. Probab=98.93 E-value=4.1e-07 Score=65.32 Aligned_cols=136 Identities=10% Similarity=0.049 Sum_probs=84.8 Q ss_pred CCEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEEC--CCCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEE Q ss_conf 882899752677036772377763148787378758999705--897999985689588999706776776674008999 Q T0558 73 SKGAKMITRDGRELWNIAAPAGCEMQTARILPDGNALVAWCG--HPSTILEVNMKGEVLSKTEFETGIERPHAQFRQINK 150 (294) Q Consensus 73 ~~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~--~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~ 150 (294) ...++.++..+.............+.+++|+|+|+.+++..+ ...+.+++. +++.+..+.. ...+.+.+ T Consensus 38 ~~~l~~~~~~~~~~~~v~l~~~g~V~~~~WsP~~~~Favi~g~~p~~i~lyd~-~~~~v~~~~~--------~~~N~i~w 108 (194) T PF08662_consen 38 ELELFRLNEKGIPVEVVELKKEGPVHDFAWSPNGDEFAVISGNMPAKITLYDV-KGKKVFSFGS--------GPRNTIFW 108 (194) T ss_pred CEEEEEEECCCCCCEEEEECCCCCEEEEEECCCCCEEEEEEECCCCEEEEEEC-CCCEEEECCC--------CCCCEEEE T ss_conf 36999996799853256605797334789887998899999168752999959-8728886588--------87227999 Q ss_pred CCCCCEEEEEEC---CCEEEEEECCCCEEEEEECCCEEEEEEEECCCCEEEECCC------CCEEEEEECCCCEEEE Q ss_conf 699899999746---9889999378858898516973489887348968997358------9879999878985999 Q T0558 151 NKKGNYLVPLFA---TSEVREIAPNGQLLNSVKLSGTPFSSAFLDNGDCLVACGD------AHCFVQLNLESNRIVR 218 (294) Q Consensus 151 s~dG~~i~~g~~---d~~i~~~d~~g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~------~~~i~~~d~~~g~~~~ 218 (294) +|+|++++.++. .|.+.+||.+....-....+.....+..+|+|+++++.+. ++.+.+|+. .|+++. T Consensus 109 sP~G~~lv~ag~gn~~G~l~fwd~~~~~~i~~~~~~~~~~~~WsP~Gr~~~ta~t~~r~~~dng~~i~~~-~G~~l~ 184 (194) T PF08662_consen 109 SPNGRFLVLAGFGNLSGDLEFWDVRKMKKIATFEHPCSTDVEWSPDGRYFATATTSPRLRVDNGFKIWSF-QGRLLY 184 (194) T ss_pred CCCCCEEEEEECCCCCEEEEEEECCCCEEEEECCCCCEEEEEECCCCCEEEEEEECCCEECCCEEEEEEE-CCEEEE T ss_conf 8999999996726786089999767627874035775027899999899999983353324762999997-792967 No 10 >PF08662 eIF2A: Eukaryotic translation initiation factor eIF2A; InterPro: IPR013979 This entry contains eukaryotic translation initiation factors. Probab=98.89 E-value=2.5e-07 Score=66.84 Aligned_cols=137 Identities=15% Similarity=0.089 Sum_probs=73.2 Q ss_pred CCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEE--E--CCEEEEEECCCCCEEEEECCCCCCEEEEEECCCCC Q ss_conf 9869999887882999994499873116479079839984--1--88289975267703677237776314878737875 Q T0558 32 WNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFS--Y--SKGAKMITRDGRELWNIAAPAGCEMQTARILPDGN 107 (294) Q Consensus 32 ~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s--~--~~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~ 107 (294) +..|+.++..... +-..++.....+++++++|+|+.++. + ...+.+|+..+..+..+.. .....+.|+|+|+ T Consensus 38 ~~~l~~~~~~~~~-~~~v~l~~~g~V~~~~WsP~~~~Favi~g~~p~~i~lyd~~~~~v~~~~~---~~~N~i~wsP~G~ 113 (194) T PF08662_consen 38 ELELFRLNEKGIP-VEVVELKKEGPVHDFAWSPNGDEFAVISGNMPAKITLYDVKGKKVFSFGS---GPRNTIFWSPNGR 113 (194) T ss_pred CEEEEEEECCCCC-CEEEEECCCCCEEEEEECCCCCEEEEEEECCCCEEEEEECCCCEEEECCC---CCCCEEEECCCCC T ss_conf 3699999679985-32566057973347898879988999991687529999598728886588---8722799989999 Q ss_pred EEEEEECC---CCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEEC------CCEEEEEECCCCEEEE Q ss_conf 89997058---97999985689588999706776776674008999699899999746------9889999378858898 Q T0558 108 ALVAWCGH---PSTILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFA------TSEVREIAPNGQLLNS 178 (294) Q Consensus 108 ~l~~~s~~---~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~------d~~i~~~d~~g~~~~~ 178 (294) ++++++.. +.+.+|+..+.+.+.+.. +.....+.++|||+++++... |+.+++|+.+|+++.. T Consensus 114 ~lv~ag~gn~~G~l~fwd~~~~~~i~~~~--------~~~~~~~~WsP~Gr~~~ta~t~~r~~~dng~~i~~~~G~~l~~ 185 (194) T PF08662_consen 114 FLVLAGFGNLSGDLEFWDVRKMKKIATFE--------HPCSTDVEWSPDGRYFATATTSPRLRVDNGFKIWSFQGRLLYE 185 (194) T ss_pred EEEEEECCCCCEEEEEEECCCCEEEEECC--------CCCEEEEEECCCCCEEEEEEECCCEECCCEEEEEEECCEEEEE T ss_conf 99996726786089999767627874035--------7750278999998999999833533247629999977929673 Q ss_pred EE Q ss_conf 51 Q T0558 179 VK 180 (294) Q Consensus 179 ~~ 180 (294) .. T Consensus 186 ~~ 187 (194) T PF08662_consen 186 EK 187 (194) T ss_pred CC T ss_conf 22 No 11 >PF05935 Arylsulfotrans: Arylsulfotransferase (ASST); InterPro: IPR010262 This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate .; PDB: 3ett_B 3elq_B 3ets_A. Probab=98.53 E-value=2.2e-06 Score=60.04 Aligned_cols=211 Identities=14% Similarity=0.222 Sum_probs=131.6 Q ss_pred ECCCCEEEEE-----CCEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEECCCCEEEEECCCCCEEEEEECCCC Q ss_conf 0798399841-----88289975267703677237776314878737875899970589799998568958899970677 Q T0558 63 TKAGEILFSY-----SKGAKMITRDGRELWNIAAPAGCEMQTARILPDGNALVAWCGHPSTILEVNMKGEVLSKTEFETG 137 (294) Q Consensus 63 ~pdG~~l~s~-----~~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~~~~~~~~~~~ 137 (294) ..+|-+++.. ....+++|.+|...|.......... .+...+||.++.... ..+...+..|+++|++.+..+ T Consensus 112 ~~~gLy~~~~~~~~~~~~~~i~D~~G~vrW~~~~~~~~~~-~~~~~~nG~l~~~~~---~~~~e~D~~G~vi~~~~l~~~ 187 (471) T PF05935_consen 112 MEDGLYFVNPNDWDYQPGPYIYDNNGNVRWYLPSDSGRDN-RFKRLDNGHLLFGSG---GRYYEYDWLGKVIWQYDLPNG 187 (471) T ss_dssp -TT-EEEEEETT---EEEEEEEE----EEEEE-GGGT-----EEE-----EEEE------EEEEE-----EEEEEE---- T ss_pred CCCCEEEEECCCCCCCCEEEEECCCCCEEEEECCCCCCCC-EEEECCCCCEEEEEC---CEEEEECCCCCEEEEEECCCC T ss_conf 5786899967777777606999599869999527778761-478738986999978---848998889978999987887 Q ss_pred C-CCCCCCCCEEEECCCCCEEEEEEC-------------CCEEEEEECCCCEEEEEECC-------C------------- Q ss_conf 6-776674008999699899999746-------------98899993788588985169-------7------------- Q T0558 138 I-ERPHAQFRQINKNKKGNYLVPLFA-------------TSEVREIAPNGQLLNSVKLS-------G------------- 183 (294) Q Consensus 138 ~-~~~~~~~~~~~~s~dG~~i~~g~~-------------d~~i~~~d~~g~~~~~~~~~-------~------------- 183 (294) . ..|+ .+...|+|++|+.+.. .-.|...|.+|+.+|.+... . T Consensus 188 ~~~~HH----d~~~~~nGn~Li~~~~~~~~~~~~~~~~~~D~iiEid~tG~vv~~W~~~dhld~~~~~~~~~~~~~~~~~ 263 (471) T PF05935_consen 188 YYDFHH----DFQELPNGNILILAYERRYADEGKDGWTVEDVIIEIDETGEVVWEWDASDHLDPYRDTNLKDLNDPFGDN 263 (471) T ss_dssp -----S-----EEE-----EEEEE--TTEE----EE---S-EEEEE-----EEEEEEGGGTS-TT--S---B-T------ T ss_pred CCCCCE----EEEECCCCCEEEEEEECCCCCCCCCCCEEECEEEEECCCCCEEEEECHHHCCCCCCCCHHHCCCCCCCCC T ss_conf 776430----1189699979999973111246778868826899998999699998766768800065000255566756 Q ss_pred ----------EEEEEEEEC-CCCEEEECCCCCEEEEEECCCCEEEEEECCCCCCCEE---------------ECCCC--C Q ss_conf ----------348988734-8968997358987999987898599984488764114---------------11345--3 Q T0558 184 ----------TPFSSAFLD-NGDCLVACGDAHCFVQLNLESNRIVRRVNANDIEGVQ---------------LFFVA--Q 235 (294) Q Consensus 184 ----------~~~~~~~~~-~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~~~~~~~---------------~~~~~--~ 235 (294) -..++...+ +|.++++.-.-+.|..+|.++|+++|.+.++...... ..+-. . T Consensus 264 ~~~~~~~Dw~HiNsv~yd~~d~~iliS~R~~s~V~~Id~~tg~I~W~lG~~~~~~~~~~~~~l~p~~~~~~~~~~~~QH~ 343 (471) T PF05935_consen 264 PGSGGGWDWFHINSVDYDPDDDSILISSRHQSTVIKIDYRTGEIKWILGGKGGWSKDYQDYLLTPVDGDGDFDWFWGQHD 343 (471) T ss_dssp --------S--EEEEEEETTTTEEEEEETT----EEE----S---EE-S------TTTGGGB--BB-SSSS----SS-EE T ss_pred CCCCCCCCCCCCCCCEECCCCCCEEEECCCCEEEEEEECCCCCEEEEECCCCCCCCCCHHHCCCCCCCCCCCCEEECCCC T ss_conf 68899888737356077789993999767755899995699868999479877673101200343566877650003542 Q ss_pred EEECCCC---CEEEEECCC---CEEEC---CC--CCCCEEEEECCC-CC--EEEEEECCC Q ss_conf 4893899---899980467---71430---24--777569999089-98--999983588 Q T0558 236 LFPLQNG---GLYICNWQG---HDREA---GK--GKHPQLVEIDSE-GK--VVWQLNDKV 281 (294) Q Consensus 236 ~~~~~~G---~i~i~~~~~---~~~~~---~~--~~~~~~~~i~~~-G~--~vW~~~~~~ 281 (294) ..+.++| ++++.+=.. ..... .. .+..-.+.+|.+ +. .+|++.... T Consensus 344 a~~~~~~~~~~i~~FDNg~~~~~~~~~~~~~~~~~Sr~~~y~ID~~~~Tv~~v~~y~~~~ 403 (471) T PF05935_consen 344 ARFIPDGPQGNILVFDNGNGRGYSQPNLVWMKDNYSRGVEYKIDENNMTVEQVWEYGKPR 403 (471) T ss_dssp EEE--------EEEEE-----TTS--SSCCG-----EEEEEEE-S----EEEEEEE---- T ss_pred CEECCCCCEEEEEEEECCCCCCCCCCCCCCCCCCCCEEEEEEECCCCCEEEEEEEECCCC T ss_conf 078189972799999589866667766556777534369999858998699999975898 No 12 >PF07433 DUF1513: Protein of unknown function (DUF1513); InterPro: IPR008311 There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. Probab=98.37 E-value=9.5e-05 Score=48.07 Aligned_cols=192 Identities=19% Similarity=0.216 Sum_probs=113.1 Q ss_pred CCCCCCCCEEEEE-----CCCCEEEEEECC-CCEEEEEEECCCCCCCCEEEEECCCCEEEEEC----------------- Q ss_conf 0147888589997-----479869999887-88299999449987311647907983998418----------------- Q T0558 17 AQGSSPQHLLVGG-----SGWNKIAIINKD-TKEIVWEYPLEKGWECNSVAATKAGEILFSYS----------------- 73 (294) Q Consensus 17 ~~~s~~~~~l~~g-----s~~~~i~~~d~~-tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~~----------------- 73 (294) ..+|++.++|++- ++.+.|-+||.. +-+.+-||+ .++..++.+.+.|||..|+... T Consensus 56 g~fs~DG~~LytTEnd~~~g~G~IGV~d~~~~~~ri~Ef~-s~GIGPHel~~~pdg~tLvVANGGI~Thpd~GR~kLNLd 134 (305) T PF07433_consen 56 GVFSPDGRLLYTTENDYETGRGVIGVYDAADGYRRIGEFP-SGGIGPHELLLMPDGETLVVANGGIETHPDSGRAKLNLD 134 (305) T ss_pred EEECCCCCEEEECCCCCCCCCEEEEEEECCCCCEEEEEEC-CCCCCHHHEEECCCCCEEEEECCCCCCCCCCCCCCCCHH T ss_conf 7684998989860556678956999998767928987753-899583538986999989997589816887686145832 Q ss_pred ---CEEEEEE-CCCCCEEEEEC---CCCCCEEEEEECCCCCEEEEEECCCC------EEEEECCCCCEEEEEECCCCCC- Q ss_conf ---8289975-26770367723---77763148787378758999705897------9999856895889997067767- Q T0558 74 ---KGAKMIT-RDGRELWNIAA---PAGCEMQTARILPDGNALVAWCGHPS------TILEVNMKGEVLSKTEFETGIE- 139 (294) Q Consensus 74 ---~~v~~~~-~~~~~~~~~~~---~~~~~v~~~~~~~dg~~l~~~s~~~~------~~~~~~~~G~~~~~~~~~~~~~- 139 (294) ..+...| .+|..+-+... -+...+--+++.++|..++.. +..+ -.+.....|+.+..+....... T Consensus 135 tM~psL~~ld~~~G~ll~q~~L~~~~~~lSiRHLav~~~G~v~~a~-Q~qG~~~~~~PLla~~~~g~~~~~~~~p~~~~~ 213 (305) T PF07433_consen 135 TMQPSLVYLDARSGALLEQWELPPDLHQLSIRHLAVDGDGTVWFAM-QYQGDPGDAPPLLALHRRGEALQLLPAPEEQWR 213 (305) T ss_pred HCCCCEEEEECCCCCEEEEECCCHHHHHCCEEEEEECCCCCEEEEE-EECCCCCCCCCEEEEECCCCCCEECCCCHHHHH T ss_conf 3486158984278752213206834622115678774998389988-613886667875899618986312379867988 Q ss_pred CCCCCCCEEEECCCCCEEEEEE-CCCEEEEEECC-CCEEEEEECCCEEEEEEEECCCCEEEECCCCCEEEEEECCC Q ss_conf 7667400899969989999974-69889999378-85889851697348988734896899735898799998789 Q T0558 140 RPHAQFRQINKNKKGNYLVPLF-ATSEVREIAPN-GQLLNSVKLSGTPFSSAFLDNGDCLVACGDAHCFVQLNLES 213 (294) Q Consensus 140 ~~~~~~~~~~~s~dG~~i~~g~-~d~~i~~~d~~-g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~ 213 (294) ....-+-++++++||.++++.+ ..+.+.+||.. +..+...... ....++..+++ .+++.+.+ .+....... T Consensus 214 ~l~~Y~GSVA~~~~g~~iavtsPrGg~~~~~d~~tg~~~~~~~l~-D~cGva~~~~g-f~~ssG~G-~~~~~~~~~ 286 (305) T PF07433_consen 214 RLNGYIGSVAASRDGRLIAVTSPRGGRVQVWDAATGRLLGSVPLP-DACGVAALAGG-FLASSGQG-RLIRLSPPD 286 (305) T ss_pred HHCCCEEEEEECCCCCEEEEECCCCCEEEEEECCCCCEEECCCCC-CEEEEEECCCC-EEEECCCC-CEEECCCCC T ss_conf 747927999986999999998898988999999988876242725-25788676997-69967998-568647654 No 13 >PF08553 VID27: VID27 cytoplasmic protein; InterPro: IPR013863 This entry represents fungal and plant proteins and contains many hypothetical proteins. Vid27p is a cytoplasmic protein of unknown function, possibly regulates import of fructose-1,6-bisphosphatase into Vacuolar Import and Degradation (Vid) vesicles and is not essential for proteasome-dependent degradation of fructose-1,6-bisphosphatase (FBPase) , . Probab=98.34 E-value=2.3e-05 Score=52.63 Aligned_cols=140 Identities=8% Similarity=0.052 Sum_probs=85.2 Q ss_pred CEEEEE---CCEEEEEECCCCCEEE-EECCCCCCEEEEEECC-----CCCEEEEEECCCCEEEEECCCC--CEEEEEECC Q ss_conf 399841---8828997526770367-7237776314878737-----8758999705897999985689--588999706 Q T0558 67 EILFSY---SKGAKMITRDGRELWN-IAAPAGCEMQTARILP-----DGNALVAWCGHPSTILEVNMKG--EVLSKTEFE 135 (294) Q Consensus 67 ~~l~s~---~~~v~~~~~~~~~~~~-~~~~~~~~v~~~~~~~-----dg~~l~~~s~~~~~~~~~~~~G--~~~~~~~~~ 135 (294) ++|... ...++..|...+.+.. +.......+..+.... ...-.+.+-.++.++.||.+-- ++++.. .. T Consensus 488 ~Mll~~~~~~~~ly~mDLe~GKIV~eW~~~~d~~v~~~~~~sK~aqlt~e~tflGls~n~lfriDpRl~~~klv~~~-~k 566 (788) T PF08553_consen 488 NMLLLSPENPNKLYQMDLERGKIVEEWKVEDDIPVVDIAPDSKFAQLTSEQTFLGLSDNSLFRIDPRLSGNKLVQSQ-SK 566 (788) T ss_pred CEEEECCCCCCCEEEEECCCCCEEEEEECCCCCCCEEECCCCCCCCCCCCCEEEEECCCCEEEECCCCCCCCEEECC-CC T ss_conf 26974698878338875478868888525887660374256553436777518998778349874566887145113-42 Q ss_pred CCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEECCCCEEEEEE--CCCEEEEEEEECCCCEEEECCCCCEEEEEEC Q ss_conf 776776674008999699899999746988999937885889851--6973489887348968997358987999987 Q T0558 136 TGIERPHAQFRQINKNKKGNYLVPLFATSEVREIAPNGQLLNSVK--LSGTPFSSAFLDNGDCLVACGDAHCFVQLNL 211 (294) Q Consensus 136 ~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~g~~~~~~~--~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~ 211 (294) .. .......+++-+.+| +||+|+.+|.|++||.-|+...+.. .+.++..+.++.||+++++++.. ++.+++. T Consensus 567 ~Y--~~~~~Fs~~aTT~~G-~iavgS~~G~IRLyd~~gk~AKT~lp~lG~PIi~iDVT~DGkWiLaTc~t-yLlLi~t 640 (788) T PF08553_consen 567 QY--ASKNNFSCAATTEDG-YIAVGSNKGDIRLYDRLGKRAKTALPGLGDPIIGIDVTADGKWILATCDT-YLLLIDT 640 (788) T ss_pred CC--CCCCCCEEEEECCCC-EEEEECCCCCEEECCCCCCHHHHCCCCCCCCEEEEEECCCCCEEEEEECC-EEEEEEE T ss_conf 24--468870489856996-59996089857860666713322278789973677965788489997053-5999985 No 14 >PF06433 Me-amine-dh_H: Methylamine dehydrogenase heavy chain (MADH); InterPro: IPR009451 Methylamine dehydrogenase (1.4.99.3 from EC) is a periplasmic quinoprotein found in several methyltrophic bacteria . It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin (IPR002386 from INTERPRO).RCH2NH2 + H2O + acceptor = RCHO + NH3 + reduced acceptor MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The heavy subunit forms a seven-bladed beta-propeller like structure .; GO: 0030058 amine dehydrogenase activity, 0030416 methylamine metabolic process, 0042597 periplasmic space; PDB: 3c75_H 1mg2_A 2j57_J 2mta_H 2gc4_I 2gc7_I 2j55_J 1mg3_I 2bbk_H 2j56_J .... Probab=98.31 E-value=0.00013 Score=47.16 Aligned_cols=138 Identities=12% Similarity=0.095 Sum_probs=54.0 Q ss_pred CCEEEEEECCCCEEEEEEECCCCC------CCCEEEEECCCCEEEEE----CCEEEEEECCCCCEEEEECCCCCCEEEEE Q ss_conf 986999988788299999449987------31164790798399841----88289975267703677237776314878 Q T0558 32 WNKIAIINKDTKEIVWEYPLEKGW------ECNSVAATKAGEILFSY----SKGAKMITRDGRELWNIAAPAGCEMQTAR 101 (294) Q Consensus 32 ~~~i~~~d~~tg~~~w~~~~~~~~------~~~~~~~~pdG~~l~s~----~~~v~~~~~~~~~~~~~~~~~~~~v~~~~ 101 (294) ..-|.+||.+|-...+|..+.... .....++++||++++.. ...|-+.|...+....... ...++.+- T Consensus 66 tDvV~v~D~~tL~p~~EI~iP~k~R~~~~~~~~~~~ls~d~k~l~v~N~TPa~SVtVVDl~~~k~v~eid--~PGC~~iy 143 (342) T PF06433_consen 66 TDVVTVYDTQTLSPTAEIVIPPKPRFQAGPYKGMFALSADGKFLLVFNFTPAQSVTVVDLAAKKFVREID--TPGCALIY 143 (342) T ss_dssp EEEEEEEETTTTEEEEEEEETTT-B------GGGEEE-TTSSEEEE-BESSSEEE-EEES---EEEEEEE--G-SEEEEE T ss_pred EEEEEEEECCCCCCCCCEECCCCCCEEECCCCCCEEECCCCEEEEEECCCCCCEEEEEECCCCCEEEEEC--CCCEEEEE T ss_conf 2689997066676223376698763351234652365559839999836887656899854462144505--79879996 Q ss_pred ECCCCCEEEEEECCCCEEEE-ECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEECCC Q ss_conf 73787589997058979999-8568958899970677677667400899969989999974698899993788 Q T0558 102 ILPDGNALVAWCGHPSTILE-VNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIAPNG 173 (294) Q Consensus 102 ~~~dg~~l~~~s~~~~~~~~-~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~g 173 (294) +.++.. +..-|+|+..... .+..|+...... ..........+..-+++.....++.-+..|.|+..+..+ T Consensus 144 P~~~~~-F~~lC~DGsl~~v~lD~~G~~~~~~t-~~F~~~~dplf~~~a~~~~~~~~~F~sy~G~v~~~~l~~ 214 (342) T PF06433_consen 144 PTGNRG-FSMLCGDGSLLSVTLDDDGKETQRRT-EVFFPDDDPLFEHPAYSRKTGRLVFPSYEGNVYQADLSG 214 (342) T ss_dssp EEETTE-EEEEE----EEEEE-------EEEE----SSTTTS-B-S--EE-STEEEEEEEB----EEEEE--- T ss_pred ECCCCC-EEEEECCCCEEEEEECCCCCEEEECC-CCCCCCCCCCCCCCCEECCCCEEEEEECCCEEEEEECCC T ss_conf 669985-17885698238999789997876215-776667742123553277898599984377799873168 No 15 >PF04762 IKI3: IKI3 family; InterPro: IPR006849 Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation . Probab=98.19 E-value=0.00023 Score=45.30 Aligned_cols=155 Identities=15% Similarity=0.131 Sum_probs=67.7 Q ss_pred CCCCEEEEECCCCEE--EEECCEEEEE----ECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEECCCCEEEEECCCCCE Q ss_conf 731164790798399--8418828997----5267703677237776314878737875899970589799998568958 Q T0558 55 WECNSVAATKAGEIL--FSYSKGAKMI----TRDGRELWNIAAPAGCEMQTARILPDGNALVAWCGHPSTILEVNMKGEV 128 (294) Q Consensus 55 ~~~~~~~~~pdG~~l--~s~~~~v~~~----~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~~ 128 (294) ..+.++.+-+|...+ +..++.+.++ +.....+.-. ..-...+.+++|+||...++...+..++++- +++-+. T Consensus 76 ~~ivs~~~l~d~~~~c~~~~~Gdil~~~~~~~~~~~~~Eiv-g~~~~gi~a~~Wspd~e~l~~~t~~~~~l~M-t~~fe~ 153 (928) T PF04762_consen 76 DRIVSLEHLADSNQLCLVLAGGDILLVREDPDPDTDEIEIV-GSVDSGITAAAWSPDEELLALVTGENTLLLM-TRDFEP 153 (928) T ss_pred CEEEEEEEECCCCCEEEEECCCEEEEEECCCCCCCCCEEEE-EEECCCEEEEEECCCCCEEEEEECCCEEEEE-ECCCEE T ss_conf 70899882147772799987960999972699887625997-7884756898877985479999679769999-245408 Q ss_pred EEEEE------------------------CCCCCC----------------CCCCCCCEEEECCCCCEEEEEEC----C- Q ss_conf 89997------------------------067767----------------76674008999699899999746----9- Q T0558 129 LSKTE------------------------FETGIE----------------RPHAQFRQINKNKKGNYLVPLFA----T- 163 (294) Q Consensus 129 ~~~~~------------------------~~~~~~----------------~~~~~~~~~~~s~dG~~i~~g~~----d- 163 (294) +.+.. ...+.. ...+.-..+.+-.||.|+|+.+. + T Consensus 154 i~E~~l~~dd~~~~~~VsVGWGkkETQF~G~~gK~a~lr~pt~~~~~~~~ls~Dd~~~~ISWRGDG~yfAVs~~~~~~~~ 233 (928) T PF04762_consen 154 IAEQPLDSDDFGESKHVSVGWGKKETQFHGSGGKAAALRDPTVPKVDEGKLSWDDGRVSISWRGDGAYFAVSSVEPESGQ 233 (928) T ss_pred EEEEECCCCCCCCCCEEECCCCCCCEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCCEEEEEEEECCCCC T ss_conf 77752370104767603447886000615754432212488876556576545789637998688878999988716887 Q ss_pred -CEEEEEECCCCEEEEEE-CCCEEEEEEEECCCCEEEECC---CCCEEEEEEC Q ss_conf -88999937885889851-697348988734896899735---8987999987 Q T0558 164 -SEVREIAPNGQLLNSVK-LSGTPFSSAFLDNGDCLVACG---DAHCFVQLNL 211 (294) Q Consensus 164 -~~i~~~d~~g~~~~~~~-~~~~~~~~~~~~~g~~~v~~~---~~~~i~~~d~ 211 (294) ..+++|+++|...-... ..+-......-|.|+.+++.- +...|.++.. T Consensus 234 ~R~iRVy~ReG~L~s~SE~v~gLe~~LsWrPsGnLIAs~Qr~~~~~dVVFFER 286 (928) T PF04762_consen 234 RRVIRVYSREGALDSTSEPVDGLEHALSWRPSGNLIASIQRKPDRHDVVFFER 286 (928) T ss_pred EEEEEEECCCCCEEECCCCCCCCCCCCCCCCCCCEEEEEEECCCCCEEEEEEC T ss_conf 13799988887377524666676567648888878999996599845999915 No 16 >PF04053 Coatomer_WDAD: Coatomer WD associated region ; InterPro: IPR006692 Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer . While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins . For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi . Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes . Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerize, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits. This entry represents the WD-associated region found in coatomer subunits alpha, beta and beta' subunits. The alpha-subunit (RET1P) of the coatomer complex in Saccharomyces cerevisiae (Baker's yeast), participates in membrane transport between the endoplasmic reticulum and Golgi apparatus. The protein contains six WD-40 repeat motifs in its N-terminal region . More information about these proteins can be found at Protein of the Month: Clathrin .; GO: 0005198 structural molecule activity, 0005515 protein binding, 0008565 protein transporter activity, 0006461 protein complex assembly, 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0030117 membrane coat Probab=97.84 E-value=0.00092 Score=40.86 Aligned_cols=200 Identities=16% Similarity=0.077 Sum_probs=110.7 Q ss_pred CCCEEEEECCCCEEE-EECCEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEECCCCEEEEECCCCCEEEEEEC Q ss_conf 311647907983998-4188289975267703677237776314878737875899970589799998568958899970 Q T0558 56 ECNSVAATKAGEILF-SYSKGAKMITRDGRELWNIAAPAGCEMQTARILPDGNALVAWCGHPSTILEVNMKGEVLSKTEF 134 (294) Q Consensus 56 ~~~~~~~~pdG~~l~-s~~~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~~~~~~~~ 134 (294) .+..+...|+|+.++ ++++...++....-.- .......+..|.+++.+.+... +..+.+..+.+.+..+.++. T Consensus 34 ~p~~ls~nPngr~v~V~g~gey~iyt~~~~r~-----k~~g~g~~~vw~~~n~yAv~~~-~~~I~i~knf~~~~~k~i~~ 107 (444) T PF04053_consen 34 YPQSLSHNPNGRFVLVCGDGEYIIYTALAWRN-----KAFGSGLSFVWSSRNRYAVLEK-NSTIKIFKNFKEETTKSIKL 107 (444) T ss_pred CCEEEEECCCCCEEEEECCCEEEEEECCCCCC-----CCCCCCEEEEEECCCCEEEEEC-CCEEEEEECCCCCCCEEECC T ss_conf 87148999988889996699899998246665-----5567631799977711899976-97599997578752307858 Q ss_pred CCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEECC-CCEEEEEECCCEEEEEEEECCCCEEEECCCCCEEEEEECCC Q ss_conf 67767766740089996998999997469889999378-85889851697348988734896899735898799998789 Q T0558 135 ETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIAPN-GQLLNSVKLSGTPFSSAFLDNGDCLVACGDAHCFVQLNLES 213 (294) Q Consensus 135 ~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~-g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~ 213 (294) +.. +..+ |. |.+|... .++.|.+||.. ++.++...... +..+..+++|++++..+.+ .+++++.+- T Consensus 108 ~~~-------~~~I-f~--G~LL~~~-~~~~i~~yDw~~~~~i~~I~v~~-vk~V~Ws~~g~~Val~~~~-~i~il~~~~ 174 (444) T PF04053_consen 108 PFS-------VEKI-FG--GNLLGVK-SSDFICFYDWEQGKLIRRIDVSP-VKNVIWSDDGELVALLTKD-SIYILKYNL 174 (444) T ss_pred CCC-------CCEE-EC--CEEEEEE-CCCCEEEEEHHHCCEEEEEECCC-CCEEEEECCCCEEEEEECC-EEEEEEECC T ss_conf 977-------1727-73--6199997-79868998847854777996489-8579997886679998777-699998215 Q ss_pred C---------------EEEEEECCCCCCCEEECCCC---------CEEECCCCCEEEEECCCCEEECCC--CCCCEEEEE Q ss_conf 8---------------59998448876411411345---------348938998999804677143024--777569999 Q T0558 214 N---------------RIVRRVNANDIEGVQLFFVA---------QLFPLQNGGLYICNWQGHDREAGK--GKHPQLVEI 267 (294) Q Consensus 214 g---------------~~~~~~~~~~~~~~~~~~~~---------~~~~~~~G~i~i~~~~~~~~~~~~--~~~~~~~~i 267 (294) . +.+..++ .......|.. .+.++.+|...+............ ..+.+++.+ T Consensus 175 ~~~~~~~~~~g~e~~f~~~~E~~---~~IkSg~W~~d~fiYtT~~hLkYlv~G~~giI~~ld~~~Yllgy~~~~~~ly~~ 251 (444) T PF04053_consen 175 EAVEAEDDEEGIEDAFEVVHEIN---ERIKSGAWDGDVFIYTTSNHLKYLVNGESGIIAHLDKPLYLLGYLPKENRLYLL 251 (444) T ss_pred CCCCCCCCCCCCHHHEEEEEEEE---EEEEEEEEECCEEEEECCCEEEEEECCCCEEEEECCCCEEEEEEECCCCEEEEE T ss_conf 33445665557033038887742---005567997679999736327999489322789758637999997579889999 Q ss_pred CCCCCEE-EEE Q ss_conf 0899899-998 Q T0558 268 DSEGKVV-WQL 277 (294) Q Consensus 268 ~~~G~~v-W~~ 277 (294) |++++++ +.+ T Consensus 252 Dr~~~v~s~~l 262 (444) T PF04053_consen 252 DRDGNVISYEL 262 (444) T ss_pred ECCCCEEEEEE T ss_conf 78998899998 No 17 >PF04841 Vps16_N: Vps16, N-terminal region; InterPro: IPR006926 This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast . The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport . The role of VPS16 in this complex is not known.; GO: 0006886 intracellular protein transport, 0005737 cytoplasm Probab=97.84 E-value=0.00092 Score=40.86 Aligned_cols=250 Identities=10% Similarity=0.016 Sum_probs=129.0 Q ss_pred CCCCCCCEEEEECCCC-EEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEE-EECCEEEEEECCCCCEEEEECCCCC Q ss_conf 1478885899974798-6999988788299999449987311647907983998-4188289975267703677237776 Q T0558 18 QGSSPQHLLVGGSGWN-KIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILF-SYSKGAKMITRDGRELWNIAAPAGC 95 (294) Q Consensus 18 ~~s~~~~~l~~gs~~~-~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~-s~~~~v~~~~~~~~~~~~~~~~~~~ 95 (294) ....+.+++..++... .|.+++. +|+++++.+..+ ..+..+.|+.+...++ ..++.+++++..|+. ++..+... T Consensus 45 l~rd~~k~~~~~~~~p~~I~Iys~-sG~ll~si~w~~-~~iv~l~wt~~e~LivV~~dG~v~~y~~~G~~--~fsl~~~~ 120 (410) T PF04841_consen 45 LIRDESKLVPLGSARPNSIRIYSS-SGKLLSSIPWDS-GRIVGLGWTDDEELIVVFEDGTVRVYDLFGEF--QFSLGEEA 120 (410) T ss_pred EEECCCCCCCCCCCCCCEEEEECC-CCCEEEEEEECC-CCEEEEEECCCCCEEEEECCCEEEEEECCCCE--EECCCCCC T ss_conf 997676422255788767999978-897968988179-88789998778879999868989998067756--21445220 Q ss_pred C--------EEEEEECCCCCEEEEEECCCCEEEEECCCCC-EEEEEECCCCCCC--CCCCCC--EEEECCCCCEEEEEEC Q ss_conf 3--------1487873787589997058979999856895-8899970677677--667400--8999699899999746 Q T0558 96 E--------MQTARILPDGNALVAWCGHPSTILEVNMKGE-VLSKTEFETGIER--PHAQFR--QINKNKKGNYLVPLFA 162 (294) Q Consensus 96 ~--------v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~-~~~~~~~~~~~~~--~~~~~~--~~~~s~dG~~i~~g~~ 162 (294) . +....+..+|-.+ -..+..++.....+.. ..+.+...+.... +..... ...++.+....+.... T Consensus 121 ~~~~v~~~~i~~~~f~~~Gvvv--Lt~~~~i~~v~~~~~~~~~~~~~~~p~~~~~~~~~~~~~~i~~l~~~~~~~Vll~~ 198 (410) T PF04841_consen 121 EETGVVDCRIFAIWFWDNGVVV--LTSNNRIYVVNNFSEPVRPRNLPEIPTLITKNHWWPSWTEIPLLSVSRSVEVLLAN 198 (410) T ss_pred CCCCCCCCCCCCCEECCCCEEE--EECCCEEEEEECCCCCCCCCCCCCCCCCCCCCCCCCCCCCCEEEECCCCEEEEEEC T ss_conf 0046644431010225776899--92698399992566653212346688754455656566752248267768999811 Q ss_pred CCEEEEEECCCCEEEEEECCCEEEEEEEECCCCEEEECCCCCEEEEEECCCCEEEEEECCC-CCCCEEECCCCC--EEEC Q ss_conf 9889999378858898516973489887348968997358987999987898599984488-764114113453--4893 Q T0558 163 TSEVREIAPNGQLLNSVKLSGTPFSSAFLDNGDCLVACGDAHCFVQLNLESNRIVRRVNAN-DIEGVQLFFVAQ--LFPL 239 (294) Q Consensus 163 d~~i~~~d~~g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~-~~~~~~~~~~~~--~~~~ 239 (294) +..++..+.... ......+++....++|+|+.++.-..++++.+....=.+....++.. ...-..+.+.+. +... T Consensus 199 g~~l~~i~~~~~--~~i~~~g~~~~msvSpng~~lAl~t~~g~l~v~ssdf~~~~~e~~~~~~~~p~~m~WCG~daV~l~ 276 (410) T PF04841_consen 199 GETLYLIDEHSF--KQIDSRGPITHMSVSPNGKFLALFTDDGKLWVVSSDFSKKLCEYDTDSKSPPKQMEWCGNDAVVLS 276 (410) T ss_pred CCEEEEEECCCE--EECCCCCCEEEEEECCCCCEEEEEECCCCEEEEECCHHHCEEEECCCCCCCCCEEEEECCCCEEEE T ss_conf 993799983540--460468974899998999879999779859999883221004412576789756589579858898 Q ss_pred CCCCEEEEECCCCEEECCCCCCCEEEEECCCCCEEEE Q ss_conf 8998999804677143024777569999089989999 Q T0558 240 QNGGLYICNWQGHDREAGKGKHPQLVEIDSEGKVVWQ 276 (294) Q Consensus 240 ~~G~i~i~~~~~~~~~~~~~~~~~~~~i~~~G~~vW~ 276 (294) -...+++.+-.+......-... ..+.-..||-.|-. T Consensus 277 ~~~~l~lvg~~~~~~~~~~~~~-~~l~~E~DGvrI~t 312 (410) T PF04841_consen 277 WEDELLLVGPDGDSIDFYYDGP-PILVPEIDGVRIIT 312 (410) T ss_pred ECCEEEEECCCCCCEEEEECCC-EEEEECCCEEEEEE T ss_conf 3558999889987558752782-19984388679975 No 18 >PF04762 IKI3: IKI3 family; InterPro: IPR006849 Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation . Probab=97.83 E-value=0.00097 Score=40.72 Aligned_cols=153 Identities=16% Similarity=0.157 Sum_probs=98.8 Q ss_pred HCCCCCCCCEEEEECCCCEEEEEECCCCEEEEEEECC-----------------------CCC----------------- Q ss_conf 0014788858999747986999988788299999449-----------------------987----------------- Q T0558 16 FAQGSSPQHLLVGGSGWNKIAIINKDTKEIVWEYPLE-----------------------KGW----------------- 55 (294) Q Consensus 16 ~~~~s~~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~-----------------------~~~----------------- 55 (294) .++=|+++++|+...+++++.++.. +-+.+-|.++. .+. T Consensus 125 a~~Wspd~e~l~~~t~~~~~l~Mt~-~fe~i~E~~l~~dd~~~~~~VsVGWGkkETQF~G~~gK~a~lr~pt~~~~~~~~ 203 (928) T PF04762_consen 125 AAAWSPDEELLALVTGENTLLLMTR-DFEPIAEQPLDSDDFGESKHVSVGWGKKETQFHGSGGKAAALRDPTVPKVDEGK 203 (928) T ss_pred EEEECCCCCEEEEEECCCEEEEEEC-CCEEEEEEECCCCCCCCCCEEECCCCCCCEEECCCCCCCCCCCCCCCCCCCCCC T ss_conf 9887798547999967976999924-540877752370104767603447886000615754432212488876556576 Q ss_pred -----CCCEEEEECCCCEEEEE------C--CEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEEC--CCCEEE Q ss_conf -----31164790798399841------8--82899752677036772377763148787378758999705--897999 Q T0558 56 -----ECNSVAATKAGEILFSY------S--KGAKMITRDGRELWNIAAPAGCEMQTARILPDGNALVAWCG--HPSTIL 120 (294) Q Consensus 56 -----~~~~~~~~pdG~~l~s~------~--~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~--~~~~~~ 120 (294) .-..+++..||.+++.+ + ..+++|+.+|. +-....+-..--..++|-|.|+++++..+ +..-++ T Consensus 204 ls~Dd~~~~ISWRGDG~yfAVs~~~~~~~~~R~iRVy~ReG~-L~s~SE~v~gLe~~LsWrPsGnLIAs~Qr~~~~~dVV 282 (928) T PF04762_consen 204 LSWDDGRVSISWRGDGAYFAVSSVEPESGQRRVIRVYSREGA-LDSTSEPVDGLEHALSWRPSGNLIASIQRKPDRHDVV 282 (928) T ss_pred CCCCCCCCEEEECCCCCEEEEEEEECCCCCEEEEEEECCCCC-EEECCCCCCCCCCCCCCCCCCCEEEEEEECCCCCEEE T ss_conf 545789637998688878999988716887137999888873-7752466667656764888887899999659984599 Q ss_pred EECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEECCC Q ss_conf 98568958899970677677667400899969989999974698899993788 Q T0558 121 EVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIAPNG 173 (294) Q Consensus 121 ~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~g 173 (294) ++-++|-.-.+|.+.... ....+..+.++.|+..|++-..|. |.+|.... T Consensus 283 FFERNGLRHgeF~L~~~~--~~~~v~~L~WNsDS~iLAV~~~d~-VQLWT~gN 332 (928) T PF04762_consen 283 FFERNGLRHGEFTLRFDP--EEEKVISLAWNSDSDILAVWLEDR-VQLWTTGN 332 (928) T ss_pred EEECCCCEECEEEECCCC--CCCCEEEEEECCCCCEEEEEECCE-EEEEEECC T ss_conf 991598272527504567--665045759989798899998581-99998058 No 19 >PF05096 Glu_cyclase_2: Glutamine cyclotransferase; InterPro: IPR007788 This family of enzymes 2.3.2.5 from EC catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively . This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.; PDB: 2faw_A 2iwa_A. Probab=97.75 E-value=0.0013 Score=39.83 Aligned_cols=199 Identities=11% Similarity=0.067 Sum_probs=124.1 Q ss_pred CEEEEEEECCCCCCCCEEEEECCCCEEEEE----CCEEEEEECCCCCE-EEEECCCCCCEEEEEECCCCCEEEEEECCCC Q ss_conf 829999944998731164790798399841----88289975267703-6772377763148787378758999705897 Q T0558 43 KEIVWEYPLEKGWECNSVAATKAGEILFSY----SKGAKMITRDGREL-WNIAAPAGCEMQTARILPDGNALVAWCGHPS 117 (294) Q Consensus 43 g~~~w~~~~~~~~~~~~~~~~pdG~~l~s~----~~~v~~~~~~~~~~-~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~ 117 (294) -+++.++|-+...---.+.|.+||..+-|. ...++.++..++++ .+...+......++....|. ....+=.++. T Consensus 33 ~~Vv~~yPHd~~aFTQGL~~~~~~~LyESTG~yG~S~lr~~dl~tg~v~~~~~L~~~~FgEGit~~~d~-i~qLTWk~~~ 111 (264) T PF05096_consen 33 YEVVNTYPHDPNAFTQGLEFLHDGTLYESTGLYGQSSLRKVDLETGKVLQSTDLPSRYFGEGITIVGDK-IYQLTWKEGV 111 (264) T ss_dssp EEEEEEEE--TT------EE-STTEEEEEE------EEEEEES----EEEEEE--TT------EEETTE-EE---TT--- T ss_pred EEEEEECCCCCCCCCCCEEECCCCEEEEECCCCCCEEEEEEECCCCCEEEEEECCCCCEEEEEEEECCE-EEEEEECCCE T ss_conf 599997789986657747973799899947987667799997787869999988845536448998999-9999966882 Q ss_pred EEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEECC-CCEEEEEEC--CCEEEE---EEEE Q ss_conf 9999856895889997067767766740089996998999997469889999378-858898516--973489---8873 Q T0558 118 TILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIAPN-GQLLNSVKL--SGTPFS---SAFL 191 (294) Q Consensus 118 ~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~-g~~~~~~~~--~~~~~~---~~~~ 191 (294) .++++..+-+.+.++....+..+ ...||+.|+.......++..|++ .+....... .+.|.. -... T Consensus 112 ~fvyD~~tl~~~~~~~y~~EGWG---------Lt~d~~~L~~SDGS~~L~~ldP~tf~~~~~i~V~~~g~pv~~lNELE~ 182 (264) T PF05096_consen 112 GFVYDANTLEELGTFPYPGEGWG---------LTNDGQRLIMSDGSNKLYFLDPETFAETGSIQVTDNGRPVRRLNELEY 182 (264) T ss_dssp --EEETTT--EEE--------------------EE----EE------EEEEE-TTT--EEEEEE-B----B---EEEEEE T ss_pred EEEECCCHHEEEEEEECCCCCEE---------EEECCCEEEEECCCCCEEEECHHHCEEEEEEEEEECCEECCCCEEEEE T ss_conf 79980613106789932896167---------965899999989967559988178169999999989988776155799 Q ss_pred CCCCEEEECCCCCEEEEEECCCCEEEEEECCCCCCC---------EEECCCCCEEECC-CCCEEEEECCC Q ss_conf 489689973589879999878985999844887641---------1411345348938-99899980467 Q T0558 192 DNGDCLVACGDAHCFVQLNLESNRIVRRVNANDIEG---------VQLFFVAQLFPLQ-NGGLYICNWQG 251 (294) Q Consensus 192 ~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~~~~~---------~~~~~~~~~~~~~-~G~i~i~~~~~ 251 (294) .+|.+++-.=..+.|..+|+++|++..-+......- ....-.+++++.+ ++++++.+..+ T Consensus 183 v~G~IyANVw~td~I~~Idp~tG~V~~~idls~L~~~~~~~~~~~~~~dVLNGIA~d~~~~~l~VTGK~W 252 (264) T PF05096_consen 183 VDGEIYANVWQTDRIVRIDPETGKVTGWIDLSGLLPPAGRDGSRHPDNDVLNGIAYDPETDRLFVTGKNW 252 (264) T ss_dssp ----EEEEETTSSEEEEE-S----B---EE-HHHHHHHHH---T--T-------EEETTTTEEEE----- T ss_pred ECCEEEEEECCCCEEEEEECCCCEEEEEEECHHCCCCCCCCCCCCCCCCEEEEEEECCCCCEEEEECCCC T ss_conf 9999999977887499993898829999994324221023434344677057477828999899973788 No 20 >PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction:Dipeptidyl-Polypeptide + H_(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/). ; GO: 0004274 dipeptidyl-peptidase IV activity, 0006508 proteolysis, 0016020 membrane; PDB: 1z68_B 2oae_A 2gbi_A 2i3z_B 2gbg_A 2gbf_B 2gbc_B 3d4l_B 3ccb_B 2i03_A .... Probab=97.72 E-value=0.0014 Score=39.51 Aligned_cols=200 Identities=13% Similarity=0.095 Sum_probs=102.6 Q ss_pred CCCCCEEEEE---------CCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEE-EECCEEEEEECCCCCEEEE Q ss_conf 7888589997---------47986999988788299999449987311647907983998-4188289975267703677 Q T0558 20 SSPQHLLVGG---------SGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILF-SYSKGAKMITRDGRELWNI 89 (294) Q Consensus 20 s~~~~~l~~g---------s~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~-s~~~~v~~~~~~~~~~~~~ 89 (294) |++.++++.. |..+.+.++|.+++++. +.. ..........++|||+.++ ..+.++++.+...+...+. T Consensus 1 S~d~~~~l~~~~~~~~~R~s~~~~y~~~di~~~~~~-~l~-~~~~~~~~~~~SPdg~~~afv~~~nly~~~~~~~~~~~l 78 (353) T PF00930_consen 1 SPDGKYVLLATNVKKQWRHSSFGDYYLYDISTGKIK-PLT-PSPGKLQDPKWSPDGNWIAFVRDNNLYVRDLATGQETQL 78 (353) T ss_dssp -TTSSEEEEEEEEEEESSSEEEEEEEEEETTTTEEE-ESS--EETTBSEEEE------EEEEETTEEEEESSTTS-EEEC T ss_pred CCCCCEEEEEECCEEEEEECCCCEEEEEECCCCCEE-ECC-CCCCCCCCCEECCCCCEEEEEECCEEEEEECCCCCEEEE T ss_conf 997486999987568435514224999989889789-877-997644364699999979999798589998899964888 Q ss_pred ECCCC-----------------CCEEEEEECCCCCEEEEEECCCCE-----EEEECCCCC---EEEEEECC--------- Q ss_conf 23777-----------------631487873787589997058979-----999856895---88999706--------- Q T0558 90 AAPAG-----------------CEMQTARILPDGNALVAWCGHPST-----ILEVNMKGE---VLSKTEFE--------- 135 (294) Q Consensus 90 ~~~~~-----------------~~v~~~~~~~dg~~l~~~s~~~~~-----~~~~~~~G~---~~~~~~~~--------- 135 (294) +.... ..-.++-|+||+++|+...-|.+. +......+. .+.+++.+ T Consensus 79 T~dg~~~i~~G~~dwvyeEEi~~~~~~~wWSPD~~~la~~~~D~s~V~~~~~~~~~~~~~~yp~~~~~~YPk~G~~np~v 158 (353) T PF00930_consen 79 TTDGENDIFNGVPDWVYEEEIFGRNSALWWSPDSKYLAFARFDESKVPEYHLPDYSPPDSQYPELHSIRYPKAGDPNPKV 158 (353) T ss_dssp E---TTTEE-----HHHHHHTTSSS--EEE-TTSSEEEEEEEE-TTS-EEEEEEE-STTESS-EEEEEE--B------EE T ss_pred CCCCCCCEECCCCCCCCCCHHCCCCCCEEECCCCCEEEEEEECCCCCEEEEEEEECCCCCCCCCCCCCCCCCCCCCCCEE T ss_conf 07986435537764212410026566569999999899999789887399988406864468643435788994949867 Q ss_pred --------CCC----------CCCCCCCCEEEECCCCC-EEEE-EECC---CEEEEEECC-CCEEEEEEC--CCE---EE Q ss_conf --------776----------77667400899969989-9999-7469---889999378-858898516--973---48 Q T0558 136 --------TGI----------ERPHAQFRQINKNKKGN-YLVP-LFAT---SEVREIAPN-GQLLNSVKL--SGT---PF 186 (294) Q Consensus 136 --------~~~----------~~~~~~~~~~~~s~dG~-~i~~-g~~d---~~i~~~d~~-g~~~~~~~~--~~~---~~ 186 (294) .+. .....-+..+.+.+|++ +++. ...+ ..+...|.. |...+.... ... .. T Consensus 159 ~l~v~~~~~~~~~~v~~~~~~~~~~~yl~~v~W~~d~~~l~~~~~nR~q~~~~l~~~d~~tg~~~~~~~e~~~~Wv~~~~ 238 (353) T PF00930_consen 159 RLGVVDLASGKTTEVDPPDALNPRDYYLTRVGWSPDGNKLLVQWLNRDQNRLDLLLCDPETGQTRVILEETSDGWVDPHN 238 (353) T ss_dssp EEEEEECCCTTTTEE---HHHHCSSEEEEEEEEEETTEEEEEEEEETTSTEEEEEEEEECTTTEEEEEEEE-SS---SSS T ss_pred EEEEEECCCCCEEEECCCCCCCCCCEEEEEEEECCCCCEEEEEEECCCCCEEEEEEEECCCCEEEEEEECCCCCCEECCC T ss_conf 99999888997887247422577771888869878995699998036898899999989888488899724797467335 Q ss_pred EEEEE-CCCCE-EEECCCCC--EEEEEECCCCEEEEEECC Q ss_conf 98873-48968-99735898--799998789859998448 Q T0558 187 SSAFL-DNGDC-LVACGDAH--CFVQLNLESNRIVRRVNA 222 (294) Q Consensus 187 ~~~~~-~~g~~-~v~~~~~~--~i~~~d~~~g~~~~~~~~ 222 (294) ...+. +++.. +...-.++ .+++++..++++. .++. T Consensus 239 ~~~~~~~~~~~~l~~ser~G~~hLy~~~~~~~~~~-~lT~ 277 (353) T PF00930_consen 239 PPPFLLPDGSRFLWISERDGYRHLYLYDLDGGKPR-QLTS 277 (353) T ss_dssp -EEE--TTSSEEEEEEE---EEEEEEEESTCSEEE-ES-- T ss_pred CCEEEECCCCEEEEEEECCCCCEEEEEECCCCCEE-ECCC T ss_conf 64058579987999998089638999958999346-0565 No 21 >PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction:Dipeptidyl-Polypeptide + H_(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/). ; GO: 0004274 dipeptidyl-peptidase IV activity, 0006508 proteolysis, 0016020 membrane; PDB: 1z68_B 2oae_A 2gbi_A 2i3z_B 2gbg_A 2gbf_B 2gbc_B 3d4l_B 3ccb_B 2i03_A .... Probab=97.61 E-value=0.002 Score=38.45 Aligned_cols=228 Identities=10% Similarity=0.029 Sum_probs=123.9 Q ss_pred HHCCCCCCCCEEEEECCCCEEEEEECCCCEEEEEEECCCC-----------------CCCCEEEEECCCCEEEEE---CC Q ss_conf 2001478885899974798699998878829999944998-----------------731164790798399841---88 Q T0558 15 PFAQGSSPQHLLVGGSGWNKIAIINKDTKEIVWEYPLEKG-----------------WECNSVAATKAGEILFSY---SK 74 (294) Q Consensus 15 ~~~~~s~~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~-----------------~~~~~~~~~pdG~~l~s~---~~ 74 (294) ..+..|++++.++--.++ .|++.+..+++... ...++. ..-..+.++|||+.|+.. +. T Consensus 46 ~~~~~SPdg~~~afv~~~-nly~~~~~~~~~~~-lT~dg~~~i~~G~~dwvyeEEi~~~~~~~wWSPD~~~la~~~~D~s 123 (353) T PF00930_consen 46 QDPKWSPDGNWIAFVRDN-NLYVRDLATGQETQ-LTTDGENDIFNGVPDWVYEEEIFGRNSALWWSPDSKYLAFARFDES 123 (353) T ss_dssp SEEEE------EEEEETT-EEEEESSTTS-EEE-CE---TTTEE-----HHHHHHTTSSS--EEE-TTSSEEEEEEEE-T T ss_pred CCCEECCCCCEEEEEECC-EEEEEECCCCCEEE-ECCCCCCCEECCCCCCCCCCHHCCCCCCEEECCCCCEEEEEEECCC T ss_conf 364699999979999798-58999889996488-8079864355377642124100265665699999998999997898 Q ss_pred -----------------------------------EEEEEECCCCCEEEEE-----CCCCCCEEEEEECCCCCEEE--EE Q ss_conf -----------------------------------2899752677036772-----37776314878737875899--97 Q T0558 75 -----------------------------------GAKMITRDGRELWNIA-----APAGCEMQTARILPDGNALV--AW 112 (294) Q Consensus 75 -----------------------------------~v~~~~~~~~~~~~~~-----~~~~~~v~~~~~~~dg~~l~--~~ 112 (294) .+.+++.+++...+.. ......+..+.|.+|++.++ .. T Consensus 124 ~V~~~~~~~~~~~~~~yp~~~~~~YPk~G~~np~v~l~v~~~~~~~~~~v~~~~~~~~~~~yl~~v~W~~d~~~l~~~~~ 203 (353) T PF00930_consen 124 KVPEYHLPDYSPPDSQYPELHSIRYPKAGDPNPKVRLGVVDLASGKTTEVDPPDALNPRDYYLTRVGWSPDGNKLLVQWL 203 (353) T ss_dssp TS-EEEEEEE-STTESS-EEEEEE--B------EEEEEEEECCCTTTTEE---HHHHCSSEEEEEEEEEETTEEEEEEEE T ss_pred CCEEEEEEEECCCCCCCCCCCCCCCCCCCCCCCEEEEEEEECCCCCEEEECCCCCCCCCCEEEEEEEECCCCCEEEEEEE T ss_conf 87399988406864468643435788994949867999998889978872474225777718888698789956999980 Q ss_pred ECCCC---EEEEECCCCCEEEEEECCCCCCCCCCCCCEEEEC-CCCC-EEEEEECCC--EEEEEECCCCEEEEEECCC-E Q ss_conf 05897---9999856895889997067767766740089996-9989-999974698--8999937885889851697-3 Q T0558 113 CGHPS---TILEVNMKGEVLSKTEFETGIERPHAQFRQINKN-KKGN-YLVPLFATS--EVREIAPNGQLLNSVKLSG-T 184 (294) Q Consensus 113 s~~~~---~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s-~dG~-~i~~g~~d~--~i~~~d~~g~~~~~~~~~~-~ 184 (294) +++.. +...+..+|++.+......... -.......+. +++. ++.....+| .+++++.+++.......+. . T Consensus 204 nR~q~~~~l~~~d~~tg~~~~~~~e~~~~W--v~~~~~~~~~~~~~~~~l~~ser~G~~hLy~~~~~~~~~~~lT~G~~~ 281 (353) T PF00930_consen 204 NRDQNRLDLLLCDPETGQTRVILEETSDGW--VDPHNPPPFLLPDGSRFLWISERDGYRHLYLYDLDGGKPRQLTSGNWE 281 (353) T ss_dssp ETTSTEEEEEEEEECTTTEEEEEEEE-SS-----SSS-EEE--TTSSEEEEEEE---EEEEEEEESTCSEEEES------ T ss_pred CCCCCEEEEEEEECCCCEEEEEEECCCCCC--EECCCCCEEEECCCCEEEEEEECCCCCEEEEEECCCCCEEECCCCCEE T ss_conf 368988999999898884888997247974--673356405857998799999808963899995899934605656537 Q ss_pred EEE-EEEECCCCE-EEECCCC----CEEEEEECC-CCEEEEEECCCCCCCEEECCCCCEEECCCCCEEEEECCCCE Q ss_conf 489-887348968-9973589----879999878-98599984488764114113453489389989998046771 Q T0558 185 PFS-SAFLDNGDC-LVACGDA----HCFVQLNLE-SNRIVRRVNANDIEGVQLFFVAQLFPLQNGGLYICNWQGHD 253 (294) Q Consensus 185 ~~~-~~~~~~g~~-~v~~~~~----~~i~~~d~~-~g~~~~~~~~~~~~~~~~~~~~~~~~~~~G~i~i~~~~~~~ 253 (294) +.+ ..+..+++. ++.+..+ ..++.+++. ++++..-....... .+..+.++|++++..+.+.+ T Consensus 282 V~~i~~~d~~~~~vYf~a~~~~p~~~hlY~v~l~~~~~~~~LT~~~~~~-------~s~~~S~~~~y~v~~ys~~~ 350 (353) T PF00930_consen 282 VTSILGVDEDGNLVYFTANEDDPYERHLYRVSLDGGGKPTRLTPEDGDH-------YSASFSPDGKYYVDTYSGPD 350 (353) T ss_dssp EEEEEEEEEESSEEEEEESSC-TT-BEEEEEETTETTEEEESSTTTSTT-------EEEEE-TTSSEEEEEEEETT T ss_pred ECCCEEEECCCCEEEEEEECCCCCCEEEEEEECCCCCCEEECCCCCCCC-------EEEEECCCCCEEEEEECCCC T ss_conf 5452149488999999995999984799999888899818898999874-------68999999999999886899 No 22 >PF05096 Glu_cyclase_2: Glutamine cyclotransferase; InterPro: IPR007788 This family of enzymes 2.3.2.5 from EC catalyse the cyclization of free L-glutamine and N-terminal glutaminyl residues in proteins to pyroglutamate (5-oxoproline) and pyroglutamyl residues respectively . This family includes plant and bacterial enzymes and seems unrelated to the mammalian enzymes.; PDB: 2faw_A 2iwa_A. Probab=97.46 E-value=0.0029 Score=37.23 Aligned_cols=151 Identities=16% Similarity=0.127 Sum_probs=98.9 Q ss_pred CCCCCEEEEEECCCCCEEEEEECC--CCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEE Q ss_conf 777631487873787589997058--979999856895889997067767766740089996998999997469889999 Q T0558 92 PAGCEMQTARILPDGNALVAWCGH--PSTILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREI 169 (294) Q Consensus 92 ~~~~~v~~~~~~~dg~~l~~~s~~--~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~ 169 (294) ........+.+..+|..+-+++.. +.+...+..+|++..+..++...-+. .+... +++++.-.-.++...+| T Consensus 42 d~~aFTQGL~~~~~~~LyESTG~yG~S~lr~~dl~tg~v~~~~~L~~~~FgE-----Git~~-~d~i~qLTWk~~~~fvy 115 (264) T PF05096_consen 42 DPNAFTQGLEFLHDGTLYESTGLYGQSSLRKVDLETGKVLQSTDLPSRYFGE-----GITIV-GDKIYQLTWKEGVGFVY 115 (264) T ss_dssp -TT------EE-STTEEEEEE------EEEEEES----EEEEEE--TT-----------EEE-TTEEE---TT-----EE T ss_pred CCCCCCCCEEECCCCEEEEECCCCCCEEEEEEECCCCCEEEEEECCCCCEEE-----EEEEE-CCEEEEEEECCCEEEEE T ss_conf 9866577479737998999479876677999977878699999888455364-----48998-99999999668827998 Q ss_pred ECC-CCEEEEEECCCEEEEEEEECCCCEEEECCCCCEEEEEECCCCEEEEEECCCCCCCEEECCCCCEEECCCCCEEEEE Q ss_conf 378-8588985169734898873489689973589879999878985999844887641141134534893899899980 Q T0558 170 APN-GQLLNSVKLSGTPFSSAFLDNGDCLVACGDAHCFVQLNLESNRIVRRVNANDIEGVQLFFVAQLFPLQNGGLYICN 248 (294) Q Consensus 170 d~~-g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~G~i~i~~ 248 (294) |.+ -+.+.++.....-+... .+|+.++.+..++.++++|+++-++..+++-.+ .+......+...+. +|.++..- T Consensus 116 D~~tl~~~~~~~y~~EGWGLt--~d~~~L~~SDGS~~L~~ldP~tf~~~~~i~V~~-~g~pv~~lNELE~v-~G~IyANV 191 (264) T PF05096_consen 116 DANTLEELGTFPYPGEGWGLT--NDGQRLIMSDGSNKLYFLDPETFAETGSIQVTD-NGRPVRRLNELEYV-DGEIYANV 191 (264) T ss_dssp ETTT--EEE-----------E--E----EE------EEEEE-TTT--EEEEEE-B-----B---EEEEEE-----EEEEE T ss_pred CCCHHEEEEEEECCCCCEEEE--ECCCEEEEECCCCCEEEECHHHCEEEEEEEEEE-CCEECCCCEEEEEE-CCEEEEEE T ss_conf 061310678993289616796--589999998996755998817816999999998-99887761557999-99999997 Q ss_pred CCCC Q ss_conf 4677 Q T0558 249 WQGH 252 (294) Q Consensus 249 ~~~~ 252 (294) |+.. T Consensus 192 w~td 195 (264) T PF05096_consen 192 WQTD 195 (264) T ss_dssp TTSS T ss_pred CCCC T ss_conf 7887 No 23 >PF08553 VID27: VID27 cytoplasmic protein; InterPro: IPR013863 This entry represents fungal and plant proteins and contains many hypothetical proteins. Vid27p is a cytoplasmic protein of unknown function, possibly regulates import of fructose-1,6-bisphosphatase into Vacuolar Import and Degradation (Vid) vesicles and is not essential for proteasome-dependent degradation of fructose-1,6-bisphosphatase (FBPase) , . Probab=96.94 E-value=0.0055 Score=35.20 Aligned_cols=141 Identities=15% Similarity=0.113 Sum_probs=87.1 Q ss_pred CCCEEEE-ECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEE-------ECCCCEEEEECCEEEEEECC--C-CCEEEEE Q ss_conf 8858999-7479869999887882999994499873116479-------07983998418828997526--7-7036772 Q T0558 22 PQHLLVG-GSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAA-------TKAGEILFSYSKGAKMITRD--G-RELWNIA 90 (294) Q Consensus 22 ~~~~l~~-gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~-------~pdG~~l~s~~~~v~~~~~~--~-~~~~~~~ 90 (294) +.+.|+- ....++|+-+|.++||++-++..+.-..+..++. ++...++..++..++.||.- + +.++... T Consensus 486 d~~Mll~~~~~~~~ly~mDLe~GKIV~eW~~~~d~~v~~~~~~sK~aqlt~e~tflGls~n~lfriDpRl~~~klv~~~~ 565 (788) T PF08553_consen 486 DRNMLLLSPENPNKLYQMDLERGKIVEEWKVEDDIPVVDIAPDSKFAQLTSEQTFLGLSDNSLFRIDPRLSGNKLVQSQS 565 (788) T ss_pred CCCEEEECCCCCCCEEEEECCCCCEEEEEECCCCCCCEEECCCCCCCCCCCCCEEEEECCCCEEEECCCCCCCCEEECCC T ss_conf 66269746988783388754788688885258876603742565534367775189987783498745668871451134 Q ss_pred CC-CC-CCEEEEEECCCCCEEEEEECCCCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEE Q ss_conf 37-77-63148787378758999705897999985689588999706776776674008999699899999746988999 Q T0558 91 AP-AG-CEMQTARILPDGNALVAWCGHPSTILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVRE 168 (294) Q Consensus 91 ~~-~~-~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~ 168 (294) .. .. ....+++-+.+ .++++++.++.+++++ ..|+. ......+...++.++..+.||+++++.+. ..+.+ T Consensus 566 k~Y~~~~~Fs~~aTT~~-G~iavgS~~G~IRLyd-~~gk~-----AKT~lp~lG~PIi~iDVT~DGkWiLaTc~-tyLlL 637 (788) T PF08553_consen 566 KQYASKNNFSCAATTED-GYIAVGSNKGDIRLYD-RLGKR-----AKTALPGLGDPIIGIDVTADGKWILATCD-TYLLL 637 (788) T ss_pred CCCCCCCCCEEEEECCC-CEEEEECCCCCEEECC-CCCCH-----HHHCCCCCCCCEEEEEECCCCCEEEEEEC-CEEEE T ss_conf 22446887048985699-6599960898578606-66713-----32227878997367796578848999705-35999 Q ss_pred EE Q ss_conf 93 Q T0558 169 IA 170 (294) Q Consensus 169 ~d 170 (294) .+ T Consensus 638 i~ 639 (788) T PF08553_consen 638 ID 639 (788) T ss_pred EE T ss_conf 98 No 24 >PF06433 Me-amine-dh_H: Methylamine dehydrogenase heavy chain (MADH); InterPro: IPR009451 Methylamine dehydrogenase (1.4.99.3 from EC) is a periplasmic quinoprotein found in several methyltrophic bacteria . It is induced when grown on methylamine as a carbon source MADH and catalyses the oxidative deamination of amines to their corresponding aldehydes. The redox cofactor of this enzyme is tryptophan tryptophylquinone (TTQ). Electrons derived from the oxidation of methylamine are passed to an electron acceptor, which is usually the blue-copper protein amicyanin (IPR002386 from INTERPRO).RCH2NH2 + H2O + acceptor = RCHO + NH3 + reduced acceptor MADH is a hetero-tetramer, comprised of two heavy subunits and two light subunits. The heavy subunit forms a seven-bladed beta-propeller like structure .; GO: 0030058 amine dehydrogenase activity, 0030416 methylamine metabolic process, 0042597 periplasmic space; PDB: 3c75_H 1mg2_A 2j57_J 2mta_H 2gc4_I 2gc7_I 2j55_J 1mg3_I 2bbk_H 2j56_J .... Probab=96.91 E-value=0.011 Score=33.15 Aligned_cols=175 Identities=10% Similarity=0.062 Sum_probs=100.1 Q ss_pred CCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEE------------CCEEEEEECCCCCEEE-EECCCC---- Q ss_conf 98699998878829999944998731164790798399841------------8828997526770367-723777---- Q T0558 32 WNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSY------------SKGAKMITRDGRELWN-IAAPAG---- 94 (294) Q Consensus 32 ~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~------------~~~v~~~~~~~~~~~~-~~~~~~---- 94 (294) .++++++|.++++++=....+ . ...+.++|||+.++.. ..-+.+||...-.... ...+.. T Consensus 16 ~~rv~viD~d~~~~lGmi~~g--~-~~~~~~s~dg~~~y~a~T~ysR~~rG~RtDvV~v~D~~tL~p~~EI~iP~k~R~~ 92 (342) T PF06433_consen 16 TGRVYVIDGDSGKVLGMIDTG--F-FGNFVLSPDGKFIYVAETFYSRGTRGERTDVVTVYDTQTLSPTAEIVIPPKPRFQ 92 (342) T ss_dssp SEEEEEEETT---B---BEEE--B---EEEE-----EEEEEEEEEEETTE-EEEEEEEEEETTTTEEEEEEEETTT-B-- T ss_pred CCEEEEEECCCCCEEEEEECC--C-CCCEEECCCCCEEEEEEEEECCCCCCCEEEEEEEEECCCCCCCCCEECCCCCCEE T ss_conf 013999979888377766336--4-6651588999889998788705565530268999706667622337669876335 Q ss_pred --CCEEEEEECCCCCEEEEEEC--CCCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEE- Q ss_conf --63148787378758999705--8979999856895889997067767766740089996998999997469889999- Q T0558 95 --CEMQTARILPDGNALVAWCG--HPSTILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREI- 169 (294) Q Consensus 95 --~~v~~~~~~~dg~~l~~~s~--~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~- 169 (294) ......++++|++++++..- ...+-+.|....+.+.+...++ +.. +-.+++ +-+...+.||.+... T Consensus 93 ~~~~~~~~~ls~d~k~l~v~N~TPa~SVtVVDl~~~k~v~eid~PG-------C~~-iyP~~~-~~F~~lC~DGsl~~v~ 163 (342) T PF06433_consen 93 AGPYKGMFALSADGKFLLVFNFTPAQSVTVVDLAAKKFVREIDTPG-------CAL-IYPTGN-RGFSMLCGDGSLLSVT 163 (342) T ss_dssp ----GGGEEE-TTSSEEEE-BESSSEEE-EEES---EEEEEEEG-S-------EEE-EEEEET-TEEEEEE----EEEEE T ss_pred ECCCCCCEEECCCCEEEEEECCCCCCEEEEEECCCCCEEEEECCCC-------EEE-EEECCC-CCEEEEECCCCEEEEE T ss_conf 1234652365559839999836887656899854462144505798-------799-966699-8517885698238999 Q ss_pred -ECCCCEEEEEE-----CCCEEEEEE-EECCCCEEEECCCCCEEEEEECCCCEEEE Q ss_conf -37885889851-----697348988-73489689973589879999878985999 Q T0558 170 -APNGQLLNSVK-----LSGTPFSSA-FLDNGDCLVACGDAHCFVQLNLESNRIVR 218 (294) Q Consensus 170 -d~~g~~~~~~~-----~~~~~~~~~-~~~~g~~~v~~~~~~~i~~~d~~~g~~~~ 218 (294) |.+|+...... ..+.+..-+ .....+.++-.+..+.++..+.......+ T Consensus 164 lD~~G~~~~~~t~~F~~~~dplf~~~a~~~~~~~~~F~sy~G~v~~~~l~~~~~~~ 219 (342) T PF06433_consen 164 LDDDGKETQRRTEVFFPDDDPLFEHPAYSRKTGRLVFPSYEGNVYQADLSGDKAKF 219 (342) T ss_dssp -------EEEE---SSTTTS-B-S--EE-STEEEEEEEB----EEEEE------EE T ss_pred ECCCCCEEEECCCCCCCCCCCCCCCCCEECCCCEEEEEECCCEEEEEECCCCCCCC T ss_conf 78999787621577666774212355327789859998437779987316886542 No 25 >PF07250 Glyoxal_oxid_N: Glyoxal oxidase N-terminus; InterPro: IPR009880 This entry represents the N terminus (approximately 300 residues) of a number of plant and fungal glyoxal oxidase enzymes. Glyoxal oxidase catalyses the oxidation of aldehydes to carboxylic acids, coupled with reduction of dioxygen to hydrogen peroxide. It is an essential component of the extracellular lignin degradation pathways of the wood-rot fungus Phanerochaete chrysosporium . Probab=96.80 E-value=0.013 Score=32.55 Aligned_cols=182 Identities=13% Similarity=0.097 Sum_probs=103.0 Q ss_pred CCCEEEEEECCC-CEEEEEEECCCCCCCCEEEEECCCCEEEEE-CCEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCE Q ss_conf 798699998878-829999944998731164790798399841-882899752677036772377763148787378758 Q T0558 31 GWNKIAIINKDT-KEIVWEYPLEKGWECNSVAATKAGEILFSY-SKGAKMITRDGRELWNIAAPAGCEMQTARILPDGNA 108 (294) Q Consensus 31 ~~~~i~~~d~~t-g~~~w~~~~~~~~~~~~~~~~pdG~~l~s~-~~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~ 108 (294) .+++|.++|... |.-.-+ +..+. |. ..|.....-.+ --.-.+||................+.+-.+.+||.+ T Consensus 7 ~~~~v~~~d~t~~g~s~~~--l~~~~-cr---~~~~~~~~~~d~~a~s~~~D~~tn~~rpl~v~td~FCSgg~~L~dG~l 80 (243) T PF07250_consen 7 HNNKVIMFDRTNFGPSNIS--LPDGR-CR---NNPEDNALKIDCTAHSVEYDPATNTFRPLPVQTDTFCSGGAFLPDGRL 80 (243) T ss_pred CCCEEEEEECCCCCCCCCC--CCCCC-CC---CCCCCHHHCCCCCEEEEEEECCCCCEEECCCCCCCCCCCCCCCCCCCE T ss_conf 3997999968676645310--69982-65---675001013684088998836889887035734775667579989999 Q ss_pred EEEEECCCC---EEEEECC--CCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEECC---CCEE--EE Q ss_conf 999705897---9999856--895889997067767766740089996998999997469889999378---8588--98 Q T0558 109 LVAWCGHPS---TILEVNM--KGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIAPN---GQLL--NS 178 (294) Q Consensus 109 l~~~s~~~~---~~~~~~~--~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~---g~~~--~~ 178 (294) +.++....+ +.++... .+.+-|...... ....-..-....-+||+.|+.|....--+.+-+. +... +. T Consensus 81 l~tGG~~~G~~~iR~~~P~~~~~~~dW~e~~~~--L~~~RWYpT~~~L~DG~vlIiGG~~~~~~E~~P~~~~~~~~~~~~ 158 (243) T PF07250_consen 81 LQTGGDFDGNKGIRIFDPCGSDGTCDWVESPNQ--LQAGRWYPTAQTLPDGRVLIIGGRRNPTYEFFPPIGPGPGPVNLP 158 (243) T ss_pred EEECCCCCCCEEEEEEECCCCCCCCCEEECCCC--CCCCCCCCCCEECCCCCEEEEECCCCCCEEECCCCCCCCCCEECC T ss_conf 983786777600699607888887771765774--457865566249899989999488888310778876788725210 Q ss_pred EEC------CCE-EEEEEEECCCCEEEECCCCCEEEEEECCCCEEEEEECC Q ss_conf 516------973-48988734896899735898799998789859998448 Q T0558 179 VKL------SGT-PFSSAFLDNGDCLVACGDAHCFVQLNLESNRIVRRVNA 222 (294) Q Consensus 179 ~~~------~~~-~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~ 222 (294) +.. ... --.+...|+|++++.+..+. .++|..++++++.+.. T Consensus 159 ~L~~t~~~~~~nlYPf~~LlPdG~lFi~an~~s--~i~D~~t~~vv~~lP~ 207 (243) T PF07250_consen 159 FLADTNDTQPNNLYPFVHLLPDGNLFIFANNRS--IILDYNTNTVVRDLPN 207 (243) T ss_pred HHHHHCCCCCCCCCCEEEECCCCCEEEEECCCC--EEECCCCCEEEEECCC T ss_conf 202231367666572699858998999984774--8871889959764788 No 26 >PF04841 Vps16_N: Vps16, N-terminal region; InterPro: IPR006926 This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast . The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport . The role of VPS16 in this complex is not known.; GO: 0006886 intracellular protein transport, 0005737 cytoplasm Probab=96.80 E-value=0.013 Score=32.55 Aligned_cols=50 Identities=12% Similarity=0.124 Sum_probs=20.3 Q ss_pred EEEEECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEECCCCEEEEECCCCCE Q ss_conf 89975267703677237776314878737875899970589799998568958 Q T0558 76 AKMITRDGRELWNIAAPAGCEMQTARILPDGNALVAWCGHPSTILEVNMKGEV 128 (294) Q Consensus 76 v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~~ 128 (294) +.+|+..|..+++..-.+ ..+..+.|+.+.+.+ +...|+.+++. +..|+. T Consensus 63 I~Iys~sG~ll~si~w~~-~~iv~l~wt~~e~Li-vV~~dG~v~~y-~~~G~~ 112 (410) T PF04841_consen 63 IRIYSSSGKLLSSIPWDS-GRIVGLGWTDDEELI-VVFEDGTVRVY-DLFGEF 112 (410) T ss_pred EEEECCCCCEEEEEEECC-CCEEEEEECCCCCEE-EEECCCEEEEE-ECCCCE T ss_conf 999978897968988179-887899987788799-99868989998-067756 No 27 >PF07250 Glyoxal_oxid_N: Glyoxal oxidase N-terminus; InterPro: IPR009880 This entry represents the N terminus (approximately 300 residues) of a number of plant and fungal glyoxal oxidase enzymes. Glyoxal oxidase catalyses the oxidation of aldehydes to carboxylic acids, coupled with reduction of dioxygen to hydrogen peroxide. It is an essential component of the extracellular lignin degradation pathways of the wood-rot fungus Phanerochaete chrysosporium . Probab=96.62 E-value=0.017 Score=31.67 Aligned_cols=139 Identities=19% Similarity=0.186 Sum_probs=58.1 Q ss_pred EEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEEC-----CEEEEEECCC---CCEEEEECC---CCCCEEEEEE Q ss_conf 6999988788299999449987311647907983998418-----8289975267---703677237---7763148787 Q T0558 34 KIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSYS-----KGAKMITRDG---RELWNIAAP---AGCEMQTARI 102 (294) Q Consensus 34 ~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~~-----~~v~~~~~~~---~~~~~~~~~---~~~~v~~~~~ 102 (294) .=.+||..|+++. ...+....-|++-++.|||+++.++. ..+++++..+ ..-|..... ..-+=-+... T Consensus 47 ~s~~~D~~tn~~r-pl~v~td~FCSgg~~L~dG~ll~tGG~~~G~~~iR~~~P~~~~~~~dW~e~~~~L~~~RWYpT~~~ 125 (243) T PF07250_consen 47 HSVEYDPATNTFR-PLPVQTDTFCSGGAFLPDGRLLQTGGDFDGNKGIRIFDPCGSDGTCDWVESPNQLQAGRWYPTAQT 125 (243) T ss_pred EEEEEECCCCCEE-ECCCCCCCCCCCCCCCCCCCEEEECCCCCCCEEEEEEECCCCCCCCCEEECCCCCCCCCCCCCCEE T ss_conf 8998836889887-035734775667579989999983786777600699607888887771765774457865566249 Q ss_pred CCCCCEEEEEECCCCEEEEECCC--CCEEEEEECCCCC--CCCCCCCCEEEECCCCCEEEEEECCCEEEEEECCCCE Q ss_conf 37875899970589799998568--9588999706776--7766740089996998999997469889999378858 Q T0558 103 LPDGNALVAWCGHPSTILEVNMK--GEVLSKTEFETGI--ERPHAQFRQINKNKKGNYLVPLFATSEVREIAPNGQL 175 (294) Q Consensus 103 ~~dg~~l~~~s~~~~~~~~~~~~--G~~~~~~~~~~~~--~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d~~g~~ 175 (294) .|||+.++.+......+-+.... +............ .....-.--+...|||++++.+..++ .++|..+.. T Consensus 126 L~DG~vlIiGG~~~~~~E~~P~~~~~~~~~~~~~L~~t~~~~~~nlYPf~~LlPdG~lFi~an~~s--~i~D~~t~~ 200 (243) T PF07250_consen 126 LPDGRVLIIGGRRNPTYEFFPPIGPGPGPVNLPFLADTNDTQPNNLYPFVHLLPDGNLFIFANNRS--IILDYNTNT 200 (243) T ss_pred CCCCCEEEEECCCCCCEEECCCCCCCCCCEECCHHHHHCCCCCCCCCCEEEECCCCCEEEEECCCC--EEECCCCCE T ss_conf 899989999488888310778876788725210202231367666572699858998999984774--887188995 No 28 >PF03178 CPSF_A: CPSF A subunit region; InterPro: IPR004871 This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs . The function of the aligned region is unknown but may be involved in RNA/DNA binding.; GO: 0003676 nucleic acid binding, 0005634 nucleus; PDB: 3ei3_A 2b5m_A 3e0c_A 3ei4_A 2b5l_A 3ei1_A 2hye_A 3ei2_A. Probab=96.56 E-value=0.018 Score=31.43 Aligned_cols=170 Identities=14% Similarity=0.099 Sum_probs=88.5 Q ss_pred CEEEEEECCCCEEEEEEECCCCCCCCEEEEEC-------CCCEEEEE-----------C-CEEEEEECCCC-----CEE- Q ss_conf 86999988788299999449987311647907-------98399841-----------8-82899752677-----036- Q T0558 33 NKIAIINKDTKEIVWEYPLEKGWECNSVAATK-------AGEILFSY-----------S-KGAKMITRDGR-----ELW- 87 (294) Q Consensus 33 ~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~p-------dG~~l~s~-----------~-~~v~~~~~~~~-----~~~- 87 (294) ..|.++|..+.+.+++++++....+.++.... .-.+|+.+ . +.++++..... ++. T Consensus 2 s~i~lvd~~~~~~i~~~~l~~~E~~~s~~~~~l~~~~~~~~~~lvVGT~~~~~~~~~~~~Gri~v~~i~~~~~~~~~L~~ 81 (321) T PF03178_consen 2 SCIRLVDPTTFEVIDSFELDENEHVTSMCSVQLDSSSTGKRPYLVVGTAFNYGEDPESRSGRIYVFEIIESPSTNRKLKL 81 (321) T ss_dssp -EEEEEETTTSSEEEEEEEETTEE----EEEEE------SS-EEEE--EE--TTSSS-----EEEEEE---------EEE T ss_pred CEEEEEECCCCEEEEEEECCCCCCEEEEEEEEECCCCCCCCCEEEEEECCCCCCCCCCCCEEEEEEEEECCCCCCCEEEE T ss_conf 19999928972599999989995455999999646666656799999422467887767569999999424445318999 Q ss_pred EEECCCCCCEEEEEECCCCCEEEEEECCCCEEEEECCCCC-EEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEE Q ss_conf 7723777631487873787589997058979999856895-889997067767766740089996998999997469889 Q T0558 88 NIAAPAGCEMQTARILPDGNALVAWCGHPSTILEVNMKGE-VLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEV 166 (294) Q Consensus 88 ~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~-~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i 166 (294) .........+.++... .| +++++.+ ..+.++.....+ ++..-..... . .+.++.. -+++|++|.....+ T Consensus 82 i~~~~~~g~v~al~~~-~g-~ll~~~g-~~l~v~~l~~~~~l~~~~~~~~~----~-~i~~l~~--~~~~I~vgD~~~Sv 151 (321) T PF03178_consen 82 IHETEVKGPVYALCSF-NG-RLLAAVG-NKLRVYDLGNKKELLRKAFYDLP----F-YITSLSV--FGNYILVGDAMKSV 151 (321) T ss_dssp EEEE-----EEEEEE------EEEEES-SEEEEEEEETTSEEEEEEEE-BS----S---SS-EE--E--EEEE--TSB-E T ss_pred EEEEECCCCCEEEHHC-CC-EEEEEEC-CEEEEEEECCCCCEEEEEEECCC----C-EEEEEEE--ECCEEEEEEHHHCE T ss_conf 9999728766993262-99-8999989-99999993695311020334587----2-8999998--89999999925387 Q ss_pred EEE--ECCCCEEEEE--E-CCCEEEEEEEECCCCEEEECCCCCEEEEEECC Q ss_conf 999--3788588985--1-69734898873489689973589879999878 Q T0558 167 REI--APNGQLLNSV--K-LSGTPFSSAFLDNGDCLVACGDAHCFVQLNLE 212 (294) Q Consensus 167 ~~~--d~~g~~~~~~--~-~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~ 212 (294) .++ +.+++.+-.. . ..-.+.++.+..+++.++++-..+++.++... T Consensus 152 ~~~~y~~~~~~l~~iarD~~~~~vta~~~l~d~~~ii~~D~~gnl~~l~~~ 202 (321) T PF03178_consen 152 SFLRYDEENNKLILIARDFQPRWVTAVEFLVDEDTIIVSDKFGNLFVLRYN 202 (321) T ss_dssp EEEEE-S----EEEEEEESS-B-EEEEEEE-ETTEEE-EETTSEE--EEE- T ss_pred EEEEEECCCCEEEEEEECCCCCCEEEEEEECCCCEEEEECCCCCEEEEEEC T ss_conf 999994589759999834987546999873278779998699949999738 No 29 >PF03022 MRJP: Major royal jelly protein; InterPro: IPR003534 The major royal jelly proteins (MRJPs) comprise 12.5% of the mass, and 82-90% of the protein content , of honeybee (Apis mellifera) royal jelly. Royal jelly is a substance secreted by the cephalic glands of nurse bees and it is used to trigger development of a queen bee from a bee larva. The biological function of the MRJPs is unknown, but they are believed to play a major role in nutrition due to their high essential amino acid content . Two royal jelly proteins, MRJP3 and MRJP5, contain a tandem repeat that results from a high genetic variablility. This polymorphism may be useful for genotyping individual bees .; PDB: 2qe8_B. Probab=96.55 E-value=0.019 Score=31.35 Aligned_cols=60 Identities=15% Similarity=0.199 Sum_probs=43.9 Q ss_pred CEEEEEECCCCEEEEEEECCCC-----CCCCEEEEEC-C-----CCEEEEE--CCEEEEEECCCCCEEEEECC Q ss_conf 8699998878829999944998-----7311647907-9-----8399841--88289975267703677237 Q T0558 33 NKIAIINKDTKEIVWEYPLEKG-----WECNSVAATK-A-----GEILFSY--SKGAKMITRDGRELWNIAAP 92 (294) Q Consensus 33 ~~i~~~d~~tg~~~w~~~~~~~-----~~~~~~~~~p-d-----G~~l~s~--~~~v~~~~~~~~~~~~~~~~ 92 (294) -+|.+||+.|++++.++.++.. .....+.+.. + +..++++ ...+.++|...+..|+...+ T Consensus 34 pKLv~~Dl~t~~~v~~~~lp~~v~~~~S~l~dl~VD~~~~~~~~~~aYItD~~~~glIV~Dl~~g~swRv~~~ 106 (287) T PF03022_consen 34 PKLVAFDLKTNKVVRRYDLPADVAPPDSYLNDLRVDVRDGDCDEGFAYITDSGGPGLIVYDLATGRSWRVLHG 106 (287) T ss_dssp -EEEEEETTTTEEEEEEE--TTTS-TT----EEEEETTTT-----EEEEEE--SGEEEEEES----EEEE--- T ss_pred CEEEEEECCCCCEEEEEECCCHHCCCCCCEEEEEEECCCCCCEEEEEEEECCCCCCEEEEECCCCCEEEEECC T ss_conf 7699999999978999989920135776325489968899732789999779888699998889958999178 No 30 >PF00400 WD40: WD domain, G-beta repeat; InterPro: IPR001680 WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD-containing proteins have 4 to 16 repeating units, all of which are thought to form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. The underlying common function of all WD-repeat proteins is coordinating multi-protein complex assemblies, where the repeating units serve as a rigid scaffold for protein interactions. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase , .; PDB: 3fm0_A 2hes_X 1vyh_S 1p22_A 3dm0_A 3frx_C 1nex_D 2ovr_B 2ovp_B 2ovq_B .... Probab=96.14 E-value=0.005 Score=35.51 Aligned_cols=38 Identities=8% Similarity=0.133 Sum_probs=27.9 Q ss_pred CEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEE Q ss_conf 58899970677677667400899969989999974698899993 Q T0558 127 EVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIA 170 (294) Q Consensus 127 ~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d 170 (294) +++..+. .|...++++.++|++++|++++.|+.|++|| T Consensus 2 ~~~~~~~------~h~~~i~~v~~~~~~~~l~s~~~d~~i~iwd 39 (39) T PF00400_consen 2 KCVQTFK------GHTSPITSVAFSPDGKFLASGSDDGTIRIWD 39 (39) T ss_dssp EEEEEEE------SSSSSEEEEEEESSSSEEEEEETTSEEEEEE T ss_pred EEEEEEC------CCCCCEEEEEECCCCEEEEEECCCCEEEEEC T ss_conf 1999987------8688409967324121126466899899989 No 31 >PF04053 Coatomer_WDAD: Coatomer WD associated region ; InterPro: IPR006692 Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer . While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins . For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi . Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes . Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerize, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits. This entry represents the WD-associated region found in coatomer subunits alpha, beta and beta' subunits. The alpha-subunit (RET1P) of the coatomer complex in Saccharomyces cerevisiae (Baker's yeast), participates in membrane transport between the endoplasmic reticulum and Golgi apparatus. The protein contains six WD-40 repeat motifs in its N-terminal region . More information about these proteins can be found at Protein of the Month: Clathrin .; GO: 0005198 structural molecule activity, 0005515 protein binding, 0008565 protein transporter activity, 0006461 protein complex assembly, 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0030117 membrane coat Probab=95.78 E-value=0.043 Score=28.69 Aligned_cols=172 Identities=10% Similarity=-0.001 Sum_probs=95.8 Q ss_pred HHHCCCCCCCCEEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEE-CCEEEE-EECCCCCEEEEEC Q ss_conf 32001478885899974798699998878829999944998731164790798399841-882899-7526770367723 Q T0558 14 APFAQGSSPQHLLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSY-SKGAKM-ITRDGRELWNIAA 91 (294) Q Consensus 14 ~~~~~~s~~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~-~~~v~~-~~~~~~~~~~~~~ 91 (294) +...+.++.++.++++ +++...++........ ....+.+.+|.+++++.+.. +..+.+ .+......++... T Consensus 35 p~~ls~nPngr~v~V~-g~gey~iyt~~~~r~k------~~g~g~~~vw~~~n~yAv~~~~~~I~i~knf~~~~~k~i~~ 107 (444) T PF04053_consen 35 PQSLSHNPNGRFVLVC-GDGEYIIYTALAWRNK------AFGSGLSFVWSSRNRYAVLEKNSTIKIFKNFKEETTKSIKL 107 (444) T ss_pred CEEEEECCCCCEEEEE-CCCEEEEEECCCCCCC------CCCCCEEEEEECCCCEEEEECCCEEEEEECCCCCCCEEECC T ss_conf 7148999988889996-6998999982466655------56763179997771189997697599997578752307858 Q ss_pred CC-CCCEEEEEECCCCCEEEEEECCCCEEEEECCCCCEEEEEECCCCCCCCCCCCCEEEECCCCCEEEEEECCCEEEEEE Q ss_conf 77-76314878737875899970589799998568958899970677677667400899969989999974698899993 Q T0558 92 PA-GCEMQTARILPDGNALVAWCGHPSTILEVNMKGEVLSKTEFETGIERPHAQFRQINKNKKGNYLVPLFATSEVREIA 170 (294) Q Consensus 92 ~~-~~~v~~~~~~~dg~~l~~~s~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~~d~~i~~~d 170 (294) +. ...++. |..| ....+..+.+++..+++.+.+.... ++..+.++++|++++..+ +.++++++ T Consensus 108 ~~~~~~If~------G~LL-~~~~~~~i~~yDw~~~~~i~~I~v~--------~vk~V~Ws~~g~~Val~~-~~~i~il~ 171 (444) T PF04053_consen 108 PFSVEKIFG------GNLL-GVKSSDFICFYDWEQGKLIRRIDVS--------PVKNVIWSDDGELVALLT-KDSIYILK 171 (444) T ss_pred CCCCCEEEC------CEEE-EEECCCCEEEEEHHHCCEEEEEECC--------CCCEEEEECCCCEEEEEE-CCEEEEEE T ss_conf 977172773------6199-9977986899884785477799648--------985799978866799987-77699998 Q ss_pred CCCC-------------EEEEEEC-CCEEEEEEEECCCCEEEECCCCCEEEEEE Q ss_conf 7885-------------8898516-97348988734896899735898799998 Q T0558 171 PNGQ-------------LLNSVKL-SGTPFSSAFLDNGDCLVACGDAHCFVQLN 210 (294) Q Consensus 171 ~~g~-------------~~~~~~~-~~~~~~~~~~~~g~~~v~~~~~~~i~~~d 210 (294) .+-. ....... ...+.+.....+ .++.++..+--++++ T Consensus 172 ~~~~~~~~~~~~~g~e~~f~~~~E~~~~IkSg~W~~d--~fiYtT~~hLkYlv~ 223 (444) T PF04053_consen 172 YNLEAVEAEDDEEGIEDAFEVVHEINERIKSGAWDGD--VFIYTTSNHLKYLVN 223 (444) T ss_pred ECCCCCCCCCCCCCCHHHEEEEEEEEEEEEEEEEECC--EEEEECCCEEEEEEC T ss_conf 2153344566555703303888774200556799767--999973632799948 No 32 >PF01011 PQQ: PQQ enzyme repeat family.; InterPro: IPR002372 Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor for a number of enzymes (quinoproteins) and particularly for some bacterial dehydrogenases , . A number of bacterial quinoproteins belong to this family. Enzymes in this group have repeats of a beta propeller.; GO: 0006118 electron transport; PDB: 1kb0_A 1flg_B 1g72_C 4aah_C 2ad6_C 2ad7_A 2ad8_C 2d0v_I 1h4j_C 1h4i_A .... Probab=95.78 E-value=0.0073 Score=34.32 Aligned_cols=32 Identities=16% Similarity=0.243 Sum_probs=26.2 Q ss_pred EEEEECCCCEEEEEECCCCEEEEEEECCCCCC Q ss_conf 89997479869999887882999994499873 Q T0558 25 LLVGGSGWNKIAIINKDTKEIVWEYPLEKGWE 56 (294) Q Consensus 25 ~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~ 56 (294) .++.++.++.++++|.+|||++|+++.+.... T Consensus 2 ~v~~~~~~g~l~AlDa~TG~~~W~~~~~~~~~ 33 (38) T PF01011_consen 2 RVYVGSADGHLYALDAETGKILWRFDTGGPVW 33 (38) T ss_dssp EEEEESTTTEEEEEESTE-EEEEEEESSTTTC T ss_pred EEEEECCCCEEEEEECCCCCEEEEEECCCCCC T ss_conf 89996889999999889998999778899975 No 33 >PF02897 Peptidase_S9_N: Prolyl oligopeptidase, N-terminal beta-propeller domain; InterPro: IPR004106 Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes . They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence . Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases . Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base . The geometric orientations of the catalytic residues are similar between families, despite different protein folds . The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) , . Peptidases are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry. Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. This entry represents the beta-propeller domain found at the N-terminal of prolyl oligopeptidase, including acylamino-acid-releasing enzyme (also known as acylaminoacyl peptidase), which belong to the MEROPS peptidase family S9 (clan SC), subfamily S9A. The prolyl oligopeptidase family consist of a number of evolutionary related peptidases whose catalytic activity seems to be provided by a charge relay system similar to that of the trypsin family of serine proteases, but which evolved by independent convergent evolution. The N-terminal domain of prolyl oligopeptidases form an unusual 7-bladed beta-propeller consisting of seven 4-stranded beta-sheet motifs. Prolyl oligopeptidase is a large cytosolic enzyme involved in the maturation and degradation of peptide hormones and neuropeptides, which relate to the induction of amnesia. The enzyme contains a peptidase domain, where its catalytic triad (Ser554, His680, Asp641) is covered by the central tunnel of the N-terminal beta-propeller domain. In this way, large structured peptides are excluded from the active site, thereby protecting larger peptides and proteins from proteolysis in the cytosol . The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Mammalian acylaminoacyl peptidase is an exopeptidase that is a member of the same prolyl oligopeptidase family of serine peptidases. This enzyme removes acylated amino acid residues from the N terminus of oligopeptides .; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 2bkl_B 1yr2_A 3ddu_A 1qfm_A 1e5t_A 1vz3_A 1qfs_A 1vz2_A 1h2z_A 1o6f_A .... Probab=95.42 E-value=0.057 Score=27.79 Aligned_cols=54 Identities=11% Similarity=0.102 Sum_probs=21.0 Q ss_pred EEEEECCCCEEEEE---C----CEEEEEECCCCCEEEEECCCCCCEEEEEECCCCCEEEEEE Q ss_conf 64790798399841---8----8289975267703677237776314878737875899970 Q T0558 59 SVAATKAGEILFSY---S----KGAKMITRDGRELWNIAAPAGCEMQTARILPDGNALVAWC 113 (294) Q Consensus 59 ~~~~~pdG~~l~s~---~----~~v~~~~~~~~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s 113 (294) ...++|||++++.+ + ..+++.+..+++........ .....+.|.+|++.++... T Consensus 129 ~~~~Spdg~~~a~~~~~~G~e~~~l~i~dl~tg~~~~d~i~~-~~~~~i~W~~d~~~~~Y~~ 189 (415) T PF02897_consen 129 GFSVSPDGKRLAYSLDPGGSEWYTLRIFDLETGEFLPDVIEG-PKFSSIAWSPDGKGFFYTR 189 (415) T ss_dssp EEEE-----EEEEEEEETT-SEEEEEEEEC---EEECEEEEE-EESEEEEE-TT--EEEEEE T ss_pred EEEECCCCCEEEEEECCCCCCEEEEEEEECCCCCCCCCCCCC-CCCCEEEEEECCCEEEEEE T ss_conf 367999889899998799982599999999999898864336-6432589980898999998 No 34 >PF00400 WD40: WD domain, G-beta repeat; InterPro: IPR001680 WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD-containing proteins have 4 to 16 repeating units, all of which are thought to form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. The underlying common function of all WD-repeat proteins is coordinating multi-protein complex assemblies, where the repeating units serve as a rigid scaffold for protein interactions. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase , .; PDB: 3fm0_A 2hes_X 1vyh_S 1p22_A 3dm0_A 3frx_C 1nex_D 2ovr_B 2ovp_B 2ovq_B .... Probab=94.33 E-value=0.039 Score=28.99 Aligned_cols=36 Identities=11% Similarity=0.049 Sum_probs=21.1 Q ss_pred EEEECCCCCCEEEEEECCCCCEEEEEECCCCEEEEE Q ss_conf 677237776314878737875899970589799998 Q T0558 87 WNIAAPAGCEMQTARILPDGNALVAWCGHPSTILEV 122 (294) Q Consensus 87 ~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~~ 122 (294) .+...+|...+.++.++|+++++++++.|+.+.+|+ T Consensus 4 ~~~~~~h~~~i~~v~~~~~~~~l~s~~~d~~i~iwd 39 (39) T PF00400_consen 4 VQTFKGHTSPITSVAFSPDGKFLASGSDDGTIRIWD 39 (39) T ss_dssp EEEEESSSSSEEEEEEESSSSEEEEEETTSEEEEEE T ss_pred EEEECCCCCCEEEEEECCCCEEEEEECCCCEEEEEC T ss_conf 999878688409967324121126466899899989 No 35 >PF05694 SBP56: 56kDa selenium binding protein (SBP56); InterPro: IPR008826 This family consists of several eukaryotic selenium binding proteins as well as three sequences from archaea. The exact function of this protein is unknown although it is thought that SBP56 participates in late stages of intra-Golgi protein transport . The Lotus japonicus homologue of SBP56, LjSBP is thought to have more than one physiological role and can be implicated in controlling the oxidation/reduction status of target proteins in vesicular Golgi transport .; GO: 0008430 selenium binding; PDB: 2ece_A. Probab=91.53 E-value=0.26 Score=22.99 Aligned_cols=145 Identities=14% Similarity=0.068 Sum_probs=73.1 Q ss_pred CCEEEEECCCCEEEEEECCC----CEEEEEEEC----C--CCCCCCEEEEECCCCEEEEE--------CCEEEEEECCCC Q ss_conf 85899974798699998878----829999944----9--98731164790798399841--------882899752677 Q T0558 23 QHLLVGGSGWNKIAIINKDT----KEIVWEYPL----E--KGWECNSVAATKAGEILFSY--------SKGAKMITRDGR 84 (294) Q Consensus 23 ~~~l~~gs~~~~i~~~d~~t----g~~~w~~~~----~--~~~~~~~~~~~pdG~~l~s~--------~~~v~~~~~~~~ 84 (294) ..+++.|-..++|+++|..+ -++.....- . .-...+.+.-.|+|.+++|. -+++.++|.++- T Consensus 88 r~Li~PgL~SsrIyiiDt~~dPr~P~l~KvIep~ev~~k~g~s~PHT~hclp~G~i~IS~LGd~~G~g~Gg~~llD~~tF 167 (461) T PF05694_consen 88 RYLIVPGLRSSRIYIIDTKTDPRKPKLHKVIEPEEVFKKTGYSRPHTVHCLPDGNIMISALGDADGNGPGGFVLLDHDTF 167 (461) T ss_dssp -EEEE----S--EEEEE--S-TTS-EEEEEE-HHHHHHH----SEEEEE------EEEEE------------EEE-TTT- T ss_pred CEEEEECCCCCCEEEEECCCCCCCCCEEEECCHHHHHHCCCCCCCCHHCCCCCCCEEEEECCCCCCCCCCCEEEECCCCC T ss_conf 63983024678389997989988985265427889754059777740014798648999445788988874899728752 Q ss_pred CE---EEEECCCCCCEEEEEECCCCCEEEEEE--------------------CCCCEEEEECCCCCEEEEEECCCCCCCC Q ss_conf 03---677237776314878737875899970--------------------5897999985689588999706776776 Q T0558 85 EL---WNIAAPAGCEMQTARILPDGNALVAWC--------------------GHPSTILEVNMKGEVLSKTEFETGIERP 141 (294) Q Consensus 85 ~~---~~~~~~~~~~v~~~~~~~dg~~l~~~s--------------------~~~~~~~~~~~~G~~~~~~~~~~~~~~~ 141 (294) ++ |....+........-+.|.-+++++.. ....+.+|+..+.+.++++.+..... . T Consensus 168 ev~G~We~~~~~~~~gYDfw~qpr~nvMiSSeWg~P~~~~~G~~~~~l~~~~YG~~lh~Wd~~~r~~~QtiDLg~~g~-~ 246 (461) T PF05694_consen 168 EVKGRWEKDRGPQPFGYDFWYQPRHNVMISSEWGAPNMFEDGFNPEDLEAGKYGHRLHVWDWSTRKHIQTIDLGEEGQ-M 246 (461) T ss_dssp -B-----SB---------EEEETTTTEEEE-B---HHHH-----TTTHHHH----EEEEEETTTTEEEEEEE----EE-E T ss_pred EECCCCCCCCCCCCCCCCEEEECCCCEEEEECCCCHHHHHCCCCHHHHHCCCCCCEEEEEECCCCCEEEEEECCCCCC-C T ss_conf 365534668888778987776168885997045784675357886786446566368998888884447873486885-1 Q ss_pred CCCCCEEEECCCC--CE-EEEEECCCEEEEEEC Q ss_conf 6740089996998--99-999746988999937 Q T0558 142 HAQFRQINKNKKG--NY-LVPLFATSEVREIAP 171 (294) Q Consensus 142 ~~~~~~~~~s~dG--~~-i~~g~~d~~i~~~d~ 171 (294) + --+.|..|- .+ ++.+.-.+.|..|-. T Consensus 247 p---LEvRflHdP~~~~gFvg~aLsssIw~~~~ 276 (461) T PF05694_consen 247 P---LEVRFLHDPDKNYGFVGCALSSSIWRWYK 276 (461) T ss_dssp E---EEE---SSTT-----EEEE--EEEEEEEE T ss_pred E---EEEEECCCCCCCCCEEEEECCEEEEEEEE T ss_conf 4---88872689976631785641105999997 No 36 >PF10168 Nup88: Nuclear pore component Probab=89.18 E-value=0.4 Score=21.61 Aligned_cols=107 Identities=12% Similarity=0.082 Sum_probs=61.7 Q ss_pred CEEEEECCCCEEEEEECCCCEEEEEEEC----------------------CCCCCCCEEEEECCCCEEEE-ECCEEEEEE Q ss_conf 5899974798699998878829999944----------------------99873116479079839984-188289975 Q T0558 24 HLLVGGSGWNKIAIINKDTKEIVWEYPL----------------------EKGWECNSVAATKAGEILFS-YSKGAKMIT 80 (294) Q Consensus 24 ~~l~~gs~~~~i~~~d~~tg~~~w~~~~----------------------~~~~~~~~~~~~pdG~~l~s-~~~~v~~~~ 80 (294) +-|+.+. |+.+++||....-+. .... ....++..+.++|+|..++. |..++.+.. T Consensus 34 ~nL~~~~-d~~L~~W~~~e~~l~-~~n~r~~~~~~~~~~~~~~q~Ll~s~~~~feV~~I~vs~tG~~lAL~G~~gv~Il~ 111 (717) T PF10168_consen 34 RNLLDCK-DGDLFAWDSSESCLL-VVNLRDSESEATKPAKVKYQTLLPSNPPLFEVDRISVSPTGSLLALAGPRGVCILE 111 (717) T ss_pred CCEEEEE-CCEEEEEECCCCEEE-EEECCCCCCCCCCCCCCCEEEEECCCCCCCEEEEEEECCCCCEEEEECCCCEEEEE T ss_conf 3117975-898999946566899-98703456555776667516882478988478899988988779997488469999 Q ss_pred CC-----------CC---CEEEE-------ECCCCCCEEEEEECCC---CCEEEEEECCCCEEEEECCCCCEEEEE Q ss_conf 26-----------77---03677-------2377763148787378---758999705897999985689588999 Q T0558 81 RD-----------GR---ELWNI-------AAPAGCEMQTARILPD---GNALVAWCGHPSTILEVNMKGEVLSKT 132 (294) Q Consensus 81 ~~-----------~~---~~~~~-------~~~~~~~v~~~~~~~d---g~~l~~~s~~~~~~~~~~~~G~~~~~~ 132 (294) .. ++ .+..+ .......+..+.|+|. +..|++-..|+.+..++..+..-.|+. T Consensus 112 LPrr~g~~g~~e~g~~~i~Crt~~v~~~lf~~~~~l~v~Qv~WHP~s~~ds~LvVLtsdn~iR~Yd~~~~~~~~qV 187 (717) T PF10168_consen 112 LPRRWGKDGYFEDGKDEINCRTYPVDSRLFTSNPSLEVLQVRWHPWSPSDSHLVVLTSDNTIREYDLSKPRHPWQV 187 (717) T ss_pred ECCCCCCCCCCCCCCCCEEEEEEECHHHHHCCCCCCEEEEEEECCCCCCCCEEEEEECCCEEEEEECCCCCCCCEE T ss_conf 0543487653146886113677971286615799725898875578889974999936977998714887767265 No 37 >PF12234 Rav1p_C: RAVE protein 1 C terminal Probab=88.58 E-value=0.44 Score=21.33 Aligned_cols=46 Identities=15% Similarity=0.107 Sum_probs=18.5 Q ss_pred EEEEEECCCCCEEEEE-CCCCCCEEEEEE--CCCCCEEEEEECCCCEEE Q ss_conf 2899752677036772-377763148787--378758999705897999 Q T0558 75 GAKMITRDGRELWNIA-APAGCEMQTARI--LPDGNALVAWCGHPSTIL 120 (294) Q Consensus 75 ~v~~~~~~~~~~~~~~-~~~~~~v~~~~~--~~dg~~l~~~s~~~~~~~ 120 (294) ...+||..+..+.... -.....+..+.| +|+++.+++.+....+.+ T Consensus 52 ~LtIwd~~~~~lE~~e~f~~~~~I~dLDWtst~~~qsiLaVGf~~~V~L 100 (630) T PF12234_consen 52 RLTIWDTRGSVLEYEESFSEDDPIRDLDWTSTPDGQSILAVGFPHHVLL 100 (630) T ss_pred EEEEEECCCCEEEEHHHCCCCCCEECCCCCCCCCCCEEEEEECCCEEEE T ss_conf 7999973751121165505788532352332799877999972757899 No 38 >PF07569 Hira: TUP1-like enhancer of split; InterPro: IPR011494 The Hira proteins are found in a range of eukaryotes and are implicated in the assembly of repressive chromatin. These proteins also contain IPR001680 from INTERPRO.; GO: 0030528 transcription regulator activity, 0045449 regulation of transcription, 0005634 nucleus Probab=84.02 E-value=0.73 Score=19.70 Aligned_cols=26 Identities=12% Similarity=-0.070 Sum_probs=12.4 Q ss_pred CCCCCEEEEEECCCCEEEEECCCCCE Q ss_conf 37875899970589799998568958 Q T0558 103 LPDGNALVAWCGHPSTILEVNMKGEV 128 (294) Q Consensus 103 ~~dg~~l~~~s~~~~~~~~~~~~G~~ 128 (294) ..++.++++.+.++..++|+..++++ T Consensus 19 ~~~~~~Ll~lT~~G~l~vWnl~~~k~ 44 (220) T PF07569_consen 19 ECNGSYLLALTSSGLLYVWNLKTKKA 44 (220) T ss_pred EECCCEEEEEECCCEEEEEECCCCEE T ss_conf 96897899992687799998888813 No 39 >PF11768 DUF3312: Protein of unknown function (DUF3312) Probab=83.25 E-value=0.78 Score=19.48 Aligned_cols=70 Identities=7% Similarity=0.036 Sum_probs=39.1 Q ss_pred CCEEEECCCCCEEEEEECCCEEEEEECCCCEEEEEECCCEEEEEEEECCCCEEEECCCCCEEEEEECCCC Q ss_conf 0089996998999997469889999378858898516973489887348968997358987999987898 Q T0558 145 FRQINKNKKGNYLVPLFATSEVREIAPNGQLLNSVKLSGTPFSSAFLDNGDCLVACGDAHCFVQLNLESN 214 (294) Q Consensus 145 ~~~~~~s~dG~~i~~g~~d~~i~~~d~~g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g 214 (294) +.+-+++|+-.-++-|+.|+++.+||..-+...-.+..-.|.-+..+|+|..++.++..+.+-.+|..=+ T Consensus 261 v~~~a~np~edKL~LGc~D~slvLyD~~r~vT~l~qa~~~P~~i~WHp~~ai~~Van~~gelQ~FDiALs 330 (544) T PF11768_consen 261 VCCCARNPDEDKLILGCEDGSLVLYDEHRGVTLLTQAAVPPTLIAWHPDGAIFLVANERGELQCFDIALS 330 (544) T ss_pred EEEECCCCCHHHEEEECCCCCEEEEECCCCCEEEEECCCCCCCEEECCCCCEEEEECCCCEEEEEEEEHH T ss_conf 0351589515643761367868988546660464204447653257789858999668743898752412 No 40 >PF10647 Gmad1: Lipoprotein LpqB beta-propeller domain Probab=76.32 E-value=1.3 Score=17.95 Aligned_cols=144 Identities=11% Similarity=0.070 Sum_probs=75.8 Q ss_pred CCCCEEEEECCCCEEEE-E--CCEEEEEECCC-CCEEEEECCCCCCEEEEEECCCCCEEEEEECCCCEEEE-ECCCCCEE Q ss_conf 73116479079839984-1--88289975267-70367723777631487873787589997058979999-85689588 Q T0558 55 WECNSVAATKAGEILFS-Y--SKGAKMITRDG-RELWNIAAPAGCEMQTARILPDGNALVAWCGHPSTILE-VNMKGEVL 129 (294) Q Consensus 55 ~~~~~~~~~pdG~~l~s-~--~~~v~~~~~~~-~~~~~~~~~~~~~v~~~~~~~dg~~l~~~s~~~~~~~~-~~~~G~~~ 129 (294) ..+.++++++||..++. . ++...+|.... +.......+ ......+|.+++....+...+....+. +..+|+.. T Consensus 24 ~~~~s~avS~~g~~~A~v~~~~~~~~L~v~~~g~~~~~~~~g--~~lt~PS~~~~~~~W~v~~~~~~~~~~~~~~~g~~~ 101 (253) T PF10647_consen 24 YTVTSAAVSRDGQRVAAVSEPDGRQSLYVGPPGGPARQVLTG--GSLTRPSWDRDGWVWTVDDGDDVVRVIRDDADGTGS 101 (253) T ss_pred CCCCCEEECCCCCEEEEEEECCCCCEEEEECCCCCCEEEECC--CCCCCCCCCCCCCEEEEECCCCCEEEEEECCCCCCE T ss_conf 653213887899759999954898589996289853051047--742464076899889996599725888744788610 Q ss_pred EEEECCCCCCCCCCCCCEEEECCCCCEEEEEE---CCCEEEEEE----CCC-CEEEEE------ECCCEEEEEEEECCCC Q ss_conf 99970677677667400899969989999974---698899993----788-588985------1697348988734896 Q T0558 130 SKTEFETGIERPHAQFRQINKNKKGNYLVPLF---ATSEVREIA----PNG-QLLNSV------KLSGTPFSSAFLDNGD 195 (294) Q Consensus 130 ~~~~~~~~~~~~~~~~~~~~~s~dG~~i~~g~---~d~~i~~~d----~~g-~~~~~~------~~~~~~~~~~~~~~g~ 195 (294) .. ........ ..+..+..++||..++.-. ..+.+++-- ..| ...... .....+.++....++. T Consensus 102 ~~-~v~~~~~~--~~I~~lrvS~DG~RvAvv~~~~~~~~v~va~V~r~~~g~~~~l~~~~~~~~~~~~~v~~l~W~~~~~ 178 (253) T PF10647_consen 102 PV-EVDWPALR--GRITSLRVSPDGVRVAVVVERGGGGQVYVAGVVRDGDGVPRRLTAPRRLGPGLGEDVTSLAWSSDST 178 (253) T ss_pred EE-EECCCCCC--CCEEEEEECCCCCEEEEEEEECCCCEEEEEEEEECCCCCCCEECCCEEECCCCCCCCEEEEEECCCE T ss_conf 17-80255567--5057898579985899999878998699999970899865153133662356567402556726987 Q ss_pred EEEECCCC Q ss_conf 89973589 Q T0558 196 CLVACGDA 203 (294) Q Consensus 196 ~~v~~~~~ 203 (294) +++.+... T Consensus 179 L~V~~~~~ 186 (253) T PF10647_consen 179 LVVLTSSP 186 (253) T ss_pred EEEEECCC T ss_conf 99994589 No 41 >PF05787 DUF839: Bacterial protein of unknown function (DUF839); InterPro: IPR008557 This family consists of bacterial proteins of unknown function. Probab=75.45 E-value=1.3 Score=17.79 Aligned_cols=13 Identities=31% Similarity=0.342 Sum_probs=6.6 Q ss_pred CEEECCCCCEEEE Q ss_conf 3489389989998 Q T0558 235 QLFPLQNGGLYIC 247 (294) Q Consensus 235 ~~~~~~~G~i~i~ 247 (294) ++.+.++|++.|. T Consensus 440 Nl~fd~~G~LwI~ 452 (524) T PF05787_consen 440 NLAFDPDGNLWIQ 452 (524) T ss_pred CCEECCCCCEEEE T ss_conf 5149799999999 No 42 >PF02333 Phytase: Phytase; InterPro: IPR003431 Phytase (3.1.3.8 from EC) (phytate 3-phosphatase) is a secreted enzyme which hydrolyses phytate to release inorganic phosphate. This family appears to represent a novel enzyme that shows phytase activity () and has been shown to have a six- bladed propeller folding architecture ().; PDB: 2poo_A 1cvm_A 1qlg_A 1poo_A 1h6l_A. Probab=75.44 E-value=1.3 Score=17.79 Aligned_cols=191 Identities=15% Similarity=0.198 Sum_probs=97.1 Q ss_pred CCCCCC-EEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEE----ECCCCE---EEEEC-----CEEEEEECCC-- Q ss_conf 478885-89997479869999887882999994499873116479----079839---98418-----8289975267-- Q T0558 19 GSSPQH-LLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAA----TKAGEI---LFSYS-----KGAKMITRDG-- 83 (294) Q Consensus 19 ~s~~~~-~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~----~pdG~~---l~s~~-----~~v~~~~~~~-- 83 (294) +..|.+ +++...+.+-+.++|+ .|+.+..++.+. .+.+.+ .-+|+. .+.++ ..+.+|..+. T Consensus 63 ~~~p~~S~ii~T~K~~GL~vydl-~G~~~~~~~~g~---~nnVDvrygf~L~g~~vDlAvaS~R~~g~ntL~~f~id~~~ 138 (381) T PF02333_consen 63 PTDPSKSLIIGTDKKGGLYVYDL-DGKQLQFLPAGR---LNNVDVRYGFPLGGRTVDLAVASDRSDGRNTLRLFRIDPDN 138 (381) T ss_dssp SS-GGG-EEEEEETTE-EEEEE-----EEEE--------EEEEEEEEEEE----EEEEEEEEE------EEEEEEE---- T ss_pred CCCCCCCEEEEECCCCCEEEECC-CCCEEEECCCCC---CCEEEEECCEECCCEEEEEEEEECCCCCCCEEEEEEECCCC T ss_conf 99956056999717788699878-986967626787---31556561601288068699993466778648999646866 Q ss_pred CCEEEEECC------CCCCEEEEEE--CC-CCCEEEEEECCCCEE----EEECCCC----CEEEEEECCCCCCCCCCCCC Q ss_conf 703677237------7763148787--37-875899970589799----9985689----58899970677677667400 Q T0558 84 RELWNIAAP------AGCEMQTARI--LP-DGNALVAWCGHPSTI----LEVNMKG----EVLSKTEFETGIERPHAQFR 146 (294) Q Consensus 84 ~~~~~~~~~------~~~~v~~~~~--~~-dg~~l~~~s~~~~~~----~~~~~~G----~~~~~~~~~~~~~~~~~~~~ 146 (294) +.+.....+ ....+..++. +| +|.+.+..++..+.+ +.+..+| +++++|...++.++ T Consensus 139 g~L~~i~~~~~p~~t~~~e~YGlclY~s~~~g~~yafv~~k~G~~~Qy~L~~~~~G~i~~~lVR~f~~~sQ~EG------ 212 (381) T PF02333_consen 139 GELTDIGDPNQPIATDLREPYGLCLYRSPSTGKLYAFVNRKDGEVEQYELSDDGNGTITATLVREFKLGSQPEG------ 212 (381) T ss_dssp --EEE-C-SSS-EE-SSSS----EEEE-S----EEEEE------EEEEEEEE-----EE-EEEEEE--SS-B-------- T ss_pred CCCEECCCCCCCCCCCCCCCEEEEEEECCCCCCEEEEEECCCCEEEEEEEECCCCCCCCCEEEEEECCCCCCEE------ T ss_conf 65104467775567886643489986638889789999888745999999748898583578799537986028------ Q ss_pred EEEECCCCCEEEEEECCCEEEEEECC--C----CEEEEEE---CCCEEEEEEEE--CCC-CEEEECCCC-CEEEEEECCC Q ss_conf 89996998999997469889999378--8----5889851---69734898873--489-689973589-8799998789 Q T0558 147 QINKNKKGNYLVPLFATSEVREIAPN--G----QLLNSVK---LSGTPFSSAFL--DNG-DCLVACGDA-HCFVQLNLES 213 (294) Q Consensus 147 ~~~~s~dG~~i~~g~~d~~i~~~d~~--g----~~~~~~~---~~~~~~~~~~~--~~g-~~~v~~~~~-~~i~~~d~~~ 213 (294) +.....-.+|+.+..+.-|..|+.+ + ..+.... ....+-.+.+. .++ .++++++.+ +...+++.+. T Consensus 213 -CVVDde~g~LYvgEEd~GIW~y~AeP~~~~~~~~i~~~~g~~l~aDvEGlaiy~~~~g~gYLivSsQG~~sf~VY~r~~ 291 (381) T PF02333_consen 213 -CVVDDETGYLYVGEEDVGIWRYDAEPEGGSTGTLIDAADGDGLVADVEGLAIYYGGDGTGYLIVSSQGNNSFAVYDREG 291 (381) T ss_dssp --EE-S----EEEEETTTEEEEE-SS--------EEEE-------S-B---EE---------EEEEE----EEEEE---- T ss_pred -EEEECCCCCEEEECCCCEEEEEECCCCCCCCCEEEEECCCCCCCCCCCCEEEEECCCCCEEEEEECCCCCEEEEEECCC T ss_conf -9995677978884377669998768889987448762146776457630388863799718999768897489986678 Q ss_pred C-EEEEEE Q ss_conf 8-599984 Q T0558 214 N-RIVRRV 220 (294) Q Consensus 214 g-~~~~~~ 220 (294) . +.+-++ T Consensus 292 ~~~~~g~F 299 (381) T PF02333_consen 292 PNAYVGSF 299 (381) T ss_dssp ---EEEEE T ss_pred CCCCCCEE T ss_conf 97653438 No 43 >PF03088 Str_synth: Strictosidine synthase; InterPro: IPR004141 Strictosidine synthase is a key enzyme in alkaloid biosynthesis. It catalyses the condensation of tryptamine with secologanin to form strictosidine.; GO: 0016844 strictosidine synthase activity, 0009058 biosynthetic process; PDB: 2vaq_A 2fp8_B 2v91_A 2fpc_B 2fp9_A 2fpb_B. Probab=74.30 E-value=1.4 Score=17.59 Aligned_cols=35 Identities=14% Similarity=0.091 Sum_probs=13.9 Q ss_pred CCCEEEEEECCCCEEEEEE-CCCEEEEEEEECCCCE Q ss_conf 6988999937885889851-6973489887348968 Q T0558 162 ATSEVREIAPNGQLLNSVK-LSGTPFSSAFLDNGDC 196 (294) Q Consensus 162 ~d~~i~~~d~~g~~~~~~~-~~~~~~~~~~~~~g~~ 196 (294) ..|++..||++++...... .-..|..+++++|+.. T Consensus 35 ~tGRLl~YDp~t~~~~VLl~gL~fpNGvals~D~~~ 70 (89) T PF03088_consen 35 PTGRLLKYDPRTKETTVLLDGLYFPNGVALSKDGSF 70 (89) T ss_dssp ----EEEEETTTTEEEEEE-S-S-----EE-TTSSE T ss_pred CCCCEEEEECCCCEEEEEECCCCCCCEEEECCCCCE T ss_conf 963189981899918996008865764799899999 No 44 >PF08194 DIM: DIM protein; InterPro: IPR013172 Drosophila immune-induced molecules (DIMs) are short proteins induced during the immune response of Drosophila. This family includes DIMs 1 to 4 that have masses below 5 kDa . Probab=71.10 E-value=0.91 Score=19.00 Aligned_cols=30 Identities=33% Similarity=0.458 Sum_probs=19.1 Q ss_pred CHHHHHHHHHHHHH-HHCCCCCCCCEEEEEC Q ss_conf 90345568887663-2001478885899974 Q T0558 1 MKNFILLVALFLVA-PFAQGSSPQHLLVGGS 30 (294) Q Consensus 1 ~~~~~~~~~~~~~~-~~~~~s~~~~~l~~gs 30 (294) ||++.+..++++.+ ..++..+|+++++.|. T Consensus 1 MK~lsla~~l~lLal~~a~~~~pG~ViING~ 31 (36) T PF08194_consen 1 MKCLSLAFALGLLALAAAVPATPGNVIINGD 31 (36) T ss_pred CCEEHHHHHHHHHHHHHCCCCCCCEEEECCE T ss_conf 9240999999999998506689986899856 No 45 >PF00780 CNH: CNH domain; InterPro: IPR001180 Based on sequence similarities a domain of homology has been identified in the following proteins : Citron and Citron kinase. These two proteins interact with the GTP-bound forms of the small GTPases Rho and Rac but not with Cdc42. Myotonic dystrophy kinase-related Cdc42-binding kinase (MRCKalpha). This serine/threonine kinase interacts with the GTP-bound form of the small GTPase Cdc42 and to a lesser extent with that of Rac. NCK Interacting Kinase (NIK), a serine/threonine protein kinase. ROM-1 and ROM-2, from yeast. These proteins are GDP/GTP exchange proteins (GEPs) for the small GTP binding protein Rho1. This domain, called the citron homology domain, is often found after cysteine rich and pleckstrin homology (PH) domains at the C-terminal end of the proteins . It acts as a regulatory domain and could be involved in macromolecular interactions , .; GO: 0005083 small GTPase regulator activity Probab=66.31 E-value=2.1 Score=16.40 Aligned_cols=227 Identities=11% Similarity=0.039 Sum_probs=106.2 Q ss_pred CCCCCCEEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEE-EEECCEEEEEECCC---CC-EEE----- Q ss_conf 478885899974798699998878829999944998731164790798399-84188289975267---70-367----- Q T0558 19 GSSPQHLLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEIL-FSYSKGAKMITRDG---RE-LWN----- 88 (294) Q Consensus 19 ~s~~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l-~s~~~~v~~~~~~~---~~-~~~----- 88 (294) +...++.|+.|+.+| |++++. .....|..- .+-..+..+...++-+.+ +-.|+.++.++... .. .+. T Consensus 3 ~~~~~~~llvGt~~G-l~~~~~-~~~~~~~~~-~~~~~V~qi~vi~~~~~llvLsd~~L~~~~L~~l~~~~~~~~~~~~~ 79 (275) T PF00780_consen 3 PDTGGQKLLVGTEEG-LYLYDI-SDPNRPRKI-LKLFSVTQIEVIEELNLLLVLSDKQLYVYDLSSLEPRSLSSPLSKSK 79 (275) T ss_pred CCCCCCEEEEEECCC-EEEEEE-CCCCCCCEE-CCCCCEEEEEEECCCCEEEEEECCCEEEEECHHHCCCCCCCCCCCCC T ss_conf 244899999998999-899995-365662103-35442899999400699999919928999908954433565432100 Q ss_pred -----EECCCCCCEEEEE--ECCCCCEEEEEECCCCEEEE--ECCCC---CEEEEEECCCCCCCCCCCCCEEEECCCCCE Q ss_conf -----7237776314878--73787589997058979999--85689---588999706776776674008999699899 Q T0558 89 -----IAAPAGCEMQTAR--ILPDGNALVAWCGHPSTILE--VNMKG---EVLSKTEFETGIERPHAQFRQINKNKKGNY 156 (294) Q Consensus 89 -----~~~~~~~~v~~~~--~~~dg~~l~~~s~~~~~~~~--~~~~G---~~~~~~~~~~~~~~~~~~~~~~~~s~dG~~ 156 (294) ........+...+ -..++...++.+...++.+. ....+ +...++..+ +.+..+.+. +.. T Consensus 80 ~~~~~~~i~~~k~~~~f~~~~~~~~~~~L~va~kk~i~i~~~~~~~~~~~~~~kei~~~-------~~~~~i~~~--~~~ 150 (275) T PF00780_consen 80 SDNQPQKIPKTKGCSFFAVVGGHSGSRYLCVAVKKKILIYEWKDPLNKFVKLFKEISLP-------DPPKSIAWF--NNS 150 (275) T ss_pred CCCCCCCCCCCCCEEEEEECCCCCCCEEEEEEECCEEEEEEEECCCCCCCEEEEEEEEC-------CCCEEEEEE--CCE T ss_conf 12222123455661799842676883699999999999999957877411041499947-------722798998--999 Q ss_pred EEEEECCCEEEEEECC-CCEEEEEECC------------CEEEEEEEECCCCEEEECCCCCEEEEEECCCCEEEEEECCC Q ss_conf 9997469889999378-8588985169------------73489887348968997358987999987898599984488 Q T0558 157 LVPLFATSEVREIAPN-GQLLNSVKLS------------GTPFSSAFLDNGDCLVACGDAHCFVQLNLESNRIVRRVNAN 223 (294) Q Consensus 157 i~~g~~d~~i~~~d~~-g~~~~~~~~~------------~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~ 223 (294) +..|.. ....++|.. +...+-.... ..|..+...+++.+++.. +....++|. .|+..+ T Consensus 151 icvg~~-~~f~~v~l~~~~~~~l~~~~~~~~~~~~~~~~~~p~~~~~l~~~e~Ll~~--~~~g~fvn~-~G~~~r----- 221 (275) T PF00780_consen 151 ICVGTS-KGFEIVDLDTGSPSSLLDLDDSSFSFFSPSESLKPVGIFQLSDDEFLLCY--DNFGVFVNS-NGKPSR----- 221 (275) T ss_pred EEEEEC-CEEEEEECCCCCCCEEECCCCCCCCHHCCCCCCCCCEEEEECCCCEEEEE--CCEEEEECC-CCCCCC----- T ss_conf 999989-84899989989840430667754210013557897379998999799996--744999918-998146----- Q ss_pred CCCCEEECCCCCEEECCCCCEEEEECCCCEEECCCCCCCEEEEECCCCCEEEEEECCCCEEEEE Q ss_conf 7641141134534893899899980467714302477756999908998999983588537888 Q T0558 224 DIEGVQLFFVAQLFPLQNGGLYICNWQGHDREAGKGKHPQLVEIDSEGKVVWQLNDKVKFGMIS 287 (294) Q Consensus 224 ~~~~~~~~~~~~~~~~~~G~i~i~~~~~~~~~~~~~~~~~~~~i~~~G~~vW~~~~~~~~~~i~ 287 (294) ...+.|. . ......-..+.++.+.+++-.||++.....++.+. T Consensus 222 ---~~~i~w~-----------------~-~p~~v~~~~pyl~~~~~~~ieV~~i~~~~lvQ~i~ 264 (275) T PF00780_consen 222 ---KSTIKWS-----------------G-PPQSVAYSYPYLLAFHPNGIEVRSIETGELVQTIN 264 (275) T ss_pred ---CEEEECC-----------------C-HHCEEEEECCEEEEEECCCEEEEECCCCCEEEEEE T ss_conf ---6189888-----------------8-10299998999999969939999987993899998 No 46 >PF00879 Defensin_propep: Defensin propeptide The pattern for this Prosite entry doesn't match the propeptide.; InterPro: IPR002366 Defensins are 2-6 kDa, cationic, microbicidal peptides active against many Gram-negative and Gram-positive bacteria, fungi, and enveloped viruses , containing three pairs of intramolecular disulphide bonds . On the basis of their size and pattern of disulphide bonding, mammalian defensins are classified into alpha, beta and theta categories. Alpha-defensins, which have been identified in humans, monkeys and several rodent species, are particularly abundant in neutrophils, certain macrophage populations and Paneth cells of the small intestine. Every mammalian species explored thus far has beta-defensins. In cows, as many as 13 beta-defensins exist in neutrophils. However, in other species, beta-defensins are more often produced by epithelial cells lining various organs (e.g. the epidermis, bronchial tree and genitourinary tract). Theta-defensins are cyclic and have so far only been identified in primate phagocytes. Defensins are produced constitutively and/or in response to microbial products or proinflammatory cytokines. Some defensins are also called corticostatins (CS) because they inhibit corticotropin-stimulated corticosteroid production. The mechanism(s) by which microorganisms are killed and/or inactivated by defensins is not understood completely. However, it is generally believed that killing is a consequence of disruption of the microbial membrane. The polar topology of defensins, with spatially separated charged and hydrophobic regions, allows them to insert themselves into the phospholipid membranes so that their hydrophobic regions are buried within the lipid membrane interior and their charged (mostly cationic) regions interact with anionic phospholipid head groups and water. Subsequently, some defensins can aggregate to form 'channel-like' pores; others might bind to and cover the microbial membrane in a 'carpet-like' manner. The net outcome is the disruption of membrane integrity and function, which ultimately leads to the lysis of microorganisms. Some defensins are synthesized as propeptides which may be relevant to this process - in neutrophils only the mature peptides have been identified but in Paneth cells, the propeptide is stored in vesicles and appears to be cleaved by trypsin on activation. ; GO: 0006952 defense response Probab=64.47 E-value=1.4 Score=17.75 Aligned_cols=24 Identities=42% Similarity=0.548 Sum_probs=21.7 Q ss_pred CHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 903455688876632001478885 Q T0558 1 MKNFILLVALFLVAPFAQGSSPQH 24 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~~~~ 24 (294) ||-+.|+.+++|+|..+|..+.+. T Consensus 1 MrTl~LLaAlLllALqaQAe~~q~ 24 (52) T PF00879_consen 1 MRTLVLLAALLLLALQAQAEPLQE 24 (52) T ss_pred CCHHHHHHHHHHHHHHHHCCCCCC T ss_conf 905999999999999975267654 No 47 >PF09792 But2: Ubiquitin 3 binding protein But2 Probab=63.86 E-value=2.3 Score=16.08 Aligned_cols=83 Identities=20% Similarity=0.187 Sum_probs=34.3 Q ss_pred CHHHHHHHHHHHHHHHCCCCC---CCCEEEEECCCCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEECCEEE Q ss_conf 903455688876632001478---88589997479869999887882999994499873116479079839984188289 Q T0558 1 MKNFILLVALFLVAPFAQGSS---PQHLLVGGSGWNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSYSKGAK 77 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~---~~~~l~~gs~~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~~~~v~ 77 (294) ||||.-+.++-..|..+.... --++.++|...+.|.-+|. |+. .++++.......+..||.+- ++.+.. T Consensus 1 mk~~~~~~a~a~~a~al~~r~~~~~f~l~asGg~sg~i~~l~~--~~~----rv~g~~~~~~F~i~~dG~lt--d~~g~~ 72 (446) T PF09792_consen 1 MKYFASLAALAAGANALVKRDDNCCFHLTASGGDSGTIGQLDD--GQN----RVGGSLPNGTFCINSDGSLT--DSNGRG 72 (446) T ss_pred CCHHHHHHHHHHHCCHHHHCCCCCEEEEEECCCCCCCEEECCC--CCC----CCCCCCCCEEEEECCCCCEE--ECCCCE T ss_conf 9117889987641313211588756988843787764664158--764----32587786389982898578--689776 Q ss_pred EEECCCCCEEEEEC Q ss_conf 97526770367723 Q T0558 78 MITRDGRELWNIAA 91 (294) Q Consensus 78 ~~~~~~~~~~~~~~ 91 (294) .+...+..+.+... T Consensus 73 ~i~~~~t~qfq~D~ 86 (446) T PF09792_consen 73 CILTPGTTQFQCDA 86 (446) T ss_pred EEECCCCEEEEECC T ss_conf 99679965887368 No 48 >PF10566 Glyco_hydro_97: Glycoside hydrolase 97 ; PDB: 2jkp_A 2zq0_A 2jke_B 2d73_B 2jka_B. Probab=63.78 E-value=2.3 Score=16.07 Aligned_cols=51 Identities=20% Similarity=0.302 Sum_probs=27.8 Q ss_pred CHHHHH-HHHHHHHHHHCCCCCCCCEEEEECCCCEEEE-EECCCCEEEEEEECC Q ss_conf 903455-6888766320014788858999747986999-988788299999449 Q T0558 1 MKNFIL-LVALFLVAPFAQGSSPQHLLVGGSGWNKIAI-INKDTKEIVWEYPLE 52 (294) Q Consensus 1 ~~~~~~-~~~~~~~~~~~~~s~~~~~l~~gs~~~~i~~-~d~~tg~~~w~~~~~ 52 (294) ||...| ++++++.+..++..+ .+.....|-|++|.+ +....|++.++.... T Consensus 1 MKk~~i~~l~~~l~~~~~~~~~-~~~~~v~SPdG~l~v~v~~~~g~~~Y~v~~~ 53 (643) T PF10566_consen 1 MKKLIIILLALLLLLSASSSAA-AKNYTVSSPDGKLKVTVSLDDGQPTYSVSYN 53 (643) T ss_dssp --------------------------EEEE-TTSSEEEEEEE----EEEEEEET T ss_pred CCHHHHHHHHHHHHHHHHHHHC-CCCEEEECCCCCEEEEEEECCCCEEEEEEEC T ss_conf 9437999999999987411201-4733868979888999996899289999999 No 49 >PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C-terminus alpha-amidation of biological peptides . In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators . The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller , . The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; PDB: 1rwl_A 1rwi_B 1q7f_A. Probab=58.95 E-value=2.8 Score=15.49 Aligned_cols=22 Identities=27% Similarity=0.312 Sum_probs=8.8 Q ss_pred EEEEEEECCCCEEEECCCCCEE Q ss_conf 4898873489689973589879 Q T0558 185 PFSSAFLDNGDCLVACGDAHCF 206 (294) Q Consensus 185 ~~~~~~~~~g~~~v~~~~~~~i 206 (294) |.+++..++|+.+++....++| T Consensus 4 P~giav~~~g~i~VaD~~n~rV 25 (28) T PF01436_consen 4 PHGIAVDPDGNIYVADSGNHRV 25 (28) T ss_dssp BEEEEE-TTSEEEEEETTTTEE T ss_pred CCEEEECCCCCEEEEECCCCEE T ss_conf 6599995999899998999999 No 50 >PF05777 Acp26Ab: Drosophila accessory gland-specific peptide 26Ab (Acp26Ab); InterPro: IPR008392 This family consists of accessory gland-specific 26Ab peptides or male accessory gland secretory protein 355B from different Drosophila species. Drosophila males, like males of most other insects, transfer a group of specific proteins (Acp26Ab and Acp26Aa in Drosophila) to the females during mating. These proteins are produced primarily in the accessory gland and are likely to influence the female's reproduction .; GO: 0007617 mating behavior, 0005576 extracellular region Probab=55.20 E-value=3.1 Score=15.17 Aligned_cols=28 Identities=14% Similarity=0.144 Sum_probs=19.0 Q ss_pred CHHHHHHHHHHHHHHHCCCCCCCCEEEEE Q ss_conf 90345568887663200147888589997 Q T0558 1 MKNFILLVALFLVAPFAQGSSPQHLLVGG 29 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~~~~~l~~g 29 (294) |.||+++|||+-||..- .|.+.-++-+- T Consensus 1 mnyf~~lcif~cicl~~-~sdAaPyisVq 28 (90) T PF05777_consen 1 MNYFVPLCIFSCICLWQ-LSDAAPYISVQ 28 (90) T ss_pred CCEEEHHHHHHHHHHHH-HCCCCCEEEEE T ss_conf 96130259989889986-04678637886 No 51 >PF02261 Asp_decarbox: Aspartate decarboxylase; InterPro: IPR003190 Decarboxylation of aspartate is the major route of alanine production in bacteria, and is catalysed by the enzyme aspartate decarboxylase. The enzyme is translated as an inactive proenzyme of two chains, A and B. This family contains both chains of aspartate decarboxylase.; GO: 0004068 aspartate 1-decarboxylase activity, 0006523 alanine biosynthetic process; PDB: 1vc3_A 2eeo_B 2c45_C 1uhd_A 1uhe_A 1aw8_D 1pqf_B 1pyq_B 1pt0_A 1ppy_B .... Probab=54.90 E-value=2.1 Score=16.37 Aligned_cols=72 Identities=13% Similarity=0.170 Sum_probs=39.9 Q ss_pred CCEEEEEECCCCEEEEEECCCCCCCEEECCCCCEE---ECCCCCEEEEECCCCEEECCCCCCCEEEEECCCCCEE Q ss_conf 98799998789859998448876411411345348---9389989998046771430247775699990899899 Q T0558 203 AHCFVQLNLESNRIVRRVNANDIEGVQLFFVAQLF---PLQNGGLYICNWQGHDREAGKGKHPQLVEIDSEGKVV 274 (294) Q Consensus 203 ~~~i~~~d~~~g~~~~~~~~~~~~~~~~~~~~~~~---~~~~G~i~i~~~~~~~~~~~~~~~~~~~~i~~~G~~v 274 (294) ...+.++|.++|+-+.++-.....++.....++.. ..+.-.+.+..|...+........++++-+|++.++. T Consensus 41 ~E~V~I~Nv~NG~Rf~TYvI~g~~GSg~i~lNGAAAr~~~~GD~vII~ay~~~~~~e~~~~~P~vv~vd~~N~i~ 115 (116) T PF02261_consen 41 YEQVQIVNVNNGERFETYVIPGERGSGVICLNGAAARLVQPGDRVIIMAYAQMDEEEAKNHKPKVVFVDEKNRIK 115 (116) T ss_dssp TBEEEEEEST---EEEEEEEEE------EEEE--GGGTS----EEEEEEEEEEEHHHHHH---EEEEEETTSEEE T ss_pred CCEEEEEECCCCCEEEEEEEECCCCCCEEEECCHHHHCCCCCCEEEEEECCCCCHHHHHCCCCEEEEECCCCCCC T ss_conf 988999999999378999987368987798778898347999999999884579899833888699999999895 No 52 >PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 Members of this domain are plant lectins. Curculin is a sweet-tasting and taste-modifying protein from the fruits of Curculigo latifolia (Lumbah). The three mannose-binding sites are devoid of mannose-binding activity . Other members of this domain are mannose specific and have diverse functions. The lectin of the saffron crocus (Crocus sativus) (Saffron) specifically interacts with a yeast mannan and is a major corm protein specifically expressed in this organ . The actin-binding and vesicle-associated protein comitin exhibits a mannose-specific lectin activity and may have a role in cell motility. It binds to vesicle membranes via mannose residues and, by way of its interaction with actin, links these membranes to the cytoskeleton. ; GO: 0005529 sugar binding; PDB: 1npl_A 1niv_A 1msa_D 1jpc_A 1xd5_B 1xd6_A 1bwu_Q 1kj1_D 1b2p_A 1dlp_F .... Probab=53.19 E-value=3.4 Score=14.84 Aligned_cols=19 Identities=16% Similarity=0.108 Sum_probs=6.7 Q ss_pred ECCCEEEEEECCCCEEEEE Q ss_conf 4698899993788588985 Q T0558 161 FATSEVREIAPNGQLLNSV 179 (294) Q Consensus 161 ~~d~~i~~~d~~g~~~~~~ 179 (294) ..||.+.++|..+..+|.. T Consensus 26 ~~dGnLvL~~~~~~~vWss 44 (114) T PF01453_consen 26 QSDGNLVLYDGNGSVVWSS 44 (114) T ss_dssp ETTSEEEEEETT-EEEEE- T ss_pred CCCCEEEEECCCCCEEEEE T ss_conf 9898399987998899982 No 53 >PF06649 DUF1161: Protein of unknown function (DUF1161); InterPro: IPR010595 This family consists of several short, hypothetical bacterial proteins of unknown function. Probab=50.92 E-value=3 Score=15.19 Aligned_cols=17 Identities=59% Similarity=0.780 Sum_probs=13.1 Q ss_pred CHHHHHHHHHHHHHHHC Q ss_conf 90345568887663200 Q T0558 1 MKNFILLVALFLVAPFA 17 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~ 17 (294) ||.|+|.+++++.+..+ T Consensus 1 Mkk~~l~~~l~~la~~a 17 (75) T PF06649_consen 1 MKKFLLAVALLLLAAPA 17 (75) T ss_pred CCHHHHHHHHHHHHHHH T ss_conf 93569999999985645 No 54 >PF05399 EVI2A: Ectropic viral integration site 2A protein (EVI2A); InterPro: IPR008608 This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumours , .; GO: 0016021 integral to membrane Probab=50.66 E-value=2.4 Score=15.93 Aligned_cols=11 Identities=9% Similarity=-0.299 Sum_probs=4.7 Q ss_pred CCCEEEEEECC Q ss_conf 79869999887 Q T0558 31 GWNKIAIINKD 41 (294) Q Consensus 31 ~~~~i~~~d~~ 41 (294) ..+.+.+|+.. T Consensus 32 w~~s~~~~~~~ 42 (227) T PF05399_consen 32 WANSNTAWDSI 42 (227) T ss_pred CCCCCEEECCC T ss_conf 55563420320 No 55 >PF10956 DUF2756: Protein of unknown function (DUF2756) Probab=50.23 E-value=2.6 Score=15.71 Aligned_cols=19 Identities=47% Similarity=0.614 Sum_probs=15.4 Q ss_pred CHHHHHHHHHHHHHHHCCC Q ss_conf 9034556888766320014 Q T0558 1 MKNFILLVALFLVAPFAQG 19 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~ 19 (294) ||++.|+++++-.+.||++ T Consensus 1 mkr~l~l~allpf~~~Aqp 19 (104) T PF10956_consen 1 MKRLLLLTALLPFAALAQP 19 (104) T ss_pred CHHHHHHHHHHHHHHHHHH T ss_conf 9158999998688998616 No 56 >PF05264 CfAFP: Choristoneura fumiferana antifreeze protein (CfAFP); InterPro: IPR007928 Antifreeze proteins (AFPs) are a class of proteins that are able to bind to and inhibit the growth of macromolecular ice, thereby permitting an organism to survive subzero temperatures by decreasing the probability of ice nucleation in their bodies . These proteins have been characterised from a variety of organisms, including fish, plants, bacteria, fungi and arthropods. This entry represents insect AFPs of the type found in spruce budworm, Choristoneura fumiferana. The structure of these AFPs consists of a left-handed beta-helix with 15 residues per coil . The beta-helices of insect AFPs present a highly rigid array of threonine residues and bound water molecules that can effectively mimic the ice lattice. As such, beta-helical AFPs provide a more effective coverage of the ice surface compared to the alpha-helical fish AFPs. A second insect antifreeze from Tenebrio molitor (IPR003460 from INTERPRO) also consists of beta-helices, however in these proteins the helices form a right-handed twist; these proteins show no sequence homology to the current entry, but may act by a similar mechanism. The beta-helix motif may be used as an AFP structural motif in non-homologous proteins from other (non-fish) organisms as well. ; PDB: 1l0s_B 1eww_A 1n4i_A 1m8n_B 1z2f_A. Probab=47.53 E-value=1.6 Score=17.25 Aligned_cols=11 Identities=55% Similarity=0.839 Sum_probs=9.4 Q ss_pred CHHHHHHHHHH Q ss_conf 90345568887 Q T0558 1 MKNFILLVALF 11 (294) Q Consensus 1 ~~~~~~~~~~~ 11 (294) ||+||||.++- T Consensus 1 mk~~~lim~la 11 (137) T PF05264_consen 1 MKCFMLIMALA 11 (137) T ss_dssp ----------- T ss_pred CCEEHHHHHHH T ss_conf 90201206321 No 57 >PF10614 Tafi-CsgF: Curli production assembly/transport component CsgF Probab=43.68 E-value=4.7 Score=13.82 Aligned_cols=26 Identities=38% Similarity=0.390 Sum_probs=13.3 Q ss_pred CHHHHHHHHHHHHHHHCCCCCCCCEEE Q ss_conf 903455688876632001478885899 Q T0558 1 MKNFILLVALFLVAPFAQGSSPQHLLV 27 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~~~~~l~ 27 (294) ||+++++..+++++. +..+.+..++. T Consensus 1 mk~~~l~a~l~~~~~-a~~a~A~~LVY 26 (142) T PF10614_consen 1 MKYRGLLAVLLLLAA-AGPAQAQELVY 26 (142) T ss_pred CCCHHHHHHHHHHHH-CCCCCHHHEEE T ss_conf 916399999999982-55000422255 No 58 >PF10279 Latarcin: Latarcin precursor; PDB: 2g9p_A 2pco_A. Probab=43.04 E-value=2.1 Score=16.43 Aligned_cols=12 Identities=42% Similarity=0.681 Sum_probs=10.1 Q ss_pred CHHHHHHHHHHH Q ss_conf 903455688876 Q T0558 1 MKNFILLVALFL 12 (294) Q Consensus 1 ~~~~~~~~~~~~ 12 (294) ||||+++.++.+ T Consensus 1 MKyfvVaLaL~v 12 (105) T PF10279_consen 1 MKYFVVALALAV 12 (105) T ss_dssp ------------ T ss_pred CCCHHHHHHHHH T ss_conf 902599999999 No 59 >PF03646 FlaG: FlaG protein; InterPro: IPR005186 Although these proteins are known to be important for flagellar their exact function is unknown.; PDB: 2hc5_A. Probab=39.44 E-value=5.4 Score=13.37 Aligned_cols=16 Identities=31% Similarity=0.752 Sum_probs=6.0 Q ss_pred EEEEECCCCEEEEEEE Q ss_conf 9999887882999994 Q T0558 35 IAIINKDTKEIVWEYP 50 (294) Q Consensus 35 i~~~d~~tg~~~w~~~ 50 (294) |.++|.+||+++.+.| T Consensus 70 VkViD~~T~eVIRqIP 85 (107) T PF03646_consen 70 VKVIDKETGEVIRQIP 85 (107) T ss_dssp EEEEETTT-SEEEEE- T ss_pred EEEEECCCCCEEEECC T ss_conf 9999899883424488 No 60 >PF01939 DUF91: Protein of unknown function DUF91; InterPro: IPR002793 The function of these prokaryotic proteins is unknown. Computational analysis suggests that they may form a restriction endonuclease-like fold, similar to that found in a variety of endonucleases and DNA repair enzymes .; PDB: 2vld_B. Probab=39.10 E-value=5.4 Score=13.34 Aligned_cols=14 Identities=14% Similarity=0.304 Sum_probs=5.5 Q ss_pred EEEECCCCEEEEEC Q ss_conf 47907983998418 Q T0558 60 VAATKAGEILFSYS 73 (294) Q Consensus 60 ~~~~pdG~~l~s~~ 73 (294) +.+.|||.+++=++ T Consensus 30 livKpDGsvlVH~~ 43 (228) T PF01939_consen 30 LIVKPDGSVLVHSD 43 (228) T ss_dssp EEE-----EEEE-S T ss_pred EEEECCCCEEEECC T ss_conf 99906981899478 No 61 >PF07676 PD40: WD40-like Beta Propeller Repeat; InterPro: IPR011659 This region appears to be related to the IPR001680 from INTERPRO repeat. This model is likely to miss copies within a sequence.; PDB: 2w8b_D 1crz_A 2hqs_B 1c5k_A 2ivz_D 1k32_A 1n6e_C 1n6f_A 1n6d_F 2ojh_A .... Probab=36.84 E-value=5.9 Score=13.09 Aligned_cols=15 Identities=27% Similarity=0.355 Sum_probs=5.5 Q ss_pred EEEECCCCCEEEEEE Q ss_conf 878737875899970 Q T0558 99 TARILPDGNALVAWC 113 (294) Q Consensus 99 ~~~~~~dg~~l~~~s 113 (294) ...|+|||++++..+ T Consensus 13 ~p~~SpDG~~i~f~s 27 (39) T PF07676_consen 13 SPSWSPDGKYIVFSS 27 (39) T ss_dssp EEEE-TTSSEEEEEE T ss_pred CEEECCCCCEEEEEE T ss_conf 879867999999984 No 62 >PF07202 Tcp10_C: T-complex protein 10 C-terminus; InterPro: IPR009852 Proteins in this entry include T-complex 10, involved in spermatogenesis in mice, and centromere protein J, which not only inhibits microtubule nucleation from the centrosome, but also depolymerizes taxol-stabilized microtubules , . These proteins share an approximately 180 residue C-terminal region which contains unsual G repreats . Probab=34.08 E-value=6.5 Score=12.79 Aligned_cols=98 Identities=12% Similarity=0.206 Sum_probs=45.3 Q ss_pred CCCCCEEEEEECCCEEEEEECCCCEEEEEECCCEEEEEEEECCCCEEEECCCCCEEEEEECCCCEEEEEECCCCCCCEEE Q ss_conf 69989999974698899993788588985169734898873489689973589879999878985999844887641141 Q T0558 151 NKKGNYLVPLFATSEVREIAPNGQLLNSVKLSGTPFSSAFLDNGDCLVACGDAHCFVQLNLESNRIVRRVNANDIEGVQL 230 (294) Q Consensus 151 s~dG~~i~~g~~d~~i~~~d~~g~~~~~~~~~~~~~~~~~~~~g~~~v~~~~~~~i~~~d~~~g~~~~~~~~~~~~~~~~ 230 (294) -|||.-.+ ..-||+++...++|...+. .|||........+. ..+...+|+..-... T Consensus 80 ~pdG~KeI-~FPDGt~k~i~~dG~ee~~------------~pDGt~~~v~~ng~--k~I~~pNG~~~ih~~--------- 135 (179) T PF07202_consen 80 YPDGSKEI-LFPDGTIKVIHPDGEEETI------------FPDGTVVTVNPNGD--KTIEFPNGQQEIHTA--------- 135 (179) T ss_pred ECCCCEEE-EECCCCEEEEECCCCEEEE------------ECCCEEEEEECCCC--EEEECCCCCEEEEEC--------- T ss_conf 58998999-9799749999189958999------------18963999915873--899858985999846--------- Q ss_pred CCCCCEEECCCCCEEEEECCCCEEECCCCCCCEEEEECCCCCEEEE Q ss_conf 1345348938998999804677143024777569999089989999 Q T0558 231 FFVAQLFPLQNGGLYICNWQGHDREAGKGKHPQLVEIDSEGKVVWQ 276 (294) Q Consensus 231 ~~~~~~~~~~~G~i~i~~~~~~~~~~~~~~~~~~~~i~~~G~~vW~ 276 (294) .......|||++...--+|..... ..++++-.-|++|++|-+ T Consensus 136 --~~~~~~yPdGt~k~~~~~G~q~~~--y~~gr~~~kd~~g~~~~~ 177 (179) T PF07202_consen 136 --DYKRREYPDGTVKTVYPDGRQETR--YSNGRVRVKDKDGNVIMD 177 (179) T ss_pred --CCEEEECCCCCEEEEECCCCEEEE--ECCCEEEEECCCCCEEEE T ss_conf --738998899829999659968987--258679997478879840 No 63 >PF04202 Mfp-3: Foot protein 3; InterPro: IPR007328 Mytilus foot protein-3 (Mfp-3) is a highly polymorphic protein family located in the byssal adhesive plaques of blue mussels. Probab=32.43 E-value=6.9 Score=12.61 Aligned_cols=23 Identities=48% Similarity=0.677 Sum_probs=14.4 Q ss_pred CHHHHH--HHHHHHHHHHCCCCCCC Q ss_conf 903455--68887663200147888 Q T0558 1 MKNFIL--LVALFLVAPFAQGSSPQ 23 (294) Q Consensus 1 ~~~~~~--~~~~~~~~~~~~~s~~~ 23 (294) |++|-+ +++++||..|+.-|-+. T Consensus 1 Mn~~Sv~VLvaLVLiGsFAVqSDA~ 25 (71) T PF04202_consen 1 MNNFSVSVLVALVLIGSFAVQSDAG 25 (71) T ss_pred CCCCHHHHHHHHHHHHHHEEECCCC T ss_conf 9740146999999862311421553 No 64 >PF01731 Arylesterase: Arylesterase; InterPro: IPR002640 The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysis of the toxic metabolites of a variety of organophosphorus insecticides. The enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl acetate), and hence confer resistance to organophosphate toxicity . Mammals have 3 distinct paraoxonase types, termed PON1-3 , . In mice and humans, the PON genes are found on the same chromosome in close proximity. PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity. Human and rabbit PONs appear to have two distinct Ca2+ binding sites, one required for stability and one required for catalytic activity. The Ca2+ dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as the electrophilic catalyst, like that proposed for phospholipase A2. The paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)- associated proteins capable of preventing oxidative modification of low density lipoproteins (LPL) . Although PON2 has oxidative properties, the enzyme does not associate with HDL. Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share 79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo . This family consists of arylesterases (Also known as serum paraoxonase) 3.1.1.2 from EC. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity . Human arylesterase (PON1) P27169 from SWISSPROT is associated with HDL and may protect against LDL oxidation .; GO: 0004064 arylesterase activity; PDB: 1v04_A. Probab=31.87 E-value=7 Score=12.54 Aligned_cols=28 Identities=18% Similarity=0.199 Sum_probs=17.2 Q ss_pred CCEEEECCCCCEEEEEEC-CCEEEEEECC Q ss_conf 008999699899999746-9889999378 Q T0558 145 FRQINKNKKGNYLVPLFA-TSEVREIAPN 172 (294) Q Consensus 145 ~~~~~~s~dG~~i~~g~~-d~~i~~~d~~ 172 (294) .+.+..+||++++++++. ...|+++.++ T Consensus 56 aNGI~~s~d~k~vyVa~~~~~~v~v~~~~ 84 (86) T PF01731_consen 56 ANGINISPDGKYVYVASSMAHSVHVYKRH 84 (86) T ss_dssp B---EE-----EEEEEETTTTEEEEEEE- T ss_pred CCCEEECCCCCEEEEECCCCCEEEEEEEC T ss_conf 77658889988999941654308999971 No 65 >PF00993 MHC_II_alpha: Class II histocompatibility antigen, alpha domain; InterPro: IPR001003 Major Histocompatibility Complex (MHC) glycoproteins are heterodimeric cell surface receptors that function to present antigen peptide fragments to T cells responsible for cell-mediated immune responses. MHC molecules can be subdivided into two groups on the basis of structure and function: class I molecules present intracellular antigen peptide fragments (~10 amino acids) on the surface of the host cells to cytotoxic T cells; class II molecules present exogenously derived antigenic peptides (~15 amino acids) to helper T cells. MHC class I and II molecules are assembled and loaded with their peptide ligands via different mechanisms. However, both present peptide fragments rather than entire proteins to T cells, and are required to mount an immune response. Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) . MHC class II molecules are comprised of two membrane-spanning chains, alpha and beta, of similar size. Both chains consist of two globular domains (N- and C-terminal), and a transmembrane segment to anchor them to the membrane . A groove in the structure acts as the peptide-binding site. This entry represents the N-terminal domain (also called alpha-1 domain) of the alpha chain. More information about these proteins can be found at Protein of the Month: MHC .; GO: 0006955 immune response, 0019882 antigen processing and presentation, 0016020 membrane, 0042613 MHC class II protein complex; PDB: 1r5v_A 1fng_C 1i3r_E 1fne_A 1r5w_A 1kt2_C 1ktd_A 1ieb_C 1iea_C 1zgl_A .... Probab=31.34 E-value=7.1 Score=12.48 Aligned_cols=20 Identities=15% Similarity=0.120 Sum_probs=8.7 Q ss_pred CCCCEEEEEECCCCEEEEEE Q ss_conf 58987999987898599984 Q T0558 201 GDAHCFVQLNLESNRIVRRV 220 (294) Q Consensus 201 ~~~~~i~~~d~~~g~~~~~~ 220 (294) -++..+.-.|...++.+|++ T Consensus 23 fDgeElfy~Df~kke~V~~l 42 (82) T PF00993_consen 23 FDGEELFYADFKKKEGVWRL 42 (82) T ss_dssp ETTEEEEEEETTTTEEEESS T ss_pred ECCCEEEEEECCCCCEEEEC T ss_conf 07766899864678068857 No 66 >PF06422 PDR_CDR: CDR ABC transporter; InterPro: IPR010929 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). In yeast, the PDR and CDR ABC transporters display extensive sequence homology, and confer resistance to several anti-fungal compounds by actively transporting their substrates out of the cell. These transporters have two homologous halves, each with an N-terminal intracellular hydrophilic region that contains an ATP-binding site, followed by a C-terminal membrane-associated region containing six transmembrane segments . This entry represents a domain of the PDR/CDR ABC transporter comprising extracellular loop 3, transmembrane segment 6 and a linker region.; GO: 0005524 ATP binding, 0042626 ATPase activity, coupled to transmembrane movement of substances, 0006810 transport, 0016021 integral to membrane Probab=30.73 E-value=5.3 Score=13.44 Aligned_cols=10 Identities=40% Similarity=0.810 Sum_probs=4.6 Q ss_pred HHHHHHHHHH Q ss_conf 0345568887 Q T0558 2 KNFILLVALF 11 (294) Q Consensus 2 ~~~~~~~~~~ 11 (294) |||.|+++++ T Consensus 50 RN~GIl~aF~ 59 (103) T PF06422_consen 50 RNFGILIAFI 59 (103) T ss_pred HHHHHHHHHH T ss_conf 3379999999 No 67 >PF07403 DUF1505: Protein of unknown function (DUF1505); InterPro: IPR009981 This family consists of several uncharacterised Caenorhabditis elegans proteins of around 115 resides in length. Members of this family contain 6 highly conserved cysteine residues. The function of this family is unknown. Probab=27.98 E-value=8.1 Score=12.09 Aligned_cols=24 Identities=33% Similarity=0.397 Sum_probs=14.7 Q ss_pred CHHHHHHHHHHHHHHHCCCCCCCC Q ss_conf 903455688876632001478885 Q T0558 1 MKNFILLVALFLVAPFAQGSSPQH 24 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~~~~ 24 (294) |+-|++++.++.+.......++.. T Consensus 1 Mn~~~~tvl~lsv~iA~~~~~~S~ 24 (114) T PF07403_consen 1 MNFFILTVLFLSVTIAGVSGSPSS 24 (114) T ss_pred CCCCHHHHHHHHHHHHHCCCCCCC T ss_conf 973002057678777635677544 No 68 >PF10907 DUF2749: Protein of unknown function (DUF2749) Probab=26.86 E-value=8.4 Score=11.95 Aligned_cols=28 Identities=18% Similarity=0.211 Sum_probs=15.9 Q ss_pred CHHHHHHHHHHHHHHHCCCCCCCCEEEEECC Q ss_conf 9034556888766320014788858999747 Q T0558 1 MKNFILLVALFLVAPFAQGSSPQHLLVGGSG 31 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~~~~~l~~gs~ 31 (294) |+.++||+-+++ .+.++.....|++.+. T Consensus 1 ms~~vlIal~va---vAa~a~~at~liV~p~ 28 (66) T PF10907_consen 1 MSPRVLIALLVA---VAAAAGAATWLIVQPR 28 (66) T ss_pred CCCCHHHHHHHH---HHHHCCCEEEEEECCC T ss_conf 970049999999---9862362489997787 No 69 >PF11714 Inhibitor_I53: Thrombin inhibitor Madanin Probab=26.81 E-value=8.5 Score=11.95 Aligned_cols=18 Identities=22% Similarity=0.464 Sum_probs=11.8 Q ss_pred CHHHHHHHHHHHHHHHCC Q ss_conf 903455688876632001 Q T0558 1 MKNFILLVALFLVAPFAQ 18 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~ 18 (294) ||-|.|++...++..++- T Consensus 1 mkhfaililavvasavvm 18 (78) T PF11714_consen 1 MKHFAILILAVVASAVVM 18 (78) T ss_pred CCCEEHHHHHHHHHHHHH T ss_conf 951421489998878862 No 70 >PF03548 LolA: Outer membrane lipoprotein carrier protein LolA; InterPro: IPR004564 This protein, LolA, is known so far only in the gamma subdivision of the Proteobacteria. In Escherichia coli, lipoproteins are anchored to the periplasmic side of either the inner or outer membrane through N-terminal lipids, depending on the lipoprotein-sorting signal present at position 2 . Five Lol proteins are involved in the sorting and outer membrane localization of lipoproteins. LolCDE, an ATP binding cassette (ABC) transporter, in the inner membrane releases outer membrane-directed lipoproteins from the inner membrane in an ATP-dependent manner, leading to the formation of a water-soluble complex between the lipoprotein and the molecular chaperone, LolA. The LolA-lipoprotein complex crosses the periplasm and then interacts with outer membrane receptor LolB, which is essential for the anchoring of lipoproteins to the outer membrane. E. coli lipoproteins are anchored to the inner or outer membrane depending on the residue at position 2. Aspartate at this position makes lipoproteins specific to the inner membrane, whereas other residues cause the release of lipoproteins from the inner membrane.; GO: 0015031 protein transport, 0030288 outer membrane-bounded periplasmic space; PDB: 1ua8_A 2zpc_A 1iwl_A 2zpd_A. Probab=25.56 E-value=8.9 Score=11.79 Aligned_cols=41 Identities=17% Similarity=0.246 Sum_probs=22.8 Q ss_pred CCEEEEEECCCCEEEEEEECCCCCCCCEEEEECCCCEEEEECCEEEEEECCCCCE Q ss_conf 9869999887882999994499873116479079839984188289975267703 Q T0558 32 WNKIAIINKDTKEIVWEYPLEKGWECNSVAATKAGEILFSYSKGAKMITRDGREL 86 (294) Q Consensus 32 ~~~i~~~d~~tg~~~w~~~~~~~~~~~~~~~~pdG~~l~s~~~~v~~~~~~~~~~ 86 (294) .|++.+-. -+++.|++. .|+...+++.+..+.+++.+.+++ T Consensus 26 ~G~~~~~k--p~~~rw~~~------------~P~~~~iv~~g~~l~~y~~~~~qv 66 (165) T PF03548_consen 26 SGKFYFKK--PGKFRWEYE------------KPDEQTIVSDGKTLWIYDPDLKQV 66 (165) T ss_dssp ---EEEET--TTEEEEEE-------------SSS-EEEEE---EEEEEECCCTEE T ss_pred EEEEEEEC--CCEEEEEEC------------CCCCEEEEEECCEEEEEECCCCEE T ss_conf 99999978--991999996------------988579999799999991888788 No 71 >PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family of proteins includes several glycine rich proteins as well as two nodulins 16 and 24. The family also contains proteins that are induced in response to various stresses. Probab=25.22 E-value=9 Score=11.75 Aligned_cols=12 Identities=42% Similarity=0.587 Sum_probs=7.4 Q ss_pred HHHHHHHHHHHH Q ss_conf 034556888766 Q T0558 2 KNFILLVALFLV 13 (294) Q Consensus 2 ~~~~~~~~~~~~ 13 (294) |.|+|+..+|++ T Consensus 4 k~~llL~lllA~ 15 (95) T PF07172_consen 4 KAFLLLGLLLAA 15 (95) T ss_pred HHHHHHHHHHHH T ss_conf 799999999999 No 72 >PF03527 RHS: RHS protein; InterPro: IPR001826 RHS elements are proteins of non-essential function believed to play an important role in the natural ecology of the cell. The protein sequences comprise highly conserved 141 kDa domain containing multiple tandem 22-residue repeats, followed by divergent C-terminal domains , . The 22 residue repeats contain a YD dipeptide which is the most strongly conserved motif of the repeat. Probab=22.84 E-value=9.9 Score=11.43 Aligned_cols=14 Identities=29% Similarity=0.696 Sum_probs=9.0 Q ss_pred EECCCCCEEEEEEC Q ss_conf 99089989999835 Q T0558 266 EIDSEGKVVWQLND 279 (294) Q Consensus 266 ~i~~~G~~vW~~~~ 279 (294) .+|.+|+++|+-.. T Consensus 15 ltd~~G~ivW~a~Y 28 (41) T PF03527_consen 15 LTDEDGEIVWSARY 28 (41) T ss_pred HCCCCCCEEEEEEH T ss_conf 70889859999872 No 73 >PF05436 MF_alpha_N: Mating factor alpha precursor N-terminus; InterPro: IPR008675 This entry contains the N-terminal regions of the Saccharomyces mating factor alpha precursor protein. All proteins in this family contain one or more copies of IPR006742 from INTERPRO further toward their C terminus.; GO: 0007618 mating, 0005576 extracellular region Probab=22.76 E-value=10 Score=11.42 Aligned_cols=23 Identities=26% Similarity=0.205 Sum_probs=18.7 Q ss_pred CHHHHHHHHHHHHHHHCCCCCCC Q ss_conf 90345568887663200147888 Q T0558 1 MKNFILLVALFLVAPFAQGSSPQ 23 (294) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~s~~~ 23 (294) ||-+.++.++.+++..+++++++ T Consensus 1 Mkf~~ilsa~~la~~av~aa~~~ 23 (86) T PF05436_consen 1 MKFSSILSAVALAATAVSAAPVE 23 (86) T ss_pred CCHHHHHHHHHHHHHHHCCCCCC T ss_conf 92578999999999871268876 No 74 >PF12273 RCR: Chitin synthesis regulation, resistance to Congo red Probab=21.28 E-value=11 Score=11.22 Aligned_cols=14 Identities=21% Similarity=0.489 Sum_probs=6.1 Q ss_pred HHHHHHHHHHHHHC Q ss_conf 45568887663200 Q T0558 4 FILLVALFLVAPFA 17 (294) Q Consensus 4 ~~~~~~~~~~~~~~ 17 (294) |.|+|+++++..|+ T Consensus 5 ~~iiv~~i~i~~~~ 18 (130) T PF12273_consen 5 FAIIVIAIFIIFFL 18 (130) T ss_pred HHHHHHHHHHHHHH T ss_conf 79999999999999 No 75 >PF08801 Nucleoporin_N: Nup133 N terminal like; InterPro: IPR014908 Nup133 is a nucleoporin that is crucial for nuclear pore complex (NPC) biogenesis. The N-terminal forms a seven-bladed beta propeller structure . ; PDB: 1xks_A. Probab=21.06 E-value=11 Score=11.19 Aligned_cols=34 Identities=15% Similarity=0.150 Sum_probs=25.9 Q ss_pred CCEEEECCCCCEEEEEECCCEEEEEECC--CCEEEE Q ss_conf 0089996998999997469889999378--858898 Q T0558 145 FRQINKNKKGNYLVPLFATSEVREIAPN--GQLLNS 178 (294) Q Consensus 145 ~~~~~~s~dG~~i~~g~~d~~i~~~d~~--g~~~~~ 178 (294) +..+...+.-+++++...++.|.+|+.. +..... T Consensus 194 I~~i~~d~~r~~ly~lts~g~i~~w~l~~~~~~~~~ 229 (424) T PF08801_consen 194 IVSIKVDPSRRLLYTLTSKGTIQVWDLSWGGSSLVR 229 (424) T ss_dssp EEEEEEETTTTEEEEEESSE-EEEEEE-SS-EEEEE T ss_pred EEEEEECCCCEEEEEEECCCCEEEEEECCCCCHHHH T ss_conf 699996686309999968997799994489721210 No 76 >PF10793 Gloverin: Gloverin-like protein Probab=20.57 E-value=11 Score=11.12 Aligned_cols=20 Identities=30% Similarity=0.348 Sum_probs=15.6 Q ss_pred HHHHHHHHHHHHHCCCCCCC Q ss_conf 45568887663200147888 Q T0558 4 FILLVALFLVAPFAQGSSPQ 23 (294) Q Consensus 4 ~~~~~~~~~~~~~~~~s~~~ 23 (294) ..++.+.+|+|.++|-+-|- T Consensus 5 l~~~~a~~~~c~~aqV~~pp 24 (175) T PF10793_consen 5 LLIIFAAVLACVNAQVSLPP 24 (175) T ss_pred EEHHHHHHHHHHHEEEECCC T ss_conf 41378999876301463375 No 77 >PF10395 Utp8: Utp8 family Probab=20.47 E-value=11 Score=11.10 Aligned_cols=88 Identities=19% Similarity=0.251 Sum_probs=50.2 Q ss_pred EEEEECCCCEEEEEEECCCCCCCCEEEEE--CCCCEEE-EE--C-CEEEEE--ECC----------CCCEEEEECCCCCC Q ss_conf 99998878829999944998731164790--7983998-41--8-828997--526----------77036772377763 Q T0558 35 IAIINKDTKEIVWEYPLEKGWECNSVAAT--KAGEILF-SY--S-KGAKMI--TRD----------GRELWNIAAPAGCE 96 (294) Q Consensus 35 i~~~d~~tg~~~w~~~~~~~~~~~~~~~~--pdG~~l~-s~--~-~~v~~~--~~~----------~~~~~~~~~~~~~~ 96 (294) -+++++ |-|++|.+++...--+.++... .|+.-++ .+ + +.-+.. ... ++..-.+...-... T Consensus 53 sYiikP-TPKLvws~pL~pTtiV~~~dV~~~~~~~k~~~vGlt~rkk~~lll~~~~~~~~~~~~~~~e~~~~~e~kl~~k 131 (670) T PF10395_consen 53 SYIIKP-TPKLVWSYPLSPTTIVEAMDVLEKSDGKKYYCVGLTERKKHKLLLIERKRSDTADGNSNGETTNEFELKLDDK 131 (670) T ss_pred HEECCC-CCCEEEECCCCCCCEEEEEEEEECCCCCEEEEEEEEECCEEEEEEEEECCCCCCCCCCCCCCCHHEEEECCCC T ss_conf 403579-9600570457855547765678637894699999832771489999822455556776766212136771572 Q ss_pred EEEEEECCCCCEEEEEECCCCEEEEEC Q ss_conf 148787378758999705897999985 Q T0558 97 MQTARILPDGNALVAWCGHPSTILEVN 123 (294) Q Consensus 97 v~~~~~~~dg~~l~~~s~~~~~~~~~~ 123 (294) +.++.+.++++.+++.-.++.+-..+. T Consensus 132 vv~Vk~~~~~~~I~vv~~nG~i~~~~~ 158 (670) T PF10395_consen 132 VVGVKFSSDGKIIVVVLENGLIQLYDF 158 (670) T ss_pred EEEEEECCCCCEEEEEEECCCEEEEEE T ss_conf 789998178978999991796899962 Done!