Query psy7363
Match_columns 552
No_of_seqs 132 out of 324
Neff 4.4
Searched_HMMs 46136
Date Sat Aug 17 00:49:17 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy7363.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/7363hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF00927 Transglut_C: Transglu 99.6 1.1E-15 2.5E-20 133.2 10.6 105 373-479 1-107 (107)
2 smart00460 TGc Transglutaminas 99.2 1.8E-12 3.8E-17 102.9 1.6 65 144-254 4-68 (68)
3 PF01841 Transglut_core: Trans 99.1 1.7E-11 3.6E-16 105.3 1.1 71 134-252 43-113 (113)
4 COG1305 Transglutaminase-like 98.2 7.2E-07 1.6E-11 88.0 2.3 76 146-263 194-269 (319)
5 PF07705 CARDB: CARDB; InterP 93.6 1 2.2E-05 37.5 10.4 69 371-449 2-72 (101)
6 PF06030 DUF916: Bacterial pro 91.2 0.82 1.8E-05 41.9 7.3 63 384-447 24-102 (121)
7 PF06280 DUF1034: Fn3-like dom 89.8 10 0.00022 33.5 12.8 68 385-453 6-86 (112)
8 PF11614 FixG_C: IG-like fold 84.3 14 0.00031 32.8 10.6 84 391-487 35-118 (118)
9 PF14874 PapD-like: Flagellar- 82.1 11 0.00024 32.2 8.8 58 382-447 15-72 (102)
10 KOG3865|consensus 75.1 18 0.00038 39.0 9.2 79 370-448 192-276 (402)
11 PF10633 NPCBM_assoc: NPCBM-as 73.4 13 0.00028 30.8 6.4 59 384-448 2-60 (78)
12 PF09641 DUF2026: Protein of u 71.0 3 6.4E-05 41.8 2.4 17 128-144 32-49 (204)
13 PF07919 Gryzun: Gryzun, putat 69.1 22 0.00047 39.3 8.8 71 370-448 468-539 (554)
14 PF09624 DUF2393: Protein of u 68.4 18 0.00039 33.6 6.9 85 370-454 45-139 (149)
15 KOG4134|consensus 60.1 10 0.00023 38.9 3.9 54 243-320 72-127 (253)
16 COG1470 Predicted membrane pro 59.6 59 0.0013 36.8 9.7 80 385-468 282-361 (513)
17 PF00207 A2M: Alpha-2-macroglo 58.2 20 0.00044 30.6 4.9 40 370-410 53-92 (92)
18 PF00927 Transglut_C: Transglu 57.4 7.5 0.00016 33.9 2.1 24 487-510 1-24 (107)
19 TIGR02745 ccoG_rdxA_fixG cytoc 56.8 1.1E+02 0.0024 34.1 11.4 74 370-451 323-402 (434)
20 PF12690 BsuPI: Intracellular 54.9 74 0.0016 27.3 7.7 63 389-452 2-74 (82)
21 KOG0909|consensus 54.5 2.3 4.9E-05 47.1 -1.9 24 147-172 221-244 (500)
22 PF03896 TRAP_alpha: Transloco 52.6 2.3E+02 0.0049 30.1 12.3 99 371-473 82-183 (285)
23 PF06159 DUF974: Protein of un 51.5 2.7E+02 0.0059 28.5 12.5 65 381-447 8-76 (249)
24 PF14016 DUF4232: Protein of u 46.5 1.2E+02 0.0027 27.4 8.3 74 371-447 4-81 (131)
25 PF09674 DUF2400: Protein of u 44.4 7.7 0.00017 39.6 0.1 21 232-253 153-173 (232)
26 PF03168 LEA_2: Late embryogen 41.2 2.1E+02 0.0045 23.7 8.8 52 392-447 1-52 (101)
27 smart00769 WHy Water Stress an 40.9 1E+02 0.0022 26.7 6.6 64 381-448 9-72 (100)
28 PF05506 DUF756: Domain of unk 38.6 1.5E+02 0.0032 25.2 7.1 45 390-446 21-65 (89)
29 PF06510 DUF1102: Protein of u 38.1 1.6E+02 0.0034 28.6 7.7 75 386-466 67-141 (146)
30 PF12735 Trs65: TRAPP traffick 37.2 64 0.0014 34.0 5.6 43 369-412 154-196 (306)
31 PF01455 HupF_HypC: HupF/HypC 37.2 9.6 0.00021 31.9 -0.4 9 164-172 2-10 (68)
32 TIGR00074 hypC_hupF hydrogenas 37.1 14 0.0003 31.8 0.5 9 164-172 2-10 (76)
33 PRK10409 hydrogenase assembly 37.1 14 0.0003 32.8 0.6 9 164-172 2-10 (90)
34 PRK10413 hydrogenase 2 accesso 35.9 15 0.00032 32.0 0.5 9 164-172 2-10 (82)
35 PLN03080 Probable beta-xylosid 34.7 1.3E+02 0.0029 35.8 8.2 61 388-450 685-747 (779)
36 PRK15098 beta-D-glucoside gluc 34.1 73 0.0016 37.7 6.0 66 385-452 665-731 (765)
37 PF07919 Gryzun: Gryzun, putat 33.5 5E+02 0.011 28.8 12.0 95 369-469 171-283 (554)
38 COG0242 Def N-formylmethionyl- 32.8 43 0.00093 32.9 3.2 54 153-217 45-125 (168)
39 PF07703 A2M_N_2: Alpha-2-macr 28.6 4.2E+02 0.009 23.5 11.0 108 380-510 7-119 (136)
40 PF06205 GT36_AF: Glycosyltran 25.8 1.1E+02 0.0024 26.6 4.2 32 417-448 53-84 (90)
41 PF13584 BatD: Oxygen toleranc 25.2 9.3E+02 0.02 26.4 12.4 134 373-514 13-156 (484)
42 TIGR03079 CH4_NH3mon_ox_B meth 25.2 2.7E+02 0.0058 30.9 7.8 83 369-452 264-358 (399)
43 PF13199 Glyco_hydro_66: Glyco 25.1 1.1E+02 0.0024 35.3 5.2 66 385-457 9-76 (559)
44 PF07760 DUF1616: Protein of u 25.1 3.1E+02 0.0066 28.6 8.1 67 381-448 185-254 (287)
45 PF02752 Arrestin_C: Arrestin 25.0 4.5E+02 0.0097 22.6 11.4 48 372-420 4-52 (136)
46 PF07070 Spo0M: SpoOM protein; 24.5 6.5E+02 0.014 25.7 10.0 82 373-456 15-104 (218)
47 COG1470 Predicted membrane pro 24.4 1.1E+03 0.025 27.1 13.8 73 369-448 333-407 (513)
48 PF05753 TRAP_beta: Translocon 24.3 4.5E+02 0.0098 25.9 8.6 79 379-460 30-109 (181)
49 PF11931 DUF3449: Domain of un 23.4 27 0.00059 35.1 0.0 9 161-169 130-138 (196)
50 PF03423 CBM_25: Carbohydrate 22.9 47 0.001 28.7 1.4 23 203-228 3-25 (87)
51 PF14310 Fn3-like: Fibronectin 21.7 1.2E+02 0.0026 24.7 3.5 29 428-456 23-51 (71)
52 PF04744 Monooxygenase_B: Mono 21.1 1.8E+02 0.0039 32.1 5.6 109 369-478 245-370 (381)
53 cd00487 Pep_deformylase Polype 20.7 89 0.0019 29.3 2.9 48 164-217 46-119 (141)
No 1
>PF00927 Transglut_C: Transglutaminase family, C-terminal ig like domain; InterPro: IPR008958 Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase Transglutaminases catalyse the post-translational modification of proteins at glutamine residues, with formation of isopeptide bonds. Members of the transglutaminase family usually have three domains: N-terminal (IPR001102 from INTERPRO), middle (IPR013808 from INTERPRO) and C-terminal. The middle domain is usually well conserved, but family members can display major differences in their N- and C-terminal domains, although their overall structure is conserved []. This entry represents the C-terminal domain found in transglutaminases, which consists of an immunoglobulin-like beta-sandwich consisting of seven strands in two sheets with a Greek key topology. The best known transglutaminase is blood coagulation factor XIII, a plasma tetrameric protein composed of two catalytic A subunits and two non-catalytic B subunits. Factor XIII is responsible for cross-linking fibrin chains, thus stabilising the fibrin clot. Protein-glutamine gamma-glutamyltransferases (2.3.2.13 from EC) are calcium-dependent enzymes that catalyse the cross-linking of proteins by promoting the formation of isopeptide bonds between the gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group of a lysine in a second polypeptide chain. TGases also catalyse the conjugation of polyamines to proteins [, ].; GO: 0003810 protein-glutamine gamma-glutamyltransferase activity, 0018149 peptide cross-linking; PDB: 2XZZ_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B 1L9N_B ....
Probab=99.64 E-value=1.1e-15 Score=133.23 Aligned_cols=105 Identities=31% Similarity=0.432 Sum_probs=95.0
Q ss_pred eEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEehhhh-
Q psy7363 373 IKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEY- 451 (552)
Q Consensus 373 v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY- 451 (552)
++|+|++.+++.+|+||++.++++|.+++.-++|++++++.+++|||..++++++....++|+|++++++.+.|.|++|
T Consensus 1 p~~~i~~~~~~~vG~d~~v~v~~~N~~~~~l~~v~~~l~~~~v~ytG~~~~~~~~~~~~~~l~p~~~~~~~~~i~p~~yG 80 (107)
T PF00927_consen 1 PEIKIKLPGDPVVGQDFTVSVSFTNPSSEPLRNVSLNLCAFTVEYTGLTRDQFKKEKFEVTLKPGETKSVEVTITPSQYG 80 (107)
T ss_dssp EEEEEEEESEEBTTSEEEEEEEEEE-SSS-EECEEEEEEEEEEECTTTEEEEEEEEEEEEEE-TTEEEEEEEEE-HHSHE
T ss_pred CeEEEEECCCccCCCCEEEEEEEEeCCcCccccceeEEEEEEEEECCcccccEeEEEcceeeCCCCEEEEEEEEEceeEe
Confidence 3678888899999999999999999999887899999999999999999999999999999999999999999999999
Q ss_pred -hccCCCCCcEEEEEEEEEccCCceEEEE
Q psy7363 452 -YKRLVDQADFNIACLATVHDTNFEYFAQ 479 (552)
Q Consensus 452 -~~~L~d~~~i~v~a~a~v~et~~~~~a~ 479 (552)
.++|+| +|++.++++++++.+.+++|
T Consensus 81 ~~~~l~~--~~~~~~l~~V~g~~~v~v~q 107 (107)
T PF00927_consen 81 PKQLLVD--LFSSDALADVKGTKQVYVTQ 107 (107)
T ss_dssp EECCEEE--EEEESSEEEEEEEEEEEEEE
T ss_pred cchhcch--hcchhhhcCeeccEEEEEeC
Confidence 678888 99999999999999877765
No 2
>smart00460 TGc Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events.
Probab=99.25 E-value=1.8e-12 Score=102.94 Aligned_cols=65 Identities=38% Similarity=0.793 Sum_probs=52.9
Q ss_pred ccccceeeeeeeccccceeeccCCCcceeeecccCccccccccccccccccccCCCCCceeeeeecccCccccCCCCCce
Q psy7363 144 VKYGQCWVFSGVLTTGETCRTLEAPNKDIIIIFVNPESLEKNLVVGVCYAAAHDTQSSLTVDYFVDEDGRVMEELNSDSI 223 (552)
Q Consensus 144 VkYGQCWVFAgV~~T~~vlR~LGIP~R~V~~~~~~~~~~~~~~~~~TNF~SAHDt~~nLtiD~y~de~G~~l~~~~~DSI 223 (552)
.++|+|.-||.+++. +||++|||||+| ..|....+...... ..
T Consensus 4 ~~~G~C~~~a~l~~~--llr~~GIpar~v-----------------~g~~~~~~~~~~~~------------------~~ 46 (68)
T smart00460 4 TKYGTCGEFAALFVA--LLRSLGIPARVV-----------------SGYLKAPDTIGGLR------------------SI 46 (68)
T ss_pred ccceeeHHHHHHHHH--HHHHCCCCeEEE-----------------eeeecCCCCCcccc------------------cC
Confidence 478999999999999 999999999999 77765554432211 23
Q ss_pred eeeeeeeeeeecccCCCCCCCCCeEEEcCCC
Q psy7363 224 WNFHMWNEVWMTRRDLGTTDFNGWQVIDATP 254 (552)
Q Consensus 224 WNFHVWnE~WM~RpDL~~~gy~GWQvlDaTP 254 (552)
.+.|+|+|+|+. ++|+.+||||
T Consensus 47 ~~~H~W~ev~~~---------~~W~~~D~~~ 68 (68)
T smart00460 47 WEAHAWAEVYLE---------GGWVPVDPTP 68 (68)
T ss_pred CCcEEEEEEEEC---------CCeEEEeCCC
Confidence 479999999983 6999999998
No 3
>PF01841 Transglut_core: Transglutaminase-like superfamily; InterPro: IPR002931 This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the epsilon-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds []. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' []. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease []. A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal beta-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal beta-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [].; PDB: 2F4M_A 2F4O_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B ....
Probab=99.11 E-value=1.7e-11 Score=105.35 Aligned_cols=71 Identities=21% Similarity=0.402 Sum_probs=55.5
Q ss_pred HHHHHhCCCCccccceeeeeeeccccceeeccCCCcceeeecccCccccccccccccccccccCCCCCceeeeeecccCc
Q psy7363 134 LQQFYTKKKPVKYGQCWVFSGVLTTGETCRTLEAPNKDIIIIFVNPESLEKNLVVGVCYAAAHDTQSSLTVDYFVDEDGR 213 (552)
Q Consensus 134 L~q~~~t~~PVkYGQCWVFAgV~~T~~vlR~LGIP~R~V~~~~~~~~~~~~~~~~~TNF~SAHDt~~nLtiD~y~de~G~ 213 (552)
..+.+++| +|.|.-||.+++. +||++|||||+| +.+...++++.+
T Consensus 43 ~~~~l~~~----~G~C~~~a~l~~a--llr~~Gipar~v-----------------~g~~~~~~~~~~------------ 87 (113)
T PF01841_consen 43 ASEVLRSG----RGDCEDYASLFVA--LLRALGIPARVV-----------------SGYVKGPDPDGD------------ 87 (113)
T ss_dssp HHHHHHCE----EESHHHHHHHHHH--HHHHHT--EEEE-----------------EEEEEECSSTTC------------
T ss_pred HHHHHHcC----CCccHHHHHHHHH--HHhhCCCceEEE-----------------EEEcCCcccccc------------
Confidence 34444455 4999999999999 999999999999 888887776655
Q ss_pred cccCCCCCceeeeeeeeeeeecccCCCCCCCCCeEEEcC
Q psy7363 214 VMEELNSDSIWNFHMWNEVWMTRRDLGTTDFNGWQVIDA 252 (552)
Q Consensus 214 ~l~~~~~DSIWNFHVWnE~WM~RpDL~~~gy~GWQvlDa 252 (552)
.....++.|+|||+|+ + .+||..+||
T Consensus 88 -----~~~~~~~~H~w~ev~~-----~---~~~W~~~Dp 113 (113)
T PF01841_consen 88 -----YSVDGNDNHAWVEVYL-----P---GGGWIPLDP 113 (113)
T ss_dssp -----TSTSSEEEEEEEEEEE-----T---TTEEEEEET
T ss_pred -----ccCCCCCCEEEEEEEE-----c---CCcEEEcCC
Confidence 2344567899999999 3 389999997
No 4
>COG1305 Transglutaminase-like enzymes, putative cysteine proteases [Amino acid transport and metabolism]
Probab=98.18 E-value=7.2e-07 Score=87.96 Aligned_cols=76 Identities=25% Similarity=0.432 Sum_probs=57.0
Q ss_pred ccceeeeeeeccccceeeccCCCcceeeecccCccccccccccccccccccCCCCCceeeeeecccCccccCCCCCceee
Q psy7363 146 YGQCWVFSGVLTTGETCRTLEAPNKDIIIIFVNPESLEKNLVVGVCYAAAHDTQSSLTVDYFVDEDGRVMEELNSDSIWN 225 (552)
Q Consensus 146 YGQCWVFAgV~~T~~vlR~LGIP~R~V~~~~~~~~~~~~~~~~~TNF~SAHDt~~nLtiD~y~de~G~~l~~~~~DSIWN 225 (552)
.|.|.=||.+|.. +||++|||||.| .+|..+ ..+.+.......-.+
T Consensus 194 ~G~C~d~a~l~va--l~Ra~GIpAR~V-----------------~Gy~~~---------------~~~~~~~~~~~~~~~ 239 (319)
T COG1305 194 RGVCRDFAHLLVA--LLRAAGIPARYV-----------------SGYLGA---------------EVEPLSGRPLVRNDD 239 (319)
T ss_pred CcccccHHHHHHH--HHHHcCCcceee-----------------eccccC---------------CCCcccccccccCcc
Confidence 7999999999999 999999999999 664332 222221110122336
Q ss_pred eeeeeeeeecccCCCCCCCCCeEEEcCCCCCcCCCeee
Q psy7363 226 FHMWNEVWMTRRDLGTTDFNGWQVIDATPQELSGRKYQ 263 (552)
Q Consensus 226 FHVWnE~WM~RpDL~~~gy~GWQvlDaTPQE~S~G~y~ 263 (552)
.|+|.|+|+. ++ ||-.+|||+.....+.+.
T Consensus 240 ~Haw~ev~~~-------~~-gW~~~Dpt~~~~~~~~~~ 269 (319)
T COG1305 240 AHAWAEVYLP-------GR-GWVPLDPTNGLLAGGRYS 269 (319)
T ss_pred cceeeeeecC-------CC-ccEeecCCCCCccCcccc
Confidence 8999999974 34 999999999999888764
No 5
>PF07705 CARDB: CARDB; InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins.; PDB: 2KUT_A 2L0D_A 3IDU_A 2KL6_A.
Probab=93.61 E-value=1 Score=37.54 Aligned_cols=69 Identities=20% Similarity=0.304 Sum_probs=43.7
Q ss_pred CceEEE-EEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEE-EEcCCCEEEEEEEEeh
Q psy7363 371 NDIKFD-FELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDV-VVKRGKSEEIVLHVSY 448 (552)
Q Consensus 371 ~~v~~~-i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v-~L~p~e~~~v~l~I~y 448 (552)
+||.+. ......+..|+++.+.++|+|........+.+.+ +-.|... ....+ .|.|++...+.+.+..
T Consensus 2 pDL~v~~~~~~~~~~~g~~~~i~~~V~N~G~~~~~~~~v~~-----~~~~~~~-----~~~~i~~L~~g~~~~v~~~~~~ 71 (101)
T PF07705_consen 2 PDLTVSITVSPSNVVPGEPVTITVTVKNNGTADAENVTVRL-----YLDGNSV-----STVTIPSLAPGESETVTFTWTP 71 (101)
T ss_dssp --EEE-EEEC-SEEETTSEEEEEEEEEE-SSS-BEEEEEEE-----EETTEEE-----EEEEESEB-TTEEEEEEEEEE-
T ss_pred CCEEEEEeeCCCcccCCCEEEEEEEEEECCCCCCCCEEEEE-----EECCcee-----ccEEECCcCCCcEEEEEEEEEe
Confidence 566663 3344666789999999999999776535555553 2333333 44555 8899999999999987
Q ss_pred h
Q psy7363 449 E 449 (552)
Q Consensus 449 ~ 449 (552)
.
T Consensus 72 ~ 72 (101)
T PF07705_consen 72 P 72 (101)
T ss_dssp S
T ss_pred C
Confidence 7
No 6
>PF06030 DUF916: Bacterial protein of unknown function (DUF916); InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function.
Probab=91.18 E-value=0.82 Score=41.92 Aligned_cols=63 Identities=17% Similarity=0.302 Sum_probs=47.0
Q ss_pred CCCCCEEEEEEEEeCCCCCceEEEEEEEEEE------EeEcCcccc----------ceeeEeeEEEEcCCCEEEEEEEEe
Q psy7363 384 VIGSPFSVVVKMNNKSRDQDYTVTVILRVDA------VTYTGKVGD----------SVKKTKEDVVVKRGKSEEIVLHVS 447 (552)
Q Consensus 384 ~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~------v~YtG~~~~----------~~~~~~~~v~L~p~e~~~v~l~I~ 447 (552)
.-|+.-+|.+.|+|.|++. .++.+.+.-.. +.|+..... ++-+....|+|+|++++.+++.|.
T Consensus 24 ~P~q~~~l~v~i~N~s~~~-~tv~v~~~~A~Tn~nG~I~Y~~~~~~~d~sl~~~~~~~v~~~~~Vtl~~~~sk~V~~~i~ 102 (121)
T PF06030_consen 24 KPGQKQTLEVRITNNSDKE-ITVKVSANTATTNDNGVIDYSQNNPKKDKSLKYPFSDLVKIPKEVTLPPNESKTVTFTIK 102 (121)
T ss_pred CCCCEEEEEEEEEeCCCCC-EEEEEEEeeeEecCCEEEEECCCCcccCcccCcchHHhccCCcEEEECCCCEEEEEEEEE
Confidence 4699999999999999988 88888776554 455444332 233334459999999999998885
No 7
>PF06280 DUF1034: Fn3-like domain (DUF1034); InterPro: IPR010435 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 3EIF_A 1XF1_B.
Probab=89.80 E-value=10 Score=33.48 Aligned_cols=68 Identities=19% Similarity=0.187 Sum_probs=42.3
Q ss_pred CCCCEEEEEEEEeCCCCCceEEEEEEE-EEEEe-E--cCcccc--------ceeeEeeEEEEcCCCEEEEEEEEeh-hhh
Q psy7363 385 IGSPFSVVVKMNNKSRDQDYTVTVILR-VDAVT-Y--TGKVGD--------SVKKTKEDVVVKRGKSEEIVLHVSY-EEY 451 (552)
Q Consensus 385 iG~Df~l~v~l~N~s~~~~r~v~~~l~-a~~v~-Y--tG~~~~--------~~~~~~~~v~L~p~e~~~v~l~I~y-~eY 451 (552)
+|..+...++|+|.+++. .+.++... +.+.. + .|.... ........|+|+||+++++.+.|.. +..
T Consensus 6 ~~~~~~~~itl~N~~~~~-~ty~~~~~~~~t~~~~~~~~~~~~~~~~~~~~~~~~~~~~vTV~ag~s~~v~vti~~p~~~ 84 (112)
T PF06280_consen 6 TGNKFSFTITLHNYGDKP-VTYTLSHVPVLTDKTDTEEGYSILVPPVPSISTVSFSPDTVTVPAGQSKTVTVTITPPSGL 84 (112)
T ss_dssp E-SEEEEEEEEEE-SSS--EEEEEEEE-EEEEEE--ETTEEEEEEEE----EEE---EEEEE-TTEEEEEEEEEE--GGG
T ss_pred cCCceEEEEEEEECCCCC-EEEEEeeEEEEeeEeeccCCcccccccccceeeEEeCCCeEEECCCCEEEEEEEEEehhcC
Confidence 377799999999999988 88777666 33322 1 122111 3455678999999999999999998 545
Q ss_pred hc
Q psy7363 452 YK 453 (552)
Q Consensus 452 ~~ 453 (552)
.+
T Consensus 85 ~~ 86 (112)
T PF06280_consen 85 DA 86 (112)
T ss_dssp HH
T ss_pred Cc
Confidence 54
No 8
>PF11614 FixG_C: IG-like fold at C-terminal of FixG, putative oxidoreductase; PDB: 2R39_A.
Probab=84.28 E-value=14 Score=32.80 Aligned_cols=84 Identities=13% Similarity=0.144 Sum_probs=45.0
Q ss_pred EEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEehhhhhccCCCCCcEEEEEEEEEc
Q psy7363 391 VVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYYKRLVDQADFNIACLATVH 470 (552)
Q Consensus 391 l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~~~L~d~~~i~v~a~a~v~ 470 (552)
-.++|.|++.+. +++++.+.. ....++-.....++|+|++..++++.|....-.-. ....-|.|.+. ..
T Consensus 35 Y~lkl~Nkt~~~-~~~~i~~~g-------~~~~~l~~~~~~i~v~~g~~~~~~v~v~~p~~~~~-~~~~~i~f~v~--~~ 103 (118)
T PF11614_consen 35 YTLKLTNKTNQP-RTYTISVEG-------LPGAELQGPENTITVPPGETREVPVFVTAPPDALK-SGSTPITFTVT--DD 103 (118)
T ss_dssp EEEEEEE-SSS--EEEEEEEES--------SS-EE-ES--EEEE-TT-EEEEEEEEEE-GGG-S-SSEEEEEEEEE--EG
T ss_pred EEEEEEECCCCC-EEEEEEEec-------CCCeEEECCCcceEECCCCEEEEEEEEEECHHHcc-CCCeeEEEEEE--EC
Confidence 578899999999 888777763 22444433568899999999999998876554422 12224555554 32
Q ss_pred cCCceEEEEEeEEEeCC
Q psy7363 471 DTNFEYFAQDDFRVRKP 487 (552)
Q Consensus 471 et~~~~~a~~d~~L~~P 487 (552)
+.....+.+-.+..|
T Consensus 104 --~~~~~~~~~s~F~~P 118 (118)
T PF11614_consen 104 --DGGEIITYKSTFIGP 118 (118)
T ss_dssp --GGTEEEEEEEEEE--
T ss_pred --CCCEEEEEEEEEEcC
Confidence 223445555554443
No 9
>PF14874 PapD-like: Flagellar-associated PapD-like
Probab=82.11 E-value=11 Score=32.20 Aligned_cols=58 Identities=17% Similarity=0.214 Sum_probs=42.4
Q ss_pred CCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEe
Q psy7363 382 DIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVS 447 (552)
Q Consensus 382 ~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~ 447 (552)
.+.+|+.-...+.|+|.|... ...++... . .....|.-....-.|.||++.++.+.+.
T Consensus 15 ~v~~g~~~~~~v~l~N~s~~p-~~f~v~~~--~-----~~~~~~~v~~~~g~l~PG~~~~~~V~~~ 72 (102)
T PF14874_consen 15 NVFVGQTYSRTVTLTNTSSIP-ARFRVRQP--E-----SLSSFFSVEPPSGFLAPGESVELEVTFS 72 (102)
T ss_pred EEccCCEEEEEEEEEECCCCC-EEEEEEeC--C-----cCCCCEEEECCCCEECCCCEEEEEEEEE
Confidence 567999999999999999887 55555422 1 1223444455566799999999888887
No 10
>KOG3865|consensus
Probab=75.11 E-value=18 Score=39.03 Aligned_cols=79 Identities=20% Similarity=0.297 Sum_probs=59.2
Q ss_pred CCceEEEEEecCCCCC-CCCEEEEEEEEeCCCCCceEEEEEE--EEEEEeE-cCccccceeeEeeEE--EEcCCCEEEEE
Q psy7363 370 FNDIKFDFELRDDIVI-GSPFSVVVKMNNKSRDQDYTVTVIL--RVDAVTY-TGKVGDSVKKTKEDV--VVKRGKSEEIV 443 (552)
Q Consensus 370 ~~~v~~~i~~~~~~~i-G~Df~l~v~l~N~s~~~~r~v~~~l--~a~~v~Y-tG~~~~~~~~~~~~v--~L~p~e~~~v~ 443 (552)
...+++...+...+-. |+++.+.|.++|+|...-+.+++.+ .|..+.| |+.....+.....+= .+.|+.+-.-.
T Consensus 192 ~~~lhLevsLDkEiYyHGE~isvnV~V~NNsnKtVKkIK~~V~Q~adi~Lfs~aqy~~~VA~~E~~eGc~v~Pgstl~Kv 271 (402)
T KOG3865|consen 192 DGPLHLEVSLDKEIYYHGEPISVNVHVTNNSNKTVKKIKISVRQVADICLFSTAQYKKPVAMEETDEGCPVAPGSTLSKV 271 (402)
T ss_pred CCceEEEEEecchheecCCceeEEEEEecCCcceeeeeEEEeEeeceEEEEecccccceeeeeecccCCccCCCCeeeee
Confidence 4678888888666644 9999999999999987655555544 4788888 787787876665554 88899887666
Q ss_pred EEEeh
Q psy7363 444 LHVSY 448 (552)
Q Consensus 444 l~I~y 448 (552)
+.+.+
T Consensus 272 f~l~P 276 (402)
T KOG3865|consen 272 FTLTP 276 (402)
T ss_pred EEech
Confidence 66643
No 11
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=73.36 E-value=13 Score=30.81 Aligned_cols=59 Identities=17% Similarity=0.212 Sum_probs=32.3
Q ss_pred CCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEeh
Q psy7363 384 VIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSY 448 (552)
Q Consensus 384 ~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y 448 (552)
.-|+.+.+.++++|.....-..+++.+.+- .|=. ......... .|+||+..++.+.|..
T Consensus 2 ~~G~~~~~~~tv~N~g~~~~~~v~~~l~~P----~GW~-~~~~~~~~~-~l~pG~s~~~~~~V~v 60 (78)
T PF10633_consen 2 TPGETVTVTLTVTNTGTAPLTNVSLSLSLP----EGWT-VSASPASVP-SLPPGESVTVTFTVTV 60 (78)
T ss_dssp -TTEEEEEEEEEE--SSS-BSS-EEEEE------TTSE----EEEEE---B-TTSEEEEEEEEEE
T ss_pred CCCCEEEEEEEEEECCCCceeeEEEEEeCC----CCcc-ccCCccccc-cCCCCCEEEEEEEEEC
Confidence 359999999999999866534566666541 2322 122222222 8999999998888864
No 12
>PF09641 DUF2026: Protein of unknown function (DUF2026); InterPro: IPR018599 This protein of approx. 100 residues is found in bacteria. It contains up to five alpha helices and up to seven beta strands and is probably monomeric. Its function is unknown. ; PDB: 2HLY_A.
Probab=71.00 E-value=3 Score=41.77 Aligned_cols=17 Identities=29% Similarity=0.346 Sum_probs=9.9
Q ss_pred cccHHHHHHHHhCC-CCc
Q psy7363 128 VGSMKILQQFYTKK-KPV 144 (552)
Q Consensus 128 ~GSv~IL~q~~~t~-~PV 144 (552)
+..+-||++.|+-. .|+
T Consensus 32 ~~Ga~IL~~hYk~~A~~~ 49 (204)
T PF09641_consen 32 TFGAFILRDHYKIEAKPK 49 (204)
T ss_dssp HHHHHHHHHHH---EEEE
T ss_pred HHHHHHHHHHhcccceec
Confidence 35678999999887 444
No 13
>PF07919 Gryzun: Gryzun, putative trafficking through Golgi; InterPro: IPR012880 The proteins featured in this family are all hypothetical eukaryotic proteins of unknown function. The region in question is approximately 150 residues long.
Probab=69.14 E-value=22 Score=39.31 Aligned_cols=71 Identities=8% Similarity=0.180 Sum_probs=56.2
Q ss_pred CCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEE-EEeEcCccccceeeEeeEEEEcCCCEEEEEEEEeh
Q psy7363 370 FNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVD-AVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSY 448 (552)
Q Consensus 370 ~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~-~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y 448 (552)
..++.+.++......+|.+|.+.++|+|.|... .++.+.|... .-.+.|.. +..+.|.|.+.+.+...+-+
T Consensus 468 ~~~~~v~~~~p~~~~~~~~~~l~~~I~N~T~~~-~~~~~~me~s~~F~fsG~k-------~~~~~llP~s~~~~~y~l~p 539 (554)
T PF07919_consen 468 SSPLRVLASVPPSAIVGEPFTLSYTIENPTNHF-QTFELSMEPSDDFMFSGPK-------QTTFSLLPFSRHTVRYNLLP 539 (554)
T ss_pred CCCcEEEEecCCccccCcEEEEEEEEECCCCcc-EEEEEEEccCCCEEEECCC-------cCceEECCCCcEEEEEEEEE
Confidence 456677777777788999999999999999988 8888888633 34566655 57888999999999887743
No 14
>PF09624 DUF2393: Protein of unknown function (DUF2393); InterPro: IPR013417 The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416 from INTERPRO, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Anabaena sp. (strain PCC 7120), Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.
Probab=68.36 E-value=18 Score=33.57 Aligned_cols=85 Identities=13% Similarity=0.199 Sum_probs=54.0
Q ss_pred CCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEe-EcCccccceeeEeeEE---------EEcCCCE
Q psy7363 370 FNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVT-YTGKVGDSVKKTKEDV---------VVKRGKS 439 (552)
Q Consensus 370 ~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~-YtG~~~~~~~~~~~~v---------~L~p~e~ 439 (552)
.....+.+........++-+-|..+++|.+...-+.|.+.+..-.-. =.+....+.......+ .|.|+++
T Consensus 45 ~~~~~~~~~~~~~l~~~~~~~v~g~V~N~g~~~i~~c~i~~~l~~~~~~~~n~~~~~~~~~~~f~~~~~~i~~~L~~~e~ 124 (149)
T PF09624_consen 45 LKKIELTLTSQKRLQYSESFYVDGTVTNTGKFTIKKCKITVKLYNDKQVSGNKFKEIFYQQIPFVKKSIPIADNLKPGES 124 (149)
T ss_pred cCCceEEEeeeeeeeeccEEEEEEEEEECCCCEeeEEEEEEEEEeCCCccCchhhhhhccccchhccceeHHhhcCcccc
Confidence 34556666656667889999999999999998745555554432200 1222222222222112 2999999
Q ss_pred EEEEEEEehhhhhcc
Q psy7363 440 EEIVLHVSYEEYYKR 454 (552)
Q Consensus 440 ~~v~l~I~y~eY~~~ 454 (552)
++..+.+++..|.+.
T Consensus 125 ~~f~~~~~~~p~~~~ 139 (149)
T PF09624_consen 125 KEFRFIFPYPPYFGN 139 (149)
T ss_pred eeEEEEecCCccCCC
Confidence 999999997777543
No 15
>KOG4134|consensus
Probab=60.12 E-value=10 Score=38.86 Aligned_cols=54 Identities=26% Similarity=0.334 Sum_probs=44.0
Q ss_pred CCCCeEEEcCCCCCcCCCeeeeccccceeeeeccccccCCCceEEEEEcCceEEEEEeCCCCcee--EEeecccccccee
Q psy7363 243 DFNGWQVIDATPQELSGRKYQCGPTSVTAVKRGEVKIAYDGGFVFSEVNADKVFWRYNGPTQPLK--LLRKDINGIGLAL 320 (552)
Q Consensus 243 gy~GWQvlDaTPQE~S~G~y~CGPapV~AIKeG~v~~~YD~~FVFAEVNAD~v~W~~~~~g~~~~--~~~~dt~~VG~~I 320 (552)
||++=.+|+.| ..+-||.||+|.=+|||.+.|+- +.|..+. .-.+...+||..|
T Consensus 72 gydnIKvLg~~-----------------------aki~~D~pf~hlwi~adfyVf~P-k~Gd~LeG~Vn~vS~sHIglLI 127 (253)
T KOG4134|consen 72 GYDNIKVLGQT-----------------------AKIRADDPFMHLWINADFYVFRP-KAGDILEGVVNHVSRSHIGLLI 127 (253)
T ss_pred eecceEeeccc-----------------------cceecCCCceEEEEeeeEEEECC-CCCCeeeeeeeecchhhhceee
Confidence 89999999988 57889999999999999999976 4444322 3456788999988
No 16
>COG1470 Predicted membrane protein [Function unknown]
Probab=59.60 E-value=59 Score=36.83 Aligned_cols=80 Identities=14% Similarity=0.167 Sum_probs=56.6
Q ss_pred CCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEehhhhhccCCCCCcEEEE
Q psy7363 385 IGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYYKRLVDQADFNIA 464 (552)
Q Consensus 385 iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~~~L~d~~~i~v~ 464 (552)
.|..+.+.|++.|+.... .+..|.++.-.-.||-..... .-.--.+.|.|++++.+.+.|..++ ....+.--+.|+
T Consensus 282 ~~~t~sf~V~IeN~g~~~-d~y~Le~~g~pe~w~~~Fteg-~~~vt~vkL~~gE~kdvtleV~ps~--na~pG~Ynv~I~ 357 (513)
T COG1470 282 PSTTASFTVSIENRGKQD-DEYALELSGLPEGWTAEFTEG-ELRVTSVKLKPGEEKDVTLEVYPSL--NATPGTYNVTIT 357 (513)
T ss_pred cCCceEEEEEEccCCCCC-ceeEEEeccCCCCcceEEeeC-ceEEEEEEecCCCceEEEEEEecCC--CCCCCceeEEEE
Confidence 588899999999999999 888888885444444333311 1223578899999999999997764 344445556666
Q ss_pred EEEE
Q psy7363 465 CLAT 468 (552)
Q Consensus 465 a~a~ 468 (552)
|..+
T Consensus 358 A~s~ 361 (513)
T COG1470 358 ASSS 361 (513)
T ss_pred Eecc
Confidence 6654
No 17
>PF00207 A2M: Alpha-2-macroglobulin family; InterPro: IPR001599 This entry contains serum complement C3 and C4 precursors and alpha-macrogrobulins. The alpha-macroglobulin (aM) family of proteins includes protease inhibitors [], typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a 'bait region' and a thiol ester, (iii) a similar protease inhibitory mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance []. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain [] (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation []. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified [, ].; GO: 0004866 endopeptidase inhibitor activity; PDB: 3KLS_B 3PRX_C 3KM9_B 3PVM_C 3CU7_A 4E0S_A 4A5W_A 2PN5_A 3FRP_G 3HRZ_B ....
Probab=58.24 E-value=20 Score=30.58 Aligned_cols=40 Identities=23% Similarity=0.317 Sum_probs=31.0
Q ss_pred CCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEE
Q psy7363 370 FNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVIL 410 (552)
Q Consensus 370 ~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l 410 (552)
..++.++..+...+..|+-+.+.+++.|..+.. .++++.|
T Consensus 53 ~~p~~i~~~lP~~l~~GD~~~i~v~v~N~~~~~-~~v~V~l 92 (92)
T PF00207_consen 53 FKPFFIQLNLPRSLRRGDQIQIPVTVFNYTDKD-QEVTVTL 92 (92)
T ss_dssp B-SEEEEEE--SEEETTSEEEEEEEEEE-SSS--EEEEEEE
T ss_pred EeeEEEEcCCCcEEecCCEEEEEEEEEeCCCCC-EEEEEEC
Confidence 468888888888899999999999999999988 8887765
No 18
>PF00927 Transglut_C: Transglutaminase family, C-terminal ig like domain; InterPro: IPR008958 Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase Transglutaminases catalyse the post-translational modification of proteins at glutamine residues, with formation of isopeptide bonds. Members of the transglutaminase family usually have three domains: N-terminal (IPR001102 from INTERPRO), middle (IPR013808 from INTERPRO) and C-terminal. The middle domain is usually well conserved, but family members can display major differences in their N- and C-terminal domains, although their overall structure is conserved []. This entry represents the C-terminal domain found in transglutaminases, which consists of an immunoglobulin-like beta-sandwich consisting of seven strands in two sheets with a Greek key topology. The best known transglutaminase is blood coagulation factor XIII, a plasma tetrameric protein composed of two catalytic A subunits and two non-catalytic B subunits. Factor XIII is responsible for cross-linking fibrin chains, thus stabilising the fibrin clot. Protein-glutamine gamma-glutamyltransferases (2.3.2.13 from EC) are calcium-dependent enzymes that catalyse the cross-linking of proteins by promoting the formation of isopeptide bonds between the gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group of a lysine in a second polypeptide chain. TGases also catalyse the conjugation of polyamines to proteins [, ].; GO: 0003810 protein-glutamine gamma-glutamyltransferase activity, 0018149 peptide cross-linking; PDB: 2XZZ_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B 1L9N_B ....
Probab=57.37 E-value=7.5 Score=33.92 Aligned_cols=24 Identities=13% Similarity=0.104 Sum_probs=22.3
Q ss_pred CCeEEEeCCcccccceeeEEEeec
Q psy7363 487 PDIRLKQLMKRFDRGNCNGGLITP 510 (552)
Q Consensus 487 P~l~I~v~g~~~V~k~~~~~~~~~ 510 (552)
|+|+|++++.+.+|+++.+.++|-
T Consensus 1 p~~~i~~~~~~~vG~d~~v~v~~~ 24 (107)
T PF00927_consen 1 PEIKIKLPGDPVVGQDFTVSVSFT 24 (107)
T ss_dssp EEEEEEEESEEBTTSEEEEEEEEE
T ss_pred CeEEEEECCCccCCCCEEEEEEEE
Confidence 789999999999999999999883
No 19
>TIGR02745 ccoG_rdxA_fixG cytochrome c oxidase accessory protein FixG. Member of this ferredoxin-like protein family are found exclusively in species with an operon encoding the cbb3 type of cytochrome c oxidase (cco-cbb3), and near the cco-cbb3 operon in about half the cases. The cco-cbb3 is found in a variety of proteobacteria and almost nowhere else, and is associated with oxygen use under microaerobic conditions. Some (but not all) of these proteobacteria are also nitrogen-fixing, hence the gene symbol fixG. FixG was shown essential for functional cco-cbb3 expression in Bradyrhizobium japonicum.
Probab=56.81 E-value=1.1e+02 Score=34.08 Aligned_cols=74 Identities=11% Similarity=0.096 Sum_probs=47.9
Q ss_pred CCceEEEEEecCCCC--CCCC----EEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEE
Q psy7363 370 FNDIKFDFELRDDIV--IGSP----FSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIV 443 (552)
Q Consensus 370 ~~~v~~~i~~~~~~~--iG~D----f~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~ 443 (552)
.+++++++--...+. ..+| =.-++++.|++.++ ++.++.+.. ....++......++|+|++..+++
T Consensus 323 r~~~~~~v~r~r~~l~~~~~~g~i~N~Y~~~i~Nk~~~~-~~~~l~v~g-------~~~~~~~~~~~~i~v~~g~~~~~~ 394 (434)
T TIGR02745 323 REPMDLNVLRDRNLLYVRNSDGVVENTYTLKILNKTEQP-HEYYLSVLG-------LPGIKIEGPGAPIHVKAGEKVKLP 394 (434)
T ss_pred CCceEEEEEecCCcceEECCCCcEEEEEEEEEEECCCCC-EEEEEEEec-------CCCcEEEcCCceEEECCCCEEEEE
Confidence 356666665433322 1222 24578899999998 888887653 333333333238999999999999
Q ss_pred EEEehhhh
Q psy7363 444 LHVSYEEY 451 (552)
Q Consensus 444 l~I~y~eY 451 (552)
+.|.....
T Consensus 395 v~v~~~~~ 402 (434)
T TIGR02745 395 VFLRTPPD 402 (434)
T ss_pred EEEEechh
Confidence 98877643
No 20
>PF12690 BsuPI: Intracellular proteinase inhibitor; InterPro: IPR020481 BsuPI is a intracellular proteinase inhibitor that directly regulates the major intracellular proteinase (ISP-1) activity in vivo. It inhibits ISP-1 in the early stages of sporulation and then may be inactivated by a membrane-bound proteinase [].; PDB: 3ISY_A.
Probab=54.87 E-value=74 Score=27.31 Aligned_cols=63 Identities=14% Similarity=0.130 Sum_probs=32.1
Q ss_pred EEEEEEEEeCCCCCceEEEEEE--EEEEEeE--------cCccccceeeEeeEEEEcCCCEEEEEEEEehhhhh
Q psy7363 389 FSVVVKMNNKSRDQDYTVTVIL--RVDAVTY--------TGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYY 452 (552)
Q Consensus 389 f~l~v~l~N~s~~~~r~v~~~l--~a~~v~Y--------tG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~ 452 (552)
+.+.++++|.+++. .+++..- .+.-+.+ .=.-..-|-..-.+.+|.||++......++..++.
T Consensus 2 v~~~l~v~N~s~~~-v~l~f~sgq~~D~~v~d~~g~~vwrwS~~~~FtQal~~~~l~pGe~~~~~~~~~~~~~~ 74 (82)
T PF12690_consen 2 VEFTLTVTNNSDEP-VTLQFPSGQRYDFVVKDKEGKEVWRWSDGKMFTQALQEETLEPGESLTYEETWDLKDLS 74 (82)
T ss_dssp EEEEEEEEE-SSS--EEEEESSS--EEEEEE-TT--EEEETTTT-------EEEEE-TT-EEEEEEEESS----
T ss_pred EEEEEEEEeCCCCe-EEEEeCCCCEEEEEEECCCCCEEEEecCCchhhheeeEEEECCCCEEEEEEEECCCCCC
Confidence 56888899998887 6666421 2223333 22233444555678999999999999998876653
No 21
>KOG0909|consensus
Probab=54.53 E-value=2.3 Score=47.07 Aligned_cols=24 Identities=29% Similarity=0.589 Sum_probs=22.3
Q ss_pred cceeeeeeeccccceeeccCCCccee
Q psy7363 147 GQCWVFSGVLTTGETCRTLEAPNKDI 172 (552)
Q Consensus 147 GQCWVFAgV~~T~~vlR~LGIP~R~V 172 (552)
|-|-=.|-.+|- .|||||.-+|.|
T Consensus 221 GRCGEWANCFTl--lcralg~daR~i 244 (500)
T KOG0909|consen 221 GRCGEWANCFTL--LCRALGLDARYI 244 (500)
T ss_pred CccchHHHHHHH--HHHHcCCcceEE
Confidence 888888888899 999999999999
No 22
>PF03896 TRAP_alpha: Translocon-associated protein (TRAP), alpha subunit; InterPro: IPR005595 The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane [].; GO: 0005783 endoplasmic reticulum
Probab=52.60 E-value=2.3e+02 Score=30.09 Aligned_cols=99 Identities=16% Similarity=0.171 Sum_probs=63.0
Q ss_pred CceEEEEEec-CCCCCCCCEEEEEEEEeCCCCCceEEEEE-EEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEeh
Q psy7363 371 NDIKFDFELR-DDIVIGSPFSVVVKMNNKSRDQDYTVTVI-LRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSY 448 (552)
Q Consensus 371 ~~v~~~i~~~-~~~~iG~Df~l~v~l~N~s~~~~r~v~~~-l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y 448 (552)
.++.+-|... ...+-|++.++-|.++|+..+. -+|... -+.+.-..-...+..|........|+|+++.+++-....
T Consensus 82 adt~~~F~~~~~~l~aG~~~~~LvgftN~g~~~-~~V~~i~aSl~~p~d~~~~iqNfTa~~y~~~V~pg~~aT~~YsF~~ 160 (285)
T PF03896_consen 82 ADTTILFPKPTKKLPAGEPVKFLVGFTNKGSEP-FTVESIEASLRYPQDYSYYIQNFTAVRYNREVPPGEEATFPYSFTP 160 (285)
T ss_pred ceEEEEeccccccccCCCeEEEEEEEEeCCCCC-EEEEEEeeeecCccccceEEEeecccccCcccCCCCeEEEEEEEec
Confidence 4555555433 5677899999999999998876 554433 222222233444556667778899999999998888877
Q ss_pred hhhhccCCCCC-cEEEEEEEEEccCC
Q psy7363 449 EEYYKRLVDQA-DFNIACLATVHDTN 473 (552)
Q Consensus 449 ~eY~~~L~d~~-~i~v~a~a~v~et~ 473 (552)
++++. .+. -+.+.+..+..++.
T Consensus 161 ~~~l~---pr~f~L~i~l~y~d~~g~ 183 (285)
T PF03896_consen 161 SEELA---PRPFGLVINLIYEDSDGN 183 (285)
T ss_pred chhcC---CcceEEEEEEEEEeCCCC
Confidence 66553 222 34444554444433
No 23
>PF06159 DUF974: Protein of unknown function (DUF974); InterPro: IPR010378 This is a family of uncharacterised eukaryotic proteins.
Probab=51.50 E-value=2.7e+02 Score=28.55 Aligned_cols=65 Identities=18% Similarity=0.274 Sum_probs=38.4
Q ss_pred CCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEe----eEEEEcCCCEEEEEEEEe
Q psy7363 381 DDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTK----EDVVVKRGKSEEIVLHVS 447 (552)
Q Consensus 381 ~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~----~~v~L~p~e~~~v~l~I~ 447 (552)
..+-+||.|...|.+.|.+...-+ .+.+.|...+=+....-.+.... ....|.|++.....++-.
T Consensus 8 G~iylGEtF~~~l~~~N~s~~~v~--~v~ikvemqT~s~~~r~~L~~~~~~~~~~~~L~p~~~l~~iv~~~ 76 (249)
T PF06159_consen 8 GSIYLGETFSCYLSVNNDSNKPVR--NVRIKVEMQTPSQSLRLPLSDNENSDSPVASLAPGESLDFIVSHE 76 (249)
T ss_pred CCEeecCCEEEEEEeecCCCCceE--EeEEEEEEeCCCCCccccCCCCccccccccccCCCCeEeEEEEEE
Confidence 456689999999999998776633 44444444443431122222211 234688888766655544
No 24
>PF14016 DUF4232: Protein of unknown function (DUF4232)
Probab=46.54 E-value=1.2e+02 Score=27.43 Aligned_cols=74 Identities=19% Similarity=0.109 Sum_probs=46.3
Q ss_pred CceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccc-eee---EeeEEEEcCCCEEEEEEEE
Q psy7363 371 NDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDS-VKK---TKEDVVVKRGKSEEIVLHV 446 (552)
Q Consensus 371 ~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~-~~~---~~~~v~L~p~e~~~v~l~I 446 (552)
.+|.+++...+ .-.|+. .+.|+++|.+... .++...=..+...=.|..... ..+ ....++|.|++.-...|..
T Consensus 4 ~~L~~~~~~~~-~~~g~~-~~~l~~tN~s~~~-C~l~G~P~v~~~~~~g~~~~~~~~~~~~~~~~vtL~PG~sA~a~l~~ 80 (131)
T PF14016_consen 4 ADLSVTVGPVD-AGAGQR-HATLTFTNTSDTP-CTLYGYPGVALVDADGAPLGVPAVREGPPPRPVTLAPGGSAYAGLRW 80 (131)
T ss_pred ccEEEEEeccc-CCCCcc-EEEEEEEECCCCc-EEeccCCcEEEECCCCCcCCccccccCCCCCcEEECCCCEEEEEEEE
Confidence 56777876433 356666 9999999998876 555555333333233331111 111 3467999999988877766
Q ss_pred e
Q psy7363 447 S 447 (552)
Q Consensus 447 ~ 447 (552)
.
T Consensus 81 ~ 81 (131)
T PF14016_consen 81 S 81 (131)
T ss_pred e
Confidence 4
No 25
>PF09674 DUF2400: Protein of unknown function (DUF2400); InterPro: IPR014127 Members of this uncharacterised protein family are found sporadically, so far only among spirochetes, epsilon and delta proteobacteria, and Bacteroides. The function is unknown and its gene neighbourhoods show little conservation.
Probab=44.37 E-value=7.7 Score=39.62 Aligned_cols=21 Identities=43% Similarity=0.896 Sum_probs=18.6
Q ss_pred eeecccCCCCCCCCCeEEEcCC
Q psy7363 232 VWMTRRDLGTTDFNGWQVIDAT 253 (552)
Q Consensus 232 ~WM~RpDL~~~gy~GWQvlDaT 253 (552)
=||.|+|=|. ..|+|+.||+.
T Consensus 153 RWMVR~d~~V-D~GlW~~i~ps 173 (232)
T PF09674_consen 153 RWMVRKDSPV-DFGLWSSIDPS 173 (232)
T ss_pred HhhccCCCCC-CCcCCCCCCHH
Confidence 3999999988 99999998864
No 26
>PF03168 LEA_2: Late embryogenesis abundant protein; InterPro: IPR004864 Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress [, ]. The function of these proteins is unknown. ; PDB: 3BUT_A 1XO8_A 1YYC_A.
Probab=41.18 E-value=2.1e+02 Score=23.71 Aligned_cols=52 Identities=13% Similarity=0.088 Sum_probs=37.3
Q ss_pred EEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEe
Q psy7363 392 VVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVS 447 (552)
Q Consensus 392 ~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~ 447 (552)
++.++|... - .+.+.=..-.++|.|...+. ......+.++|+++..+.+.+.
T Consensus 1 ~l~v~NPN~-~--~i~~~~~~~~v~~~g~~v~~-~~~~~~~~i~~~~~~~v~~~v~ 52 (101)
T PF03168_consen 1 TLSVRNPNS-F--GIRYDSIEYDVYYNGQRVGT-GGSLPPFTIPARSSTTVPVPVS 52 (101)
T ss_dssp EEEEEESSS-S---EEEEEEEEEEEESSSEEEE-EEECE-EEESSSCEEEEEEEEE
T ss_pred CEEEECCCc-e--eEEEeCEEEEEEECCEEEEC-ccccCCeEECCCCcEEEEEEEE
Confidence 477889877 4 56666666778899988774 4555788999999887776654
No 27
>smart00769 WHy Water Stress and Hypersensitive response.
Probab=40.89 E-value=1e+02 Score=26.70 Aligned_cols=64 Identities=11% Similarity=-0.051 Sum_probs=44.0
Q ss_pred CCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEeh
Q psy7363 381 DDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSY 448 (552)
Q Consensus 381 ~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y 448 (552)
.--.....|.+.+.++|..+-+ +.+.=..-.++|+|....+-.. ....+++++.+..+.+.+.-
T Consensus 9 ~~~~~~~~~~l~l~v~NPN~~~---l~~~~~~y~l~~~g~~v~~g~~-~~~~~ipa~~~~~v~v~~~~ 72 (100)
T smart00769 9 PVSGLEIEIVLKVKVQNPNPFP---IPVNGLSYDLYLNGVELGSGEI-PDSGTLPGNGRTVLDVPVTV 72 (100)
T ss_pred cccceEEEEEEEEEEECCCCCc---cccccEEEEEEECCEEEEEEEc-CCCcEECCCCcEEEEEEEEe
Confidence 3334567888999999997765 5555556678999987766533 24588888887776555544
No 28
>PF05506 DUF756: Domain of unknown function (DUF756); InterPro: IPR008475 This domain is found, normally as a tandem repeat, at the C terminus of bacterial phospholipase C proteins.; GO: 0004629 phospholipase C activity, 0016042 lipid catabolic process
Probab=38.58 E-value=1.5e+02 Score=25.23 Aligned_cols=45 Identities=16% Similarity=0.212 Sum_probs=30.9
Q ss_pred EEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEE
Q psy7363 390 SVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHV 446 (552)
Q Consensus 390 ~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I 446 (552)
.|.|+|.|..... +.+.+.. ..|.+ .....++|.|++..++.+.+
T Consensus 21 ~l~l~l~N~g~~~---~~~~v~~--~~y~~-------~~~~~~~v~ag~~~~~~w~l 65 (89)
T PF05506_consen 21 NLRLTLSNPGSAA---VTFTVYD--NAYGG-------GGPWTYTVAAGQTVSLTWPL 65 (89)
T ss_pred EEEEEEEeCCCCc---EEEEEEe--CCcCC-------CCCEEEEECCCCEEEEEEee
Confidence 8999999986655 3333333 22321 22378899999999988877
No 29
>PF06510 DUF1102: Protein of unknown function (DUF1102); InterPro: IPR009482 This family consists of several hypothetical archaeal proteins of unknown function.
Probab=38.14 E-value=1.6e+02 Score=28.57 Aligned_cols=75 Identities=12% Similarity=0.092 Sum_probs=43.3
Q ss_pred CCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEehhhhhccCCCCCcEEEEE
Q psy7363 386 GSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYYKRLVDQADFNIAC 465 (552)
Q Consensus 386 G~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~~~L~d~~~i~v~a 465 (552)
|.++.+.|+++ |+.. .|.+.-..-..+.+|.- ..-+++...|+|.|++...|-|.+.......- .-+.-|.|.|
T Consensus 67 ~~~~~IcV~I~--s~~~--~i~fy~~~~~~~~~~~~-sd~a~~~i~ftv~~ge~v~VGm~~~~tg~~lG-~~~~~~tI~A 140 (146)
T PF06510_consen 67 GADVPICVTIS--SSSD--SIEFYTGDYDSYITGPG-SDSARQSICFTVEPGESVKVGMIFDSTGDSLG-DYDGQITIKA 140 (146)
T ss_pred cCCceEEEEEe--cCCC--cEEEEecCCCccccCCc-cccccceEEEEecCCCeeEEEEEEecCCCCCc-ceeeEEEEEE
Confidence 56677777776 3332 44444333333333322 33356889999999999999999886644322 2233444444
Q ss_pred E
Q psy7363 466 L 466 (552)
Q Consensus 466 ~ 466 (552)
.
T Consensus 141 ~ 141 (146)
T PF06510_consen 141 Y 141 (146)
T ss_pred E
Confidence 3
No 30
>PF12735 Trs65: TRAPP trafficking subunit Trs65; InterPro: IPR024662 This family is one of the subunits of the TRAPP Golgi trafficking complex []. TRAPP subunits are found in two different sized complexes, TRAPP I and TRAPP II. While both complexes contain the same seven subunits, Bet3p, Bet5p, Trs20p, Trs23p, Trs31p, Trs33p and Trs85p, with TRAPPC human equivalents, TRAPP II has the additional three subunits ,Trs65p, Trs120p and Trs130p []. While it has been implicated in cell wall biogenesis and stress response, the role of Trs65 in TRAPP II is supported by the findings that the protein co-localises with Trs130p, and deletion of TRS65 in yeast leads to a conditional lethal phenotype if either one of the other TRAPP II-specific subunits is modified []. Furthermore, the trs65 mutant has reduced Ypt31/32p guanine nucleotide exchange, GEF, activity []. Trs65 is also known as killer toxin-resistance protein 11.
Probab=37.19 E-value=64 Score=33.98 Aligned_cols=43 Identities=21% Similarity=0.394 Sum_probs=35.6
Q ss_pred CCCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEE
Q psy7363 369 EFNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRV 412 (552)
Q Consensus 369 ~~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a 412 (552)
...+|+|+|.....+..|+.|.+.|-+-|.|+.. +.+.+.+.-
T Consensus 154 ~~~gv~~sF~gp~~V~~Ge~F~w~v~ivN~S~~~-r~L~l~~~~ 196 (306)
T PF12735_consen 154 SLSGVTFSFSGPSSVKVGEPFSWKVFIVNRSSSP-RKLALYVPP 196 (306)
T ss_pred cccCeEEEEeCCceEecCCeEEEEEEEEECCCCC-eeEEEEecC
Confidence 3458999998777889999999999999999988 776665544
No 31
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=37.18 E-value=9.6 Score=31.88 Aligned_cols=9 Identities=22% Similarity=0.505 Sum_probs=8.7
Q ss_pred ccCCCccee
Q psy7363 164 TLEAPNKDI 172 (552)
Q Consensus 164 ~LGIP~R~V 172 (552)
|||||+|+|
T Consensus 2 Cl~iP~~Vv 10 (68)
T PF01455_consen 2 CLAIPGRVV 10 (68)
T ss_dssp CECEEEEEE
T ss_pred cccccEEEE
Confidence 899999999
No 32
>TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. An additional proposed function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. PubMed:12441107.
Probab=37.15 E-value=14 Score=31.75 Aligned_cols=9 Identities=11% Similarity=0.416 Sum_probs=8.5
Q ss_pred ccCCCccee
Q psy7363 164 TLEAPNKDI 172 (552)
Q Consensus 164 ~LGIP~R~V 172 (552)
|||||+|++
T Consensus 2 CLaiP~~V~ 10 (76)
T TIGR00074 2 CIAIPGQVV 10 (76)
T ss_pred ccccceEEE
Confidence 899999999
No 33
>PRK10409 hydrogenase assembly chaperone; Provisional
Probab=37.06 E-value=14 Score=32.78 Aligned_cols=9 Identities=11% Similarity=0.183 Sum_probs=8.6
Q ss_pred ccCCCccee
Q psy7363 164 TLEAPNKDI 172 (552)
Q Consensus 164 ~LGIP~R~V 172 (552)
|||||+|++
T Consensus 2 CLgiP~kVv 10 (90)
T PRK10409 2 CIGVPGQIR 10 (90)
T ss_pred ccccceEEE
Confidence 899999999
No 34
>PRK10413 hydrogenase 2 accessory protein HypG; Provisional
Probab=35.88 E-value=15 Score=32.03 Aligned_cols=9 Identities=11% Similarity=0.409 Sum_probs=8.5
Q ss_pred ccCCCccee
Q psy7363 164 TLEAPNKDI 172 (552)
Q Consensus 164 ~LGIP~R~V 172 (552)
|||||+|++
T Consensus 2 CLaiP~kVi 10 (82)
T PRK10413 2 CIGVPGQVL 10 (82)
T ss_pred ccccceEEE
Confidence 899999999
No 35
>PLN03080 Probable beta-xylosidase; Provisional
Probab=34.65 E-value=1.3e+02 Score=35.79 Aligned_cols=61 Identities=18% Similarity=0.190 Sum_probs=41.2
Q ss_pred CEEEEEEEEeCCCCCce-EEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEeh-hh
Q psy7363 388 PFSVVVKMNNKSRDQDY-TVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSY-EE 450 (552)
Q Consensus 388 Df~l~v~l~N~s~~~~r-~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y-~e 450 (552)
.+.|.++|+|......+ .|.+.+... ..-.+.+..+++- -..|.|+|+|+++|.++|+. ++
T Consensus 685 ~~~v~v~VtNtG~~~G~evvQlYv~~p-~~~~~~P~k~L~g-F~kv~L~~Ges~~V~~~l~~~~~ 747 (779)
T PLN03080 685 RFNVHISVSNVGEMDGSHVVMLFSRSP-PVVPGVPEKQLVG-FDRVHTASGRSTETEIVVDPCKH 747 (779)
T ss_pred eEEEEEEEEECCcccCcEEEEEEEecC-ccCCCCcchhccC-cEeEeeCCCCEEEEEEEeCchHH
Confidence 49999999999887633 344444332 1112456666633 34677999999999999986 44
No 36
>PRK15098 beta-D-glucoside glucohydrolase; Provisional
Probab=34.14 E-value=73 Score=37.74 Aligned_cols=66 Identities=14% Similarity=0.236 Sum_probs=43.6
Q ss_pred CCCCEEEEEEEEeCCCCCce-EEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEehhhhh
Q psy7363 385 IGSPFSVVVKMNNKSRDQDY-TVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYY 452 (552)
Q Consensus 385 iG~Df~l~v~l~N~s~~~~r-~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~ 452 (552)
-|+.++++++|+|..+...+ .|.+.+....-. ...+..+++ .-..+.|+|+++++|.++|+.++..
T Consensus 665 ~~~~i~v~v~V~NtG~~~G~EVvQlYv~~~~~~-~~~P~k~L~-gF~Kv~L~pGes~~V~~~l~~~~L~ 731 (765)
T PRK15098 665 RDGKVTASVTVTNTGKREGATVVQLYLQDVTAS-MSRPVKELK-GFEKIMLKPGETQTVSFPIDIEALK 731 (765)
T ss_pred CCCeEEEEEEEEECCCCCccEEEEEeccCCCCC-CCCHHHhcc-CceeEeECCCCeEEEEEeecHHHhc
Confidence 36789999999999987633 345554332111 123444443 3345689999999999999987643
No 37
>PF07919 Gryzun: Gryzun, putative trafficking through Golgi; InterPro: IPR012880 The proteins featured in this family are all hypothetical eukaryotic proteins of unknown function. The region in question is approximately 150 residues long.
Probab=33.51 E-value=5e+02 Score=28.79 Aligned_cols=95 Identities=12% Similarity=0.145 Sum_probs=56.1
Q ss_pred CCCceEEEE-EecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEE-------EeEcCcc----ccceeeEee------
Q psy7363 369 EFNDIKFDF-ELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDA-------VTYTGKV----GDSVKKTKE------ 430 (552)
Q Consensus 369 ~~~~v~~~i-~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~-------v~YtG~~----~~~~~~~~~------ 430 (552)
.++.+.+++ .......+||.+.+.+++.|..++. ..+.+.+.... ..=++.. ....+++..
T Consensus 171 ~pp~v~I~~~~~~~~~l~gE~~~i~i~I~n~e~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 249 (554)
T PF07919_consen 171 RPPKVSIKLPNHKPPALTGEFYPIPITISNNEDEE-ASGVLEVRLLHPSQLGVSSEETEDLSQVNWDSDKDDEPLFLGIP 249 (554)
T ss_pred CCCCeEEEeCCCCCCeEcCCEEEEEEEEEcCCCcc-ceeEEEEEEecccccccccccCccceecccccccccchhccCcc
Confidence 346788888 6667778899999999999999887 66555444330 0011111 011111111
Q ss_pred EEEEcCCCEEEEEEEEehhhhhccCCCCCcEEEEEEEEE
Q psy7363 431 DVVVKRGKSEEIVLHVSYEEYYKRLVDQADFNIACLATV 469 (552)
Q Consensus 431 ~v~L~p~e~~~v~l~I~y~eY~~~L~d~~~i~v~a~a~v 469 (552)
--.|.+++..++.+.|. .....+..|.+.+..+.
T Consensus 250 lg~l~~~~s~~~~l~i~-----~~~~~~~~L~i~~~Y~l 283 (554)
T PF07919_consen 250 LGELAPGSSITVTLYIR-----TSRPGEYELSISVSYHL 283 (554)
T ss_pred cccCCCCCcEEEEEEEE-----eCCceeEEEEEEEEEEE
Confidence 12356777777777777 34444455566666654
No 38
>COG0242 Def N-formylmethionyl-tRNA deformylase [Translation, ribosomal structure and biogenesis]
Probab=32.81 E-value=43 Score=32.85 Aligned_cols=54 Identities=20% Similarity=0.434 Sum_probs=33.6
Q ss_pred eeeccccceeeccCCCcceeee-------------cccCccccccccccccccccccCCCCCceee--------------
Q psy7363 153 SGVLTTGETCRTLEAPNKDIII-------------IFVNPESLEKNLVVGVCYAAAHDTQSSLTVD-------------- 205 (552)
Q Consensus 153 AgV~~T~~vlR~LGIP~R~V~~-------------~~~~~~~~~~~~~~~TNF~SAHDt~~nLtiD-------------- 205 (552)
.|++.+ =.||+-|++++ +|+||+=++. +=.---+.+|-|+|+
T Consensus 45 VGLAAp-----QIGi~kri~vi~~~~~~~~~~~~~vlINP~I~~~------~~~~~~~~EGCLSvP~~~~~V~R~~~I~V 113 (168)
T COG0242 45 VGLAAP-----QIGISKRIFVIDVEEDGRPKEPPLVLINPEIISK------SEETLTGEEGCLSVPGVRGEVERPERITV 113 (168)
T ss_pred eeeeeh-----hcCceeeEEEEEccCccCcCcCceEEECCEEeec------CCcccccCcceEeecCceeeeecccEEEE
Confidence 466666 57888888862 4556654333 111222567888888
Q ss_pred eeecccCccccC
Q psy7363 206 YFVDEDGRVMEE 217 (552)
Q Consensus 206 ~y~de~G~~l~~ 217 (552)
.|+|.+|++...
T Consensus 114 ~~~D~~G~~~~~ 125 (168)
T COG0242 114 KYLDRNGKPQEL 125 (168)
T ss_pred EEEcCCCCEEEE
Confidence 466888887653
No 39
>PF07703 A2M_N_2: Alpha-2-macroglobulin family N-terminal region; InterPro: IPR011625 This is a domain of the alpha-2-macroglobulin family. The alpha-macroglobulin (aM) family of proteins includes protease inhibitors [], typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a 'bait region' and a thiol ester, (iii) a similar protease inhibitory mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance []. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain [] (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation []. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified [, ].; PDB: 2QKI_D 3L3O_D 3NMS_A 2ICF_A 2A73_A 2ICE_D 2HR0_A 2A74_A 2XWJ_G 3OHX_A ....
Probab=28.60 E-value=4.2e+02 Score=23.47 Aligned_cols=108 Identities=14% Similarity=0.089 Sum_probs=63.6
Q ss_pred cCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEEEEEehhhhhccCCCCC
Q psy7363 380 RDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYYKRLVDQA 459 (552)
Q Consensus 380 ~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~~~L~d~~ 459 (552)
.+....|+.+.+.+...-. . ..+.+.+. .+..+... ..+.+..+ ...+.++++.+ + . .
T Consensus 7 ~~~~~~Ge~~~v~v~~~~~---~-~~~~~~v~---------s~g~I~~~-~~~~~~~~-~~~~~~~v~~~-~----~--P 64 (136)
T PF07703_consen 7 KDSYKPGETAKVTVQSPFP---N-GTFLYLVE---------SRGKIVST-GSVELKNG-STTFEFPVTPD-M----A--P 64 (136)
T ss_dssp SSSB-TTSEEEEEEEEESC---E-SEEEEEEE---------ETTEEEEE-EEEECTTT-SSEEEEEE-GG-G----T--S
T ss_pred CCCcCCCCEEEEEEEcCCC---c-cEEEEEEE---------ECCeEEEE-EEEEecCC-cEEEEEecchh-c----C--C
Confidence 3566789998888877665 1 22222222 12233221 23444443 33667777654 2 1 2
Q ss_pred cEEEEEEEEEcc-CCceEEEEEeEEEeC---CCeEEEe-CCcccccceeeEEEeec
Q psy7363 460 DFNIACLATVHD-TNFEYFAQDDFRVRK---PDIRLKQ-LMKRFDRGNCNGGLITP 510 (552)
Q Consensus 460 ~i~v~a~a~v~e-t~~~~~a~~d~~L~~---P~l~I~v-~g~~~V~k~~~~~~~~~ 510 (552)
.+++.|..- .+ .++.+.....|.+.. -+++|+. +.+.+-|+++.++|+.|
T Consensus 65 ~~~v~~~~v-~~~~g~~~~~s~~i~V~~~~~~~v~l~~~~~~~~Pg~~~~~~i~~~ 119 (136)
T PF07703_consen 65 NFYVLAYYV-RPADGEVVADSVWIEVEPCFELKVELTASPDEYKPGEEVTLRIKAP 119 (136)
T ss_dssp EEEEEEEEE-TTCTCEEEEEEEEEEBGCSGSSSEEEEESSSSBTTTSEEEEEEEES
T ss_pred cEEEEEEEE-cCCCCeEEEEEEEEEecccccceEEEEEecceeCCCCEEEEEEEeC
Confidence 455555543 33 566788888887776 3466776 77889999999999883
No 40
>PF06205 GT36_AF: Glycosyltransferase 36 associated family ; InterPro: IPR010403 This domain is found in the NvdB protein (P20471 from SWISSPROT), which is involved in the production of beta-(1-->2)-glucan.; PDB: 1V7V_A 1V7W_A 1V7X_A 3ACT_B 2CQT_A 3QFY_B 3QFZ_A 2CQS_A 3QG0_B 3AFJ_A ....
Probab=25.77 E-value=1.1e+02 Score=26.56 Aligned_cols=32 Identities=19% Similarity=0.249 Sum_probs=23.5
Q ss_pred EcCccccceeeEeeEEEEcCCCEEEEEEEEeh
Q psy7363 417 YTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSY 448 (552)
Q Consensus 417 YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y 448 (552)
..|...++|...+..++|+|++++++.+-+-.
T Consensus 53 ~~g~~~Dpc~al~~~v~L~PGe~~~v~f~lG~ 84 (90)
T PF06205_consen 53 SEGAGLDPCAALQVRVTLEPGEEKEVVFLLGA 84 (90)
T ss_dssp B--BSS-EEEEEEEEEEE-TT-EEEEEEEEEE
T ss_pred CccCCcCeEEEEEEEEEECCCCEEEEEEEEEE
Confidence 45888899999999999999999999887654
No 41
>PF13584 BatD: Oxygen tolerance
Probab=25.25 E-value=9.3e+02 Score=26.43 Aligned_cols=134 Identities=12% Similarity=0.103 Sum_probs=66.8
Q ss_pred eEEEEEec-CCCCCCCCEEEEEEEEeCCCCCc----eEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEEEE---E
Q psy7363 373 IKFDFELR-DDIVIGSPFSVVVKMNNKSRDQD----YTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEEIV---L 444 (552)
Q Consensus 373 v~~~i~~~-~~~~iG~Df~l~v~l~N~s~~~~----r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~v~---l 444 (552)
+.+...++ ..+.+|+.|.|++++.+.....+ ....+.-..++..++-..+..-......++|.|.+.-++. +
T Consensus 13 ~~v~a~vd~~~v~~ge~~~l~i~~~~~~~~~~~p~l~~f~v~~~~~s~~~~~inG~~~~~~~~~~~l~p~~~G~~~IP~~ 92 (484)
T PF13584_consen 13 VSVTASVDRNEVGLGETFQLTITINGDGDDPDLPELDGFEVLGPSQSSSTSIINGKVSSSTTYTYTLQPKKTGTFTIPPF 92 (484)
T ss_pred eEEEEEECCcEEcCCCEEEEEEEEecCcccCCCCCCCCeEEcceEEEEEEEEecCceEEEEEEEEEEEecccceEEEceE
Confidence 55555443 56788999999999988443210 1233322222222333333333667788888898755543 3
Q ss_pred EEehhhhhccCCCCCcEEEEEEEEEccCCce-EEEEEeEEEeCCCeEEEe-CCcccccceeeEEEeeccccc
Q psy7363 445 HVSYEEYYKRLVDQADFNIACLATVHDTNFE-YFAQDDFRVRKPDIRLKQ-LMKRFDRGNCNGGLITPTSKR 514 (552)
Q Consensus 445 ~I~y~eY~~~L~d~~~i~v~a~a~v~et~~~-~~a~~d~~L~~P~l~I~v-~g~~~V~k~~~~~~~~~~~~~ 514 (552)
+|.+ ..+...-.-|+|.+.......... ....+++. |.+++ ..++-+|+++.+++.+-+...
T Consensus 93 ~v~v---~Gk~~~S~pi~i~V~~~~~~~~~~~~~~~~~~~-----l~~~v~~~~~Yvge~v~lt~~ly~~~~ 156 (484)
T PF13584_consen 93 TVEV---DGKTYKSQPITIEVSKASQSPSQPPSNADDDVF-----LEAEVSKKSVYVGEPVILTLRLYTRNN 156 (484)
T ss_pred EEEE---CCEEEeecCEEEEEEecccCCccccccccccEE-----EEEEeCCCceecCCcEEEEEEEEEecC
Confidence 3332 222211233444444333221100 11111111 22222 256788899998888877654
No 42
>TIGR03079 CH4_NH3mon_ox_B methane monooxygenase/ammonia monooxygenase, subunit B. Both ammonia oxidizers such as Nitrosomonas europaea and methanotrophs (obligate methane oxidizers) such as Methylococcus capsulatus each can grow only on their own characteristic substrate. However, both groups have the ability to oxidize both substrates, and so the relevant enzymes must be named here according to their ability to oxidze both. The protein family represented here reflects subunit B of both the particulate methane monooxygenase of methylotrophs and the ammonia monooxygenase of nitrifying bacteria.
Probab=25.18 E-value=2.7e+02 Score=30.89 Aligned_cols=83 Identities=8% Similarity=0.108 Sum_probs=54.0
Q ss_pred CCCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccce--------eeE----eeEEEEcC
Q psy7363 369 EFNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSV--------KKT----KEDVVVKR 436 (552)
Q Consensus 369 ~~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~--------~~~----~~~v~L~p 436 (552)
+.+.|+.++....=-+-|...++.++++|++++. -.+--.-+|..-+-|-.-..+. .-+ +.+-.+.|
T Consensus 264 ~~~~V~~kv~~a~Y~VPGR~l~~~~~VTN~g~~~-vrlgEF~TA~vRFlN~~~v~~~~~~yP~~lla~GL~v~d~~pI~P 342 (399)
T TIGR03079 264 APNPVSINVTKANYDVPGRALRVTMEITNNGDQV-ISIGEFTTAGIRFMNANGVRVLDPDYPRELLAEGLEVDDQSAIAP 342 (399)
T ss_pred CCCceEEEEeccEEecCCcEEEEEEEEEcCCCCc-eEEEeEeecceEeeCcccccccCCCChHHHhhccceeCCCCCcCC
Confidence 4567888887666567799999999999999988 5554444444433332111111 111 22335889
Q ss_pred CCEEEEEEEEehhhhh
Q psy7363 437 GKSEEIVLHVSYEEYY 452 (552)
Q Consensus 437 ~e~~~v~l~I~y~eY~ 452 (552)
||++++.+...-....
T Consensus 343 GETr~v~v~aqdA~WE 358 (399)
T TIGR03079 343 GETVEVKMEAKDALWE 358 (399)
T ss_pred CcceEEEEEEehhhhH
Confidence 9999999888765543
No 43
>PF13199 Glyco_hydro_66: Glycosyl hydrolase family 66; PDB: 3VMO_A 3VMN_A 3VMP_A.
Probab=25.12 E-value=1.1e+02 Score=35.32 Aligned_cols=66 Identities=21% Similarity=0.293 Sum_probs=41.2
Q ss_pred CCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEeeEEEEcCCCEEE--EEEEEehhhhhccCCC
Q psy7363 385 IGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTKEDVVVKRGKSEE--IVLHVSYEEYYKRLVD 457 (552)
Q Consensus 385 iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~~~v~L~p~e~~~--v~l~I~y~eY~~~L~d 457 (552)
-|+.+.|.+.++|.+.+. .+.++.++.. +=| ..+.....++.|.+++... +.+.++.++|..+|++
T Consensus 9 PGe~V~l~~~~~~~~~~~-~~g~~~v~~~---~l~---~~V~~~~~~~~~~~~~~~~~~~~w~~P~~df~GYlv~ 76 (559)
T PF13199_consen 9 PGEKVTLTASLKNTTGSD-FSGTVKVRYY---HLN---EVVGETKQSVDLTSGASWTLTIEWTAPADDFTGYLVE 76 (559)
T ss_dssp TTS-EEEE-EEE--SSS--EEEEEEEEEE---ETT---EEEEEEEEEEEE-TT-EEE-TTSEEE-TTSSEEEEEE
T ss_pred CCCeEEEEEEeccCcccc-ccceEEEEEE---EcC---eEeeeeeeEEeecCCCcceeeEEEECCcccCeeEEEE
Confidence 499999999999998887 5655554432 222 4555667888888887754 5688999999888764
No 44
>PF07760 DUF1616: Protein of unknown function (DUF1616); InterPro: IPR011674 This is a group of sequences from hypothetical archaeal proteins. The region in question is approximately 330 amino acid residues long.
Probab=25.07 E-value=3.1e+02 Score=28.56 Aligned_cols=67 Identities=19% Similarity=0.243 Sum_probs=46.7
Q ss_pred CCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCc---cccceeeEeeEEEEcCCCEEEEEEEEeh
Q psy7363 381 DDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGK---VGDSVKKTKEDVVVKRGKSEEIVLHVSY 448 (552)
Q Consensus 381 ~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~---~~~~~~~~~~~v~L~p~e~~~v~l~I~y 448 (552)
.....|++.++.+-+.|...+. -+-++.+..+...+++. ......-+...++|.+|++.+.++++.+
T Consensus 185 t~l~~ge~~~v~vgI~NhE~~~-~~Ytv~v~l~~~~~~~~~~~~~~~~~l~~~~~~L~~n~t~~~~~~~~~ 254 (287)
T PF07760_consen 185 TNLTSGEPGTVIVGIENHEGRP-ENYTVVVVLQNVTWNPNNYNVMESTVLDRPIVTLADNETWEQPYKFTP 254 (287)
T ss_pred eeEEcCCcEEEEEEEEcCCCCc-EEEEEEEEEeccccccccccccchhcccceEEEeCCCCeEEEEEEEEE
Confidence 4456899999999999998766 56666666565555522 2222223344558999999999999987
No 45
>PF02752 Arrestin_C: Arrestin (or S-antigen), C-terminal domain; InterPro: IPR011022 G protein-coupled receptors are a large family of signalling molecules that respond to a wide variety of extracellular stimuli. The receptors relay the information encoded by the ligand through the activation of heterotrimeric G proteins and intracellular effector molecules. To ensure the appropriate regulation of the signalling cascade, it is vital to properly inactivate the receptor. This inactivation is achieved, in part, by the binding of a soluble protein, arrestin, which uncouples the receptor from the downstream G protein after the receptors are phosphorylated by G protein-coupled receptor kinases. In addition to the inactivation of G protein-coupled receptors, arrestins have also been implicated in the endocytosis of receptors and cross talk with other signalling pathways. Arrestin (retinal S-antigen) is a major protein of the retinal rod outer segments. It interacts with photo-activated phosphorylated rhodopsin, inhibiting or 'arresting' its ability to interact with transducin []. The protein binds calcium, and shows similarity in its C terminus to alpha-transducin and other purine nucleotide-binding proteins. In mammals, arrestin is associated with autoimmune uveitis. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X) [], which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction [, , ]. The crystal structure of bovine retinal arrestin comprises two domains of antiparallel beta-sheets connected through a hinge region and one short alpha-helix on the back of the amino-terminal fold []. The binding region for phosphorylated light-activated rhodopsin is located at the N-terminal domain, as indicated by the docking of the photoreceptor to the three-dimensional structure of arrestin. The C-terminal domain consists of an immunoglobulin-like beta-sandwich structure. This entry represents proteins with immunoglobulin-like domains that are similar to those found in arrestin.; PDB: 1SUJ_A 3UGX_A 1CF1_B 1AYR_A 3UGU_A 3P2D_B 1ZSH_A 2WTR_B 3GC3_A 1G4R_A ....
Probab=24.96 E-value=4.5e+02 Score=22.61 Aligned_cols=48 Identities=25% Similarity=0.370 Sum_probs=28.8
Q ss_pred ceEEEEEecCC-CCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCc
Q psy7363 372 DIKFDFELRDD-IVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGK 420 (552)
Q Consensus 372 ~v~~~i~~~~~-~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~ 420 (552)
.+.+++..... ...|+.+.+.+.+.|.+...-+.+++.|.=.. .|...
T Consensus 4 ~i~~~~~i~~~~~~~Ge~i~v~v~i~n~s~~~i~~I~v~L~~~~-~~~~~ 52 (136)
T PF02752_consen 4 KISLSISIPRTAYVPGETIPVNVEIDNQSKKKIKKIKVSLVERI-TYKAK 52 (136)
T ss_dssp EEEEEEEES-SEEETT--EEEEEEEEE-SSSEEEEEEEEEEEEE-EE-SS
T ss_pred EEEEEEEECCCEECCCCEEEEEEEEEECCCCEEEEEEEEEEEEE-EEEEe
Confidence 35666665544 36899999999999988865455666665444 44444
No 46
>PF07070 Spo0M: SpoOM protein; InterPro: IPR009776 This family consists of several bacterial SpoOM proteins which are thought to control sporulation in Bacillus subtilis.Spo0M exerts certain negative effects on sporulation and its gene expression is controlled by sigmaH [].
Probab=24.54 E-value=6.5e+02 Score=25.70 Aligned_cols=82 Identities=12% Similarity=0.071 Sum_probs=53.0
Q ss_pred eEEEEEecCCCCCCCCEEEEEEEEeCCCCCce--EEEEEEEEEEEeEcCc----cccce--eeEeeEEEEcCCCEEEEEE
Q psy7363 373 IKFDFELRDDIVIGSPFSVVVKMNNKSRDQDY--TVTVILRVDAVTYTGK----VGDSV--KKTKEDVVVKRGKSEEIVL 444 (552)
Q Consensus 373 v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r--~v~~~l~a~~v~YtG~----~~~~~--~~~~~~v~L~p~e~~~v~l 444 (552)
|...+. ......|+.++=.|.++-.+.+. + .|.+.|.+....-.+- ....+ ++-...++|+|++++++|+
T Consensus 15 VDT~L~-~~~~~pGe~v~G~V~i~GG~v~Q-~I~~I~l~L~t~~~~e~~d~~~~~~~~~~~~~v~~~f~I~~ge~~~iPF 92 (218)
T PF07070_consen 15 VDTVLE-KPSVRPGETVRGEVHIKGGSVDQ-EIDRIYLELVTRYEVESDDKEYTQEVELARVRVSGPFTIEPGEEKEIPF 92 (218)
T ss_pred EEEEEC-CCCccCCCEEEEEEEEEeCCcce-EEeEEEEEEEEEEEEecCCCeEEEEEEEEEEEeCCCEEECCCCEEEEeE
Confidence 444443 35667899999999999887766 3 4555566555443331 12223 2445779999999999999
Q ss_pred EEehhhhhccCC
Q psy7363 445 HVSYEEYYKRLV 456 (552)
Q Consensus 445 ~I~y~eY~~~L~ 456 (552)
.++--...+-..
T Consensus 93 ~~~lP~etPiT~ 104 (218)
T PF07070_consen 93 SFPLPWETPITE 104 (218)
T ss_pred EEECCCCCCccC
Confidence 887544444333
No 47
>COG1470 Predicted membrane protein [Function unknown]
Probab=24.37 E-value=1.1e+03 Score=27.09 Aligned_cols=73 Identities=15% Similarity=0.261 Sum_probs=48.3
Q ss_pred CCCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCccccceeeEe--eEEEEcCCCEEEEEEEE
Q psy7363 369 EFNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGKVGDSVKKTK--EDVVVKRGKSEEIVLHV 446 (552)
Q Consensus 369 ~~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~~~~~~~~~~--~~v~L~p~e~~~v~l~I 446 (552)
+..|+.+++.......-| +-++.|++.-.|.-. +.+-+.+. -+|.....+.-+. ..+++.||+++.+++.|
T Consensus 333 E~kdvtleV~ps~na~pG-~Ynv~I~A~s~s~v~-~e~~lki~-----~~g~~~~~v~l~~g~~~lt~taGee~~i~i~I 405 (513)
T COG1470 333 EEKDVTLEVYPSLNATPG-TYNVTITASSSSGVT-RELPLKIK-----NTGSYNELVKLDNGPYRLTITAGEEKTIRISI 405 (513)
T ss_pred CceEEEEEEecCCCCCCC-ceeEEEEEeccccce-eeeeEEEE-----eccccceeEEccCCcEEEEecCCccceEEEEE
Confidence 456788888766665554 566776666655333 33333322 3566666676555 89999999999999988
Q ss_pred eh
Q psy7363 447 SY 448 (552)
Q Consensus 447 ~y 448 (552)
.=
T Consensus 406 ~N 407 (513)
T COG1470 406 EN 407 (513)
T ss_pred Ee
Confidence 63
No 48
>PF05753 TRAP_beta: Translocon-associated protein beta (TRAPB); InterPro: IPR008856 This family consists of several eukaryotic translocon-associated protein beta (TRAPB) or signal sequence receptor beta subunit (SSR-beta) proteins. The normal translocation of nascent polypeptides into the lumen of the endoplasmic reticulum (ER) is thought to be aided in part by a translocon-associated protein (TRAP) complex consisting of 4 protein subunits. The association of mature proteins with the ER and Golgi, or other intracellular locales, such as lysosomes, depends on the initial targeting of the nascent polypeptide to the ER membrane. A similar scenario must also exist for proteins destined for secretion [].; GO: 0005783 endoplasmic reticulum, 0016021 integral to membrane
Probab=24.27 E-value=4.5e+02 Score=25.90 Aligned_cols=79 Identities=14% Similarity=0.004 Sum_probs=46.6
Q ss_pred ecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEe-EcCccccceeeEeeEEEEcCCCEEEEEEEEehhhhhccCCC
Q psy7363 379 LRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVT-YTGKVGDSVKKTKEDVVVKRGKSEEIVLHVSYEEYYKRLVD 457 (552)
Q Consensus 379 ~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~-YtG~~~~~~~~~~~~v~L~p~e~~~v~l~I~y~eY~~~L~d 457 (552)
+.+.++-|+++.|.+++.|..+.. -.++.+..+... -.-.........+. =+|+|++..+..+.|.+...+..-..
T Consensus 30 l~~~~v~g~~v~V~~~iyN~G~~~--A~dV~l~D~~fp~~~F~lvsG~~s~~~-~~i~pg~~vsh~~vv~p~~~G~f~~~ 106 (181)
T PF05753_consen 30 LNKYLVEGEDVTVTYTIYNVGSSA--AYDVKLTDDSFPPEDFELVSGSLSASW-ERIPPGENVSHSYVVRPKKSGYFNFT 106 (181)
T ss_pred ccccccCCcEEEEEEEEEECCCCe--EEEEEEECCCCCccccEeccCceEEEE-EEECCCCeEEEEEEEeeeeeEEEEcc
Confidence 345556699999999999999887 334444332111 00011112222222 26889888888888877766665554
Q ss_pred CCc
Q psy7363 458 QAD 460 (552)
Q Consensus 458 ~~~ 460 (552)
...
T Consensus 107 ~a~ 109 (181)
T PF05753_consen 107 PAV 109 (181)
T ss_pred CEE
Confidence 433
No 49
>PF11931 DUF3449: Domain of unknown function (DUF3449); InterPro: IPR024598 This presumed domain is functionally uncharacterised. It has two conserved sequence motifs: PIP and CEICG and contains a zinc-finger of the C2H2-type.; PDB: 4DGW_A.
Probab=23.36 E-value=27 Score=35.06 Aligned_cols=9 Identities=44% Similarity=0.571 Sum_probs=0.0
Q ss_pred eeeccCCCc
Q psy7363 161 TCRTLEAPN 169 (552)
Q Consensus 161 vlR~LGIP~ 169 (552)
=|||||||-
T Consensus 130 GlrcLGI~n 138 (196)
T PF11931_consen 130 GLRCLGIPN 138 (196)
T ss_dssp ---------
T ss_pred cChhcCCCC
Confidence 699999993
No 50
>PF03423 CBM_25: Carbohydrate binding domain (family 25); InterPro: IPR005085 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM25 from CAZY which has a starch-binding function as has been demonstrated in one case.; PDB: 2LAB_A 2C3X_B 2C3V_A 2C3W_C 2LAA_A.
Probab=22.90 E-value=47 Score=28.65 Aligned_cols=23 Identities=13% Similarity=0.262 Sum_probs=11.2
Q ss_pred eeeeeecccCccccCCCCCceeeeee
Q psy7363 203 TVDYFVDEDGRVMEELNSDSIWNFHM 228 (552)
Q Consensus 203 tiD~y~de~G~~l~~~~~DSIWNFHV 228 (552)
+|.+||+.....+. ..+.|| +|+
T Consensus 3 ~vtVyYn~~~~~l~--g~~~v~-~~~ 25 (87)
T PF03423_consen 3 TVTVYYNPSLTALS--GAPNVH-LHG 25 (87)
T ss_dssp EEEEEE---E-SSS---S-EEE-EEE
T ss_pred EEEEEEEeCCCCCC--CCCcEE-EEe
Confidence 57789987655553 367776 555
No 51
>PF14310 Fn3-like: Fibronectin type III-like domain; PDB: 3ABZ_D 3AC0_D 2X40_A 2X41_A 2X42_A.
Probab=21.68 E-value=1.2e+02 Score=24.72 Aligned_cols=29 Identities=17% Similarity=0.121 Sum_probs=20.8
Q ss_pred EeeEEEEcCCCEEEEEEEEehhhhhccCC
Q psy7363 428 TKEDVVVKRGKSEEIVLHVSYEEYYKRLV 456 (552)
Q Consensus 428 ~~~~v~L~p~e~~~v~l~I~y~eY~~~L~ 456 (552)
.-..+.|+||+++++.+.|+.++..-.-.
T Consensus 23 gF~rv~l~pGes~~v~~~l~~~~l~~~d~ 51 (71)
T PF14310_consen 23 GFERVSLAPGESKTVSFTLPPEDLAYWDE 51 (71)
T ss_dssp EEEEEEE-TT-EEEEEEEEEHHHHEEEET
T ss_pred ceEEEEECCCCEEEEEEEECHHHEeeEcC
Confidence 33567799999999999999987654443
No 52
>PF04744 Monooxygenase_B: Monooxygenase subunit B protein; InterPro: IPR006833 Ammonia monooxygenase and the particulate methane monooxygenase are both integral membrane proteins, occurring in ammonia oxidisers and methanotrophs respectively, which are thought to be evolutionarily related []. These enzymes have a relatively wide substrate specificity and can catalyse the oxidation of a range of substrates including ammonia, methane, halogenated hydrocarbons and aromatic molecules []. These enzymes are composed of 3 subunits - A (IPR003393 from INTERPRO), B (IPR006833 from INTERPRO) and C (IPR006980 from INTERPRO) - and contain various metal centres, including copper. Particulate methane monooxygenase from Methylococcus capsulatus str. Bath is an ABC homotrimer, which contains mononuclear and dinuclear copper metal centres, and a third metal centre containing a metal ion whose identity in vivo is not certain[]. The soluble regions of these enzymes derive primarily from the B subunit. This subunit forms two antiparallel beta-barrel-like structures and contains the mono- and di- nuclear copper metal centres [].; PDB: 3CHX_E 3RFR_A 3RGB_A 1YEW_A.
Probab=21.13 E-value=1.8e+02 Score=32.10 Aligned_cols=109 Identities=10% Similarity=0.160 Sum_probs=61.8
Q ss_pred CCCceEEEEEecCCCCCCCCEEEEEEEEeCCCCCceEEEEEEEEEEEeEcCc--------cccceee----EeeEEEEcC
Q psy7363 369 EFNDIKFDFELRDDIVIGSPFSVVVKMNNKSRDQDYTVTVILRVDAVTYTGK--------VGDSVKK----TKEDVVVKR 436 (552)
Q Consensus 369 ~~~~v~~~i~~~~~~~iG~Df~l~v~l~N~s~~~~r~v~~~l~a~~v~YtG~--------~~~~~~~----~~~~v~L~p 436 (552)
+...|+.+++...=-+-|..+.++++++|+++++ -.+.-..+|..-+-|-. +...+.. -+-+-.+.|
T Consensus 245 ~~~~V~~~v~~A~Y~vpgR~l~~~l~VtN~g~~p-v~LgeF~tA~vrFln~~v~~~~~~~P~~l~A~~gL~vs~~~pI~P 323 (381)
T PF04744_consen 245 PPNSVKVKVTDATYRVPGRTLTMTLTVTNNGDSP-VRLGEFNTANVRFLNPDVPTDDPDYPDELLAERGLSVSDNSPIAP 323 (381)
T ss_dssp S-SSEEEEEEEEEEESSSSEEEEEEEEEEESSS--BEEEEEESSS-EEE-TTT-SS-S---TTTEETT-EEES--S-B-T
T ss_pred CCCceEEEEeccEEecCCcEEEEEEEEEcCCCCc-eEeeeEEeccEEEeCcccccCCCCCchhhhccCcceeCCCCCcCC
Confidence 3345777777655556799999999999999998 55554444433333222 2222222 233446789
Q ss_pred CCEEEEEEEEehhhhh----ccCCCCCcEEEEEEEEEcc-CCceEEE
Q psy7363 437 GKSEEIVLHVSYEEYY----KRLVDQADFNIACLATVHD-TNFEYFA 478 (552)
Q Consensus 437 ~e~~~v~l~I~y~eY~----~~L~d~~~i~v~a~a~v~e-t~~~~~a 478 (552)
||++++.+.+.-+... ..|.-+.-.|+..+.--.+ +++.++.
T Consensus 324 GETrtl~V~a~dA~WeveRL~~l~~D~dsrfgGLLff~d~~G~r~i~ 370 (381)
T PF04744_consen 324 GETRTLTVEAQDAAWEVERLSDLIYDPDSRFGGLLFFFDASGNRYIS 370 (381)
T ss_dssp T-EEEEEEEEE-HHHHHTTGGGGGGSSS-EEEEEEEEEETTS-EEEE
T ss_pred CceEEEEEEeehhHHHHhhhhhhhcCcccceeEEEEEEcCCCCEEEE
Confidence 9999999999655443 2344678889998876653 3334443
No 53
>cd00487 Pep_deformylase Polypeptide or peptide deformylase; a family of metalloenzymes that catalyzes the removal of the N-terminal formyl group in a growing polypeptide chain following translation initiation during protein synthesis in prokaryotes. These enzymes utilize Fe(II) as the catalytic metal ion, which can be replaced with a nickel or cobalt ion with no loss of activity. There are two types of peptide deformylases, types I and II, which differ in structure only in the outer surface of the domain. Because these enzymes are essential only in prokaryotes (although eukaryotic gene sequences have been found), they are a target for a new class of antibacterial agents.
Probab=20.74 E-value=89 Score=29.32 Aligned_cols=48 Identities=23% Similarity=0.476 Sum_probs=27.2
Q ss_pred ccCCCcceee------------ecccCccccccccccccccccccCCCCCceee--------------eeecccCccccC
Q psy7363 164 TLEAPNKDII------------IIFVNPESLEKNLVVGVCYAAAHDTQSSLTVD--------------YFVDEDGRVMEE 217 (552)
Q Consensus 164 ~LGIP~R~V~------------~~~~~~~~~~~~~~~~TNF~SAHDt~~nLtiD--------------~y~de~G~~l~~ 217 (552)
=+|+|-|+++ ++|+||+=+.+ .=.-..+.++-|++. .|+|.+|+..+.
T Consensus 46 QIG~~~ri~vv~~~~~~~~~~~~v~INP~I~~~------s~~~~~~~EgCLS~pg~~~~V~R~~~I~v~~~d~~G~~~~~ 119 (141)
T cd00487 46 QIGVSKRIFVIDVPDEENKEPPLVLINPEIIES------SGETEYGEEGCLSVPGYRGEVERPKKVTVRYLDEDGNPIEL 119 (141)
T ss_pred hcCCceeEEEEEcccccccccceEEECCeEecc------CCCEeeCCcCCcCcCCcceEecCcCEEEEEEECCCCCEEEE
Confidence 4678888775 34556644321 111223356777766 467888877653
Done!