Query psy1396
Match_columns 66
No_of_seqs 81 out of 111
Neff 3.8
Searched_HMMs 46136
Date Fri Aug 16 15:57:48 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy1396.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/1396hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF07718 Coatamer_beta_C: Coat 100.0 8.4E-35 1.8E-39 198.1 8.1 62 5-66 74-135 (140)
2 KOG1058|consensus 99.9 2.6E-28 5.7E-33 199.5 5.8 62 5-66 739-800 (948)
3 COG5096 Vesicle coat complex, 98.5 1.9E-08 4.1E-13 82.7 0.6 61 5-66 587-647 (757)
4 PF11614 FixG_C: IG-like fold 95.6 0.092 2E-06 33.0 6.6 53 5-58 36-88 (118)
5 PF06030 DUF916: Bacterial pro 94.9 0.036 7.8E-07 36.3 3.3 36 31-66 80-116 (121)
6 PF10633 NPCBM_assoc: NPCBM-as 94.5 0.17 3.6E-06 29.8 5.3 56 5-60 10-67 (78)
7 PF06280 DUF1034: Fn3-like dom 93.9 0.11 2.5E-06 32.3 3.8 37 30-66 56-96 (112)
8 PF00927 Transglut_C: Transglu 91.9 0.82 1.8E-05 28.1 5.6 49 5-53 20-75 (107)
9 smart00809 Alpha_adaptinC2 Ada 91.2 1.4 3.1E-05 26.4 6.1 49 5-55 23-72 (104)
10 PF14796 AP3B1_C: Clathrin-ada 88.8 1.3 2.9E-05 30.4 5.0 50 5-54 90-140 (145)
11 COG1470 Predicted membrane pro 87.6 1.3 2.8E-05 36.1 5.0 50 5-55 402-453 (513)
12 PF02883 Alpha_adaptinC2: Adap 86.8 4 8.6E-05 25.1 5.9 50 5-54 29-79 (115)
13 PF07705 CARDB: CARDB; InterP 86.6 3.9 8.4E-05 23.5 5.5 48 6-55 25-72 (101)
14 TIGR02745 ccoG_rdxA_fixG cytoc 85.6 4.8 0.0001 31.6 7.1 50 5-55 351-400 (434)
15 PF09478 CBM49: Carbohydrate b 67.4 14 0.00031 22.0 4.0 46 3-48 20-75 (80)
16 COG1837 Predicted RNA-binding 66.8 4.9 0.00011 25.0 1.9 34 31-64 12-46 (76)
17 PF08752 COP-gamma_platf: Coat 66.3 34 0.00074 23.5 6.2 49 6-54 54-104 (151)
18 PRK00468 hypothetical protein; 63.9 8.7 0.00019 23.5 2.6 34 31-65 12-47 (75)
19 PF03168 LEA_2: Late embryogen 60.5 30 0.00066 19.8 5.8 45 13-57 11-56 (101)
20 PRK01064 hypothetical protein; 60.3 9.1 0.0002 23.7 2.3 34 31-65 12-47 (78)
21 smart00769 WHy Water Stress an 56.1 43 0.00094 20.3 5.7 43 12-55 29-73 (100)
22 PRK02821 hypothetical protein; 51.9 9.4 0.0002 23.6 1.3 33 31-64 13-47 (77)
23 PF00345 PapD_N: Pili and flag 51.9 23 0.00049 21.9 3.1 50 5-56 19-75 (122)
24 TIGR00158 L9 ribosomal protein 48.8 17 0.00036 24.7 2.2 19 48-66 75-93 (148)
25 PHA01707 dut 2'-deoxyuridine 5 48.5 82 0.0018 21.2 7.1 61 4-64 15-83 (158)
26 PF03948 Ribosomal_L9_C: Ribos 45.5 7.1 0.00015 24.1 0.0 18 49-66 15-32 (87)
27 cd07557 trimeric_dUTPase Trime 45.1 63 0.0014 18.9 4.5 29 36-64 12-42 (92)
28 PF10435 BetaGal_dom2: Beta-ga 44.9 99 0.0021 21.5 5.7 51 5-55 33-87 (183)
29 cd07706 IgV_TCR_delta Immunogl 44.1 50 0.0011 19.9 3.7 26 31-56 2-27 (116)
30 PF14315 DUF4380: Domain of un 42.1 49 0.0011 23.8 3.9 34 17-50 234-268 (274)
31 CHL00160 rpl9 ribosomal protei 41.3 24 0.00051 24.2 2.1 17 50-66 83-99 (153)
32 cd09020 D-hex-6-P-epi_like D-h 40.6 46 0.001 23.6 3.6 27 19-53 243-269 (269)
33 PF09624 DUF2393: Protein of u 40.4 85 0.0018 20.3 4.6 47 6-52 68-131 (149)
34 PF14221 DUF4330: Domain of un 39.3 1.1E+02 0.0025 20.8 5.2 19 47-65 123-141 (168)
35 cd05754 Ig3_Perlecan_like Thir 38.8 50 0.0011 18.7 3.0 23 31-53 4-26 (85)
36 cd07693 Ig1_Robo First immunog 38.0 52 0.0011 18.4 2.9 22 31-52 4-25 (100)
37 PF05906 DUF865: Herpesvirus-7 36.2 22 0.00047 19.4 1.0 25 32-64 8-32 (35)
38 PF14016 DUF4232: Protein of u 34.9 56 0.0012 20.7 3.0 18 34-51 62-79 (131)
39 PF05506 DUF756: Domain of unk 34.7 1E+02 0.0022 18.3 4.9 20 33-52 46-65 (89)
40 cd05747 Ig5_Titin_like M5, fif 34.4 61 0.0013 18.7 2.9 23 30-52 5-27 (92)
41 TIGR00576 dut deoxyuridine 5'- 32.5 90 0.0019 20.5 3.8 30 35-64 28-59 (141)
42 PHA03126 dUTPase; Provisional 32.0 67 0.0015 25.1 3.5 30 35-64 185-217 (326)
43 PLN02808 alpha-galactosidase 31.3 57 0.0012 25.5 3.0 45 6-51 324-384 (386)
44 PF00635 Motile_Sperm: MSP (Ma 31.2 1.2E+02 0.0025 17.9 5.7 47 5-54 23-69 (109)
45 TIGR03000 plancto_dom_1 Planct 30.7 74 0.0016 19.9 2.9 28 19-48 46-73 (75)
46 PRK00137 rplI 50S ribosomal pr 29.6 48 0.001 22.3 2.1 18 49-66 76-93 (147)
47 cd02980 TRX_Fd_family Thioredo 28.9 1E+02 0.0022 17.2 3.1 25 20-44 35-59 (77)
48 PF07610 DUF1573: Protein of u 28.2 75 0.0016 17.0 2.4 43 6-51 2-44 (45)
49 PF13598 DUF4139: Domain of un 27.8 55 0.0012 23.3 2.3 23 5-27 32-54 (317)
50 PRK15308 putative fimbrial pro 27.6 72 0.0016 23.3 2.8 46 5-51 36-97 (234)
51 PF12790 T6SS-SciN: Type VI se 27.5 80 0.0017 20.4 2.8 30 36-65 81-110 (142)
52 TIGR02274 dCTP_deam deoxycytid 27.2 2E+02 0.0044 19.4 5.8 62 3-64 13-99 (179)
53 PF14874 PapD-like: Flagellar- 26.1 1.5E+02 0.0032 17.4 6.6 56 5-62 25-80 (102)
54 PF00692 dUTPase: dUTPase; In 25.7 1.2E+02 0.0026 19.0 3.3 25 34-58 20-44 (129)
55 cd04968 Ig3_Contactin_like Thi 24.6 1.3E+02 0.0029 16.9 3.2 23 30-52 3-25 (88)
56 PF07679 I-set: Immunoglobulin 24.5 1.4E+02 0.003 16.5 3.6 24 30-53 2-25 (90)
57 PF01002 Flavi_NS2B: Flaviviru 23.7 60 0.0013 21.8 1.7 21 17-37 70-90 (128)
58 PF10648 Gmad2: Immunoglobulin 23.5 1.9E+02 0.0041 17.8 4.8 46 18-63 30-77 (88)
59 PHA02703 ORF007 dUTPase; Provi 23.4 1.5E+02 0.0032 20.4 3.7 30 35-64 41-72 (165)
60 PF14310 Fn3-like: Fibronectin 23.0 1.3E+02 0.0029 17.1 2.9 18 37-54 26-43 (71)
61 COG4454 Uncharacterized copper 22.9 73 0.0016 22.5 2.1 24 27-51 106-129 (158)
62 PLN02547 dUTP pyrophosphatase 22.8 1.9E+02 0.0041 19.6 4.1 22 36-57 45-66 (157)
63 cd04981 IgV_H Immunoglobulin ( 22.7 1.7E+02 0.0037 18.1 3.6 23 34-56 3-25 (117)
64 PF11906 DUF3426: Protein of u 22.5 2.2E+02 0.0047 18.1 5.3 50 6-55 74-137 (149)
65 PRK13956 dut deoxyuridine 5'-t 22.5 1.3E+02 0.0028 20.4 3.2 27 28-57 30-56 (147)
66 PHA03094 dUTPase; Provisional 21.6 1.2E+02 0.0025 20.2 2.8 23 35-57 33-55 (144)
67 cd04972 Ig_TrkABC_d4 Fourth do 20.8 1.7E+02 0.0037 16.7 3.1 22 32-53 4-25 (90)
68 PTZ00143 deoxyuridine 5'-triph 20.5 1.7E+02 0.0036 20.1 3.4 22 35-56 34-55 (155)
69 PF14168 YjzC: YjzC-like prote 20.4 1.2E+02 0.0025 18.0 2.3 26 18-45 17-42 (57)
No 1
>PF07718 Coatamer_beta_C: Coatomer beta C-terminal region; InterPro: IPR011710 Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer []. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins []. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi []. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes []. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits. This entry represents the C-terminal domain of the beta subunit from coatomer proteins (Beta-coat proteins). The C-terminal domain probably adapts the function of the N-terminal IPR002553 from INTERPRO domain. Coatomer protein complex I (COPI)-coated vesicles are involved in transport between the endoplasmic reticulum and the Golgi but also participate in transport from early to late endosomes within the endocytic pathway []. More information about these proteins can be found at Protein of the Month: Clathrin [].; GO: 0005198 structural molecule activity, 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0030126 COPI vesicle coat
Probab=100.00 E-value=8.4e-35 Score=198.08 Aligned_cols=62 Identities=66% Similarity=1.009 Sum_probs=61.2
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEeceeceEEEEeC
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVASTENGIIFGNI 66 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVsStetGvIfG~I 66 (66)
|+|++|||++|||||+|||+|+||||+|||||++||+||+|+++||+|||||||+|+|||||
T Consensus 74 DvllvNqT~~tLqNl~vElat~gdLklve~p~~~tL~P~~~~~i~~~iKVsStetGvIfG~I 135 (140)
T PF07718_consen 74 DVLLVNQTNETLQNLTVELATLGDLKLVERPQPITLAPHGFARIKATIKVSSTETGVIFGNI 135 (140)
T ss_pred EEEEEeCChhhhhcEEEEEEecCCcEEccCCCceeeCCCcEEEEEEEEEEEeccCCEEEEEE
Confidence 78999999999999999999999999999999999999999999999999999999999997
No 2
>KOG1058|consensus
Probab=99.95 E-value=2.6e-28 Score=199.53 Aligned_cols=62 Identities=71% Similarity=1.031 Sum_probs=61.2
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEeceeceEEEEeC
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVASTENGIIFGNI 66 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVsStetGvIfG~I 66 (66)
|+|+||||++||||+++||+|+||||+||||++++|+||+|+++||+|||+|||+|+|||||
T Consensus 739 DvL~VNqT~~tLQNl~lelATlgdLKlve~p~p~~Laph~f~~ikatvKVsStenGvIfGnI 800 (948)
T KOG1058|consen 739 DVLLVNQTKETLQNLSLELATLGDLKLVERPTPFSLAPHDFVNIKATVKVSSTENGVIFGNI 800 (948)
T ss_pred EEEEecCChHHHhhheeeeeeccCceeeecCCCcccCcccceeEEEEEEEeeccCcEEEEEE
Confidence 78999999999999999999999999999999999999999999999999999999999997
No 3
>COG5096 Vesicle coat complex, various subunits [Intracellular trafficking and secretion]
Probab=98.53 E-value=1.9e-08 Score=82.72 Aligned_cols=61 Identities=28% Similarity=0.305 Sum_probs=59.0
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEeceeceEEEEeC
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVASTENGIIFGNI 66 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVsStetGvIfG~I 66 (66)
++++.|+|.++|+|+.+.+ |+|+++.+..+++.++.||+++.-+.++|+++++.|.||||+
T Consensus 587 ~~~~~~~t~~~l~nl~~~~-t~~~l~~~~~~~~~~l~~~~~~~~~~~v~~~~~~~~~i~gn~ 647 (757)
T COG5096 587 SALLTNQTPELLENLRLDF-TLGTLSTIPLKPIFNLRKGAVVLQQVTVKKPNAELGFITGNI 647 (757)
T ss_pred hhhccccCHHHHHhhhccc-cccceeccCCCCcccCCCCceeeeeeeeeccchhhhhhccCc
Confidence 4678999999999999999 999999999999999999999999999999999999999986
No 4
>PF11614 FixG_C: IG-like fold at C-terminal of FixG, putative oxidoreductase; PDB: 2R39_A.
Probab=95.59 E-value=0.092 Score=32.96 Aligned_cols=53 Identities=13% Similarity=0.183 Sum_probs=34.8
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEecee
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVASTE 58 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVsSte 58 (66)
.+-|.|+|++..+ ++|++...-++++......+.|+|++...+...|++...+
T Consensus 36 ~lkl~Nkt~~~~~-~~i~~~g~~~~~l~~~~~~i~v~~g~~~~~~v~v~~p~~~ 88 (118)
T PF11614_consen 36 TLKLTNKTNQPRT-YTISVEGLPGAELQGPENTITVPPGETREVPVFVTAPPDA 88 (118)
T ss_dssp EEEEEE-SSS-EE-EEEEEES-SS-EE-ES--EEEE-TT-EEEEEEEEEE-GGG
T ss_pred EEEEEECCCCCEE-EEEEEecCCCeEEECCCcceEECCCCEEEEEEEEEECHHH
Confidence 4678898888765 7777776669999666689999999999999999986544
No 5
>PF06030 DUF916: Bacterial protein of unknown function (DUF916); InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function.
Probab=94.91 E-value=0.036 Score=36.29 Aligned_cols=36 Identities=33% Similarity=0.599 Sum_probs=31.3
Q ss_pred EecCCCceEeCCCCeEEEEEEEEEece-eceEEEEeC
Q psy1396 31 LVERPQPVVLAPHDFCNIKANVKVAST-ENGIIFGNI 66 (66)
Q Consensus 31 lverpq~~tL~P~~~~~i~a~iKVsSt-etGvIfG~I 66 (66)
++..|..++|+|++++.+..+||+-.. -.|+|.|-|
T Consensus 80 ~v~~~~~Vtl~~~~sk~V~~~i~~P~~~f~G~ilGGi 116 (121)
T PF06030_consen 80 LVKIPKEVTLPPNESKTVTFTIKMPKKAFDGIILGGI 116 (121)
T ss_pred hccCCcEEEECCCCEEEEEEEEEcCCCCcCCEEEeeE
Confidence 577788899999999999999999665 789999864
No 6
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=94.54 E-value=0.17 Score=29.80 Aligned_cols=56 Identities=21% Similarity=0.315 Sum_probs=33.5
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCce-EeCCCCeEEEEEEEEEec-eece
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPV-VLAPHDFCNIKANVKVAS-TENG 60 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~-tL~P~~~~~i~a~iKVsS-tetG 60 (66)
.+-+.|.....+.|+.++|..--+..+--.|..+ .|.|+++..+...|++.+ ++.|
T Consensus 10 ~~tv~N~g~~~~~~v~~~l~~P~GW~~~~~~~~~~~l~pG~s~~~~~~V~vp~~a~~G 67 (78)
T PF10633_consen 10 TLTVTNTGTAPLTNVSLSLSLPEGWTVSASPASVPSLPPGESVTVTFTVTVPADAAPG 67 (78)
T ss_dssp EEEEE--SSS-BSS-EEEEE--TTSE---EEEEE--B-TTSEEEEEEEEEE-TT--SE
T ss_pred EEEEEECCCCceeeEEEEEeCCCCccccCCccccccCCCCCEEEEEEEEECCCCCCCc
Confidence 4678899999999999999766666644455544 699999999999999963 3444
No 7
>PF06280 DUF1034: Fn3-like domain (DUF1034); InterPro: IPR010435 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 3EIF_A 1XF1_B.
Probab=93.87 E-value=0.11 Score=32.34 Aligned_cols=37 Identities=14% Similarity=0.159 Sum_probs=24.7
Q ss_pred EEecCCCceEeCCCCeEEEEEEEEEec-ee---ceEEEEeC
Q psy1396 30 KLVERPQPVVLAPHDFCNIKANVKVAS-TE---NGIIFGNI 66 (66)
Q Consensus 30 klverpq~~tL~P~~~~~i~a~iKVsS-te---tGvIfG~I 66 (66)
.+...|..+|++|++++.+..+|.+.+ .+ ..++-|+|
T Consensus 56 ~~~~~~~~vTV~ag~s~~v~vti~~p~~~~~~~~~~~eG~I 96 (112)
T PF06280_consen 56 TVSFSPDTVTVPAGQSKTVTVTITPPSGLDASNGPFYEGFI 96 (112)
T ss_dssp EEE---EEEEE-TTEEEEEEEEEE--GGGHHTT-EEEEEEE
T ss_pred eEEeCCCeEEECCCCEEEEEEEEEehhcCCcccCCEEEEEE
Confidence 456677799999999999999999955 44 47777775
No 8
>PF00927 Transglut_C: Transglutaminase family, C-terminal ig like domain; InterPro: IPR008958 Synonym(s): Protein-glutamine gamma-glutamyltransferase, Fibrinoligase, TGase Transglutaminases catalyse the post-translational modification of proteins at glutamine residues, with formation of isopeptide bonds. Members of the transglutaminase family usually have three domains: N-terminal (IPR001102 from INTERPRO), middle (IPR013808 from INTERPRO) and C-terminal. The middle domain is usually well conserved, but family members can display major differences in their N- and C-terminal domains, although their overall structure is conserved []. This entry represents the C-terminal domain found in transglutaminases, which consists of an immunoglobulin-like beta-sandwich consisting of seven strands in two sheets with a Greek key topology. The best known transglutaminase is blood coagulation factor XIII, a plasma tetrameric protein composed of two catalytic A subunits and two non-catalytic B subunits. Factor XIII is responsible for cross-linking fibrin chains, thus stabilising the fibrin clot. Protein-glutamine gamma-glutamyltransferases (2.3.2.13 from EC) are calcium-dependent enzymes that catalyse the cross-linking of proteins by promoting the formation of isopeptide bonds between the gamma-carboxyl group of a glutamine in one polypeptide chain and the epsilon-amino group of a lysine in a second polypeptide chain. TGases also catalyse the conjugation of polyamines to proteins [, ].; GO: 0003810 protein-glutamine gamma-glutamyltransferase activity, 0018149 peptide cross-linking; PDB: 2XZZ_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B 1L9N_B ....
Probab=91.92 E-value=0.82 Score=28.07 Aligned_cols=49 Identities=16% Similarity=0.281 Sum_probs=35.0
Q ss_pred eEEEEecCcccccceEEEE-----EecCCeE--EecCCCceEeCCCCeEEEEEEEE
Q psy1396 5 NYKVLSVTGDTLQNCTLEL-----ATLGDLK--LVERPQPVVLAPHDFCNIKANVK 53 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vEl-----at~GdLk--lverpq~~tL~P~~~~~i~a~iK 53 (66)
.+.+.|.+++.|+|+++-| .-.|-++ .-.+....+|+|++...++..|.
T Consensus 20 ~v~~~N~~~~~l~~v~~~l~~~~v~ytG~~~~~~~~~~~~~~l~p~~~~~~~~~i~ 75 (107)
T PF00927_consen 20 SVSFTNPSSEPLRNVSLNLCAFTVEYTGLTRDQFKKEKFEVTLKPGETKSVEVTIT 75 (107)
T ss_dssp EEEEEE-SSS-EECEEEEEEEEEEECTTTEEEEEEEEEEEEEE-TTEEEEEEEEE-
T ss_pred EEEEEeCCcCccccceeEEEEEEEEECCcccccEeEEEcceeeCCCCEEEEEEEEE
Confidence 3578999999999988888 4445553 66677788999999998877664
No 9
>smart00809 Alpha_adaptinC2 Adaptin C-terminal domain. Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology PUBMED:10430869, PUBMED:12176391. The adaptor appendage contains an additional N-terminal strand.
Probab=91.24 E-value=1.4 Score=26.44 Aligned_cols=49 Identities=22% Similarity=0.285 Sum_probs=37.6
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCce-EeCCCCeEEEEEEEEEe
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPV-VLAPHDFCNIKANVKVA 55 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~-tL~P~~~~~i~a~iKVs 55 (66)
.+++.|.+...+.|+.++++.--.+++--.|+.- +|+|++. ++-.+++.
T Consensus 23 ~~~~~N~s~~~it~f~~~~avpk~~~l~l~~~s~~~l~p~~~--i~q~~~i~ 72 (104)
T smart00809 23 TLTFTNKSPSPITNFSFQAAVPKSLKLQLQPPSSPTLPPGGQ--ITQVLKVE 72 (104)
T ss_pred EEEEEeCCCCeeeeEEEEEEcccceEEEEcCCCCCccCCCCC--EEEEEEEE
Confidence 4678899999999999999988788887777744 6999875 44444443
No 10
>PF14796 AP3B1_C: Clathrin-adaptor complex-3 beta-1 subunit C-terminal
Probab=88.78 E-value=1.3 Score=30.37 Aligned_cols=50 Identities=10% Similarity=0.207 Sum_probs=42.4
Q ss_pred eEEEEecCcccccceEEEEEe-cCCeEEecCCCceEeCCCCeEEEEEEEEE
Q psy1396 5 NYKVLSVTGDTLQNCTLELAT-LGDLKLVERPQPVVLAPHDFCNIKANVKV 54 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat-~GdLklverpq~~tL~P~~~~~i~a~iKV 54 (66)
++.+.|.+.+.+.||+|.=-. .+.+++-|=|+.=.|+|+++.....-|-+
T Consensus 90 ql~ftN~s~~~i~~I~i~~k~l~~g~~i~~F~~I~~L~pg~s~t~~lgIDF 140 (145)
T PF14796_consen 90 QLTFTNNSDEPIKNIHIGEKKLPAGMRIHEFPEIESLEPGASVTVSLGIDF 140 (145)
T ss_pred EEEEEecCCCeecceEECCCCCCCCcEeeccCcccccCCCCeEEEEEEEec
Confidence 678999999999999998776 66899999999989999998776555444
No 11
>COG1470 Predicted membrane protein [Function unknown]
Probab=87.65 E-value=1.3 Score=36.14 Aligned_cols=50 Identities=12% Similarity=0.222 Sum_probs=40.7
Q ss_pred eEEEEecCcccccceEEEE-EecCCeEEecCCCceE-eCCCCeEEEEEEEEEe
Q psy1396 5 NYKVLSVTGDTLQNCTLEL-ATLGDLKLVERPQPVV-LAPHDFCNIKANVKVA 55 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vEl-at~GdLklverpq~~t-L~P~~~~~i~a~iKVs 55 (66)
.+.|.|.-+.+|.||.+++ .++| ..+=-.|..+- |.|+++..+.++|||-
T Consensus 402 ~i~I~NsGna~LtdIkl~v~~Pqg-Wei~Vd~~~I~sL~pge~~tV~ltI~vP 453 (513)
T COG1470 402 RISIENSGNAPLTDIKLTVNGPQG-WEIEVDESTIPSLEPGESKTVSLTITVP 453 (513)
T ss_pred EEEEEecCCCccceeeEEecCCcc-ceEEECcccccccCCCCcceEEEEEEcC
Confidence 4789999999999999999 7777 54433444444 9999999999999994
No 12
>PF02883 Alpha_adaptinC2: Adaptin C-terminal domain; InterPro: IPR008152 Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport []. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors [, ]. AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes []. AP2 associates with the plasma membrane and is responsible for endocytosis []. AP3 is responsible for protein trafficking to lysosomes and other related organelles []. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins []. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface []. GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1 [, ]. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors []. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking [], and with ubiquitin for receptor sorting []. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis []. This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, and the GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology [, ]. Although these domains share a similar fold, there is little sequence identity between the alpha/beta-adaptins and gamma-adaptin/GAE. More information about these proteins can be found at Protein of the Month: Clathrin [].; GO: 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0030131 clathrin adaptor complex; PDB: 3MNM_B 3ZY7_B 1GYU_A 1GYW_B 2A7B_A 1GYV_A 2E9G_A 1E42_B 2G30_A 2IV9_B ....
Probab=86.79 E-value=4 Score=25.05 Aligned_cols=50 Identities=14% Similarity=0.182 Sum_probs=35.7
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCC-ceEeCCCCeEEEEEEEEE
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQ-PVVLAPHDFCNIKANVKV 54 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq-~~tL~P~~~~~i~a~iKV 54 (66)
.+.+.|++...+.|++++++.--.+++-=.|+ .-+|+|++...-...|..
T Consensus 29 ~~~f~N~s~~~it~f~~q~avpk~~~l~l~~~s~~~i~p~~~i~Q~~~v~~ 79 (115)
T PF02883_consen 29 KLTFGNKSSQPITNFSFQAAVPKSFKLQLQPPSSSTIPPGQQITQVIKVEN 79 (115)
T ss_dssp EEEEEE-SSS-BEEEEEEEEEBTTSEEEEEESS-SSB-TTTEEEEEEEEEE
T ss_pred EEEEEECCCCCcceEEEEEEeccccEEEEeCCCCCeeCCCCeEEEEEEEEE
Confidence 47889999999999999998888888776666 678999666555544444
No 13
>PF07705 CARDB: CARDB; InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins.; PDB: 2KUT_A 2L0D_A 3IDU_A 2KL6_A.
Probab=86.58 E-value=3.9 Score=23.54 Aligned_cols=48 Identities=17% Similarity=0.052 Sum_probs=34.8
Q ss_pred EEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEe
Q psy1396 6 YKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVA 55 (66)
Q Consensus 6 ~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVs 55 (66)
+.|.|+-.....++.++|+--|... .....-.|+|++...+..+++..
T Consensus 25 ~~V~N~G~~~~~~~~v~~~~~~~~~--~~~~i~~L~~g~~~~v~~~~~~~ 72 (101)
T PF07705_consen 25 VTVKNNGTADAENVTVRLYLDGNSV--STVTIPSLAPGESETVTFTWTPP 72 (101)
T ss_dssp EEEEE-SSS-BEEEEEEEEETTEEE--EEEEESEB-TTEEEEEEEEEE-S
T ss_pred EEEEECCCCCCCCEEEEEEECCcee--ccEEECCcCCCcEEEEEEEEEeC
Confidence 5688998888999999998877665 22223479999999998888887
No 14
>TIGR02745 ccoG_rdxA_fixG cytochrome c oxidase accessory protein FixG. Member of this ferredoxin-like protein family are found exclusively in species with an operon encoding the cbb3 type of cytochrome c oxidase (cco-cbb3), and near the cco-cbb3 operon in about half the cases. The cco-cbb3 is found in a variety of proteobacteria and almost nowhere else, and is associated with oxygen use under microaerobic conditions. Some (but not all) of these proteobacteria are also nitrogen-fixing, hence the gene symbol fixG. FixG was shown essential for functional cco-cbb3 expression in Bradyrhizobium japonicum.
Probab=85.61 E-value=4.8 Score=31.62 Aligned_cols=50 Identities=14% Similarity=0.236 Sum_probs=35.9
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEe
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVA 55 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVs 55 (66)
.+.|.|+|++. +-.++++..+-++++.-.++++.++|++...+...+.+.
T Consensus 351 ~~~i~Nk~~~~-~~~~l~v~g~~~~~~~~~~~~i~v~~g~~~~~~v~v~~~ 400 (434)
T TIGR02745 351 TLKILNKTEQP-HEYYLSVLGLPGIKIEGPGAPIHVKAGEKVKLPVFLRTP 400 (434)
T ss_pred EEEEEECCCCC-EEEEEEEecCCCcEEEcCCceEEECCCCEEEEEEEEEec
Confidence 45678888876 556666666667776433348999999998887777664
No 15
>PF09478 CBM49: Carbohydrate binding domain CBM49; InterPro: IPR019028 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This domain is found at the C-terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose []. ; GO: 0030246 carbohydrate binding, 0005576 extracellular region
Probab=67.35 E-value=14 Score=21.97 Aligned_cols=46 Identities=11% Similarity=0.266 Sum_probs=33.4
Q ss_pred cceEEEEecCcccccceEEEEEec-CCeEEecC--------CCce-EeCCCCeEEE
Q psy1396 3 QNNYKVLSVTGDTLQNCTLELATL-GDLKLVER--------PQPV-VLAPHDFCNI 48 (66)
Q Consensus 3 ~~~~llvNqT~~tLQNl~vElat~-GdLklver--------pq~~-tL~P~~~~~i 48 (66)
|-++.|.|..+.++.++.+....+ +++==+++ |.-. .|+|+++..+
T Consensus 20 qy~v~I~N~~~~~I~~~~i~~~~l~~~iW~l~~~~~~~y~lPs~~~~i~pg~s~~F 75 (80)
T PF09478_consen 20 QYDVTITNNGSKPIKSLKISIDNLYGSIWGLDKVSGNTYTLPSYQPTIKPGQSFTF 75 (80)
T ss_pred EEEEEEEECCCCeEEEEEEEECccchhheeEEeccCCEEECCccccccCCCCEEEE
Confidence 447889999999999999998754 66655666 5533 5777776653
No 16
>COG1837 Predicted RNA-binding protein (contains KH domain) [General function prediction only]
Probab=66.81 E-value=4.9 Score=25.05 Aligned_cols=34 Identities=24% Similarity=0.463 Sum_probs=27.0
Q ss_pred EecCCCceEeCCCC-eEEEEEEEEEeceeceEEEE
Q psy1396 31 LVERPQPVVLAPHD-FCNIKANVKVASTENGIIFG 64 (66)
Q Consensus 31 lverpq~~tL~P~~-~~~i~a~iKVsStetGvIfG 64 (66)
+||+|..+.+..-+ -..+...+.|++.|.|-++|
T Consensus 12 lVd~Pd~v~V~~~~~~~~~~~~l~v~~~D~GkvIG 46 (76)
T COG1837 12 LVDNPDDVRVDEEEGEKTVTIELRVAPEDMGKVIG 46 (76)
T ss_pred hcCCccceEEEEEecCCeEEEEEEECcccccceec
Confidence 68899977755444 35677889999999999988
No 17
>PF08752 COP-gamma_platf: Coatomer gamma subunit appendage platform subdomain; InterPro: IPR014863 Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer []. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins []. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi []. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes []. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits. This entry represents the C-terminal appendage domain of the gamma subunit of coatomer complexes. The appendage domain of the gamma coatomer subunit has a similar overall structural fold to the appendage domain of clathrin adaptors, and can also share the same motif-based cargo recognition and accessory factor recruitment mechanisms. The coatomer gamma subunit appendage domain contains a protein-protein interaction site and a second proposed binding site that interacts with the alpha, beta, epsilon COPI subcomplex []. More information about these proteins can be found at Protein of the Month: Clathrin [].; GO: 0005198 structural molecule activity, 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0005798 Golgi-associated vesicle; PDB: 1PZD_A 1R4X_A.
Probab=66.33 E-value=34 Score=23.52 Aligned_cols=49 Identities=14% Similarity=0.074 Sum_probs=33.4
Q ss_pred EEEEecC-cccccceEEEEEecCC-eEEecCCCceEeCCCCeEEEEEEEEE
Q psy1396 6 YKVLSVT-GDTLQNCTLELATLGD-LKLVERPQPVVLAPHDFCNIKANVKV 54 (66)
Q Consensus 6 ~llvNqT-~~tLQNl~vElat~Gd-Lklverpq~~tL~P~~~~~i~a~iKV 54 (66)
+-+.|.= ...|.|++|++.+..+ ++....-+.=.|.|++...+.+.+|-
T Consensus 54 F~v~NTL~dq~LenV~V~~~~~~~~~~~~~~ipi~~L~~~~~~~~yV~l~~ 104 (151)
T PF08752_consen 54 FNVTNTLNDQVLENVSVVLEPSEEEFEEVFIIPIPSLPYNEPGSCYVVLKR 104 (151)
T ss_dssp EEEEE--TTEEEEEEEEEEEESSS--EEEEEE-EEEE-CT--EEEEEEEE-
T ss_pred EEEeeccCceeeeeEEEEEecCCceEEEEEEEEhhhCCCCCCeeEEEEEEe
Confidence 3455554 4579999999988876 88888777778999999999998887
No 18
>PRK00468 hypothetical protein; Provisional
Probab=63.87 E-value=8.7 Score=23.51 Aligned_cols=34 Identities=26% Similarity=0.493 Sum_probs=25.8
Q ss_pred EecCCCceEeC--CCCeEEEEEEEEEeceeceEEEEe
Q psy1396 31 LVERPQPVVLA--PHDFCNIKANVKVASTENGIIFGN 65 (66)
Q Consensus 31 lverpq~~tL~--P~~~~~i~a~iKVsStetGvIfG~ 65 (66)
++|.|..+.+. +++ ..+...++|++.|.|-|+|-
T Consensus 12 LVd~Pe~v~V~~~~~~-~~~~~~l~v~~~D~GrVIGk 47 (75)
T PRK00468 12 LVDNPDAVQVNEIEGE-QSVILELKVAPEDMGKVIGK 47 (75)
T ss_pred hcCCCCeEEEEEEeCC-CeEEEEEEEChhhCcceecC
Confidence 67888866643 333 34788999999999999983
No 19
>PF03168 LEA_2: Late embryogenesis abundant protein; InterPro: IPR004864 Different types of LEA proteins are expressed at different stages of late embryogenesis in higher plant seed embryos and under conditions of dehydration stress [, ]. The function of these proteins is unknown. ; PDB: 3BUT_A 1XO8_A 1YYC_A.
Probab=60.46 E-value=30 Score=19.83 Aligned_cols=45 Identities=11% Similarity=0.079 Sum_probs=33.7
Q ss_pred cccccceEEEEEecCCeEE-ecCCCceEeCCCCeEEEEEEEEEece
Q psy1396 13 GDTLQNCTLELATLGDLKL-VERPQPVVLAPHDFCNIKANVKVAST 57 (66)
Q Consensus 13 ~~tLQNl~vElat~GdLkl-verpq~~tL~P~~~~~i~a~iKVsSt 57 (66)
.-.+..+..+++-.|..=- -..++.+.+.|+++..+...+.++..
T Consensus 11 ~i~~~~~~~~v~~~g~~v~~~~~~~~~~i~~~~~~~v~~~v~~~~~ 56 (101)
T PF03168_consen 11 GIRYDSIEYDVYYNGQRVGTGGSLPPFTIPARSSTTVPVPVSVDYS 56 (101)
T ss_dssp -EEEEEEEEEEEESSSEEEEEEECE-EEESSSCEEEEEEEEEEEHH
T ss_pred eEEEeCEEEEEEECCEEEECccccCCeEECCCCcEEEEEEEEEcHH
Confidence 4456788888887664333 67888999999999999998887654
No 20
>PRK01064 hypothetical protein; Provisional
Probab=60.28 E-value=9.1 Score=23.67 Aligned_cols=34 Identities=32% Similarity=0.472 Sum_probs=25.5
Q ss_pred EecCCCceEeC--CCCeEEEEEEEEEeceeceEEEEe
Q psy1396 31 LVERPQPVVLA--PHDFCNIKANVKVASTENGIIFGN 65 (66)
Q Consensus 31 lverpq~~tL~--P~~~~~i~a~iKVsStetGvIfG~ 65 (66)
+++.|..+.+. ++ -..+...+.|.+.+.|.+.|-
T Consensus 12 LVd~Pe~V~V~~~~~-~~~~~~~l~v~~~D~g~vIGk 47 (78)
T PRK01064 12 LVDRPEEVHIKEVQG-THTIIYELTVAKPDIGKIIGK 47 (78)
T ss_pred hcCCCCeEEEEEEeC-CCEEEEEEEECcccceEEECC
Confidence 67888866643 33 245778899999999999983
No 21
>smart00769 WHy Water Stress and Hypersensitive response.
Probab=56.14 E-value=43 Score=20.27 Aligned_cols=43 Identities=19% Similarity=0.156 Sum_probs=33.6
Q ss_pred CcccccceEEEEEecCCeEE--ecCCCceEeCCCCeEEEEEEEEEe
Q psy1396 12 TGDTLQNCTLELATLGDLKL--VERPQPVVLAPHDFCNIKANVKVA 55 (66)
Q Consensus 12 T~~tLQNl~vElat~GdLkl--verpq~~tL~P~~~~~i~a~iKVs 55 (66)
-.-.+.++.-+++-.| .++ .+.++..++.|++...+...+.++
T Consensus 29 ~~l~~~~~~y~l~~~g-~~v~~g~~~~~~~ipa~~~~~v~v~~~~~ 73 (100)
T smart00769 29 FPIPVNGLSYDLYLNG-VELGSGEIPDSGTLPGNGRTVLDVPVTVN 73 (100)
T ss_pred CccccccEEEEEEECC-EEEEEEEcCCCcEECCCCcEEEEEEEEee
Confidence 3446788887787766 344 456778999999999999999984
No 22
>PRK02821 hypothetical protein; Provisional
Probab=51.89 E-value=9.4 Score=23.60 Aligned_cols=33 Identities=21% Similarity=0.400 Sum_probs=25.0
Q ss_pred EecCCCceEeC--CCCeEEEEEEEEEeceeceEEEE
Q psy1396 31 LVERPQPVVLA--PHDFCNIKANVKVASTENGIIFG 64 (66)
Q Consensus 31 lverpq~~tL~--P~~~~~i~a~iKVsStetGvIfG 64 (66)
+|+.|..+.+. +.+. .....|+|++.|.|-|.|
T Consensus 13 LVd~Pe~V~V~~~~~~~-~~~i~l~v~~~D~GrVIG 47 (77)
T PRK02821 13 IVDNPDDVRVDSHTNRR-GRTLEVRVHPDDLGKVIG 47 (77)
T ss_pred hCCCCCeEEEEEEECCC-cEEEEEEEChhhCcceeC
Confidence 67888876653 3332 367899999999999988
No 23
>PF00345 PapD_N: Pili and flagellar-assembly chaperone, PapD N-terminal domain; InterPro: IPR016147 Most Gram-negative bacteria possess a supramolecular structure - the pili - on their surface, which mediates attachment to specific receptors. Many interactive subunits are required to assemble pili, but their assembly only takes place after translocation across the cytoplasmic membrane. Periplasmic chaperones assist pili assembly by binding to the subunits, thereby preventing premature aggregation [, ]. Pili chaperones are structurally, and possibly evolutionarily, related to the immunoglobulin superfamily [, ]: they contain two globular domains, with a topology identical to an immunoglobulin fold. This entry represents the N-terminal domain of pili assembly chaperone, and has a beta-sandwich fold consisting of seven strands in two sheets with a Greek key topology.; GO: 0007047 cellular cell wall organization, 0030288 outer membrane-bounded periplasmic space; PDB: 2CO6_B 2CO7_B 1L4I_B 3GFU_A 3F65_F 3F6L_A 3F6I_A 3GEW_B 3DSN_D 2OS7_B ....
Probab=51.88 E-value=23 Score=21.92 Aligned_cols=50 Identities=10% Similarity=0.102 Sum_probs=32.8
Q ss_pred eEEEEecCcccccceEEEEEec----CC---eEEecCCCceEeCCCCeEEEEEEEEEec
Q psy1396 5 NYKVLSVTGDTLQNCTLELATL----GD---LKLVERPQPVVLAPHDFCNIKANVKVAS 56 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~----Gd---Lklverpq~~tL~P~~~~~i~a~iKVsS 56 (66)
.+-|.|.+++.+ =+++.+.-. ++ -.++=-|+.+.|+|++...+|. ++-+.
T Consensus 19 ~i~v~N~~~~~~-~vq~~v~~~~~~~~~~~~~~~~vsPp~~~L~pg~~q~vRv-~~~~~ 75 (122)
T PF00345_consen 19 SITVTNNSDQPY-LVQVWVYDQDDEDEDEPTDPFIVSPPIFRLEPGESQTVRV-YRGSK 75 (122)
T ss_dssp EEEEEESSSSEE-EEEEEEEETTSTTSSSSSSSEEEESSEEEEETTEEEEEEE-EECSG
T ss_pred EEEEEcCCCCcE-EEEEEEEcCCCcccccccccEEEeCCceEeCCCCcEEEEE-EecCC
Confidence 466778777332 234444431 11 1467789999999999999999 87444
No 24
>TIGR00158 L9 ribosomal protein L9. Ribosomal protein L9 appears to be universal in, but restricted to, eubacteria and chloroplast.
Probab=48.79 E-value=17 Score=24.68 Aligned_cols=19 Identities=21% Similarity=0.410 Sum_probs=15.6
Q ss_pred EEEEEEEeceeceEEEEeC
Q psy1396 48 IKANVKVASTENGIIFGNI 66 (66)
Q Consensus 48 i~a~iKVsStetGvIfG~I 66 (66)
+...|++.+-|.|-+||.|
T Consensus 75 ~~~~i~~k~ge~gklfGSV 93 (148)
T TIGR00158 75 GTLTISKKVGDEGKLFGSI 93 (148)
T ss_pred cEEEEEEEeCCCCeEEEeE
Confidence 3467888889999999986
No 25
>PHA01707 dut 2'-deoxyuridine 5'-triphosphatase
Probab=48.52 E-value=82 Score=21.23 Aligned_cols=61 Identities=10% Similarity=0.141 Sum_probs=42.5
Q ss_pred ceEEEEecCcccccceEEEEEecCCeEEe------cCCCceEeCCCCeEEEEEEEEEecee--ceEEEE
Q psy1396 4 NNYKVLSVTGDTLQNCTLELATLGDLKLV------ERPQPVVLAPHDFCNIKANVKVASTE--NGIIFG 64 (66)
Q Consensus 4 ~~~llvNqT~~tLQNl~vElat~GdLklv------erpq~~tL~P~~~~~i~a~iKVsSte--tGvIfG 64 (66)
.++.|.+-..+.+|--.++|.-...+.+. .....+.|.|+++.-+.+...|.-.+ .|.++|
T Consensus 15 g~i~I~Pf~~~~v~p~s~DlrLg~~~~~~~~~~~~~~~~~~~l~Pg~~~l~~T~E~i~lP~~~~~~i~~ 83 (158)
T PHA01707 15 GWLVIEPLSEDTIRENGVDLKIGNEIVRIKENMEKEVGDEFIIYPHEHVLLTTKEYIKLPNDIIAFCNL 83 (158)
T ss_pred CCeEEcCCCHHHcCCceEEEEecCeEEEEecccccccCCcEEECCCCEEEEEEeEEEECCCCEEEEEEC
Confidence 44556666666777777777665566554 34568999999999999998887443 344443
No 26
>PF03948 Ribosomal_L9_C: Ribosomal protein L9, C-terminal domain; InterPro: IPR020069 Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ]. Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities [, ]. The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker []. Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends. ; PDB: 3D5B_I 3PYV_H 3F1H_I 3PYR_H 3MRZ_H 1VSP_G 3MS1_H 1VSA_G 3PYT_H 2WH4_I ....
Probab=45.46 E-value=7.1 Score=24.08 Aligned_cols=18 Identities=33% Similarity=0.660 Sum_probs=14.9
Q ss_pred EEEEEEeceeceEEEEeC
Q psy1396 49 KANVKVASTENGIIFGNI 66 (66)
Q Consensus 49 ~a~iKVsStetGvIfG~I 66 (66)
..+|+....|.|-+||.|
T Consensus 15 ~l~i~~k~g~~gklfGSV 32 (87)
T PF03948_consen 15 TLTIKRKAGENGKLFGSV 32 (87)
T ss_dssp EEEEEECBSSCSSBSSEB
T ss_pred EEEEEEEecCCcceecCc
Confidence 367788888999999986
No 27
>cd07557 trimeric_dUTPase Trimeric dUTP diphosphatases. Trimeric dUTP diphosphatases, or dUTPases, are the most common family of dUTPase, found in bacteria, eukaryotes, and archaea. They catalyze the hydrolysis of the dUTP-Mg complex (dUTP-Mg) into dUMP and pyrophosphate. This reaction is crucial for the preservation of chromosomal integrity as it removes dUTP and therefore reduces the cellular dUTP/dTTP ratio, and prevents dUTP from being incorporated into DNA. It also provides dUMP as the precursor for dTTP synthesis via the thymidylate synthase pathway. dUTPases are homotrimeric, except some monomeric viral dUTPases, which have been shown to mimic a trimer. Active sites are located at the subunit interface.
Probab=45.08 E-value=63 Score=18.90 Aligned_cols=29 Identities=17% Similarity=0.345 Sum_probs=21.9
Q ss_pred CceEeCCCCeEEEEEEEEEece--eceEEEE
Q psy1396 36 QPVVLAPHDFCNIKANVKVAST--ENGIIFG 64 (66)
Q Consensus 36 q~~tL~P~~~~~i~a~iKVsSt--etGvIfG 64 (66)
..+.|.|+++..+.+.+++.-. -.|.|++
T Consensus 12 ~~~~i~P~~~~~v~t~~~i~~p~~~~~~i~~ 42 (92)
T cd07557 12 EGIVLPPGETVLVPTGEAIELPEGYVGLVFP 42 (92)
T ss_pred CCEEEcCCCEEEEEEeEEEEcCCCeEEEEEc
Confidence 3489999999999999888654 4555554
No 28
>PF10435 BetaGal_dom2: Beta-galactosidase, domain 2; InterPro: IPR018954 This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with IPR001944 from INTERPRO, which is N-terminal to it, but itself has no metazoan members. ; GO: 0004565 beta-galactosidase activity; PDB: 3OGS_A 3OGV_A 3OGR_A 3OG2_A 1TG7_A 1XC6_A.
Probab=44.88 E-value=99 Score=21.51 Aligned_cols=51 Identities=14% Similarity=0.180 Sum_probs=30.7
Q ss_pred eEEEEecCc-ccccc--eEEEEE-ecCCeEEecCCCceEeCCCCeEEEEEEEEEe
Q psy1396 5 NYKVLSVTG-DTLQN--CTLELA-TLGDLKLVERPQPVVLAPHDFCNIKANVKVA 55 (66)
Q Consensus 5 ~~llvNqT~-~tLQN--l~vEla-t~GdLklverpq~~tL~P~~~~~i~a~iKVs 55 (66)
.+|++.+.+ .++.+ -++++. ..|.+.|.+....++|..++++-+-+..++-
T Consensus 33 ~Fyvvrh~~~~s~~~~~f~l~v~Ts~G~~tiPq~~g~ltL~GrdSKIlvtDy~~G 87 (183)
T PF10435_consen 33 GFYVVRHNDSTSTASTSFTLNVNTSDGTLTIPQLGGSLTLNGRDSKILVTDYDFG 87 (183)
T ss_dssp EEEEEEESSTT--S-EEE-EEEEETTEEEEE-TTSS-EEE-TT-EEEEEEEEEET
T ss_pred EEEEEEccCCCCCCceEEEEEeecCCeeEEecccCCcEEECCcceeEEEeecccC
Confidence 577777732 23333 344543 3578888888889999999999888877764
No 29
>cd07706 IgV_TCR_delta Immunoglobulin (Ig) variable (V) domain of T-cell receptor (TCR) delta chain. IgV_TCR_delta: immunoglobulin (Ig) variable (V) domain of the delta chain of gamma/delta T-cell receptors (TCRs). TCRs mediate antigen recognition by T lymphocytes, and are heterodimers consisting of alpha and beta chains or gamma and delta chains. Each chain contains a variable (V) and a constant (C) region. The majority of T cells contain alpha/beta TCRs but a small subset contain gamma/delta TCRs. Alpha/beta TCRs recognize antigen as peptide fragments presented by major histocompatibility complex (MHC) molecules. Gamma/delta TCRs recognize intact protein antigens; they recognize protein antigens directly and without antigen processing, and MHC independently of the bound peptide. Gamma/delta T cells can also be stimulated by non-peptide antigens such as small phosphate- or amine-containing compounds. The variable domain of gamma/delta TCRs is responsible for antigen recognition and is
Probab=44.06 E-value=50 Score=19.91 Aligned_cols=26 Identities=4% Similarity=0.056 Sum_probs=21.1
Q ss_pred EecCCCceEeCCCCeEEEEEEEEEec
Q psy1396 31 LVERPQPVVLAPHDFCNIKANVKVAS 56 (66)
Q Consensus 31 lverpq~~tL~P~~~~~i~a~iKVsS 56 (66)
+...|+.+...+|+.+.+.++++.++
T Consensus 2 v~q~~~~v~~~~G~~v~L~C~~~~~~ 27 (116)
T cd07706 2 VTQAQPDVSVQVGEEVTLNCRYETSW 27 (116)
T ss_pred cEEeCCceEEcCCCCEEEEEEEeCCC
Confidence 45668888999999999999887644
No 30
>PF14315 DUF4380: Domain of unknown function (DUF4380)
Probab=42.08 E-value=49 Score=23.77 Aligned_cols=34 Identities=24% Similarity=0.297 Sum_probs=28.5
Q ss_pred cceEEEEEecCCeEEecCCCceE-eCCCCeEEEEE
Q psy1396 17 QNCTLELATLGDLKLVERPQPVV-LAPHDFCNIKA 50 (66)
Q Consensus 17 QNl~vElat~GdLklverpq~~t-L~P~~~~~i~a 50 (66)
++.++|+++.+.+=-+|--.++. |+|+++..-+-
T Consensus 234 ~g~~~E~y~~~~~lElE~~sP~~~L~PGe~~~~~e 268 (274)
T PF14315_consen 234 FGCSVEVYTNDPYLELETLSPLRTLAPGESLEHTE 268 (274)
T ss_pred CCceEEEEcCCCEEEEEEcCcccccCCCCEEEEEE
Confidence 57889999999999899888766 99999876554
No 31
>CHL00160 rpl9 ribosomal protein L9; Provisional
Probab=41.32 E-value=24 Score=24.18 Aligned_cols=17 Identities=41% Similarity=0.559 Sum_probs=14.3
Q ss_pred EEEEEeceeceEEEEeC
Q psy1396 50 ANVKVASTENGIIFGNI 66 (66)
Q Consensus 50 a~iKVsStetGvIfG~I 66 (66)
.+|++..-|.|-+||.|
T Consensus 83 ~~i~~k~ge~gklfGSV 99 (153)
T CHL00160 83 FSVKKKVGENNQIFGSV 99 (153)
T ss_pred EEEEEEeCCCCeEEccc
Confidence 56777788999999986
No 32
>cd09020 D-hex-6-P-epi_like D-hexose-6-phosphate epimerase-like. D-Hexose-6-phosphate epimerase Ymr099c from Saccharomyces cerevisiae belongs to the large superfamily of aldose-1-epimerases. Its active site is very similar to the catalytic site of galactose mutarotase, the best studied member of the superfamily. It also contains the conserved glutamate and histidine residues that have been shown in galactose mutarotase to be critical for catalysis, the glutamate serving as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate, and the histidine as the active site acid to protonate the C-5 ring oxygen. In addition Ymr099c contains 2 conserved arginine residues which are involved in phosphate binding, and exhibits hexose-6-phosphate mutarotase activity on glucose-6-P, galactose-6-P and mannose-6-P.
Probab=40.61 E-value=46 Score=23.62 Aligned_cols=27 Identities=26% Similarity=0.481 Sum_probs=20.8
Q ss_pred eEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEE
Q psy1396 19 CTLELATLGDLKLVERPQPVVLAPHDFCNIKANVK 53 (66)
Q Consensus 19 l~vElat~GdLklverpq~~tL~P~~~~~i~a~iK 53 (66)
||||=+..+ ..+.|+|+++.+....|+
T Consensus 243 vCvEp~~~~--------~~~~L~pG~~~~~~~~i~ 269 (269)
T cd09020 243 VCVEAANVA--------DPVTLAPGESHTLSQTIS 269 (269)
T ss_pred EEECeeecC--------CCEEECCCCCEEEEEEEC
Confidence 778876643 567899999998887763
No 33
>PF09624 DUF2393: Protein of unknown function (DUF2393); InterPro: IPR013417 The function of this protein is unknown. It is always found as part of a two-gene operon with IPR013416 from INTERPRO, a protein that appears to span the membrane seven times. It has so far been found in the bacteria Anabaena sp. (strain PCC 7120), Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.
Probab=40.44 E-value=85 Score=20.28 Aligned_cols=47 Identities=15% Similarity=0.227 Sum_probs=34.5
Q ss_pred EEEEecCcccccceEEEEEe------cCC--eEEecCCCce---------EeCCCCeEEEEEEE
Q psy1396 6 YKVLSVTGDTLQNCTLELAT------LGD--LKLVERPQPV---------VLAPHDFCNIKANV 52 (66)
Q Consensus 6 ~llvNqT~~tLQNl~vElat------~Gd--Lklverpq~~---------tL~P~~~~~i~a~i 52 (66)
.-|.|.++.++.++.|++.- .++ ....+++.++ +|.|++.+..+..+
T Consensus 68 g~V~N~g~~~i~~c~i~~~l~~~~~~~~n~~~~~~~~~~~f~~~~~~i~~~L~~~e~~~f~~~~ 131 (149)
T PF09624_consen 68 GTVTNTGKFTIKKCKITVKLYNDKQVSGNKFKEIFYQQIPFVKKSIPIADNLKPGESKEFRFIF 131 (149)
T ss_pred EEEEECCCCEeeEEEEEEEEEeCCCccCchhhhhhccccchhccceeHHhhcCcccceeEEEEe
Confidence 45789999999999887733 233 4566666555 29999999998774
No 34
>PF14221 DUF4330: Domain of unknown function (DUF4330)
Probab=39.25 E-value=1.1e+02 Score=20.83 Aligned_cols=19 Identities=16% Similarity=0.468 Sum_probs=13.6
Q ss_pred EEEEEEEEeceeceEEEEe
Q psy1396 47 NIKANVKVASTENGIIFGN 65 (66)
Q Consensus 47 ~i~a~iKVsStetGvIfG~ 65 (66)
.+...-+..-+++|+.||+
T Consensus 123 ~~tl~g~~~~t~~g~vlgg 141 (168)
T PF14221_consen 123 RFTLEGQAQITDDGVVLGG 141 (168)
T ss_pred EEEEEEEEEEcCCeEEECC
Confidence 3444444778899999986
No 35
>cd05754 Ig3_Perlecan_like Third immunoglobulin (Ig)-like domain found in Perlecan and similar proteins. Ig3_Perlecan_like: domain similar to the third immunoglobulin (Ig)-like domain found in Perlecan. Perlecan is a large multi-domain heparin sulfate proteoglycan, important in tissue development and organogenesis. Perlecan can be represented as 5 major portions; its fourth major portion (domain IV) is a tandem repeat of immunoglobulin-like domains (Ig2-Ig15), which can vary in size due to alternative splicing. Perlecan binds many cellular and extracellular ligands. Its domain IV region has many binding sites. Some of these have been mapped at the level of individual Ig-like domains, including a site restricted to the Ig5 domain for heparin/sulfatide, a site restricted to the Ig3 domain for nidogen-1 and nidogen-2, a site restricted to Ig4-5 for fibronectin, and sites restricted to Ig2 and to Ig13-15 for fibulin-2.
Probab=38.77 E-value=50 Score=18.67 Aligned_cols=23 Identities=22% Similarity=0.273 Sum_probs=19.3
Q ss_pred EecCCCceEeCCCCeEEEEEEEE
Q psy1396 31 LVERPQPVVLAPHDFCNIKANVK 53 (66)
Q Consensus 31 lverpq~~tL~P~~~~~i~a~iK 53 (66)
.+++|+...+.+++...+.+...
T Consensus 4 ~~~~p~~~~v~~G~~v~L~C~~~ 26 (85)
T cd05754 4 TVEEPRSQEVRPGADVSFICRAK 26 (85)
T ss_pred EecCCCceEEcCCCCEEEEEEcC
Confidence 46788899999999999998764
No 36
>cd07693 Ig1_Robo First immunoglobulin (Ig)-like domain in Robo (roundabout) receptors and similar proteins. Ig1_Robo: domain similar to the first immunoglobulin (Ig)-like domain in Robo (roundabout) receptors. Robo receptors play a role in the development of the central nervous system (CNS), and are receptors of Slit protein. Slit is a repellant secreted by the neural cells in the midline. Slit acts through Robo to prevent most neurons from crossing the midline from either side. Three mammalian Robo homologs (robo1, -2, and -3), and three mammalian Slit homologs (Slit-1,-2, -3), have been identified. Commissural axons, which cross the midline, express low levels of Robo; longitudinal axons, which avoid the midline, express high levels of Robo. robo1, -2, and -3 are expressed by commissural neurons in the vertebrate spinal cord and Slits 1, -2, -3 are expressed at the ventral midline. Robo-3 is a divergent member of the Robo family which instead of being a positive regulator of slit res
Probab=37.95 E-value=52 Score=18.37 Aligned_cols=22 Identities=18% Similarity=0.458 Sum_probs=15.4
Q ss_pred EecCCCceEeCCCCeEEEEEEE
Q psy1396 31 LVERPQPVVLAPHDFCNIKANV 52 (66)
Q Consensus 31 lverpq~~tL~P~~~~~i~a~i 52 (66)
+.+.|+..+..+++...+.+.+
T Consensus 4 i~~~p~~~~v~~G~~~~l~C~~ 25 (100)
T cd07693 4 IVEHPSDLIVSKGDPATLNCKA 25 (100)
T ss_pred EEecCceeEEcCCCeEEEEeeC
Confidence 5566777777777777777644
No 37
>PF05906 DUF865: Herpesvirus-7 repeat of unknown function (DUF865)
Probab=36.17 E-value=22 Score=19.41 Aligned_cols=25 Identities=40% Similarity=0.715 Sum_probs=14.3
Q ss_pred ecCCCceEeCCCCeEEEEEEEEEeceeceEEEE
Q psy1396 32 VERPQPVVLAPHDFCNIKANVKVASTENGIIFG 64 (66)
Q Consensus 32 verpq~~tL~P~~~~~i~a~iKVsStetGvIfG 64 (66)
-||||+++ |-.|.-++ .|-+.++|.
T Consensus 8 qerpqphn--pltfkpvk------ttgtavvfs 32 (35)
T PF05906_consen 8 QERPQPHN--PLTFKPVK------TTGTAVVFS 32 (35)
T ss_pred ccCCCCCC--ccceeeee------ccceEEEee
Confidence 37888765 44455443 345556653
No 38
>PF14016 DUF4232: Protein of unknown function (DUF4232)
Probab=34.86 E-value=56 Score=20.68 Aligned_cols=18 Identities=33% Similarity=0.471 Sum_probs=13.9
Q ss_pred CCCceEeCCCCeEEEEEE
Q psy1396 34 RPQPVVLAPHDFCNIKAN 51 (66)
Q Consensus 34 rpq~~tL~P~~~~~i~a~ 51 (66)
.|.+++|+|++++.....
T Consensus 62 ~~~~vtL~PG~sA~a~l~ 79 (131)
T PF14016_consen 62 PPRPVTLAPGGSAYAGLR 79 (131)
T ss_pred CCCcEEECCCCEEEEEEE
Confidence 466899999999875544
No 39
>PF05506 DUF756: Domain of unknown function (DUF756); InterPro: IPR008475 This domain is found, normally as a tandem repeat, at the C terminus of bacterial phospholipase C proteins.; GO: 0004629 phospholipase C activity, 0016042 lipid catabolic process
Probab=34.68 E-value=1e+02 Score=18.31 Aligned_cols=20 Identities=10% Similarity=0.086 Sum_probs=16.5
Q ss_pred cCCCceEeCCCCeEEEEEEE
Q psy1396 33 ERPQPVVLAPHDFCNIKANV 52 (66)
Q Consensus 33 erpq~~tL~P~~~~~i~a~i 52 (66)
..|+.++|+|++...+.-.+
T Consensus 46 ~~~~~~~v~ag~~~~~~w~l 65 (89)
T PF05506_consen 46 GGPWTYTVAAGQTVSLTWPL 65 (89)
T ss_pred CCCEEEEECCCCEEEEEEee
Confidence 46889999999998877655
No 40
>cd05747 Ig5_Titin_like M5, fifth immunoglobulin (Ig)-like domain of human titin C terminus and similar proteins. Ig5_Titin_like: domain similar to the M5, fifth immunoglobulin (Ig)-like domain from the human titin C terminus. Titin (also called connectin) is a fibrous sarcomeric protein specifically found in vertebrate striated muscle. Titin is gigantic; depending on isoform composition it ranges from 2970 to 3700 kDa, and is of a length that spans half a sarcomere. Titin largely consists of multiple repeats of Ig-like and fibronectin type 3 (FN-III)-like domains. Titin connects the ends of myosin thick filaments to Z disks and extends along the thick filament to the H zone, and appears to function similar to an elastic band, keeping the myosin filaments centered in the sarcomere during muscle contraction or stretching.
Probab=34.41 E-value=61 Score=18.71 Aligned_cols=23 Identities=9% Similarity=0.400 Sum_probs=13.4
Q ss_pred EEecCCCceEeCCCCeEEEEEEE
Q psy1396 30 KLVERPQPVVLAPHDFCNIKANV 52 (66)
Q Consensus 30 klverpq~~tL~P~~~~~i~a~i 52 (66)
+++.+|+......|+..++.+.+
T Consensus 5 ~~~~~p~~~~~~~G~~~~L~C~~ 27 (92)
T cd05747 5 TILTKPRSLTVSEGESARFSCDV 27 (92)
T ss_pred cccccCccEEEeCCCcEEEEEEE
Confidence 34555666666666666665543
No 41
>TIGR00576 dut deoxyuridine 5'-triphosphate nucleotidohydrolase (dut). Changed role from 132 to 123. RTD
Probab=32.51 E-value=90 Score=20.50 Aligned_cols=30 Identities=10% Similarity=0.257 Sum_probs=21.5
Q ss_pred CCceEeCCCCeEEEEEEEEEecee--ceEEEE
Q psy1396 35 PQPVVLAPHDFCNIKANVKVASTE--NGIIFG 64 (66)
Q Consensus 35 pq~~tL~P~~~~~i~a~iKVsSte--tGvIfG 64 (66)
|..+.|.|+++..+.+.+++.-.+ .|.|++
T Consensus 28 ~~d~~i~P~~~~lv~tg~~v~ip~g~~~~i~~ 59 (141)
T TIGR00576 28 AEDVTIPPGERALVPTGIAIELPDGYYGRVAP 59 (141)
T ss_pred CCCeEECCCCEEEEEeCcEEecCCCEEEEEEe
Confidence 445789999999999998886443 344543
No 42
>PHA03126 dUTPase; Provisional
Probab=31.99 E-value=67 Score=25.05 Aligned_cols=30 Identities=23% Similarity=0.400 Sum_probs=24.5
Q ss_pred CCceEeCCCCeEEEEEEEEEece---eceEEEE
Q psy1396 35 PQPVVLAPHDFCNIKANVKVAST---ENGIIFG 64 (66)
Q Consensus 35 pq~~tL~P~~~~~i~a~iKVsSt---etGvIfG 64 (66)
|..++|.|++.+.|.+-|++.-. ..|.|||
T Consensus 185 ~edvvI~Pge~~lV~TGIai~ip~~g~~~~I~p 217 (326)
T PHA03126 185 PTDATIEPDESHFVDLPIVFASSNPAVTPCIFG 217 (326)
T ss_pred CCCcEECCCCEEEEEcCeEEEcCCCCeEEEEeC
Confidence 35689999999999999999843 4577776
No 43
>PLN02808 alpha-galactosidase
Probab=31.30 E-value=57 Score=25.51 Aligned_cols=45 Identities=9% Similarity=0.055 Sum_probs=28.6
Q ss_pred EEEEecCcccccceEEEEEecC-----CeEE-----------ecCCCceEeCCCCeEEEEEE
Q psy1396 6 YKVLSVTGDTLQNCTLELATLG-----DLKL-----------VERPQPVVLAPHDFCNIKAN 51 (66)
Q Consensus 6 ~llvNqT~~tLQNl~vElat~G-----dLkl-----------verpq~~tL~P~~~~~i~a~ 51 (66)
+.+.|+.++ -+.+++.|+.+| ..++ ....-.++|+||+.+-+|.+
T Consensus 324 Val~N~~~~-~~~~~~~~~~lgl~~~~~~~vrDlWs~~~~g~~~~~~~~~v~pHg~~~~rlt 384 (386)
T PLN02808 324 VVLWNRGSS-RATITARWSDIGLNSSAVVNARDLWAHSTQSSVKGQLSALVESHACKMYVLT 384 (386)
T ss_pred EEEEECCCC-CEEEEEEHHHhCCCCCCceEEEECCCCCccCcccceEEEEECCceEEEEEEe
Confidence 567898775 467888876665 1122 22223567899988877754
No 44
>PF00635 Motile_Sperm: MSP (Major sperm protein) domain; InterPro: IPR000535 Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans. MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold []. ; GO: 0005198 structural molecule activity; PDB: 1MSP_A 3MSP_B 2BVU_B 2MSP_C 1Z9O_F 1Z9L_A 3IKK_A 1WIC_A 2CRI_A 2RR3_A ....
Probab=31.18 E-value=1.2e+02 Score=17.88 Aligned_cols=47 Identities=9% Similarity=0.174 Sum_probs=30.1
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEE
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKV 54 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKV 54 (66)
.+.|.|.+.. .+..-+-|...-...-+|..--|.|+++..|..+.+=
T Consensus 23 ~l~l~N~s~~---~i~fKiktt~~~~y~v~P~~G~i~p~~~~~i~I~~~~ 69 (109)
T PF00635_consen 23 ELTLTNPSDK---PIAFKIKTTNPNRYRVKPSYGIIEPGESVEITITFQP 69 (109)
T ss_dssp EEEEEE-SSS---EEEEEEEES-TTTEEEESSEEEE-TTEEEEEEEEE-S
T ss_pred EEEEECCCCC---cEEEEEEcCCCceEEecCCCEEECCCCEEEEEEEEEe
Confidence 4556666544 5666666665545556788888999999999887764
No 45
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=30.72 E-value=74 Score=19.89 Aligned_cols=28 Identities=25% Similarity=0.401 Sum_probs=23.5
Q ss_pred eEEEEEecCCeEEecCCCceEeCCCCeEEE
Q psy1396 19 CTLELATLGDLKLVERPQPVVLAPHDFCNI 48 (66)
Q Consensus 19 l~vElat~GdLklverpq~~tL~P~~~~~i 48 (66)
+.+|+..-| +.+.+...+.+.+|+.+.+
T Consensus 46 v~a~~~~dG--~~~t~~~~V~vrAGd~~~v 73 (75)
T TIGR03000 46 VTAEYDRDG--RILTRTRTVVVRAGDTVTV 73 (75)
T ss_pred EEEEEecCC--cEEEEEEEEEEcCCceEEe
Confidence 567777777 8999999999999998765
No 46
>PRK00137 rplI 50S ribosomal protein L9; Reviewed
Probab=29.64 E-value=48 Score=22.26 Aligned_cols=18 Identities=28% Similarity=0.693 Sum_probs=13.2
Q ss_pred EEEEEEeceeceEEEEeC
Q psy1396 49 KANVKVASTENGIIFGNI 66 (66)
Q Consensus 49 ~a~iKVsStetGvIfG~I 66 (66)
...|+...-+.|-+||.|
T Consensus 76 ~l~i~~k~g~~gklfGsV 93 (147)
T PRK00137 76 TVTIKAKAGEDGKLFGSV 93 (147)
T ss_pred EEEEEEEcCCCCeEEeee
Confidence 345555566999999986
No 47
>cd02980 TRX_Fd_family Thioredoxin (TRX)-like [2Fe-2S] Ferredoxin (Fd) family; composed of [2Fe-2S] Fds with a TRX fold (TRX-like Fds) and proteins containing domains similar to TRX-like Fd including formate dehydrogenases, NAD-reducing hydrogenases and the subunit E of NADH:ubiquinone oxidoreductase (NuoE). TRX-like Fds are soluble low-potential electron carriers containing a single [2Fe-2S] cluster. The exact role of TRX-like Fd is still unclear. It has been suggested that it may be involved in nitrogen fixation. Its homologous domains in large redox enzymes (such as Nuo and hydrogenases) function as electron carriers.
Probab=28.87 E-value=1e+02 Score=17.22 Aligned_cols=25 Identities=20% Similarity=0.266 Sum_probs=22.1
Q ss_pred EEEEEecCCeEEecCCCceEeCCCC
Q psy1396 20 TLELATLGDLKLVERPQPVVLAPHD 44 (66)
Q Consensus 20 ~vElat~GdLklverpq~~tL~P~~ 44 (66)
.+++...|-|..+.+.+.+-+.|.+
T Consensus 35 ~v~v~~~~Clg~C~~~P~v~i~~~~ 59 (77)
T cd02980 35 RVTVERVGCLGACGLAPVVVVYPDG 59 (77)
T ss_pred eEEEEEcCCcCcccCCCEEEEeCCC
Confidence 6889999999999999999999844
No 48
>PF07610 DUF1573: Protein of unknown function (DUF1573); InterPro: IPR011467 These hypothetical proteins from bacteria, such as Rhodopirellula baltica, Bacteroides thetaiotaomicron and Porphyromonas gingivalis, share a region of conserved sequence towards their N termini.
Probab=28.19 E-value=75 Score=17.02 Aligned_cols=43 Identities=14% Similarity=0.154 Sum_probs=24.7
Q ss_pred EEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEE
Q psy1396 6 YKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKAN 51 (66)
Q Consensus 6 ~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~ 51 (66)
+-+.|..++.|.=-.|+-+ -+=-.++-+ .-.|+||++..|+++
T Consensus 2 F~~~N~g~~~L~I~~v~ts--CgCt~~~~~-~~~i~PGes~~i~v~ 44 (45)
T PF07610_consen 2 FEFTNTGDSPLVITDVQTS--CGCTTAEYS-KKPIAPGESGKIKVT 44 (45)
T ss_pred EEEEECCCCcEEEEEeeEc--cCCEEeeCC-cceECCCCEEEEEEE
Confidence 4566777766643333322 222233333 356999999988875
No 49
>PF13598 DUF4139: Domain of unknown function (DUF4139)
Probab=27.79 E-value=55 Score=23.32 Aligned_cols=23 Identities=30% Similarity=0.286 Sum_probs=19.8
Q ss_pred eEEEEecCcccccceEEEEEecC
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLG 27 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~G 27 (66)
.-.|.|+|.+..+|+.+.|+|.-
T Consensus 32 ~a~V~q~TGeDW~~v~LtLsT~~ 54 (317)
T PF13598_consen 32 WAEVRQNTGEDWNNVKLTLSTAR 54 (317)
T ss_pred EEEEEcCCCCCccCceEEEEeCC
Confidence 34689999999999999999854
No 50
>PRK15308 putative fimbrial protein TcfA; Provisional
Probab=27.57 E-value=72 Score=23.27 Aligned_cols=46 Identities=17% Similarity=0.222 Sum_probs=33.3
Q ss_pred eEEEEecCcccccceEEEEEecC--------Ce--------EEecCCCceEeCCCCeEEEEEE
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLG--------DL--------KLVERPQPVVLAPHDFCNIKAN 51 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~G--------dL--------klverpq~~tL~P~~~~~i~a~ 51 (66)
.+-|.|+++++ +-++|+++-.- +. .|+-.|+.++|+|+++..||..
T Consensus 36 ~v~V~N~g~~~-~~vqV~v~r~~~PG~~~e~~~~~~~~~~~eLiaSP~~l~L~pg~~q~IRli 97 (234)
T PRK15308 36 SLFVYSKSDHT-QYVRTRIKRIEHPATPQEKEVPAGNDIETGLVVSPEKFALPAGTTRTVRVI 97 (234)
T ss_pred EEEEEeCCCCc-EEEEEEEEEEcCCCCCCCcccccccCCCCcEEEcCceeEECCCCeEEEEEE
Confidence 57788887765 45666664332 11 4778899999999999999853
No 51
>PF12790 T6SS-SciN: Type VI secretion lipoprotein; InterPro: IPR017734 This entry represents a family of lipoproteins associated with IAHP-related loci, thought to be type VI secretion system protein []. ; PDB: 3RX9_A.
Probab=27.48 E-value=80 Score=20.45 Aligned_cols=30 Identities=17% Similarity=0.244 Sum_probs=16.8
Q ss_pred CceEeCCCCeEEEEEEEEEeceeceEEEEe
Q psy1396 36 QPVVLAPHDFCNIKANVKVASTENGIIFGN 65 (66)
Q Consensus 36 q~~tL~P~~~~~i~a~iKVsStetGvIfG~ 65 (66)
..+.|.|++...+.....=...--||+-||
T Consensus 81 ~e~~l~Pg~~~~~~~~~~~~aryigvvA~f 110 (142)
T PF12790_consen 81 DEFVLQPGESRTLTLDRDPDARYIGVVAGF 110 (142)
T ss_dssp EEEEE-TT-EEEEEEE--TT--EEEEEE--
T ss_pred eEEEECCCCcEeeeEccCCCCcEEEEEEEE
Confidence 356799999999887666666667777664
No 52
>TIGR02274 dCTP_deam deoxycytidine triphosphate deaminase. Members of this family include the Escherichia coli monofunctional deoxycytidine triphosphate deaminase (dCTP deaminase) and a Methanocaldococcus jannaschii bifunctional dCTP deaminase (3.5.4.13)/dUTP diphosphatase (EC 3.6.1.23), which has the EC number 3.5.4.30 for the overall operation.
Probab=27.23 E-value=2e+02 Score=19.42 Aligned_cols=62 Identities=18% Similarity=0.272 Sum_probs=38.5
Q ss_pred cceEEEEecCcccccceEEEEEecCCeEEe-----------------------cCCCceEeCCCCeEEEEEEEEEece--
Q psy1396 3 QNNYKVLSVTGDTLQNCTLELATLGDLKLV-----------------------ERPQPVVLAPHDFCNIKANVKVAST-- 57 (66)
Q Consensus 3 ~~~~llvNqT~~tLQNl~vElat~GdLklv-----------------------erpq~~tL~P~~~~~i~a~iKVsSt-- 57 (66)
.++++|-+-..+.+|-..++|.-....++. +....+.|.|+++.-+.+..++.-.
T Consensus 13 ~g~i~I~Pf~~~~v~p~s~DLrlg~~~~~~~~~~~~~id~~~~~~~~~~~~~~~~~~~~~l~Pg~~~lv~t~e~i~lP~~ 92 (179)
T TIGR02274 13 EGLLKIEPLDEEQLQPAGVDLRLGNEFRVFRNHTGAVIDPENPKEAVSYLFEVEEGEEFVIPPGEFALATTLEYVKLPDD 92 (179)
T ss_pred cCCEEEcCCCccccCCceEEEecCCEEEEEeCCCCcccCcccccccceeeeeeccCCcEEECCCCEEEEEeceEEEcCCC
Confidence 345566666667777777776543333321 2234689999999988888777633
Q ss_pred eceEEEE
Q psy1396 58 ENGIIFG 64 (66)
Q Consensus 58 etGvIfG 64 (66)
-.|.|++
T Consensus 93 ~~~~i~~ 99 (179)
T TIGR02274 93 VVGFLEG 99 (179)
T ss_pred eEEEEEe
Confidence 3444444
No 53
>PF14874 PapD-like: Flagellar-associated PapD-like
Probab=26.06 E-value=1.5e+02 Score=17.43 Aligned_cols=56 Identities=9% Similarity=0.052 Sum_probs=30.2
Q ss_pred eEEEEecCcccccceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEeceeceEE
Q psy1396 5 NYKVLSVTGDTLQNCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVASTENGII 62 (66)
Q Consensus 5 ~~llvNqT~~tLQNl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVsStetGvI 62 (66)
.+.|.|.+... ....+......+-.+--.|+.-.|+|+++..++..+.= +.+.|.+
T Consensus 25 ~v~l~N~s~~p-~~f~v~~~~~~~~~~~v~~~~g~l~PG~~~~~~V~~~~-~~~~g~~ 80 (102)
T PF14874_consen 25 TVTLTNTSSIP-ARFRVRQPESLSSFFSVEPPSGFLAPGESVELEVTFSP-TKPLGDY 80 (102)
T ss_pred EEEEEECCCCC-EEEEEEeCCcCCCCEEEECCCCEECCCCEEEEEEEEEe-CCCCceE
Confidence 35677777665 33444443311112222456678999998777665541 3344543
No 54
>PF00692 dUTPase: dUTPase; InterPro: IPR008180 Synonym(s): dUTP diphosphatase, Deoxyuridine-triphosphatase The essential enzyme dUTP pyrophosphatase (3.6.1.23 from EC) is specific for dUTP and is critical for the fidelity of DNA replication and repair. dUTPase hydrolyzes dUTP to dUMP and pyrophosphate, simultaneously reducing dUTP levels and providing the dUMP for dTTP biosynthesis. dUTPase decreases the intracellular concentration of dUPT so that uracil cannot be incorporated into DNA []. The crystal structure of human dUTPase reveals that each subunit of the dUTPase trimer folds into an eight-stranded jelly-roll beta barrel, with the C-terminal beta strands interchanged among the subunits. The structure is similar to that of the Escherichia coli enzyme, despite low sequence homology between the two enzymes []. Other enzymes like deoxycytidine triphosphate deaminase (dCTP) (3.5.4.13 from EC) that specifically bind uridine also belong to this group suggesting that the signature may recognise a putative uridine-binding motif. Some retroviruses encode dUTPases. Retroviral dUTPase is synthesised as part of POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, dUTPase and RNase H. ; GO: 0016787 hydrolase activity, 0046080 dUTP metabolic process; PDB: 4DHK_A 1DUC_A 1DUN_A 3LQW_A 2BT1_A 2WE1_A 2WE0_A 2WE2_A 2BSY_A 2WE3_A ....
Probab=25.71 E-value=1.2e+02 Score=18.97 Aligned_cols=25 Identities=12% Similarity=0.291 Sum_probs=19.8
Q ss_pred CCCceEeCCCCeEEEEEEEEEecee
Q psy1396 34 RPQPVVLAPHDFCNIKANVKVASTE 58 (66)
Q Consensus 34 rpq~~tL~P~~~~~i~a~iKVsSte 58 (66)
.|..+.|.|+++..+.+.+++.-.+
T Consensus 20 ~~~~~~i~p~~~~~v~t~~~~~~p~ 44 (129)
T PF00692_consen 20 APEDFVIPPGETVLVPTGEEINIPP 44 (129)
T ss_dssp -SSSEEEETTEEEEEEEEEEEE-ST
T ss_pred cCCCEEECCCCEEEEEeCeEEECCC
Confidence 4567899999999999999987544
No 55
>cd04968 Ig3_Contactin_like Third Ig domain of contactin. Ig3_Contactin_like: Third Ig domain of contactins. Contactins are neural cell adhesion molecules and are comprised of six Ig domains followed by four fibronectin type III(FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. The first four Ig domains form the intermolecular binding fragment, which arranges as a compact U-shaped module via contacts between Ig domains 1 and 4, and between Ig domains 2 and 3. Contactin-2 (TAG-1, axonin-1) may play a part in the neuronal processes of neurite outgrowth, axon guidance and fasciculation, and neuronal migration. This group also includes contactin-1 and contactin-5. The different contactins show different expression patterns in the central nervous system. During development and in adulthood, contactin-2 is transiently expressed in subsets of central and peripheral neurons. Contactin-5 is expressed specifically in the rat postnatal nervous system, peaking at about 3 week
Probab=24.57 E-value=1.3e+02 Score=16.94 Aligned_cols=23 Identities=9% Similarity=0.027 Sum_probs=16.5
Q ss_pred EEecCCCceEeCCCCeEEEEEEE
Q psy1396 30 KLVERPQPVVLAPHDFCNIKANV 52 (66)
Q Consensus 30 klverpq~~tL~P~~~~~i~a~i 52 (66)
.++..|...+..+++.+.+.+.+
T Consensus 3 ~~~~~p~~~~~~~g~~v~l~C~~ 25 (88)
T cd04968 3 IIVVFPKDTYALKGQNVTLECFA 25 (88)
T ss_pred cEEECCCceEEeCCCcEEEEEEe
Confidence 35667777777788888777754
No 56
>PF07679 I-set: Immunoglobulin I-set domain; InterPro: IPR013098 The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable; IPR013106 from INTERPRO), C1-set (constant-1; IPR003597 from INTERPRO), C2-set (constant-2; IPR008424 from INTERPRO) and I-set (intermediate; IPR013098 from INTERPRO) []. Structural studies have shown that these domains share a common core Greek-key beta-sandwich structure, with the types differing in the number of strands in the beta-sheets as well as in their sequence patterns [, ]. Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system []. This entry represents I-set domains, which are found in several cell adhesion molecules, including vascular (VCAM), intercellular (ICAM), neural (NCAM) and mucosal addressin (MADCAM) cell adhesion molecules, as well as junction adhesion molecules (JAM). I-set domains are also present in several other diverse protein families, including several tyrosine-protein kinase receptors, the hemolymph protein hemolin, the muscle proteins titin, telokin, and twitchin, the neuronal adhesion molecule axonin-1 [], and the signalling molecule semaphorin 4D that is involved in axonal guidance, immune function and angiogenesis [].; PDB: 3MTR_A 2EDK_A 3DMK_B 1KOA_A 3NCM_A 2NCM_A 2V9Q_A 2CR3_A 3QQN_A 3QR2_A ....
Probab=24.47 E-value=1.4e+02 Score=16.49 Aligned_cols=24 Identities=13% Similarity=0.346 Sum_probs=19.0
Q ss_pred EEecCCCceEeCCCCeEEEEEEEE
Q psy1396 30 KLVERPQPVVLAPHDFCNIKANVK 53 (66)
Q Consensus 30 klverpq~~tL~P~~~~~i~a~iK 53 (66)
++..+|..++...|+...+.+.+.
T Consensus 2 ~~~~~~~~~~v~~G~~~~l~c~~~ 25 (90)
T PF07679_consen 2 VFTKKPKDVTVKEGESVTLECEVS 25 (90)
T ss_dssp EEEEESSEEEEETTSEEEEEEEEE
T ss_pred EEEEecCCEEEeCCCEEEEEEEEE
Confidence 456778888888888888888775
No 57
>PF01002 Flavi_NS2B: Flavivirus non-structural protein NS2B; InterPro: IPR000487 Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex [, ].; GO: 0004252 serine-type endopeptidase activity, 0019012 virion; PDB: 2WV9_A 2FOM_A 2VBC_B 3U1I_C 3U1J_A 3LKW_A 3L6P_A 2GGV_A 3E90_C 2IJO_A ....
Probab=23.70 E-value=60 Score=21.83 Aligned_cols=21 Identities=19% Similarity=0.325 Sum_probs=16.3
Q ss_pred cceEEEEEecCCeEEecCCCc
Q psy1396 17 QNCTLELATLGDLKLVERPQP 37 (66)
Q Consensus 17 QNl~vElat~GdLklverpq~ 37 (66)
.++.|++-..||+|+++.|.+
T Consensus 70 ~rldV~~d~~G~f~l~~~~~~ 90 (128)
T PF01002_consen 70 VRLDVKLDDDGNFKLINEEGE 90 (128)
T ss_dssp EEEEEEE-TTS-EEETTSTTT
T ss_pred eEEEEEECCCCCEEeccCCCc
Confidence 468899999999999999874
No 58
>PF10648 Gmad2: Immunoglobulin-like domain of bacterial spore germination; InterPro: IPR018911 This domain is found linked to IPR019606 from INTERPRO in some bacterial proteins. It is predicted to contain an immunoglobulin-like all-beta fold.
Probab=23.52 E-value=1.9e+02 Score=17.80 Aligned_cols=46 Identities=9% Similarity=0.044 Sum_probs=35.6
Q ss_pred ceEEEEEecCCeEEecCCCceEeCCCCeEEEEEEEEEec--eeceEEE
Q psy1396 18 NCTLELATLGDLKLVERPQPVVLAPHDFCNIKANVKVAS--TENGIIF 63 (66)
Q Consensus 18 Nl~vElat~GdLklverpq~~tL~P~~~~~i~a~iKVsS--tetGvIf 63 (66)
++.+++..-.+.-+.|.+....=+...+-.++.+|.++. .+.|.+.
T Consensus 30 tv~~rv~D~~g~vl~e~~~~a~~g~~~~g~F~~tv~~~~~~~~~g~l~ 77 (88)
T PF10648_consen 30 TVNIRVRDGHGEVLAEGFVTATGGAPSWGPFEGTVSFPPPPPGKGTLE 77 (88)
T ss_pred EEEEEEEcCCCcEEEEeeEEeccCCCcccceEEEEEeCCCCCCceEEE
Confidence 567788777777778888887788899999999999886 4555543
No 59
>PHA02703 ORF007 dUTPase; Provisional
Probab=23.38 E-value=1.5e+02 Score=20.36 Aligned_cols=30 Identities=10% Similarity=0.097 Sum_probs=20.2
Q ss_pred CCceEeCCCCeEEEEEEEEEe--ceeceEEEE
Q psy1396 35 PQPVVLAPHDFCNIKANVKVA--STENGIIFG 64 (66)
Q Consensus 35 pq~~tL~P~~~~~i~a~iKVs--StetGvIfG 64 (66)
|..++|.|++...+.+-+++. --..|.||+
T Consensus 41 ~~d~vi~P~~~~lv~TGi~i~iP~g~~g~i~~ 72 (165)
T PHA02703 41 ACDCIVPAGCRCVVFTDLLIKLPDGCYGRIAP 72 (165)
T ss_pred CCCeEECCCCEEEEeCCeEEEcCCCeEEEEEC
Confidence 346899999999999766554 333444543
No 60
>PF14310 Fn3-like: Fibronectin type III-like domain; PDB: 3ABZ_D 3AC0_D 2X40_A 2X41_A 2X42_A.
Probab=23.01 E-value=1.3e+02 Score=17.08 Aligned_cols=18 Identities=22% Similarity=0.233 Sum_probs=12.4
Q ss_pred ceEeCCCCeEEEEEEEEE
Q psy1396 37 PVVLAPHDFCNIKANVKV 54 (66)
Q Consensus 37 ~~tL~P~~~~~i~a~iKV 54 (66)
.+.|+||++.++.-.|..
T Consensus 26 rv~l~pGes~~v~~~l~~ 43 (71)
T PF14310_consen 26 RVSLAPGESKTVSFTLPP 43 (71)
T ss_dssp EEEE-TT-EEEEEEEEEH
T ss_pred EEEECCCCEEEEEEEECH
Confidence 356999999999877764
No 61
>COG4454 Uncharacterized copper-binding protein [Inorganic ion transport and metabolism]
Probab=22.85 E-value=73 Score=22.51 Aligned_cols=24 Identities=25% Similarity=0.408 Sum_probs=19.1
Q ss_pred CCeEEecCCCceEeCCCCeEEEEEE
Q psy1396 27 GDLKLVERPQPVVLAPHDFCNIKAN 51 (66)
Q Consensus 27 GdLklverpq~~tL~P~~~~~i~a~ 51 (66)
+|++. +.|..++|+|+++..+-..
T Consensus 106 ~Dme~-d~~~~v~L~PG~s~elvv~ 129 (158)
T COG4454 106 DDMEH-DDPNTVTLAPGKSGELVVV 129 (158)
T ss_pred Ccccc-CCcceeEeCCCCcEEEEEE
Confidence 36665 9999999999999877543
No 62
>PLN02547 dUTP pyrophosphatase
Probab=22.82 E-value=1.9e+02 Score=19.61 Aligned_cols=22 Identities=9% Similarity=0.143 Sum_probs=18.4
Q ss_pred CceEeCCCCeEEEEEEEEEece
Q psy1396 36 QPVVLAPHDFCNIKANVKVAST 57 (66)
Q Consensus 36 q~~tL~P~~~~~i~a~iKVsSt 57 (66)
..++|.|+++..+.+-+++.--
T Consensus 45 ~d~~i~P~~~~li~tgi~v~iP 66 (157)
T PLN02547 45 YDTVVPARGKALVPTDLSIAIP 66 (157)
T ss_pred CCeEECCCCEEEEEeceEEEcC
Confidence 4678999999999999998643
No 63
>cd04981 IgV_H Immunoglobulin (Ig) heavy chain (H), variable (V) domain. IgV_H: Immunoglobulin (Ig) heavy chain (H), variable (V) domain. The basic structure of Ig molecules is a tetramer of two light chains and two heavy chains linked by disulfide bonds. In Ig, each chain is composed of one variable domain (IgV) and one or more constant domains (IgC); these names reflect the fact that the variability in sequences is higher in the variable domain than in the constant domain. There are five types of heavy chains (alpha, gamma, delta, epsilon, and mu), which determine the type of immunoglobulin: IgA, IgG, IgD, IgE, and IgM, respectively. In higher vertebrates, there are two types of light chain, designated kappa and lambda, which can associate with any of the heavy chains. This family includes alpha, gamma, delta, epsilon, and mu heavy chains.
Probab=22.66 E-value=1.7e+02 Score=18.07 Aligned_cols=23 Identities=9% Similarity=0.164 Sum_probs=19.1
Q ss_pred CCCceEeCCCCeEEEEEEEEEec
Q psy1396 34 RPQPVVLAPHDFCNIKANVKVAS 56 (66)
Q Consensus 34 rpq~~tL~P~~~~~i~a~iKVsS 56 (66)
.|+.....+++.+.++++++-++
T Consensus 3 ~~~~~~v~~G~~vtL~C~~~~~~ 25 (117)
T cd04981 3 ESGPGLVKPGQSLKLSCKASGFT 25 (117)
T ss_pred cCCCeEEcCCCCEEEEEEEeCCC
Confidence 58888999999999999876544
No 64
>PF11906 DUF3426: Protein of unknown function (DUF3426); InterPro: IPR021834 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 262 to 463 amino acids in length.
Probab=22.52 E-value=2.2e+02 Score=18.12 Aligned_cols=50 Identities=14% Similarity=0.116 Sum_probs=32.7
Q ss_pred EEEEecCccccc--ceEEEEEecCCeEEecC---C---------CceEeCCCCeEEEEEEEEEe
Q psy1396 6 YKVLSVTGDTLQ--NCTLELATLGDLKLVER---P---------QPVVLAPHDFCNIKANVKVA 55 (66)
Q Consensus 6 ~llvNqT~~tLQ--Nl~vElat~GdLklver---p---------q~~tL~P~~~~~i~a~iKVs 55 (66)
.-|.|++..... .|.++|.-..+--+.+| | ..-.|+|++...++..+-..
T Consensus 74 g~i~N~~~~~~~~P~l~l~L~D~~g~~l~~r~~~P~~yl~~~~~~~~~l~pg~~~~~~~~~~~p 137 (149)
T PF11906_consen 74 GTIRNRADFPQALPALELSLLDAQGQPLARRVFTPADYLPPGLAAQAGLPPGESVPFRLRLEDP 137 (149)
T ss_pred EEEEeCCCCcccCceEEEEEECCCCCEEEEEEEChHHhcccccccccccCCCCeEEEEEEeeCC
Confidence 457788877654 66667775555433332 3 14569999999999887643
No 65
>PRK13956 dut deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional
Probab=22.50 E-value=1.3e+02 Score=20.38 Aligned_cols=27 Identities=30% Similarity=0.420 Sum_probs=21.4
Q ss_pred CeEEecCCCceEeCCCCeEEEEEEEEEece
Q psy1396 28 DLKLVERPQPVVLAPHDFCNIKANVKVAST 57 (66)
Q Consensus 28 dLklverpq~~tL~P~~~~~i~a~iKVsSt 57 (66)
||... ..++|.|++...+.+-+++.--
T Consensus 30 DL~a~---~~~~i~p~~~~lv~TGi~i~lP 56 (147)
T PRK13956 30 DLKVA---ERTVIAPGEIKLVPTGVKAYMQ 56 (147)
T ss_pred ccccC---CCeEECCCCEEEEECCeEEECC
Confidence 55443 4789999999999999998654
No 66
>PHA03094 dUTPase; Provisional
Probab=21.59 E-value=1.2e+02 Score=20.17 Aligned_cols=23 Identities=9% Similarity=0.234 Sum_probs=19.1
Q ss_pred CCceEeCCCCeEEEEEEEEEece
Q psy1396 35 PQPVVLAPHDFCNIKANVKVAST 57 (66)
Q Consensus 35 pq~~tL~P~~~~~i~a~iKVsSt 57 (66)
|..+.|.|+++..+.+-+++.-.
T Consensus 33 ~~~~~i~P~~~~lv~Tg~~i~ip 55 (144)
T PHA03094 33 AYDYTVPPKERILVKTDISLSIP 55 (144)
T ss_pred CCCeEECCCCEEEEEcCeEEEcC
Confidence 45689999999999999888643
No 67
>cd04972 Ig_TrkABC_d4 Fourth domain (immunoglobulin-like) of Trk receptors TrkA, TrkB and TrkC. TrkABC_d4: the fourth domain of Trk receptors TrkA, TrkB and TrkC, this is an immunoglobulin (Ig)-like domain which binds to neurotrophin. The Trk family of receptors are tyrosine kinase receptors. They are activated by dimerization, leading to autophosphorylation of intracellular tyrosine residues, and triggering the signal transduction pathway. TrkA, TrkB, and TrkC share significant sequence homology and domain organization. The first three domains are leucine-rich domains. The fourth and fifth domains are Ig-like domains playing a part in ligand binding. TrkA, Band C mediate the trophic effects of the neurotrophin Nerve growth factor (NGF) family. TrkA is recognized by NGF. TrKB is recognized by brain-derived neurotrophic factor (BDNF) and neurotrophin (NT)-4. TrkC is recognized by NT-3. NT-3 is promiscuous as in some cell systems it activates TrkA and TrkB receptors. TrkA is a receptor fo
Probab=20.77 E-value=1.7e+02 Score=16.66 Aligned_cols=22 Identities=18% Similarity=0.345 Sum_probs=14.0
Q ss_pred ecCCCceEeCCCCeEEEEEEEE
Q psy1396 32 VERPQPVVLAPHDFCNIKANVK 53 (66)
Q Consensus 32 verpq~~tL~P~~~~~i~a~iK 53 (66)
++.|+.....+|+...+.+.+.
T Consensus 4 v~~p~~~~v~~G~~v~l~C~~~ 25 (90)
T cd04972 4 VDGPNATVVYEGGTATIRCTAE 25 (90)
T ss_pred ccCCcCEEEcCCCcEEEEEEEE
Confidence 4556666677777777766554
No 68
>PTZ00143 deoxyuridine 5'-triphosphate nucleotidohydrolase; Provisional
Probab=20.49 E-value=1.7e+02 Score=20.05 Aligned_cols=22 Identities=23% Similarity=0.309 Sum_probs=19.0
Q ss_pred CCceEeCCCCeEEEEEEEEEec
Q psy1396 35 PQPVVLAPHDFCNIKANVKVAS 56 (66)
Q Consensus 35 pq~~tL~P~~~~~i~a~iKVsS 56 (66)
+..++|.|++...|.+-|++.-
T Consensus 34 ~~~~~i~Pg~~~~V~tGi~i~~ 55 (155)
T PTZ00143 34 VKDQTIKPGETAFIKLGIKAAA 55 (155)
T ss_pred CCCeEECCCCEEEEECCeEEEc
Confidence 3568999999999999999884
No 69
>PF14168 YjzC: YjzC-like protein
Probab=20.45 E-value=1.2e+02 Score=18.03 Aligned_cols=26 Identities=27% Similarity=0.337 Sum_probs=22.0
Q ss_pred ceEEEEEecCCeEEecCCCceEeCCCCe
Q psy1396 18 NCTLELATLGDLKLVERPQPVVLAPHDF 45 (66)
Q Consensus 18 Nl~vElat~GdLklverpq~~tL~P~~~ 45 (66)
.+++|.-.-|+. |+.|..++|..++.
T Consensus 17 G~Y~EvG~~G~~--v~~p~~v~l~~Gd~ 42 (57)
T PF14168_consen 17 GTYVEVGERGGH--VNNPKEVKLKKGDR 42 (57)
T ss_pred ceEEEECCCCCc--cCCCcEEEecCCCc
Confidence 467888888887 88999999998875
Done!