Query psy8623
Match_columns 68
No_of_seqs 117 out of 1228
Neff 9.7
Searched_HMMs 46136
Date Fri Aug 16 23:32:38 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy8623.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/8623hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF00028 Cadherin: Cadherin do 99.8 2E-18 4.3E-23 87.1 8.3 66 1-67 15-81 (93)
2 smart00112 CA Cadherin repeats 99.7 5.6E-16 1.2E-20 76.0 7.8 60 7-67 1-60 (79)
3 cd00031 CA Cadherin repeat dom 99.6 5.7E-14 1.2E-18 78.2 9.4 66 1-67 16-81 (199)
4 cd00031 CA Cadherin repeat dom 99.5 1.3E-12 2.9E-17 72.7 9.2 65 1-66 121-185 (199)
5 KOG4289|consensus 99.5 1.6E-13 3.6E-18 92.9 6.2 67 1-68 288-354 (2531)
6 KOG4289|consensus 99.4 5.5E-13 1.2E-17 90.5 7.8 65 1-67 393-457 (2531)
7 KOG1219|consensus 99.4 5.6E-13 1.2E-17 93.0 7.5 64 1-67 2699-2762(4289)
8 KOG1219|consensus 99.3 1.7E-11 3.6E-16 86.0 7.7 66 1-67 972-1037(4289)
9 KOG1834|consensus 98.3 5.1E-06 1.1E-10 54.3 6.7 61 1-65 167-229 (952)
10 PF08266 Cadherin_2: Cadherin- 97.1 0.0023 5.1E-08 31.9 4.7 31 19-50 36-66 (84)
11 KOG1834|consensus 96.0 0.07 1.5E-06 35.9 7.1 62 4-65 54-120 (952)
12 smart00736 CADG Dystroglycan-t 95.2 0.17 3.8E-06 25.5 8.2 55 6-65 24-80 (97)
13 PF05345 He_PIG: Putative Ig d 94.7 0.16 3.6E-06 22.7 5.9 35 29-65 14-49 (49)
14 PF08758 Cadherin_pro: Cadheri 87.6 2.4 5.2E-05 21.4 6.1 35 29-65 46-80 (90)
15 PF03413 PepSY: Peptidase prop 72.6 7.5 0.00016 17.3 2.9 29 14-42 29-62 (64)
16 PF07495 Y_Y_Y: Y_Y_Y domain; 70.1 9.3 0.0002 17.3 5.5 47 13-65 6-52 (66)
17 PF14157 YmzC: YmzC-like prote 64.9 13 0.00027 17.7 2.6 27 19-45 32-58 (63)
18 cd02899 PLAT_SR Scavenger rece 62.7 21 0.00045 18.7 4.1 19 7-25 11-29 (109)
19 PF12971 NAGLU_N: Alpha-N-acet 48.9 34 0.00073 16.9 3.0 29 16-44 19-48 (86)
20 COG3212 Predicted membrane pro 45.2 41 0.0009 18.5 3.0 36 5-41 101-138 (144)
21 PF12245 Big_3_2: Bacterial Ig 44.2 34 0.00075 15.6 2.7 14 51-64 22-35 (60)
22 PF00635 Motile_Sperm: MSP (Ma 42.7 45 0.00097 16.5 5.5 25 14-40 31-55 (109)
23 COG5448 Uncharacterized conser 42.5 65 0.0014 18.3 4.9 37 13-49 106-142 (184)
24 PF13754 Big_3_4: Bacterial Ig 41.2 37 0.0008 15.1 2.6 15 50-64 22-36 (54)
25 TIGR01965 VCBS_repeat VCBS rep 31.0 84 0.0018 16.2 6.1 35 2-39 2-36 (99)
26 PF13860 FlgD_ig: FlgD Ig-like 31.0 71 0.0015 15.3 2.5 14 51-64 68-81 (81)
27 PRK13744 conjugal transfer pro 30.1 45 0.00098 15.9 1.4 20 2-21 28-47 (83)
28 COG2706 3-carboxymuconate cycl 29.7 1.6E+02 0.0034 18.9 4.2 15 29-43 269-283 (346)
29 COG1770 PtrB Protease II [Amin 29.5 1.9E+02 0.0041 20.4 4.6 30 15-44 375-404 (682)
30 COG3470 Tpd Uncharacterized pr 28.3 70 0.0015 18.2 2.1 27 14-40 134-160 (179)
31 PF14302 DUF4377: Domain of un 28.0 85 0.0018 15.3 3.6 24 44-67 40-63 (80)
32 PF01011 PQQ: PQQ enzyme repea 27.9 56 0.0012 13.2 1.9 17 29-45 11-27 (38)
33 COG5584 Predicted small secret 27.8 55 0.0012 17.0 1.6 13 29-41 86-98 (103)
34 PF03646 FlaG: FlaG protein; 27.7 46 0.001 16.9 1.4 27 14-40 54-80 (107)
35 PF12134 PRP8_domainIV: PRP8 d 27.3 69 0.0015 19.2 2.1 15 29-43 47-61 (231)
36 PRK08577 hypothetical protein; 26.1 1.1E+02 0.0025 16.2 4.8 34 30-64 34-67 (136)
37 PF12461 DUF3688: Protein of u 25.9 1E+02 0.0022 15.6 2.7 14 29-42 73-86 (91)
38 PF13750 Big_3_3: Bacterial Ig 25.5 1.3E+02 0.0028 16.6 8.4 17 48-64 119-135 (158)
39 cd01756 PLAT_repeat PLAT/LH2 d 24.9 1.1E+02 0.0025 15.8 5.8 18 8-25 12-29 (120)
40 PF12216 m04gp34like: Immune e 24.8 1.7E+02 0.0038 18.0 3.6 20 29-48 138-157 (272)
41 PF02897 Peptidase_S9_N: Proly 24.5 1.7E+02 0.0037 18.3 3.6 30 15-44 383-412 (414)
42 PF11102 Cap_synth_GfcB: Group 24.4 1.5E+02 0.0033 16.9 4.0 16 29-44 170-185 (200)
43 PF01751 Toprim: Toprim domain 24.3 41 0.00089 16.7 0.8 9 3-11 63-71 (100)
44 PF14564 Membrane_bind: Membra 23.7 1.1E+02 0.0024 16.0 2.3 17 29-45 71-87 (110)
45 COG3049 Penicillin V acylase a 23.6 2.1E+02 0.0046 18.4 4.4 37 3-42 150-186 (353)
46 PF07861 WND: WisP family N-Te 23.1 1.4E+02 0.0031 17.6 2.8 28 13-43 201-228 (263)
47 cd00146 PKD polycystic kidney 22.3 1E+02 0.0022 14.3 3.0 18 47-64 52-69 (81)
No 1
>PF00028 Cadherin: Cadherin domain; InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.79 E-value=2e-18 Score=87.10 Aligned_cols=66 Identities=35% Similarity=0.570 Sum_probs=60.4
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEEC-CcCC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDR-GKET 67 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~-g~~~ 67 (68)
|+++.|.|+|.+.|+.+.|+|..+.. ..+|.|++.+|.|.+.+.||||....|++.|.|+|. |.|+
T Consensus 15 v~~v~a~D~D~~~n~~i~y~i~~~~~-~~~F~I~~~tg~i~~~~~LD~E~~~~y~l~v~a~D~~~~~~ 81 (93)
T PF00028_consen 15 VGQVTATDPDSGPNSQITYSILGGNP-DGLFSIDPNTGEISLKKPLDRETQSSYQLTVRATDSGGSPP 81 (93)
T ss_dssp EEEEEEEESSTSTTSSEEEEEEETTS-TTSEEEETTTTEEEESSSSCTTTTSEEEEEEEEEETTTSSE
T ss_pred EEEEEEEeCCCCCCceEEEEEecCcc-cCceEEeeeeeccccceecCcccCCEEEEEEEEEECCCCCC
Confidence 46899999999999999999998875 389999999999999999999999999999999999 6664
No 2
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=99.68 E-value=5.6e-16 Score=76.05 Aligned_cols=60 Identities=32% Similarity=0.488 Sum_probs=54.2
Q ss_pred EeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcCC
Q psy8623 7 SDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKET 67 (68)
Q Consensus 7 ~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~~ 67 (68)
+|+|.|.|+.++|+|..+... .+|.|++.+|.|.+.+.||+|....|.+.|.|.|.|.|+
T Consensus 1 ~D~D~g~n~~i~Y~i~~~~~~-~~F~i~~~tg~i~~~~~LD~e~~~~y~l~v~a~D~~~~~ 60 (79)
T smart00112 1 TDADSGENGKVTYSILSGNED-GLFSIDPETGEITTTKPLDREEQPEYTLTVEATDGGGPP 60 (79)
T ss_pred CCCCCCcCcEEEEEEecCCCC-CEEEEeCCccEEEeCCccCeeCCCeEEEEEEEEECCCCC
Confidence 489999999999999877654 799999999999888999999999999999999998764
No 3
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.57 E-value=5.7e-14 Score=78.24 Aligned_cols=66 Identities=35% Similarity=0.500 Sum_probs=59.3
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcCC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKET 67 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~~ 67 (68)
|+++.|.|+|.+.++.+.|+|...... .+|.|++.+|.|++.+.||||....|.+.|.|.|.|.|+
T Consensus 16 v~~~~a~D~D~~~~~~~~y~i~~~~~~-~~F~i~~~tG~l~~~~~lD~e~~~~~~l~v~a~D~g~~~ 81 (199)
T cd00031 16 VGTVSATDPDSGENGRVTYSILGGNED-GLFSIDPNTGVITTTKPLDREEQSEYTLTVVASDGGGPP 81 (199)
T ss_pred EEEEEEECCCCCCCceEEEEEeCCCCc-ccEEEeCCCCEEEECCCCCCcCCceEEEEEEEEECCcCc
Confidence 467889999998889999999887643 689999999999999999999999999999999987764
No 4
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.47 E-value=1.3e-12 Score=72.75 Aligned_cols=65 Identities=38% Similarity=0.544 Sum_probs=58.5
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKE 66 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~ 66 (68)
++++.|+|+|.+.++.++|++..... ..+|.|++.+|.|.+.+.||+|....|++.|.|.|.+.|
T Consensus 121 i~~~~a~D~D~~~~~~~~y~l~~~~~-~~~f~i~~~~G~i~~~~~ld~e~~~~~~l~v~a~D~~~~ 185 (199)
T cd00031 121 VGTVTATDADSGENAKLTYSILSGND-KELFSIDPNTGIITLAKPLDREEKSSYELTVVATDGGGP 185 (199)
T ss_pred EEEEEEEcCCCCCCccEEEEEeCCCC-CCEEEEeCCceEEEeCCccCCccCceEEEEEEEEECCCC
Confidence 46889999999989999999988764 378999999999999999999999999999999998753
No 5
>KOG4289|consensus
Probab=99.46 E-value=1.6e-13 Score=92.89 Aligned_cols=67 Identities=36% Similarity=0.474 Sum_probs=62.0
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcCCC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKETQ 68 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~~~ 68 (68)
|++|.|+|.|++.|+.|+|++..+... ..|.|++++|.|.+..++|||....|++.|.|.|+|.|++
T Consensus 288 vLtvrAtD~Dsp~Nani~Yrl~eg~~~-~~f~in~rSGvI~T~a~lDRE~~~~y~L~VeAsDqG~~pg 354 (2531)
T KOG4289|consen 288 VLTVRATDGDSPPNANIRYRLLEGNAK-NVFEINPRSGVISTRAPLDREELESYQLDVEASDQGRPPG 354 (2531)
T ss_pred EEEEEeccCCCCCCCceEEEecCCCcc-ceeEEcCccceeeccCccCHHhhhheEEEEEeccCCCCCC
Confidence 578999999999999999999988543 7899999999999999999999999999999999998764
No 6
>KOG4289|consensus
Probab=99.45 E-value=5.5e-13 Score=90.46 Aligned_cols=65 Identities=37% Similarity=0.720 Sum_probs=60.5
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcCC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKET 67 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~~ 67 (68)
|++|.|+|.|.|.|+.|.|+|.+++.. +.|.||..||+|.+..+||+|.. +|.+.|+|.|+|-|+
T Consensus 393 vlrV~AtDrD~g~Ng~VHYsi~Sgn~~-G~f~id~~tGel~vv~plD~e~~-~ytl~IrAqDggrPp 457 (2531)
T KOG4289|consen 393 VLRVTATDRDKGTNGKVHYSIASGNGR-GQFYIDSLTGELDVVEPLDFENS-EYTLRIRAQDGGRPP 457 (2531)
T ss_pred EEEEEecccCCCcCceEEEEeeccCcc-ccEEEecccceEEEeccccccCC-eeEEEEEcccCCCCC
Confidence 578999999999999999999988764 88999999999999999999988 999999999999886
No 7
>KOG1219|consensus
Probab=99.44 E-value=5.6e-13 Score=92.97 Aligned_cols=64 Identities=31% Similarity=0.491 Sum_probs=59.7
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcCC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKET 67 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~~ 67 (68)
|.+++|.|.|.|.||+|+|++.+.. .+|.|++.||.|++...||+|+...|.+.+.|+|.|.+.
T Consensus 2699 V~qf~AsD~Ds~~nGqirysl~~~v---~yF~In~etGwlTt~~eld~ek~d~y~lkv~AtDhG~~s 2762 (4289)
T KOG1219|consen 2699 VIQFHASDMDSGNNGQIRYSLTSPV---PYFAINPETGWLTTLFELDLEKQDLYSLKVVATDHGVPS 2762 (4289)
T ss_pred EEEEEeeccCCCCCceEEEEEcCCc---ceEEEcCCCCeeeehhhhccccCCceEEEEEEecCCccc
Confidence 5689999999999999999998774 489999999999999999999999999999999999874
No 8
>KOG1219|consensus
Probab=99.29 E-value=1.7e-11 Score=86.03 Aligned_cols=66 Identities=30% Similarity=0.488 Sum_probs=60.7
Q ss_pred CEEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCcCC
Q psy8623 1 MLRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGKET 67 (68)
Q Consensus 1 v~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~~~ 67 (68)
|+.|.|.|.|.|..+.++|+|..+.. -+.|+|++.+|.|++.+.||||+...|-+++.|.|.|.++
T Consensus 972 vi~i~A~dedsgldg~l~Y~I~~gdg-~g~FsId~~tG~irTl~~lDrE~ks~YwltveA~D~gt~~ 1037 (4289)
T KOG1219|consen 972 VIRIQARDEDSGLDGELSYKIRTGDG-DGIFSIDSTTGSIRTLKALDREKKSSYWLTVEAKDLGTVP 1037 (4289)
T ss_pred EEEEEEecCCCCccceEEEEEEcCCc-ceeEEecCCcceEeechhhchhhcceEEEEEEEEecCCCc
Confidence 46899999999999999999987753 3789999999999999999999999999999999999876
No 9
>KOG1834|consensus
Probab=98.28 E-value=5.1e-06 Score=54.33 Aligned_cols=61 Identities=30% Similarity=0.533 Sum_probs=52.0
Q ss_pred CEEEEEEeCCCC-CCeEE-EEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCc
Q psy8623 1 MLRVSASDPDCG-VNAMV-NYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGK 65 (68)
Q Consensus 1 v~~v~A~D~D~g-~~~~i-~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~ 65 (68)
|+.|.|.|.|.+ .++.| .|.|.... -.|.|| +.|.|+....|.+.+..+|.++|+|.|.|.
T Consensus 167 il~veAiD~DCspq~sqIC~YEI~t~d---~PFaId-n~G~irnTekLny~ke~~Y~ltVtAyDCg~ 229 (952)
T KOG1834|consen 167 ILRVEAIDKDCSPQYSQICEYEITTPD---VPFAID-NDGNIRNTEKLNYTKEHQYKLTVTAYDCGK 229 (952)
T ss_pred eEEEEeecCCCCCcccceeEEEecCCC---CceEEc-CCCccccccccccccceeEEEEEEEEeccc
Confidence 578999999986 56677 68888754 349998 569999999999999999999999999874
No 10
>PF08266 Cadherin_2: Cadherin-like; InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=97.10 E-value=0.0023 Score=31.92 Aligned_cols=31 Identities=13% Similarity=0.389 Sum_probs=20.5
Q ss_pred EEeccCCCCCCcEEEeCCccEEEEeecCCCCC
Q psy8623 19 YTLGESPSRTNHFYMKSVSGEICIAQDLDFES 50 (68)
Q Consensus 19 y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~ 50 (68)
+.+.... ...+|.++..+|.|.+...+|||.
T Consensus 36 ~ri~s~~-~~~~~~v~~~tG~L~v~~rIDRE~ 66 (84)
T PF08266_consen 36 FRIVSEG-NSQYFRVNEKTGDLFVSERIDREE 66 (84)
T ss_dssp BEEE-SS-SS-SEEE-TTTSEEEESS--SCCC
T ss_pred eEEeecC-CcceeEecCCceeEEeCCccCHHH
Confidence 4444433 237999999999999999999997
No 11
>KOG1834|consensus
Probab=95.99 E-value=0.07 Score=35.92 Aligned_cols=62 Identities=23% Similarity=0.368 Sum_probs=42.0
Q ss_pred EEEEeCCCC--CCeEE-EEEeccCCCCCCcEEEeCCcc--EEEEeecCCCCCCceEEEEEEEEECCc
Q psy8623 4 VSASDPDCG--VNAMV-NYTLGESPSRTNHFYMKSVSG--EICIAQDLDFESRSSYEFPVVATDRGK 65 (68)
Q Consensus 4 v~A~D~D~g--~~~~i-~y~i~~~~~~~~~f~id~~tG--~i~~~~~ld~e~~~~~~~~v~a~d~g~ 65 (68)
+.|-|.|.. -.+.| -|+|-+.+-+-..-.+|..|| .|+.+.+||=|-+.+|+++|+|.|.|.
T Consensus 54 l~aLdkdaplr~ageiC~fklhgq~vPFdavVvdK~TGegvlRaK~~lDCelqkeytf~iQAydCg~ 120 (952)
T KOG1834|consen 54 LAALDKDAPLRYAGEICGFKLHGQPVPFDAVVVDKYTGEGVLRAKEPLDCELQKEYTFTIQAYDCGN 120 (952)
T ss_pred eeeecCCCCcccccccceeEecCCCCCceEEEEeccCCceEEeecCcccccccccceEEEEEEecCC
Confidence 345566653 22344 466766543323345677887 567778899999999999999999874
No 12
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=95.19 E-value=0.17 Score=25.46 Aligned_cols=55 Identities=20% Similarity=0.239 Sum_probs=37.3
Q ss_pred EEeCCCCCCeEEEEEeccCC--CCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCc
Q psy8623 6 ASDPDCGVNAMVNYTLGESP--SRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGK 65 (68)
Q Consensus 6 A~D~D~g~~~~i~y~i~~~~--~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~ 65 (68)
..|+| ...+.|++.... ....|+..|+.++.+.-. +.. +....|.+.|.|+|..+
T Consensus 24 F~d~d---~~~lty~~~~~~~~~lP~Wl~fd~~~~~~~Gt-P~~-~~~g~~~i~v~a~D~~g 80 (97)
T smart00736 24 FTDAD---GDTLTYSATLSDGSALPSWLSFDSDTGTLSGT-PTN-SDVGSLSLKVTATDSSG 80 (97)
T ss_pred eECCC---CCeEEEEEEeCCCCCCCCeEEEeCCCCEEEEE-CCC-CCCcEEEEEEEEEECCC
Confidence 35666 346888875332 223799999999887653 333 23466999999999764
No 13
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=94.70 E-value=0.16 Score=22.68 Aligned_cols=35 Identities=23% Similarity=0.335 Sum_probs=26.3
Q ss_pred CcEEEeCCccEEEEeecCCCC-CCceEEEEEEEEECCc
Q psy8623 29 NHFYMKSVSGEICIAQDLDFE-SRSSYEFPVVATDRGK 65 (68)
Q Consensus 29 ~~f~id~~tG~i~~~~~ld~e-~~~~~~~~v~a~d~g~ 65 (68)
.++.+|+.+|.|.-.- +.. ....|.+.|.|+|..+
T Consensus 14 ~gLs~d~~tG~isGtp--~~~~~~G~y~~~vtatd~~G 49 (49)
T PF05345_consen 14 SGLSLDPSTGTISGTP--TSSVQPGTYTFTVTATDGSG 49 (49)
T ss_pred CcEEEeCCCCEEEeec--CCCccccEEEEEEEEEcCCC
Confidence 7899999999986552 223 2358999999999753
No 14
>PF08758 Cadherin_pro: Cadherin prodomain like; InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=87.60 E-value=2.4 Score=21.37 Aligned_cols=35 Identities=20% Similarity=0.295 Sum_probs=23.6
Q ss_pred CcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCc
Q psy8623 29 NHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGK 65 (68)
Q Consensus 29 ~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~ 65 (68)
..|.|.. .|.|++.+++.... ..-.+.|.|.|..+
T Consensus 46 pdF~V~~-DGsVy~~r~v~l~~-~~~~F~V~a~D~~~ 80 (90)
T PF08758_consen 46 PDFRVLE-DGSVYAKRPVQLSS-EQRSFTVHAWDSQT 80 (90)
T ss_dssp SEEEEET-TTEEEEES--S-SS-S-EEEEEEEEETTT
T ss_pred CCEEEcC-CCeEEEeeeEecCC-CceEEEEEEECCCC
Confidence 3599975 59999999987643 33478999998754
No 15
>PF03413 PepSY: Peptidase propeptide and YPEB domain This Prosite motif covers only the active site. This is family M4 in the peptidase classification. ; InterPro: IPR005075 This signature, PepSY, is found in the propeptide of members of the MEROPS peptidase family M4 (clan MA(E)), which contains the thermostable thermolysins (3.4.24.27 from EC), and related thermolabile neutral proteases (bacillolysins) (3.4.24.28 from EC) from various species of Bacillus. It is also in many non-peptidase proteins, including Bacillus subtilis YpeB protein - a regulator of SleB spore cortex lytic enzyme - and a large number of eubacterial and archaeal cell wall-associated and secreted proteins which are mostly annotated as 'hypothetical protein'. Many extracellular bacterial proteases are produced as proenzymes. The propeptides usually have a dual function, i.e. they function as an intramolecular chaperone required for the folding of the polypeptide and as an inhibitor preventing premature activation of the enzyme. Analysis of the propeptide region of the M4 family of peptidases reveals two regions of conservation, the PepSY domain and a second domain, proximate to the N terminus, the FTP domain (IPR011096 from INTERPRO), which is also found in isolation in the propeptide of eukaryotic peptidases belong to MEROPS peptidase family M36. Propeptide domain swapping experiments, for example swapping the propeptide domain of PA protease with that of vibrolysin, both propeptides contain the FTP and PepSY domains, allows the PA protease domain to fold correctly and inhibits the C-terminal autoprocessing activity. However, swapping the propeptide of PA protease for the thermolysin propeptide, does not facilitate the correct folding nor the processing of the chimaeric protein into an active peptidase []. Mutational analysis of the Pseudomonas aeruginosa elastase gene revealed two mutations in the propeptide which resulted in the loss of inhibitory activity but not chaperone activity: A-15V and T-153I (where +1 is defined as the first residue of the mature peptidase). Both mutations resulted in peptidase activity, the T-153V mutation being much less effective than the A-15I mutation [] in activating peptidase activity. The T-153V mutation lies N-terminal to the FTP domain while the A-15I mutation is C-terminal to the PepSY domain. Given the diverse range of other proteins, both domains occur in in isolation, the exact function of each is still unclear; though it has been proposed that the PepSY domain primarily has inhibitory activity and in conjunction with the FTP domain in chaperone activity. ; GO: 0008237 metallopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0005576 extracellular region; PDB: 2GU3_A 3NQZ_A 3NQY_A 2KGY_A.
Probab=72.65 E-value=7.5 Score=17.32 Aligned_cols=29 Identities=21% Similarity=0.353 Sum_probs=14.9
Q ss_pred CeEEEEEeccCC---CCCC--cEEEeCCccEEEE
Q psy8623 14 NAMVNYTLGESP---SRTN--HFYMKSVSGEICI 42 (68)
Q Consensus 14 ~~~i~y~i~~~~---~~~~--~f~id~~tG~i~~ 42 (68)
++...|.+.-.. +... .+.||..||+|.-
T Consensus 29 ~~~~~Y~v~~~~~~~~~~~~~~v~VDa~tG~Il~ 62 (64)
T PF03413_consen 29 NGRLVYEVEVVSDDDPDGGEYEVYVDAYTGEILS 62 (64)
T ss_dssp TCEEEEEEEEEBTTSTTTEEEEEEEETTT--EEE
T ss_pred CCcEEEEEEEEEEecCCCCEEEEEEECCCCeEEE
Confidence 455667664221 2123 3449999999853
No 16
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=70.06 E-value=9.3 Score=17.31 Aligned_cols=47 Identities=21% Similarity=0.181 Sum_probs=26.4
Q ss_pred CCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCCCCceEEEEEEEEECCc
Q psy8623 13 VNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRGK 65 (68)
Q Consensus 13 ~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g~ 65 (68)
.+-..+|.+.+-.. .|..+...+-.+.. . ......|+|.|.|.|..+
T Consensus 6 ~~~~Y~Y~l~g~d~--~W~~~~~~~~~~~~-~---~L~~G~Y~l~V~a~~~~~ 52 (66)
T PF07495_consen 6 ENIRYRYRLEGFDD--EWITLGSYSNSISY-T---NLPPGKYTLEVRAKDNNG 52 (66)
T ss_dssp TTEEEEEEEETTES--SEEEESSTS-EEEE-E---S--SEEEEEEEEEEETTS
T ss_pred CceEEEEEEECCCC--eEEECCCCcEEEEE-E---eCCCEEEEEEEEEECCCC
Confidence 34456677765543 56666443323221 2 235678999999998654
No 17
>PF14157 YmzC: YmzC-like protein; PDB: 3KVP_E.
Probab=64.94 E-value=13 Score=17.68 Aligned_cols=27 Identities=11% Similarity=0.303 Sum_probs=16.9
Q ss_pred EEeccCCCCCCcEEEeCCccEEEEeec
Q psy8623 19 YTLGESPSRTNHFYMKSVSGEICIAQD 45 (68)
Q Consensus 19 y~i~~~~~~~~~f~id~~tG~i~~~~~ 45 (68)
|.+.........|.-|+.+++|.+.+.
T Consensus 32 Fav~~e~~~iKIfkyd~~tNei~L~KE 58 (63)
T PF14157_consen 32 FAVVDEDGQIKIFKYDEDTNEITLKKE 58 (63)
T ss_dssp EEEE-ETTEEEEEEEETTTTEEEEEEE
T ss_pred EEEEecCCeEEEEEeCCCCCeEEEEEe
Confidence 444433333366777899998887764
No 18
>cd02899 PLAT_SR Scavenger receptor protein. A subfamily of PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates. This subfamily contains Toxoplasma gondii Scavenger protein TgSR1.
Probab=62.67 E-value=21 Score=18.66 Aligned_cols=19 Identities=32% Similarity=0.438 Sum_probs=12.8
Q ss_pred EeCCCCCCeEEEEEeccCC
Q psy8623 7 SDPDCGVNAMVNYTLGESP 25 (68)
Q Consensus 7 ~D~D~g~~~~i~y~i~~~~ 25 (68)
....+|..+.|...|.+..
T Consensus 11 ~~~~AGT~~~V~i~L~G~~ 29 (109)
T cd02899 11 KDKEAGTNGTIEITLLGSS 29 (109)
T ss_pred CCCCCCccceEEEEEEECC
Confidence 4556677788877776544
No 19
>PF12971 NAGLU_N: Alpha-N-acetylglucosaminidase (NAGLU) N-terminal domain; InterPro: IPR024240 Alpha-N-acetylglucosaminidase, is a lysosomal enzyme required for the stepwise degradation of heparan sulphate []. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterised by neurological dysfunction but relatively mild somatic manifestations []. The structure shows that the enzyme is composed of three domains. This entry represents the N-terminal domain of Alpha-N-acetylglucosaminidase which has an alpha-beta fold [].; PDB: 4A4A_A 2VC9_A 2VCC_A 2VCB_A 2VCA_A.
Probab=48.89 E-value=34 Score=16.89 Aligned_cols=29 Identities=24% Similarity=0.323 Sum_probs=17.7
Q ss_pred EEEEEeccCCCCCCcEEEeC-CccEEEEee
Q psy8623 16 MVNYTLGESPSRTNHFYMKS-VSGEICIAQ 44 (68)
Q Consensus 16 ~i~y~i~~~~~~~~~f~id~-~tG~i~~~~ 44 (68)
.+.+++.......+.|.|.. ..|.|.+..
T Consensus 19 ~f~~~~~~~~~~~d~F~l~~~~~gki~I~G 48 (86)
T PF12971_consen 19 QFTFELIPSSNGKDVFELSSADNGKIVIRG 48 (86)
T ss_dssp GEEEEE---BTTBEEEEEEE-SSS-EEEEE
T ss_pred eEEEEEecCCCCCCEEEEEeCCCCeEEEEe
Confidence 36777766553447899987 888887753
No 20
>COG3212 Predicted membrane protein [Function unknown]
Probab=45.18 E-value=41 Score=18.48 Aligned_cols=36 Identities=11% Similarity=0.186 Sum_probs=22.4
Q ss_pred EEEeCCCCCCeEEEEEec--cCCCCCCcEEEeCCccEEE
Q psy8623 5 SASDPDCGVNAMVNYTLG--ESPSRTNHFYMKSVSGEIC 41 (68)
Q Consensus 5 ~A~D~D~g~~~~i~y~i~--~~~~~~~~f~id~~tG~i~ 41 (68)
...+.+. .+++..|.+. .+.....-|.||.+||.|.
T Consensus 101 ~dieLe~-~~g~~vYevei~~~d~~e~ev~iDA~TG~Il 138 (144)
T COG3212 101 DDIELEE-DNGRLVYEVEIVKDDGQEYEVEIDAKTGKIL 138 (144)
T ss_pred eEEEEec-cCCEEEEEEEEEeCCCcEEEEEEecCCCCcc
Confidence 3334443 4678888763 3223335689999999874
No 21
>PF12245 Big_3_2: Bacterial Ig-like domain (group 3); InterPro: IPR022038 This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT.
Probab=44.17 E-value=34 Score=15.61 Aligned_cols=14 Identities=36% Similarity=0.627 Sum_probs=11.0
Q ss_pred CceEEEEEEEEECC
Q psy8623 51 RSSYEFPVVATDRG 64 (68)
Q Consensus 51 ~~~~~~~v~a~d~g 64 (68)
...|.+.+.+.|..
T Consensus 22 dg~yt~~v~a~D~A 35 (60)
T PF12245_consen 22 DGEYTLTVTATDKA 35 (60)
T ss_pred CccEEEEEEEEECC
Confidence 56788888888864
No 22
>PF00635 Motile_Sperm: MSP (Major sperm protein) domain; InterPro: IPR000535 Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans. MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold []. ; GO: 0005198 structural molecule activity; PDB: 1MSP_A 3MSP_B 2BVU_B 2MSP_C 1Z9O_F 1Z9L_A 3IKK_A 1WIC_A 2CRI_A 2RR3_A ....
Probab=42.67 E-value=45 Score=16.53 Aligned_cols=25 Identities=16% Similarity=0.373 Sum_probs=16.9
Q ss_pred CeEEEEEeccCCCCCCcEEEeCCccEE
Q psy8623 14 NAMVNYTLGESPSRTNHFYMKSVSGEI 40 (68)
Q Consensus 14 ~~~i~y~i~~~~~~~~~f~id~~tG~i 40 (68)
+..+.|+|....+ ..|.+.+..|.|
T Consensus 31 ~~~i~fKiktt~~--~~y~v~P~~G~i 55 (109)
T PF00635_consen 31 DKPIAFKIKTTNP--NRYRVKPSYGII 55 (109)
T ss_dssp SSEEEEEEEES-T--TTEEEESSEEEE
T ss_pred CCcEEEEEEcCCC--ceEEecCCCEEE
Confidence 3468888866554 568888888754
No 23
>COG5448 Uncharacterized conserved protein [Function unknown]
Probab=42.46 E-value=65 Score=18.31 Aligned_cols=37 Identities=14% Similarity=0.158 Sum_probs=23.1
Q ss_pred CCeEEEEEeccCCCCCCcEEEeCCccEEEEeecCCCC
Q psy8623 13 VNAMVNYTLGESPSRTNHFYMKSVSGEICIAQDLDFE 49 (68)
Q Consensus 13 ~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~~~ld~e 49 (68)
.++.+...+.+......-|.+|..||.+.+...+...
T Consensus 106 V~gSV~v~V~gVk~~~~af~VD~~tGiV~l~~~p~~~ 142 (184)
T COG5448 106 VNGSVMVYVNGVKTAPGAFIVDYNTGIVTLPSAPPQD 142 (184)
T ss_pred cCCeEEEEEccEEcCCcceEeeccCCeEEeCCCCCCC
Confidence 3445555554433223679999999999877554443
No 24
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=41.15 E-value=37 Score=15.11 Aligned_cols=15 Identities=33% Similarity=0.543 Sum_probs=12.3
Q ss_pred CCceEEEEEEEEECC
Q psy8623 50 SRSSYEFPVVATDRG 64 (68)
Q Consensus 50 ~~~~~~~~v~a~d~g 64 (68)
....|.+.+.|.|..
T Consensus 22 ~dG~y~itv~a~D~A 36 (54)
T PF13754_consen 22 ADGTYTITVTATDAA 36 (54)
T ss_pred CCccEEEEEEEEeCC
Confidence 467899999999964
No 25
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=31.04 E-value=84 Score=16.21 Aligned_cols=35 Identities=17% Similarity=0.309 Sum_probs=21.3
Q ss_pred EEEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccE
Q psy8623 2 LRVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGE 39 (68)
Q Consensus 2 ~~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~ 39 (68)
+++.++|+|.+..- .++........+.|.|+. .|.
T Consensus 2 G~Lt~sD~D~gd~~--~~s~~~~~g~yGtlti~~-~G~ 36 (99)
T TIGR01965 2 GQLTISDADAGQAH--FIAQTDAAGQYGTFSIDA-DGQ 36 (99)
T ss_pred CceEEeCCCCCCce--EEecccccCCcEEEEECC-CCc
Confidence 47889999987543 344432232236688876 563
No 26
>PF13860 FlgD_ig: FlgD Ig-like domain; PDB: 3C12_A 3OSV_A.
Probab=30.96 E-value=71 Score=15.33 Aligned_cols=14 Identities=50% Similarity=0.809 Sum_probs=10.9
Q ss_pred CceEEEEEEEEECC
Q psy8623 51 RSSYEFPVVATDRG 64 (68)
Q Consensus 51 ~~~~~~~v~a~d~g 64 (68)
...|.+.|.|.+.|
T Consensus 68 ~G~Y~~~v~a~~~g 81 (81)
T PF13860_consen 68 DGTYTFRVTATDGG 81 (81)
T ss_dssp SEEEEEEEEEEET-
T ss_pred CCCEEEEEEEEeCC
Confidence 46799999999865
No 27
>PRK13744 conjugal transfer protein TrbG; Provisional
Probab=30.11 E-value=45 Score=15.92 Aligned_cols=20 Identities=40% Similarity=0.662 Sum_probs=14.6
Q ss_pred EEEEEEeCCCCCCeEEEEEe
Q psy8623 2 LRVSASDPDCGVNAMVNYTL 21 (68)
Q Consensus 2 ~~v~A~D~D~g~~~~i~y~i 21 (68)
..++|..||.|...-+.|-.
T Consensus 28 cevsapepdaggkrivayvy 47 (83)
T PRK13744 28 CEVSAPEPDAGGKRIVAYVY 47 (83)
T ss_pred ccccCCCCCCCCcEEEEEEe
Confidence 45788899998666666654
No 28
>COG2706 3-carboxymuconate cyclase [Carbohydrate transport and metabolism]
Probab=29.72 E-value=1.6e+02 Score=18.91 Aligned_cols=15 Identities=13% Similarity=0.297 Sum_probs=9.5
Q ss_pred CcEEEeCCccEEEEe
Q psy8623 29 NHFYMKSVSGEICIA 43 (68)
Q Consensus 29 ~~f~id~~tG~i~~~ 43 (68)
..|.|++.+|.|.+.
T Consensus 269 ~~f~V~~~~g~L~~~ 283 (346)
T COG2706 269 AVFSVDPDGGKLELV 283 (346)
T ss_pred EEEEEcCCCCEEEEE
Confidence 456677776666554
No 29
>COG1770 PtrB Protease II [Amino acid transport and metabolism]
Probab=29.55 E-value=1.9e+02 Score=20.36 Aligned_cols=30 Identities=13% Similarity=0.264 Sum_probs=23.1
Q ss_pred eEEEEEeccCCCCCCcEEEeCCccEEEEee
Q psy8623 15 AMVNYTLGESPSRTNHFYMKSVSGEICIAQ 44 (68)
Q Consensus 15 ~~i~y~i~~~~~~~~~f~id~~tG~i~~~~ 44 (68)
..++|+..+-..+..+|..|-.+|+.++.+
T Consensus 375 ~~lR~~ysS~ttP~~~~~~dm~t~er~~Lk 404 (682)
T COG1770 375 DRLRYSYSSMTTPATLFDYDMATGERTLLK 404 (682)
T ss_pred ccEEEEeecccccceeEEeeccCCcEEEEE
Confidence 467787777665568999999999887753
No 30
>COG3470 Tpd Uncharacterized protein probably involved in high-affinity Fe2+ transport [Inorganic ion transport and metabolism]
Probab=28.31 E-value=70 Score=18.20 Aligned_cols=27 Identities=7% Similarity=0.049 Sum_probs=18.2
Q ss_pred CeEEEEEeccCCCCCCcEEEeCCccEE
Q psy8623 14 NAMVNYTLGESPSRTNHFYMKSVSGEI 40 (68)
Q Consensus 14 ~~~i~y~i~~~~~~~~~f~id~~tG~i 40 (68)
|.+++|.|...+...-..++|+.||.=
T Consensus 134 ~YKl~~~Is~Ps~~~~~rHvDeeTGVG 160 (179)
T COG3470 134 NYKLTFEISAPSKAGYGRHVDEETGVG 160 (179)
T ss_pred cEEEEEEecCCCccccceecccccCcc
Confidence 456788887665433456788888873
No 31
>PF14302 DUF4377: Domain of unknown function (DUF4377)
Probab=27.96 E-value=85 Score=15.30 Aligned_cols=24 Identities=17% Similarity=0.303 Sum_probs=18.5
Q ss_pred ecCCCCCCceEEEEEEEEECCcCC
Q psy8623 44 QDLDFESRSSYEFPVVATDRGKET 67 (68)
Q Consensus 44 ~~ld~e~~~~~~~~v~a~d~g~~~ 67 (68)
..|++|..-.|.+.|.......||
T Consensus 40 eGF~yE~Gy~Y~L~Vk~~~~~npp 63 (80)
T PF14302_consen 40 EGFEYEPGYEYVLRVKRTPVANPP 63 (80)
T ss_pred cCcCcCCCcEEEEEEEEEECCCCC
Confidence 667888888888888887766554
No 32
>PF01011 PQQ: PQQ enzyme repeat family.; InterPro: IPR002372 Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor for a number of enzymes (quinoproteins) and particularly for some bacterial dehydrogenases [, ]. A number of bacterial quinoproteins belong to this family. Enzymes in this group have repeats of a beta propeller.; PDB: 1H4I_C 1H4J_E 1W6S_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A 1G72_A ....
Probab=27.88 E-value=56 Score=13.16 Aligned_cols=17 Identities=6% Similarity=0.184 Sum_probs=13.1
Q ss_pred CcEEEeCCccEEEEeec
Q psy8623 29 NHFYMKSVSGEICIAQD 45 (68)
Q Consensus 29 ~~f~id~~tG~i~~~~~ 45 (68)
..+.+|..||++.....
T Consensus 11 ~l~AlD~~TG~~~W~~~ 27 (38)
T PF01011_consen 11 YLYALDAKTGKVLWKFQ 27 (38)
T ss_dssp EEEEEETTTTSEEEEEE
T ss_pred EEEEEECCCCCEEEeee
Confidence 57888999998876644
No 33
>COG5584 Predicted small secreted protein [Function unknown]
Probab=27.81 E-value=55 Score=17.00 Aligned_cols=13 Identities=31% Similarity=0.672 Sum_probs=10.4
Q ss_pred CcEEEeCCccEEE
Q psy8623 29 NHFYMKSVSGEIC 41 (68)
Q Consensus 29 ~~f~id~~tG~i~ 41 (68)
--|.+|.+||+|.
T Consensus 86 yef~aDA~TGevi 98 (103)
T COG5584 86 YEFYADANTGEVI 98 (103)
T ss_pred EEEEEecCCceEE
Confidence 3588899999874
No 34
>PF03646 FlaG: FlaG protein; InterPro: IPR005186 Although these proteins are known to be important for flagellar their exact function is unknown.; PDB: 2HC5_A.
Probab=27.68 E-value=46 Score=16.89 Aligned_cols=27 Identities=19% Similarity=0.375 Sum_probs=10.6
Q ss_pred CeEEEEEeccCCCCCCcEEEeCCccEE
Q psy8623 14 NAMVNYTLGESPSRTNHFYMKSVSGEI 40 (68)
Q Consensus 14 ~~~i~y~i~~~~~~~~~f~id~~tG~i 40 (68)
+..+.|++.......-.-.+|..||++
T Consensus 54 ~~~l~F~vde~~~~~vVkViD~~T~eV 80 (107)
T PF03646_consen 54 NTSLRFSVDEESGRVVVKVIDKETGEV 80 (107)
T ss_dssp S--EEEEEEEETTEEEEEEEETTT-SE
T ss_pred CCceEEEEecCCCcEEEEEEECCCCcE
Confidence 445666665544211112235666654
No 35
>PF12134 PRP8_domainIV: PRP8 domain IV core; InterPro: IPR021983 This domain is found in eukaryotes, and is about 20 amino acids in length. It is found associated with PF10597 from PFAM, PF10596 from PFAM, PF10598 from PFAM, PF08083 from PFAM, PF08082 from PFAM, PF01398 from PFAM, PF08084 from PFAM. There is a conserved LILR sequence motif. The domain is a selenomethionine domain in a subunit of the spliceosome. The function of PRP8 domain IV is believed to be interaction with the splicosomal core. ; PDB: 3E66_A 3E9P_A 3SBT_A 3E9O_A 3SBG_A 3LRU_A 3E9L_A 3ENB_A.
Probab=27.35 E-value=69 Score=19.17 Aligned_cols=15 Identities=13% Similarity=0.419 Sum_probs=10.7
Q ss_pred CcEEEeCCccEEEEe
Q psy8623 29 NHFYMKSVSGEICIA 43 (68)
Q Consensus 29 ~~f~id~~tG~i~~~ 43 (68)
..|.++|.||.+.++
T Consensus 47 ~ifIfnP~TGqLflK 61 (231)
T PF12134_consen 47 AIFIFNPRTGQLFLK 61 (231)
T ss_dssp EEEEE-TTT-EEEEE
T ss_pred eEEEEeCCCCcEEEE
Confidence 468889999999876
No 36
>PRK08577 hypothetical protein; Provisional
Probab=26.13 E-value=1.1e+02 Score=16.17 Aligned_cols=34 Identities=18% Similarity=0.276 Sum_probs=23.0
Q ss_pred cEEEeCCccEEEEeecCCCCCCceEEEEEEEEECC
Q psy8623 30 HFYMKSVSGEICIAQDLDFESRSSYEFPVVATDRG 64 (68)
Q Consensus 30 ~f~id~~tG~i~~~~~ld~e~~~~~~~~v~a~d~g 64 (68)
.|.++...|+|.+ .++..+....|.+.+.+.|..
T Consensus 34 ~~~~~~~~~~~~~-~~~~~~~k~~~~I~V~~~Dr~ 67 (136)
T PRK08577 34 LLIADTDKKEIHL-EPIALPGKKLVEIELVVEDRP 67 (136)
T ss_pred EEEEECCCCEEEE-EEcCCCCccEEEEEEEEcCCC
Confidence 3556777788866 444545566788888887754
No 37
>PF12461 DUF3688: Protein of unknown function (DUF3688) ; InterPro: IPR022160 This entry is represented by Spiroplasma phage 1-C74, Orf1. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This domain family is found in bacteria and viruses, and is typically between 79 and 104 amino acids in length. There is a conserved YRW sequence motif. There is a single completely conserved residue Y that may be functionally important.
Probab=25.91 E-value=1e+02 Score=15.56 Aligned_cols=14 Identities=14% Similarity=0.105 Sum_probs=10.8
Q ss_pred CcEEEeCCccEEEE
Q psy8623 29 NHFYMKSVSGEICI 42 (68)
Q Consensus 29 ~~f~id~~tG~i~~ 42 (68)
....||.+||+|+-
T Consensus 73 q~P~ID~ntG~Itd 86 (91)
T PF12461_consen 73 QTPTIDKNTGNITD 86 (91)
T ss_pred cCceEcCCCCeEeE
Confidence 45679999999853
No 38
>PF13750 Big_3_3: Bacterial Ig-like domain (group 3)
Probab=25.51 E-value=1.3e+02 Score=16.64 Aligned_cols=17 Identities=41% Similarity=0.669 Sum_probs=14.4
Q ss_pred CCCCceEEEEEEEEECC
Q psy8623 48 FESRSSYEFPVVATDRG 64 (68)
Q Consensus 48 ~e~~~~~~~~v~a~d~g 64 (68)
.|....|+|.|.|.|..
T Consensus 119 le~~~~YtLtV~a~D~a 135 (158)
T PF13750_consen 119 LEADDSYTLTVSATDKA 135 (158)
T ss_pred cCCCCeEEEEEEEEecC
Confidence 36788999999999964
No 39
>cd01756 PLAT_repeat PLAT/LH2 domain repeats of family of proteins with unknown function. In general, PLAT/LH2 consists of an eight stranded beta-barrel and it's proposed function is to mediate interaction with lipids or membrane bound proteins.
Probab=24.93 E-value=1.1e+02 Score=15.77 Aligned_cols=18 Identities=28% Similarity=0.126 Sum_probs=10.5
Q ss_pred eCCCCCCeEEEEEeccCC
Q psy8623 8 DPDCGVNAMVNYTLGESP 25 (68)
Q Consensus 8 D~D~g~~~~i~y~i~~~~ 25 (68)
...+|..+.|...|....
T Consensus 12 ~~~AGT~a~V~i~L~G~~ 29 (120)
T cd01756 12 VKGAGTDANVFITLYGEN 29 (120)
T ss_pred CcCCCCCcEEEEEEEeCC
Confidence 344566666766665543
No 40
>PF12216 m04gp34like: Immune evasion protein; InterPro: IPR022022 The proteins in this family are related to the m04 encoded protein gp34 of pathogenic microorganisms such as Murid herpesvirus 1. m06 and m152 genes are expressed earlier in the intracellular replication phases of these microorganism' life cycles. They function to inhibit MHC-1 loading and export. gp34 is theorized to prevent immune reactions from NK cells which would ordinarily recognise and attack cells lacking MHC.
Probab=24.82 E-value=1.7e+02 Score=17.96 Aligned_cols=20 Identities=20% Similarity=0.373 Sum_probs=15.8
Q ss_pred CcEEEeCCccEEEEeecCCC
Q psy8623 29 NHFYMKSVSGEICIAQDLDF 48 (68)
Q Consensus 29 ~~f~id~~tG~i~~~~~ld~ 48 (68)
.-|.||+.+|.|.+...-.+
T Consensus 138 ~Gf~Vd~~sG~L~i~snat~ 157 (272)
T PF12216_consen 138 DGFKVDPSSGNLYISSNATR 157 (272)
T ss_pred CCeEEcCCCceEEEecCccc
Confidence 55999999999988755554
No 41
>PF02897 Peptidase_S9_N: Prolyl oligopeptidase, N-terminal beta-propeller domain; InterPro: IPR004106 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents the beta-propeller domain found at the N-terminal of prolyl oligopeptidase, including acylamino-acid-releasing enzyme (also known as acylaminoacyl peptidase), which belong to the MEROPS peptidase family S9 (clan SC), subfamily S9A. The prolyl oligopeptidase family consist of a number of evolutionary related peptidases whose catalytic activity seems to be provided by a charge relay system similar to that of the trypsin family of serine proteases, but which evolved by independent convergent evolution. The N-terminal domain of prolyl oligopeptidases form an unusual 7-bladed beta-propeller consisting of seven 4-stranded beta-sheet motifs. Prolyl oligopeptidase is a large cytosolic enzyme involved in the maturation and degradation of peptide hormones and neuropeptides, which relate to the induction of amnesia. The enzyme contains a peptidase domain, where its catalytic triad (Ser554, His680, Asp641) is covered by the central tunnel of the N-terminal beta-propeller domain. In this way, large structured peptides are excluded from the active site, thereby protecting larger peptides and proteins from proteolysis in the cytosol []. The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Mammalian acylaminoacyl peptidase is an exopeptidase that is a member of the same prolyl oligopeptidase family of serine peptidases. This enzyme removes acylated amino acid residues from the N terminus of oligopeptides [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 2BKL_B 3DDU_A 1YR2_A 2XE4_A 1VZ3_A 3EQ9_A 1O6F_A 3EQ7_A 4AN0_A 1UOP_A ....
Probab=24.54 E-value=1.7e+02 Score=18.28 Aligned_cols=30 Identities=7% Similarity=0.248 Sum_probs=17.9
Q ss_pred eEEEEEeccCCCCCCcEEEeCCccEEEEee
Q psy8623 15 AMVNYTLGESPSRTNHFYMKSVSGEICIAQ 44 (68)
Q Consensus 15 ~~i~y~i~~~~~~~~~f~id~~tG~i~~~~ 44 (68)
..+.|.+.+-..+...|.+|..+|++.+.+
T Consensus 383 ~~~~~~~ss~~~P~~~y~~d~~t~~~~~~k 412 (414)
T PF02897_consen 383 DELRFSYSSFTTPPTVYRYDLATGELTLLK 412 (414)
T ss_dssp SEEEEEEEETTEEEEEEEEETTTTCEEEEE
T ss_pred CEEEEEEeCCCCCCEEEEEECCCCCEEEEE
Confidence 345666555443346777787777765543
No 42
>PF11102 Cap_synth_GfcB: Group 4 capsule polysaccharide formation lipoprotein gfcB; InterPro: IPR021308 Some members in this bacterial family of proteins are annotated as YjbF however the function is unknown. ; PDB: 2IN5_B.
Probab=24.40 E-value=1.5e+02 Score=16.95 Aligned_cols=16 Identities=19% Similarity=0.632 Sum_probs=11.0
Q ss_pred CcEEEeCCccEEEEee
Q psy8623 29 NHFYMKSVSGEICIAQ 44 (68)
Q Consensus 29 ~~f~id~~tG~i~~~~ 44 (68)
+.|.+|+.+|.|.-++
T Consensus 170 N~yWvd~~sG~V~kS~ 185 (200)
T PF11102_consen 170 NRYWVDPASGQVVKSR 185 (200)
T ss_dssp EEEEEETTT--EEEEE
T ss_pred EEEEEECCCCcEEEEE
Confidence 6789999999986653
No 43
>PF01751 Toprim: Toprim domain; InterPro: IPR006171 This is a conserved region from DNA primase. This corresponds to the Toprim (topoisomerase-primase) domain common to DnaG primases, topoisomerases, OLD family nucleases and RecR/M DNA repair proteins []. Both DnaG motifs IV and V are present in the alignment, the DxD (V) motif may be involved in Mg2+ binding and mutations to the conserved glutamate (IV) completely abolish DnaG type primase activity. DNA primase 2.7.7.6 from EC is a nucleotidyltransferase it synthesizes the oligoribonucleotide primers required for DNA replication on the lagging strand of the replication fork; it can also prime the leading stand and has been implicated in cell division []. This family also includes the atypical archaeal A subunit from type II DNA topoisomerases []. Type II DNA topoisomerases catalyse the relaxation of DNA supercoiling by causing transient double strand breaks.; PDB: 2ZJT_A 3IG0_A 3M4I_A 3NUH_B 1GKU_B 1GL9_C 3PWT_A 1CY4_A 1ECL_A 1CY7_A ....
Probab=24.33 E-value=41 Score=16.71 Aligned_cols=9 Identities=44% Similarity=0.652 Sum_probs=6.1
Q ss_pred EEEEEeCCC
Q psy8623 3 RVSASDPDC 11 (68)
Q Consensus 3 ~v~A~D~D~ 11 (68)
-+.|+|+|.
T Consensus 63 iiiatD~D~ 71 (100)
T PF01751_consen 63 IIIATDPDR 71 (100)
T ss_dssp EEEEC-SSH
T ss_pred eeecCCCCh
Confidence 367899986
No 44
>PF14564 Membrane_bind: Membrane binding; PDB: 1YHP_A 2B1O_A.
Probab=23.67 E-value=1.1e+02 Score=16.03 Aligned_cols=17 Identities=24% Similarity=0.399 Sum_probs=13.3
Q ss_pred CcEEEeCCccEEEEeec
Q psy8623 29 NHFYMKSVSGEICIAQD 45 (68)
Q Consensus 29 ~~f~id~~tG~i~~~~~ 45 (68)
-+|..++.+|.|.....
T Consensus 71 vYFkY~~s~g~V~~~~~ 87 (110)
T PF14564_consen 71 VYFKYNPSTGEVSIRKT 87 (110)
T ss_dssp EEEEEETTTTEEEEE-T
T ss_pred EEEEECCCCCeEEEeec
Confidence 47999999999987653
No 45
>COG3049 Penicillin V acylase and related amidases [Cell envelope biogenesis, outer membrane]
Probab=23.63 E-value=2.1e+02 Score=18.40 Aligned_cols=37 Identities=24% Similarity=0.525 Sum_probs=23.0
Q ss_pred EEEEEeCCCCCCeEEEEEeccCCCCCCcEEEeCCccEEEE
Q psy8623 3 RVSASDPDCGVNAMVNYTLGESPSRTNHFYMKSVSGEICI 42 (68)
Q Consensus 3 ~v~A~D~D~g~~~~i~y~i~~~~~~~~~f~id~~tG~i~~ 42 (68)
.+...|+..+ -..+.|++...+. ....|.+..|.+.+
T Consensus 150 ~v~~~~~~~~-~~plH~s~sDasG--~S~iiE~~~GklvI 186 (353)
T COG3049 150 IVALNDPGEG-VAPLHYSLSDASG--DSAIIEPIDGKLVI 186 (353)
T ss_pred EEeccCCCCC-CCceeEEEEcCCC--CeEEEEEeCCEEEE
Confidence 4555666665 4568899876554 44566666665544
No 46
>PF07861 WND: WisP family N-Terminal Region; InterPro: IPR012503 This family is found at the N terminus of the Tropheryma whipplei WisP family proteins [].
Probab=23.14 E-value=1.4e+02 Score=17.64 Aligned_cols=28 Identities=11% Similarity=0.196 Sum_probs=19.2
Q ss_pred CCeEEEEEeccCCCCCCcEEEeCCccEEEEe
Q psy8623 13 VNAMVNYTLGESPSRTNHFYMKSVSGEICIA 43 (68)
Q Consensus 13 ~~~~i~y~i~~~~~~~~~f~id~~tG~i~~~ 43 (68)
.+.+..|++.... .-..||..||.|...
T Consensus 201 R~S~~T~SLs~P~---~~v~lD~~TG~l~~S 228 (263)
T PF07861_consen 201 RGSPFTYSLSTPV---AGVRLDANTGALSGS 228 (263)
T ss_pred cCCcceEEeccCC---CceEEecccceeeee
Confidence 3456677775443 347899999998765
No 47
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=22.30 E-value=1e+02 Score=14.28 Aligned_cols=18 Identities=11% Similarity=0.311 Sum_probs=14.3
Q ss_pred CCCCCceEEEEEEEEECC
Q psy8623 47 DFESRSSYEFPVVATDRG 64 (68)
Q Consensus 47 d~e~~~~~~~~v~a~d~g 64 (68)
.+.....|.+.+.++|..
T Consensus 52 ~y~~~G~y~v~l~v~d~~ 69 (81)
T cd00146 52 TYTKPGTYTVTLTVTNAV 69 (81)
T ss_pred EcCCCcEEEEEEEEEeCC
Confidence 456678899999999874
Done!