Query psy6837
Match_columns 122
No_of_seqs 223 out of 1009
Neff 9.0
Searched_HMMs 46136
Date Fri Aug 16 19:41:37 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy6837.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/6837hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00031 CA Cadherin repeat dom 99.9 3.2E-21 7E-26 131.4 17.6 96 17-112 87-185 (199)
2 KOG4289|consensus 99.9 4.4E-22 9.6E-27 161.6 15.2 116 4-121 342-470 (2531)
3 KOG1219|consensus 99.9 4.7E-22 1E-26 165.8 15.4 119 3-122 2649-2774(4289)
4 PF00028 Cadherin: Cadherin do 99.9 1.7E-20 3.6E-25 114.4 13.6 79 37-115 1-83 (93)
5 KOG4289|consensus 99.9 5.7E-21 1.2E-25 155.3 13.3 114 9-122 755-882 (2531)
6 KOG1219|consensus 99.8 1.3E-20 2.9E-25 157.3 12.7 106 16-121 2162-2273(4289)
7 cd00031 CA Cadherin repeat dom 99.7 4.5E-16 9.8E-21 105.9 13.3 86 36-121 1-94 (199)
8 smart00112 CA Cadherin repeats 99.5 9.1E-14 2E-18 82.1 9.1 65 58-122 2-74 (79)
9 KOG1834|consensus 99.4 7.8E-12 1.7E-16 97.1 13.3 94 17-113 134-231 (952)
10 PF08266 Cadherin_2: Cadherin- 98.5 5.5E-07 1.2E-11 53.9 5.3 60 36-96 2-66 (84)
11 PF08758 Cadherin_pro: Cadheri 98.4 8.7E-06 1.9E-10 49.4 9.5 86 29-119 3-88 (90)
12 KOG1834|consensus 97.9 0.00016 3.6E-09 57.3 9.8 92 19-112 20-122 (952)
13 smart00736 CADG Dystroglycan-t 96.9 0.036 7.8E-07 33.7 9.6 53 56-111 24-80 (97)
14 smart00112 CA Cadherin repeats 96.8 0.003 6.5E-08 36.6 4.3 28 3-30 48-79 (79)
15 PF05345 He_PIG: Putative Ig d 93.3 0.67 1.5E-05 24.6 5.8 35 74-110 13-48 (49)
16 TIGR01965 VCBS_repeat VCBS rep 92.8 1.2 2.6E-05 27.4 7.0 67 52-120 2-78 (99)
17 KOG3597|consensus 87.1 4.3 9.4E-05 31.6 7.3 56 17-72 27-83 (442)
18 PF13750 Big_3_3: Bacterial Ig 84.8 9.3 0.0002 25.4 12.6 67 52-118 69-144 (158)
19 TIGR00845 caca sodium/calcium 84.0 21 0.00045 30.6 10.2 56 14-72 513-569 (928)
20 TIGR01965 VCBS_repeat VCBS rep 82.0 6.9 0.00015 24.1 5.3 39 3-45 60-98 (99)
21 PF07495 Y_Y_Y: Y_Y_Y domain; 70.4 15 0.00033 19.9 6.6 41 64-108 9-49 (66)
22 PF03160 Calx-beta: Calx-beta 66.9 24 0.00053 20.9 9.3 81 19-110 2-87 (100)
23 cd00146 PKD polycystic kidney 57.0 34 0.00073 19.3 4.4 28 92-119 51-79 (81)
24 smart00089 PKD Repeats in poly 54.4 37 0.00081 19.0 4.3 29 91-119 48-76 (79)
25 PF15418 DUF4625: Domain of un 53.5 41 0.00089 21.7 4.5 24 38-61 97-120 (132)
26 PF10365 DUF2436: Domain of un 50.6 71 0.0015 21.1 7.1 84 26-110 66-156 (161)
27 PF12245 Big_3_2: Bacterial Ig 49.8 42 0.00092 18.3 4.0 19 96-114 21-40 (60)
28 PF13754 Big_3_4: Bacterial Ig 43.8 51 0.0011 17.5 5.0 17 96-112 22-39 (54)
29 PF00635 Motile_Sperm: MSP (Ma 39.1 64 0.0014 19.2 3.7 34 52-86 22-55 (109)
30 PF07145 PAM2: Ataxin-2 C-term 35.1 24 0.00052 14.7 0.9 11 25-35 6-16 (18)
31 PF14979 TMEM52: Transmembrane 32.5 45 0.00097 22.1 2.2 25 94-118 58-82 (154)
32 PF04234 CopC: CopC domain; I 31.6 1E+02 0.0023 18.3 3.7 35 35-69 59-96 (97)
33 PF05688 DUF824: Salmonella re 31.4 77 0.0017 16.7 2.6 19 11-29 10-28 (47)
34 KOG4680|consensus 31.2 1.3E+02 0.0028 19.9 4.1 28 38-66 109-136 (153)
35 KOG4221|consensus 31.1 2.4E+02 0.0052 25.5 6.7 54 62-118 548-607 (1381)
36 TIGR00845 caca sodium/calcium 30.8 3.6E+02 0.0079 23.5 9.4 46 25-72 395-442 (928)
37 smart00634 BID_1 Bacterial Ig- 30.8 81 0.0018 18.5 3.1 31 10-42 14-44 (92)
38 PF14157 YmzC: YmzC-like prote 27.3 1.1E+02 0.0024 17.2 2.9 26 66-91 31-58 (63)
39 cd02848 Chitinase_N_term Chiti 27.2 1.6E+02 0.0035 18.3 6.0 31 91-121 73-106 (106)
40 PF02369 Big_1: Bacterial Ig-l 27.2 89 0.0019 18.8 2.8 64 10-79 19-84 (100)
41 PF00337 Gal-bind_lectin: Gala 26.7 1.7E+02 0.0036 18.2 7.8 72 36-108 1-93 (133)
42 PRK08577 hypothetical protein; 24.7 1.6E+02 0.0034 18.7 3.8 32 76-108 34-65 (136)
43 PF03413 PepSY: Peptidase prop 24.4 1.2E+02 0.0026 15.8 3.0 12 77-88 51-62 (64)
44 PF07861 WND: WisP family N-Te 24.1 2.1E+02 0.0046 20.0 4.4 27 62-89 202-228 (263)
45 PF14550 Peptidase_U35_2: Puta 23.7 68 0.0015 20.5 1.9 19 43-61 82-100 (122)
46 PF09865 DUF2092: Predicted pe 21.9 2.5E+02 0.0055 19.7 4.6 38 6-45 170-207 (214)
47 PF08329 ChitinaseA_N: Chitina 21.8 1.1E+02 0.0023 19.9 2.6 31 91-121 76-109 (133)
48 PF10633 NPCBM_assoc: NPCBM-as 21.8 1.4E+02 0.0031 16.7 2.9 7 4-10 9-15 (78)
49 PF00801 PKD: PKD domain; Int 21.7 1.2E+02 0.0026 16.4 2.5 51 53-111 15-65 (69)
50 PF13860 FlgD_ig: FlgD Ig-like 21.3 1.2E+02 0.0025 17.4 2.5 14 97-110 68-81 (81)
51 PF14730 DUF4468: Domain of un 21.0 1.6E+02 0.0035 17.3 3.1 14 97-110 65-78 (91)
52 PF03671 Ufm1: Ubiquitin fold 20.9 1.8E+02 0.0039 16.9 3.0 25 18-46 3-27 (76)
No 1
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.89 E-value=3.2e-21 Score=131.40 Aligned_cols=96 Identities=35% Similarity=0.641 Sum_probs=90.6
Q ss_pred EEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCCC-CcEEEECCccEEEEcccCC
Q psy6837 17 CVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGDE-GKFVLESSSGNLKLKDTLD 93 (122)
Q Consensus 17 ~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~~-~~f~i~~~~G~i~~~~~Ld 93 (122)
.+.|.|.|+|||+|.|....|.+.++|+.++|+.++++.|.|+|. ++.++|+|.++.. ++|.|++.+|.|.+.+.||
T Consensus 87 ~v~I~V~d~Nd~~P~~~~~~~~~~v~e~~~~~~~i~~~~a~D~D~~~~~~~~y~l~~~~~~~~f~i~~~~G~i~~~~~ld 166 (199)
T cd00031 87 TVTVTVLDVNDNPPVFEQSSYEASVPENAPPGTVVGTVTATDADSGENAKLTYSILSGNDKELFSIDPNTGIITLAKPLD 166 (199)
T ss_pred EEEEEEccCCCCCCcccccceEEEEeCCCCCCCEEEEEEEEcCCCCCCccEEEEEeCCCCCCEEEEeCCceEEEeCCccC
Confidence 899999999999999998999999999999999999999999998 6999999998765 8999999999999999999
Q ss_pred ccCCCEEEEEEEEEeCCCC
Q psy6837 94 REQKDKYRVQVRASDGVQS 112 (122)
Q Consensus 94 ~e~~~~~~l~v~a~D~g~~ 112 (122)
+|....|.+.|.|.|.+.|
T Consensus 167 ~e~~~~~~l~v~a~D~~~~ 185 (199)
T cd00031 167 REEKSSYELTVVATDGGGP 185 (199)
T ss_pred CccCceEEEEEEEEECCCC
Confidence 9999999999999997643
No 2
>KOG4289|consensus
Probab=99.89 E-value=4.4e-22 Score=161.65 Aligned_cols=116 Identities=31% Similarity=0.546 Sum_probs=103.4
Q ss_pred EEEEEEcCCCcee-----EEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCCC-C
Q psy6837 4 LTLNLAHKFGVIG-----CVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGDE-G 75 (122)
Q Consensus 4 l~~~~~~~~g~~~-----~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~~-~ 75 (122)
|.+. |.|.|.++ +|.|+|.|+|||+|+|....|.+.|+|+..+++.|.+++|+|.|. |+.++|+|.+++. +
T Consensus 342 L~Ve-AsDqG~~pgp~Ta~V~itV~D~NDNaPqFse~~Yvvqv~Edvt~~avvlrV~AtDrD~g~Ng~VHYsi~Sgn~~G 420 (2531)
T KOG4289|consen 342 LDVE-ASDQGRPPGPRTAMVEITVEDENDNAPQFSEKRYVVQVREDVTPPAVVLRVTATDRDKGTNGKVHYSIASGNGRG 420 (2531)
T ss_pred EEEE-eccCCCCCCCceEEEEEEEEecCCCCccccccceEEEecccCCCCceEEEEEecccCCCcCceEEEEeeccCccc
Confidence 3443 34445544 899999999999999999999999999999999999999999999 9999999999864 8
Q ss_pred cEEEECCccEEEEcccCCccCCCEEEEEEEEEeCCCCcee-----EEEEEc
Q psy6837 76 KFVLESSSGNLKLKDTLDREQKDKYRVQVRASDGVQSADV-----VLTILN 121 (122)
Q Consensus 76 ~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~~~~~-----~v~I~d 121 (122)
.|.||..+|+|.+..+||+|.. +|++.|+|+|||.|+.+ +|+|.|
T Consensus 421 ~f~id~~tGel~vv~plD~e~~-~ytl~IrAqDggrPpLsn~sgl~iqVlD 470 (2531)
T KOG4289|consen 421 QFYIDSLTGELDVVEPLDFENS-EYTLRIRAQDGGRPPLSNTSGLVIQVLD 470 (2531)
T ss_pred cEEEecccceEEEeccccccCC-eeEEEEEcccCCCCCccCCCceEEEEEe
Confidence 9999999999999999999998 99999999999999886 566655
No 3
>KOG1219|consensus
Probab=99.89 E-value=4.7e-22 Score=165.79 Aligned_cols=119 Identities=28% Similarity=0.377 Sum_probs=103.3
Q ss_pred eEEEEEEcCCCcee--EEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCCCCcEE
Q psy6837 3 DLTLNLAHKFGVIG--CVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGDEGKFV 78 (122)
Q Consensus 3 ~l~~~~~~~~g~~~--~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~~~~f~ 78 (122)
++++.+....+.-+ .+.|.|.|+|||+|.|..++|.+.+.||+|+|+.|++++|.|.|. +++++|+|.+... +|.
T Consensus 2649 qi~v~a~~~~~vva~tsv~vqVkDvNDNaPvFe~d~y~f~i~En~pvGtsV~qf~AsD~Ds~~nGqirysl~~~v~-yF~ 2727 (4289)
T KOG1219|consen 2649 QIKVKATCGQWVVAETSVFVQVKDVNDNAPVFEKDPYLFIIEENSPVGTSVIQFHASDMDSGNNGQIRYSLTSPVP-YFA 2727 (4289)
T ss_pred EEEEEeecCCceEEEEEEEEEeecccCCCccccCCceeEEEeccCCCCceEEEEEeeccCCCCCceEEEEEcCCcc-eEE
Confidence 34444444444233 899999999999999999999999999999999999999999999 9999999997644 999
Q ss_pred EECCccEEEEcccCCccCCCEEEEEEEEEeCCCCcee---EEEEEcC
Q psy6837 79 LESSSGNLKLKDTLDREQKDKYRVQVRASDGVQSADV---VLTILNC 122 (122)
Q Consensus 79 i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~~~~~---~v~I~d~ 122 (122)
|++++|.|++.+.||+|+++.|.|.|.|+|.|.|+.. .++|.|+
T Consensus 2728 In~etGwlTt~~eld~ek~d~y~lkv~AtDhG~~ssq~~v~v~vtDv 2774 (4289)
T KOG1219|consen 2728 INPETGWLTTLFELDLEKQDLYSLKVVATDHGVPSSQATVLVHVTDV 2774 (4289)
T ss_pred EcCCCCeeeehhhhccccCCceEEEEEEecCCcccccceEEEEEEec
Confidence 9999999999999999999999999999998877554 6677764
No 4
>PF00028 Cadherin: Cadherin domain; InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.87 E-value=1.7e-20 Score=114.45 Aligned_cols=79 Identities=35% Similarity=0.693 Sum_probs=74.8
Q ss_pred eEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCC-CCcEEEECCccEEEEcccCCccCCCEEEEEEEEEeC-CCC
Q psy6837 37 TEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGD-EGKFVLESSSGNLKLKDTLDREQKDKYRVQVRASDG-VQS 112 (122)
Q Consensus 37 ~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~-~~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~-g~~ 112 (122)
|.+.|+|+.++|+.++++.|.|+|. ++.+.|+|.++. .++|.|++.+|.|++.++||||....|.|.|.|+|+ |.|
T Consensus 1 Y~~~v~E~~~~g~~v~~v~a~D~D~~~n~~i~y~i~~~~~~~~F~I~~~tg~i~~~~~LD~E~~~~y~l~v~a~D~~~~~ 80 (93)
T PF00028_consen 1 YSFSVPENAPPGTVVGQVTATDPDSGPNSQITYSILGGNPDGLFSIDPNTGEISLKKPLDRETQSSYQLTVRATDSGGSP 80 (93)
T ss_dssp EEEEEETTGSTSSEEEEEEEEESSTSTTSSEEEEEEETTSTTSEEEETTTTEEEESSSSCTTTTSEEEEEEEEEETTTSS
T ss_pred CEEEEECCCCCCCEEEEEEEEeCCCCCCceEEEEEecCcccCceEEeeeeeccccceecCcccCCEEEEEEEEEECCCCC
Confidence 7899999999999999999999996 999999999886 689999999999999999999999999999999998 788
Q ss_pred cee
Q psy6837 113 ADV 115 (122)
Q Consensus 113 ~~~ 115 (122)
+++
T Consensus 81 ~~~ 83 (93)
T PF00028_consen 81 PLS 83 (93)
T ss_dssp EEE
T ss_pred CCE
Confidence 765
No 5
>KOG4289|consensus
Probab=99.86 E-value=5.7e-21 Score=155.33 Aligned_cols=114 Identities=32% Similarity=0.436 Sum_probs=104.0
Q ss_pred EcCCCcee-----EEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCC--CCcEEE
Q psy6837 9 AHKFGVIG-----CVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGD--EGKFVL 79 (122)
Q Consensus 9 ~~~~g~~~-----~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~--~~~f~i 79 (122)
+.|+|.+. ++.|.|.|+|||+|+|..+.|.++|.|++|+++.+++++|+|+|. |+.+.|.+.++. ++.|.|
T Consensus 755 A~D~~~pq~adtttveV~v~diNDnaPqf~assyt~sV~Ed~Pv~TsvlQVSatDaD~g~Ng~v~y~~qg~~d~p~~F~I 834 (2531)
T KOG4289|consen 755 ARDNGIPQKADTTTVEVLVNDINDNAPQFLASSYTGSVFEDAPVFTSVLQVSATDADSGPNGRVYYTFQGGDDGPGDFYI 834 (2531)
T ss_pred ecCCCCCCcCccEEEEEEeecccccCcccchhhceeEeecCCCCcceEEEEEEeccCCCCCceEEEEecCCCCCCCceEE
Confidence 46777776 899999999999999999999999999999999999999999999 899999888763 478999
Q ss_pred ECCccEEEEcccCCccCCCEEEEEEEEEeCCCCcee-----EEEEEcC
Q psy6837 80 ESSSGNLKLKDTLDREQKDKYRVQVRASDGVQSADV-----VLTILNC 122 (122)
Q Consensus 80 ~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~~~~~-----~v~I~d~ 122 (122)
+|.+|.|++...||||....|.|.+.|.|.|.|+.+ +|+|+|+
T Consensus 835 EptSGviRtl~rLdRE~~avy~L~a~avDrg~p~ls~~~eItvtvldv 882 (2531)
T KOG4289|consen 835 EPTSGVIRTLRRLDRENVAVYVLAAYAVDRGNPPLSAPVEITVTVLDV 882 (2531)
T ss_pred ccCcceeehhhhhcchheeEEEEEEEEeeCCCCCcCCceEEEEEEEec
Confidence 999999999999999999999999999998888765 7777775
No 6
>KOG1219|consensus
Probab=99.85 E-value=1.3e-20 Score=157.33 Aligned_cols=106 Identities=30% Similarity=0.473 Sum_probs=96.8
Q ss_pred eEEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCCCCcEEEEEEcCC--CCcEEEECCccEEEEcccCC
Q psy6837 16 GCVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDLPGTIAYSLVAGD--EGKFVLESSSGNLKLKDTLD 93 (122)
Q Consensus 16 ~~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~~~~i~y~l~~~~--~~~f~i~~~~G~i~~~~~Ld 93 (122)
+.+-|.|.|+|||||.|.+..|.++++|+++.|+.+.++.|+|.|.|+.+.|+|.++. ...|+|++.||.|++.+.||
T Consensus 2162 a~VeIiV~dIndn~PvFeqlsYt~sisE~s~igt~viqilATdsDsn~~isYsl~g~s~~sk~f~In~sTG~it~~g~ld 2241 (4289)
T KOG1219|consen 2162 AKVEIIVGDINDNPPVFEQLSYTISISENSKIGTKVIQILATDSDSNREISYSLEGNSEISKPFRINVSTGWITVAGKLD 2241 (4289)
T ss_pred eEEEEEecccCCCCchhheeeEEEEccCCCccCceEEEEEeccCCCCCceEEEeecCCccccceEEecccceEEEeeecC
Confidence 3899999999999999999999999999999999999999999999999999999753 37899999999999999999
Q ss_pred ccCCCEEEEEEEEEeCCCCcee----EEEEEc
Q psy6837 94 REQKDKYRVQVRASDGVQSADV----VLTILN 121 (122)
Q Consensus 94 ~e~~~~~~l~v~a~D~g~~~~~----~v~I~d 121 (122)
||+.++|.+.|+|+|+|.|-.+ .|.|.|
T Consensus 2242 yE~~q~f~~fvratdggk~lSseviv~V~VeD 2273 (4289)
T KOG1219|consen 2242 YEENQEFRFFVRATDGGKPLSSEVIVEVHVED 2273 (4289)
T ss_pred hhhcceEEEEEEEccCCCcccccEEEEEEehh
Confidence 9999999999999999999222 555554
No 7
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.71 E-value=4.5e-16 Score=105.95 Aligned_cols=86 Identities=35% Similarity=0.638 Sum_probs=75.8
Q ss_pred ceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCCC-CcEEEECCccEEEEcccCCccCCCEEEEEEEEEeCCCC
Q psy6837 36 HTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGDE-GKFVLESSSGNLKLKDTLDREQKDKYRVQVRASDGVQS 112 (122)
Q Consensus 36 ~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~~-~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~~ 112 (122)
.|.+.|+|+++.|+.++++.|.|+|. ++.++|+|.++.. ++|.|++.+|.|++.+.||||....|.|.|.|+|.|.|
T Consensus 1 ~~~~~i~En~~~g~~v~~~~a~D~D~~~~~~~~y~i~~~~~~~~F~i~~~tG~l~~~~~lD~e~~~~~~l~v~a~D~g~~ 80 (199)
T cd00031 1 SYSVSVPENAPPGTVVGTVSATDPDSGENGRVTYSILGGNEDGLFSIDPNTGVITTTKPLDREEQSEYTLTVVASDGGGP 80 (199)
T ss_pred CeEEEEeCCCCCCCEEEEEEEECCCCCCCceEEEEEeCCCCcccEEEeCCCCEEEECCCCCCcCCceEEEEEEEEECCcC
Confidence 37899999999999999999999998 4789999998765 79999999999999999999999999999999997666
Q ss_pred ce-e----EEEEEc
Q psy6837 113 AD-V----VLTILN 121 (122)
Q Consensus 113 ~~-~----~v~I~d 121 (122)
.+ + +|.|.|
T Consensus 81 ~~~~~~~v~I~V~d 94 (199)
T cd00031 81 PLSSTATVTVTVLD 94 (199)
T ss_pred cceeEEEEEEEEcc
Confidence 53 2 555554
No 8
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=99.54 E-value=9.1e-14 Score=82.09 Aligned_cols=65 Identities=32% Similarity=0.653 Sum_probs=54.7
Q ss_pred CCCC--CCcEEEEEEcCCC-CcEEEECCccEEEEcccCCccCCCEEEEEEEEEeCCCCcee-----EEEEEcC
Q psy6837 58 DPDL--PGTIAYSLVAGDE-GKFVLESSSGNLKLKDTLDREQKDKYRVQVRASDGVQSADV-----VLTILNC 122 (122)
Q Consensus 58 D~D~--~~~i~y~l~~~~~-~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~~~~~-----~v~I~d~ 122 (122)
|+|. ++.++|+|.++.. ++|.|++.+|.|++.++||||....|.|.|.|.|.|.|+.+ +|+|.|+
T Consensus 2 D~D~g~n~~i~Y~i~~~~~~~~F~i~~~tg~i~~~~~LD~e~~~~y~l~v~a~D~~~~~~~~~~~v~I~V~D~ 74 (79)
T smart00112 2 DADSGENGKVTYSILSGNEDGLFSIDPETGEITTTKPLDREEQPEYTLTVEATDGGGPPLSSTATVTVTVLDV 74 (79)
T ss_pred CCCCCcCcEEEEEEecCCCCCEEEEeCCccEEEeCCccCeeCCCeEEEEEEEEECCCCCcccEEEEEEEEEEC
Confidence 6676 7889999998765 89999999999999999999999999999999997765333 5566654
No 9
>KOG1834|consensus
Probab=99.41 E-value=7.8e-12 Score=97.08 Aligned_cols=94 Identities=23% Similarity=0.409 Sum_probs=83.6
Q ss_pred EEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC---CCc-EEEEEEcCCCCcEEEECCccEEEEcccC
Q psy6837 17 CVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL---PGT-IAYSLVAGDEGKFVLESSSGNLKLKDTL 92 (122)
Q Consensus 17 ~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~---~~~-i~y~l~~~~~~~f~i~~~~G~i~~~~~L 92 (122)
++.|+|.|+|+++|.|....|.+.|.|.- .-..++++.|.|.|- +++ ..|.|.. ++-+|.|| +.|.|+.+.+|
T Consensus 134 tvhIrVkDvNe~AP~f~ep~Yka~V~EGK-~yd~il~veAiD~DCspq~sqIC~YEI~t-~d~PFaId-n~G~irnTekL 210 (952)
T KOG1834|consen 134 TVHIRVKDVNEFAPVFKEPWYKAHVTEGK-VYDSILRVEAIDKDCSPQYSQICEYEITT-PDVPFAID-NDGNIRNTEKL 210 (952)
T ss_pred EEEEEeccccccCchhcccceeeEEecce-eeeeeEEEEeecCCCCCcccceeEEEecC-CCCceEEc-CCCcccccccc
Confidence 89999999999999999999999999984 677889999999998 555 5688885 56689998 89999999999
Q ss_pred CccCCCEEEEEEEEEeCCCCc
Q psy6837 93 DREQKDKYRVQVRASDGVQSA 113 (122)
Q Consensus 93 d~e~~~~~~l~v~a~D~g~~~ 113 (122)
.|.....|.|+|.|.|-|...
T Consensus 211 ny~ke~~Y~ltVtAyDCg~kr 231 (952)
T KOG1834|consen 211 NYTKEHQYKLTVTAYDCGKKR 231 (952)
T ss_pred ccccceeEEEEEEEEeccccc
Confidence 999999999999999955443
No 10
>PF08266 Cadherin_2: Cadherin-like; InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=98.45 E-value=5.5e-07 Score=53.93 Aligned_cols=60 Identities=25% Similarity=0.563 Sum_probs=40.0
Q ss_pred ceEEEEeCCCCCCcEEEEEEEECCCC----CCcEEEEEEcC-CCCcEEEECCccEEEEcccCCccC
Q psy6837 36 HTEYTISEDVPVGTLVAKVKAIDPDL----PGTIAYSLVAG-DEGKFVLESSSGNLKLKDTLDREQ 96 (122)
Q Consensus 36 ~~~~~v~E~~~~g~~v~~~~a~D~D~----~~~i~y~l~~~-~~~~f~i~~~~G~i~~~~~Ld~e~ 96 (122)
+..++|+|..++|+.|+.+ |.|... .....|.+.+. ...+|.+++.+|.|.++..+|||.
T Consensus 2 qi~YsV~EE~~~Gt~IGni-a~dL~l~~~~l~~~~~ri~s~~~~~~~~v~~~tG~L~v~~rIDRE~ 66 (84)
T PF08266_consen 2 QIRYSVPEEMPPGTVIGNI-AKDLGLDPQSLSSRNFRIVSEGNSQYFRVNEKTGDLFVSERIDREE 66 (84)
T ss_dssp EEEEEEESS--TT-EEEEC-CCCCT--HHHHCCTTBEEE-SSSS-SEEE-TTTSEEEESS--SCCC
T ss_pred CeEEEeecCCCCCCEEEEh-HHhhCCCcccccccceEEeecCCcceeEecCCceeEEeCCccCHHH
Confidence 3578999999999999998 556544 11235776654 468999999999999999999997
No 11
>PF08758 Cadherin_pro: Cadherin prodomain like; InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=98.39 E-value=8.7e-06 Score=49.35 Aligned_cols=86 Identities=15% Similarity=0.240 Sum_probs=46.8
Q ss_pred CCcccCCceEEEEeCCCCCCcEEEEEEEECCCCCCcEEEEEEcCCCCcEEEECCccEEEEcccCCccCCCEEEEEEEEEe
Q psy6837 29 PPSFADFHTEYTISEDVPVGTLVAKVKAIDPDLPGTIAYSLVAGDEGKFVLESSSGNLKLKDTLDREQKDKYRVQVRASD 108 (122)
Q Consensus 29 ~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~~~~i~y~l~~~~~~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D 108 (122)
.|-|.+..|.+.||.+...|..++++.-.|-..+..+.|.-. ++ .|.|. ..|.|+++.++..... .-.|.|.|.|
T Consensus 3 ~pGF~~~~~~~~Vp~~l~~g~~lg~V~f~dC~~~~~~~~~ss--Dp-dF~V~-~DGsVy~~r~v~l~~~-~~~F~V~a~D 77 (90)
T PF08758_consen 3 RPGFSQKKYTFEVPSNLEAGQPLGKVNFEDCTGRRRVIFESS--DP-DFRVL-EDGSVYAKRPVQLSSE-QRSFTVHAWD 77 (90)
T ss_dssp --B--S-EEEE----SS-SS--EEE---B--SS---EEEE-----S-EEEEE-TTTEEEEES--S-SSS--EEEEEEEEE
T ss_pred cCCcccceEEEEcCchhhCCcEEEEEEeccCCCCCceEEecC--CC-CEEEc-CCCeEEEeeeEecCCC-ceEEEEEEEC
Confidence 588999999999999999999999999998865666777655 22 79997 7899999999877543 3579999999
Q ss_pred CCCCceeEEEE
Q psy6837 109 GVQSADVVLTI 119 (122)
Q Consensus 109 ~g~~~~~~v~I 119 (122)
........+.|
T Consensus 78 ~~~~~~~~v~V 88 (90)
T PF08758_consen 78 SQTQEQKEVKV 88 (90)
T ss_dssp TTTTEEEEEEE
T ss_pred CCCCeEEEEEE
Confidence 65555444444
No 12
>KOG1834|consensus
Probab=97.89 E-value=0.00016 Score=57.27 Aligned_cols=92 Identities=21% Similarity=0.264 Sum_probs=66.5
Q ss_pred EEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC----C-CcEEEEEEcCCCCcEE---EECCcc--EEEE
Q psy6837 19 QLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL----P-GTIAYSLVAGDEGKFV---LESSSG--NLKL 88 (122)
Q Consensus 19 ~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~----~-~~i~y~l~~~~~~~f~---i~~~~G--~i~~ 88 (122)
....--+|-+-|. ....|..-|.||...=.....+.|.|.|. . ...-|.|.+.+ -+|. +|+.+| .|+.
T Consensus 20 ~~~aarankhkpw-ie~ey~gvV~Endntvll~Ppl~aLdkdaplr~ageiC~fklhgq~-vPFdavVvdK~TGegvlRa 97 (952)
T KOG1834|consen 20 HHHAARANKHKPW-IEEEYHGVVTENDNTVLLDPPLAALDKDAPLRYAGEICGFKLHGQP-VPFDAVVVDKYTGEGVLRA 97 (952)
T ss_pred ccccccccccCcc-cccceeEEEEeCCceEEeCCCeeeecCCCCcccccccceeEecCCC-CCceEEEEeccCCceEEee
Confidence 4445667887784 45569999999986444444688888886 2 33458888543 3454 577666 6888
Q ss_pred cccCCccCCCEEEEEEEEEe-CCCC
Q psy6837 89 KDTLDREQKDKYRVQVRASD-GVQS 112 (122)
Q Consensus 89 ~~~Ld~e~~~~~~l~v~a~D-~g~~ 112 (122)
+.+||.|.++.|+|+|.|.| |..|
T Consensus 98 K~~lDCelqkeytf~iQAydCg~gp 122 (952)
T KOG1834|consen 98 KEPLDCELQKEYTFTIQAYDCGNGP 122 (952)
T ss_pred cCcccccccccceEEEEEEecCCCC
Confidence 89999999999999999999 4444
No 13
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=96.86 E-value=0.036 Score=33.70 Aligned_cols=53 Identities=23% Similarity=0.330 Sum_probs=39.5
Q ss_pred EECCCCCCcEEEEEEcCC----CCcEEEECCccEEEEcccCCccCCCEEEEEEEEEeCCC
Q psy6837 56 AIDPDLPGTIAYSLVAGD----EGKFVLESSSGNLKLKDTLDREQKDKYRVQVRASDGVQ 111 (122)
Q Consensus 56 a~D~D~~~~i~y~l~~~~----~~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~ 111 (122)
..|+| +..++|++.... +.|.++++.++.+.- .+...+ ...|.+.|.|+|+..
T Consensus 24 F~d~d-~~~lty~~~~~~~~~lP~Wl~fd~~~~~~~G-tP~~~~-~g~~~i~v~a~D~~g 80 (97)
T smart00736 24 FTDAD-GDTLTYSATLSDGSALPSWLSFDSDTGTLSG-TPTNSD-VGSLSLKVTATDSSG 80 (97)
T ss_pred eECCC-CCeEEEEEEeCCCCCCCCeEEEeCCCCEEEE-ECCCCC-CcEEEEEEEEEECCC
Confidence 46777 788999986432 479999999888776 344433 456999999999653
No 14
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=96.79 E-value=0.003 Score=36.58 Aligned_cols=28 Identities=32% Similarity=0.475 Sum_probs=19.1
Q ss_pred eEEEEEEcCCC----ceeEEEEEEEecCCCCC
Q psy6837 3 DLTLNLAHKFG----VIGCVQLRILDRNDSPP 30 (122)
Q Consensus 3 ~l~~~~~~~~g----~~~~v~i~V~dvNdn~P 30 (122)
.|.+.+.+..+ ....+.|.|.|+|||+|
T Consensus 48 ~l~v~a~D~~~~~~~~~~~v~I~V~D~Nd~~P 79 (79)
T smart00112 48 TLTVEATDGGGPPLSSTATVTVTVLDVNDNAP 79 (79)
T ss_pred EEEEEEEECCCCCcccEEEEEEEEEECCCCCC
Confidence 44555554322 22389999999999998
No 15
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=93.26 E-value=0.67 Score=24.63 Aligned_cols=35 Identities=26% Similarity=0.396 Sum_probs=26.4
Q ss_pred CCcEEEECCccEEEEcccCCcc-CCCEEEEEEEEEeCC
Q psy6837 74 EGKFVLESSSGNLKLKDTLDRE-QKDKYRVQVRASDGV 110 (122)
Q Consensus 74 ~~~f~i~~~~G~i~~~~~Ld~e-~~~~~~l~v~a~D~g 110 (122)
+.+..+|+.+|.|.=. .+.. ....|.+.|.|+|+.
T Consensus 13 P~gLs~d~~tG~isGt--p~~~~~~G~y~~~vtatd~~ 48 (49)
T PF05345_consen 13 PSGLSLDPSTGTISGT--PTSSVQPGTYTFTVTATDGS 48 (49)
T ss_pred CCcEEEeCCCCEEEee--cCCCccccEEEEEEEEEcCC
Confidence 5789999999998765 2333 235899999999953
No 16
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=92.76 E-value=1.2 Score=27.42 Aligned_cols=67 Identities=19% Similarity=0.291 Sum_probs=38.9
Q ss_pred EEEEEECCCCCCcEEEEEEcC--CCCcEEEECCccEEEEc--------ccCCccCCCEEEEEEEEEeCCCCceeEEEEE
Q psy6837 52 AKVKAIDPDLPGTIAYSLVAG--DEGKFVLESSSGNLKLK--------DTLDREQKDKYRVQVRASDGVQSADVVLTIL 120 (122)
Q Consensus 52 ~~~~a~D~D~~~~i~y~l~~~--~~~~f~i~~~~G~i~~~--------~~Ld~e~~~~~~l~v~a~D~g~~~~~~v~I~ 120 (122)
+++.++|+|.+....+++... .-+.|.|++ +|...-. +.|..-..-.-.|.+.+.||. +...+|+|.
T Consensus 2 G~Lt~sD~D~gd~~~~s~~~~~g~yGtlti~~-~G~wtYtl~n~~~avq~L~~Ge~~tdsFtvtv~DGt-t~~vtItI~ 78 (99)
T TIGR01965 2 GQLTISDADAGQAHFIAQTDAAGQYGTFSIDA-DGQWTYQADNSQTAVQALKAGETLTDTFTVTSADGT-SQTVTITIT 78 (99)
T ss_pred CceEEeCCCCCCceEEecccccCCcEEEEECC-CCcEEEEeCCCcHHHHhhcCCCEEEEEEEEEEeCCC-eEEEEEEEE
Confidence 467889999766677776422 346788875 6643321 123222233456778888863 433466654
No 17
>KOG3597|consensus
Probab=87.07 E-value=4.3 Score=31.59 Aligned_cols=56 Identities=18% Similarity=0.193 Sum_probs=43.5
Q ss_pred EEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC-CCcEEEEEEcC
Q psy6837 17 CVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL-PGTIAYSLVAG 72 (122)
Q Consensus 17 ~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~-~~~i~y~l~~~ 72 (122)
.+.|.|..+||.+..+....+.+.+.|+...-.-...+.+.|+|. -..+.|.+.+.
T Consensus 27 ~~~i~v~pvndpp~~~~~~~~~l~~~~~~~k~l~~~~l~~~d~d~~~~~l~f~v~~t 83 (442)
T KOG3597|consen 27 VLRIHVNPVNDPPSLIFPSGSLLVILEGGQKVLDPELLTAADPDSAPLPLEFQVLGT 83 (442)
T ss_pred eecccccccCCCcceeecccceEEeecCCceeccceEeeccCCCCCccceEEEEccC
Confidence 688999999997776777777788888876544455688999997 66788888754
No 18
>PF13750 Big_3_3: Bacterial Ig-like domain (group 3)
Probab=84.76 E-value=9.3 Score=25.40 Aligned_cols=67 Identities=18% Similarity=0.232 Sum_probs=36.8
Q ss_pred EEEEEECCCCC-CcEEEEEEcCCC-CcEE--EE-CCccEEEEccc---CCccCCCEEEEEEEEEe-CCCCceeEEE
Q psy6837 52 AKVKAIDPDLP-GTIAYSLVAGDE-GKFV--LE-SSSGNLKLKDT---LDREQKDKYRVQVRASD-GVQSADVVLT 118 (122)
Q Consensus 52 ~~~~a~D~D~~-~~i~y~l~~~~~-~~f~--i~-~~~G~i~~~~~---Ld~e~~~~~~l~v~a~D-~g~~~~~~v~ 118 (122)
..+.++|.... .-...+|.+++. ..-. .. ...|.-.+.-+ -..|....|+|+|.|.| .|.....+++
T Consensus 69 i~i~~tD~~~~~~i~sv~l~Gg~~~d~v~ls~~~~~~~~~~~~yp~~fpsle~~~~YtLtV~a~D~aGN~~~~si~ 144 (158)
T PF13750_consen 69 ISINVTDNSDDSKITSVSLTGGPASDSVSLSWTNKGNGVYTLEYPRIFPSLEADDSYTLTVSATDKAGNQSTKSIS 144 (158)
T ss_pred eEEEEEeCCCCceEEEEEEECCcccceEEEeeEeccCceEEeecccccCCcCCCCeEEEEEEEEecCCCEEEEEEE
Confidence 45777776542 234577776642 2222 21 13343333211 13467889999999999 5554444443
No 19
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=83.97 E-value=21 Score=30.58 Aligned_cols=56 Identities=23% Similarity=0.335 Sum_probs=34.7
Q ss_pred ceeEEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEE-EEECCCCCCcEEEEEEcC
Q psy6837 14 VIGCVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKV-KAIDPDLPGTIAYSLVAG 72 (122)
Q Consensus 14 ~~~~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~-~a~D~D~~~~i~y~l~~~ 72 (122)
.+...+|++.| ||++|.|.-..-...|.|+. |+.-.++ +..+.+..-.+.|.-..+
T Consensus 513 ~ps~ATVTIlD-DD~aGIfsFe~~~~sV~Es~--G~vtvtV~RtsGa~G~VtV~Y~T~dG 569 (928)
T TIGR00845 513 SPNTATVTILD-DDHAGIFTFEEDVFHVSESI--GIMEVKVLRTSGARGTVIVPYRTVEG 569 (928)
T ss_pred CCceEEEEEec-CcccCcccccCceEEEEcCC--CEEEEEEEEcCCCCeeEEEEEEeecC
Confidence 33467788887 78899877666678889984 5544443 222332233466776655
No 20
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=81.97 E-value=6.9 Score=24.08 Aligned_cols=39 Identities=23% Similarity=0.307 Sum_probs=27.1
Q ss_pred eEEEEEEcCCCceeEEEEEEEecCCCCCcccCCceEEEEeCCC
Q psy6837 3 DLTLNLAHKFGVIGCVQLRILDRNDSPPSFADFHTEYTISEDV 45 (122)
Q Consensus 3 ~l~~~~~~~~g~~~~v~i~V~dvNdn~P~f~~~~~~~~v~E~~ 45 (122)
++++.+.+ |.+..|.|.|.-.|| +|+.... -...+.|+.
T Consensus 60 sFtvtv~D--Gtt~~vtItI~GtND-apvi~~~-~~g~v~ED~ 98 (99)
T TIGR01965 60 TFTVTSAD--GTSQTVTITITGAND-AAVIGGA-DTGSVTEDS 98 (99)
T ss_pred EEEEEEeC--CCeEEEEEEEEccCC-CCEEecc-cceeEecCC
Confidence 45555544 567899999999999 8866543 246666653
No 21
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=70.38 E-value=15 Score=19.90 Aligned_cols=41 Identities=32% Similarity=0.416 Sum_probs=24.1
Q ss_pred cEEEEEEcCCCCcEEEECCccEEEEcccCCccCCCEEEEEEEEEe
Q psy6837 64 TIAYSLVAGDEGKFVLESSSGNLKLKDTLDREQKDKYRVQVRASD 108 (122)
Q Consensus 64 ~i~y~l~~~~~~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D 108 (122)
...|.|.+-+..|..+...+-.+... .| ....|+|.|.|.|
T Consensus 9 ~Y~Y~l~g~d~~W~~~~~~~~~~~~~-~L---~~G~Y~l~V~a~~ 49 (66)
T PF07495_consen 9 RYRYRLEGFDDEWITLGSYSNSISYT-NL---PPGKYTLEVRAKD 49 (66)
T ss_dssp EEEEEEETTESSEEEESSTS-EEEEE-S-----SEEEEEEEEEEE
T ss_pred EEEEEEECCCCeEEECCCCcEEEEEE-eC---CCEEEEEEEEEEC
Confidence 34566675556666664332233332 22 4578999999999
No 22
>PF03160 Calx-beta: Calx-beta domain; InterPro: IPR003644 The calx-beta motif is present as a tandem repeat in the cytoplasmic domains of Calx Na-Ca exchangers, which are used to expel calcium from cells. This motif overlaps domains used for calcium binding and regulation. The calx-beta motif is also present in the cytoplasmic tail of mammalian integrin-beta4, which mediates the bi-directional transfer of signals across the plasma membrane, as well as in some cyanobacterial proteins. This motif contains a series of beta-strands and turns that form a self-contained beta-sheet [, ].; GO: 0007154 cell communication, 0016021 integral to membrane; PDB: 3H6A_B 3FSO_A 3FQ4_B 2DPK_A 2QVM_A 3GIN_B 2QVK_A 2FWU_A 2FWS_A 3E9U_A ....
Probab=66.93 E-value=24 Score=20.92 Aligned_cols=81 Identities=28% Similarity=0.453 Sum_probs=40.4
Q ss_pred EEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcCC---CCcEEEECCccEEEEcccCC
Q psy6837 19 QLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAGD---EGKFVLESSSGNLKLKDTLD 93 (122)
Q Consensus 19 ~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~~---~~~f~i~~~~G~i~~~~~Ld 93 (122)
+|.+.| ||.+ .+.-..-...+.|+. |..-..+.-...+. .-.+.|...++. ..-| .+.+|.|.....
T Consensus 2 tvtI~d-~d~~-~v~f~~~~~~v~E~~--~~~~v~V~~~~~~~~~~v~v~~~~~~gtA~~~~Dy--~~~~~~v~f~~g-- 73 (100)
T PF03160_consen 2 TVTILD-DDDP-TVSFSSPSYTVSEGD--GTVTVTVTRSGGSLDGPVTVNYSTVDGTATAGSDY--SPTSGTVTFPPG-- 73 (100)
T ss_dssp EEEEE--TTSE-EEEESSSEEEEETTS--SEEEEEEEEESS-TSSEEEEEEEEEESSSETTTSB--E--EEEEEE-TT--
T ss_pred EEEEEC-CCCC-EEEEeCCEEEEEeCC--CEEEEEEEEcccCCCcceEEEEEEeCCcccccccc--ccceeEEEECCC--
Confidence 567777 6644 766555577889986 44555555554442 444667776653 2345 235666655433
Q ss_pred ccCCCEEEEEEEEEeCC
Q psy6837 94 REQKDKYRVQVRASDGV 110 (122)
Q Consensus 94 ~e~~~~~~l~v~a~D~g 110 (122)
+. ...+.|...|..
T Consensus 74 -~t--~~~i~i~i~dD~ 87 (100)
T PF03160_consen 74 -ET--SKTINITIIDDD 87 (100)
T ss_dssp --S--EEEEEEEB---S
T ss_pred -Ce--EEEEEEEEeCCC
Confidence 22 334444445533
No 23
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=57.02 E-value=34 Score=19.26 Aligned_cols=28 Identities=18% Similarity=0.284 Sum_probs=19.1
Q ss_pred CCccCCCEEEEEEEEEeCC-CCceeEEEE
Q psy6837 92 LDREQKDKYRVQVRASDGV-QSADVVLTI 119 (122)
Q Consensus 92 Ld~e~~~~~~l~v~a~D~g-~~~~~~v~I 119 (122)
..|.....|++++.++|.. .....++.|
T Consensus 51 ~~y~~~G~y~v~l~v~d~~g~~~~~~~~V 79 (81)
T cd00146 51 HTYTKPGTYTVTLTVTNAVGSSSTKTTTV 79 (81)
T ss_pred EEcCCCcEEEEEEEEEeCCCCEEEEEEEE
Confidence 4467888999999999964 333324444
No 24
>smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.
Probab=54.37 E-value=37 Score=18.97 Aligned_cols=29 Identities=24% Similarity=0.366 Sum_probs=20.0
Q ss_pred cCCccCCCEEEEEEEEEeCCCCceeEEEE
Q psy6837 91 TLDREQKDKYRVQVRASDGVQSADVVLTI 119 (122)
Q Consensus 91 ~Ld~e~~~~~~l~v~a~D~g~~~~~~v~I 119 (122)
...|+....|.+++.+.|.......+++|
T Consensus 48 ~~~y~~~G~y~v~l~v~n~~g~~~~~~~i 76 (79)
T smart00089 48 THTYTKPGTYTVTLTVTNAVGSASATVTV 76 (79)
T ss_pred EEEeCCCcEEEEEEEEEcCCCcEEEEEEE
Confidence 34567788999999999954444444444
No 25
>PF15418 DUF4625: Domain of unknown function (DUF4625)
Probab=53.49 E-value=41 Score=21.71 Aligned_cols=24 Identities=21% Similarity=0.286 Sum_probs=21.6
Q ss_pred EEEEeCCCCCCcEEEEEEEECCCC
Q psy6837 38 EYTISEDVPVGTLVAKVKAIDPDL 61 (122)
Q Consensus 38 ~~~v~E~~~~g~~v~~~~a~D~D~ 61 (122)
.+.||+++++|..-+.+.++|...
T Consensus 97 ~i~IPa~a~~G~YH~~i~VtD~~G 120 (132)
T PF15418_consen 97 HIDIPADAPAGDYHFMITVTDAAG 120 (132)
T ss_pred eeeCCCCCCCcceEEEEEEEECCC
Confidence 578999999999999999999974
No 26
>PF10365 DUF2436: Domain of unknown function (DUF2436); InterPro: IPR018832 Gingipains R and K are endopeptidases with specificity for arginyl and lysyl bonds, respectively. Like other cysteine peptidases, they require reducing conditions for activity. They are maximally active at approximately neutral pH. Gingipains R and K are secreted by the bacterium Porphyromonas gingivalis (Bacteroides gingivalis). The bacterium is a major pathogen in periodontal disease, and the many ways in which the activities of the gingipains may contribute to the disease processes have been reviewed []. These enzymes are also involved in the hemagglutinating activity of the organisms. This entry represents a central region found in gingipain K peptidases, active on lysyl bonds; they belong to the MEROPS peptidase family C25 (gingipain family, clan CD).
Probab=50.56 E-value=71 Score=21.09 Aligned_cols=84 Identities=21% Similarity=0.273 Sum_probs=44.9
Q ss_pred CCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC-----CCcEEEEEEcCC-CCcEEEECCccEEEEc-ccCCccCCC
Q psy6837 26 NDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL-----PGTIAYSLVAGD-EGKFVLESSSGNLKLK-DTLDREQKD 98 (122)
Q Consensus 26 Ndn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~-----~~~i~y~l~~~~-~~~f~i~~~~G~i~~~-~~Ld~e~~~ 98 (122)
|+|+|.=....++..||+|+-+-.. .+-...|... .+..-|-|.... .+...|--..|.=..+ .-.-+|...
T Consensus 66 n~~~pa~ly~~FEYkiP~NADps~t-pq~mv~dG~~~i~IPaG~YDy~I~~P~~~~kiwIaGd~g~~~tr~dDy~fEAGK 144 (161)
T PF10365_consen 66 NCNVPANLYDPFEYKIPANADPSTT-PQNMVVDGEASIDIPAGTYDYCIAAPQPGGKIWIAGDGGDGPTRGDDYVFEAGK 144 (161)
T ss_pred CCCCChhhcccceEeccCCCCCccC-cceEEecCceEEEecCceeEEEEecCCCCCeEEEecCCCCCCccccceEEecCC
Confidence 5667765556788889988764322 2222222211 444556665442 2444443222211111 224458899
Q ss_pred EEEEEEEEEeCC
Q psy6837 99 KYRVQVRASDGV 110 (122)
Q Consensus 99 ~~~l~v~a~D~g 110 (122)
.|++++.....|
T Consensus 145 tY~ftm~~~g~g 156 (161)
T PF10365_consen 145 TYRFTMKRVGSG 156 (161)
T ss_pred EEEEEEEeccCC
Confidence 999999887643
No 27
>PF12245 Big_3_2: Bacterial Ig-like domain (group 3); InterPro: IPR022038 This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT.
Probab=49.76 E-value=42 Score=18.28 Aligned_cols=19 Identities=21% Similarity=0.419 Sum_probs=13.8
Q ss_pred CCCEEEEEEEEEe-CCCCce
Q psy6837 96 QKDKYRVQVRASD-GVQSAD 114 (122)
Q Consensus 96 ~~~~~~l~v~a~D-~g~~~~ 114 (122)
....|++.+.|.| .|....
T Consensus 21 ~dg~yt~~v~a~D~AGN~~~ 40 (60)
T PF12245_consen 21 ADGEYTLTVTATDKAGNTSS 40 (60)
T ss_pred CCccEEEEEEEEECCCCEEE
Confidence 3568999999999 444433
No 28
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=43.78 E-value=51 Score=17.49 Aligned_cols=17 Identities=24% Similarity=0.440 Sum_probs=12.9
Q ss_pred CCCEEEEEEEEEe-CCCC
Q psy6837 96 QKDKYRVQVRASD-GVQS 112 (122)
Q Consensus 96 ~~~~~~l~v~a~D-~g~~ 112 (122)
....|.+.+.|+| .|..
T Consensus 22 ~dG~y~itv~a~D~AGN~ 39 (54)
T PF13754_consen 22 ADGTYTITVTATDAAGNT 39 (54)
T ss_pred CCccEEEEEEEEeCCCCC
Confidence 3568999999999 4443
No 29
>PF00635 Motile_Sperm: MSP (Major sperm protein) domain; InterPro: IPR000535 Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans. MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold []. ; GO: 0005198 structural molecule activity; PDB: 1MSP_A 3MSP_B 2BVU_B 2MSP_C 1Z9O_F 1Z9L_A 3IKK_A 1WIC_A 2CRI_A 2RR3_A ....
Probab=39.07 E-value=64 Score=19.20 Aligned_cols=34 Identities=15% Similarity=0.401 Sum_probs=22.4
Q ss_pred EEEEEECCCCCCcEEEEEEcCCCCcEEEECCccEE
Q psy6837 52 AKVKAIDPDLPGTIAYSLVAGDEGKFVLESSSGNL 86 (122)
Q Consensus 52 ~~~~a~D~D~~~~i~y~l~~~~~~~f~i~~~~G~i 86 (122)
..+...-.- +..+.|.+.......|.+.|..|.|
T Consensus 22 ~~l~l~N~s-~~~i~fKiktt~~~~y~v~P~~G~i 55 (109)
T PF00635_consen 22 CELTLTNPS-DKPIAFKIKTTNPNRYRVKPSYGII 55 (109)
T ss_dssp EEEEEEE-S-SSEEEEEEEES-TTTEEEESSEEEE
T ss_pred EEEEEECCC-CCcEEEEEEcCCCceEEecCCCEEE
Confidence 334443332 5678899987777789999988854
No 30
>PF07145 PAM2: Ataxin-2 C-terminal region; InterPro: IPR009818 This entry represents a conserved region approximately 250 residues long located towards the C terminus of eukaryotic ataxin-2. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder []. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death []. Ataxin-2 is predicted to consist of mostly non-globular domains [].; PDB: 3NTW_B 1JH4_B 3KTR_B 3KUJ_B 3KUT_D 3KUS_D 1JGN_B 2RQG_A 2RQH_A.
Probab=35.07 E-value=24 Score=14.71 Aligned_cols=11 Identities=27% Similarity=0.410 Sum_probs=7.8
Q ss_pred cCCCCCcccCC
Q psy6837 25 RNDSPPSFADF 35 (122)
Q Consensus 25 vNdn~P~f~~~ 35 (122)
.|-|+|.|.+.
T Consensus 6 LNp~A~eFvP~ 16 (18)
T PF07145_consen 6 LNPNAPEFVPS 16 (18)
T ss_dssp SSTTSSSS-TT
T ss_pred cCCCCccccCC
Confidence 57889998765
No 31
>PF14979 TMEM52: Transmembrane 52
Probab=32.45 E-value=45 Score=22.11 Aligned_cols=25 Identities=24% Similarity=0.248 Sum_probs=19.3
Q ss_pred ccCCCEEEEEEEEEeCCCCceeEEE
Q psy6837 94 REQKDKYRVQVRASDGVQSADVVLT 118 (122)
Q Consensus 94 ~e~~~~~~l~v~a~D~g~~~~~~v~ 118 (122)
....+.|+++|.+.|...+-.+||+
T Consensus 58 ~~~~~P~~~TVia~D~DSt~hsTvT 82 (154)
T PF14979_consen 58 PPAPQPYEVTVIAVDSDSTLHSTVT 82 (154)
T ss_pred CCCCCCceEEEEeccCCccccchhh
Confidence 3456789999999998887766654
No 32
>PF04234 CopC: CopC domain; InterPro: IPR007348 CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm [].; GO: 0005507 copper ion binding, 0046688 response to copper ion, 0042597 periplasmic space; PDB: 1IX2_B 1LYQ_A 2C9P_C 2C9R_A 2C9Q_A 1M42_A 1OT4_A 1NM4_A.
Probab=31.62 E-value=1e+02 Score=18.31 Aligned_cols=35 Identities=14% Similarity=0.347 Sum_probs=19.8
Q ss_pred CceEEEEeCCCCCCcEEEEEEEECCCC---CCcEEEEE
Q psy6837 35 FHTEYTISEDVPVGTLVAKVKAIDPDL---PGTIAYSL 69 (122)
Q Consensus 35 ~~~~~~v~E~~~~g~~v~~~~a~D~D~---~~~i~y~l 69 (122)
..+.+.+++..+.|......++.-.|. .+.+.|++
T Consensus 59 ~~~~~~l~~~l~~G~YtV~wrvvs~DGH~~~G~~~F~V 96 (97)
T PF04234_consen 59 KTLTVPLPPPLPPGTYTVSWRVVSADGHPVSGSFSFTV 96 (97)
T ss_dssp TEEEEEESS---SEEEEEEEEEEETTSCEEEEEEEEEE
T ss_pred eEEEEECCCCCCCceEEEEEEEEecCCCCcCCEEEEEE
Confidence 456677777777777776666655564 45555543
No 33
>PF05688 DUF824: Salmonella repeat of unknown function (DUF824); InterPro: IPR008542 This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium, Salmonella typhi and Escherichia coli. These repeats are almost always found with this entry. The repeats are associated with RatA and RatB, the coding sequences of which are found in the pathogeneicity island of Salmonella. The sequences may be determinants of pathogenicity [, ].
Probab=31.42 E-value=77 Score=16.66 Aligned_cols=19 Identities=26% Similarity=0.364 Sum_probs=13.8
Q ss_pred CCCceeEEEEEEEecCCCC
Q psy6837 11 KFGVIGCVQLRILDRNDSP 29 (122)
Q Consensus 11 ~~g~~~~v~i~V~dvNdn~ 29 (122)
.-|..-+++|++.|.|.||
T Consensus 10 K~Ge~I~ltVt~kda~G~p 28 (47)
T PF05688_consen 10 KVGETIPLTVTVKDANGNP 28 (47)
T ss_pred ecCCeEEEEEEEECCCCCC
Confidence 4455668999999997743
No 34
>KOG4680|consensus
Probab=31.21 E-value=1.3e+02 Score=19.87 Aligned_cols=28 Identities=21% Similarity=0.322 Sum_probs=23.2
Q ss_pred EEEEeCCCCCCcEEEEEEEECCCCCCcEE
Q psy6837 38 EYTISEDVPVGTLVAKVKAIDPDLPGTIA 66 (122)
Q Consensus 38 ~~~v~E~~~~g~~v~~~~a~D~D~~~~i~ 66 (122)
...+|--.|+|+.+...+|.|.+ +.+++
T Consensus 109 sq~LPg~tPPG~Y~lkm~~~d~~-~~~LT 136 (153)
T KOG4680|consen 109 SQVLPGYTPPGSYVLKMTAYDAK-GKELT 136 (153)
T ss_pred eEeccCcCCCceEEEEEEeecCC-CCEEE
Confidence 46789999999999999999998 55544
No 35
>KOG4221|consensus
Probab=31.13 E-value=2.4e+02 Score=25.50 Aligned_cols=54 Identities=20% Similarity=0.356 Sum_probs=33.5
Q ss_pred CCcE-EEEEEc---CCCCcEEEECCccEEEEcccCCccCCCEEEEEEEEEe--CCCCceeEEE
Q psy6837 62 PGTI-AYSLVA---GDEGKFVLESSSGNLKLKDTLDREQKDKYRVQVRASD--GVQSADVVLT 118 (122)
Q Consensus 62 ~~~i-~y~l~~---~~~~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D--~g~~~~~~v~ 118 (122)
|+.+ .|++.. +...++.++.++-+..+. +.+....|.+.|.|.. |-.++.+.++
T Consensus 548 n~~I~~yk~~ys~~~~~~~~~~~~n~~e~ti~---gL~k~TeY~~~vvA~N~~G~g~sS~~i~ 607 (1381)
T KOG4221|consen 548 NGPITGYKLFYSEDDTGKELRVENNATEYTIN---GLEKYTEYSIRVVAYNSAGSGVSSADIT 607 (1381)
T ss_pred CCCceEEEEEEEcCCCCceEEEecCccEEEee---cCCCccceEEEEEEecCCCCCCCCCceE
Confidence 4443 455533 245678888666666655 3467789999999998 3333333443
No 36
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=30.81 E-value=3.6e+02 Score=23.52 Aligned_cols=46 Identities=26% Similarity=0.245 Sum_probs=28.9
Q ss_pred cCCCCCcccCCceEEEEeCCCCCCcEEEEEEEECCCC--CCcEEEEEEcC
Q psy6837 25 RNDSPPSFADFHTEYTISEDVPVGTLVAKVKAIDPDL--PGTIAYSLVAG 72 (122)
Q Consensus 25 vNdn~P~f~~~~~~~~v~E~~~~g~~v~~~~a~D~D~--~~~i~y~l~~~ 72 (122)
.||..+.|.-.+-...|.|+. |+.-..+.-...|. .-.+.|+..++
T Consensus 395 ~dd~~s~i~Fe~~~Y~V~En~--GtV~VtV~R~GGdl~~tVsVdY~T~DG 442 (928)
T TIGR00845 395 ENDPVSKIFFEPGHYTCLENC--GTVALTVVRRGGDLTNTVYVDYRTEDG 442 (928)
T ss_pred ccCCcceEEecCCeEEEeecC--cEEEEEEEEccCCCCceEEEEEEccCC
Confidence 456556655555577888985 66666665544343 34577887766
No 37
>smart00634 BID_1 Bacterial Ig-like domain (group 1).
Probab=30.77 E-value=81 Score=18.48 Aligned_cols=31 Identities=13% Similarity=0.074 Sum_probs=19.0
Q ss_pred cCCCceeEEEEEEEecCCCCCcccCCceEEEEe
Q psy6837 10 HKFGVIGCVQLRILDRNDSPPSFADFHTEYTIS 42 (122)
Q Consensus 10 ~~~g~~~~v~i~V~dvNdn~P~f~~~~~~~~v~ 42 (122)
.+......+.++|.|.|.||= -. ....|.+.
T Consensus 14 Adg~d~~~i~v~v~D~~Gnpv-~~-~~V~f~~~ 44 (92)
T smart00634 14 ANGSDAITLTATVTDANGNPV-AG-QEVTFTTP 44 (92)
T ss_pred EcCcccEEEEEEEECCCCCCc-CC-CEEEEEEC
Confidence 444555689999999988543 22 22445544
No 38
>PF14157 YmzC: YmzC-like protein; PDB: 3KVP_E.
Probab=27.31 E-value=1.1e+02 Score=17.18 Aligned_cols=26 Identities=19% Similarity=0.414 Sum_probs=17.2
Q ss_pred EEEEEcCCC--CcEEEECCccEEEEccc
Q psy6837 66 AYSLVAGDE--GKFVLESSSGNLKLKDT 91 (122)
Q Consensus 66 ~y~l~~~~~--~~f~i~~~~G~i~~~~~ 91 (122)
+|.+.+... ..|..|+.+++|.+.|.
T Consensus 31 ~Fav~~e~~~iKIfkyd~~tNei~L~KE 58 (63)
T PF14157_consen 31 HFAVVDEDGQIKIFKYDEDTNEITLKKE 58 (63)
T ss_dssp EEEEE-ETTEEEEEEEETTTTEEEEEEE
T ss_pred EEEEEecCCeEEEEEeCCCCCeEEEEEe
Confidence 455552222 46888999999988765
No 39
>cd02848 Chitinase_N_term Chitinase N-terminus domain. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and based on sequence criteria, chitinases belong to families 18 and 19 of glycosyl hydrolases. The N-terminus of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitob
Probab=27.21 E-value=1.6e+02 Score=18.33 Aligned_cols=31 Identities=16% Similarity=0.333 Sum_probs=21.7
Q ss_pred cCCccCCCEEEEEEEEEeCCCCcee---EEEEEc
Q psy6837 91 TLDREQKDKYRVQVRASDGVQSADV---VLTILN 121 (122)
Q Consensus 91 ~Ld~e~~~~~~l~v~a~D~g~~~~~---~v~I~d 121 (122)
.+++.+...|.+.|+++|+...+.+ .|.|-|
T Consensus 73 t~~v~kgG~y~m~V~lCn~dGCS~S~~~~I~VAD 106 (106)
T cd02848 73 TFKVGKGGRYQMQVALCNGDGCSTSAAKEIVVAD 106 (106)
T ss_pred EEEeCCCCeEEEEEEEECCCCccCcCCEEEEecC
Confidence 4567778899999999995544444 455543
No 40
>PF02369 Big_1: Bacterial Ig-like domain (group 1); InterPro: IPR003344 Proteins that contain this domain are found in a variety of bacterial and phage surface proteins such as intimins. Intimin is a bacterial cell-adhesion molecule that mediates the intimate bacterial host-cell interaction. It contains three domains; two immunoglobulin-like domains and a C-type lectin-like module implying that carbohydrate recognition may be important in intimin-mediated cell adhesion [].; PDB: 1CWV_A 4E9L_A 1F02_I 1F00_I.
Probab=27.17 E-value=89 Score=18.77 Aligned_cols=64 Identities=19% Similarity=0.244 Sum_probs=28.7
Q ss_pred cCCCceeEEEEEEEecCCCCCcccCCceEEEEeCCCCCCcEEEE--EEEECCCCCCcEEEEEEcCCCCcEEE
Q psy6837 10 HKFGVIGCVQLRILDRNDSPPSFADFHTEYTISEDVPVGTLVAK--VKAIDPDLPGTIAYSLVAGDEGKFVL 79 (122)
Q Consensus 10 ~~~g~~~~v~i~V~dvNdn~P~f~~~~~~~~v~E~~~~g~~v~~--~~a~D~D~~~~i~y~l~~~~~~~f~i 79 (122)
.+.+...++.++|.|.|.||= ... .+.+......|..... ...+|. ++.....|.+...+...|
T Consensus 19 a~g~~~~tltatV~D~~gnpv--~g~--~V~f~~~~~~~~l~~~~~~~~Td~--~G~a~~tltst~aG~~~V 84 (100)
T PF02369_consen 19 ADGSDTNTLTATVTDANGNPV--PGQ--PVTFSSSSSGGTLSPTNTSATTDS--NGIATVTLTSTKAGTYTV 84 (100)
T ss_dssp SSSSS-EEEEEEEEETTSEB---TS---EEEE--EESSSEES-CEE-EEE-T--TSEEEEEEE-SS-EEEEE
T ss_pred eCCcCcEEEEEEEEcCCCCCC--CCC--EEEEEEcCCCcEEecCccccEECC--CEEEEEEEEecCceEEEE
Confidence 455555699999999998543 222 3333112222222222 234555 466666666443333333
No 41
>PF00337 Gal-bind_lectin: Galactoside-binding lectin; InterPro: IPR001079 Galectins (also known as galaptins or S-lectin) are a family of proteins defined by having at least one characteristic carbohydrate recognition domain (CRD) with an affinity for beta-galactosides and sharing certain sequence elements. Members of the galectins family are found in mammals, birds, amphibians, fish, nematodes, sponges, and some fungi. Galectins are known to carry out intra- and extracellular functions through glycoconjugate-mediated recogntion. From the cytosol they may be secreted by non-classical pathways, but they may also be targeted to the nucleus or specific sub-cytosolic sites. Within the same peptide chain some galectins have a CRD with only a few additional amino acids, whereas others have two CRDs joined by a link peptide, and one (galectin-3) has one CRD joined to a different type of domain [, ]. The galectin carbohydrate recognition domain (CRD) is a beta-sandwich of about 135 amino acid. The two sheets are slightly bent with 6 strands forming the concave side and 5 strands forming the convex side. The concave side forms a groove in which carbohydrate is bound, and which is long enough to hold about a linear tetrasaccharide [, ].; GO: 0005529 sugar binding; PDB: 2WSU_B 2WT0_A 2WT1_A 2WT2_B 2WSV_A 1HLC_A 2ZGQ_A 3M3Q_B 1WW5_C 3M3E_A ....
Probab=26.69 E-value=1.7e+02 Score=18.25 Aligned_cols=72 Identities=10% Similarity=0.217 Sum_probs=41.6
Q ss_pred ceEEEEeCCCCCCcEEEEEEEECCCCCCcEEEEEEcC-----CCCcEEEE--CCc-cEEEEcc-------------cCCc
Q psy6837 36 HTEYTISEDVPVGTLVAKVKAIDPDLPGTIAYSLVAG-----DEGKFVLE--SSS-GNLKLKD-------------TLDR 94 (122)
Q Consensus 36 ~~~~~v~E~~~~g~~v~~~~a~D~D~~~~i~y~l~~~-----~~~~f~i~--~~~-G~i~~~~-------------~Ld~ 94 (122)
+|...+++...+|..+.----...+ ..++...|..+ .+-.|.++ ... +.|..+. ...+
T Consensus 1 pf~~~l~~~l~~G~~i~i~G~~~~~-~~~f~inl~~~~~~~~~~i~lH~~~rf~~~~~iv~Ns~~~g~Wg~Ee~~~~~pf 79 (133)
T PF00337_consen 1 PFTARLPGGLSPGDSIIIRGTVPPD-AKRFSINLQTGPNDPDDDIALHFNPRFDEQNVIVRNSRINGKWGQEERESPFPF 79 (133)
T ss_dssp SEEEEETTEEETTEEEEEEEEEBTT-SSBEEEEEEES-STTTTEEEEEEEEECTTEEEEEEEEEETTEE-SEEEESSTSS
T ss_pred CceEEcCCCCCCCcEEEEEEEECCC-CCEEEEEecCCCcCCCCCEEEEEEEEeCCCceEEEeceECCEeccceeeeeeee
Confidence 4677888888888777432222233 66777777766 22234444 344 5454431 1223
Q ss_pred cCCCEEEEEEEEEe
Q psy6837 95 EQKDKYRVQVRASD 108 (122)
Q Consensus 95 e~~~~~~l~v~a~D 108 (122)
.....|++.|.+.+
T Consensus 80 ~~g~~F~i~I~~~~ 93 (133)
T PF00337_consen 80 QPGQPFEIRIRVEE 93 (133)
T ss_dssp TTTSEEEEEEEEES
T ss_pred cCCceEEEEEEEec
Confidence 33567899988876
No 42
>PRK08577 hypothetical protein; Provisional
Probab=24.65 E-value=1.6e+02 Score=18.72 Aligned_cols=32 Identities=9% Similarity=0.271 Sum_probs=20.6
Q ss_pred cEEEECCccEEEEcccCCccCCCEEEEEEEEEe
Q psy6837 76 KFVLESSSGNLKLKDTLDREQKDKYRVQVRASD 108 (122)
Q Consensus 76 ~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D 108 (122)
.|.++...|+|.+ .++-.+....|.+.+.+.|
T Consensus 34 ~~~~~~~~~~~~~-~~~~~~~k~~~~I~V~~~D 65 (136)
T PRK08577 34 LLIADTDKKEIHL-EPIALPGKKLVEIELVVED 65 (136)
T ss_pred EEEEECCCCEEEE-EEcCCCCccEEEEEEEEcC
Confidence 4556666677777 3444445567778887777
No 43
>PF03413 PepSY: Peptidase propeptide and YPEB domain This Prosite motif covers only the active site. This is family M4 in the peptidase classification. ; InterPro: IPR005075 This signature, PepSY, is found in the propeptide of members of the MEROPS peptidase family M4 (clan MA(E)), which contains the thermostable thermolysins (3.4.24.27 from EC), and related thermolabile neutral proteases (bacillolysins) (3.4.24.28 from EC) from various species of Bacillus. It is also in many non-peptidase proteins, including Bacillus subtilis YpeB protein - a regulator of SleB spore cortex lytic enzyme - and a large number of eubacterial and archaeal cell wall-associated and secreted proteins which are mostly annotated as 'hypothetical protein'. Many extracellular bacterial proteases are produced as proenzymes. The propeptides usually have a dual function, i.e. they function as an intramolecular chaperone required for the folding of the polypeptide and as an inhibitor preventing premature activation of the enzyme. Analysis of the propeptide region of the M4 family of peptidases reveals two regions of conservation, the PepSY domain and a second domain, proximate to the N terminus, the FTP domain (IPR011096 from INTERPRO), which is also found in isolation in the propeptide of eukaryotic peptidases belong to MEROPS peptidase family M36. Propeptide domain swapping experiments, for example swapping the propeptide domain of PA protease with that of vibrolysin, both propeptides contain the FTP and PepSY domains, allows the PA protease domain to fold correctly and inhibits the C-terminal autoprocessing activity. However, swapping the propeptide of PA protease for the thermolysin propeptide, does not facilitate the correct folding nor the processing of the chimaeric protein into an active peptidase []. Mutational analysis of the Pseudomonas aeruginosa elastase gene revealed two mutations in the propeptide which resulted in the loss of inhibitory activity but not chaperone activity: A-15V and T-153I (where +1 is defined as the first residue of the mature peptidase). Both mutations resulted in peptidase activity, the T-153V mutation being much less effective than the A-15I mutation [] in activating peptidase activity. The T-153V mutation lies N-terminal to the FTP domain while the A-15I mutation is C-terminal to the PepSY domain. Given the diverse range of other proteins, both domains occur in in isolation, the exact function of each is still unclear; though it has been proposed that the PepSY domain primarily has inhibitory activity and in conjunction with the FTP domain in chaperone activity. ; GO: 0008237 metallopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0005576 extracellular region; PDB: 2GU3_A 3NQZ_A 3NQY_A 2KGY_A.
Probab=24.39 E-value=1.2e+02 Score=15.82 Aligned_cols=12 Identities=8% Similarity=0.288 Sum_probs=7.5
Q ss_pred EEEECCccEEEE
Q psy6837 77 FVLESSSGNLKL 88 (122)
Q Consensus 77 f~i~~~~G~i~~ 88 (122)
+.||+.||.|.-
T Consensus 51 v~VDa~tG~Il~ 62 (64)
T PF03413_consen 51 VYVDAYTGEILS 62 (64)
T ss_dssp EEEETTT--EEE
T ss_pred EEEECCCCeEEE
Confidence 449999998753
No 44
>PF07861 WND: WisP family N-Terminal Region; InterPro: IPR012503 This family is found at the N terminus of the Tropheryma whipplei WisP family proteins [].
Probab=24.05 E-value=2.1e+02 Score=20.01 Aligned_cols=27 Identities=22% Similarity=0.290 Sum_probs=21.9
Q ss_pred CCcEEEEEEcCCCCcEEEECCccEEEEc
Q psy6837 62 PGTIAYSLVAGDEGKFVLESSSGNLKLK 89 (122)
Q Consensus 62 ~~~i~y~l~~~~~~~f~i~~~~G~i~~~ 89 (122)
|.+.+|+|.... .-..||..+|.|...
T Consensus 202 ~S~~T~SLs~P~-~~v~lD~~TG~l~~S 228 (263)
T PF07861_consen 202 GSPFTYSLSTPV-AGVRLDANTGALSGS 228 (263)
T ss_pred CCcceEEeccCC-CceEEecccceeeee
Confidence 888999998544 458999999998765
No 45
>PF14550 Peptidase_U35_2: Putative phage protease XkdF
Probab=23.71 E-value=68 Score=20.50 Aligned_cols=19 Identities=47% Similarity=0.690 Sum_probs=15.6
Q ss_pred CCCCCCcEEEEEEEECCCC
Q psy6837 43 EDVPVGTLVAKVKAIDPDL 61 (122)
Q Consensus 43 E~~~~g~~v~~~~a~D~D~ 61 (122)
|..|.|+.+..+++.|.|.
T Consensus 82 ~~i~~GtWv~~~k~~ddel 100 (122)
T PF14550_consen 82 ETIPKGTWVVGVKITDDEL 100 (122)
T ss_pred eeecceEEEEEEEecCHHH
Confidence 5567899999999999776
No 46
>PF09865 DUF2092: Predicted periplasmic protein (DUF2092); InterPro: IPR019207 This entry represents various hypothetical prokaryotic proteins of unknown function.
Probab=21.93 E-value=2.5e+02 Score=19.65 Aligned_cols=38 Identities=13% Similarity=0.068 Sum_probs=19.2
Q ss_pred EEEEcCCCceeEEEEEEEecCCCCCcccCCceEEEEeCCC
Q psy6837 6 LNLAHKFGVIGCVQLRILDRNDSPPSFADFHTEYTISEDV 45 (122)
Q Consensus 6 ~~~~~~~g~~~~v~i~V~dvNdn~P~f~~~~~~~~v~E~~ 45 (122)
|+...+.|.| +..+.+.|.|- .|.+....|.+.-|+++
T Consensus 170 IT~k~~~~~P-Qy~~~~~~W~~-~p~~~~~~F~F~pP~gA 207 (214)
T PF09865_consen 170 ITYKTDPGSP-QYSAEFSDWNL-DPKLPADTFTFTPPAGA 207 (214)
T ss_pred EEECCCCCCc-eEEEEEecccC-CCCCCcceeEEcCCCCC
Confidence 3333344433 34455555555 34556666666665554
No 47
>PF08329 ChitinaseA_N: Chitinase A, N-terminal domain; InterPro: IPR013540 This domain is found in a number of bacterial chitinases and similar viral proteins. It is organised into a fibronectin III module domain-like fold, comprising only beta strands. Its function is not known, but it may be involved in interaction with the enzyme substrate, chitin [, ]. It is separated by a hinge region from the catalytic domain (IPR001223 from INTERPRO); this hinge region is probably mobile, allowing the N-terminal domain to have different relative positions in solution []. ; GO: 0004568 chitinase activity; PDB: 2WLY_A 1EDQ_A 2WM0_A 1X6N_A 1NH6_A 2WK2_A 1EHN_A 2WLZ_A 1EIB_A 1FFR_A ....
Probab=21.83 E-value=1.1e+02 Score=19.91 Aligned_cols=31 Identities=13% Similarity=0.269 Sum_probs=18.1
Q ss_pred cCCccCCCEEEEEEEEEeCCCCcee---EEEEEc
Q psy6837 91 TLDREQKDKYRVQVRASDGVQSADV---VLTILN 121 (122)
Q Consensus 91 ~Ld~e~~~~~~l~v~a~D~g~~~~~---~v~I~d 121 (122)
.+.......|++.|+++|....+.+ +|+|.|
T Consensus 76 ~~~~~~gG~y~~~VeLCN~~GCS~S~~~~V~VaD 109 (133)
T PF08329_consen 76 TFTVTKGGRYQMQVELCNADGCSTSAPVEVVVAD 109 (133)
T ss_dssp EEEE-S-EEEEEEEEEEETTEEEE---EEEEEE-
T ss_pred EEEecCCCEEEEEEEEECCCCcccCCCEEEEEeC
Confidence 3555567889999999994444333 555544
No 48
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=21.78 E-value=1.4e+02 Score=16.73 Aligned_cols=7 Identities=29% Similarity=0.591 Sum_probs=2.4
Q ss_pred EEEEEEc
Q psy6837 4 LTLNLAH 10 (122)
Q Consensus 4 l~~~~~~ 10 (122)
+++++..
T Consensus 9 ~~~tv~N 15 (78)
T PF10633_consen 9 VTLTVTN 15 (78)
T ss_dssp EEEEEE-
T ss_pred EEEEEEE
Confidence 3344443
No 49
>PF00801 PKD: PKD domain; InterPro: IPR000601 The PKD (Polycystic Kidney Disease) domain was first identified in the Polycystic Kidney Disease protein, polycystin-1 (PDK1 gene), and contains an Ig-like fold consisting of a beta-sandwich of seven strands in two sheets with a Greek key topology, although some members have additional strands []. Polycystin-1 is a large cell-surface glycoprotein involved in adhesive protein-protein and protein-carbohydrate interactions; however it is not clear if the PKD domain mediates any of these interactions. PKD domains are also found in other proteins, usually in the extracellular parts of proteins involved in interactions with other proteins. For example, domains with a PKD-type fold are found in archaeal surface layer proteins that protect the cell from extreme environments [], and in the human VPS10 domain-containing receptor SorCS2 [].; PDB: 1B4R_A 2KZW_A 2C4X_A 2C26_A 2Y72_B 3JQU_A 3JS7_B 1WGO_A 1L0Q_A.
Probab=21.72 E-value=1.2e+02 Score=16.43 Aligned_cols=51 Identities=22% Similarity=0.316 Sum_probs=27.4
Q ss_pred EEEEECCCCCCcEEEEEEcCCCCcEEEECCccEEEEcccCCccCCCEEEEEEEEEeCCC
Q psy6837 53 KVKAIDPDLPGTIAYSLVAGDEGKFVLESSSGNLKLKDTLDREQKDKYRVQVRASDGVQ 111 (122)
Q Consensus 53 ~~~a~D~D~~~~i~y~l~~~~~~~f~i~~~~G~i~~~~~Ld~e~~~~~~l~v~a~D~g~ 111 (122)
.+.+...+ +..++|...-++ ..+... ...+. .-|.....|.++|.|+|+..
T Consensus 15 ~f~~~~~~-g~~~~y~W~fgd-~~~~~~--~~~~t----~ty~~~G~y~V~ltv~n~~g 65 (69)
T PF00801_consen 15 TFTASSSD-GSPVTYSWDFGD-NGTVST--GSSVT----HTYSSPGTYTVTLTVTNGVG 65 (69)
T ss_dssp EEEETTTT-SSECEEEEE-SS-ESEEEC--SSEEE----EEESSSEEEEEEEEEEETTS
T ss_pred EEEEEccC-CCCeEEEEEECC-CCcccc--CCCEE----EEcCCCeEEEEEEEEEECCC
Confidence 45555533 666777766444 111111 11111 22445789999999999543
No 50
>PF13860 FlgD_ig: FlgD Ig-like domain; PDB: 3C12_A 3OSV_A.
Probab=21.27 E-value=1.2e+02 Score=17.42 Aligned_cols=14 Identities=36% Similarity=0.596 Sum_probs=11.0
Q ss_pred CCEEEEEEEEEeCC
Q psy6837 97 KDKYRVQVRASDGV 110 (122)
Q Consensus 97 ~~~~~l~v~a~D~g 110 (122)
...|.+.|.|+|+|
T Consensus 68 ~G~Y~~~v~a~~~g 81 (81)
T PF13860_consen 68 DGTYTFRVTATDGG 81 (81)
T ss_dssp SEEEEEEEEEEET-
T ss_pred CCCEEEEEEEEeCC
Confidence 45799999999865
No 51
>PF14730 DUF4468: Domain of unknown function (DUF4468) with TBP-like fold
Probab=20.95 E-value=1.6e+02 Score=17.27 Aligned_cols=14 Identities=21% Similarity=0.482 Sum_probs=10.3
Q ss_pred CCEEEEEEEEEeCC
Q psy6837 97 KDKYRVQVRASDGV 110 (122)
Q Consensus 97 ~~~~~l~v~a~D~g 110 (122)
.-.|+|.+.+.||.
T Consensus 65 ~i~y~l~i~~kDgk 78 (91)
T PF14730_consen 65 RINYTLIIDCKDGK 78 (91)
T ss_pred EEEEEEEEEEECCE
Confidence 34688888888864
No 52
>PF03671 Ufm1: Ubiquitin fold modifier 1 protein; InterPro: IPR005375 Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1, IPR000011 from INTERPRO), a ubiquitin-conjugating enzyme (E2, IPR000608 from INTERPRO), and a ubiquitin ligase (E3, IPR000569 from INTERPRO, IPR003613 from INTERPRO), which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process []. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1 (IPR005375 from INTERPRO), SUMO1 (IPR003653 from INTERPRO), NEDD8, Rad23 (IPR004806 from INTERPRO), Elongin B and Parkin (IPR003977 from INTERPRO), the latter being involved in Parkinson's disease []. Ubiquitin-like molecules (UBLs) can be divided into two subclasses: type-1 UBLs, which ligate to target proteins in a manner similar, but not identical, to the ubiquitylation pathway, such as SUMO, NEDD8, and UCRP/ISG15, and type-2 UBLs (also called UDPs, ubiquitin-domain proteins), which contain ubiquitin-like structure embedded in a variety of different classes of large proteins with apparently distinct functions, such as Rad23, Elongin B, Scythe, Parkin, and HOIL-1. This entry represents Ufm1 (ubiquitin-fold modifier), which is a ubiquitin-like protein with structural similarities to ubiquitin [, ]. Ufm1 is one of a number of ubiquitin-like modifiers that conjugate to target proteins in cells through Uba5 (E1) and Ufc1 (E2). The Ufm1-system is conserved in metazoa and plants, suggesting it has a potential role in multicellular organisms []. Human Ufm1 is synthesized as a precursor consisting of 85 amino-acid residues. Prior to activation by Uba5, the extra amino acids at the C-terminal region of Ufm1 are removed to expose Gly, which is necessary for conjugation to target molecule(s). C-terminal processing of Ufm1 requires two specific cysteine peptidases (IPR012462 from INTERPRO): UfSP1 and UfSP2; both peptidases are also able to release Ufm1 from Ufm1-conjugated cellular proteins. UfSP2 is present in most, if not all, of multi-cellular organisms including plant, nematode, fly, and mammal, whereas UfSP1 is not present in plants and nematodes []. For further information on ubiquitin, please see Protein of the Month [].; PDB: 1J0G_A 1WXS_A 1L7Y_A.
Probab=20.87 E-value=1.8e+02 Score=16.89 Aligned_cols=25 Identities=24% Similarity=0.487 Sum_probs=13.5
Q ss_pred EEEEEEecCCCCCcccCCceEEEEeCCCC
Q psy6837 18 VQLRILDRNDSPPSFADFHTEYTISEDVP 46 (122)
Q Consensus 18 v~i~V~dvNdn~P~f~~~~~~~~v~E~~~ 46 (122)
+++++.--.| |..+ .-.++|||++|
T Consensus 3 vtfKI~ltsD--p~~p--~kv~sVPE~ap 27 (76)
T PF03671_consen 3 VTFKITLTSD--PKLP--YKVISVPEEAP 27 (76)
T ss_dssp EEEEEEESTS--STS---EEEEEEETTSB
T ss_pred EEEEEEEccC--CCCc--ceEEecCCCCc
Confidence 4444444445 4332 22578888875
Done!