Query 003218
Match_columns 838
No_of_seqs 127 out of 152
Neff 3.8
Searched_HMMs 46136
Date Thu Mar 28 19:18:25 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/003218.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/003218hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF12036 DUF3522: Protein of u 100.0 1.4E-47 3E-52 381.5 16.7 179 578-785 2-186 (186)
2 PF05875 Ceramidase: Ceramidas 97.9 0.00018 3.9E-09 75.5 13.1 52 587-646 29-88 (262)
3 PF04080 Per1: Per1-like ; In 96.9 0.023 4.9E-07 61.4 14.9 44 606-657 89-132 (267)
4 TIGR01065 hlyIII channel prote 96.4 0.073 1.6E-06 54.7 13.9 51 606-659 35-89 (204)
5 PF12955 DUF3844: Domain of un 96.3 0.0041 8.9E-08 58.6 3.8 66 530-601 4-85 (103)
6 PF07974 EGF_2: EGF-like domai 96.0 0.0054 1.2E-07 46.5 2.8 26 537-570 7-32 (32)
7 PF03006 HlyIII: Haemolysin-II 96.0 0.11 2.4E-06 52.2 12.6 43 612-657 50-94 (222)
8 PRK15087 hemolysin; Provisiona 95.5 0.26 5.7E-06 51.5 13.8 46 610-659 54-103 (219)
9 KOG2970 Predicted membrane pro 94.1 0.36 7.9E-06 53.1 10.5 141 606-786 141-294 (319)
10 KOG2329 Alkaline ceramidase [L 93.4 0.29 6.2E-06 53.3 8.3 42 587-628 37-86 (276)
11 PF13965 SID-1_RNA_chan: dsRNA 92.2 5.3 0.00011 47.8 17.2 19 775-793 529-547 (570)
12 COG1272 Predicted membrane pro 91.9 1.4 3E-05 46.9 10.8 39 754-793 176-214 (226)
13 cd00053 EGF Epidermal growth f 89.4 0.38 8.3E-06 34.2 2.9 30 536-571 6-36 (36)
14 PF00008 EGF: EGF-like domain 88.1 0.41 8.8E-06 36.0 2.3 29 535-568 3-31 (32)
15 KOG1225 Teneurin-1 and related 87.4 0.34 7.4E-06 56.9 2.3 32 532-573 312-343 (525)
16 PHA02887 EGF-like protein; Pro 87.0 0.4 8.6E-06 46.7 2.1 45 523-572 75-123 (126)
17 cd00054 EGF_CA Calcium-binding 86.3 0.69 1.5E-05 33.5 2.7 34 532-571 3-38 (38)
18 PF04863 EGF_alliinase: Alliin 85.7 0.3 6.6E-06 41.8 0.6 35 536-573 17-52 (56)
19 KOG4289 Cadherin EGF LAG seven 83.8 0.75 1.6E-05 58.9 2.9 36 532-573 1240-1276(2531)
20 smart00179 EGF_CA Calcium-bind 83.4 1.2 2.6E-05 32.9 2.9 34 532-571 3-39 (39)
21 PF12661 hEGF: Human growth fa 83.4 0.51 1.1E-05 29.7 0.7 13 558-570 1-13 (13)
22 KOG1225 Teneurin-1 and related 82.5 0.91 2E-05 53.5 2.8 58 503-572 217-280 (525)
23 PF04151 PPC: Bacterial pre-pe 81.7 9.6 0.00021 32.5 8.1 66 431-516 4-69 (70)
24 PF12036 DUF3522: Protein of u 80.8 11 0.00024 38.8 9.6 34 634-668 59-92 (186)
25 smart00181 EGF Epidermal growt 78.6 2 4.4E-05 31.4 2.6 28 536-570 6-34 (35)
26 PHA03099 epidermal growth fact 76.0 2.1 4.6E-05 42.4 2.6 37 531-572 42-82 (139)
27 KOG1226 Integrin beta subunit 75.7 1.8 3.9E-05 52.8 2.5 34 529-572 544-581 (783)
28 KOG3607 Meltrins, fertilins an 74.9 1.7 3.7E-05 53.0 2.1 35 530-573 624-658 (716)
29 COG5237 PER1 Predicted membran 70.7 6.9 0.00015 42.8 5.1 51 605-663 135-190 (319)
30 KOG4243 Macrophage maturation- 68.0 17 0.00037 39.6 7.3 24 769-792 255-278 (298)
31 KOG4260 Uncharacterized conser 67.0 3.2 6.9E-05 45.7 1.8 39 532-573 142-184 (350)
32 KOG3879 Predicted membrane pro 58.6 1.4E+02 0.0029 32.9 11.8 24 812-835 212-235 (267)
33 PF07645 EGF_CA: Calcium-bindi 57.9 8 0.00017 30.4 2.1 25 536-566 10-34 (42)
34 PF00954 S_locus_glycop: S-loc 53.9 28 0.00061 32.3 5.4 34 526-566 72-107 (110)
35 smart00051 DSL delta serrate l 53.3 9.4 0.0002 33.2 2.0 26 536-570 38-63 (63)
36 PF00053 Laminin_EGF: Laminin 50.0 10 0.00022 30.5 1.6 28 537-572 2-33 (49)
37 cd00055 EGF_Lam Laminin-type e 48.2 16 0.00034 29.9 2.4 28 537-572 3-34 (50)
38 PF12947 EGF_3: EGF domain; I 47.1 10 0.00022 29.6 1.2 28 536-569 6-33 (36)
39 KOG1219 Uncharacterized conser 32.4 34 0.00073 47.2 3.0 41 527-573 3899-3940(4289)
40 KOG1219 Uncharacterized conser 31.0 38 0.00081 46.8 3.1 38 531-574 3942-3980(4289)
41 PF12662 cEGF: Complement Clr- 30.3 33 0.0007 25.1 1.4 15 558-572 3-21 (24)
42 PF12658 Ten1: Telomere cappin 29.8 57 0.0012 32.0 3.5 48 127-174 36-97 (124)
43 PRK05420 aquaporin Z; Provisio 25.0 3.5E+02 0.0077 28.8 8.6 62 692-768 161-223 (231)
44 PF04151 PPC: Bacterial pre-pe 23.6 3.4E+02 0.0073 23.1 6.7 64 258-336 3-68 (70)
45 PLN00184 aquaporin NIP1; Provi 21.9 3.3E+02 0.0072 30.4 7.9 62 691-768 207-268 (296)
46 PRK09292 Na(+)-translocating N 20.3 3E+02 0.0065 29.6 6.9 36 632-667 121-157 (209)
No 1
>PF12036 DUF3522: Protein of unknown function (DUF3522); InterPro: IPR021910 This family of proteins is functionally uncharacterised. This protein is found in eukaryotes. Proteins in this family are typically between 220 to 787 amino acids in length.
Probab=100.00 E-value=1.4e-47 Score=381.47 Aligned_cols=179 Identities=31% Similarity=0.426 Sum_probs=163.7
Q ss_pred hHHHHHHHHHHHhhhhhHHHHHHHHHHhhHHHHHHHHHHHHHhhhhhccccc----ceeeccchhHhHhHhHHHHHHHHH
Q 003218 578 RGHVQQSVALIASNAAALLPAYQALRQKAFAEWVLFTASGISSGLYHACDVG----TWCALSFNVLQFMDFWLSFMAVVS 653 (838)
Q Consensus 578 ~~~~~q~lLLtLSNLaFlP~I~vA~kRr~~~Ea~Vy~fTMffS~fYHACD~g----~~Cim~ydvLQf~DF~gSimSiwv 653 (838)
.+...|+++||+||++|+|+|++|+|||+++|++||+|||++|+||||||++ .+|++++++||++||+++++++|+
T Consensus 2 ~~~~~~~l~l~lSnl~~lP~i~~a~rr~~~~Ea~v~~~tm~~S~~YHacd~~~~~~~lc~~~~~~L~~~~~~~s~~~~~v 81 (186)
T PF12036_consen 2 FEQLLQFLLLTLSNLAFLPTIYVAVRRRYHFEAFVYTFTMFFSTFYHACDSGPGEIFLCIMDWHRLQNIDFIGSFLSIWV 81 (186)
T ss_pred hhhHHHHHHHHHHHHHHHHHHHHHHHHhhHHHHHHHHHHHHHHHhcccccCCCCceEEeechHHHHHHHHHHHHHHHHHH
Confidence 4578899999999999999999999999999999999999999999999964 499999999999999999999999
Q ss_pred HHHhhccchhHHHhhhhhhhHHHHHHHHHhhccCC--ccchhhHHHHHHHHHHHHHhhhcccccceeeeccccccccchh
Q 003218 654 TFIYLTTIDEALKRTIHTVVAILTAMMAITKATRS--SNIILVISIGAAGLLIGLLVELSTKFRSFSLRFGFCMNMVDRQ 731 (838)
Q Consensus 654 T~I~MA~~~e~lk~~~~~~~~IL~Al~~~~q~~R~--wnil~PI~i~~l~ili~Wl~~~~t~~R~~~~s~~~~~~yP~~~ 731 (838)
|+++||++++++|+.+++++++++++. .|.||+ ||+++|+++++++++++|++|++ +|+.+ ||+++
T Consensus 82 tl~~~a~~~~~~~~~l~~~~~~~~ai~--~~~~~~~~~~~~~Pi~~~~~i~~~~w~~r~~--~~~~~--------~~~~~ 149 (186)
T PF12036_consen 82 TLCAMARLDEPLKSVLHYFGALVIAIF--QQKDRWSLWNTIGPILIGLLILLVSWLYRCR--RRRRC--------YPPSW 149 (186)
T ss_pred HHHHhccCCHHHHHHHHHHHHHHHHHH--HhhCcccchhhHHHHHHHHHHHHHHHheecc--cCCcc--------CChHH
Confidence 999999999999999999999998877 444544 69999999999999999999865 34434 78876
Q ss_pred HHHHHHHHHhHHhhhhcccchhhhHHHHHHHHHHhhhhcccCcceeEehhHHHH
Q 003218 732 QTIMEWLRNFMKTILRRFRWGFVLVGFAALAMAAISWKLETSQSYWIWHSIWHV 785 (838)
Q Consensus 732 ~~i~~w~~~~~~~l~rrfRw~f~L~Ggi~la~~aI~~flET~dnY~y~HSiWHi 785 (838)
+ ||++++.||+++++.|+.+|+||+|||||+||+||+
T Consensus 150 ~-----------------~~~~~l~~g~~~~~~Gl~~f~et~dnY~~~HSlWHi 186 (186)
T PF12036_consen 150 R-----------------RWLFYLLPGIIFFILGLDLFLETNDNYRIVHSLWHI 186 (186)
T ss_pred H-----------------HHHHHHHHHHHHHHHHHhHhhcCCCcEEEEeeeeeC
Confidence 5 899999999999999999999999999999999996
No 2
>PF05875 Ceramidase: Ceramidase; InterPro: IPR008901 This entry consists of several ceramidases. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates.; GO: 0016811 hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amides, 0006672 ceramide metabolic process, 0016021 integral to membrane
Probab=97.87 E-value=0.00018 Score=75.50 Aligned_cols=52 Identities=29% Similarity=0.281 Sum_probs=35.2
Q ss_pred HHHhhhhhHHHHHHHHH---H-----hhHHHHHHHHHHHHHhhhhhcccccceeeccchhHhHhHhHH
Q 003218 587 LIASNAAALLPAYQALR---Q-----KAFAEWVLFTASGISSGLYHACDVGTWCALSFNVLQFMDFWL 646 (838)
Q Consensus 587 LtLSNLaFlP~I~vA~k---R-----r~~~Ea~Vy~fTMffS~fYHACD~g~~Cim~ydvLQf~DF~g 646 (838)
=|+||++|+......++ | ++..-.+...+-++.|+.||+= +++ ..|.+|=+-
T Consensus 29 NtlSNl~fi~~al~gl~~~~~~~~~~~~~l~~~~l~~VGiGS~~FHaT-------l~~-~~ql~DelP 88 (262)
T PF05875_consen 29 NTLSNLAFIVAALYGLYLARRRGLERRFALLYLGLALVGIGSFLFHAT-------LSY-WTQLLDELP 88 (262)
T ss_pred HHHHHHHHHHHHHHHHHHHhhccccchhHHHHHHHHHHHHhHHHHHhC-------hhh-hHHHhhhhh
Confidence 37999998877654333 2 2444555566778999999984 454 367788654
No 3
>PF04080 Per1: Per1-like ; InterPro: IPR007217 A member of this family has been implemented in protein processing in the endoplasmic reticulum [].
Probab=96.90 E-value=0.023 Score=61.36 Aligned_cols=44 Identities=14% Similarity=0.119 Sum_probs=35.1
Q ss_pred hHHHHHHHHHHHHHhhhhhcccccceeeccchhHhHhHhHHHHHHHHHHHHh
Q 003218 606 AFAEWVLFTASGISSGLYHACDVGTWCALSFNVLQFMDFWLSFMAVVSTFIY 657 (838)
Q Consensus 606 ~~~Ea~Vy~fTMffS~fYHACD~g~~Cim~ydvLQf~DF~gSimSiwvT~I~ 657 (838)
+..-+++...+=++|+.+|+.|.. .=+.+|-+++.+.|...+.+
T Consensus 89 ~~~~~~v~~naW~wStvFH~RD~~--------~TE~lDYf~A~a~vl~~l~~ 132 (267)
T PF04080_consen 89 YIIYAIVSMNAWIWSTVFHTRDTP--------LTEKLDYFSAGATVLFGLYA 132 (267)
T ss_pred eehHHHHHHHHHHHHHHHHHhccc--------HhhHhHHhhhHHHHHHHHHH
Confidence 567889999999999999999974 12368999988777776654
No 4
>TIGR01065 hlyIII channel protein, hemolysin III family. This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens and Drosophila. In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a role cytolytic role.
Probab=96.38 E-value=0.073 Score=54.66 Aligned_cols=51 Identities=24% Similarity=0.342 Sum_probs=35.3
Q ss_pred hHHHHHHHHHHHH----HhhhhhcccccceeeccchhHhHhHhHHHHHHHHHHHHhhc
Q 003218 606 AFAEWVLFTASGI----SSGLYHACDVGTWCALSFNVLQFMDFWLSFMAVVSTFIYLT 659 (838)
Q Consensus 606 ~~~Ea~Vy~fTMf----fS~fYHACD~g~~Cim~ydvLQf~DF~gSimSiwvT~I~MA 659 (838)
......+|.+++. .|++||.=.... -..+.|+++|-.+=.+.|+.|++-..
T Consensus 35 ~~~~~~vy~~~~~~~~~~St~yH~~~~s~---~~~~~~~rlD~~gI~~lIaGsytP~~ 89 (204)
T TIGR01065 35 AVLGFSIYGISLILLFLVSTLYHSIPKGS---KAKNWLRKIDHSMIYVLIAGTYTPFL 89 (204)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCcCch---hHHHHHHHccHHHHHHHHHHhhHHHH
Confidence 3455667766654 599999765211 24568999999998888888765543
No 5
>PF12955 DUF3844: Domain of unknown function (DUF3844); InterPro: IPR024382 This presumed domain is found in fungal species. It contains 8 largely conserved cysteine residues. This domain is found in proteins thought to be located in the endoplasmic reticulum.
Probab=96.26 E-value=0.0041 Score=58.62 Aligned_cols=66 Identities=26% Similarity=0.432 Sum_probs=42.1
Q ss_pred Eecccc---cCCCCCceeeeeeccCCceEEeeeeeCC-------------CCCCcCCCccccchhHHHHHHHHHHHhhhh
Q 003218 530 SLERCP---KRCSSHGQCRNAFDASGLTLYSFCACDR-------------DHGGFDCSVELVSHRGHVQQSVALIASNAA 593 (838)
Q Consensus 530 sls~C~---~~Cg~~G~C~ll~~~sG~~~ys~C~C~~-------------Gy~GwdCtd~svs~~~~~~q~lLLtLSNLa 593 (838)
+.+.|. ++|++||+|......+++ .-=.|.|.+ .|+|.+|...-++ .+..|++.+-++
T Consensus 4 S~~aC~~~Tn~CsgHG~C~~~~~~~~~-~C~~C~C~~T~~~~~~~~~ktt~W~G~aCqKkDvS-----~~F~L~~~~ti~ 77 (103)
T PF12955_consen 4 SNDACENATNNCSGHGSCVKKYGSGGG-DCFACKCKPTVVKTGSGKGKTTHWGGPACQKKDVS-----VPFWLFAGFTIA 77 (103)
T ss_pred CHHHHHHhccCCCCCceEeeccCCCcc-ceEEEEeeccccccccccCceeeeccccccccccc-----chhhHHHHHHHH
Confidence 446675 799999999987543321 222799999 7999999864323 344455555555
Q ss_pred hHHHHHHH
Q 003218 594 ALLPAYQA 601 (838)
Q Consensus 594 FlP~I~vA 601 (838)
++..+..+
T Consensus 78 lv~~~~~~ 85 (103)
T PF12955_consen 78 LVVLVAGA 85 (103)
T ss_pred HHHHHHHH
Confidence 44444333
No 6
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=96.04 E-value=0.0054 Score=46.53 Aligned_cols=26 Identities=46% Similarity=1.049 Sum_probs=22.7
Q ss_pred CCCCCceeeeeeccCCceEEeeeeeCCCCCCcCC
Q 003218 537 RCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDC 570 (838)
Q Consensus 537 ~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdC 570 (838)
.|++||+|... .| .|.|++||.|.+|
T Consensus 7 ~C~~~G~C~~~---~g-----~C~C~~g~~G~~C 32 (32)
T PF07974_consen 7 ICSGHGTCVSP---CG-----RCVCDSGYTGPDC 32 (32)
T ss_pred ccCCCCEEeCC---CC-----EEECCCCCcCCCC
Confidence 59999999964 23 8999999999988
No 7
>PF03006 HlyIII: Haemolysin-III related; InterPro: IPR004254 Members of this family are integral membrane proteins. This family includes proteins that are hemolysin-III homologs.; GO: 0016021 integral to membrane
Probab=95.96 E-value=0.11 Score=52.22 Aligned_cols=43 Identities=16% Similarity=0.298 Sum_probs=29.9
Q ss_pred HHHHHHHHhhhhhc--ccccceeeccchhHhHhHhHHHHHHHHHHHHh
Q 003218 612 LFTASGISSGLYHA--CDVGTWCALSFNVLQFMDFWLSFMAVVSTFIY 657 (838)
Q Consensus 612 Vy~fTMffS~fYHA--CD~g~~Cim~ydvLQf~DF~gSimSiwvT~I~ 657 (838)
-....+++|++||. |-+... .+..|+++|-.|-.+.+..+.+.
T Consensus 50 ~~~~~~~~St~yH~f~~~s~~~---~~~~~~~lD~~gI~l~i~gs~~p 94 (222)
T PF03006_consen 50 SAILCFLCSTLYHLFSCHSEGK---VYHIFLRLDYAGIFLLIAGSYTP 94 (222)
T ss_pred HHHHHHHhHHHhhCCCcCCcHH---HHHHHHhcchhhhhHhHhhhhhh
Confidence 34455778999999 533211 57899999999976666665443
No 8
>PRK15087 hemolysin; Provisional
Probab=95.55 E-value=0.26 Score=51.47 Aligned_cols=46 Identities=22% Similarity=0.303 Sum_probs=32.8
Q ss_pred HHHHHHH----HHHhhhhhcccccceeeccchhHhHhHhHHHHHHHHHHHHhhc
Q 003218 610 WVLFTAS----GISSGLYHACDVGTWCALSFNVLQFMDFWLSFMAVVSTFIYLT 659 (838)
Q Consensus 610 a~Vy~fT----MffS~fYHACD~g~~Cim~ydvLQf~DF~gSimSiwvT~I~MA 659 (838)
..+|..+ +.+|++||.-... -..+.|+++|=.+=.+.|..|+.-++
T Consensus 54 ~~vy~~s~~~l~~~StlYH~~~~~----~~~~~~~rlDh~~I~llIaGsytP~~ 103 (219)
T PRK15087 54 YSLYGGSMILLFLASTLYHAIPHQ----RAKRWLKKFDHCAIYLLIAGTYTPFL 103 (219)
T ss_pred HHHHHHHHHHHHHHHHHHHCCCch----HHHHHHHHccHHHHHHHHHHhhHHHH
Confidence 3455554 4579999987632 23569999999998888888776543
No 9
>KOG2970 consensus Predicted membrane protein [Function unknown]
Probab=94.06 E-value=0.36 Score=53.14 Aligned_cols=141 Identities=22% Similarity=0.265 Sum_probs=74.8
Q ss_pred hHHHHHHHHHHHHHhhhhhcccccceeeccchhHhHhHhHHHHH----HHHHHHHhhccchhH-HHhhhhhhhHHHHHHH
Q 003218 606 AFAEWVLFTASGISSGLYHACDVGTWCALSFNVLQFMDFWLSFM----AVVSTFIYLTTIDEA-LKRTIHTVVAILTAMM 680 (838)
Q Consensus 606 ~~~Ea~Vy~fTMffS~fYHACD~g~~Cim~ydvLQf~DF~gSim----SiwvT~I~MA~~~e~-lk~~~~~~~~IL~Al~ 680 (838)
.+.-|.+...+-+.|+.+|.=|.. .=+.||-.++.+ +.-++++-|-+++.. ..+- |+.++..|..
T Consensus 141 ~~I~a~i~mnawiwSsvFH~rD~~--------lTEklDYf~A~~~vlf~ly~a~ir~~~i~~~~~~~~--~ita~fla~y 210 (319)
T KOG2970|consen 141 WLIYAYIGMNAWIWSSVFHIRDVP--------LTEKLDYFSAYLTVLFGLYVALIRMLSIQSLPALRG--MITAIFLAFY 210 (319)
T ss_pred hhhHHHHHHHHHHHHHhhhhcCCc--------hHhhhhHHHHHHHHHHHHHHHHHHHHHHhcchhhhH--HHHHHHHHHH
Confidence 567788888999999999999873 123567666543 333444444444433 2222 2233333333
Q ss_pred HH--hh-----ccCCccchhhHHHHHHHHHHHHHhhhcccccceeeeccccccccchhHHHHHHHHHhHHhhhhcccchh
Q 003218 681 AI--TK-----ATRSSNIILVISIGAAGLLIGLLVELSTKFRSFSLRFGFCMNMVDRQQTIMEWLRNFMKTILRRFRWGF 753 (838)
Q Consensus 681 ~~--~q-----~~R~wnil~PI~i~~l~ili~Wl~~~~t~~R~~~~s~~~~~~yP~~~~~i~~w~~~~~~~l~rrfRw~f 753 (838)
+. .+ -|=.+|+..=+++|.+- ++.|++-. .|+|+ .|..|+ +|.+
T Consensus 211 a~Hi~yls~~~fdYgyNm~~~v~~g~iq-~vlw~~~~-~~~~~----------~~s~~~-----------------i~~~ 261 (319)
T KOG2970|consen 211 ANHILYLSFYNFDYGYNMIVCVAIGVIQ-LVLWLVWS-FKKRN----------LPSFWR-----------------IWPI 261 (319)
T ss_pred HHHHHHHhheecccccceeeehhhHHHH-HHHHHHHH-HHhhc----------Ccchhh-----------------hhHH
Confidence 21 11 23335776655666555 45554432 23444 344332 6777
Q ss_pred hhHHHHHHHHHHhhhhc-ccCcceeEehhHHHHH
Q 003218 754 VLVGFAALAMAAISWKL-ETSQSYWIWHSIWHVS 786 (838)
Q Consensus 754 ~L~Ggi~la~~aI~~fl-ET~dnY~y~HSiWHi~ 786 (838)
++.....++++ +=.++ ..=..|.=-|++||..
T Consensus 262 ~i~~~~~LA~s-LEi~DFpPy~~~iDAHALWHla 294 (319)
T KOG2970|consen 262 LIVIFFFLAMS-LEIFDFPPYAWLIDAHALWHLA 294 (319)
T ss_pred HHHHHHHHHHH-HHhhcCCchhhhcchHHHHHhh
Confidence 77765544442 22233 2233444469999975
No 10
>KOG2329 consensus Alkaline ceramidase [Lipid transport and metabolism]
Probab=93.40 E-value=0.29 Score=53.33 Aligned_cols=42 Identities=38% Similarity=0.412 Sum_probs=31.6
Q ss_pred HHHhhhhhHHHH----HHHHHH----hhHHHHHHHHHHHHHhhhhhcccc
Q 003218 587 LIASNAAALLPA----YQALRQ----KAFAEWVLFTASGISSGLYHACDV 628 (838)
Q Consensus 587 LtLSNLaFlP~I----~vA~kR----r~~~Ea~Vy~fTMffS~fYHACD~ 628 (838)
=|.||+.|+.++ +-++|+ |++.-.+.+++-+++|..|||-=+
T Consensus 37 NT~sN~~fil~~~~~l~~~y~~~~e~~~~l~~v~~~ivgl~S~~fH~TL~ 86 (276)
T KOG2329|consen 37 NTESNSPFILLAFIGLHCAYRQKLEKRAYLICVLFTIVGLGSMYFHMTLV 86 (276)
T ss_pred HHhhcchHHHHHHHHHHHHHHHHhhhhHHHHHHHHHHHHHHHhhhhhhHH
Confidence 477888877443 344443 478899999999999999999753
No 11
>PF13965 SID-1_RNA_chan: dsRNA-gated channel SID-1
Probab=92.24 E-value=5.3 Score=47.75 Aligned_cols=19 Identities=37% Similarity=0.992 Sum_probs=17.2
Q ss_pred ceeEehhHHHHHHHhheeE
Q 003218 775 SYWIWHSIWHVSIYTSSFF 793 (838)
Q Consensus 775 nY~y~HSiWHi~Ia~S~~F 793 (838)
+++=+|-+||++-|++.||
T Consensus 529 ~f~D~HDiwH~~SA~alff 547 (570)
T PF13965_consen 529 GFFDWHDIWHFLSAIALFF 547 (570)
T ss_pred CccccHHHHHHHHHHHHHH
Confidence 6788999999999999887
No 12
>COG1272 Predicted membrane protein, hemolysin III homolog [General function prediction only]
Probab=91.92 E-value=1.4 Score=46.93 Aligned_cols=39 Identities=21% Similarity=0.312 Sum_probs=26.9
Q ss_pred hhHHHHHHHHHHhhhhcccCcceeEehhHHHHHHHhheeE
Q 003218 754 VLVGFAALAMAAISWKLETSQSYWIWHSIWHVSIYTSSFF 793 (838)
Q Consensus 754 ~L~Ggi~la~~aI~~flET~dnY~y~HSiWHi~Ia~S~~F 793 (838)
+..||++..++++++-.+- |-..+.|-+||+++-+++++
T Consensus 176 l~~GGv~YsvG~ifY~~~~-~~~~~~H~iwH~fVv~ga~~ 214 (226)
T COG1272 176 LALGGVLYSVGAIFYVLRI-DRIPYSHAIWHLFVVGGAAC 214 (226)
T ss_pred HHHHhHHheeeeEEEEEee-ccCCchHHHHHHHHHHHHHH
Confidence 4556666666666543332 77889999999998776653
No 13
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=89.43 E-value=0.38 Score=34.22 Aligned_cols=30 Identities=33% Similarity=0.687 Sum_probs=24.0
Q ss_pred cCCCCCceeeeeeccCCceEEeeeeeCCCCCCc-CCC
Q 003218 536 KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGF-DCS 571 (838)
Q Consensus 536 ~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~Gw-dCt 571 (838)
..|.++|+|.... |++ .|.|..||.|+ .|.
T Consensus 6 ~~C~~~~~C~~~~---~~~---~C~C~~g~~g~~~C~ 36 (36)
T cd00053 6 NPCSNGGTCVNTP---GSY---RCVCPPGYTGDRSCE 36 (36)
T ss_pred CCCCCCCEEecCC---CCe---EeECCCCCcccCCcC
Confidence 6788899999753 323 79999999999 773
No 14
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=88.12 E-value=0.41 Score=35.99 Aligned_cols=29 Identities=24% Similarity=0.511 Sum_probs=23.3
Q ss_pred ccCCCCCceeeeeeccCCceEEeeeeeCCCCCCc
Q 003218 535 PKRCSSHGQCRNAFDASGLTLYSFCACDRDHGGF 568 (838)
Q Consensus 535 ~~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~Gw 568 (838)
++.|.++|.|.... .+.| .|.|.+||.|.
T Consensus 3 ~~~C~n~g~C~~~~--~~~y---~C~C~~G~~G~ 31 (32)
T PF00008_consen 3 SNPCQNGGTCIDLP--GGGY---TCECPPGYTGK 31 (32)
T ss_dssp TTSSTTTEEEEEES--TSEE---EEEEBTTEEST
T ss_pred CCcCCCCeEEEeCC--CCCE---EeECCCCCccC
Confidence 36899999999976 2334 99999999984
No 15
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=87.35 E-value=0.34 Score=56.89 Aligned_cols=32 Identities=44% Similarity=1.006 Sum_probs=27.7
Q ss_pred cccccCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCcc
Q 003218 532 ERCPKRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSVE 573 (838)
Q Consensus 532 s~C~~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd~ 573 (838)
..||.||.+||+|.. | .|.|++||.|.+|+..
T Consensus 312 ~~cpadC~g~G~Ci~-----G-----~C~C~~Gy~G~~C~~~ 343 (525)
T KOG1225|consen 312 RRCPADCSGHGKCID-----G-----ECLCDEGYTGELCIQR 343 (525)
T ss_pred ccCCccCCCCCcccC-----C-----ceEeCCCCcCCccccc
Confidence 349999999999993 4 8999999999999873
No 16
>PHA02887 EGF-like protein; Provisional
Probab=87.04 E-value=0.4 Score=46.68 Aligned_cols=45 Identities=29% Similarity=0.700 Sum_probs=35.2
Q ss_pred ceEEEEEEecccc----cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCc
Q 003218 523 SETVMSVSLERCP----KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSV 572 (838)
Q Consensus 523 ~~v~~svsls~C~----~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd 572 (838)
.+...+..-.||+ +=|= ||+|.++.+.. -.+|.|..||.|.-|..
T Consensus 75 ~~rk~~~hf~pC~~eyk~YCi-HG~C~yI~dL~----epsCrC~~GYtG~RCE~ 123 (126)
T PHA02887 75 FKRKNSMFFEKCKNDFNDFCI-NGECMNIIDLD----EKFCICNKGYTGIRCDE 123 (126)
T ss_pred hhhccccCccccChHhhCEee-CCEEEccccCC----CceeECCCCcccCCCCc
Confidence 3445566678998 4684 99999998665 35999999999999964
No 17
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=86.32 E-value=0.69 Score=33.48 Aligned_cols=34 Identities=29% Similarity=0.810 Sum_probs=25.5
Q ss_pred cccc--cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCC
Q 003218 532 ERCP--KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCS 571 (838)
Q Consensus 532 s~C~--~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCt 571 (838)
..|. ..|..+|.|.... |.| .|.|..||.|..|.
T Consensus 3 ~~C~~~~~C~~~~~C~~~~---~~~---~C~C~~g~~g~~C~ 38 (38)
T cd00054 3 DECASGNPCQNGGTCVNTV---GSY---RCSCPPGYTGRNCE 38 (38)
T ss_pred ccCCCCCCcCCCCEeECCC---CCe---EeECCCCCcCCcCC
Confidence 4565 4788889998642 333 79999999998883
No 18
>PF04863 EGF_alliinase: Alliinase EGF-like domain; InterPro: IPR006947 Allicin is a thiosulphinate that gives rise to dithiines, allyl sulphides and ajoenes, the three groups of active compounds in Allium species. Allicin is synthesised from sulphoxide cysteine derivatives by alliinase, whose C-S lyase activity cleaves C(beta)-S(gamma) bonds. It is thought that this enzyme forms part of a primitive plant defence system [].; GO: 0016846 carbon-sulfur lyase activity; PDB: 1LK9_B 2HOX_C 2HOR_A.
Probab=85.73 E-value=0.3 Score=41.80 Aligned_cols=35 Identities=34% Similarity=0.574 Sum_probs=19.0
Q ss_pred cCCCCCceeeeeecc-CCceEEeeeeeCCCCCCcCCCcc
Q 003218 536 KRCSSHGQCRNAFDA-SGLTLYSFCACDRDHGGFDCSVE 573 (838)
Q Consensus 536 ~~Cg~~G~C~ll~~~-sG~~~ys~C~C~~Gy~GwdCtd~ 573 (838)
-.|++||+..+-.-. .| ...|.|...|+|.||+.-
T Consensus 17 i~CSGHGr~flDg~~~dG---~p~CECn~Cy~GpdCS~~ 52 (56)
T PF04863_consen 17 ISCSGHGRAFLDGLIADG---SPVCECNSCYGGPDCSTL 52 (56)
T ss_dssp S--TTSEE--TTS-EETT---EE--EE-TTEESTTS-EE
T ss_pred CCcCCCCeeeeccccccC---CccccccCCcCCCCcccC
Confidence 379999999863211 22 278999999999999853
No 19
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=83.80 E-value=0.75 Score=58.89 Aligned_cols=36 Identities=31% Similarity=0.798 Sum_probs=29.1
Q ss_pred ccc-ccCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCcc
Q 003218 532 ERC-PKRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSVE 573 (838)
Q Consensus 532 s~C-~~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd~ 573 (838)
.-| -+.||+||.|..- + |+| +|.|++||.|.+|-.+
T Consensus 1240 DlCYs~pC~nng~C~sr--E-ggY---tCeCrpg~tGehCEvs 1276 (2531)
T KOG4289|consen 1240 DLCYSGPCGNNGRCRSR--E-GGY---TCECRPGFTGEHCEVS 1276 (2531)
T ss_pred HhhhcCCCCCCCceEEe--c-Cce---eEEecCCccccceeee
Confidence 446 3799999999974 3 557 9999999999999643
No 20
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=83.42 E-value=1.2 Score=32.89 Aligned_cols=34 Identities=29% Similarity=0.788 Sum_probs=25.7
Q ss_pred cccc--cCCCCCceeeeeeccCCceEEeeeeeCCCCC-CcCCC
Q 003218 532 ERCP--KRCSSHGQCRNAFDASGLTLYSFCACDRDHG-GFDCS 571 (838)
Q Consensus 532 s~C~--~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~-GwdCt 571 (838)
..|. +.|..+|.|... .|.| .|.|..||. |..|.
T Consensus 3 ~~C~~~~~C~~~~~C~~~---~g~~---~C~C~~g~~~g~~C~ 39 (39)
T smart00179 3 DECASGNPCQNGGTCVNT---VGSY---RCECPPGYTDGRNCE 39 (39)
T ss_pred ccCcCCCCcCCCCEeECC---CCCe---EeECCCCCccCCcCC
Confidence 4565 479888999864 3444 699999999 98883
No 21
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=83.37 E-value=0.51 Score=29.68 Aligned_cols=13 Identities=31% Similarity=0.864 Sum_probs=10.9
Q ss_pred eeeeCCCCCCcCC
Q 003218 558 FCACDRDHGGFDC 570 (838)
Q Consensus 558 ~C~C~~Gy~GwdC 570 (838)
.|.|.+||.|..|
T Consensus 1 ~C~C~~G~~G~~C 13 (13)
T PF12661_consen 1 TCQCPPGWTGPNC 13 (13)
T ss_dssp EEEE-TTEETTTT
T ss_pred CccCcCCCcCCCC
Confidence 4999999999988
No 22
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=82.45 E-value=0.91 Score=53.49 Aligned_cols=58 Identities=28% Similarity=0.371 Sum_probs=40.4
Q ss_pred EeeccCCcEEEEEEeeeCCC------ceEEEEEEecccccCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCc
Q 003218 503 ILYVREGTWGFGIRHVNTSK------SETVMSVSLERCPKRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSV 572 (838)
Q Consensus 503 IpYPqtGtWYLsL~~~n~~~------~~v~~svsls~C~~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd 572 (838)
--|-++|.|+... ..... ...-...+...||.+|.++|+|.. | .|.|+.||.|.||+.
T Consensus 217 ~~r~~~~~~~~~~--~~~~~ic~c~~~~~g~~c~~~~C~~~c~~~g~c~~-----G-----~CIC~~Gf~G~dC~e 280 (525)
T KOG1225|consen 217 TGRCREGRCFCTA--GFFDGICECPEGYFGPLCSTIYCPGGCTGRGQCVE-----G-----RCICPPGFTGDDCDE 280 (525)
T ss_pred ccccccCcccccc--cccCceeecCCceeCCccccccCCCCCcccceEeC-----C-----eEeCCCCCcCCCCCc
Confidence 3456778888754 22111 111222335689999999999997 5 899999999999986
No 23
>PF04151 PPC: Bacterial pre-peptidase C-terminal domain; InterPro: IPR007280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This domain is normally found at the C terminus of secreted archaeal and bacterial peptidases, the majority of which belong to MEROPS peptidase families M4 (vibriolysin, IPR001570 from INTERPRO), M9A amd M9B (microbial collangenase, IPR002169 from INTERPRO), M28 (aminopeptidase Ap1, IPR007484 from INTERPRO) and S8 (subtilisin family peptidases, IPR000209 from INTERPRO).; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 4DY5_B 4DXZ_A 4DY3_B 3JQW_A 3JQX_C 1NQJ_B 1NQD_A 2O8O_A 1WMF_A 1WME_A ....
Probab=81.65 E-value=9.6 Score=32.49 Aligned_cols=66 Identities=23% Similarity=0.425 Sum_probs=38.2
Q ss_pred EEEeecCCCCCCceEEEEEeecceeeEEEEEeeCCCCCCCcccccccccccccccccccccccCCccceeEEEeeccCCc
Q 003218 431 YFLLDIPRGAAGGSIHIQLTSDTKIKHEIYAKSGGLPSLQSWDYYYANRTNNSVGSMFFKLYNSSEEKVDFYILYVREGT 510 (838)
Q Consensus 431 ~f~l~Lp~gdSGG~L~v~L~~nks~~~~Vyar~g~~Ptlt~~D~~~~~~ts~s~~s~f~~~~nsS~~~a~L~IpYPqtGt 510 (838)
+|.+++| +|+.|+|+|..... +..++.....-+++.++|. .+ .....+..+.+.-|++|+
T Consensus 4 ~y~f~v~---ag~~l~i~l~~~~~-d~dl~l~~~~g~~~~~~d~-----~~-----------~~~~~~~~i~~~~~~~Gt 63 (70)
T PF04151_consen 4 YYSFTVP---AGGTLTIDLSGGSG-DADLYLYDSNGNSLASYDD-----SS-----------QSGGNDESITFTAPAAGT 63 (70)
T ss_dssp EEEEEES---TTEEEEEEECETTS-SEEEEEEETTSSSCEECCC-----CT-----------CETTSEEEEEEEESSSEE
T ss_pred EEEEEEc---CCCEEEEEEcCCCC-CeEEEEEcCCCCchhhhee-----cC-----------CCCCCccEEEEEcCCCEE
Confidence 6777777 67789999865552 3334433332355544431 00 001223445566799999
Q ss_pred EEEEEE
Q 003218 511 WGFGIR 516 (838)
Q Consensus 511 WYLsL~ 516 (838)
||+.++
T Consensus 64 Yyi~V~ 69 (70)
T PF04151_consen 64 YYIRVY 69 (70)
T ss_dssp EEEEEE
T ss_pred EEEEEE
Confidence 999874
No 24
>PF12036 DUF3522: Protein of unknown function (DUF3522); InterPro: IPR021910 This family of proteins is functionally uncharacterised. This protein is found in eukaryotes. Proteins in this family are typically between 220 to 787 amino acids in length.
Probab=80.81 E-value=11 Score=38.82 Aligned_cols=34 Identities=12% Similarity=0.080 Sum_probs=29.2
Q ss_pred ccchhHhHhHhHHHHHHHHHHHHhhccchhHHHhh
Q 003218 634 LSFNVLQFMDFWLSFMAVVSTFIYLTTIDEALKRT 668 (838)
Q Consensus 634 m~ydvLQf~DF~gSimSiwvT~I~MA~~~e~lk~~ 668 (838)
|....||++||...+.++....+.|+++.+ +++.
T Consensus 59 lc~~~~~~L~~~~~~~s~~~~~vtl~~~a~-~~~~ 92 (186)
T PF12036_consen 59 LCIMDWHRLQNIDFIGSFLSIWVTLCAMAR-LDEP 92 (186)
T ss_pred EeechHHHHHHHHHHHHHHHHHHHHHHhcc-CCHH
Confidence 789999999999999999999999988775 4443
No 25
>smart00181 EGF Epidermal growth factor-like domain.
Probab=78.56 E-value=2 Score=31.42 Aligned_cols=28 Identities=32% Similarity=0.656 Sum_probs=21.4
Q ss_pred cCCCCCceeeeeeccCCceEEeeeeeCCCCCC-cCC
Q 003218 536 KRCSSHGQCRNAFDASGLTLYSFCACDRDHGG-FDC 570 (838)
Q Consensus 536 ~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~G-wdC 570 (838)
+.|..+ .|.... |.+ .|.|..||.| ..|
T Consensus 6 ~~C~~~-~C~~~~---~~~---~C~C~~g~~g~~~C 34 (35)
T smart00181 6 GPCSNG-TCINTP---GSY---TCSCPPGYTGDKRC 34 (35)
T ss_pred CCCCCC-EEECCC---CCe---EeECCCCCccCCcc
Confidence 368777 898652 333 8999999999 887
No 26
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=76.00 E-value=2.1 Score=42.44 Aligned_cols=37 Identities=32% Similarity=0.832 Sum_probs=29.6
Q ss_pred ecccc----cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCc
Q 003218 531 LERCP----KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSV 572 (838)
Q Consensus 531 ls~C~----~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd 572 (838)
.++|+ +=| =||+|..+.+.+. .+|.|..||.|.-|-.
T Consensus 42 i~~Cp~ey~~YC-lHG~C~yI~dl~~----~~CrC~~GYtGeRCEh 82 (139)
T PHA03099 42 IRLCGPEGDGYC-LHGDCIHARDIDG----MYCRCSHGYTGIRCQH 82 (139)
T ss_pred cccCChhhCCEe-ECCEEEeeccCCC----ceeECCCCcccccccc
Confidence 46887 346 5899999987653 4899999999999963
No 27
>KOG1226 consensus Integrin beta subunit (N-terminal portion of extracellular region) [Signal transduction mechanisms; Extracellular structures]
Probab=75.66 E-value=1.8 Score=52.78 Aligned_cols=34 Identities=29% Similarity=0.844 Sum_probs=28.2
Q ss_pred EEecccccC----CCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCc
Q 003218 529 VSLERCPKR----CSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSV 572 (838)
Q Consensus 529 vsls~C~~~----Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd 572 (838)
...-.|+.. ||+||+|.- | .|.|++||.|-.|.=
T Consensus 544 CDnfsC~r~~g~lC~g~G~C~C-----G-----~CvC~~GwtG~~C~C 581 (783)
T KOG1226|consen 544 CDNFSCERHKGVLCGGHGRCEC-----G-----RCVCNPGWTGSACNC 581 (783)
T ss_pred ccCcccccccCcccCCCCeEeC-----C-----cEEcCCCCccCCCCC
Confidence 344578877 999999997 4 899999999999863
No 28
>KOG3607 consensus Meltrins, fertilins and related Zn-dependent metalloproteinases of the ADAMs family [Posttranslational modification, protein turnover, chaperones]
Probab=74.86 E-value=1.7 Score=53.01 Aligned_cols=35 Identities=29% Similarity=0.663 Sum_probs=29.7
Q ss_pred EecccccCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCcc
Q 003218 530 SLERCPKRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSVE 573 (838)
Q Consensus 530 sls~C~~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd~ 573 (838)
..+-||.+|++||.|-.- + .|+|.+||.+.+|...
T Consensus 624 ~~~~~~~~C~g~GVCnn~----~-----~ChC~~gwapp~C~~~ 658 (716)
T KOG3607|consen 624 NSSCCPTTCNGHGVCNNE----L-----NCHCEPGWAPPFCFIF 658 (716)
T ss_pred cccccccccCCCcccCCC----c-----ceeeCCCCCCCccccc
Confidence 346789999999999863 2 8999999999999864
No 29
>COG5237 PER1 Predicted membrane protein [Function unknown]
Probab=70.75 E-value=6.9 Score=42.83 Aligned_cols=51 Identities=18% Similarity=0.333 Sum_probs=30.6
Q ss_pred hhHHHH-HHHHHHHHHhhhhhcccccceeeccchhHhHhHhHHHHH----HHHHHHHhhccchh
Q 003218 605 KAFAEW-VLFTASGISSGLYHACDVGTWCALSFNVLQFMDFWLSFM----AVVSTFIYLTTIDE 663 (838)
Q Consensus 605 r~~~Ea-~Vy~fTMffS~fYHACD~g~~Cim~ydvLQf~DF~gSim----SiwvT~I~MA~~~e 663 (838)
++...+ ++.-.+-..|+.+|.=|.- .=+-||-+.+.+ .+-++++-|-.+..
T Consensus 135 ~~~l~wv~igmlAwi~SsvFHird~~--------iTeklDYF~AgltVLfGfy~~lvrm~~~~~ 190 (319)
T COG5237 135 LYYLQWVYIGMLAWISSSVFHIRDNT--------ITEKLDYFLAGLTVLFGFYMALVRMILIVS 190 (319)
T ss_pred eEEeeHHHHHHHHHHHHhheeeeccc--------hhhhHHHHHhhHHHHHHHHHHHHHHHHhhc
Confidence 355566 6777778899999999862 111355555443 34445555555543
No 30
>KOG4243 consensus Macrophage maturation-associated protein [Defense mechanisms]
Probab=68.03 E-value=17 Score=39.55 Aligned_cols=24 Identities=17% Similarity=0.247 Sum_probs=18.0
Q ss_pred hcccCcceeEehhHHHHHHHhhee
Q 003218 769 KLETSQSYWIWHSIWHVSIYTSSF 792 (838)
Q Consensus 769 flET~dnY~y~HSiWHi~Ia~S~~ 792 (838)
|+..+.---+-|-|||.++++++.
T Consensus 255 FFK~DG~ipfAHAIWHLFV~l~A~ 278 (298)
T KOG4243|consen 255 FFKSDGIIPFAHAIWHLFVALAAG 278 (298)
T ss_pred EEecCCceehHHHHHHHHHHHHcc
Confidence 445555666789999999988763
No 31
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=66.98 E-value=3.2 Score=45.72 Aligned_cols=39 Identities=26% Similarity=0.629 Sum_probs=29.4
Q ss_pred cccc----cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCcc
Q 003218 532 ERCP----KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSVE 573 (838)
Q Consensus 532 s~C~----~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd~ 573 (838)
.+|| +.|+++|+|+=-.+..| ...|.|.+||+|.-|.+=
T Consensus 142 l~Cpggser~C~GnG~C~GdGsR~G---sGkCkC~~GY~Gp~C~~C 184 (350)
T KOG4260|consen 142 LQCPGGSERPCFGNGSCHGDGSREG---SGKCKCETGYTGPLCRYC 184 (350)
T ss_pred ccCCCCCcCCcCCCCcccCCCCCCC---CCcccccCCCCCcccccc
Confidence 3575 68999999985433322 239999999999999864
No 32
>KOG3879 consensus Predicted membrane protein [Function unknown]
Probab=58.59 E-value=1.4e+02 Score=32.86 Aligned_cols=24 Identities=25% Similarity=0.536 Sum_probs=17.9
Q ss_pred CCeeeEeecCCCCCCCCCCCCCcc
Q 003218 812 GTYELTRQDSMPRGDSEGRERPEV 835 (838)
Q Consensus 812 ~~y~~t~~d~~~r~~~~~~~~~~~ 835 (838)
=+|...|.|.-+-.|.|-+|.|.+
T Consensus 212 isy~th~~d~e~~ee~~~~~~~~i 235 (267)
T KOG3879|consen 212 ISYDTHHEDNEPEEETEVPEEPKI 235 (267)
T ss_pred cceecccccCCCCcccCCCCCcch
Confidence 357777888888888777777765
No 33
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=57.92 E-value=8 Score=30.45 Aligned_cols=25 Identities=28% Similarity=0.670 Sum_probs=21.3
Q ss_pred cCCCCCceeeeeeccCCceEEeeeeeCCCCC
Q 003218 536 KRCSSHGQCRNAFDASGLTLYSFCACDRDHG 566 (838)
Q Consensus 536 ~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~ 566 (838)
+.|..++.|.-.. |.| .|.|++||.
T Consensus 10 ~~C~~~~~C~N~~---Gsy---~C~C~~Gy~ 34 (42)
T PF07645_consen 10 HNCPENGTCVNTE---GSY---SCSCPPGYE 34 (42)
T ss_dssp SSSSTTSEEEEET---TEE---EEEESTTEE
T ss_pred CcCCCCCEEEcCC---CCE---EeeCCCCcE
Confidence 5798899999864 656 899999998
No 34
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=53.87 E-value=28 Score=32.27 Aligned_cols=34 Identities=21% Similarity=0.533 Sum_probs=25.0
Q ss_pred EEEEEecccc--cCCCCCceeeeeeccCCceEEeeeeeCCCCC
Q 003218 526 VMSVSLERCP--KRCSSHGQCRNAFDASGLTLYSFCACDRDHG 566 (838)
Q Consensus 526 ~~svsls~C~--~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~ 566 (838)
..+.-.+.|- +.||.+|.|... . -..|.|-+||.
T Consensus 72 ~~~~p~d~Cd~y~~CG~~g~C~~~--~-----~~~C~Cl~GF~ 107 (110)
T PF00954_consen 72 FWSAPKDQCDVYGFCGPNGICNSN--N-----SPKCSCLPGFE 107 (110)
T ss_pred EEEecccCCCCccccCCccEeCCC--C-----CCceECCCCcC
Confidence 4455557895 899999999642 1 23699999985
No 35
>smart00051 DSL delta serrate ligand.
Probab=53.29 E-value=9.4 Score=33.22 Aligned_cols=26 Identities=23% Similarity=0.365 Sum_probs=21.2
Q ss_pred cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCC
Q 003218 536 KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDC 570 (838)
Q Consensus 536 ~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdC 570 (838)
+++.+|..|.. .| .|.|.+||.|..|
T Consensus 38 ~d~~~~~~Cd~----~G-----~~~C~~Gw~G~~C 63 (63)
T smart00051 38 DDFFGHYTCDE----NG-----NKGCLEGWMGPYC 63 (63)
T ss_pred ccccCCccCCc----CC-----CEecCCCCcCCCC
Confidence 56778889964 24 7999999999988
No 36
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=49.96 E-value=10 Score=30.52 Aligned_cols=28 Identities=32% Similarity=0.909 Sum_probs=21.3
Q ss_pred CCCCCc----eeeeeeccCCceEEeeeeeCCCCCCcCCCc
Q 003218 537 RCSSHG----QCRNAFDASGLTLYSFCACDRDHGGFDCSV 572 (838)
Q Consensus 537 ~Cg~~G----~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd 572 (838)
+|.++| .|.. ..| .|.|++||.|..|..
T Consensus 2 ~C~~~~~~~~~C~~---~~G-----~C~C~~~~~G~~C~~ 33 (49)
T PF00053_consen 2 DCNPHGSSSQTCDP---STG-----QCVCKPGTTGPRCDQ 33 (49)
T ss_dssp SSTTCCBCCSSEEE---TCE-----EESBSTTEESTTS-E
T ss_pred cCcCCCCCCCcccC---CCC-----EEeccccccCCcCcC
Confidence 466666 8887 233 899999999999974
No 37
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=48.16 E-value=16 Score=29.87 Aligned_cols=28 Identities=32% Similarity=0.929 Sum_probs=21.2
Q ss_pred CCCCCce----eeeeeccCCceEEeeeeeCCCCCCcCCCc
Q 003218 537 RCSSHGQ----CRNAFDASGLTLYSFCACDRDHGGFDCSV 572 (838)
Q Consensus 537 ~Cg~~G~----C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd 572 (838)
+|.++|. |.. .+| .|.|+.|+.|..|..
T Consensus 3 ~C~~~g~~~~~C~~---~~G-----~C~C~~~~~G~~C~~ 34 (50)
T cd00055 3 DCNGHGSLSGQCDP---GTG-----QCECKPNTTGRRCDR 34 (50)
T ss_pred cCcCCCCCCccccC---CCC-----EEeCCCcCCCCCCCC
Confidence 4666665 765 234 899999999999963
No 38
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=47.07 E-value=10 Score=29.63 Aligned_cols=28 Identities=21% Similarity=0.456 Sum_probs=20.0
Q ss_pred cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcC
Q 003218 536 KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFD 569 (838)
Q Consensus 536 ~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~Gwd 569 (838)
..|+.+-+|..... .+ .|.|++||.|-+
T Consensus 6 ~~C~~nA~C~~~~~---~~---~C~C~~Gy~GdG 33 (36)
T PF12947_consen 6 GGCHPNATCTNTGG---SY---TCTCKPGYEGDG 33 (36)
T ss_dssp GGS-TTCEEEE-TT---SE---EEEE-CEEECCS
T ss_pred CCCCCCcEeecCCC---CE---EeECCCCCccCC
Confidence 58999999998643 23 999999999864
No 39
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=32.43 E-value=34 Score=47.22 Aligned_cols=41 Identities=27% Similarity=0.673 Sum_probs=32.6
Q ss_pred EEEEeccc-ccCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCcc
Q 003218 527 MSVSLERC-PKRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSVE 573 (838)
Q Consensus 527 ~svsls~C-~~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd~ 573 (838)
-++.++|| ++.|-.-|+|... ++| | -|.|+.||.|--|-.+
T Consensus 3899 CEi~~epC~snPC~~GgtCip~--~n~-f---~CnC~~gyTG~~Ce~~ 3940 (4289)
T KOG1219|consen 3899 CEIDLEPCASNPCLTGGTCIPF--YNG-F---LCNCPNGYTGKRCEAR 3940 (4289)
T ss_pred cccccccccCCCCCCCCEEEec--CCC-e---eEeCCCCccCceeecc
Confidence 45677899 5899999999985 333 4 7999999999999643
No 40
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=31.02 E-value=38 Score=46.84 Aligned_cols=38 Identities=29% Similarity=0.654 Sum_probs=30.7
Q ss_pred ecccc-cCCCCCceeeeeeccCCceEEeeeeeCCCCCCcCCCccc
Q 003218 531 LERCP-KRCSSHGQCRNAFDASGLTLYSFCACDRDHGGFDCSVEL 574 (838)
Q Consensus 531 ls~C~-~~Cg~~G~C~ll~~~sG~~~ys~C~C~~Gy~GwdCtd~s 574 (838)
.+.|- |-|+.-|+|.....+ | .|.|.+||.|-.|-++.
T Consensus 3942 i~eCs~n~C~~gg~C~n~~gs---f---~CncT~g~~gr~c~~~~ 3980 (4289)
T KOG1219|consen 3942 ISECSKNVCGTGGQCINIPGS---F---HCNCTPGILGRTCCAEK 3980 (4289)
T ss_pred ccccccccccCCceeeccCCc---e---EeccChhHhcccCcccc
Confidence 45576 789999999987543 3 89999999999997654
No 41
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=30.31 E-value=33 Score=25.10 Aligned_cols=15 Identities=27% Similarity=0.665 Sum_probs=12.5
Q ss_pred eeeeCCCCC----CcCCCc
Q 003218 558 FCACDRDHG----GFDCSV 572 (838)
Q Consensus 558 ~C~C~~Gy~----GwdCtd 572 (838)
.|.|.+||. |-.|.|
T Consensus 3 ~C~C~~Gy~l~~d~~~C~D 21 (24)
T PF12662_consen 3 TCSCPPGYQLSPDGRSCED 21 (24)
T ss_pred EeeCCCCCcCCCCCCcccc
Confidence 799999997 667775
No 42
>PF12658 Ten1: Telomere capping, CST complex subunit; InterPro: IPR024222 Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex []. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast, and a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. S.pombe has Stn1- and Ten1-like proteins that are essential for chromosome end protection. Stn1 orthologues exist in all species that have Pot1, whereas Ten1-like proteins can be found in all fungi. Fission yeast Stn1 and Ten1 localise at telomeres in a manner that correlates with the length of the ssDNA overhang, suggesting that they specifically associate with the telomeric ssDNA. Two separate protein complexes are required for chromosome end protection in fission yeast. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution [].; PDB: 3KF8_D 3KF6_B 3K0X_A.
Probab=29.82 E-value=57 Score=31.98 Aligned_cols=48 Identities=23% Similarity=0.477 Sum_probs=26.5
Q ss_pred CCccccccccccccccc---cceee----ee-----eeccccCCcceE--EEeecCCCccce
Q 003218 127 SSNELEDIQNEEQCYPM---QKNIS----VK-----LTNEQISPGAWY--LGFFNGVGAIRT 174 (838)
Q Consensus 127 ~~~~~~~~~~~~qc~p~---~~~~~----~~-----l~~~qi~~g~wy--~g~f~~~~~~r~ 174 (838)
++....+....+.+||- ..+.+ +. ++.+++..|.|+ +|+++|-.+..+
T Consensus 36 ~Y~~~~~~L~l~h~~p~~~~~~~~~v~VdI~~vL~tv~~~~~rvG~WvNV~Gy~~~~~~~~~ 97 (124)
T PF12658_consen 36 SYDTSTGTLTLEHNYPRENDSQPSSVSVDINLVLETVSSEELRVGEWVNVVGYIRGEKPSQT 97 (124)
T ss_dssp EEECCCTEEEEEETCCC---S----EEEE-TTTTTTS-GGGGSTT-EEEEEEEEECTT----
T ss_pred EEecCccEEEEeecCCCCcCCCCceEEEEHHHHhhhcCccceecceEEEEEEEecccccccc
Confidence 34445556666677777 22211 22 266789999998 899999997763
No 43
>PRK05420 aquaporin Z; Provisional
Probab=24.96 E-value=3.5e+02 Score=28.84 Aligned_cols=62 Identities=18% Similarity=0.124 Sum_probs=34.7
Q ss_pred hhhHHHHHHHHHHHHHhhhcccccceeeeccccccccchhHHHHHHHHHhHHh-hhhcccchhhhHHHHHHHHHHhhh
Q 003218 692 ILVISIGAAGLLIGLLVELSTKFRSFSLRFGFCMNMVDRQQTIMEWLRNFMKT-ILRRFRWGFVLVGFAALAMAAISW 768 (838)
Q Consensus 692 l~PI~i~~l~ili~Wl~~~~t~~R~~~~s~~~~~~yP~~~~~i~~w~~~~~~~-l~rrfRw~f~L~Ggi~la~~aI~~ 768 (838)
..|+.+++++.++.+.-.. -.|.++| |.|- +-+.+... +...+.|+|.+.|++-..++++.+
T Consensus 161 ~~p~~iGl~v~~~~~~~~~---------~TG~s~N-PAR~-----~gpal~~g~~~~~~~wvy~vgP~~Ga~laa~~y 223 (231)
T PRK05420 161 FAPIAIGLALTLIHLISIP---------VTNTSVN-PARS-----TGVALFVGGWALEQLWLFWVAPIVGAIIGGLIY 223 (231)
T ss_pred chHHHHHHHHHHHHHHhhc---------cCCCccC-cHHH-----HHHHHHhCCCCccceEEeehHHHHHHHHHHHHH
Confidence 5677777766544432211 1255555 5542 22333221 111358999999999877777765
No 44
>PF04151 PPC: Bacterial pre-peptidase C-terminal domain; InterPro: IPR007280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This domain is normally found at the C terminus of secreted archaeal and bacterial peptidases, the majority of which belong to MEROPS peptidase families M4 (vibriolysin, IPR001570 from INTERPRO), M9A amd M9B (microbial collangenase, IPR002169 from INTERPRO), M28 (aminopeptidase Ap1, IPR007484 from INTERPRO) and S8 (subtilisin family peptidases, IPR000209 from INTERPRO).; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 4DY5_B 4DXZ_A 4DY3_B 3JQW_A 3JQX_C 1NQJ_B 1NQD_A 2O8O_A 1WMF_A 1WME_A ....
Probab=23.57 E-value=3.4e+02 Score=23.10 Aligned_cols=64 Identities=20% Similarity=0.302 Sum_probs=41.7
Q ss_pred eEEEEeccCchheeeEEeceeeeeccccCCcCCCCCceEEEEeecCCCCCccccccCC--CCCcceeeecCCCCcceEEE
Q 003218 258 KVFFLDVLGIAEQLIIMAMNVTFSMTQSNNTLNAGGANIVCFARHGAMPSEILHDYSG--DISNGPLIVDSPKVGRWYIT 335 (838)
Q Consensus 258 ~~y~ldV~~~a~~l~i~a~n~~~~~~~s~~~~~~~~~~l~~~~r~~a~P~~~~~d~sg--~~~~c~L~l~sPpwgrW~~v 335 (838)
.+|++++|.-.. ++|++.+-. .+.-|-++...| +.....|+++ ....-.+.+..|.-|+||+.
T Consensus 3 D~y~f~v~ag~~-l~i~l~~~~------------~d~dl~l~~~~g--~~~~~~d~~~~~~~~~~~i~~~~~~~GtYyi~ 67 (70)
T PF04151_consen 3 DYYSFTVPAGGT-LTIDLSGGS------------GDADLYLYDSNG--NSLASYDDSSQSGGNDESITFTAPAAGTYYIR 67 (70)
T ss_dssp EEEEEEESTTEE-EEEEECETT------------SSEEEEEEETTS--SSCEECCCCTCETTSEEEEEEEESSSEEEEEE
T ss_pred EEEEEEEcCCCE-EEEEEcCCC------------CCeEEEEEcCCC--CchhhheecCCCCCCccEEEEEcCCCEEEEEE
Confidence 579999998776 888874432 145566666665 4444445444 23446677788999999887
Q ss_pred E
Q 003218 336 I 336 (838)
Q Consensus 336 i 336 (838)
+
T Consensus 68 V 68 (70)
T PF04151_consen 68 V 68 (70)
T ss_dssp E
T ss_pred E
Confidence 4
No 45
>PLN00184 aquaporin NIP1; Provisional
Probab=21.91 E-value=3.3e+02 Score=30.42 Aligned_cols=62 Identities=16% Similarity=0.132 Sum_probs=34.6
Q ss_pred chhhHHHHHHHHHHHHHhhhcccccceeeeccccccccchhHHHHHHHHHhHHhhhhcccchhhhHHHHHHHHHHhhh
Q 003218 691 IILVISIGAAGLLIGLLVELSTKFRSFSLRFGFCMNMVDRQQTIMEWLRNFMKTILRRFRWGFVLVGFAALAMAAISW 768 (838)
Q Consensus 691 il~PI~i~~l~ili~Wl~~~~t~~R~~~~s~~~~~~yP~~~~~i~~w~~~~~~~l~rrfRw~f~L~Ggi~la~~aI~~ 768 (838)
-+.|+++|+++.++....- .-.|..+| |.|- +-+.++...+ ++.|+|.+.|++-.+++++.+
T Consensus 207 ~~~~l~IG~~v~~~~~~~g---------~~TG~smN-PAR~-----~GPal~~~~~-~~~WVy~vgPilGa~laal~y 268 (296)
T PLN00184 207 ELAGLAIGSTVLLNVLIAA---------PVSSASMN-PGRS-----LGPAMVYGCY-KGIWIYIVAPTLGAIAGAWVY 268 (296)
T ss_pred cchHHHHHHHHHHHHHHhc---------ccCccccC-chhh-----HHHHHHhhcc-cccchHHhHHHHHHHHHHHHH
Confidence 3557777776644332211 12355555 5542 2233333222 348999999999877777655
No 46
>PRK09292 Na(+)-translocating NADH-quinone reductase subunit D; Validated
Probab=20.33 E-value=3e+02 Score=29.58 Aligned_cols=36 Identities=11% Similarity=0.073 Sum_probs=31.6
Q ss_pred eeccchhHh-HhHhHHHHHHHHHHHHhhccchhHHHh
Q 003218 632 CALSFNVLQ-FMDFWLSFMAVVSTFIYLTTIDEALKR 667 (838)
Q Consensus 632 Cim~ydvLQ-f~DF~gSimSiwvT~I~MA~~~e~lk~ 667 (838)
...+++.++ ..|=+++.+.++..++.|+.++|.+-+
T Consensus 121 ~a~~~~~~~s~~dglg~GlGftlaL~lla~iRE~Lg~ 157 (209)
T PRK09292 121 FAMKNPPIPSFLDGIGNGLGYGAILLIVAFFRELFGS 157 (209)
T ss_pred HHhhCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 345777887 899999999999999999999998887
Done!