Query 028785
Match_columns 204
No_of_seqs 38 out of 40
Neff 3.5
Searched_HMMs 46136
Date Fri Mar 29 02:30:07 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/028785.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/028785hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF10342 GPI-anchored: Ser-Thr 98.3 9E-06 1.9E-10 58.6 10.2 88 37-151 1-89 (93)
2 PRK10301 hypothetical protein; 96.8 0.024 5.3E-07 45.2 10.6 86 34-145 30-115 (124)
3 PF04234 CopC: CopC domain; I 96.1 0.072 1.6E-06 40.0 9.1 85 34-145 4-88 (97)
4 PF15418 DUF4625: Domain of un 95.3 0.17 3.8E-06 41.0 9.2 93 32-147 19-125 (132)
5 PF09608 Alph_Pro_TM: Putative 91.5 0.18 3.9E-06 44.6 3.2 40 117-165 158-197 (236)
6 PF00041 fn3: Fibronectin type 90.9 2.4 5.1E-05 28.5 7.7 71 46-142 10-80 (85)
7 TIGR02186 alph_Pro_TM conserve 87.7 1.1 2.5E-05 40.5 5.4 44 117-170 183-226 (261)
8 PF04734 Ceramidase_alk: Neutr 87.0 4 8.6E-05 41.2 9.2 101 50-163 571-674 (674)
9 COG2372 CopC Uncharacterized p 85.7 15 0.00032 30.4 10.3 90 34-151 31-121 (127)
10 PF05345 He_PIG: Putative Ig d 80.5 4.1 8.8E-05 27.7 4.4 33 110-142 17-49 (49)
11 PF12276 DUF3617: Protein of u 79.5 21 0.00045 28.4 8.8 81 42-129 24-116 (162)
12 PF00028 Cadherin: Cadherin do 78.9 4.6 0.0001 28.7 4.5 48 119-168 2-52 (93)
13 smart00737 ML Domain involved 75.2 33 0.00071 25.9 9.7 96 35-146 14-112 (118)
14 cd00917 PG-PI_TP The phosphati 73.8 40 0.00087 26.2 9.8 36 110-145 78-115 (122)
15 PF07495 Y_Y_Y: Y_Y_Y domain; 71.3 4.6 0.0001 27.2 2.7 24 117-143 30-53 (66)
16 PTZ00487 ceramidase; Provision 70.9 27 0.00058 35.9 9.1 84 45-138 608-692 (715)
17 PF03443 Glyco_hydro_61: Glyco 70.2 5.9 0.00013 34.4 3.8 37 115-151 135-176 (218)
18 PRK12634 flgD flagellar basal 70.2 9.7 0.00021 33.3 5.1 34 113-146 146-183 (221)
19 PRK12633 flgD flagellar basal 66.8 12 0.00026 32.9 5.0 34 113-146 153-190 (230)
20 PF14610 DUF4448: Protein of u 66.7 60 0.0013 27.1 9.0 46 37-88 25-70 (189)
21 PF13754 Big_3_4: Bacterial Ig 65.5 13 0.00029 25.3 4.1 29 114-143 10-38 (54)
22 cd00258 GM2-AP GM2 activator p 62.5 73 0.0016 27.3 8.8 24 123-147 129-152 (162)
23 PF02221 E1_DerP2_DerF2: ML do 62.5 19 0.00042 27.2 4.9 36 110-145 87-125 (134)
24 PF01835 A2M_N: MG2 domain; I 60.9 11 0.00023 27.5 3.1 23 116-138 64-86 (99)
25 PF14524 Wzt_C: Wzt C-terminal 57.6 19 0.00042 26.8 4.1 46 113-164 83-130 (142)
26 PRK15221 Saf-pilin pilus forma 56.1 13 0.00028 32.0 3.2 43 3-49 4-59 (165)
27 PF08481 GBS_Bsp-like: GBS Bsp 52.8 18 0.00039 27.5 3.3 55 88-151 33-92 (95)
28 TIGR03065 srtB_sig_QVPTGV sort 52.7 2.1 4.5E-05 27.7 -1.6 30 169-198 3-32 (32)
29 KOG4680 Uncharacterized conser 52.1 27 0.00059 29.7 4.5 29 117-145 107-135 (153)
30 cd00031 CA Cadherin repeat dom 51.7 50 0.0011 25.8 5.8 48 118-167 2-52 (199)
31 PF03404 Mo-co_dimer: Mo-co ox 46.8 1.5E+02 0.0033 23.7 7.9 90 37-143 14-105 (131)
32 PF04151 PPC: Bacterial pre-pe 44.5 17 0.00036 25.2 1.8 12 126-137 59-70 (70)
33 PRK02710 plastocyanin; Provisi 42.8 59 0.0013 25.2 4.8 26 30-58 28-61 (119)
34 PRK12812 flgD flagellar basal 42.3 51 0.0011 29.8 5.0 35 113-147 165-203 (259)
35 PF14363 AAA_assoc: Domain ass 41.6 56 0.0012 24.7 4.4 51 11-61 47-97 (98)
36 PF13860 FlgD_ig: FlgD Ig-like 41.3 39 0.00085 24.4 3.4 27 114-140 50-80 (81)
37 PF10633 NPCBM_assoc: NPCBM-as 40.7 43 0.00092 23.7 3.5 22 116-137 54-75 (78)
38 PF06247 Plasmod_Pvs28: Plasmo 40.2 16 0.00034 32.3 1.3 20 174-193 177-196 (197)
39 PF12866 DUF3823: Protein of u 36.7 1.1E+02 0.0023 27.0 6.0 50 31-86 105-158 (222)
40 cd00031 CA Cadherin repeat dom 36.3 1.2E+02 0.0027 23.6 5.8 50 116-167 105-157 (199)
41 PF10648 Gmad2: Immunoglobulin 35.7 59 0.0013 24.6 3.7 75 37-143 6-86 (88)
42 PF00207 A2M: Alpha-2-macroglo 34.5 1.3E+02 0.0028 22.0 5.3 44 116-164 16-60 (92)
43 PRK12813 flgD flagellar basal 33.1 77 0.0017 28.0 4.5 31 112-142 144-178 (223)
44 smart00060 FN3 Fibronectin typ 32.9 1.2E+02 0.0025 18.5 9.0 26 117-142 56-81 (83)
45 PF01108 Tissue_fac: Tissue fa 31.6 2.2E+02 0.0047 21.2 7.3 85 25-141 18-102 (107)
46 cd00912 ML The ML (MD-2-relate 28.8 1.2E+02 0.0026 23.3 4.5 28 119-146 93-121 (127)
47 COG3110 Uncharacterized protei 28.3 2E+02 0.0044 25.9 6.3 37 112-148 86-127 (216)
48 PRK06655 flgD flagellar basal 26.4 1.3E+02 0.0029 26.3 4.9 33 113-146 150-186 (225)
49 PF09912 DUF2141: Uncharacteri 25.9 69 0.0015 24.9 2.7 25 116-142 41-65 (112)
50 PF13750 Big_3_3: Bacterial Ig 24.8 1.5E+02 0.0032 24.5 4.6 27 117-144 4-31 (158)
51 PRK15296 putative fimbrial pro 22.7 2.9E+02 0.0063 22.5 5.9 47 32-78 25-78 (181)
52 COG5301 Phage-related tail fib 22.6 1.2E+02 0.0026 30.7 4.1 80 83-179 50-131 (587)
53 COG1464 NlpA ABC-type metal io 22.2 59 0.0013 29.9 1.9 16 31-46 29-45 (268)
54 PF12245 Big_3_2: Bacterial Ig 21.2 1.4E+02 0.003 20.7 3.2 29 116-144 10-38 (60)
55 PF01002 Flavi_NS2B: Flaviviru 20.6 50 0.0011 27.0 1.0 73 116-198 55-127 (128)
56 PF12571 DUF3751: Phage tail-c 20.5 4.6E+02 0.0099 21.4 6.7 37 117-155 75-112 (159)
57 PF00801 PKD: PKD domain; Int 20.2 2.9E+02 0.0063 18.7 4.9 28 113-143 39-66 (69)
58 PF03381 CDC50: LEM3 (ligand-e 20.0 1E+02 0.0022 27.7 3.0 65 123-197 203-270 (278)
No 1
>PF10342 GPI-anchored: Ser-Thr-rich glycosyl-phosphatidyl-inositol-anchored membrane family; InterPro: IPR018466 This entry represents glycoproteins involved in cell wall (1-->6)-beta-glucan assembly. In yeast a null mutation leads to severe growth defects, aberrant multi-budded morphology, and mating defects [, ]. The entry includes DRMIP and Hesp-379, which are involved in both fruiting body formation and in host attack respectively. Hesp-379 is a haustorially expressed secreted protein; the haustorium being the small sucker that penetrates host tissue [].
Probab=98.33 E-value=9e-06 Score=58.59 Aligned_cols=88 Identities=25% Similarity=0.452 Sum_probs=64.0
Q ss_pred EecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccCCC
Q 028785 37 TTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKSLQ 116 (204)
Q Consensus 37 as~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~~g 116 (204)
.+|..|.++++| +.++|+|.-..+.| ..++|.||-.+.+. ..+...| ....+.+.|
T Consensus 1 TsP~~~~~~~~g-~~~~I~W~~~~~~~-------~~~~I~L~~g~~~~---------------~~~~~~i-a~~v~~~~g 56 (93)
T PF10342_consen 1 TSPTAGTVWTAG-QPITITWTSDGTDP-------GNVTIYLCNGNNTN---------------LNFVQTI-ASNVSNSDG 56 (93)
T ss_pred CcCCCCCEEECC-CcEEEEEeCCCCCC-------cEEEEEEEcCCCCC---------------cceeEEE-EecccCCCC
Confidence 378999999999 77999999986554 58999999766521 2332333 334455569
Q ss_pred eEEEEecCCCCC-ceeEEEEEEEeCCCceEeecccC
Q 028785 117 TLDWIIESDVPT-ATYFVRAYALNAERHEVAYGQST 151 (204)
Q Consensus 117 s~~~tl~~dvp~-atYfvraya~da~g~~vaYGqs~ 151 (204)
+++|+++.|+|. ..||||... .+++ -.|.+|.
T Consensus 57 s~~~~~p~~l~~~~~Y~i~~~~-~~~~--~~~~~S~ 89 (93)
T PF10342_consen 57 SYTWTIPSDLPSGGDYFIQIVN-SSNN--TIYAYSP 89 (93)
T ss_pred EEEEEcCCCCCCCCcEEEEEEE-CCCC--ceEEECc
Confidence 999999999996 789999993 3444 4566665
No 2
>PRK10301 hypothetical protein; Provisional
Probab=96.76 E-value=0.024 Score=45.17 Aligned_cols=86 Identities=10% Similarity=0.182 Sum_probs=54.5
Q ss_pred EEEEecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeecccccc
Q 028785 34 QVTTTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNK 113 (204)
Q Consensus 34 ~Vtas~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~ 113 (204)
.+..+|.+|++|..-.+.|++++.- | .+..+..|+|. .+. .+.|.-...... ..
T Consensus 30 l~~s~Pa~ga~v~~~P~~V~L~F~e----~--v~~~~s~i~v~---~~~---g~~v~~~~~~~~--------------~~ 83 (124)
T PRK10301 30 LTHQYPAANAQVTAAPQALTLNFSE----G--IEPGFSGATIT---GPK---QENIKTLPAKRN--------------EQ 83 (124)
T ss_pred ccccCCCCCCccccCCCEEEEEcCC----C--ccccccEEEEE---cCC---CCEeccCCcccc--------------CC
Confidence 4678899999999999999999532 2 34445667663 111 111111100000 11
Q ss_pred CCCeEEEEecCCCCCceeEEEEEEEeCCCceE
Q 028785 114 SLQTLDWIIESDVPTATYFVRAYALNAERHEV 145 (204)
Q Consensus 114 ~~gs~~~tl~~dvp~atYfvraya~da~g~~v 145 (204)
.+..+...++.+.++|+|.|+==.+-+|||.+
T Consensus 84 ~~~~~~v~l~~~L~~G~YtV~Wrvvs~DGH~~ 115 (124)
T PRK10301 84 DQKQLIVPLADSLKPGTYTVDWHVVSVDGHKT 115 (124)
T ss_pred CCcEEEEECCCCCCCccEEEEEEEEecCCCcc
Confidence 12346666667799999999999999999955
No 3
>PF04234 CopC: CopC domain; InterPro: IPR007348 CopC is a bacterial blue copper protein that binds 1 atom of copper per protein molecule. Along with CopA, CopC mediates copper resistance by sequestration of copper in the periplasm [].; GO: 0005507 copper ion binding, 0046688 response to copper ion, 0042597 periplasmic space; PDB: 1IX2_B 1LYQ_A 2C9P_C 2C9R_A 2C9Q_A 1M42_A 1OT4_A 1NM4_A.
Probab=96.09 E-value=0.072 Score=40.05 Aligned_cols=85 Identities=20% Similarity=0.304 Sum_probs=50.9
Q ss_pred EEEEecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeecccccc
Q 028785 34 QVTTTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNK 113 (204)
Q Consensus 34 ~Vtas~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~ 113 (204)
.+..+|.+|++|....+.|+++++-+ .+..+.+|+|. .+.. +.|... + ...+.
T Consensus 4 L~~s~Pa~ga~l~~~P~~v~L~F~e~------v~~~~s~v~v~---~~~g---~~v~~~---------~------~~~~~ 56 (97)
T PF04234_consen 4 LVSSSPADGATLAAAPEEVTLTFSEP------VEPGFSSVTVT---DPDG---KRVDLG---------E------PTVDG 56 (97)
T ss_dssp EEEEES-TTBEE-S--SSEEEEESS---------CCC-EEEEE---EEEE---TTSCTC---------E------EEEEE
T ss_pred ccccCCCCCCEeecCCCEEEEEeCCC------CccCccEEEEE---cCCC---ceeecC---------c------ceecC
Confidence 46789999999999999999996544 23344577665 1111 111111 1 12222
Q ss_pred CCCeEEEEecCCCCCceeEEEEEEEeCCCceE
Q 028785 114 SLQTLDWIIESDVPTATYFVRAYALNAERHEV 145 (204)
Q Consensus 114 ~~gs~~~tl~~dvp~atYfvraya~da~g~~v 145 (204)
....+...++...|+|+|.|+=-++-+|||.+
T Consensus 57 ~~~~~~~~l~~~l~~G~YtV~wrvvs~DGH~~ 88 (97)
T PF04234_consen 57 DGKTLTVPLPPPLPPGTYTVSWRVVSADGHPV 88 (97)
T ss_dssp STTEEEEEESS---SEEEEEEEEEEETTSCEE
T ss_pred CceEEEEECCCCCCCceEEEEEEEEecCCCCc
Confidence 34689999999999999999999999999965
No 4
>PF15418 DUF4625: Domain of unknown function (DUF4625)
Probab=95.34 E-value=0.17 Score=41.04 Aligned_cols=93 Identities=16% Similarity=0.287 Sum_probs=63.3
Q ss_pred eeEEEEecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEE-----eecccccc---CccccccccCcCcCCCcc
Q 028785 32 TLQVTTTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQL-----CFAPVSQK---DRAWRKTEDHLNKDKTCS 103 (204)
Q Consensus 32 tL~Vtas~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkL-----CYap~SQv---dR~WRK~~d~l~kdK~Cq 103 (204)
.+.++..|..++++..|.| |++.-.+.. +.+-+.++|.+ +.+-..+. ..+|.
T Consensus 19 ~~~~~~~p~~~~~~~~G~~-ihfe~~i~d------~~~i~si~VeIH~nfd~H~h~~~~~~~~~~~~------------- 78 (132)
T PF15418_consen 19 LNEIGAFPENCKVATRGDD-IHFEADISD------NSAIKSIKVEIHNNFDHHTHSTEAGECEKPWV------------- 78 (132)
T ss_pred eeecccCCCCCeEEecCCc-EEEEEEEEc------ccceeEEEEEEecCcCcccccccccccccCcE-------------
Confidence 4456678899999999999 777755542 33346888888 44433333 23332
Q ss_pred eeeeccccccC------CCeEEEEecCCCCCceeEEEEEEEeCCCceEee
Q 028785 104 FKIVEKPYNKS------LQTLDWIIESDVPTATYFVRAYALNAERHEVAY 147 (204)
Q Consensus 104 ~ki~~k~y~~~------~gs~~~tl~~dvp~atYfvraya~da~g~~vaY 147 (204)
..+.|+.. .-...++||.|+|+|.|.+-....|+.|.+..+
T Consensus 79 ---~~~~~~~~~g~~~~~~h~~i~IPa~a~~G~YH~~i~VtD~~Gn~~~~ 125 (132)
T PF15418_consen 79 ---FEQDYDIYGGKKNYDFHEHIDIPADAPAGDYHFMITVTDAAGNQTEE 125 (132)
T ss_pred ---EEEEEcccCCcccEeEEEeeeCCCCCCCcceEEEEEEEECCCCEEEE
Confidence 12222211 236778999999999999999999999987654
No 5
>PF09608 Alph_Pro_TM: Putative transmembrane protein (Alph_Pro_TM); InterPro: IPR019088 This entry consists of predicted transmembrane proteins of about 270 amino acids. They are found predominantly, though not exclusively, in alphaproteobacteria, generally only once in each genome.
Probab=91.53 E-value=0.18 Score=44.55 Aligned_cols=40 Identities=20% Similarity=0.410 Sum_probs=34.3
Q ss_pred eEEEEecCCCCCceeEEEEEEEeCCCceEeecccCCCCcccceEEEEee
Q 028785 117 TLDWIIESDVPTATYFVRAYALNAERHEVAYGQSTNDQKTTNLFDIQAI 165 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvraya~da~g~~vaYGqs~~~~~ttn~F~V~~i 165 (204)
+.+..+|-|+|+|.|.+|+|.+ .+|..++...+. ++|+.+
T Consensus 158 ra~i~LPanvp~G~Y~v~v~l~-rdG~vv~~~~~~--------l~V~Kv 197 (236)
T PF09608_consen 158 RARIPLPANVPPGDYTVRVYLF-RDGQVVASQETP--------LRVRKV 197 (236)
T ss_pred EEEeEcCCCCCcceEEEEEEEE-ECCEEEEEEeeE--------EEEEEc
Confidence 5788999999999999999997 899888776666 888873
No 6
>PF00041 fn3: Fibronectin type III domain; InterPro: IPR003961 Fibronectins are multi-domain glycoproteins found in a soluble form in plasma, and in an insoluble form in loose connective tissue and basement membranes []. They contain multiple copies of 3 repeat regions (types I, II and III), which bind to a variety of substances including heparin, collagen, DNA, actin, fibrin and fibronectin receptors on cell surfaces. The wide variety of these substances means that fibronectins are involved in a number of important functions: e.g., wound healing; cell adhesion; blood coagulation; cell differentiation and migration; maintenance of the cellular cytoskeleton; and tumour metastasis []. The role of fibronectin in cell differentiation is demonstrated by the marked reduction in the expression of its gene when neoplastic transformation occurs. Cell attachment has been found to be mediated by the binding of the tetrapeptide RGDS to integrins on the cell surface [], although related sequences can also display cell adhesion activity. Plasma fibronectin occurs as a dimer of 2 different subunits, linked together by 2 disulphide bonds near the C terminus. The difference in the 2 chains occurs in the type III repeat region and is caused by alternative splicing of the mRNA from one gene []. The observation that, in a given protein, an individual repeat of one of the 3 types (e.g., the first FnIII repeat) shows much less similarity to its subsequent tandem repeats within that protein than to its equivalent repeat between fibronectins from other species, has suggested that the repeating structure of fibronectin arose at an early stage of evolution. It also seems to suggest that the structure is subject to high selective pressure []. The fibronectin type III repeat region is an approximately 100 amino acid domain, different tandem repeats of which contain binding sites for DNA, heparin and the cell surface []. The superfamily of sequences believed to contain FnIII repeats represents 45 different families, the majority of which are involved in cell surface binding in some manner, or are receptor protein tyrosine kinases, or cytokine receptors.; GO: 0005515 protein binding; PDB: 1UEM_A 1TDQ_A 1X5I_A 2IC2_B 2IBG_C 2IBB_A 3R8Q_A 2FNB_A 1FNH_A 2EDB_A ....
Probab=90.94 E-value=2.4 Score=28.51 Aligned_cols=71 Identities=18% Similarity=0.349 Sum_probs=43.5
Q ss_pred eeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccCCCeEEEEecCC
Q 028785 46 KAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKSLQTLDWIIESD 125 (204)
Q Consensus 46 ~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~~gs~~~tl~~d 125 (204)
..+.++|.|+|.... ..-+ + -...+|.|.+..... .|+. ..... ....+++..=
T Consensus 10 ~~~~~sv~v~W~~~~-~~~~-~----~~~y~v~~~~~~~~~-~~~~-----------------~~~~~--~~~~~~i~~L 63 (85)
T PF00041_consen 10 NISPTSVTVSWKPPS-SGNG-P----ITGYRVEYRSVNSTS-DWQE-----------------VTVPG--NETSYTITGL 63 (85)
T ss_dssp EECSSEEEEEEEESS-STSS-S----ESEEEEEEEETTSSS-EEEE-----------------EEEET--TSSEEEEESC
T ss_pred ECCCCEEEEEEECCC-CCCC-C----eeEEEEEEEecccce-eeee-----------------eeeee--eeeeeeeccC
Confidence 348899999999986 2212 2 335566664433322 1111 11111 2336677766
Q ss_pred CCCceeEEEEEEEeCCC
Q 028785 126 VPTATYFVRAYALNAER 142 (204)
Q Consensus 126 vp~atYfvraya~da~g 142 (204)
.|.-+|.+|++++++.|
T Consensus 64 ~p~t~Y~~~v~a~~~~g 80 (85)
T PF00041_consen 64 QPGTTYEFRVRAVNSDG 80 (85)
T ss_dssp CTTSEEEEEEEEEETTE
T ss_pred CCCCEEEEEEEEEeCCc
Confidence 89999999999998877
No 7
>TIGR02186 alph_Pro_TM conserved hypothetical protein. This family consists of predicted transmembrane proteins of about 270 amino acids. Members are found, so far, only among the Alphaproteobacteria and only once in each genome.
Probab=87.71 E-value=1.1 Score=40.48 Aligned_cols=44 Identities=23% Similarity=0.258 Sum_probs=36.5
Q ss_pred eEEEEecCCCCCceeEEEEEEEeCCCceEeecccCCCCcccceEEEEeeccccc
Q 028785 117 TLDWIIESDVPTATYFVRAYALNAERHEVAYGQSTNDQKTTNLFDIQAITGRHA 170 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvraya~da~g~~vaYGqs~~~~~ttn~F~V~~itg~~~ 170 (204)
+.+..+|-|+|+|+|.+|+|.+ .+|..++-..+. ++|+.+ |...
T Consensus 183 ra~i~LPAnvp~G~Y~v~v~L~-r~G~vv~~~~t~--------l~V~Kv-G~E~ 226 (261)
T TIGR02186 183 RATLRLPANVPNGTHEVRAYLF-RGGVFIARTELA--------LEIVKT-GLEQ 226 (261)
T ss_pred EEeeecCCCCCCceEEEEEEEE-eCCEEEEEEEeE--------EEEEEe-cHHH
Confidence 5678899999999999999997 999999877776 888883 4433
No 8
>PF04734 Ceramidase_alk: Neutral/alkaline non-lysosomal ceramidase; InterPro: IPR006823 This family represents a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes [, , ]. They hydrolyse the sphingolipid ceramide into sphingosine and free fatty acid.; PDB: 2ZXC_A 2ZWS_A.
Probab=87.03 E-value=4 Score=41.20 Aligned_cols=101 Identities=18% Similarity=0.290 Sum_probs=47.1
Q ss_pred ceEEEEee-ccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccCCCeEEEEecCCCCC
Q 028785 50 DKVTITWG-LNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKSLQTLDWIIESDVPT 128 (204)
Q Consensus 50 D~itvtw~-ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~~gs~~~tl~~dvp~ 128 (204)
|.+++++. .|+.-.--.+..|-+|.=+- -+-.|.---||=+-+-.=.|+-..-....+.=.++|++|.|+|+
T Consensus 571 ~~v~~~F~~a~Prn~l~~~~tfl~Ver~~-------~~~~W~~v~~D~dw~t~f~W~r~~~~~~~S~~ti~W~ip~~~~~ 643 (674)
T PF04734_consen 571 DTVSATFVGANPRNNLRLEGTFLTVERLE-------SGGSWQTVADDADWSTRFRWKRTGSLLGTSEVTIEWEIPPDTPP 643 (674)
T ss_dssp -EEEEEEEE--GGG---TTS-SEEEEEEE-------S-S--EEEEETTSTTEEEEEEEETTT--EEEEEEEEE--TT--S
T ss_pred CeEEEEEeeeCCCCccCCCCCeEEEEEec-------CCCCeEEEEeCCCccEEEEEEecCCccccEEEEEEEECCCCCCC
Confidence 66777754 33322222445555554332 35678776666333322334332222112234899999999999
Q ss_pred ceeEEEEEEE--eCCCceEeecccCCCCcccceEEEE
Q 028785 129 ATYFVRAYAL--NAERHEVAYGQSTNDQKTTNLFDIQ 163 (204)
Q Consensus 129 atYfvraya~--da~g~~vaYGqs~~~~~ttn~F~V~ 163 (204)
++|-||.+.- ...|....|--++ +.|+|+
T Consensus 644 G~YRi~~~G~~k~~~g~i~~f~G~S------~~F~V~ 674 (674)
T PF04734_consen 644 GTYRIRHFGDAKSLFGGITPFEGTS------REFTVT 674 (674)
T ss_dssp EEEEEEEEEEEE-TTT-EEEEEEE---------EEE-
T ss_pred CCEEEEEEeeccCCCCCeEeEEEEC------CceEeC
Confidence 9999999984 5567677774444 238774
No 9
>COG2372 CopC Uncharacterized protein, homolog of Cu resistance protein CopC [General function prediction only]
Probab=85.69 E-value=15 Score=30.39 Aligned_cols=90 Identities=17% Similarity=0.281 Sum_probs=56.3
Q ss_pred EEEEecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeecccccc
Q 028785 34 QVTTTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNK 113 (204)
Q Consensus 34 ~Vtas~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~ 113 (204)
.|.-.|.+|.++.+=++.|+..++-. ....|-.++|. .| +...-++-..+. +.
T Consensus 31 l~~s~Pad~s~v~aaP~~i~L~Fse~------ve~~fs~~~l~---~~-------------d~~~v~t~~~~~-----~~ 83 (127)
T COG2372 31 LVSSNPADNSVVTAAPAAITLEFSEG------VEPGFSGAKLT---GP-------------DGEEVATAGTKL-----DE 83 (127)
T ss_pred eecCCCCCcchhhcCceeEEEecCCc------cCCCcceeEEE---CC-------------CCCccccCcccc-----cc
Confidence 45566788999999999999774432 34455666664 11 111111111101 11
Q ss_pred CC-CeEEEEecCCCCCceeEEEEEEEeCCCceEeecccC
Q 028785 114 SL-QTLDWIIESDVPTATYFVRAYALNAERHEVAYGQST 151 (204)
Q Consensus 114 ~~-gs~~~tl~~dvp~atYfvraya~da~g~~vaYGqs~ 151 (204)
.+ ...+..++.+.+.++|.+.=-.+.+||| +-=|+.+
T Consensus 84 ~~~~~l~v~l~~~L~aG~Y~v~WrvvS~DGH-~v~G~~s 121 (127)
T COG2372 84 QNHTQLEVPLPQPLKAGVYTVDWRVVSSDGH-VVKGSIS 121 (127)
T ss_pred cCCcEEEecCcccCCCCcEEEEEEEEecCCc-EeccEEE
Confidence 11 2478888899999999998888889999 5556665
No 10
>PF05345 He_PIG: Putative Ig domain; InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=80.46 E-value=4.1 Score=27.69 Aligned_cols=33 Identities=21% Similarity=0.165 Sum_probs=29.1
Q ss_pred ccccCCCeEEEEecCCCCCceeEEEEEEEeCCC
Q 028785 110 PYNKSLQTLDWIIESDVPTATYFVRAYALNAER 142 (204)
Q Consensus 110 ~y~~~~gs~~~tl~~dvp~atYfvraya~da~g 142 (204)
-++...|.+.|++..++.++.|-+.+-+-|++|
T Consensus 17 s~d~~tG~isGtp~~~~~~G~y~~~vtatd~~G 49 (49)
T PF05345_consen 17 SLDPSTGTISGTPTSSVQPGTYTFTVTATDGSG 49 (49)
T ss_pred EEeCCCCEEEeecCCCccccEEEEEEEEEcCCC
Confidence 357778999999999999999999999988876
No 11
>PF12276 DUF3617: Protein of unknown function (DUF3617); InterPro: IPR022061 This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 179 amino acids in length. There is a single completely conserved residue C that may be functionally important.
Probab=79.48 E-value=21 Score=28.38 Aligned_cols=81 Identities=17% Similarity=0.261 Sum_probs=49.0
Q ss_pred CceeeeCCceEEEEeeccc-----------cCcCCC-ccccceeEEEEeeccccccCccccccccCcCcCCCcceeeecc
Q 028785 42 GQVLKAGEDKVTITWGLNQ-----------SLAAGT-DSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEK 109 (204)
Q Consensus 42 g~vl~aG~D~itvtw~ln~-----------t~~ag~-d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k 109 (204)
...++.|+=.+|++..++. -.+.+. +....+..++.|..+..-.+-...- ....++.|.. .
T Consensus 24 ~~~~kpGlWe~t~~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~Cit~~~~~~~~~~~---~~~~~~~C~~----~ 96 (162)
T PF12276_consen 24 APDIKPGLWEVTTTTEMPGMQPEMMEQMMAMMGAGMGGMPPQTSTVRQCITPEEAAKPDKSF---FPQENQDCTY----T 96 (162)
T ss_pred cCCCCCcccEEEEEeccCchhhhhhhhhhcccccccccccCCCCccCccCChhHhccccccc---ccCCCCCCCE----e
Confidence 3455777767777755111 111111 1223467889999998765432221 5566689976 3
Q ss_pred ccccCCCeEEEEecCCCCCc
Q 028785 110 PYNKSLQTLDWIIESDVPTA 129 (204)
Q Consensus 110 ~y~~~~gs~~~tl~~dvp~a 129 (204)
.+..+++.++|++.=+.|.+
T Consensus 97 ~~~~~~~~~~~~~~C~~~~~ 116 (162)
T PF12276_consen 97 DVSRSGGTVTFTMSCTGPGG 116 (162)
T ss_pred eEEEeCCEEEEEEEeCCCCC
Confidence 45556789999998777775
No 12
>PF00028 Cadherin: Cadherin domain; InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=78.94 E-value=4.6 Score=28.67 Aligned_cols=48 Identities=19% Similarity=0.380 Sum_probs=38.7
Q ss_pred EEEecCCCCCceeEEEEEEEeCC---CceEeecccCCCCcccceEEEEeeccc
Q 028785 119 DWIIESDVPTATYFVRAYALNAE---RHEVAYGQSTNDQKTTNLFDIQAITGR 168 (204)
Q Consensus 119 ~~tl~~dvp~atYfvraya~da~---g~~vaYGqs~~~~~ttn~F~V~~itg~ 168 (204)
+..|+.|.|.++...++-+.|+| ...+.|.-..... .+.|+|.+.+|.
T Consensus 2 ~~~v~E~~~~g~~v~~v~a~D~D~~~n~~i~y~i~~~~~--~~~F~I~~~tg~ 52 (93)
T PF00028_consen 2 SFSVPENAPPGTVVGQVTATDPDSGPNSQITYSILGGNP--DGLFSIDPNTGE 52 (93)
T ss_dssp EEEEETTGSTSSEEEEEEEEESSTSTTSSEEEEEEETTS--TTSEEEETTTTE
T ss_pred EEEEECCCCCCCEEEEEEEEeCCCCCCceEEEEEecCcc--cCceEEeeeeec
Confidence 46789999999999999999887 4568887777432 577999998876
No 13
>smart00737 ML Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids.
Probab=75.21 E-value=33 Score=25.85 Aligned_cols=96 Identities=15% Similarity=0.199 Sum_probs=51.0
Q ss_pred EEEecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccC
Q 028785 35 VTTTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKS 114 (204)
Q Consensus 35 Vtas~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~ 114 (204)
|+++|-. ...++.=+|++.+..+.... .. +++|.+-+.- +..++-..+.++ |+.--..=|..++
T Consensus 14 v~v~Pc~--~~~g~~~~i~i~f~~~~~~~----~~--~~~v~~~~~g---~~ip~~~~~~d~-----C~~~~~~CPl~~G 77 (118)
T smart00737 14 VSISPCP--PVRGKTLTISISFTLNEDIS----KL--KVVVHVKIGG---IEVPIPGETYDL-----CKLLGSKCPIEKG 77 (118)
T ss_pred EEecCCC--CCCCCEEEEEEEEEEcccce----EE--EEEEEEEECC---EEEeccCCCCCc-----cccCCCCCCCCCC
Confidence 4555532 23344446888876664443 22 4555555541 222222222222 3221112245444
Q ss_pred C---CeEEEEecCCCCCceeEEEEEEEeCCCceEe
Q 028785 115 L---QTLDWIIESDVPTATYFVRAYALNAERHEVA 146 (204)
Q Consensus 115 ~---gs~~~tl~~dvp~atYfvraya~da~g~~va 146 (204)
. -+....|+...|.++|.+++-..|.+|..++
T Consensus 78 ~~~~~~~~~~v~~~~P~~~~~v~~~l~d~~~~~i~ 112 (118)
T smart00737 78 ETVNYTNSLTVPGIFPPGKYTVKWELTDEDGEELA 112 (118)
T ss_pred eeEEEEEeeEccccCCCeEEEEEEEEEcCCCCEEE
Confidence 2 1233467789999999999999998887654
No 14
>cd00917 PG-PI_TP The phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP) has been shown to bind phosphatidylglycerol and phosphatidylinositol, but the biological significance of this is still obscure. These proteins belong to the ML domain family.
Probab=73.85 E-value=40 Score=26.21 Aligned_cols=36 Identities=22% Similarity=0.310 Sum_probs=26.2
Q ss_pred ccccCCC--eEEEEecCCCCCceeEEEEEEEeCCCceE
Q 028785 110 PYNKSLQ--TLDWIIESDVPTATYFVRAYALNAERHEV 145 (204)
Q Consensus 110 ~y~~~~g--s~~~tl~~dvp~atYfvraya~da~g~~v 145 (204)
|..++.- ..+..|+..+|+++|.|++-..|.++.++
T Consensus 78 Pi~~G~~~~~~~~~ip~~~P~g~y~v~~~l~d~~~~~i 115 (122)
T cd00917 78 PIEPGDKFLTKLVDLPGEIPPGKYTVSARAYTKDDEEI 115 (122)
T ss_pred CcCCCcEEEEEEeeCCCCCCCceEEEEEEEECCCCCEE
Confidence 4444432 33457778899999999999988888754
No 15
>PF07495 Y_Y_Y: Y_Y_Y domain; InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=71.26 E-value=4.6 Score=27.17 Aligned_cols=24 Identities=13% Similarity=0.409 Sum_probs=17.9
Q ss_pred eEEEEecCCCCCceeEEEEEEEeCCCc
Q 028785 117 TLDWIIESDVPTATYFVRAYALNAERH 143 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvraya~da~g~ 143 (204)
++.|+ +.|+|+|.+++.+.|.+|.
T Consensus 30 ~~~~~---~L~~G~Y~l~V~a~~~~~~ 53 (66)
T PF07495_consen 30 SISYT---NLPPGKYTLEVRAKDNNGK 53 (66)
T ss_dssp EEEEE---S--SEEEEEEEEEEETTS-
T ss_pred EEEEE---eCCCEEEEEEEEEECCCCC
Confidence 55554 6899999999999999886
No 16
>PTZ00487 ceramidase; Provisional
Probab=70.86 E-value=27 Score=35.93 Aligned_cols=84 Identities=20% Similarity=0.241 Sum_probs=48.5
Q ss_pred eeeCCceEEEE-eeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccCCCeEEEEec
Q 028785 45 LKAGEDKVTIT-WGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKSLQTLDWIIE 123 (204)
Q Consensus 45 l~aG~D~itvt-w~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~~gs~~~tl~ 123 (204)
...| |.++++ |+.|+--.--.+..|-+|.-.- .+..|..--||-+=+-.=+|+-. ....+.-+++|+++
T Consensus 608 y~~g-~~v~~~F~~a~Prn~l~~~~tf~~Ve~~~-------~~~~W~~v~~D~dw~t~~~W~r~--~~~~S~~ti~W~i~ 677 (715)
T PTZ00487 608 YSNN-DTVSAEFYGGNPRNNFMTESSFLTVDKLN-------EKNQWTTILVDGDWDTKWHWKMH--DLGFSLITIIWSIG 677 (715)
T ss_pred cCCC-CEEEEEEEecCCCCccccCcceEEEEEec-------CCCceeEeccCCCcceEEEEecc--CCCceeEEEEEECC
Confidence 4445 566655 4445432222445555554221 33458877766444433344332 11123348999999
Q ss_pred CCCCCceeEEEEEEE
Q 028785 124 SDVPTATYFVRAYAL 138 (204)
Q Consensus 124 ~dvp~atYfvraya~ 138 (204)
.|.|+++|.||-+.-
T Consensus 678 ~~~~~G~YRi~~~G~ 692 (715)
T PTZ00487 678 PTTEPGTYRITHSGY 692 (715)
T ss_pred CCCCCeeeEEEEeec
Confidence 999999999999984
No 17
>PF03443 Glyco_hydro_61: Glycosyl hydrolase family 61; InterPro: IPR005103 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The only known activity within this family is that of endoglucanase (3.2.1.4 from EC) GH61 from CAZY ; PDB: 4EIS_B 2VTC_A 4EIR_B 3EJA_D 3EII_A.
Probab=70.17 E-value=5.9 Score=34.43 Aligned_cols=37 Identities=16% Similarity=0.309 Sum_probs=24.4
Q ss_pred CCeEEEEecCCCCCceeEEEEEEE---eC--CCceEeecccC
Q 028785 115 LQTLDWIIESDVPTATYFVRAYAL---NA--ERHEVAYGQST 151 (204)
Q Consensus 115 ~gs~~~tl~~dvp~atYfvraya~---da--~g~~vaYGqs~ 151 (204)
+++++.+||.++|++.|-+|.=.+ .| .|..--|=+|.
T Consensus 135 ~~~~~~~IP~~l~~G~YLlR~E~IaLH~a~~~~gaQfY~~Ca 176 (218)
T PF03443_consen 135 NGSWTFTIPKNLPPGQYLLRHEIIALHSAGQPGGAQFYPSCA 176 (218)
T ss_dssp TCEEEEE--TTBBSEEEEEEEEEEE-TTTTSTT--EEEEEEE
T ss_pred CCceEEEeCCCCCCCCceEEecceeeccCccCCCCEEhhhCE
Confidence 579999999999999999998665 22 23334555554
No 18
>PRK12634 flgD flagellar basal body rod modification protein; Reviewed
Probab=70.16 E-value=9.7 Score=33.35 Aligned_cols=34 Identities=12% Similarity=0.141 Sum_probs=26.9
Q ss_pred cCCCeEEEEecCC----CCCceeEEEEEEEeCCCceEe
Q 028785 113 KSLQTLDWIIESD----VPTATYFVRAYALNAERHEVA 146 (204)
Q Consensus 113 ~~~gs~~~tl~~d----vp~atYfvraya~da~g~~va 146 (204)
++...|+|.-..+ +|+|.|.+++.+.|++|..+.
T Consensus 146 aG~~~f~WDG~d~~G~~~~~G~Yt~~v~a~~~~G~~~~ 183 (221)
T PRK12634 146 AGEVSFAWDGTDANGNRMAAGKYGVTATQTDTAGSKSK 183 (221)
T ss_pred CCceeEEECCCCCCCCcCCCeeeEEEEEEEeCCCcEEe
Confidence 4445788987644 999999999999999997553
No 19
>PRK12633 flgD flagellar basal body rod modification protein; Provisional
Probab=66.80 E-value=12 Score=32.85 Aligned_cols=34 Identities=24% Similarity=0.415 Sum_probs=27.5
Q ss_pred cCCCeEEEEecCC----CCCceeEEEEEEEeCCCceEe
Q 028785 113 KSLQTLDWIIESD----VPTATYFVRAYALNAERHEVA 146 (204)
Q Consensus 113 ~~~gs~~~tl~~d----vp~atYfvraya~da~g~~va 146 (204)
++...|+|+--.+ +|++.|.+++-+.|++|..+.
T Consensus 153 aG~~~f~WDG~d~~G~~~~~G~Y~~~V~a~~~~G~~~~ 190 (230)
T PRK12633 153 TGVHTLQWDGNNDGGQPLADGKYSITVSASDADAKPVK 190 (230)
T ss_pred CCceeEEECCCCCCCCcCCCcceEEEEEEEeCCCcEEe
Confidence 4346899988643 899999999999999998665
No 20
>PF14610 DUF4448: Protein of unknown function (DUF4448)
Probab=66.68 E-value=60 Score=27.06 Aligned_cols=46 Identities=26% Similarity=0.484 Sum_probs=34.4
Q ss_pred EecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCcc
Q 028785 37 TTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRA 88 (204)
Q Consensus 37 as~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~ 88 (204)
-+|++|+++..|+ +-.|||--. -. +....+|+|.|=|.+.++.+-+
T Consensus 25 C~P~~~s~l~~g~-tY~ITWd~~-~f----~~~~~~V~I~l~y~~~~~~~~~ 70 (189)
T PF14610_consen 25 CTPKDGSELYVGK-TYYITWDPS-FF----DPSNSTVRIHLSYVNESSNEKG 70 (189)
T ss_pred ccCCCCCEEecCC-CEEEEEChh-hc----cCCCcEEEEEEEeccCCccccc
Confidence 4678999999997 567999632 11 3333689999999999987765
No 21
>PF13754 Big_3_4: Bacterial Ig-like domain (group 3)
Probab=65.49 E-value=13 Score=25.29 Aligned_cols=29 Identities=17% Similarity=0.187 Sum_probs=25.4
Q ss_pred CCCeEEEEecCCCCCceeEEEEEEEeCCCc
Q 028785 114 SLQTLDWIIESDVPTATYFVRAYALNAERH 143 (204)
Q Consensus 114 ~~gs~~~tl~~dvp~atYfvraya~da~g~ 143 (204)
++|+..++++.. +.+.|-+++-+.|+.|.
T Consensus 10 ~~G~Ws~t~~~~-~dG~y~itv~a~D~AGN 38 (54)
T PF13754_consen 10 SDGNWSFTVPAL-ADGTYTITVTATDAAGN 38 (54)
T ss_pred CCCcEEEeCCCC-CCccEEEEEEEEeCCCC
Confidence 357888888877 99999999999999997
No 22
>cd00258 GM2-AP GM2 activator protein (GM2-AP) is a non-enzymatic lysosomal protein that acts as cofactor in the sequential degradation of gangliosides. GM2A is an essential cofactor for beta-hexosaminidase A (Hex A) in the enzymatic hydrolysis of GM2 ganglioside to GM3. Mutation of the gene results in the AB variant of Tay-Sachs disease. GM2-AP and similar proteins belong to the ML domain family.
Probab=62.50 E-value=73 Score=27.31 Aligned_cols=24 Identities=17% Similarity=0.195 Sum_probs=19.2
Q ss_pred cCCCCCceeEEEEEEEeCCCceEee
Q 028785 123 ESDVPTATYFVRAYALNAERHEVAY 147 (204)
Q Consensus 123 ~~dvp~atYfvraya~da~g~~vaY 147 (204)
|.-.+.|.|.+++.. +++|.+.+=
T Consensus 129 Ps~l~~G~Y~i~~~l-~~~g~~l~C 152 (162)
T cd00258 129 PSWLTNGNYRITGIL-MADGKELGC 152 (162)
T ss_pred CCccCCCcEEEEEEE-CCCCCEEEE
Confidence 556679999999966 899988763
No 23
>PF02221 E1_DerP2_DerF2: ML domain; InterPro: IPR003172 The MD-2-related lipid-recognition (ML) domain is implicated in lipid recognition, particularly in the recognition of pathogen related products. It has an immunoglobulin-like beta-sandwich fold similar to that of E-set Ig domains. This domain is present in the following proteins: Epididymal secretory protein E1 (also known as Niemann-Pick C2 protein), which is known to bind cholesterol. Niemann-Pick disease type C2 is a fatal hereditary disease characterised by accumulation of low-density lipoprotein-derived cholesterol in lysosomes []. House-dust mite allergen proteins such as Der f 2 from Dermatophagoides farinae and Der p 2 from Dermatophagoides pteronyssinus []. ; PDB: 2AG9_B 1G13_B 2AG2_B 2AG4_A 1TJJ_C 1PU5_C 1PUB_A 2AF9_A 3T6Q_D 3M7O_B ....
Probab=62.48 E-value=19 Score=27.19 Aligned_cols=36 Identities=17% Similarity=0.197 Sum_probs=28.2
Q ss_pred ccccCCC---eEEEEecCCCCCceeEEEEEEEeCCCceE
Q 028785 110 PYNKSLQ---TLDWIIESDVPTATYFVRAYALNAERHEV 145 (204)
Q Consensus 110 ~y~~~~g---s~~~tl~~dvp~atYfvraya~da~g~~v 145 (204)
|..++.. +.+..++...|+++|.+++-..|.+|.++
T Consensus 87 Pi~~G~~~~~~~~~~i~~~~p~~~~~i~~~l~d~~~~~i 125 (134)
T PF02221_consen 87 PIKAGEYYTYTYTIPIPKIYPPGKYTIQWKLTDQDGEEI 125 (134)
T ss_dssp TBTTTEEEEEEEEEEESTTSSSEEEEEEEEEEETTTEEE
T ss_pred ccCCCcEEEEEEEEEcccceeeEEEEEEEEEEeCCCCEE
Confidence 6665532 45667789999999999999999987655
No 24
>PF01835 A2M_N: MG2 domain; InterPro: IPR002890 The proteinase-binding alpha-macroglobulins (A2M) [] are large glycoproteins found in the plasma of vertebrates, in the hemolymph of some invertebrates and in reptilian and avian egg white. A2M-like proteins are able to inhibit all four classes of proteinases by a 'trapping' mechanism. They have a peptide stretch, called the 'bait region', which contains specific cleavage sites for different proteinases. When a proteinase cleaves the bait region, a conformational change is induced in the protein, thus trapping the proteinase. The entrapped enzyme remains active against low molecular weight substrates, whilst its activity toward larger substrates is greatly reduced, due to steric hindrance. Following cleavage in the bait region, a thiol ester bond, formed between the side chains of a cysteine and a glutamine, is cleaved and mediates the covalent binding of the A2M-like protein to the proteinase. This family includes the N-terminal region of the alpha-2-macroglobulin family. The inhibitor domains belong to MEROPS inhibitor family I39.; GO: 0004866 endopeptidase inhibitor activity; PDB: 2B39_B 3KLS_B 3PRX_C 3KM9_B 3PVM_C 3CU7_A 4E0S_A 4A5W_A 4ACQ_C 2P9R_B ....
Probab=60.92 E-value=11 Score=27.47 Aligned_cols=23 Identities=22% Similarity=0.315 Sum_probs=15.9
Q ss_pred CeEEEEecCCCCCceeEEEEEEE
Q 028785 116 QTLDWIIESDVPTATYFVRAYAL 138 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya~ 138 (204)
-+++|.||.+.+.|.|.|++..-
T Consensus 64 ~~~~~~lp~~~~~G~y~i~~~~~ 86 (99)
T PF01835_consen 64 FSGSFQLPDDAPLGTYTIRVKTD 86 (99)
T ss_dssp EEEEEE--SS---EEEEEEEEET
T ss_pred EEEEEECCCCCCCEeEEEEEEEc
Confidence 47889999999999999999984
No 25
>PF14524 Wzt_C: Wzt C-terminal domain; PDB: 2R5O_B.
Probab=57.59 E-value=19 Score=26.79 Aligned_cols=46 Identities=7% Similarity=0.124 Sum_probs=28.0
Q ss_pred cCCCeEEEEecCCCCCceeEEEEEEE--eCCCceEeecccCCCCcccceEEEEe
Q 028785 113 KSLQTLDWIIESDVPTATYFVRAYAL--NAERHEVAYGQSTNDQKTTNLFDIQA 164 (204)
Q Consensus 113 ~~~gs~~~tl~~dvp~atYfvraya~--da~g~~vaYGqs~~~~~ttn~F~V~~ 164 (204)
.+.++++++++.+..+|.|+|.+... ...+....|-+.. . .|+|..
T Consensus 83 ~g~~~~~~~i~~~L~~G~Y~i~v~l~~~~~~~~~~d~~~~~--~----~f~V~~ 130 (142)
T PF14524_consen 83 GGTYEVTFTIPKPLNPGEYSISVGLGDDSSGGEVLDWIEDA--L----SFEVED 130 (142)
T ss_dssp T-EEEEEEEEE--B-SEEEEEEEEEEETTTEEEEEEEEEEE--E----EEEEE-
T ss_pred CCEEEEEEEEcCccCCCeEEEEEEEEecCCCCEEEEEECCE--E----EEEEEC
Confidence 33478999999889999999999983 3333334443333 3 388877
No 26
>PRK15221 Saf-pilin pilus formation protein SafA; Provisional
Probab=56.10 E-value=13 Score=32.04 Aligned_cols=43 Identities=33% Similarity=0.571 Sum_probs=27.5
Q ss_pred hhhhHHHHHHHHhhhcccccce-------------eeccccceeEEEEecCCCceeeeCC
Q 028785 3 ARGLLLASIYLSLLVHECYGVT-------------LFSSLQKTLQVTTTTKRGQVLKAGE 49 (204)
Q Consensus 3 ~~~l~~a~ll~a~~~~~~~~~~-------------~~SsL~ktL~Vtas~~~g~vl~aG~ 49 (204)
+-.|+||+.| ++.+.+||+.- -|+ .|..|.|+..|-+| |.||.
T Consensus 4 ikkliiasal-smmaascyags~~pnt~~~~SvDv~Fa-~p~~ltvtltpV~g--L~AG~ 59 (165)
T PRK15221 4 IKKLIIASAL-SMMAASCYAGSFLPNTEQQKSVDINFA-SPQQLTVSLDPVSG--LKAGK 59 (165)
T ss_pred HHHHHHHHHH-HHHHHHhhhcccccCCccceeEeEEEc-CCCccEEEEeecCc--cccCC
Confidence 3468898888 77777887752 232 34566677666555 55555
No 27
>PF08481 GBS_Bsp-like: GBS Bsp-like repeat; InterPro: IPR013688 This repeat is found in a number of Streptococcus proteins including some hypothetical proteins and Bsp. Bsp is a protein of group B Streptococcus (GBS) which might control cell morphology [].
Probab=52.82 E-value=18 Score=27.50 Aligned_cols=55 Identities=24% Similarity=0.411 Sum_probs=35.3
Q ss_pred cccccc--cCcCcCCCcceeeeccccccCCCeEEEEec---CCCCCceeEEEEEEEeCCCceEeecccC
Q 028785 88 AWRKTE--DHLNKDKTCSFKIVEKPYNKSLQTLDWIIE---SDVPTATYFVRAYALNAERHEVAYGQST 151 (204)
Q Consensus 88 ~WRK~~--d~l~kdK~Cq~ki~~k~y~~~~gs~~~tl~---~dvp~atYfvraya~da~g~~vaYGqs~ 151 (204)
-|.+.| |||. |--+++ .++|+..-++. -+--.|+|+|-+|..+.+|..+.-+.++
T Consensus 33 VWSe~nGQdDL~------WY~a~k---~~dg~y~~~i~~~nH~~~~G~Y~vhvY~~~~~G~~~~l~~t~ 92 (95)
T PF08481_consen 33 VWSEENGQDDLK------WYTATK---QSDGSYSVTIDLSNHKNETGTYHVHVYITDADGKMIGLNATT 92 (95)
T ss_pred EEcCCCCCCccE------EEEeee---cCCCcEEEEEeHHHCCCCccEEEEEEEEEcCCCcEEEEeeeE
Confidence 388887 5663 422322 22344444444 2334589999999999999887777666
No 28
>TIGR03065 srtB_sig_QVPTGV sortase B signal domain, QVPTGV class. This model represents a boutique (unusual) sorting signal, recognized by a member of the sortase SrtB family rather than by the housekeeping sortase, SrtA.
Probab=52.72 E-value=2.1 Score=27.71 Aligned_cols=30 Identities=23% Similarity=0.327 Sum_probs=24.5
Q ss_pred cccceeeeeeeeeehhhhhhhhhhHHHhHh
Q 028785 169 HASLDIASVCFSVFSIVALFGFFFHEKRKA 198 (204)
Q Consensus 169 ~~sL~ia~~~fS~FSvv~L~~ff~~Ekrk~ 198 (204)
|.+......=|.+.|+++..+..++-|||+
T Consensus 3 ptgv~gtlapf~al~iva~gg~~y~tk~kk 32 (32)
T TIGR03065 3 PTGVAGTLAPFAALGIVAIGGAIYFTKKKK 32 (32)
T ss_pred ccceeeeeccceeEEEEEeccEEEEEEccC
Confidence 556666777899999999999888888874
No 29
>KOG4680 consensus Uncharacterized conserved protein, contains ML domain [General function prediction only]
Probab=52.11 E-value=27 Score=29.74 Aligned_cols=29 Identities=17% Similarity=0.371 Sum_probs=25.8
Q ss_pred eEEEEecCCCCCceeEEEEEEEeCCCceE
Q 028785 117 TLDWIIESDVPTATYFVRAYALNAERHEV 145 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvraya~da~g~~v 145 (204)
.+...||--+|||+|.+.+-++|++|.+.
T Consensus 107 ~hsq~LPg~tPPG~Y~lkm~~~d~~~~~L 135 (153)
T KOG4680|consen 107 AHSQVLPGYTPPGSYVLKMTAYDAKGKEL 135 (153)
T ss_pred eeeEeccCcCCCceEEEEEEeecCCCCEE
Confidence 46778999999999999999999999864
No 30
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=51.71 E-value=50 Score=25.83 Aligned_cols=48 Identities=21% Similarity=0.317 Sum_probs=27.1
Q ss_pred EEEEecCCCCCceeEEEEEEEeCCCc---eEeecccCCCCcccceEEEEeecc
Q 028785 118 LDWIIESDVPTATYFVRAYALNAERH---EVAYGQSTNDQKTTNLFDIQAITG 167 (204)
Q Consensus 118 ~~~tl~~dvp~atYfvraya~da~g~---~vaYGqs~~~~~ttn~F~V~~itg 167 (204)
..+.++.|.|.++...+.-+.|+|.. .+.|.-...+. .+.|+|.+.+|
T Consensus 2 ~~~~i~En~~~g~~v~~~~a~D~D~~~~~~~~y~i~~~~~--~~~F~i~~~tG 52 (199)
T cd00031 2 YSVSVPENAPPGTVVGTVSATDPDSGENGRVTYSILGGNE--DGLFSIDPNTG 52 (199)
T ss_pred eEEEEeCCCCCCCEEEEEEEECCCCCCCceEEEEEeCCCC--cccEEEeCCCC
Confidence 34566677777666666666666653 46665544321 14566666544
No 31
>PF03404 Mo-co_dimer: Mo-co oxidoreductase dimerisation domain; InterPro: IPR005066 The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism [, ]. In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner []. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF []. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 (Q39054 from SWISSPROT) of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. Q9NQX3 from SWISSPROT) and Drosophila melanogaster (Fruit fly) Cinnamon (P39205 from SWISSPROT) []. This domain is found in molybdopterin cofactor oxidoreductases, such as in the C-terminal of Mo-containing sulphite oxidase, which catalyses the conversion of sulphite to sulphate, the terminal step in the oxidative degradation of cysteine and methionine []. This domain is involved in dimer formation, and has an Ig-fold structure [].; GO: 0016491 oxidoreductase activity, 0030151 molybdenum ion binding, 0055114 oxidation-reduction process; PDB: 2C9X_A 2CA3_A 2BLF_A 2CA4_A 2BPB_A 2XTS_C 2BII_A 2BIH_A 1OGP_A 2A9A_B ....
Probab=46.75 E-value=1.5e+02 Score=23.74 Aligned_cols=90 Identities=17% Similarity=0.251 Sum_probs=46.6
Q ss_pred EecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccCCC
Q 028785 37 TTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKSLQ 116 (204)
Q Consensus 37 as~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~~g 116 (204)
++|.+|+++..|...++|. +-+-.-.| ..=..|.|.+. -.+-|+...-+-..+..= + -+.+| .
T Consensus 14 ~~P~~~~~v~~~~~~v~i~--G~A~~g~g--~~I~rVEVS~D------gG~tW~~A~l~~~~~~~~-~--g~~~~----a 76 (131)
T PF03404_consen 14 TSPSDGETVKAGDGTVTIR--GYAWSGGG--RGIARVEVSTD------GGKTWQEATLDGPESPPR-Y--GEARW----A 76 (131)
T ss_dssp EESBTTEEEESESEEEEEE--EEEE-STT----EEEEEEESS------TTSSEEE-EEESTSCCCH-H--TS-TT----S
T ss_pred EecCCCCEEccCCcEEEEE--EEEEeCCC--cceEEEEEEeC------CCCCcEEeEeccCCCccc-c--cccCc----c
Confidence 4689999999997777765 33222111 11124444321 255687766544433100 0 00011 1
Q ss_pred eEEEEecCCCC--CceeEEEEEEEeCCCc
Q 028785 117 TLDWIIESDVP--TATYFVRAYALNAERH 143 (204)
Q Consensus 117 s~~~tl~~dvp--~atYfvraya~da~g~ 143 (204)
=..|.+.-+.| ++.|.|.+=+.|.+|.
T Consensus 77 W~~W~~~~~~~~~~G~~~i~~RA~D~~G~ 105 (131)
T PF03404_consen 77 WRLWEYDWPPPSLPGEYTIMVRATDESGN 105 (131)
T ss_dssp -EEEEEEEEECSHCCEEEEEEEEEETTS-
T ss_pred cceeeeccCcCccccceEEEEEEeecccc
Confidence 23344444444 4999999999999996
No 32
>PF04151 PPC: Bacterial pre-peptidase C-terminal domain; InterPro: IPR007280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This domain is normally found at the C terminus of secreted archaeal and bacterial peptidases, the majority of which belong to MEROPS peptidase families M4 (vibriolysin, IPR001570 from INTERPRO), M9A amd M9B (microbial collangenase, IPR002169 from INTERPRO), M28 (aminopeptidase Ap1, IPR007484 from INTERPRO) and S8 (subtilisin family peptidases, IPR000209 from INTERPRO).; GO: 0008233 peptidase activity, 0006508 proteolysis; PDB: 4DY5_B 4DXZ_A 4DY3_B 3JQW_A 3JQX_C 1NQJ_B 1NQD_A 2O8O_A 1WMF_A 1WME_A ....
Probab=44.51 E-value=17 Score=25.21 Aligned_cols=12 Identities=33% Similarity=0.866 Sum_probs=10.1
Q ss_pred CCCceeEEEEEE
Q 028785 126 VPTATYFVRAYA 137 (204)
Q Consensus 126 vp~atYfvraya 137 (204)
.++||||||++.
T Consensus 59 ~~~GtYyi~V~~ 70 (70)
T PF04151_consen 59 PAAGTYYIRVYG 70 (70)
T ss_dssp SSSEEEEEEEE-
T ss_pred CCCEEEEEEEEC
Confidence 789999999974
No 33
>PRK02710 plastocyanin; Provisional
Probab=42.79 E-value=59 Score=25.21 Aligned_cols=26 Identities=31% Similarity=0.604 Sum_probs=13.6
Q ss_pred cceeEEEEecCCCc--------eeeeCCceEEEEeec
Q 028785 30 QKTLQVTTTTKRGQ--------VLKAGEDKVTITWGL 58 (204)
Q Consensus 30 ~ktL~Vtas~~~g~--------vl~aG~D~itvtw~l 58 (204)
.++.+|+...++|+ .++.|. +|+|..
T Consensus 28 a~~~~V~~~~~~~~~~F~P~~i~v~~Gd---~V~~~N 61 (119)
T PRK02710 28 AETVEVKMGSDAGMLAFEPSTLTIKAGD---TVKWVN 61 (119)
T ss_pred cceEEEEEccCCCeeEEeCCEEEEcCCC---EEEEEE
Confidence 45666665544443 455553 356753
No 34
>PRK12812 flgD flagellar basal body rod modification protein; Reviewed
Probab=42.34 E-value=51 Score=29.81 Aligned_cols=35 Identities=17% Similarity=0.317 Sum_probs=27.7
Q ss_pred cCCCeEEEEecCC----CCCceeEEEEEEEeCCCceEee
Q 028785 113 KSLQTLDWIIESD----VPTATYFVRAYALNAERHEVAY 147 (204)
Q Consensus 113 ~~~gs~~~tl~~d----vp~atYfvraya~da~g~~vaY 147 (204)
++...|+|.=..+ +|++.|.+++-+.|++|..+..
T Consensus 165 aG~~~f~WDG~d~~G~~~~~G~Yt~~v~A~~~~G~~~~~ 203 (259)
T PRK12812 165 QGLFTMEWDGRDNDGVYAGDGEYTIKAVYNNKNGEKITA 203 (259)
T ss_pred CcceeEEECCCCCCCCcCCCeeeEEEEEEEcCCCcEEee
Confidence 3345788887544 8999999999999999987753
No 35
>PF14363 AAA_assoc: Domain associated at C-terminal with AAA
Probab=41.56 E-value=56 Score=24.70 Aligned_cols=51 Identities=14% Similarity=0.115 Sum_probs=33.7
Q ss_pred HHHHhhhcccccceeeccccceeEEEEecCCCceeeeCCceEEEEeecccc
Q 028785 11 IYLSLLVHECYGVTLFSSLQKTLQVTTTTKRGQVLKAGEDKVTITWGLNQS 61 (204)
Q Consensus 11 ll~a~~~~~~~~~~~~SsL~ktL~Vtas~~~g~vl~aG~D~itvtw~ln~t 61 (204)
+.|+..+++.......+..++.=.++.++++||++.-==+-++|.|.+..+
T Consensus 47 ~YL~s~~s~~a~rL~~~~~~~~~~~~l~l~~~e~V~D~F~Gv~v~W~~~~~ 97 (98)
T PF14363_consen 47 AYLSSKISPSARRLKASKSKNSKNLVLSLDDGEEVVDVFEGVKVWWSSVCT 97 (98)
T ss_pred HHHhhccCcccceeeecccCCCCceEEecCCCCEEEEEECCEEEEEEEEcc
Confidence 445544444433345666665555777788888777666789999998754
No 36
>PF13860 FlgD_ig: FlgD Ig-like domain; PDB: 3C12_A 3OSV_A.
Probab=41.27 E-value=39 Score=24.39 Aligned_cols=27 Identities=26% Similarity=0.446 Sum_probs=17.6
Q ss_pred CCCeEEEEec----CCCCCceeEEEEEEEeC
Q 028785 114 SLQTLDWIIE----SDVPTATYFVRAYALNA 140 (204)
Q Consensus 114 ~~gs~~~tl~----~dvp~atYfvraya~da 140 (204)
+..+|.|.=- .-+|++.|++++.+.|+
T Consensus 50 G~~~~~WdG~d~~G~~~~~G~Y~~~v~a~~~ 80 (81)
T PF13860_consen 50 GEHSFTWDGKDDDGNPVPDGTYTFRVTATDG 80 (81)
T ss_dssp EEEEEEE-SB-TTS-B--SEEEEEEEEEEET
T ss_pred ceEEEEECCCCCCcCCCCCCCEEEEEEEEeC
Confidence 3468888833 45999999999998754
No 37
>PF10633 NPCBM_assoc: NPCBM-associated, NEW3 domain of alpha-galactosidase; InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=40.72 E-value=43 Score=23.71 Aligned_cols=22 Identities=27% Similarity=0.321 Sum_probs=15.6
Q ss_pred CeEEEEecCCCCCceeEEEEEE
Q 028785 116 QTLDWIIESDVPTATYFVRAYA 137 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya 137 (204)
-.++.++|.|.++++|.|++-+
T Consensus 54 ~~~~V~vp~~a~~G~y~v~~~a 75 (78)
T PF10633_consen 54 VTFTVTVPADAAPGTYTVTVTA 75 (78)
T ss_dssp EEEEEEE-TT--SEEEEEEEEE
T ss_pred EEEEEECCCCCCCceEEEEEEE
Confidence 4677778899999999999865
No 38
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=40.19 E-value=16 Score=32.33 Aligned_cols=20 Identities=35% Similarity=0.368 Sum_probs=2.3
Q ss_pred eeeeeeeeehhhhhhhhhhH
Q 028785 174 IASVCFSVFSIVALFGFFFH 193 (204)
Q Consensus 174 ia~~~fS~FSvv~L~~ff~~ 193 (204)
-.+--+|+|||+.|+++|++
T Consensus 177 s~~n~~sif~Il~l~~ifi~ 196 (197)
T PF06247_consen 177 SFMNGSSIFSILNLFVIFII 196 (197)
T ss_dssp TEEE----------------
T ss_pred hHHhHHHHHHHHHHHHheee
Confidence 45667899999999998875
No 39
>PF12866 DUF3823: Protein of unknown function (DUF3823); InterPro: IPR024278 This is a family of uncharacterised proteins from Bacteroidetes. These proteins have characteristic DN and DR sequence-motifs but their function is not known.; PDB: 3HN5_B 4EIU_A.
Probab=36.75 E-value=1.1e+02 Score=26.95 Aligned_cols=50 Identities=18% Similarity=0.250 Sum_probs=33.7
Q ss_pred ceeEEEEec----CCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccC
Q 028785 31 KTLQVTTTT----KRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKD 86 (204)
Q Consensus 31 ktL~Vtas~----~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvd 86 (204)
.++++++.| +..+... .-++|+++..++.... ++ +--.|.|.....+.||
T Consensus 105 t~~d~eVtPY~~I~~~~~~~-~g~~v~asf~v~~~~~---~~--~i~~v~l~~~~t~~v~ 158 (222)
T PF12866_consen 105 TTQDFEVTPYLRIKNAKISL-NGNKVTASFKVEQIIT---NA--NIEEVQLYVSKTQFVG 158 (222)
T ss_dssp EEEEEEE-BSEEEEECEEEE-ETTEEEEEEEEEESS----HH---EEEEEEEEESSTT-S
T ss_pred eEEeEEeeeeEEEeccceee-cCCEEEEEEEEEeccC---CC--ceeEEEEEEecccccC
Confidence 466777777 4554444 4478999999997663 32 6778999999999999
No 40
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=36.33 E-value=1.2e+02 Score=23.63 Aligned_cols=50 Identities=20% Similarity=0.353 Sum_probs=38.8
Q ss_pred CeEEEEecCCCCCceeEEEEEEEeCCC---ceEeecccCCCCcccceEEEEeecc
Q 028785 116 QTLDWIIESDVPTATYFVRAYALNAER---HEVAYGQSTNDQKTTNLFDIQAITG 167 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya~da~g---~~vaYGqs~~~~~ttn~F~V~~itg 167 (204)
......+..+.|.++...++-+.|.|+ ..+.|--..... .+.|+|.+-+|
T Consensus 105 ~~~~~~v~e~~~~~~~i~~~~a~D~D~~~~~~~~y~l~~~~~--~~~f~i~~~~G 157 (199)
T cd00031 105 SSYEASVPENAPPGTVVGTVTATDADSGENAKLTYSILSGND--KELFSIDPNTG 157 (199)
T ss_pred cceEEEEeCCCCCCCEEEEEEEEcCCCCCCccEEEEEeCCCC--CCEEEEeCCce
Confidence 467788889999999999999998886 678886555332 46799988665
No 41
>PF10648 Gmad2: Immunoglobulin-like domain of bacterial spore germination; InterPro: IPR018911 This domain is found linked to IPR019606 from INTERPRO in some bacterial proteins. It is predicted to contain an immunoglobulin-like all-beta fold.
Probab=35.68 E-value=59 Score=24.57 Aligned_cols=75 Identities=11% Similarity=0.171 Sum_probs=45.6
Q ss_pred EecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcceeeeccccccCC-
Q 028785 37 TTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSFKIVEKPYNKSL- 115 (204)
Q Consensus 37 as~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ki~~k~y~~~~- 115 (204)
.+|.+|+++.. .++|+ +.+..- + .++.++| |-.++.+ +.+.+-.+++
T Consensus 6 ~~P~pg~~V~s---p~~V~--G~A~~F---E---gtv~~rv------------~D~~g~v---------l~e~~~~a~~g 53 (88)
T PF10648_consen 6 TAPAPGDTVSS---PVKVS--GKARVF---E---GTVNIRV------------RDGHGEV---------LAEGFVTATGG 53 (88)
T ss_pred cCCCCcCCcCC---CEEEE--EEEEEe---e---eEEEEEE------------EcCCCcE---------EEEeeEEeccC
Confidence 36788888776 66666 665554 2 2777776 2222222 2233333322
Q ss_pred ----CeEEEEecCCCC-CceeEEEEEEEeCCCc
Q 028785 116 ----QTLDWIIESDVP-TATYFVRAYALNAERH 143 (204)
Q Consensus 116 ----gs~~~tl~~dvp-~atYfvraya~da~g~ 143 (204)
|.|+-+|.-.-| +.+|.+++|..|+.|.
T Consensus 54 ~~~~g~F~~tv~~~~~~~~~g~l~v~~~s~~dG 86 (88)
T PF10648_consen 54 APSWGPFEGTVSFPPPPPGKGTLEVFEDSAKDG 86 (88)
T ss_pred CCcccceEEEEEeCCCCCCceEEEEEEeCCCCC
Confidence 667777774433 8899999999988764
No 42
>PF00207 A2M: Alpha-2-macroglobulin family; InterPro: IPR001599 This entry contains serum complement C3 and C4 precursors and alpha-macrogrobulins. The alpha-macroglobulin (aM) family of proteins includes protease inhibitors [], typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a 'bait region' and a thiol ester, (iii) a similar protease inhibitory mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance []. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain [] (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation []. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified [, ].; GO: 0004866 endopeptidase inhibitor activity; PDB: 3KLS_B 3PRX_C 3KM9_B 3PVM_C 3CU7_A 4E0S_A 4A5W_A 2PN5_A 3FRP_G 3HRZ_B ....
Probab=34.53 E-value=1.3e+02 Score=21.99 Aligned_cols=44 Identities=18% Similarity=0.360 Sum_probs=27.4
Q ss_pred CeEEEEecCCCCCceeEEEEEEEeCCCceEeecccCC-CCcccceEEEEe
Q 028785 116 QTLDWIIESDVPTATYFVRAYALNAERHEVAYGQSTN-DQKTTNLFDIQA 164 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya~da~g~~vaYGqs~~-~~~ttn~F~V~~ 164 (204)
.+++.++|+++. +|.+.|++++.++ .+|.... .-.+..-|.|++
T Consensus 16 ~~~~~~lPd~it--~w~v~a~a~s~~~---~~g~~~~~~~~v~~p~~i~~ 60 (92)
T PF00207_consen 16 ATFSFTLPDSIT--SWRVTAFAVSPTG---GFGIAEPPEITVFKPFFIQL 60 (92)
T ss_dssp EEEEEE-SSSSS--EEEEEEEEEETTT---EEEEECCEEEEEB-SEEEEE
T ss_pred EEEEEECCCCcc--EEEEEEEEECCCC---cceEecceEEEEEeeEEEEc
Confidence 467777777776 8999999998876 3555553 323333355554
No 43
>PRK12813 flgD flagellar basal body rod modification protein; Reviewed
Probab=33.13 E-value=77 Score=28.01 Aligned_cols=31 Identities=13% Similarity=0.287 Sum_probs=22.8
Q ss_pred ccCCCeEEEEecCC----CCCceeEEEEEEEeCCC
Q 028785 112 NKSLQTLDWIIESD----VPTATYFVRAYALNAER 142 (204)
Q Consensus 112 ~~~~gs~~~tl~~d----vp~atYfvraya~da~g 142 (204)
..+.+.|+|.=-.+ +|++.|.+++-+.++++
T Consensus 144 ~~G~~~f~WDG~d~~G~~l~~G~Yt~~V~A~~~g~ 178 (223)
T PRK12813 144 PVGAGPVEWAGEDADGNPLPNGAYSFVVESYSGGE 178 (223)
T ss_pred CCCceeEEeCCcCCCCCcCCCccEEEEEEEEeCCc
Confidence 34557899984433 99999999999875433
No 44
>smart00060 FN3 Fibronectin type 3 domain. One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins.
Probab=32.85 E-value=1.2e+02 Score=18.46 Aligned_cols=26 Identities=19% Similarity=0.298 Sum_probs=18.8
Q ss_pred eEEEEecCCCCCceeEEEEEEEeCCC
Q 028785 117 TLDWIIESDVPTATYFVRAYALNAER 142 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvraya~da~g 142 (204)
...+.++.=.|..+|.+|+.+++..|
T Consensus 56 ~~~~~i~~L~~~~~Y~v~v~a~~~~g 81 (83)
T smart00060 56 STSYTLTGLKPGTEYEFRVRAVNGAG 81 (83)
T ss_pred ccEEEEeCcCCCCEEEEEEEEEcccC
Confidence 34566666666679999999986554
No 45
>PF01108 Tissue_fac: Tissue factor; PDB: 3OG4_B 3OG6_B 1FYH_E 1FG9_D 1JRH_I 3DGC_R 3DLQ_R 1LQS_R 1Y6M_R 1J7V_R ....
Probab=31.56 E-value=2.2e+02 Score=21.19 Aligned_cols=85 Identities=21% Similarity=0.349 Sum_probs=42.9
Q ss_pred eeccccceeEEEEecCCCceeeeCCceEEEEeeccccCcCCCccccceeEEEEeeccccccCccccccccCcCcCCCcce
Q 028785 25 LFSSLQKTLQVTTTTKRGQVLKAGEDKVTITWGLNQSLAAGTDSAYKTMKLQLCFAPVSQKDRAWRKTEDHLNKDKTCSF 104 (204)
Q Consensus 25 ~~SsL~ktL~Vtas~~~g~vl~aG~D~itvtw~ln~t~~ag~d~~~ktvkvkLCYap~SQvdR~WRK~~d~l~kdK~Cq~ 104 (204)
....||+--.|+..-.+ ....++|.-....| .+.-| +|+.+ ...+..|+ .-..|+.
T Consensus 18 ~~~~lp~P~nv~~~s~n--------f~~iL~W~~~~~~~--~~~~y-tVq~~------~~~~~~W~-------~v~~C~~ 73 (107)
T PF01108_consen 18 ASASLPAPQNVTVDSVN--------FKHILRWDPGPGSP--PNVTY-TVQYK------KYGSSSWK-------DVPGCQN 73 (107)
T ss_dssp --SSGSSCEEEEEEEET--------TEEEEEEEESTTSS--STEEE-EEEEE------ESSTSCEE-------EECCEEE
T ss_pred ccccCCCCCeeEEEEEC--------CceEEEeCCCCCCC--CCeEE-EEEEE------ecCCccee-------eccceec
Confidence 44567655555533333 34678999855554 33322 45554 11222333 3357844
Q ss_pred eeeccccccCCCeEEEEecCCCCCceeEEEEEEEeCC
Q 028785 105 KIVEKPYNKSLQTLDWIIESDVPTATYFVRAYALNAE 141 (204)
Q Consensus 105 ki~~k~y~~~~gs~~~tl~~dvp~atYfvraya~da~ 141 (204)
|.+.- -+.+-...-+...|++|+-+..++
T Consensus 74 -i~~~~-------Cdlt~~~~~~~~~Y~~rV~A~~~~ 102 (107)
T PF01108_consen 74 -ITETS-------CDLTDETSDPSESYYARVRAEVGN 102 (107)
T ss_dssp -ESSSE-------EECTTCCTTTTSEEEEEEEEEETT
T ss_pred -ccccc-------eeCcchhhcCcCCEEEEEEEEeCC
Confidence 43322 222222222889999999997443
No 46
>cd00912 ML The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids.
Probab=28.82 E-value=1.2e+02 Score=23.30 Aligned_cols=28 Identities=18% Similarity=0.204 Sum_probs=21.1
Q ss_pred EEEecC-CCCCceeEEEEEEEeCCCceEe
Q 028785 119 DWIIES-DVPTATYFVRAYALNAERHEVA 146 (204)
Q Consensus 119 ~~tl~~-dvp~atYfvraya~da~g~~va 146 (204)
+..|+. .+|+..|+++...+|.+|..++
T Consensus 93 ~~~v~~~~~P~~~~~v~~~l~~~~~~~v~ 121 (127)
T cd00912 93 TVNVPEFTIPTIEYQVVLEDVTDKGEVLA 121 (127)
T ss_pred EEecCcccCCCeeEEEEEEEEcCCCCEEE
Confidence 344555 7899999999998887776554
No 47
>COG3110 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=28.34 E-value=2e+02 Score=25.85 Aligned_cols=37 Identities=16% Similarity=0.247 Sum_probs=20.0
Q ss_pred ccCCCeEEEEec--CCCCCceeEEEEEE---EeCCCceEeec
Q 028785 112 NKSLQTLDWIIE--SDVPTATYFVRAYA---LNAERHEVAYG 148 (204)
Q Consensus 112 ~~~~gs~~~tl~--~dvp~atYfvraya---~da~g~~vaYG 148 (204)
++.+.+..+.+| ||.+.|.=|++.-. +|.+|++++-=
T Consensus 86 ~~~n~~lv~~~P~~rn~rea~~f~k~p~~ql~d~~G~~i~~k 127 (216)
T COG3110 86 NAQNIDLVFFLPRLRNEREANKFIKTPRWQLLDGDGTPIAVK 127 (216)
T ss_pred cccchhheeeccccccchhhhhhhcCccEEEecCCCcEeehh
Confidence 333333444455 66666666655433 36777766543
No 48
>PRK06655 flgD flagellar basal body rod modification protein; Reviewed
Probab=26.40 E-value=1.3e+02 Score=26.33 Aligned_cols=33 Identities=15% Similarity=0.357 Sum_probs=23.9
Q ss_pred cCCCeEEEEecCC----CCCceeEEEEEEEeCCCceEe
Q 028785 113 KSLQTLDWIIESD----VPTATYFVRAYALNAERHEVA 146 (204)
Q Consensus 113 ~~~gs~~~tl~~d----vp~atYfvraya~da~g~~va 146 (204)
++...|+|+=-.+ +|++.|.+++-+. .+|..+.
T Consensus 150 aG~~~f~WDG~d~~G~~lp~G~Yt~~V~A~-~~g~~~~ 186 (225)
T PRK06655 150 AGVVSFTWDGTDTDGNALPDGNYTIKASAS-VGGKQLV 186 (225)
T ss_pred CCceeEEECCCCCCCCcCCCeeEEEEEEEE-eCCceee
Confidence 3335788876544 8999999999887 6666443
No 49
>PF09912 DUF2141: Uncharacterized protein conserved in bacteria (DUF2141); InterPro: IPR018673 This family of conserved hypothetical proteins has no known function.
Probab=25.88 E-value=69 Score=24.92 Aligned_cols=25 Identities=20% Similarity=0.452 Sum_probs=18.7
Q ss_pred CeEEEEecCCCCCceeEEEEEEEeCCC
Q 028785 116 QTLDWIIESDVPTATYFVRAYALNAER 142 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya~da~g 142 (204)
++.+.++ .|+|+++|-|++|. |.+|
T Consensus 41 ~~~~~~f-~~lp~G~YAi~v~h-D~N~ 65 (112)
T PF09912_consen 41 GTVTITF-EDLPPGTYAIAVFH-DENG 65 (112)
T ss_pred CcEEEEE-CCCCCccEEEEEEE-eCCC
Confidence 5555555 48999999999998 4443
No 50
>PF13750 Big_3_3: Bacterial Ig-like domain (group 3)
Probab=24.81 E-value=1.5e+02 Score=24.48 Aligned_cols=27 Identities=22% Similarity=0.380 Sum_probs=22.1
Q ss_pred eEEEEecCCCCCceeEEEEE-EEeCCCce
Q 028785 117 TLDWIIESDVPTATYFVRAY-ALNAERHE 144 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvray-a~da~g~~ 144 (204)
.++|.+ .++|.|.|-+.++ +.|..|..
T Consensus 4 ~~~fd~-~~l~dG~Y~l~~~~a~D~agN~ 31 (158)
T PF13750_consen 4 TYTFDL-STLPDGSYTLTVVTATDAAGNT 31 (158)
T ss_pred EEEEEe-CcCCCccEEEEEEEEEecCCCE
Confidence 467777 4999999999997 88888864
No 51
>PRK15296 putative fimbrial protein SthA; Provisional
Probab=22.65 E-value=2.9e+02 Score=22.53 Aligned_cols=47 Identities=19% Similarity=0.356 Sum_probs=24.4
Q ss_pred eeEEEEecCCCc-eeee-CCceEEEEeeccc---cCcCCCccccceeEEEE--e
Q 028785 32 TLQVTTTTKRGQ-VLKA-GEDKVTITWGLNQ---SLAAGTDSAYKTMKLQL--C 78 (204)
Q Consensus 32 tL~Vtas~~~g~-vl~a-G~D~itvtw~ln~---t~~ag~d~~~ktvkvkL--C 78 (204)
+|.++....+.- .+.. |-...+|+++--+ =..+|..++.+...|+| |
T Consensus 25 ~I~f~G~I~~~tC~v~~~~s~~~~V~lg~v~~~~l~~~g~~~~~~~F~I~L~~C 78 (181)
T PRK15296 25 TITFNGKIYDQACTVQVNGSTDTTIDLGNYSKERIAEKGATTDYVPFTVSLVSC 78 (181)
T ss_pred eEEEEEEEecCccEEEcCCCcccEEECcccchhhhccCcccCCCEeEEEEecCC
Confidence 566665543322 3444 4445577753221 11345445667778877 7
No 52
>COG5301 Phage-related tail fibre protein [General function prediction only]
Probab=22.60 E-value=1.2e+02 Score=30.70 Aligned_cols=80 Identities=19% Similarity=0.275 Sum_probs=53.6
Q ss_pred cccCccccccccCcCcC-CCcceeeeccccccCCCeEEEEecCCCCCceeEEEEEEE-eCCCceEeecccCCCCcccceE
Q 028785 83 SQKDRAWRKTEDHLNKD-KTCSFKIVEKPYNKSLQTLDWIIESDVPTATYFVRAYAL-NAERHEVAYGQSTNDQKTTNLF 160 (204)
Q Consensus 83 SQvdR~WRK~~d~l~kd-K~Cq~ki~~k~y~~~~gs~~~tl~~dvp~atYfvraya~-da~g~~vaYGqs~~~~~ttn~F 160 (204)
+.++.-||---..|--| |---|-|+|+ -||.+| +-++||-..+ |++|.-++||.|-.+.| =
T Consensus 50 aLvnE~~RaqlN~L~vdp~NpnqlIAEl-----------VlPetv--GGwwiREvGlfDadG~liavgncPeSYK----p 112 (587)
T COG5301 50 ALVNERHRAQLNRLFVDPKNPNQLIAEL-----------VLPETV--GGWWIREVGLFDADGKLIAVGNCPESYK----P 112 (587)
T ss_pred HHHHHHHHHhhhheEeCCCCccceeEEE-----------eccccc--cceEEEEeeeecCCCCEEEEccCCcccc----c
Confidence 34677788776666555 3222224443 344443 4588998886 99999999999997766 4
Q ss_pred EEEeeccccccceeeeeee
Q 028785 161 DIQAITGRHASLDIASVCF 179 (204)
Q Consensus 161 ~V~~itg~~~sL~ia~~~f 179 (204)
+.+.=+||++-++.-+..=
T Consensus 113 qm~eGsgRtqtiRmvi~~S 131 (587)
T COG5301 113 QMEEGSGRTQTIRMVIALS 131 (587)
T ss_pred cccCCCCceEEEEEEEEec
Confidence 5555578988877655443
No 53
>COG1464 NlpA ABC-type metal ion transport system, periplasmic component/surface antigen [Inorganic ion transport and metabolism]
Probab=22.25 E-value=59 Score=29.89 Aligned_cols=16 Identities=31% Similarity=0.542 Sum_probs=13.2
Q ss_pred ceeEEEEec-CCCceee
Q 028785 31 KTLQVTTTT-KRGQVLK 46 (204)
Q Consensus 31 ktL~Vtas~-~~g~vl~ 46 (204)
++|.|.++| .++++++
T Consensus 29 ~~I~vg~~~~p~a~ile 45 (268)
T COG1464 29 KTIKVGATPGPHAEILE 45 (268)
T ss_pred CcEEEeecCCchHHHHH
Confidence 899999998 5677665
No 54
>PF12245 Big_3_2: Bacterial Ig-like domain (group 3); InterPro: IPR022038 This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT.
Probab=21.21 E-value=1.4e+02 Score=20.73 Aligned_cols=29 Identities=10% Similarity=0.123 Sum_probs=24.0
Q ss_pred CeEEEEecCCCCCceeEEEEEEEeCCCce
Q 028785 116 QTLDWIIESDVPTATYFVRAYALNAERHE 144 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya~da~g~~ 144 (204)
+.-...+|.+..++.|.+++.+.|..|-.
T Consensus 10 ~~~~~~~P~~~~dg~yt~~v~a~D~AGN~ 38 (60)
T PF12245_consen 10 GVWSTVIPENDADGEYTLTVTATDKAGNT 38 (60)
T ss_pred cceeccccCccCCccEEEEEEEEECCCCE
Confidence 45556778888889999999999999974
No 55
>PF01002 Flavi_NS2B: Flavivirus non-structural protein NS2B; InterPro: IPR000487 Flaviviruses encode a single polyprotein. This is cleaved into three structural and seven non-structural proteins. All, but two, are cleaved by the NS2B-NS3 protease complex [, ].; GO: 0004252 serine-type endopeptidase activity, 0019012 virion; PDB: 2WV9_A 2FOM_A 2VBC_B 3U1I_C 3U1J_A 3LKW_A 3L6P_A 2GGV_A 3E90_C 2IJO_A ....
Probab=20.60 E-value=50 Score=27.02 Aligned_cols=73 Identities=16% Similarity=0.225 Sum_probs=16.5
Q ss_pred CeEEEEecCCCCCceeEEEEEEEeCCCceEeecccCCCCcccceEEEEeeccccccceeeeeeeeeehhhhhhhhhhHHH
Q 028785 116 QTLDWIIESDVPTATYFVRAYALNAERHEVAYGQSTNDQKTTNLFDIQAITGRHASLDIASVCFSVFSIVALFGFFFHEK 195 (204)
Q Consensus 116 gs~~~tl~~dvp~atYfvraya~da~g~~vaYGqs~~~~~ttn~F~V~~itg~~~sL~ia~~~fS~FSvv~L~~ff~~Ek 195 (204)
|..+|.-+-++..+..-+++-. |++|.-.== +..+..-....|. ..+.++++++=..=|++++++++.||
T Consensus 55 g~i~W~~ea~~sG~s~rldV~~-d~~G~f~l~-~~~~~~~~~~~~~--------~~~l~~sa~~p~~Ip~~~~~w~~~~k 124 (128)
T PF01002_consen 55 GDISWEEEAEISGGSVRLDVKL-DDDGNFKLI-NEEGEPWAMVLFL--------TALLVASAFHPIAIPVVAAGWWLWEK 124 (128)
T ss_dssp E-S---TTHEEHSEEEEEEEEE--TTS-EEET-TSTTTTS----------------------------------------
T ss_pred eccccCccchhcCCceEEEEEE-CCCCCEEec-cCCCccHHHHHHH--------HHHHHHHhhhhHHHHHHHHHHHheec
Confidence 5566666666777777776655 778762211 1111111111121 22233333333444566777888885
Q ss_pred hHh
Q 028785 196 RKA 198 (204)
Q Consensus 196 rk~ 198 (204)
.++
T Consensus 125 ~~r 127 (128)
T PF01002_consen 125 SKR 127 (128)
T ss_dssp ---
T ss_pred ccc
Confidence 443
No 56
>PF12571 DUF3751: Phage tail-collar fibre protein; InterPro: IPR022225 This entry is represented by Bacteriophage HP1, Orf31. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This family is found in bacteria and viruses, and is approximately 160 amino acids in length, some annotation suggests that it may be a tail fibre protein. There are two completely conserved residues (K and W) that may be functionally important.
Probab=20.52 E-value=4.6e+02 Score=21.41 Aligned_cols=37 Identities=22% Similarity=0.293 Sum_probs=28.1
Q ss_pred eEEEEecCCCCCceeEEEEEEE-eCCCceEeecccCCCCc
Q 028785 117 TLDWIIESDVPTATYFVRAYAL-NAERHEVAYGQSTNDQK 155 (204)
Q Consensus 117 s~~~tl~~dvp~atYfvraya~-da~g~~vaYGqs~~~~~ 155 (204)
..+..|+.|+ +-|++|-.++ |++|.-++|+-..+..|
T Consensus 75 ~~~~~i~~~~--ggf~irEiGL~d~~G~Liai~~~~~~~K 112 (159)
T PF12571_consen 75 VYSAVIPSDV--GGFTIREIGLFDEDGTLIAIANFPPTYK 112 (159)
T ss_pred EEEEEECCcc--CCcEEEEEEEEccCCCEEEEEecCCccc
Confidence 3555666664 6788888886 89999999998876543
No 57
>PF00801 PKD: PKD domain; InterPro: IPR000601 The PKD (Polycystic Kidney Disease) domain was first identified in the Polycystic Kidney Disease protein, polycystin-1 (PDK1 gene), and contains an Ig-like fold consisting of a beta-sandwich of seven strands in two sheets with a Greek key topology, although some members have additional strands []. Polycystin-1 is a large cell-surface glycoprotein involved in adhesive protein-protein and protein-carbohydrate interactions; however it is not clear if the PKD domain mediates any of these interactions. PKD domains are also found in other proteins, usually in the extracellular parts of proteins involved in interactions with other proteins. For example, domains with a PKD-type fold are found in archaeal surface layer proteins that protect the cell from extreme environments [], and in the human VPS10 domain-containing receptor SorCS2 [].; PDB: 1B4R_A 2KZW_A 2C4X_A 2C26_A 2Y72_B 3JQU_A 3JS7_B 1WGO_A 1L0Q_A.
Probab=20.22 E-value=2.9e+02 Score=18.67 Aligned_cols=28 Identities=18% Similarity=0.119 Sum_probs=21.4
Q ss_pred cCCCeEEEEecCCCCCceeEEEEEEEeCCCc
Q 028785 113 KSLQTLDWIIESDVPTATYFVRAYALNAERH 143 (204)
Q Consensus 113 ~~~gs~~~tl~~dvp~atYfvraya~da~g~ 143 (204)
..+.+++++-.. +|.|.|++-+-|+.|.
T Consensus 39 ~~~~~~t~ty~~---~G~y~V~ltv~n~~g~ 66 (69)
T PF00801_consen 39 STGSSVTHTYSS---PGTYTVTLTVTNGVGS 66 (69)
T ss_dssp ECSSEEEEEESS---SEEEEEEEEEEETTSE
T ss_pred ccCCCEEEEcCC---CeEEEEEEEEEECCCC
Confidence 334566666665 8999999999998886
No 58
>PF03381 CDC50: LEM3 (ligand-effect modulator 3) family / CDC50 family; InterPro: IPR005045 Members of this family have no known function. They have predicted transmembrane helices.; GO: 0016020 membrane
Probab=20.00 E-value=1e+02 Score=27.66 Aligned_cols=65 Identities=18% Similarity=0.233 Sum_probs=43.6
Q ss_pred cCCCCCceeEEEEEEEeCCCceEeecccCCCCcccceEEEEeec---cccccceeeeeeeeeehhhhhhhhhhHHHhH
Q 028785 123 ESDVPTATYFVRAYALNAERHEVAYGQSTNDQKTTNLFDIQAIT---GRHASLDIASVCFSVFSIVALFGFFFHEKRK 197 (204)
Q Consensus 123 ~~dvp~atYfvraya~da~g~~vaYGqs~~~~~ttn~F~V~~it---g~~~sL~ia~~~fS~FSvv~L~~ff~~Ekrk 197 (204)
..|+|.++|.|.+-.- |--+. -+.+..+-+...+ |+-.-|-++-.+.++++.+..++|+++-..+
T Consensus 203 ~~~L~~G~y~i~I~nn--------ypv~~--f~G~K~ivlst~s~~Ggkn~~Lgi~ylvvg~i~~v~~i~~~~~~~~~ 270 (278)
T PF03381_consen 203 NDDLPAGNYTIDITNN--------YPVSS--FGGKKSIVLSTTSWFGGKNYFLGIAYLVVGGICLVLAIIFLIIHYFK 270 (278)
T ss_pred cCCCCCceEEEEEEEe--------ecccc--cCcEEEEEEEeccccCccccHHHHHHHHHHHHHHHHHHHHHHHHHhC
Confidence 5889999999987543 21111 1223357777766 8888888888888888877777777665554
Done!