Query 040039
Match_columns 274
No_of_seqs 142 out of 1282
Neff 7.9
Searched_HMMs 46136
Date Fri Mar 29 04:29:13 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/040039.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/040039hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF01453 B_lectin: D-mannose b 100.0 1E-31 2.2E-36 209.5 5.8 102 36-137 1-114 (114)
2 cd00028 B_lectin Bulb-type man 99.9 2.3E-25 4.9E-30 174.3 13.2 99 1-111 17-116 (116)
3 smart00108 B_lectin Bulb-type 99.9 2.4E-24 5.1E-29 168.0 12.6 97 1-110 17-114 (114)
4 PF00954 S_locus_glycop: S-loc 99.9 2.2E-23 4.7E-28 161.6 9.0 103 164-273 1-107 (110)
5 smart00108 B_lectin Bulb-type 98.9 7.8E-09 1.7E-13 80.3 9.4 86 54-166 23-111 (114)
6 cd00028 B_lectin Bulb-type man 98.9 8.1E-09 1.8E-13 80.5 9.2 87 54-167 23-113 (116)
7 PF01453 B_lectin: D-mannose b 98.3 6.4E-06 1.4E-10 64.0 8.8 74 37-112 38-114 (114)
8 PF07974 EGF_2: EGF-like domai 94.9 0.021 4.6E-07 33.9 2.0 22 253-274 6-29 (32)
9 PF12661 hEGF: Human growth fa 94.0 0.0082 1.8E-07 28.2 -0.9 10 265-274 1-10 (13)
10 cd00053 EGF Epidermal growth f 92.4 0.097 2.1E-06 30.6 1.9 26 249-274 2-31 (36)
11 PF01683 EB: EB module; Inter 92.0 0.13 2.7E-06 33.8 2.2 30 244-273 17-46 (52)
12 cd00054 EGF_CA Calcium-binding 89.6 0.25 5.4E-06 29.2 1.8 28 247-274 3-34 (38)
13 smart00179 EGF_CA Calcium-bind 89.5 0.27 5.8E-06 29.5 1.9 27 247-273 3-33 (39)
14 PF07645 EGF_CA: Calcium-bindi 88.8 0.11 2.5E-06 32.5 -0.2 27 247-273 3-34 (42)
15 PF00008 EGF: EGF-like domain 88.4 0.082 1.8E-06 31.2 -1.0 21 254-274 5-30 (32)
16 PF12947 EGF_3: EGF domain; I 85.8 0.12 2.6E-06 31.5 -1.3 22 253-274 6-31 (36)
17 PF12662 cEGF: Complement Clr- 80.3 0.72 1.6E-05 25.4 0.5 9 265-273 3-11 (24)
18 smart00181 EGF Epidermal growt 78.6 1.5 3.2E-05 25.7 1.6 21 253-274 6-30 (35)
19 PHA02887 EGF-like protein; Pro 77.7 1.1 2.5E-05 34.5 1.1 27 247-274 84-118 (126)
20 cd05845 Ig2_L1-CAM_like Second 70.9 7.9 0.00017 28.9 4.2 34 34-67 31-64 (95)
21 PF09064 Tme5_EGF_like: Thromb 67.0 4 8.7E-05 24.4 1.4 10 264-273 18-27 (34)
22 KOG4289 Cadherin EGF LAG seven 66.4 2.4 5.3E-05 45.7 0.7 23 252-274 1244-1270(2531)
23 PF01436 NHL: NHL repeat; Int 66.3 12 0.00026 21.0 3.4 22 53-74 5-26 (28)
24 PF13360 PQQ_2: PQQ-like domai 63.0 21 0.00046 30.0 5.9 48 60-107 2-62 (238)
25 PF07354 Sp38: Zona-pellucida- 61.4 13 0.00028 33.1 4.2 34 34-67 10-43 (271)
26 PRK11138 outer membrane biogen 61.1 31 0.00068 32.0 7.1 19 89-107 342-361 (394)
27 PF13360 PQQ_2: PQQ-like domai 58.4 1.1E+02 0.0023 25.5 11.9 73 35-107 11-102 (238)
28 PRK11138 outer membrane biogen 57.7 34 0.00074 31.8 6.8 55 53-107 121-186 (394)
29 TIGR03300 assembly_YfgL outer 56.8 53 0.0012 30.1 7.9 70 37-106 40-130 (377)
30 KOG4649 PQQ (pyrrolo-quinoline 56.7 24 0.00053 31.6 5.1 45 36-80 168-218 (354)
31 PF11403 Yeast_MT: Yeast metal 56.4 8.6 0.00019 22.9 1.5 19 253-272 12-30 (40)
32 PHA03099 epidermal growth fact 56.3 5.5 0.00012 31.4 0.9 24 250-274 48-77 (139)
33 TIGR03066 Gem_osc_para_1 Gemma 54.5 42 0.0009 25.9 5.5 54 50-103 33-104 (111)
34 smart00051 DSL delta serrate l 53.5 11 0.00025 25.8 2.1 18 257-274 42-60 (63)
35 cd05852 Ig5_Contactin-1 Fifth 52.9 64 0.0014 22.2 6.0 34 34-68 13-46 (73)
36 KOG1214 Nidogen and related ba 48.7 9.8 0.00021 39.1 1.5 27 247-274 828-858 (1289)
37 PF12690 BsuPI: Intracellular 46.9 24 0.00052 25.4 3.0 16 62-77 27-42 (82)
38 cd00055 EGF_Lam Laminin-type e 42.2 18 0.00039 23.2 1.6 10 264-273 19-28 (50)
39 PF12946 EGF_MSP1_1: MSP1 EGF 39.1 3.7 8.1E-05 25.1 -1.9 22 253-274 5-31 (37)
40 PF06006 DUF905: Bacterial pro 38.3 35 0.00075 23.9 2.5 19 93-111 34-52 (70)
41 KOG0640 mRNA cleavage stimulat 37.2 1.7E+02 0.0037 27.0 7.4 68 39-106 250-333 (430)
42 PF13570 PQQ_3: PQQ-like domai 36.4 47 0.001 19.9 2.8 11 70-80 1-11 (40)
43 KOG1214 Nidogen and related ba 36.4 18 0.0004 37.2 1.3 57 216-274 749-819 (1289)
44 PF14670 FXa_inhibition: Coagu 36.1 8.7 0.00019 23.3 -0.6 10 264-273 19-28 (36)
45 cd00216 PQQ_DH Dehydrogenases 36.1 1.3E+02 0.0028 29.1 7.2 73 35-107 37-136 (488)
46 PLN00033 photosystem II stabil 33.5 1.7E+02 0.0036 27.8 7.2 46 61-106 259-306 (398)
47 PF05935 Arylsulfotrans: Aryls 32.7 48 0.001 32.1 3.6 52 60-112 127-186 (477)
48 TIGR03300 assembly_YfgL outer 31.9 1.6E+02 0.0035 26.9 6.9 19 89-107 327-346 (377)
49 smart00180 EGF_Lam Laminin-typ 31.4 32 0.00069 21.7 1.4 10 264-273 18-27 (46)
50 KOG4260 Uncharacterized conser 30.4 33 0.00072 30.7 1.8 21 254-274 151-178 (350)
51 KOG3881 Uncharacterized conser 29.1 2.6E+02 0.0057 26.4 7.4 76 36-113 199-279 (412)
52 PF10282 Lactonase: Lactonase, 27.7 3.7E+02 0.0081 24.4 8.4 65 17-97 249-331 (345)
53 PF05294 Toxin_5: Scorpion sho 27.7 15 0.00033 21.5 -0.5 15 254-268 18-32 (32)
54 smart00286 PTI Plant trypsin i 26.8 42 0.00091 19.2 1.2 18 255-273 10-28 (29)
55 smart00564 PQQ beta-propeller 26.5 1.2E+02 0.0027 16.7 3.5 17 59-75 14-31 (33)
56 PF06247 Plasmod_Pvs28: Plasmo 25.9 8.9 0.00019 32.3 -2.4 27 247-273 40-79 (197)
57 PF01011 PQQ: PQQ enzyme repea 25.8 95 0.0021 18.4 2.8 22 58-79 7-29 (38)
58 KOG1225 Teneurin-1 and related 25.8 46 0.001 32.7 2.1 20 255-274 287-306 (525)
59 cd00150 PlantTI Plant trypsin 25.5 44 0.00095 18.8 1.1 8 254-261 19-26 (27)
60 PF00053 Laminin_EGF: Laminin 24.7 21 0.00046 22.7 -0.3 10 264-273 18-27 (49)
61 KOG0291 WD40-repeat-containing 24.6 3.5E+02 0.0077 27.9 7.9 53 53-105 354-419 (893)
62 PF02237 BPL_C: Biotin protein 24.0 45 0.00098 21.2 1.2 15 87-101 21-35 (48)
63 cd05764 Ig_2 Subgroup of the i 23.7 1.3E+02 0.0029 20.1 3.7 34 34-67 13-46 (74)
64 KOG2106 Uncharacterized conser 23.0 3E+02 0.0064 27.1 6.8 68 53-120 250-329 (626)
65 KOG0278 Serine/threonine kinas 22.3 5.2E+02 0.011 23.2 7.6 53 53-106 157-211 (334)
66 PF05833 FbpA: Fibronectin-bin 22.2 79 0.0017 30.1 2.9 38 85-122 116-159 (455)
67 PF14870 PSII_BNR: Photosynthe 21.8 2.9E+02 0.0062 25.1 6.3 52 55-107 159-212 (302)
68 COG1520 FOG: WD40-like repeat 21.7 4.7E+02 0.01 23.9 8.0 73 36-108 130-226 (370)
69 KOG4234 TPR repeat-containing 21.0 40 0.00086 29.2 0.5 16 132-148 250-265 (271)
70 KOG4792 Crk family adapters [S 20.7 1.3E+02 0.0028 26.3 3.6 48 105-158 10-64 (293)
71 KOG1225 Teneurin-1 and related 20.7 60 0.0013 31.9 1.8 16 259-274 260-275 (525)
72 COG4787 FlgF Flagellar basal b 20.3 4.5E+02 0.0097 22.9 6.7 24 54-77 79-102 (251)
No 1
>PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]: Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=99.97 E-value=1e-31 Score=209.45 Aligned_cols=102 Identities=54% Similarity=0.820 Sum_probs=75.2
Q ss_pred CCcEEEEcCCCCCCCC---CcEEEEecCCcEEEEcCCCCEEEee-cCCCCc--eeEEEEeeCCCeeEecCCCceEEeecC
Q 040039 36 FPQVVWSANRNNPVRI---NATLELTSDGNLVLQDADGAIAWST-NTSGKS--VVGLNLTDMGNLVLFDKNNAAVWQSFD 109 (274)
Q Consensus 36 ~~~vVWvANr~~Pv~~---~~~l~~~~~G~L~l~~~~~~~~Wss-~~~~~~--~~~~~l~d~GNlvl~~~~~~~lWqSFd 109 (274)
+++|||+|||++|+.. .++|.|+.||+|+|++..++++|++ ++.+.. ...|+|+|+|||||+|..+.+||||||
T Consensus 1 ~~tvvW~an~~~p~~~~s~~~~L~l~~dGnLvl~~~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~ 80 (114)
T PF01453_consen 1 PRTVVWVANRNSPLTSSSGNYTLILQSDGNLVLYDSNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFD 80 (114)
T ss_dssp ---------TTEEEEECETTEEEEEETTSEEEEEETTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTT
T ss_pred CcccccccccccccccccccccceECCCCeEEEEcCCCCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecC
Confidence 3689999999999942 3899999999999999988899999 655543 688999999999999988999999999
Q ss_pred CCCCcccCCCcccCC------ceEeeecCCCCCC
Q 040039 110 HPTDSLVPGQKLLEG------KKLTASVSTTNWT 137 (274)
Q Consensus 110 ~PtDTlLpgq~l~~~------~~L~Sw~s~~dps 137 (274)
|||||+||+|+|+.+ ..|+||++.+|||
T Consensus 81 ~ptdt~L~~q~l~~~~~~~~~~~~~sw~s~~dps 114 (114)
T PF01453_consen 81 YPTDTLLPGQKLGDGNVTGKNDSLTSWSSNTDPS 114 (114)
T ss_dssp SSS-EEEEEET--TSEEEEESTSSEEEESS----
T ss_pred CCccEEEeccCcccCCCccccceEEeECCCCCCC
Confidence 999999999999863 3599999999986
No 2
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=99.93 E-value=2.3e-25 Score=174.30 Aligned_cols=99 Identities=41% Similarity=0.650 Sum_probs=87.0
Q ss_pred CeeccccCCCCCceEEEEEEeeeccccccccccCCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEcCCCCEEEeecCCC
Q 040039 1 YACGFFCNGTCDSYLFAVFIVQAYNASLIDYQHIEFPQVVWSANRNNPVRINATLELTSDGNLVLQDADGAIAWSTNTSG 80 (274)
Q Consensus 1 F~lGFf~~~~~~~~~l~iw~~~~~~~~~~~~~~~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~~~~~~~Wss~~~~ 80 (274)
|++|||++......+++|||.. .+ .++||+|||+.|....+.|.|+.||+|+|+|.++.++|++++.+
T Consensus 17 f~~G~~~~~~q~~dgnlv~~~~-----------~~-~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~~ 84 (116)
T cd00028 17 FELGFFKLIMQSRDYNLILYKG-----------SS-RTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTTR 84 (116)
T ss_pred EEEecccCCCCCCeEEEEEEeC-----------CC-CeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEecccC
Confidence 7899999875323999999974 33 68999999999965568999999999999999999999999876
Q ss_pred -CceeEEEEeeCCCeeEecCCCceEEeecCCC
Q 040039 81 -KSVVGLNLTDMGNLVLFDKNNAAVWQSFDHP 111 (274)
Q Consensus 81 -~~~~~~~l~d~GNlvl~~~~~~~lWqSFd~P 111 (274)
.....++|+|+|||||++.++.+||||||||
T Consensus 85 ~~~~~~~~L~ddGnlvl~~~~~~~~W~Sf~~P 116 (116)
T cd00028 85 VNGNYVLVLLDDGNLVLYDSDGNFLWQSFDYP 116 (116)
T ss_pred CCCceEEEEeCCCCEEEECCCCCEEEcCCCCC
Confidence 5567889999999999999999999999998
No 3
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=99.92 E-value=2.4e-24 Score=168.01 Aligned_cols=97 Identities=43% Similarity=0.689 Sum_probs=86.7
Q ss_pred CeeccccCCCCCceEEEEEEeeeccccccccccCCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEcCCCCEEEeecCC-
Q 040039 1 YACGFFCNGTCDSYLFAVFIVQAYNASLIDYQHIEFPQVVWSANRNNPVRINATLELTSDGNLVLQDADGAIAWSTNTS- 79 (274)
Q Consensus 1 F~lGFf~~~~~~~~~l~iw~~~~~~~~~~~~~~~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~~~~~~~Wss~~~- 79 (274)
|++|||++.. ...+++|||.. .+ .++||+|||+.|+..++.|.|++||+|+|++.++.++|++++.
T Consensus 17 f~~G~~~~~~-q~dgnlV~~~~-----------~~-~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~t~~ 83 (114)
T smart00108 17 FELGFFTLIM-QNDYNLILYKS-----------SS-RTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSSNTTG 83 (114)
T ss_pred EeeeccccCC-CCCEEEEEEEC-----------CC-CcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEecccC
Confidence 6899999865 57999999974 34 7899999999998877899999999999999989999999986
Q ss_pred CCceeEEEEeeCCCeeEecCCCceEEeecCC
Q 040039 80 GKSVVGLNLTDMGNLVLFDKNNAAVWQSFDH 110 (274)
Q Consensus 80 ~~~~~~~~l~d~GNlvl~~~~~~~lWqSFd~ 110 (274)
+.+...++|+|+|||||++..+.+|||||||
T Consensus 84 ~~~~~~~~L~ddGnlvl~~~~~~~~W~Sf~~ 114 (114)
T smart00108 84 ANGNYVLVLLDDGNLVIYDSDGNFLWQSFDY 114 (114)
T ss_pred CCCceEEEEeCCCCEEEECCCCCEEeCCCCC
Confidence 5566788999999999999999999999997
No 4
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=99.89 E-value=2.2e-23 Score=161.60 Aligned_cols=103 Identities=20% Similarity=0.311 Sum_probs=79.3
Q ss_pred EecccCCCCcCCCcceeeeecCceEEEeecCCCCCceEEEEecCCCCCCeEEEEEcCCCCeEEEEEc-CCCCeEEeeecc
Q 040039 164 YYALVKATKTSKEPSHARYLNGSLAFFINSSEPREPDGAVPVPPASSSPGQYMRLWPDGHLRVYEWQ-ASIGWTEVADLL 242 (274)
Q Consensus 164 w~sg~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~rl~Ld~dG~lr~y~w~-~~~~W~~~~~~~ 242 (274)
||+|+|++..+.+.+.+.. ..+..+....+..+.+++|.+.+.+. ++|++||++|++++|.|+ +.++|.++ |
T Consensus 1 wrsG~WnG~~f~g~p~~~~--~~~~~~~fv~~~~e~~~t~~~~~~s~--~~r~~ld~~G~l~~~~w~~~~~~W~~~---~ 73 (110)
T PF00954_consen 1 WRSGPWNGQRFSGIPEMSS--NSLYNYSFVSNNEEVYYTYSLSNSSV--LSRLVLDSDGQLQRYIWNESTQSWSVF---W 73 (110)
T ss_pred CCccccCCeEECCcccccc--cceeEEEEEECCCeEEEEEecCCCce--EEEEEEeeeeEEEEEEEecCCCcEEEE---E
Confidence 8999999976544344331 11222222224556677777665554 999999999999999999 89999997 7
Q ss_pred cccCCCCCCCcCCCCCCccCC---CCccCCCCCC
Q 040039 243 TGYLGECGYPLVCGKYGICSQ---GQCSCPATYF 273 (274)
Q Consensus 243 ~~p~d~C~~y~~CG~~giC~~---~~C~Cl~gf~ 273 (274)
.+|.|+||+|++||+||+|+. ++|+|||||.
T Consensus 74 ~~p~d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~ 107 (110)
T PF00954_consen 74 SAPKDQCDVYGFCGPNGICNSNNSPKCSCLPGFE 107 (110)
T ss_pred EecccCCCCccccCCccEeCCCCCCceECCCCcC
Confidence 789999999999999999983 7899999996
No 5
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=98.93 E-value=7.8e-09 Score=80.32 Aligned_cols=86 Identities=27% Similarity=0.505 Sum_probs=65.3
Q ss_pred EEEEecCCcEEEEcCC-CCEEEeecCCCC--ceeEEEEeeCCCeeEecCCCceEEeecCCCCCcccCCCcccCCceEeee
Q 040039 54 TLELTSDGNLVLQDAD-GAIAWSTNTSGK--SVVGLNLTDMGNLVLFDKNNAAVWQSFDHPTDSLVPGQKLLEGKKLTAS 130 (274)
Q Consensus 54 ~l~~~~~G~L~l~~~~-~~~~Wss~~~~~--~~~~~~l~d~GNlvl~~~~~~~lWqSFd~PtDTlLpgq~l~~~~~L~Sw 130 (274)
++.++.||+||+++.. ..++|++++... ....+.|+++|||||++.++.++|+|-.. .
T Consensus 23 ~~~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~t~------~------------- 83 (114)
T smart00108 23 TLIMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSSNTT------G------------- 83 (114)
T ss_pred ccCCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEeccc------C-------------
Confidence 5567789999999864 479999998532 22678999999999999889999998211 1
Q ss_pred cCCCCCCCCcceEEEecCCCceEEEecCCCeEEEec
Q 040039 131 VSTTNWTDGGLFSLSVTNEGLFAFIESNNTSIRYYA 166 (274)
Q Consensus 131 ~s~~dps~~G~ysl~~d~~g~~~~~~~~~~~~Yw~s 166 (274)
.. |.|.+.|+.+|...++.. ..++.|.+
T Consensus 84 ------~~-~~~~~~L~ddGnlvl~~~-~~~~~W~S 111 (114)
T smart00108 84 ------AN-GNYVLVLLDDGNLVIYDS-DGNFLWQS 111 (114)
T ss_pred ------CC-CceEEEEeCCCCEEEECC-CCCEEeCC
Confidence 13 788999999998877643 34677865
No 6
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=98.91 E-value=8.1e-09 Score=80.50 Aligned_cols=87 Identities=26% Similarity=0.509 Sum_probs=66.2
Q ss_pred EEEEec-CCcEEEEcCC-CCEEEeecCCC--CceeEEEEeeCCCeeEecCCCceEEeecCCCCCcccCCCcccCCceEee
Q 040039 54 TLELTS-DGNLVLQDAD-GAIAWSTNTSG--KSVVGLNLTDMGNLVLFDKNNAAVWQSFDHPTDSLVPGQKLLEGKKLTA 129 (274)
Q Consensus 54 ~l~~~~-~G~L~l~~~~-~~~~Wss~~~~--~~~~~~~l~d~GNlvl~~~~~~~lWqSFd~PtDTlLpgq~l~~~~~L~S 129 (274)
.+.++. ||+|++++.. ..++|++++.. ...+.+.|+++|||||+|.++.++|+|-..
T Consensus 23 ~~~~q~~dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~------------------- 83 (116)
T cd00028 23 KLIMQSRDYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTT------------------- 83 (116)
T ss_pred cCCCCCCeEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEeccc-------------------
Confidence 455676 9999999754 47999999864 345678999999999999999999998311
Q ss_pred ecCCCCCCCCcceEEEecCCCceEEEecCCCeEEEecc
Q 040039 130 SVSTTNWTDGGLFSLSVTNEGLFAFIESNNTSIRYYAL 167 (274)
Q Consensus 130 w~s~~dps~~G~ysl~~d~~g~~~~~~~~~~~~Yw~sg 167 (274)
. .. +.+.+.|+.+|...++..+ .++.|.+.
T Consensus 84 -----~-~~-~~~~~~L~ddGnlvl~~~~-~~~~W~Sf 113 (116)
T cd00028 84 -----R-VN-GNYVLVLLDDGNLVLYDSD-GNFLWQSF 113 (116)
T ss_pred -----C-CC-CceEEEEeCCCCEEEECCC-CCEEEcCC
Confidence 0 13 7889999999988776433 46778764
No 7
>PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]: Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=98.26 E-value=6.4e-06 Score=64.03 Aligned_cols=74 Identities=28% Similarity=0.443 Sum_probs=51.4
Q ss_pred CcEEEEc-CCCCCCCCCcEEEEecCCcEEEEcCCCCEEEeecCCCCceeEEEEee--CCCeeEecCCCceEEeecCCCC
Q 040039 37 PQVVWSA-NRNNPVRINATLELTSDGNLVLQDADGAIAWSTNTSGKSVVGLNLTD--MGNLVLFDKNNAAVWQSFDHPT 112 (274)
Q Consensus 37 ~~vVWvA-Nr~~Pv~~~~~l~~~~~G~L~l~~~~~~~~Wss~~~~~~~~~~~l~d--~GNlvl~~~~~~~lWqSFd~Pt 112 (274)
..+||.. +........+.+.|+++|||||+|..+.++|++.. ......+.+++ .||++ +.....++|.|-+.|+
T Consensus 38 ~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~-~ptdt~L~~q~l~~~~~~-~~~~~~~sw~s~~dps 114 (114)
T PF01453_consen 38 GSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFD-YPTDTLLPGQKLGDGNVT-GKNDSLTSWSSNTDPS 114 (114)
T ss_dssp TEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTT-SSS-EEEEEET--TSEEE-EESTSSEEEESS----
T ss_pred CCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecC-CCccEEEeccCcccCCCc-cccceEEeECCCCCCC
Confidence 5679999 43433334589999999999999998999999943 33445566777 88888 7666789999977763
No 8
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=94.86 E-value=0.021 Score=33.86 Aligned_cols=22 Identities=32% Similarity=0.859 Sum_probs=19.4
Q ss_pred cCCCCCCccC--CCCccCCCCCCC
Q 040039 253 LVCGKYGICS--QGQCSCPATYFK 274 (274)
Q Consensus 253 ~~CG~~giC~--~~~C~Cl~gf~~ 274 (274)
.+|..+|+|+ ..+|.|.+||+|
T Consensus 6 ~~C~~~G~C~~~~g~C~C~~g~~G 29 (32)
T PF07974_consen 6 NICSGHGTCVSPCGRCVCDSGYTG 29 (32)
T ss_pred CccCCCCEEeCCCCEEECCCCCcC
Confidence 5799999998 379999999986
No 9
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=93.98 E-value=0.0082 Score=28.23 Aligned_cols=10 Identities=30% Similarity=1.026 Sum_probs=7.8
Q ss_pred CccCCCCCCC
Q 040039 265 QCSCPATYFK 274 (274)
Q Consensus 265 ~C~Cl~gf~~ 274 (274)
+|.|++||+|
T Consensus 1 ~C~C~~G~~G 10 (13)
T PF12661_consen 1 TCQCPPGWTG 10 (13)
T ss_dssp EEEE-TTEET
T ss_pred CccCcCCCcC
Confidence 4999999986
No 10
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=92.40 E-value=0.097 Score=30.55 Aligned_cols=26 Identities=31% Similarity=0.796 Sum_probs=20.1
Q ss_pred CCCCcCCCCCCccCC----CCccCCCCCCC
Q 040039 249 CGYPLVCGKYGICSQ----GQCSCPATYFK 274 (274)
Q Consensus 249 C~~y~~CG~~giC~~----~~C~Cl~gf~~ 274 (274)
|.....|..++.|.. ..|.|++||.+
T Consensus 2 C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g 31 (36)
T cd00053 2 CAASNPCSNGGTCVNTPGSYRCVCPPGYTG 31 (36)
T ss_pred CCCCCCCCCCCEEecCCCCeEeECCCCCcc
Confidence 443567888899973 67999999975
No 11
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=92.01 E-value=0.13 Score=33.83 Aligned_cols=30 Identities=27% Similarity=0.637 Sum_probs=26.4
Q ss_pred ccCCCCCCCcCCCCCCccCCCCccCCCCCC
Q 040039 244 GYLGECGYPLVCGKYGICSQGQCSCPATYF 273 (274)
Q Consensus 244 ~p~d~C~~y~~CG~~giC~~~~C~Cl~gf~ 273 (274)
.+.+.|....-|-.++.|....|.|++||+
T Consensus 17 ~~g~~C~~~~qC~~~s~C~~g~C~C~~g~~ 46 (52)
T PF01683_consen 17 QPGESCESDEQCIGGSVCVNGRCQCPPGYV 46 (52)
T ss_pred CCCCCCCCcCCCCCcCEEcCCEeECCCCCE
Confidence 355789999999999999889999999985
No 12
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=89.65 E-value=0.25 Score=29.21 Aligned_cols=28 Identities=36% Similarity=0.784 Sum_probs=21.5
Q ss_pred CCCCCCcCCCCCCccCC----CCccCCCCCCC
Q 040039 247 GECGYPLVCGKYGICSQ----GQCSCPATYFK 274 (274)
Q Consensus 247 d~C~~y~~CG~~giC~~----~~C~Cl~gf~~ 274 (274)
++|.....|...+.|.. ..|.|++||.|
T Consensus 3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g 34 (38)
T cd00054 3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYTG 34 (38)
T ss_pred ccCCCCCCcCCCCEeECCCCCeEeECCCCCcC
Confidence 56765457888889973 46999999975
No 13
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=89.50 E-value=0.27 Score=29.52 Aligned_cols=27 Identities=33% Similarity=0.802 Sum_probs=21.2
Q ss_pred CCCCCCcCCCCCCccCC----CCccCCCCCC
Q 040039 247 GECGYPLVCGKYGICSQ----GQCSCPATYF 273 (274)
Q Consensus 247 d~C~~y~~CG~~giC~~----~~C~Cl~gf~ 273 (274)
++|.....|...+.|.. -.|.|++||.
T Consensus 3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~ 33 (39)
T smart00179 3 DECASGNPCQNGGTCVNTVGSYRCECPPGYT 33 (39)
T ss_pred ccCcCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence 56765567888889973 4699999997
No 14
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=88.82 E-value=0.11 Score=32.51 Aligned_cols=27 Identities=37% Similarity=0.818 Sum_probs=22.3
Q ss_pred CCCCCC-cCCCCCCccCC----CCccCCCCCC
Q 040039 247 GECGYP-LVCGKYGICSQ----GQCSCPATYF 273 (274)
Q Consensus 247 d~C~~y-~~CG~~giC~~----~~C~Cl~gf~ 273 (274)
|+|... ..|..++.|.+ -.|.|++||.
T Consensus 3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp STTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 678875 48999999973 5799999996
No 15
>PF00008 EGF: EGF-like domain This is a sub-family of the Pfam entry This is a sub-family of the Pfam entry; InterPro: IPR006209 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.; GO: 0005515 protein binding; PDB: 1WHE_A 1CCF_A 1APO_A 1WHF_A 2VJ3_A 1TOZ_A 4D90_B 3CFW_A 1EDM_B 1IXA_A ....
Probab=88.40 E-value=0.082 Score=31.22 Aligned_cols=21 Identities=33% Similarity=0.835 Sum_probs=16.8
Q ss_pred CCCCCCccCC-----CCccCCCCCCC
Q 040039 254 VCGKYGICSQ-----GQCSCPATYFK 274 (274)
Q Consensus 254 ~CG~~giC~~-----~~C~Cl~gf~~ 274 (274)
.|...|.|.. .+|.|++||+|
T Consensus 5 ~C~n~g~C~~~~~~~y~C~C~~G~~G 30 (32)
T PF00008_consen 5 PCQNGGTCIDLPGGGYTCECPPGYTG 30 (32)
T ss_dssp SSTTTEEEEEESTSEEEEEEBTTEES
T ss_pred cCCCCeEEEeCCCCCEEeECCCCCcc
Confidence 6777888862 57999999986
No 16
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=85.79 E-value=0.12 Score=31.46 Aligned_cols=22 Identities=23% Similarity=0.545 Sum_probs=16.2
Q ss_pred cCCCCCCccCC----CCccCCCCCCC
Q 040039 253 LVCGKYGICSQ----GQCSCPATYFK 274 (274)
Q Consensus 253 ~~CG~~giC~~----~~C~Cl~gf~~ 274 (274)
+-|-++..|.. -.|+|.+||+|
T Consensus 6 ~~C~~nA~C~~~~~~~~C~C~~Gy~G 31 (36)
T PF12947_consen 6 GGCHPNATCTNTGGSYTCTCKPGYEG 31 (36)
T ss_dssp GGS-TTCEEEE-TTSEEEEE-CEEEC
T ss_pred CCCCCCcEeecCCCCEEeECCCCCcc
Confidence 56889999972 56999999986
No 17
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=80.32 E-value=0.72 Score=25.44 Aligned_cols=9 Identities=56% Similarity=1.376 Sum_probs=8.1
Q ss_pred CccCCCCCC
Q 040039 265 QCSCPATYF 273 (274)
Q Consensus 265 ~C~Cl~gf~ 273 (274)
.|+|++||.
T Consensus 3 ~C~C~~Gy~ 11 (24)
T PF12662_consen 3 TCSCPPGYQ 11 (24)
T ss_pred EeeCCCCCc
Confidence 599999996
No 18
>smart00181 EGF Epidermal growth factor-like domain.
Probab=78.59 E-value=1.5 Score=25.68 Aligned_cols=21 Identities=33% Similarity=0.711 Sum_probs=16.0
Q ss_pred cCCCCCCccCC----CCccCCCCCCC
Q 040039 253 LVCGKYGICSQ----GQCSCPATYFK 274 (274)
Q Consensus 253 ~~CG~~giC~~----~~C~Cl~gf~~ 274 (274)
..|... .|.. ..|.|++||.+
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g 30 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTG 30 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCcc
Confidence 456776 7862 67999999975
No 19
>PHA02887 EGF-like protein; Provisional
Probab=77.68 E-value=1.1 Score=34.51 Aligned_cols=27 Identities=26% Similarity=0.562 Sum_probs=22.0
Q ss_pred CCCCC--CcCCCCCCccC------CCCccCCCCCCC
Q 040039 247 GECGY--PLVCGKYGICS------QGQCSCPATYFK 274 (274)
Q Consensus 247 d~C~~--y~~CG~~giC~------~~~C~Cl~gf~~ 274 (274)
++|.- .+.|= +|.|. .+.|.|.+||+|
T Consensus 84 ~pC~~eyk~YCi-HG~C~yI~dL~epsCrC~~GYtG 118 (126)
T PHA02887 84 EKCKNDFNDFCI-NGECMNIIDLDEKFCICNKGYTG 118 (126)
T ss_pred cccChHhhCEee-CCEEEccccCCCceeECCCCccc
Confidence 67864 57888 79997 188999999997
No 20
>cd05845 Ig2_L1-CAM_like Second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. Ig2_L1-CAM_like: domain similar to the second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth.
Probab=70.90 E-value=7.9 Score=28.87 Aligned_cols=34 Identities=12% Similarity=0.267 Sum_probs=23.8
Q ss_pred CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEc
Q 040039 34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQD 67 (274)
Q Consensus 34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~ 67 (274)
.|..++.|+-+....+.....++++.+|+|.+.+
T Consensus 31 ~P~P~i~W~~~~~~~i~~~~Ri~~~~~GnL~fs~ 64 (95)
T cd05845 31 AVPLRIYWMNSDLLHITQDERVSMGQNGNLYFAN 64 (95)
T ss_pred CCCCEEEEECCCCccccccccEEECCCceEEEEE
Confidence 5677888995544445545677888888888764
No 21
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=66.99 E-value=4 Score=24.40 Aligned_cols=10 Identities=60% Similarity=1.491 Sum_probs=8.6
Q ss_pred CCccCCCCCC
Q 040039 264 GQCSCPATYF 273 (274)
Q Consensus 264 ~~C~Cl~gf~ 273 (274)
.+|.||.||-
T Consensus 18 ~~C~CPeGyI 27 (34)
T PF09064_consen 18 GQCFCPEGYI 27 (34)
T ss_pred CceeCCCceE
Confidence 5899999984
No 22
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=66.39 E-value=2.4 Score=45.69 Aligned_cols=23 Identities=26% Similarity=0.677 Sum_probs=19.1
Q ss_pred CcCCCCCCccCC----CCccCCCCCCC
Q 040039 252 PLVCGKYGICSQ----GQCSCPATYFK 274 (274)
Q Consensus 252 y~~CG~~giC~~----~~C~Cl~gf~~ 274 (274)
-+.||++|-|.. -+|+|-|||+|
T Consensus 1244 s~pC~nng~C~srEggYtCeCrpg~tG 1270 (2531)
T KOG4289|consen 1244 SGPCGNNGRCRSREGGYTCECRPGFTG 1270 (2531)
T ss_pred cCCCCCCCceEEecCceeEEecCCccc
Confidence 478999999973 57999999986
No 23
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=66.25 E-value=12 Score=20.98 Aligned_cols=22 Identities=23% Similarity=0.347 Sum_probs=15.5
Q ss_pred cEEEEecCCcEEEEcCCCCEEE
Q 040039 53 ATLELTSDGNLVLQDADGAIAW 74 (274)
Q Consensus 53 ~~l~~~~~G~L~l~~~~~~~~W 74 (274)
.-+.++.+|+|++.|..+.-||
T Consensus 5 ~gvav~~~g~i~VaD~~n~rV~ 26 (28)
T PF01436_consen 5 HGVAVDSDGNIYVADSGNHRVQ 26 (28)
T ss_dssp EEEEEETTSEEEEEECCCTEEE
T ss_pred cEEEEeCCCCEEEEECCCCEEE
Confidence 3466778888888886665554
No 24
>PF13360 PQQ_2: PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=62.95 E-value=21 Score=29.98 Aligned_cols=48 Identities=29% Similarity=0.542 Sum_probs=25.4
Q ss_pred CCcEEEEcC-CCCEEEeecCC---CCce--e-----EEEE-eeCCCeeEecC-CCceEEee
Q 040039 60 DGNLVLQDA-DGAIAWSTNTS---GKSV--V-----GLNL-TDMGNLVLFDK-NNAAVWQS 107 (274)
Q Consensus 60 ~G~L~l~~~-~~~~~Wss~~~---~~~~--~-----~~~l-~d~GNlvl~~~-~~~~lWqS 107 (274)
+|.|..+|. +|+.+|+.... ...+ + .+.+ ..+|+|+..|. +|+++|+-
T Consensus 2 ~g~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~ 62 (238)
T PF13360_consen 2 DGTLSALDPRTGKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRF 62 (238)
T ss_dssp TSEEEEEETTTTEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEE
T ss_pred CCEEEEEECCCCCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEe
Confidence 566777775 67777777541 1111 0 0111 25555556663 56777764
No 25
>PF07354 Sp38: Zona-pellucida-binding protein (Sp38); InterPro: IPR010857 This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90 kDa family of zona pellucida glycoproteins in a calcium-dependent manner []. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur [].; GO: 0007339 binding of sperm to zona pellucida, 0005576 extracellular region
Probab=61.35 E-value=13 Score=33.07 Aligned_cols=34 Identities=21% Similarity=0.543 Sum_probs=30.6
Q ss_pred CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEc
Q 040039 34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQD 67 (274)
Q Consensus 34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~ 67 (274)
+-.++..|+--.++++++++.+.|++.|.|++.|
T Consensus 10 ~iDP~y~W~GP~g~~l~gn~~~nIT~TG~L~~~~ 43 (271)
T PF07354_consen 10 LIDPTYLWTGPNGKPLSGNSYVNITETGKLMFKN 43 (271)
T ss_pred cCCCceEEECCCCcccCCCCeEEEccCceEEeec
Confidence 4567889999999999999999999999999986
No 26
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=61.08 E-value=31 Score=32.04 Aligned_cols=19 Identities=21% Similarity=0.235 Sum_probs=10.9
Q ss_pred eeCCCeeEecC-CCceEEee
Q 040039 89 TDMGNLVLFDK-NNAAVWQS 107 (274)
Q Consensus 89 ~d~GNlvl~~~-~~~~lWqS 107 (274)
.++|.|...|. +++++|+-
T Consensus 342 ~~~G~l~~ld~~tG~~~~~~ 361 (394)
T PRK11138 342 DSEGYLHWINREDGRFVAQQ 361 (394)
T ss_pred eCCCEEEEEECCCCCEEEEE
Confidence 45566665553 36666664
No 27
>PF13360 PQQ_2: PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=58.44 E-value=1.1e+02 Score=25.54 Aligned_cols=73 Identities=26% Similarity=0.481 Sum_probs=44.3
Q ss_pred CCCcEEEEcCC----CCCC----CCCcEEEE-ecCCcEEEEcC-CCCEEEeecCCCC---c-e---eEEEEe-eCCCeeE
Q 040039 35 EFPQVVWSANR----NNPV----RINATLEL-TSDGNLVLQDA-DGAIAWSTNTSGK---S-V---VGLNLT-DMGNLVL 96 (274)
Q Consensus 35 ~~~~vVWvANr----~~Pv----~~~~~l~~-~~~G~L~l~~~-~~~~~Wss~~~~~---~-~---~~~~l~-d~GNlvl 96 (274)
.....+|..+- ..++ .+...+.+ +.+|.|+.+|. +|+.+|+...... . . ..+.+. .+|.|+.
T Consensus 11 ~tG~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~~ 90 (238)
T PF13360_consen 11 RTGKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLYA 90 (238)
T ss_dssp TTTEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEEE
T ss_pred CCCCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEeeccccccceeeecccccccccceeeeEe
Confidence 35567888753 2222 12333434 58899999996 8999999886332 1 1 112222 3445666
Q ss_pred ec-CCCceEEee
Q 040039 97 FD-KNNAAVWQS 107 (274)
Q Consensus 97 ~~-~~~~~lWqS 107 (274)
.| .+++++|+.
T Consensus 91 ~d~~tG~~~W~~ 102 (238)
T PF13360_consen 91 LDAKTGKVLWSI 102 (238)
T ss_dssp EETTTSCEEEEE
T ss_pred cccCCcceeeee
Confidence 67 678999995
No 28
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=57.74 E-value=34 Score=31.78 Aligned_cols=55 Identities=27% Similarity=0.522 Sum_probs=34.8
Q ss_pred cEEEEe-cCCcEEEEcC-CCCEEEeecCCCC----ce----eEEEEeeCCCeeEecC-CCceEEee
Q 040039 53 ATLELT-SDGNLVLQDA-DGAIAWSTNTSGK----SV----VGLNLTDMGNLVLFDK-NNAAVWQS 107 (274)
Q Consensus 53 ~~l~~~-~~G~L~l~~~-~~~~~Wss~~~~~----~~----~~~~l~d~GNlvl~~~-~~~~lWqS 107 (274)
..+.+. .+|.|+-+|. +|+++|+....+. ++ ....-..+|.|+-.|. +|+++|+-
T Consensus 121 ~~v~v~~~~g~l~ald~~tG~~~W~~~~~~~~~ssP~v~~~~v~v~~~~g~l~ald~~tG~~~W~~ 186 (394)
T PRK11138 121 GKVYIGSEKGQVYALNAEDGEVAWQTKVAGEALSRPVVSDGLVLVHTSNGMLQALNESDGAVKWTV 186 (394)
T ss_pred CEEEEEcCCCEEEEEECCCCCCcccccCCCceecCCEEECCEEEEECCCCEEEEEEccCCCEeeee
Confidence 344444 5688888885 6899999876432 11 1112234667777775 58899985
No 29
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=56.77 E-value=53 Score=30.09 Aligned_cols=70 Identities=23% Similarity=0.467 Sum_probs=39.7
Q ss_pred CcEEEEcCCCCCC----------CCCcEEEE-ecCCcEEEEc-CCCCEEEeecCCCC---ce-----eEEEEeeCCCeeE
Q 040039 37 PQVVWSANRNNPV----------RINATLEL-TSDGNLVLQD-ADGAIAWSTNTSGK---SV-----VGLNLTDMGNLVL 96 (274)
Q Consensus 37 ~~vVWvANr~~Pv----------~~~~~l~~-~~~G~L~l~~-~~~~~~Wss~~~~~---~~-----~~~~l~d~GNlvl 96 (274)
..++|..+-..++ -....+.+ +.+|.|.-+| .+|+++|+.+.... .. ....-..+|+|+.
T Consensus 40 ~~~~W~~~~~~~~~~~~~~~~p~v~~~~v~v~~~~g~v~a~d~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~a 119 (377)
T TIGR03300 40 VDQVWSASVGDGVGHYYLRLQPAVAGGKVYAADADGTVVALDAETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIA 119 (377)
T ss_pred ceeeeEEEcCCCcCccccccceEEECCEEEEECCCCeEEEEEccCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEE
Confidence 3467887644433 22233333 3457887777 47889998765321 11 1111234667776
Q ss_pred ecC-CCceEEe
Q 040039 97 FDK-NNAAVWQ 106 (274)
Q Consensus 97 ~~~-~~~~lWq 106 (274)
.|. +++++|+
T Consensus 120 ld~~tG~~~W~ 130 (377)
T TIGR03300 120 LDAEDGKELWR 130 (377)
T ss_pred EECCCCcEeee
Confidence 775 5888996
No 30
>KOG4649 consensus PQQ (pyrrolo-quinoline quinone) repeat protein [Secondary metabolites biosynthesis, transport and catabolism]
Probab=56.66 E-value=24 Score=31.58 Aligned_cols=45 Identities=29% Similarity=0.488 Sum_probs=34.9
Q ss_pred CCcEEEEcCCCCCCCCC------cEEEEecCCcEEEEcCCCCEEEeecCCC
Q 040039 36 FPQVVWSANRNNPVRIN------ATLELTSDGNLVLQDADGAIAWSTNTSG 80 (274)
Q Consensus 36 ~~~vVWvANr~~Pv~~~------~~l~~~~~G~L~l~~~~~~~~Wss~~~~ 80 (274)
+.+..|-|.|..||-.+ ++..-+-||+|.-.++.|+.||+..+.+
T Consensus 168 ~~~~~w~~~~~~PiF~splcv~~sv~i~~VdG~l~~f~~sG~qvwr~~t~G 218 (354)
T KOG4649|consen 168 SSTEFWAATRFGPIFASPLCVGSSVIITTVDGVLTSFDESGRQVWRPATKG 218 (354)
T ss_pred CcceehhhhcCCccccCceeccceEEEEEeccEEEEEcCCCcEEEeecCCC
Confidence 45789999999998754 2333356899999999999999877644
No 31
>PF11403 Yeast_MT: Yeast metallothionein; InterPro: IPR022710 Metallothioneins are characterised by an abundance of cysteine residues and a lack of generic secondary structure motifs. This protein functions in primary metal storage, transport and detoxification []. For the first 40 residues in the protein the polypeptide wraps around the metal by forming two large parallel loops separated by a deep cleft containing the metal cluster []. ; PDB: 1AQS_A 1AQR_A 1RJU_V 1FMY_A 1AOO_A 1AQQ_A.
Probab=56.38 E-value=8.6 Score=22.86 Aligned_cols=19 Identities=37% Similarity=0.821 Sum_probs=11.1
Q ss_pred cCCCCCCccCCCCccCCCCC
Q 040039 253 LVCGKYGICSQGQCSCPATY 272 (274)
Q Consensus 253 ~~CG~~giC~~~~C~Cl~gf 272 (274)
|.|-.+.-| ...|+||.|-
T Consensus 12 gscknneqc-qkscscptgc 30 (40)
T PF11403_consen 12 GSCKNNEQC-QKSCSCPTGC 30 (40)
T ss_dssp STTTT-TTS-TTS-SS-TTT
T ss_pred CCccChHHH-hhcCCCCCCC
Confidence 566677777 5679998773
No 32
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=56.30 E-value=5.5 Score=31.37 Aligned_cols=24 Identities=25% Similarity=0.401 Sum_probs=17.5
Q ss_pred CCCcCCCCCCccC------CCCccCCCCCCC
Q 040039 250 GYPLVCGKYGICS------QGQCSCPATYFK 274 (274)
Q Consensus 250 ~~y~~CG~~giC~------~~~C~Cl~gf~~ 274 (274)
+..+.|=. |.|. .+.|.|..||+|
T Consensus 48 ey~~YClH-G~C~yI~dl~~~~CrC~~GYtG 77 (139)
T PHA03099 48 EGDGYCLH-GDCIHARDIDGMYCRCSHGYTG 77 (139)
T ss_pred hhCCEeEC-CEEEeeccCCCceeECCCCccc
Confidence 34456654 6886 278999999997
No 33
>TIGR03066 Gem_osc_para_1 Gemmata obscuriglobus paralogous family TIGR03066. This model represents an uncharacterized paralogous family in Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. This family shows sequence similarity to TIGR03067, which is also found in Gemmata obscuriglobus as well as in a few other species.
Probab=54.52 E-value=42 Score=25.86 Aligned_cols=54 Identities=19% Similarity=0.298 Sum_probs=33.1
Q ss_pred CCCcEEEEecCCcEEEEcCCCCE-E-----Eeec---------CCCC---ceeEEEEeeCCCeeEecCCCce
Q 040039 50 RINATLELTSDGNLVLQDADGAI-A-----WSTN---------TSGK---SVVGLNLTDMGNLVLFDKNNAA 103 (274)
Q Consensus 50 ~~~~~l~~~~~G~L~l~~~~~~~-~-----Wss~---------~~~~---~~~~~~l~d~GNlvl~~~~~~~ 103 (274)
...+.|+|..||.|+|..+++.- + |+-. ..+. +-....-+++|-|||.|.+++.
T Consensus 33 ~~~~~leF~~dGKL~v~~gnng~~~~~~Gty~L~G~kLtL~~~p~g~t~k~~Vtv~~l~~~~Lvl~d~dg~~ 104 (111)
T TIGR03066 33 KDDVVIEFAKDGKLVVTIGEKGKEVKADGTYKLDGNKLTLTLKAGGKEKKETLTVKKLTDDELVGKDPDGKK 104 (111)
T ss_pred CCceEEEEcCCCeEEEecCCCCcEeccCceEEEECCEEEEEEcCCCccccceEEEEEecCCeEEEEcCCCCE
Confidence 35578999999999988765442 1 3321 1111 1011223688999999988763
No 34
>smart00051 DSL delta serrate ligand.
Probab=53.51 E-value=11 Score=25.77 Aligned_cols=18 Identities=17% Similarity=0.434 Sum_probs=13.4
Q ss_pred CCCccCC-CCccCCCCCCC
Q 040039 257 KYGICSQ-GQCSCPATYFK 274 (274)
Q Consensus 257 ~~giC~~-~~C~Cl~gf~~ 274 (274)
....|+. ..|.|+||++|
T Consensus 42 ~~~~Cd~~G~~~C~~Gw~G 60 (63)
T smart00051 42 GHYTCDENGNKGCLEGWMG 60 (63)
T ss_pred CCccCCcCCCEecCCCCcC
Confidence 3455763 67999999986
No 35
>cd05852 Ig5_Contactin-1 Fifth Ig domain of contactin-1. Ig5_Contactin-1: fifth Ig domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.
Probab=52.91 E-value=64 Score=22.17 Aligned_cols=34 Identities=21% Similarity=0.391 Sum_probs=24.1
Q ss_pred CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEcC
Q 040039 34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQDA 68 (274)
Q Consensus 34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~~ 68 (274)
.|.+++.|.=|.. ++.....+.+..+|.|+|.+.
T Consensus 13 ~P~p~v~W~k~~~-~l~~~~r~~~~~~g~L~I~~v 46 (73)
T cd05852 13 APKPKFSWSKGTE-LLVNNSRISIWDDGSLEILNI 46 (73)
T ss_pred eCCCEEEEEeCCE-ecccCCCEEEcCCCEEEECcC
Confidence 4667889987643 555556677777899988764
No 36
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=48.66 E-value=9.8 Score=39.07 Aligned_cols=27 Identities=30% Similarity=0.805 Sum_probs=21.2
Q ss_pred CCCCCCcCCCCCCccCC----CCccCCCCCCC
Q 040039 247 GECGYPLVCGKYGICSQ----GQCSCPATYFK 274 (274)
Q Consensus 247 d~C~~y~~CG~~giC~~----~~C~Cl~gf~~ 274 (274)
|+|. +..|-++..|-. ..|.|.|||+|
T Consensus 828 DeC~-psrChp~A~CyntpgsfsC~C~pGy~G 858 (1289)
T KOG1214|consen 828 DECS-PSRCHPAATCYNTPGSFSCRCQPGYYG 858 (1289)
T ss_pred cccC-ccccCCCceEecCCCcceeecccCccC
Confidence 6675 788888888862 56999999976
No 37
>PF12690 BsuPI: Intracellular proteinase inhibitor; InterPro: IPR020481 BsuPI is a intracellular proteinase inhibitor that directly regulates the major intracellular proteinase (ISP-1) activity in vivo. It inhibits ISP-1 in the early stages of sporulation and then may be inactivated by a membrane-bound proteinase [].; PDB: 3ISY_A.
Probab=46.86 E-value=24 Score=25.43 Aligned_cols=16 Identities=25% Similarity=0.686 Sum_probs=6.8
Q ss_pred cEEEEcCCCCEEEeec
Q 040039 62 NLVLQDADGAIAWSTN 77 (274)
Q Consensus 62 ~L~l~~~~~~~~Wss~ 77 (274)
+|+|.|.+|..||.-.
T Consensus 27 D~~v~d~~g~~vwrwS 42 (82)
T PF12690_consen 27 DFVVKDKEGKEVWRWS 42 (82)
T ss_dssp EEEEE-TT--EEEETT
T ss_pred EEEEECCCCCEEEEec
Confidence 4555555555665443
No 38
>cd00055 EGF_Lam Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies
Probab=42.16 E-value=18 Score=23.23 Aligned_cols=10 Identities=40% Similarity=0.953 Sum_probs=4.9
Q ss_pred CCccCCCCCC
Q 040039 264 GQCSCPATYF 273 (274)
Q Consensus 264 ~~C~Cl~gf~ 273 (274)
.+|.|.++++
T Consensus 19 G~C~C~~~~~ 28 (50)
T cd00055 19 GQCECKPNTT 28 (50)
T ss_pred CEEeCCCcCC
Confidence 4455555544
No 39
>PF12946 EGF_MSP1_1: MSP1 EGF domain 1; InterPro: IPR024730 This EGF-like domain is found at the C terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite [].; PDB: 1N1I_C 2FLG_A 1CEJ_A 2NPR_A 1B9W_A 1OB1_F.
Probab=39.07 E-value=3.7 Score=25.09 Aligned_cols=22 Identities=27% Similarity=0.578 Sum_probs=13.9
Q ss_pred cCCCCCCccCC-----CCccCCCCCCC
Q 040039 253 LVCGKYGICSQ-----GQCSCPATYFK 274 (274)
Q Consensus 253 ~~CG~~giC~~-----~~C~Cl~gf~~ 274 (274)
..|=.|+-|-. ..|.|++||+|
T Consensus 5 ~~cP~NA~C~~~~dG~eecrCllgyk~ 31 (37)
T PF12946_consen 5 TKCPANAGCFRYDDGSEECRCLLGYKK 31 (37)
T ss_dssp S---TTEEEEEETTSEEEEEE-TTEEE
T ss_pred ccCCCCcccEEcCCCCEEEEeeCCccc
Confidence 45777888851 57999999975
No 40
>PF06006 DUF905: Bacterial protein of unknown function (DUF905); InterPro: IPR009253 This family consists of several short hypothetical proteobacterial proteins of unknown function.; PDB: 2HJJ_A.
Probab=38.33 E-value=35 Score=23.88 Aligned_cols=19 Identities=26% Similarity=0.648 Sum_probs=12.0
Q ss_pred CeeEecCCCceEEeecCCC
Q 040039 93 NLVLFDKNNAAVWQSFDHP 111 (274)
Q Consensus 93 Nlvl~~~~~~~lWqSFd~P 111 (274)
-|||||.++..+|..+.+-
T Consensus 34 RlvvRd~~g~mvWRaWNFE 52 (70)
T PF06006_consen 34 RLVVRDTEGQMVWRAWNFE 52 (70)
T ss_dssp EEEEE-SS--EEEEEESSS
T ss_pred EEEEEcCCCcEEEEeeccC
Confidence 4788888888888887653
No 41
>KOG0640 consensus mRNA cleavage stimulating factor complex; subunit 1 [RNA processing and modification]
Probab=37.25 E-value=1.7e+02 Score=26.99 Aligned_cols=68 Identities=25% Similarity=0.462 Sum_probs=48.2
Q ss_pred EEEEcCCCCCCCCC-cEEEEecCCcEEEEcC-CCCE-EEeecC-----------CCCceeEEEEeeCCCeeEecCCCc--
Q 040039 39 VVWSANRNNPVRIN-ATLELTSDGNLVLQDA-DGAI-AWSTNT-----------SGKSVVGLNLTDMGNLVLFDKNNA-- 102 (274)
Q Consensus 39 vVWvANr~~Pv~~~-~~l~~~~~G~L~l~~~-~~~~-~Wss~~-----------~~~~~~~~~l~d~GNlvl~~~~~~-- 102 (274)
.-=.||.+..+.+. ..+.-+..|+|.+... +|.+ +|..-. .+..+.+|++..+|..+|......
T Consensus 250 cfvsanPd~qht~ai~~V~Ys~t~~lYvTaSkDG~IklwDGVS~rCv~t~~~AH~gsevcSa~Ftkn~kyiLsSG~DS~v 329 (430)
T KOG0640|consen 250 CFVSANPDDQHTGAITQVRYSSTGSLYVTASKDGAIKLWDGVSNRCVRTIGNAHGGSEVCSAVFTKNGKYILSSGKDSTV 329 (430)
T ss_pred EeeecCcccccccceeEEEecCCccEEEEeccCCcEEeeccccHHHHHHHHhhcCCceeeeEEEccCCeEEeecCCccee
Confidence 34568888888777 6788888899988854 3443 785321 134567899999999999864433
Q ss_pred eEEe
Q 040039 103 AVWQ 106 (274)
Q Consensus 103 ~lWq 106 (274)
-|||
T Consensus 330 kLWE 333 (430)
T KOG0640|consen 330 KLWE 333 (430)
T ss_pred eeee
Confidence 4898
No 42
>PF13570 PQQ_3: PQQ-like domain; PDB: 3HXJ_B 3Q54_A.
Probab=36.44 E-value=47 Score=19.86 Aligned_cols=11 Identities=45% Similarity=1.039 Sum_probs=5.1
Q ss_pred CCEEEeecCCC
Q 040039 70 GAIAWSTNTSG 80 (274)
Q Consensus 70 ~~~~Wss~~~~ 80 (274)
|+++|+....+
T Consensus 1 G~~~W~~~~~~ 11 (40)
T PF13570_consen 1 GKVLWSYDTGG 11 (40)
T ss_dssp S-EEEEEE-SS
T ss_pred CceeEEEECCC
Confidence 34667666543
No 43
>KOG1214 consensus Nidogen and related basement membrane protein proteins [Cell wall/membrane/envelope biogenesis; Extracellular structures]
Probab=36.36 E-value=18 Score=37.22 Aligned_cols=57 Identities=14% Similarity=0.209 Sum_probs=36.3
Q ss_pred EEEcCCCCeEEE-----EEc-CCCCeEEeeecccccCCCCCCC-cCCCCCCccC----C---CCccCCCCCCC
Q 040039 216 MRLWPDGHLRVY-----EWQ-ASIGWTEVADLLTGYLGECGYP-LVCGKYGICS----Q---GQCSCPATYFK 274 (274)
Q Consensus 216 l~Ld~dG~lr~y-----~w~-~~~~W~~~~~~~~~p~d~C~~y-~~CG~~giC~----~---~~C~Cl~gf~~ 274 (274)
+-+..+|+.|.- .+. +...-+.+. -.+|.+.|+-- .-|-..|-|. . -+|.|||||-|
T Consensus 749 ~Cin~pg~~rceC~~gy~F~dd~~tCV~i~--~pap~n~Ce~g~h~C~i~g~a~c~~hGgs~y~C~CLPGfsG 819 (1289)
T KOG1214|consen 749 VCINLPGSYRCECRSGYEFADDRHTCVLIT--PPAPANPCEDGSHTCAIAGQARCVHHGGSTYSCACLPGFSG 819 (1289)
T ss_pred eeecCCCceeEEEeecceeccCCcceEEec--CCCCCCccccCccccCcCCceEEEecCCceEEEeecCCccC
Confidence 445567887653 344 445455431 12356888776 5688888775 1 47999999975
No 44
>PF14670 FXa_inhibition: Coagulation Factor Xa inhibitory site; PDB: 3Q3K_B 1NFY_B 1LQD_A 1G2L_B 1IQF_L 2UWP_B 2VH6_B 3KQC_L 2P93_L 2BQW_A ....
Probab=36.15 E-value=8.7 Score=23.26 Aligned_cols=10 Identities=50% Similarity=1.132 Sum_probs=7.6
Q ss_pred CCccCCCCCC
Q 040039 264 GQCSCPATYF 273 (274)
Q Consensus 264 ~~C~Cl~gf~ 273 (274)
..|+|++||+
T Consensus 19 ~~C~C~~Gy~ 28 (36)
T PF14670_consen 19 YRCSCPPGYK 28 (36)
T ss_dssp EEEE-STTEE
T ss_pred eEeECCCCCE
Confidence 5799999985
No 45
>cd00216 PQQ_DH Dehydrogenases with pyrrolo-quinoline quinone (PQQ) as cofactor, like ethanol, methanol, and membrane bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller.
Probab=36.15 E-value=1.3e+02 Score=29.06 Aligned_cols=73 Identities=25% Similarity=0.413 Sum_probs=43.4
Q ss_pred CCCcEEEEcCCC-------CCCCCCcEEEEe-cCCcEEEEcC-CCCEEEeecCCCC-----------cee-----EEE-E
Q 040039 35 EFPQVVWSANRN-------NPVRINATLELT-SDGNLVLQDA-DGAIAWSTNTSGK-----------SVV-----GLN-L 88 (274)
Q Consensus 35 ~~~~vVWvANr~-------~Pv~~~~~l~~~-~~G~L~l~~~-~~~~~Wss~~~~~-----------~~~-----~~~-l 88 (274)
....++|..+-. .|+-....+.+. .+|.|+-+|. .|+++|+.+.... +++ .+. -
T Consensus 37 ~~~~~~W~~~~~~~~~~~~sPvv~~g~vy~~~~~g~l~AlD~~tG~~~W~~~~~~~~~~~~~~~~~~g~~~~~~~~V~v~ 116 (488)
T cd00216 37 KKLKVAWTFSTGDERGQEGTPLVVDGDMYFTTSHSALFALDAATGKVLWRYDPKLPADRGCCDVVNRGVAYWDPRKVFFG 116 (488)
T ss_pred hcceeeEEEECCCCCCcccCCEEECCEEEEeCCCCcEEEEECCCChhhceeCCCCCccccccccccCCcEEccCCeEEEe
Confidence 345578987644 344444444444 5799988885 6889998764321 000 011 1
Q ss_pred eeCCCeeEecC-CCceEEee
Q 040039 89 TDMGNLVLFDK-NNAAVWQS 107 (274)
Q Consensus 89 ~d~GNlvl~~~-~~~~lWqS 107 (274)
..+|.++-.|. +++++|+-
T Consensus 117 ~~~g~v~AlD~~TG~~~W~~ 136 (488)
T cd00216 117 TFDGRLVALDAETGKQVWKF 136 (488)
T ss_pred cCCCeEEEEECCCCCEeeee
Confidence 23567776775 58899994
No 46
>PLN00033 photosystem II stability/assembly factor; Provisional
Probab=33.48 E-value=1.7e+02 Score=27.75 Aligned_cols=46 Identities=17% Similarity=0.325 Sum_probs=26.3
Q ss_pred CcEEEEcCCCCEEEeecCC--CCceeEEEEeeCCCeeEecCCCceEEe
Q 040039 61 GNLVLQDADGAIAWSTNTS--GKSVVGLNLTDMGNLVLFDKNNAAVWQ 106 (274)
Q Consensus 61 G~L~l~~~~~~~~Wss~~~--~~~~~~~~l~d~GNlvl~~~~~~~lWq 106 (274)
|++++.+.+|...|..... ......+...++|.++|....+.++|.
T Consensus 259 G~~~~s~d~G~~~W~~~~~~~~~~l~~v~~~~dg~l~l~g~~G~l~~S 306 (398)
T PLN00033 259 GNFYLTWEPGQPYWQPHNRASARRIQNMGWRADGGLWLLTRGGGLYVS 306 (398)
T ss_pred ccEEEecCCCCcceEEecCCCccceeeeeEcCCCCEEEEeCCceEEEe
Confidence 4444444344445654332 223345566789999998777766554
No 47
>PF05935 Arylsulfotrans: Arylsulfotransferase (ASST); InterPro: IPR010262 This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate [].; PDB: 3ETT_B 3ELQ_A 3ETS_A.
Probab=32.72 E-value=48 Score=32.06 Aligned_cols=52 Identities=23% Similarity=0.415 Sum_probs=30.2
Q ss_pred CCcEEEEcCCCCEEEeecCCCCceeEEEEeeCCCeeEe--------cCCCceEEeecCCCC
Q 040039 60 DGNLVLQDADGAIAWSTNTSGKSVVGLNLTDMGNLVLF--------DKNNAAVWQSFDHPT 112 (274)
Q Consensus 60 ~G~L~l~~~~~~~~Wss~~~~~~~~~~~l~d~GNlvl~--------~~~~~~lWqSFd~Pt 112 (274)
.+..+++|.+|.++|.-.........+..+++|+|... |-.|+++|+ ++.|.
T Consensus 127 ~~~~~~iD~~G~Vrw~~~~~~~~~~~~~~l~nG~ll~~~~~~~~e~D~~G~v~~~-~~l~~ 186 (477)
T PF05935_consen 127 SSYTYLIDNNGDVRWYLPLDSGSDNSFKQLPNGNLLIGSGNRLYEIDLLGKVIWE-YDLPG 186 (477)
T ss_dssp EEEEEEEETTS-EEEEE-GGGT--SSEEE-TTS-EEEEEBTEEEEE-TT--EEEE-EE--T
T ss_pred CceEEEECCCccEEEEEccCccccceeeEcCCCCEEEecCCceEEEcCCCCEEEe-eecCC
Confidence 46788999999999987753322222678899999864 345788998 66665
No 48
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=31.88 E-value=1.6e+02 Score=26.85 Aligned_cols=19 Identities=16% Similarity=0.081 Sum_probs=12.5
Q ss_pred eeCCCeeEecCC-CceEEee
Q 040039 89 TDMGNLVLFDKN-NAAVWQS 107 (274)
Q Consensus 89 ~d~GNlvl~~~~-~~~lWqS 107 (274)
..+|.|.+.|.. ++++|+-
T Consensus 327 ~~~G~l~~~d~~tG~~~~~~ 346 (377)
T TIGR03300 327 DFEGYLHWLSREDGSFVARL 346 (377)
T ss_pred eCCCEEEEEECCCCCEEEEE
Confidence 457777777653 6777753
No 49
>smart00180 EGF_Lam Laminin-type epidermal growth factor-like domai.
Probab=31.37 E-value=32 Score=21.74 Aligned_cols=10 Identities=40% Similarity=0.979 Sum_probs=4.4
Q ss_pred CCccCCCCCC
Q 040039 264 GQCSCPATYF 273 (274)
Q Consensus 264 ~~C~Cl~gf~ 273 (274)
.+|.|.++++
T Consensus 18 G~C~C~~~~~ 27 (46)
T smart00180 18 GQCECKPNVT 27 (46)
T ss_pred CEEECCCCCC
Confidence 3444444443
No 50
>KOG4260 consensus Uncharacterized conserved protein [Function unknown]
Probab=30.43 E-value=33 Score=30.70 Aligned_cols=21 Identities=33% Similarity=0.841 Sum_probs=18.1
Q ss_pred CCCCCCccC-------CCCccCCCCCCC
Q 040039 254 VCGKYGICS-------QGQCSCPATYFK 274 (274)
Q Consensus 254 ~CG~~giC~-------~~~C~Cl~gf~~ 274 (274)
.|+-||-|. +.+|.|-+||+|
T Consensus 151 ~C~GnG~C~GdGsR~GsGkCkC~~GY~G 178 (350)
T KOG4260|consen 151 PCFGNGSCHGDGSREGSGKCKCETGYTG 178 (350)
T ss_pred CcCCCCcccCCCCCCCCCcccccCCCCC
Confidence 599999997 278999999986
No 51
>KOG3881 consensus Uncharacterized conserved protein [Function unknown]
Probab=29.06 E-value=2.6e+02 Score=26.37 Aligned_cols=76 Identities=17% Similarity=0.149 Sum_probs=49.2
Q ss_pred CCcEEEEcC-CCCCCCCC-cEEEEecCCcEEEEcCC--CCEEEeecCCCCceeEEEEeeCCCeeEecC-CCceEEeecCC
Q 040039 36 FPQVVWSAN-RNNPVRIN-ATLELTSDGNLVLQDAD--GAIAWSTNTSGKSVVGLNLTDMGNLVLFDK-NNAAVWQSFDH 110 (274)
Q Consensus 36 ~~~vVWvAN-r~~Pv~~~-~~l~~~~~G~L~l~~~~--~~~~Wss~~~~~~~~~~~l~d~GNlvl~~~-~~~~lWqSFd~ 110 (274)
.+++||... |--|=... --++++..|.|.++|.. .+||=+-.-...+...+.|.-+||+|+... .+.+ -+||+
T Consensus 199 LrVPvW~tdi~Fl~g~~~~~fat~T~~hqvR~YDt~~qRRPV~~fd~~E~~is~~~l~p~gn~Iy~gn~~g~l--~~FD~ 276 (412)
T KOG3881|consen 199 LRVPVWITDIRFLEGSPNYKFATITRYHQVRLYDTRHQRRPVAQFDFLENPISSTGLTPSGNFIYTGNTKGQL--AKFDL 276 (412)
T ss_pred ceeeeeeccceecCCCCCceEEEEecceeEEEecCcccCcceeEeccccCcceeeeecCCCcEEEEecccchh--heecc
Confidence 345666644 22221113 56888999999999963 457766554445667788999999988743 3443 56887
Q ss_pred CCC
Q 040039 111 PTD 113 (274)
Q Consensus 111 PtD 113 (274)
-+-
T Consensus 277 r~~ 279 (412)
T KOG3881|consen 277 RGG 279 (412)
T ss_pred cCc
Confidence 653
No 52
>PF10282 Lactonase: Lactonase, 7-bladed beta-propeller; InterPro: IPR019405 6-phosphogluconolactonases (6PGL) 3.1.1.31 from EC, which hydrolyses 6-phosphogluconolactone to 6-phosphogluconate is opne of the enzymes in the pentose phosphate pathway. Two families of structurally dissimilar 6PGLs are known to exist: the Escherichia coli (strain K12) YbhE IPR022528 from INTERPRO [] and the Pseudomonas aeruginosa DevB IPR005900 from INTERPRO [] types. This entry contains bacterial 6-phosphogluconolactonases (6PGL) YbhE-type 3.1.1.31 from EC which hydrolyse 6-phosphogluconolactone to 6-phosphogluconate. The entry also contains the fungal muconate lactonizing enzyme carboxy-cis,cis-muconate cyclase 5.5.1.5 from EC and muconate cycloisomerase 5.5.1.1 from EC, which convert cis,cis-muconates to muconolactones and vice versa as part of the microbial beta-ketoadipate pathway. Structures have been reported for the E. coli 6-phosphogluconolactonase and Neurospora crassa muconate cycloisomerase. Structures of proteins in this family have revealed a 7-bladed beta-propeller fold [].; PDB: 3SCY_A 1L0Q_A 3HFQ_B 3FGB_A 1RI6_A 3U4Y_A 3BWS_A 1JOF_H.
Probab=27.71 E-value=3.7e+02 Score=24.36 Aligned_cols=65 Identities=20% Similarity=0.358 Sum_probs=0.0
Q ss_pred EEEEeeeccccccccccCCCCcEEEEcCCCCCCCCC-cEEEE-ecCCcEE---------------EEcCCCCEEEeecCC
Q 040039 17 AVFIVQAYNASLIDYQHIEFPQVVWSANRNNPVRIN-ATLEL-TSDGNLV---------------LQDADGAIAWSTNTS 79 (274)
Q Consensus 17 ~iw~~~~~~~~~~~~~~~~~~~vVWvANr~~Pv~~~-~~l~~-~~~G~L~---------------l~~~~~~~~Wss~~~ 79 (274)
+|.+. +....++|+||. .+. +.+.+ ..+|.|. ..+++|+.++-++..
T Consensus 249 ~i~is-------------pdg~~lyvsnr~---~~sI~vf~~d~~~g~l~~~~~~~~~G~~Pr~~~~s~~g~~l~Va~~~ 312 (345)
T PF10282_consen 249 EIAIS-------------PDGRFLYVSNRG---SNSISVFDLDPATGTLTLVQTVPTGGKFPRHFAFSPDGRYLYVANQD 312 (345)
T ss_dssp EEEE--------------TTSSEEEEEECT---TTEEEEEEECTTTTTEEEEEEEEESSSSEEEEEE-TTSSEEEEEETT
T ss_pred eEEEe-------------cCCCEEEEEecc---CCEEEEEEEecCCCceEEEEEEeCCCCCccEEEEeCCCCEEEEEecC
Q ss_pred CCceeEEEEe-eCCCeeEe
Q 040039 80 GKSVVGLNLT-DMGNLVLF 97 (274)
Q Consensus 80 ~~~~~~~~l~-d~GNlvl~ 97 (274)
...+....+. ++|.|...
T Consensus 313 s~~v~vf~~d~~tG~l~~~ 331 (345)
T PF10282_consen 313 SNTVSVFDIDPDTGKLTPV 331 (345)
T ss_dssp TTEEEEEEEETTTTEEEEE
T ss_pred CCeEEEEEEeCCCCcEEEe
No 53
>PF05294 Toxin_5: Scorpion short toxin; InterPro: IPR007958 This family contains various secreted scorpion short toxins which seem to be unrelated to those described in IPR001947 from INTERPRO.; GO: 0009405 pathogenesis, 0005576 extracellular region; PDB: 1SIS_A 1CHL_A.
Probab=27.69 E-value=15 Score=21.46 Aligned_cols=15 Identities=47% Similarity=1.096 Sum_probs=12.2
Q ss_pred CCCCCCccCCCCccC
Q 040039 254 VCGKYGICSQGQCSC 268 (274)
Q Consensus 254 ~CG~~giC~~~~C~C 268 (274)
-||..|.|-.++|-|
T Consensus 18 CCgg~GkC~GpqClC 32 (32)
T PF05294_consen 18 CCGGRGKCFGPQCLC 32 (32)
T ss_dssp HCTTSEEEETTEEEE
T ss_pred HhCCCCeEcCCcccC
Confidence 388889997788876
No 54
>smart00286 PTI Plant trypsin inhibitors.
Probab=26.76 E-value=42 Score=19.21 Aligned_cols=18 Identities=33% Similarity=0.866 Sum_probs=9.1
Q ss_pred CCCCCccCCCCccCCC-CCC
Q 040039 255 CGKYGICSQGQCSCPA-TYF 273 (274)
Q Consensus 255 CG~~giC~~~~C~Cl~-gf~ 273 (274)
|-.-+-| .+.|.|++ ||-
T Consensus 10 Ck~DsDC-l~~CiC~~~G~C 28 (29)
T smart00286 10 CKRDSDC-MAECICLANGYC 28 (29)
T ss_pred cccccCc-ccCCEEcccccc
Confidence 4444444 35566665 553
No 55
>smart00564 PQQ beta-propeller repeat. Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases.
Probab=26.50 E-value=1.2e+02 Score=16.71 Aligned_cols=17 Identities=47% Similarity=0.894 Sum_probs=9.6
Q ss_pred cCCcEEEEcC-CCCEEEe
Q 040039 59 SDGNLVLQDA-DGAIAWS 75 (274)
Q Consensus 59 ~~G~L~l~~~-~~~~~Ws 75 (274)
.+|.|+-+|. +|..+|+
T Consensus 14 ~~g~l~a~d~~~G~~~W~ 31 (33)
T smart00564 14 TDGTLYALDAKTGEILWT 31 (33)
T ss_pred CCCEEEEEEcccCcEEEE
Confidence 3466665554 4566665
No 56
>PF06247 Plasmod_Pvs28: Plasmodium ookinete surface protein Pvs28; InterPro: IPR010423 This family consists of several ookinete surface protein (Pvs28) from several species of Plasmodium. Pvs25 and Pvs28 are expressed on the surface of ookinetes. These proteins are potential candidates for vaccine and induce antibodies that block the infectivity of Plasmodium vivax in immunised animals [].; GO: 0009986 cell surface, 0016020 membrane; PDB: 1Z3G_B 1Z1Y_B 1Z27_A.
Probab=25.88 E-value=8.9 Score=32.29 Aligned_cols=27 Identities=33% Similarity=0.815 Sum_probs=17.7
Q ss_pred CCCCC----CcCCCCCCccCC---------CCccCCCCCC
Q 040039 247 GECGY----PLVCGKYGICSQ---------GQCSCPATYF 273 (274)
Q Consensus 247 d~C~~----y~~CG~~giC~~---------~~C~Cl~gf~ 273 (274)
-.|+. .-.||.|+.|.. -.|.|++||.
T Consensus 40 v~C~~~e~~~K~Cgdya~C~~~~~~~~~~~~~C~C~~gY~ 79 (197)
T PF06247_consen 40 VECDKLENVNKPCGDYAKCINQANKGEERAYKCDCINGYI 79 (197)
T ss_dssp ---SG-GGTTSEEETTEEEEE-SSTTSSTSEEEEE-TTEE
T ss_pred eecCcccccCccccchhhhhcCCCcccceeEEEecccCce
Confidence 45654 457999999971 3599999985
No 57
>PF01011 PQQ: PQQ enzyme repeat family.; InterPro: IPR002372 Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor for a number of enzymes (quinoproteins) and particularly for some bacterial dehydrogenases [, ]. A number of bacterial quinoproteins belong to this family. Enzymes in this group have repeats of a beta propeller.; PDB: 1H4I_C 1H4J_E 1W6S_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A 1G72_A ....
Probab=25.80 E-value=95 Score=18.38 Aligned_cols=22 Identities=41% Similarity=0.664 Sum_probs=17.0
Q ss_pred ecCCcEEEEcC-CCCEEEeecCC
Q 040039 58 TSDGNLVLQDA-DGAIAWSTNTS 79 (274)
Q Consensus 58 ~~~G~L~l~~~-~~~~~Wss~~~ 79 (274)
+.+|.|+-+|. .|+.+|+-+..
T Consensus 7 ~~~g~l~AlD~~TG~~~W~~~~~ 29 (38)
T PF01011_consen 7 TPDGYLYALDAKTGKVLWKFQTG 29 (38)
T ss_dssp TTTSEEEEEETTTTSEEEEEESS
T ss_pred CCCCEEEEEECCCCCEEEeeeCC
Confidence 56788888875 68899988764
No 58
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=25.77 E-value=46 Score=32.66 Aligned_cols=20 Identities=35% Similarity=0.992 Sum_probs=9.5
Q ss_pred CCCCCccCCCCccCCCCCCC
Q 040039 255 CGKYGICSQGQCSCPATYFK 274 (274)
Q Consensus 255 CG~~giC~~~~C~Cl~gf~~ 274 (274)
|...+.|...+|.|.+||+|
T Consensus 287 cs~~g~~~~g~CiC~~g~~G 306 (525)
T KOG1225|consen 287 CSGGGVCVDGECICNPGYSG 306 (525)
T ss_pred cCCCceecCCEeecCCCccc
Confidence 44444444344555555543
No 59
>cd00150 PlantTI Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges.
Probab=25.48 E-value=44 Score=18.80 Aligned_cols=8 Identities=38% Similarity=1.140 Sum_probs=3.5
Q ss_pred CCCCCCcc
Q 040039 254 VCGKYGIC 261 (274)
Q Consensus 254 ~CG~~giC 261 (274)
+|..+|+|
T Consensus 19 iC~~~G~C 26 (27)
T cd00150 19 ICLENGYC 26 (27)
T ss_pred EEcccccc
Confidence 44444444
No 60
>PF00053 Laminin_EGF: Laminin EGF-like (Domains III and V); InterPro: IPR002049 Laminins [] are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation. They are composed of distinct but related alpha, beta and gamma chains. The three chains form a cross-shaped molecule that consist of a long arm and three short globular arms. The long arm consist of a coiled coil structure contributed by all three chains and cross-linked by interchain disulphide bonds. Beside different types of globular domains each subunit contains, in its first half, consecutive repeats of about 60 amino acids in length that include eight conserved cysteines []. The tertiary structure [, ] of this domain is remotely similar in its N-terminal to that of the EGF-like module (see PDOC00021 from PROSITEDOC). It is known as a 'LE' or 'laminin-type EGF-like' domain. The number of copies of the LE domain in the different forms of laminins is highly variable; from 3 up to 22 copies have been found. A schematic representation of the topology of the four disulphide bonds in the LE domain is shown below. +-------------------+ +-|-----------+ | +--------+ +-----------------+ | | | | | | | | xxCxCxxxxxxxxxxxCxxxxxxxCxxCxxxxxGxxCxxCxxgaagxxxxxxxxxxxCxx sssssssssssssssssssssssssssssssssss 'C': conserved cysteine involved in a disulphide bond 'a': conserved aromatic residue 'G': conserved glycine (lower case = less conserved) 's': region similar to the EGF-like domain In mouse laminin gamma-1 chain, the seventh LE domain has been shown to be the only one that binds with a high affinity to nidogen []. The binding-sites are located on the surface within the loops C1-C3 and C5-C6 [, ]. Long consecutive arrays of LE domains in laminins form rod-like elements of limited flexibility [], which determine the spacing in the formation of laminin networks of basement membranes [].; PDB: 3TBD_A 3ZYG_B 3ZYI_B 2Y38_A 1KLO_A 1NPE_B 3ZYJ_B 1TLE_A.
Probab=24.72 E-value=21 Score=22.67 Aligned_cols=10 Identities=40% Similarity=0.860 Sum_probs=4.7
Q ss_pred CCccCCCCCC
Q 040039 264 GQCSCPATYF 273 (274)
Q Consensus 264 ~~C~Cl~gf~ 273 (274)
.+|.|.++|+
T Consensus 18 G~C~C~~~~~ 27 (49)
T PF00053_consen 18 GQCVCKPGTT 27 (49)
T ss_dssp EEESBSTTEE
T ss_pred CEEecccccc
Confidence 3455555443
No 61
>KOG0291 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]
Probab=24.64 E-value=3.5e+02 Score=27.94 Aligned_cols=53 Identities=28% Similarity=0.657 Sum_probs=39.2
Q ss_pred cEEEEecCCcEEEEcC-CCCE-EEeecCC---------CCceeEEEEeeCCCeeEecC-CCce-EE
Q 040039 53 ATLELTSDGNLVLQDA-DGAI-AWSTNTS---------GKSVVGLNLTDMGNLVLFDK-NNAA-VW 105 (274)
Q Consensus 53 ~~l~~~~~G~L~l~~~-~~~~-~Wss~~~---------~~~~~~~~l~d~GNlvl~~~-~~~~-lW 105 (274)
.++....||.++...+ ++++ ||.+..+ ..+++.++..-+||.+|..+ +|.| .|
T Consensus 354 ~~l~YSpDgq~iaTG~eDgKVKvWn~~SgfC~vTFteHts~Vt~v~f~~~g~~llssSLDGtVRAw 419 (893)
T KOG0291|consen 354 TSLAYSPDGQLIATGAEDGKVKVWNTQSGFCFVTFTEHTSGVTAVQFTARGNVLLSSSLDGTVRAW 419 (893)
T ss_pred eeEEECCCCcEEEeccCCCcEEEEeccCceEEEEeccCCCceEEEEEEecCCEEEEeecCCeEEee
Confidence 6899999999999865 4555 8987641 24567788899999999854 4553 45
No 62
>PF02237 BPL_C: Biotin protein ligase C terminal domain; InterPro: IPR003142 This C-terminal domain has an SH3-like barrel fold, the function of which is unknown. It is found associated with prokaryotic bifunctional transcriptional repressors [] and eukaryotic enzymes involved in biotin utilization [, ]. In Escherichia coli the biotin operon repressor (BirA) is a bifunctional protein. BirA acts both as the acetyl-coA carboxylase biotin holoenzyme synthetase (6.3.4.15 from EC) and as the biotin operon repressor. DNA sequence analysis of mutations indicates that the helix-turn-helix DNA binding region is located at the N terminus while mutations affecting enzyme function, although mapping over a large region, are found mainly in the central part of the protein's primary sequence [].; GO: 0006464 protein modification process; PDB: 3RUX_A 2CGH_A 3L1A_B 3L2Z_A 1HXD_A 1BIB_A 2EWN_B 1BIA_A 2EJ9_A 3FJP_A ....
Probab=24.03 E-value=45 Score=21.19 Aligned_cols=15 Identities=20% Similarity=0.432 Sum_probs=8.6
Q ss_pred EEeeCCCeeEecCCC
Q 040039 87 NLTDMGNLVLFDKNN 101 (274)
Q Consensus 87 ~l~d~GNlvl~~~~~ 101 (274)
-+.++|.|+|+..++
T Consensus 21 gId~~G~L~v~~~~g 35 (48)
T PF02237_consen 21 GIDDDGALLVRTEDG 35 (48)
T ss_dssp EEETTSEEEEEETTE
T ss_pred EECCCCEEEEEECCC
Confidence 455666666665554
No 63
>cd05764 Ig_2 Subgroup of the immunoglobulin (Ig) superfamily. Ig_2: subgroup of the immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of the Ig superfamily are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond.
Probab=23.74 E-value=1.3e+02 Score=20.08 Aligned_cols=34 Identities=12% Similarity=0.174 Sum_probs=21.6
Q ss_pred CCCCcEEEEcCCCCCCCCCcEEEEecCCcEEEEc
Q 040039 34 IEFPQVVWSANRNNPVRINATLELTSDGNLVLQD 67 (274)
Q Consensus 34 ~~~~~vVWvANr~~Pv~~~~~l~~~~~G~L~l~~ 67 (274)
.|.+++.|.-+.+.++.......+..+|.|.|..
T Consensus 13 ~P~p~v~W~~~~~~~~~~~~~~~~~~~~~L~i~~ 46 (74)
T cd05764 13 DPEPAIHWISPDGKLISNSSRTLVYDNGTLDILI 46 (74)
T ss_pred cCCCEEEEEeCCCEEecCCCeEEEecCCEEEEEE
Confidence 3667889996655566544444455667777663
No 64
>KOG2106 consensus Uncharacterized conserved protein, contains HELP and WD40 domains [Function unknown]
Probab=23.01 E-value=3e+02 Score=27.10 Aligned_cols=68 Identities=21% Similarity=0.455 Sum_probs=45.2
Q ss_pred cEEEEecCCcEEEEcCCCC-EEEeecCC---------CCceeEEEEeeCCCeeEecCCCc-eEEee-cCCCCCcccCCCc
Q 040039 53 ATLELTSDGNLVLQDADGA-IAWSTNTS---------GKSVVGLNLTDMGNLVLFDKNNA-AVWQS-FDHPTDSLVPGQK 120 (274)
Q Consensus 53 ~~l~~~~~G~L~l~~~~~~-~~Wss~~~---------~~~~~~~~l~d~GNlvl~~~~~~-~lWqS-Fd~PtDTlLpgq~ 120 (274)
-.++|.++|..+--|++|. .||+..+. ..++-.+.|+.+|-|+=-..+-. ++|.. ..---+|-||.|.
T Consensus 250 l~v~F~engdviTgDS~G~i~Iw~~~~~~~~k~~~aH~ggv~~L~~lr~GtllSGgKDRki~~Wd~~y~k~r~~elPe~~ 329 (626)
T KOG2106|consen 250 LCVTFLENGDVITGDSGGNILIWSKGTNRISKQVHAHDGGVFSLCMLRDGTLLSGGKDRKIILWDDNYRKLRETELPEQF 329 (626)
T ss_pred EEEEEcCCCCEEeecCCceEEEEeCCCceEEeEeeecCCceEEEEEecCccEeecCccceEEeccccccccccccCchhc
Confidence 3677888998888888776 58987642 12455688899999886323322 68983 1123567778775
No 65
>KOG0278 consensus Serine/threonine kinase receptor-associated protein [Lipid transport and metabolism]
Probab=22.33 E-value=5.2e+02 Score=23.23 Aligned_cols=53 Identities=19% Similarity=0.317 Sum_probs=30.7
Q ss_pred cEEEEecCCcEEEEcCC-CCEEEeecCCCCceeEEEEeeCCCeeEec-CCCceEEe
Q 040039 53 ATLELTSDGNLVLQDAD-GAIAWSTNTSGKSVVGLNLTDMGNLVLFD-KNNAAVWQ 106 (274)
Q Consensus 53 ~~l~~~~~G~L~l~~~~-~~~~Wss~~~~~~~~~~~l~d~GNlvl~~-~~~~~lWq 106 (274)
+.|.-+.|+.+.|-|.. +..|=+-. ...++.+|++.-+|..+... .++-..|.
T Consensus 157 ~iLSSadd~tVRLWD~rTgt~v~sL~-~~s~VtSlEvs~dG~ilTia~gssV~Fwd 211 (334)
T KOG0278|consen 157 CILSSADDKTVRLWDHRTGTEVQSLE-FNSPVTSLEVSQDGRILTIAYGSSVKFWD 211 (334)
T ss_pred eEEeeccCCceEEEEeccCcEEEEEe-cCCCCcceeeccCCCEEEEecCceeEEec
Confidence 45555666777776643 33332222 23567889999899877654 33444564
No 66
>PF05833 FbpA: Fibronectin-binding protein A N-terminus (FbpA); InterPro: IPR008616 This family consists of the N-terminal region of the prokaryotic fibronectin-binding protein, the C-terminal region is IPR008532 from INTERPRO. Fibronectin binding is considered to be an important virulence factor in streptococcal infections. Fibronectin is a dimeric glycoprotein that is present in a soluble form in plasma and extracellular fluids; it is also present in a fibrillar form on cell surfaces. Both the soluble and cellular forms of fibronectin may be incorporated into the extracellular tissue matrix. While fibronectin has critical roles in eukaryotic cellular processes, such as adhesion, migration and differentiation, it is also a substrate for the attachment of bacteria. The binding of pathogenic Streptococcus pyogenes and Staphylococcus aureus to epithelial cells via fibronectin facilitates their internalisation and systemic spread within the host [].; PDB: 3DOA_A 2ZBK_F 2HKJ_A 1Z5B_A 1Z5C_B 1MX0_F 1Z5A_A 1MU5_A 1Z59_A.
Probab=22.16 E-value=79 Score=30.11 Aligned_cols=38 Identities=16% Similarity=0.270 Sum_probs=22.3
Q ss_pred EEEEeeC-CCeeEecCCCceEEeecCCCCC-----cccCCCccc
Q 040039 85 GLNLTDM-GNLVLFDKNNAAVWQSFDHPTD-----SLVPGQKLL 122 (274)
Q Consensus 85 ~~~l~d~-GNlvl~~~~~~~lWqSFd~PtD-----TlLpgq~l~ 122 (274)
.++|... ||++|.|.++.||+---..+.+ +++||+...
T Consensus 116 i~El~g~~~NiiL~d~~~~Il~a~~~~~~~~~~~R~i~~G~~Y~ 159 (455)
T PF05833_consen 116 IIELMGRHSNIILTDEDGKILDALRRVSFSQSRDREILPGEPYI 159 (455)
T ss_dssp EEE--GGG-EEEEEETT-BEEEESS-B---------BSTTSB--
T ss_pred EEEEcCCcccEEEEcCCCeEEeehhhcCcccccceeeccCcccc
Confidence 4677877 9999999999888876555554 889998765
No 67
>PF14870 PSII_BNR: Photosynthesis system II assembly factor YCF48; PDB: 2XBG_A.
Probab=21.82 E-value=2.9e+02 Score=25.12 Aligned_cols=52 Identities=19% Similarity=0.403 Sum_probs=23.9
Q ss_pred EEEecCCcEEEEcCCCCEEEeecC--CCCceeEEEEeeCCCeeEecCCCceEEee
Q 040039 55 LELTSDGNLVLQDADGAIAWSTNT--SGKSVVGLNLTDMGNLVLFDKNNAAVWQS 107 (274)
Q Consensus 55 l~~~~~G~L~l~~~~~~~~Wss~~--~~~~~~~~~l~d~GNlvl~~~~~~~lWqS 107 (274)
+.+...|++++.-..+...|.... +.+.+..|-...+|+|.+... +..|..|
T Consensus 159 vavs~~G~~~~s~~~G~~~w~~~~r~~~~riq~~gf~~~~~lw~~~~-Gg~~~~s 212 (302)
T PF14870_consen 159 VAVSSRGNFYSSWDPGQTTWQPHNRNSSRRIQSMGFSPDGNLWMLAR-GGQIQFS 212 (302)
T ss_dssp EEEETTSSEEEEE-TT-SS-EEEE--SSS-EEEEEE-TTS-EEEEET-TTEEEEE
T ss_pred EEEECcccEEEEecCCCccceEEccCccceehhceecCCCCEEEEeC-CcEEEEc
Confidence 334444444444323344455432 234456677778888888763 3344554
No 68
>COG1520 FOG: WD40-like repeat [Function unknown]
Probab=21.73 E-value=4.7e+02 Score=23.88 Aligned_cols=73 Identities=26% Similarity=0.456 Sum_probs=43.6
Q ss_pred CCcEEEEcCCCC-------CCCCCcEEEEe-cCCcEEEEcCC-CCEEEeecCC---CC-----ce---eEEEE-ee--CC
Q 040039 36 FPQVVWSANRNN-------PVRINATLELT-SDGNLVLQDAD-GAIAWSTNTS---GK-----SV---VGLNL-TD--MG 92 (274)
Q Consensus 36 ~~~vVWvANr~~-------Pv~~~~~l~~~-~~G~L~l~~~~-~~~~Wss~~~---~~-----~~---~~~~l-~d--~G 92 (274)
.-+.+|..+... |+.....+-+. .+|.|+-++.+ |..+|..... .. .. ..+.+ .+ +|
T Consensus 130 ~G~~~W~~~~~~~~~~~~~~v~~~~~v~~~s~~g~~~al~~~tG~~~W~~~~~~~~~~~~~~~~~~~~~~vy~~~~~~~~ 209 (370)
T COG1520 130 TGTLVWSRNVGGSPYYASPPVVGDGTVYVGTDDGHLYALNADTGTLKWTYETPAPLSLSIYGSPAIASGTVYVGSDGYDG 209 (370)
T ss_pred CCcEEEEEecCCCeEEecCcEEcCcEEEEecCCCeEEEEEccCCcEEEEEecCCccccccccCceeecceEEEecCCCcc
Confidence 466889887666 23333555566 57999988876 8899985442 11 11 01111 22 44
Q ss_pred CeeEecC-CCceEEeec
Q 040039 93 NLVLFDK-NNAAVWQSF 108 (274)
Q Consensus 93 Nlvl~~~-~~~~lWqSF 108 (274)
+|+=.|. +|..+|+.+
T Consensus 210 ~~~a~~~~~G~~~w~~~ 226 (370)
T COG1520 210 ILYALNAEDGTLKWSQK 226 (370)
T ss_pred eEEEEEccCCcEeeeee
Confidence 5665565 678889854
No 69
>KOG4234 consensus TPR repeat-containing protein [General function prediction only]
Probab=20.98 E-value=40 Score=29.19 Aligned_cols=16 Identities=13% Similarity=0.260 Sum_probs=12.6
Q ss_pred CCCCCCCCcceEEEecC
Q 040039 132 STTNWTDGGLFSLSVTN 148 (274)
Q Consensus 132 s~~dps~~G~ysl~~d~ 148 (274)
--.||.+ |.|++.+..
T Consensus 250 mvqd~nT-GsySi~fk~ 265 (271)
T KOG4234|consen 250 MVQDPNT-GSYSINFKG 265 (271)
T ss_pred eeeCCCC-CceeEEecC
Confidence 3458889 999999864
No 70
>KOG4792 consensus Crk family adapters [Signal transduction mechanisms]
Probab=20.69 E-value=1.3e+02 Score=26.31 Aligned_cols=48 Identities=25% Similarity=0.342 Sum_probs=30.2
Q ss_pred EeecCCC------CCcccCCCcccCCceEeeecCCCCCCCCcceEEEec-CCCceEEEecC
Q 040039 105 WQSFDHP------TDSLVPGQKLLEGKKLTASVSTTNWTDGGLFSLSVT-NEGLFAFIESN 158 (274)
Q Consensus 105 WqSFd~P------tDTlLpgq~l~~~~~L~Sw~s~~dps~~G~ysl~~d-~~g~~~~~~~~ 158 (274)
|.||=.| .-+||.||+ .|..|+-- ++-++ |.|.|.+- .+...+++|..
T Consensus 10 r~swYfg~mSRqeA~~lL~~~r--~G~FLvRD---Sst~p-GdYvLsV~E~srVshYiIn~ 64 (293)
T KOG4792|consen 10 RSSWYFGPMSRQEAVALLQGQR--HGVFLVRD---SSTSP-GDYVLSVSENSRVSHYIINS 64 (293)
T ss_pred ccceecCcccHHHHHHHhcCcc--eeeEEEec---CCCCC-CceEEEEecCcceeeeeecC
Confidence 5665544 346788887 56666532 22235 99999994 45666777654
No 71
>KOG1225 consensus Teneurin-1 and related extracellular matrix proteins, contain EGF-like repeats [Signal transduction mechanisms; Extracellular structures]
Probab=20.69 E-value=60 Score=31.91 Aligned_cols=16 Identities=38% Similarity=1.140 Sum_probs=12.0
Q ss_pred CccCCCCccCCCCCCC
Q 040039 259 GICSQGQCSCPATYFK 274 (274)
Q Consensus 259 giC~~~~C~Cl~gf~~ 274 (274)
|.|...+|-|++||+|
T Consensus 260 g~c~~G~CIC~~Gf~G 275 (525)
T KOG1225|consen 260 GQCVEGRCICPPGFTG 275 (525)
T ss_pred ceEeCCeEeCCCCCcC
Confidence 5566678888888875
No 72
>COG4787 FlgF Flagellar basal body rod protein [Cell motility and secretion]
Probab=20.31 E-value=4.5e+02 Score=22.92 Aligned_cols=24 Identities=42% Similarity=0.691 Sum_probs=18.0
Q ss_pred EEEEecCCcEEEEcCCCCEEEeec
Q 040039 54 TLELTSDGNLVLQDADGAIAWSTN 77 (274)
Q Consensus 54 ~l~~~~~G~L~l~~~~~~~~Wss~ 77 (274)
-+.|+.||=|.+.+.+|+....-+
T Consensus 79 Dvaiq~DGwlaVq~~dG~EaYTRn 102 (251)
T COG4787 79 DVAIQGDGWLAVQDADGSEAYTRN 102 (251)
T ss_pred eEEEccCceEEEEcCCCcchheec
Confidence 478888998888888887655443
Done!