Query 013272
Match_columns 446
No_of_seqs 246 out of 1606
Neff 7.5
Searched_HMMs 46136
Date Fri Mar 29 02:10:08 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013272.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013272hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF01453 B_lectin: D-mannose b 99.9 5.1E-28 1.1E-32 206.1 6.2 97 68-164 2-108 (114)
2 cd00028 B_lectin Bulb-type man 99.9 1.6E-21 3.5E-26 166.8 12.7 112 10-144 3-116 (116)
3 PF00954 S_locus_glycop: S-loc 99.8 2E-20 4.4E-25 158.4 11.6 102 194-297 1-109 (110)
4 smart00108 B_lectin Bulb-type 99.8 2.7E-20 5.9E-25 158.7 12.3 110 11-143 4-114 (114)
5 cd01098 PAN_AP_plant Plant PAN 99.5 1.2E-13 2.7E-18 110.5 7.9 79 312-392 3-84 (84)
6 PF08276 PAN_2: PAN-like domai 99.5 7.5E-14 1.6E-18 107.1 5.8 59 318-376 4-66 (66)
7 cd00129 PAN_APPLE PAN/APPLE-li 99.2 2.9E-11 6.4E-16 96.0 5.5 66 319-390 9-79 (80)
8 smart00108 B_lectin Bulb-type 98.6 3.3E-07 7.1E-12 77.9 9.0 85 88-197 24-112 (114)
9 cd00028 B_lectin Bulb-type man 98.5 5.4E-07 1.2E-11 76.9 8.1 80 93-197 30-113 (116)
10 smart00473 PAN_AP divergent su 98.4 9.7E-07 2.1E-11 68.9 6.9 71 319-390 4-77 (78)
11 PF01453 B_lectin: D-mannose b 97.9 0.00012 2.5E-09 62.4 10.1 74 68-145 38-114 (114)
12 cd01100 APPLE_Factor_XI_like S 97.8 3.1E-05 6.7E-10 60.4 4.2 50 324-373 9-58 (73)
13 smart00223 APPLE APPLE domain. 94.2 0.062 1.3E-06 42.6 3.8 48 325-372 7-57 (79)
14 PF00024 PAN_1: PAN domain Thi 93.7 0.085 1.8E-06 40.7 3.7 51 321-371 4-55 (79)
15 PF08693 SKG6: Transmembrane a 93.3 0.066 1.4E-06 36.4 2.1 8 430-437 33-40 (40)
16 PF14295 PAN_4: PAN domain; PD 92.9 0.09 1.9E-06 37.2 2.6 33 338-370 14-51 (51)
17 PF08277 PAN_3: PAN-like domai 92.7 0.66 1.4E-05 35.3 7.4 43 337-381 17-61 (71)
18 smart00605 CW CW domain. 92.6 0.97 2.1E-05 36.7 8.7 55 338-393 20-76 (94)
19 PF04478 Mid2: Mid2 like cell 91.6 0.26 5.6E-06 43.6 4.3 12 426-437 68-79 (154)
20 PF15102 TMEM154: TMEM154 prot 84.8 1.3 2.9E-05 39.0 4.1 8 408-415 59-66 (146)
21 cd01099 PAN_AP_HGF Subfamily o 83.8 2.2 4.8E-05 33.5 4.7 32 339-370 24-57 (80)
22 PTZ00382 Variant-specific surf 75.8 2.3 5E-05 34.9 2.5 12 425-436 84-95 (96)
23 PF01102 Glycophorin_A: Glycop 73.6 0.46 1E-05 40.7 -2.2 29 408-439 67-95 (122)
24 PF13360 PQQ_2: PQQ-like domai 73.2 12 0.00027 34.7 7.2 74 68-141 13-103 (238)
25 PF01683 EB: EB module; Inter 69.1 3.9 8.4E-05 29.2 2.1 33 264-297 16-48 (52)
26 PF07645 EGF_CA: Calcium-bindi 67.5 2.4 5.3E-05 28.9 0.7 29 268-296 3-35 (42)
27 PF15102 TMEM154: TMEM154 prot 67.0 4.1 8.9E-05 35.9 2.2 12 409-420 57-68 (146)
28 PF06024 DUF912: Nucleopolyhed 63.3 7.2 0.00016 32.3 2.9 11 428-438 82-92 (101)
29 PF13360 PQQ_2: PQQ-like domai 62.8 40 0.00086 31.2 8.3 73 68-140 56-148 (238)
30 KOG0291 WD40-repeat-containing 61.7 2.3E+02 0.0051 31.7 14.4 85 85-183 353-451 (893)
31 cd00053 EGF Epidermal growth f 61.1 8.1 0.00017 24.1 2.3 27 270-296 2-31 (36)
32 PRK11138 outer membrane biogen 59.3 40 0.00087 34.5 8.3 24 88-111 339-363 (394)
33 PF07974 EGF_2: EGF-like domai 59.1 8 0.00017 25.0 1.9 22 274-295 6-28 (32)
34 PF01034 Syndecan: Syndecan do 55.5 4.1 8.8E-05 30.7 0.1 8 431-438 32-39 (64)
35 smart00179 EGF_CA Calcium-bind 55.1 11 0.00023 24.4 2.2 28 268-295 3-33 (39)
36 KOG4649 PQQ (pyrrolo-quinoline 54.4 37 0.0008 33.2 6.3 44 69-112 170-217 (354)
37 PF02439 Adeno_E3_CR2: Adenovi 53.5 1.3 2.9E-05 29.7 -2.4 13 408-420 6-18 (38)
38 PF06365 CD34_antigen: CD34/Po 52.7 7.2 0.00016 36.5 1.3 23 418-440 111-133 (202)
39 PF01436 NHL: NHL repeat; Int 51.1 24 0.00053 21.7 3.1 20 88-107 7-26 (28)
40 PF01034 Syndecan: Syndecan do 49.4 5.1 0.00011 30.2 -0.2 28 409-436 13-40 (64)
41 PF02009 Rifin_STEVOR: Rifin/s 49.3 2 4.3E-05 42.7 -3.1 12 428-439 275-287 (299)
42 PF12877 DUF3827: Domain of un 49.0 6.9 0.00015 42.4 0.6 31 408-438 269-299 (684)
43 TIGR01478 STEVOR variant surfa 47.1 4.2 9.2E-05 39.7 -1.2 7 286-292 143-149 (295)
44 PTZ00370 STEVOR; Provisional 45.5 4.6 9.9E-05 39.6 -1.2 7 286-292 143-149 (296)
45 cd00054 EGF_CA Calcium-binding 45.3 19 0.0004 22.8 2.1 28 268-295 3-33 (38)
46 TIGR03300 assembly_YfgL outer 44.3 85 0.0018 31.7 7.8 54 86-139 66-130 (377)
47 KOG1219 Uncharacterized conser 44.1 19 0.00042 44.8 3.2 24 274-297 3870-3897(4289)
48 PF14610 DUF4448: Protein of u 43.3 26 0.00057 32.2 3.5 28 408-438 160-187 (189)
49 PTZ00208 65 kDa invariant surf 42.5 12 0.00026 38.3 1.2 30 410-439 388-417 (436)
50 PRK11138 outer membrane biogen 41.7 78 0.0017 32.3 7.1 61 78-139 251-319 (394)
51 PTZ00046 rifin; Provisional 41.7 5.4 0.00012 40.5 -1.5 13 428-440 334-347 (358)
52 PF13908 Shisa: Wnt and FGF in 41.5 23 0.0005 32.3 2.8 13 408-420 78-90 (179)
53 PF12662 cEGF: Complement Clr- 40.9 21 0.00045 21.5 1.6 11 287-297 3-13 (24)
54 PF01299 Lamp: Lysosome-associ 40.7 14 0.0003 36.9 1.2 28 408-438 273-300 (306)
55 TIGR01477 RIFIN variant surfac 39.9 4.8 0.0001 40.7 -2.1 13 428-440 329-342 (353)
56 PF12661 hEGF: Human growth fa 39.7 11 0.00025 19.2 0.3 9 287-295 1-9 (13)
57 TIGR03066 Gem_osc_para_1 Gemma 39.5 67 0.0015 27.1 5.0 52 84-135 34-103 (111)
58 PF14269 Arylsulfotran_2: Aryl 39.3 2.6E+02 0.0056 27.8 10.1 46 88-133 149-220 (299)
59 PF12768 Rax2: Cortical protei 38.5 35 0.00077 33.7 3.7 33 408-440 230-263 (281)
60 PF03302 VSP: Giardia variant- 38.5 19 0.00041 37.4 1.9 16 421-436 381-396 (397)
61 smart00564 PQQ beta-propeller 38.1 59 0.0013 20.1 3.6 18 91-108 13-31 (33)
62 PF02480 Herpes_gE: Alphaherpe 35.2 13 0.00027 39.2 0.0 27 302-330 239-265 (439)
63 PF12947 EGF_3: EGF domain; I 34.7 9.8 0.00021 25.2 -0.6 23 273-295 5-30 (36)
64 PF05935 Arylsulfotrans: Aryls 33.4 1.7E+02 0.0036 31.1 8.2 62 67-132 136-208 (477)
65 PF05454 DAG1: Dystroglycan (D 30.6 17 0.00036 36.1 0.0 7 432-438 169-175 (290)
66 PF05935 Arylsulfotrans: Aryls 29.4 89 0.0019 33.2 5.3 53 93-146 127-187 (477)
67 TIGR02513 type_III_yscB type I 28.2 1E+02 0.0022 26.8 4.3 53 78-130 14-93 (139)
68 PF09064 Tme5_EGF_like: Thromb 27.0 44 0.00095 22.0 1.5 10 286-295 18-27 (34)
69 PF13570 PQQ_3: PQQ-like domai 26.8 99 0.0021 20.3 3.3 8 104-111 2-9 (40)
70 PF15065 NCU-G1: Lysosomal tra 26.3 22 0.00049 36.2 0.1 14 137-151 73-86 (350)
71 COG1520 FOG: WD40-like repeat 26.2 3.9E+02 0.0085 26.9 9.2 73 68-141 131-226 (370)
72 TIGR03300 assembly_YfgL outer 26.0 1.9E+02 0.0041 29.1 6.8 62 78-140 236-305 (377)
73 PF06697 DUF1191: Protein of u 25.7 28 0.00061 34.2 0.6 17 410-426 215-231 (278)
74 PF08374 Protocadherin: Protoc 24.5 14 0.0003 34.8 -1.7 6 410-415 43-48 (221)
75 PF06006 DUF905: Bacterial pro 24.2 80 0.0017 24.2 2.6 17 127-143 35-51 (70)
76 PHA03099 epidermal growth fact 24.0 41 0.00089 29.1 1.2 9 430-438 123-131 (139)
77 PF01011 PQQ: PQQ enzyme repea 23.9 95 0.0021 20.3 2.8 22 91-112 7-29 (38)
78 KOG3637 Vitronectin receptor, 23.1 40 0.00088 39.4 1.3 32 406-437 977-1010(1030)
79 PF12545 DUF3739: Filamentous 22.4 4.2E+02 0.0092 22.4 6.8 10 68-77 3-12 (112)
80 KOG4289 Cadherin EGF LAG seven 22.4 59 0.0013 38.9 2.4 36 272-307 1243-1285(2531)
81 PF05568 ASFV_J13L: African sw 22.0 21 0.00045 31.4 -1.0 12 428-439 49-60 (189)
82 cd00216 PQQ_DH Dehydrogenases 21.1 2.8E+02 0.0061 29.4 7.2 70 69-139 40-135 (488)
83 COG1520 FOG: WD40-like repeat 21.1 2.4E+02 0.0051 28.5 6.4 60 79-139 106-178 (370)
84 smart00181 EGF Epidermal growt 20.9 78 0.0017 19.8 1.9 22 274-296 6-30 (35)
No 1
>PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]: Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=99.94 E-value=5.1e-28 Score=206.08 Aligned_cols=97 Identities=40% Similarity=0.570 Sum_probs=72.9
Q ss_pred CEEEEcCCCCCCcccC-CceEEEEecCCcEEEEcCCCceEEee-cCCCCc--cceEEEecCCCeEEEeecCeEEEEeecc
Q 013272 68 DVKVWNSGHYSRFYVS-EKCVLELTKDGDLRLKGPNDRVGWLS-GTSRQG--VERLQILRTGNLVLVDVVNRVKWQSFNF 143 (446)
Q Consensus 68 ~~vVW~ANrd~P~~~~-~~~~L~lt~dG~LvL~d~~g~~vWst-~~~~~~--~~~a~LlDsGNLVL~d~~~~~lWQSFD~ 143 (446)
+++||+|||+.|+... ...+|.|+.||+|+|.|..++++|++ ++.+.. ...|+|+|+|||||+|..+.+|||||||
T Consensus 2 ~tvvW~an~~~p~~~~s~~~~L~l~~dGnLvl~~~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~~ 81 (114)
T PF01453_consen 2 RTVVWVANRNSPLTSSSGNYTLILQSDGNLVLYDSNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFDY 81 (114)
T ss_dssp --------TTEEEEECETTEEEEEETTSEEEEEETTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTTS
T ss_pred cccccccccccccccccccccceECCCCeEEEEcCCCCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecCC
Confidence 4699999999998431 35899999999999999999999999 555443 6899999999999999889999999999
Q ss_pred CccccccccccccC------cEEEeCC
Q 013272 144 PTDVMLWGQRLNVA------TRLTSFP 164 (446)
Q Consensus 144 PTDTlLPGq~L~~~------~~L~S~~ 164 (446)
||||+||||+|+.+ ..|+||+
T Consensus 82 ptdt~L~~q~l~~~~~~~~~~~~~sw~ 108 (114)
T PF01453_consen 82 PTDTLLPGQKLGDGNVTGKNDSLTSWS 108 (114)
T ss_dssp SS-EEEEEET--TSEEEEESTSSEEEE
T ss_pred CccEEEeccCcccCCCccccceEEeEC
Confidence 99999999999873 2478887
No 2
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=99.87 E-value=1.6e-21 Score=166.75 Aligned_cols=112 Identities=29% Similarity=0.479 Sum_probs=89.0
Q ss_pred CCCCCceEeeccCCcCCCCceEEEEEeecCCCCCc-eEEEEEEeecCCceeeeeEEeeCCEEEEcCCCCCCcccCCceEE
Q 013272 10 DIQKGYKLTLAVPAEYSLGFIGRAFLIETDQIAPN-FRAAVSVEAVNGKFSCSLEVLLGDVKVWNSGHYSRFYVSEKCVL 88 (446)
Q Consensus 10 ~i~~g~~l~~~~~~~~a~G~~~~~f~~~~~~~~~~-f~~gi~~~~~~~~~~~~~~~~~~~~vVW~ANrd~P~~~~~~~~L 88 (446)
.+..|+.|...+. .++.|||. .. ... +..+||+.+.+ + ++||+|||+.|. ...++|
T Consensus 3 ~l~~~~~l~s~~~-~f~~G~~~-----~~---~q~~dgnlv~~~~~~-~-----------~~vW~snt~~~~--~~~~~l 59 (116)
T cd00028 3 PLSSGQTLVSSGS-LFELGFFK-----LI---MQSRDYNLILYKGSS-R-----------TVVWVANRDNPS--GSSCTL 59 (116)
T ss_pred CcCCCCEEEeCCC-cEEEeccc-----CC---CCCCeEEEEEEeCCC-C-----------eEEEECCCCCCC--CCCEEE
Confidence 3556777755443 23466665 22 133 88899987654 3 489999999873 357899
Q ss_pred EEecCCcEEEEcCCCceEEeecCCC-CccceEEEecCCCeEEEeecCeEEEEeeccC
Q 013272 89 ELTKDGDLRLKGPNDRVGWLSGTSR-QGVERLQILRTGNLVLVDVVNRVKWQSFNFP 144 (446)
Q Consensus 89 ~lt~dG~LvL~d~~g~~vWst~~~~-~~~~~a~LlDsGNLVL~d~~~~~lWQSFD~P 144 (446)
.|+.||+|+|.|.+|.++|++++.+ ....+|+|+|+|||||++.++.+||||||||
T Consensus 60 ~l~~dGnLvl~~~~g~~vW~S~~~~~~~~~~~~L~ddGnlvl~~~~~~~~W~Sf~~P 116 (116)
T cd00028 60 TLQSDGNLVIYDGSGTVVWSSNTTRVNGNYVLVLLDDGNLVLYDSDGNFLWQSFDYP 116 (116)
T ss_pred EEecCCCeEEEcCCCcEEEEecccCCCCceEEEEeCCCCEEEECCCCCEEEcCCCCC
Confidence 9999999999999999999999875 4567899999999999999999999999999
No 3
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=99.84 E-value=2e-20 Score=158.44 Aligned_cols=102 Identities=23% Similarity=0.422 Sum_probs=80.6
Q ss_pred eeccCCCCceeeEEe-eC---cceeEEEecCCeEEEEEEe-cCCCeEEEEEEEccCCeEEEEEeecCCCCeeEEeeeccC
Q 013272 194 WEFKPSKNRNISFIA-LG---SNGLGLFNDKGKKIAQIYS-QRLQPLRFLSLGNRTGNLALYHYSANDRNFQASFQAINK 268 (446)
Q Consensus 194 w~~~~~~~~~~~~~~-~~---~~g~~~~~~~~~~~~~~~~-~~~~~~~~~~l~d~dG~l~~y~~~~~~~~W~~~w~~p~d 268 (446)
|++|+|++..|+.+. +. ...+.+. .+.++++..+. .+.+.++|++|+ ++|++++|.|.+..++|...|.+|.|
T Consensus 1 wrsG~WnG~~f~g~p~~~~~~~~~~~fv-~~~~e~~~t~~~~~~s~~~r~~ld-~~G~l~~~~w~~~~~~W~~~~~~p~d 78 (110)
T PF00954_consen 1 WRSGPWNGQRFSGIPEMSSNSLYNYSFV-SNNEEVYYTYSLSNSSVLSRLVLD-SDGQLQRYIWNESTQSWSVFWSAPKD 78 (110)
T ss_pred CCccccCCeEECCcccccccceeEEEEE-ECCCeEEEEEecCCCceEEEEEEe-eeeEEEEEEEecCCCcEEEEEEeccc
Confidence 677888777765532 22 1223333 34555666554 456778899999 99999999999999999999999999
Q ss_pred CCCCCCCCCCCCcCCCC--CceeecCCCCCC
Q 013272 269 TCDLPLGCKPCEICTFT--NSCSCIGLLTKK 297 (446)
Q Consensus 269 ~C~~~~~CG~~g~C~~~--~~C~Cl~gf~~~ 297 (446)
+||+|+.||+||+|+.+ +.|+||+||+|+
T Consensus 79 ~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~P~ 109 (110)
T PF00954_consen 79 QCDVYGFCGPNGICNSNNSPKCSCLPGFEPK 109 (110)
T ss_pred CCCCccccCCccEeCCCCCCceECCCCcCCC
Confidence 99999999999999864 689999999986
No 4
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=99.84 E-value=2.7e-20 Score=158.66 Aligned_cols=110 Identities=29% Similarity=0.495 Sum_probs=87.1
Q ss_pred CCCCceEeeccCCcCCCCceEEEEEeecCCCCCceEEEEEEeecCCceeeeeEEeeCCEEEEcCCCCCCcccCCceEEEE
Q 013272 11 IQKGYKLTLAVPAEYSLGFIGRAFLIETDQIAPNFRAAVSVEAVNGKFSCSLEVLLGDVKVWNSGHYSRFYVSEKCVLEL 90 (446)
Q Consensus 11 i~~g~~l~~~~~~~~a~G~~~~~f~~~~~~~~~~f~~gi~~~~~~~~~~~~~~~~~~~~vVW~ANrd~P~~~~~~~~L~l 90 (446)
+..|+.|...+.. +++|||. .. ...+..+||+...+ + ++||+|||+.|+. ..+.|.|
T Consensus 4 l~~~~~l~s~~~~-f~~G~~~-----~~---~q~dgnlV~~~~~~-~-----------~~vW~snt~~~~~--~~~~l~l 60 (114)
T smart00108 4 LSSGQTLVSGNSL-FELGFFT-----LI---MQNDYNLILYKSSS-R-----------TVVWVANRDNPVS--DSCTLTL 60 (114)
T ss_pred cCCCCEEecCCCc-Eeeeccc-----cC---CCCCEEEEEEECCC-C-----------cEEEECCCCCCCC--CCEEEEE
Confidence 4456677554442 3677766 22 24678889987654 3 4899999998864 3589999
Q ss_pred ecCCcEEEEcCCCceEEeecCC-CCccceEEEecCCCeEEEeecCeEEEEeecc
Q 013272 91 TKDGDLRLKGPNDRVGWLSGTS-RQGVERLQILRTGNLVLVDVVNRVKWQSFNF 143 (446)
Q Consensus 91 t~dG~LvL~d~~g~~vWst~~~-~~~~~~a~LlDsGNLVL~d~~~~~lWQSFD~ 143 (446)
++||+|+|.|.+|.++|++++. +.....|+|+|+|||||++..+++|||||||
T Consensus 61 ~~dGnLvl~~~~g~~vW~S~t~~~~~~~~~~L~ddGnlvl~~~~~~~~W~Sf~~ 114 (114)
T smart00108 61 QSDGNLVLYDGDGRVVWSSNTTGANGNYVLVLLDDGNLVIYDSDGNFLWQSFDY 114 (114)
T ss_pred eCCCCEEEEeCCCCEEEEecccCCCCceEEEEeCCCCEEEECCCCCEEeCCCCC
Confidence 9999999999999999999986 4456789999999999999989999999997
No 5
>cd01098 PAN_AP_plant Plant PAN/APPLE-like domain; present in plant S-receptor protein kinases and secreted glycoproteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. S-receptor protein kinases and S-locus glycoproteins are involved in sporophytic self-incompatibility response in Brassica, one of probably many molecular mechanisms, by which hermaphrodite flowering plants avoid self-fertilization.
Probab=99.47 E-value=1.2e-13 Score=110.46 Aligned_cols=79 Identities=22% Similarity=0.473 Sum_probs=59.8
Q ss_pred eccCCc--ceeeEEecCeeecCCCCCccccCCHHHHHHHhhcCCCeEeEEecCCCCCeEEcc-eecceEEeccCCCeEEE
Q 013272 312 GLCGRN--RVEMLELEGVGSVLRDGPKMVNVSKEECASMCTSDCKCVGVLYSSAELECFFYG-VVMGVKQVEKRSGLIYM 388 (446)
Q Consensus 312 ~~C~~~--~~~f~~l~~~~~~~~~~~~~~~~~~~~C~~~CL~nCsC~A~~y~~~~~~C~~~~-~l~~~~~~~~~~~~~~~ 388 (446)
+.|... .++|++++++++|...... ...++++||++||+||+|+||+|.+++++|++|. .+.+.+.... .+..+|
T Consensus 3 ~~C~~~~~~~~f~~~~~~~~~~~~~~~-~~~s~~~C~~~Cl~nCsC~a~~~~~~~~~C~~~~~~~~~~~~~~~-~~~~~y 80 (84)
T cd01098 3 LNCGGDGSTDGFLKLPDVKLPDNASAI-TAISLEECREACLSNCSCTAYAYNNGSGGCLLWNGLLNNLRSLSS-GGGTLY 80 (84)
T ss_pred cccCCCCCCCEEEEeCCeeCCCchhhh-ccCCHHHHHHHHhcCCCcceeeecCCCCeEEEEeceecceEeecC-CCcEEE
Confidence 346543 3789999999987543333 6689999999999999999999987678999994 5555554332 247899
Q ss_pred EEEe
Q 013272 389 VKVA 392 (446)
Q Consensus 389 iKv~ 392 (446)
|||+
T Consensus 81 iKv~ 84 (84)
T cd01098 81 LRLA 84 (84)
T ss_pred EEeC
Confidence 9985
No 6
>PF08276 PAN_2: PAN-like domain; InterPro: IPR013227 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs
Probab=99.46 E-value=7.5e-14 Score=107.11 Aligned_cols=59 Identities=27% Similarity=0.493 Sum_probs=48.6
Q ss_pred ceeeEEecCeeecCCCCCc-cccCCHHHHHHHhhcCCCeEeEEecC--CCCCeEEc-ceecce
Q 013272 318 RVEMLELEGVGSVLRDGPK-MVNVSKEECASMCTSDCKCVGVLYSS--AELECFFY-GVVMGV 376 (446)
Q Consensus 318 ~~~f~~l~~~~~~~~~~~~-~~~~~~~~C~~~CL~nCsC~A~~y~~--~~~~C~~~-~~l~~~ 376 (446)
.++|++|++|++|...... ...+++++||++||+||||+||+|.+ ++++|++| ++|+|+
T Consensus 4 ~d~F~~l~~~~~p~~~~~~~~~~~s~~~C~~~Cl~nCsC~Ayay~~~~~~~~C~lW~~~L~d~ 66 (66)
T PF08276_consen 4 GDGFLKLPNMKLPDFDNAIVDSSVSLEECEKACLSNCSCTAYAYSNLSGGGGCLLWYGDLVDL 66 (66)
T ss_pred CCEEEEECCeeCCCCcceeeecCCCHHHHHhhcCCCCCEeeEEeeccCCCCEEEEEcCEeecC
Confidence 5899999999987642222 25689999999999999999999985 56899999 788764
No 7
>cd00129 PAN_APPLE PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=99.18 E-value=2.9e-11 Score=96.02 Aligned_cols=66 Identities=17% Similarity=0.308 Sum_probs=53.5
Q ss_pred eeeEEecCeeecCCCCCccccCCHHHHHHHhhc---CCCeEeEEecCCCCCeEEc-cee-cceEEeccCCCeEEEEE
Q 013272 319 VEMLELEGVGSVLRDGPKMVNVSKEECASMCTS---DCKCVGVLYSSAELECFFY-GVV-MGVKQVEKRSGLIYMVK 390 (446)
Q Consensus 319 ~~f~~l~~~~~~~~~~~~~~~~~~~~C~~~CL~---nCsC~A~~y~~~~~~C~~~-~~l-~~~~~~~~~~~~~~~iK 390 (446)
..|+++.+++.|. ....+++||++.|++ ||||.||+|.+.+++|++| +++ +++++..++ +.++|||
T Consensus 9 g~fl~~~~~klpd-----~~~~s~~eC~~~Cl~~~~nCsC~Aya~~~~~~gC~~W~~~l~~d~~~~~~~-g~~Ly~r 79 (80)
T cd00129 9 GTTLIKIALKIKT-----TKANTADECANRCEKNGLPFSCKAFVFAKARKQCLWFPFNSMSGVRKEFSH-GFDLYEN 79 (80)
T ss_pred CeEEEeecccCCc-----ccccCHHHHHHHHhcCCCCCCceeeeccCCCCCeEEecCcchhhHHhccCC-CceeEeE
Confidence 5688888888653 223789999999999 9999999997655789999 788 888877654 5789998
No 8
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=98.56 E-value=3.3e-07 Score=77.92 Aligned_cols=85 Identities=25% Similarity=0.373 Sum_probs=59.5
Q ss_pred EEEecCCcEEEEcCC-CceEEeecCCCC--ccceEEEecCCCeEEEeecCeEEEEeeccCccccccccccccCcEEEeCC
Q 013272 88 LELTKDGDLRLKGPN-DRVGWLSGTSRQ--GVERLQILRTGNLVLVDVVNRVKWQSFNFPTDVMLWGQRLNVATRLTSFP 164 (446)
Q Consensus 88 L~lt~dG~LvL~d~~-g~~vWst~~~~~--~~~~a~LlDsGNLVL~d~~~~~lWQSFD~PTDTlLPGq~L~~~~~L~S~~ 164 (446)
+.+..||+||+.+.. +.++|++++... ....+.|.++|||||+|.++.++|+| ++.
T Consensus 24 ~~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S--~t~------------------- 82 (114)
T smart00108 24 LIMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSS--NTT------------------- 82 (114)
T ss_pred cCCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEe--ccc-------------------
Confidence 344578999998764 579999987533 22578999999999999988999998 110
Q ss_pred CCCCcceEEEEecc-eeEEEEecCCccceeeecc
Q 013272 165 GNSTEFYSFEIQRY-RIALFLHSGKLNYSYWEFK 197 (446)
Q Consensus 165 ~~s~G~y~l~~~~~-~~~l~~~~~~~~~~Yw~~~ 197 (446)
...+.|.+.|+++ ++++|-..+ ...|.+.
T Consensus 83 -~~~~~~~~~L~ddGnlvl~~~~~---~~~W~Sf 112 (114)
T smart00108 83 -GANGNYVLVLLDDGNLVIYDSDG---NFLWQSF 112 (114)
T ss_pred -CCCCceEEEEeCCCCEEEECCCC---CEEeCCC
Confidence 1235678888854 567763222 2678653
No 9
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=98.48 E-value=5.4e-07 Score=76.87 Aligned_cols=80 Identities=23% Similarity=0.324 Sum_probs=57.6
Q ss_pred CCcEEEEcCC-CceEEeecCCCC--ccceEEEecCCCeEEEeecCeEEEEeeccCccccccccccccCcEEEeCCCCCCc
Q 013272 93 DGDLRLKGPN-DRVGWLSGTSRQ--GVERLQILRTGNLVLVDVVNRVKWQSFNFPTDVMLWGQRLNVATRLTSFPGNSTE 169 (446)
Q Consensus 93 dG~LvL~d~~-g~~vWst~~~~~--~~~~a~LlDsGNLVL~d~~~~~lWQSFD~PTDTlLPGq~L~~~~~L~S~~~~s~G 169 (446)
||+||+.+.. ++++|++++... ....+.|.++|||||+|.++.++|+|=-. ...+
T Consensus 30 dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~----------------------~~~~ 87 (116)
T cd00028 30 DYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTT----------------------RVNG 87 (116)
T ss_pred eEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEeccc----------------------CCCC
Confidence 8899998754 579999997642 34678999999999999988999987311 0235
Q ss_pred ceEEEEec-ceeEEEEecCCccceeeecc
Q 013272 170 FYSFEIQR-YRIALFLHSGKLNYSYWEFK 197 (446)
Q Consensus 170 ~y~l~~~~-~~~~l~~~~~~~~~~Yw~~~ 197 (446)
.+.+.|++ +++++|-..+ ...|.+.
T Consensus 88 ~~~~~L~ddGnlvl~~~~~---~~~W~Sf 113 (116)
T cd00028 88 NYVLVLLDDGNLVLYDSDG---NFLWQSF 113 (116)
T ss_pred ceEEEEeCCCCEEEECCCC---CEEEcCC
Confidence 67888875 4567763322 3678764
No 10
>smart00473 PAN_AP divergent subfamily of APPLE domains. Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions.
Probab=98.39 E-value=9.7e-07 Score=68.86 Aligned_cols=71 Identities=20% Similarity=0.333 Sum_probs=49.2
Q ss_pred eeeEEecCeeecCCCCCccccCCHHHHHHHhhc-CCCeEeEEecCCCCCeEEcc-e-ecceEEeccCCCeEEEEE
Q 013272 319 VEMLELEGVGSVLRDGPKMVNVSKEECASMCTS-DCKCVGVLYSSAELECFFYG-V-VMGVKQVEKRSGLIYMVK 390 (446)
Q Consensus 319 ~~f~~l~~~~~~~~~~~~~~~~~~~~C~~~CL~-nCsC~A~~y~~~~~~C~~~~-~-l~~~~~~~~~~~~~~~iK 390 (446)
..|+.++++.++..........++++|++.|++ +|+|.|+.|...++.|.+|. . +.+..... ..+.++|.|
T Consensus 4 ~~f~~~~~~~l~~~~~~~~~~~s~~~C~~~C~~~~~~C~s~~y~~~~~~C~l~~~~~~~~~~~~~-~~~~~~y~~ 77 (78)
T smart00473 4 DCFVRLPNTKLPGFSRIVISVASLEECASKCLNSNCSCRSFTYNNGTKGCLLWSESSLGDARLFP-SGGVDLYEK 77 (78)
T ss_pred ceeEEecCccCCCCcceeEcCCCHHHHHHHhCCCCCceEEEEEcCCCCEEEEeeCCccccceecc-cCCceeEEe
Confidence 568899999876322222345799999999999 99999999976568999986 3 33333222 223566665
No 11
>PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]: Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=97.88 E-value=0.00012 Score=62.36 Aligned_cols=74 Identities=26% Similarity=0.327 Sum_probs=50.4
Q ss_pred CEEEEcC-CCCCCcccCCceEEEEecCCcEEEEcCCCceEEeecCCCCccceEEEec--CCCeEEEeecCeEEEEeeccC
Q 013272 68 DVKVWNS-GHYSRFYVSEKCVLELTKDGDLRLKGPNDRVGWLSGTSRQGVERLQILR--TGNLVLVDVVNRVKWQSFNFP 144 (446)
Q Consensus 68 ~~vVW~A-Nrd~P~~~~~~~~L~lt~dG~LvL~d~~g~~vWst~~~~~~~~~a~LlD--sGNLVL~d~~~~~lWQSFD~P 144 (446)
+++||.. +...+.. ....+.|.+||||||.|..+.++|++.. ....+.+.+++ .||++ ......+.|.|=+.|
T Consensus 38 ~~~iWss~~t~~~~~--~~~~~~L~~~GNlvl~d~~~~~lW~Sf~-~ptdt~L~~q~l~~~~~~-~~~~~~~sw~s~~dp 113 (114)
T PF01453_consen 38 GSVIWSSNNTSGRGN--SGCYLVLQDDGNLVLYDSSGNVLWQSFD-YPTDTLLPGQKLGDGNVT-GKNDSLTSWSSNTDP 113 (114)
T ss_dssp TEEEEE--S-TTSS---SSEEEEEETTSEEEEEETTSEEEEESTT-SSS-EEEEEET--TSEEE-EESTSSEEEESS---
T ss_pred CCEEEEecccCCccc--cCeEEEEeCCCCEEEEeecceEEEeecC-CCccEEEeccCcccCCCc-cccceEEeECCCCCC
Confidence 3689999 4333321 3688999999999999999999999843 23445666777 88888 665567899887766
Q ss_pred c
Q 013272 145 T 145 (446)
Q Consensus 145 T 145 (446)
+
T Consensus 114 s 114 (114)
T PF01453_consen 114 S 114 (114)
T ss_dssp -
T ss_pred C
Confidence 3
No 12
>cd01100 APPLE_Factor_XI_like Subfamily of PAN/APPLE-like domains; present in plasma prekallikrein/coagulation factor XI, microneme antigen proteins, and a few prokaryotic proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=97.76 E-value=3.1e-05 Score=60.36 Aligned_cols=50 Identities=24% Similarity=0.419 Sum_probs=37.9
Q ss_pred ecCeeecCCCCCccccCCHHHHHHHhhcCCCeEeEEecCCCCCeEEccee
Q 013272 324 LEGVGSVLRDGPKMVNVSKEECASMCTSDCKCVGVLYSSAELECFFYGVV 373 (446)
Q Consensus 324 l~~~~~~~~~~~~~~~~~~~~C~~~CL~nCsC~A~~y~~~~~~C~~~~~l 373 (446)
+++++++..+.......+.++|++.|+.+|+|.|+.|....+.|+++...
T Consensus 9 ~~~~~~~g~d~~~~~~~s~~~Cq~~C~~~~~C~afT~~~~~~~C~lk~~~ 58 (73)
T cd01100 9 GSNVDFRGGDLSTVFASSAEQCQAACTADPGCLAFTYNTKSKKCFLKSSE 58 (73)
T ss_pred cCCCccccCCcceeecCCHHHHHHHcCCCCCceEEEEECCCCeEEcccCC
Confidence 35676654333333456899999999999999999998667899998543
No 13
>smart00223 APPLE APPLE domain. Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder.
Probab=94.20 E-value=0.062 Score=42.57 Aligned_cols=48 Identities=23% Similarity=0.345 Sum_probs=37.5
Q ss_pred cCeeecCCCCCccccCCHHHHHHHhhcCCCeEeEEecCCCC---CeEEcce
Q 013272 325 EGVGSVLRDGPKMVNVSKEECASMCTSDCKCVGVLYSSAEL---ECFFYGV 372 (446)
Q Consensus 325 ~~~~~~~~~~~~~~~~~~~~C~~~CL~nCsC~A~~y~~~~~---~C~~~~~ 372 (446)
++++++..+.......+.++|++.|..+=.|.++.|..... .|+++..
T Consensus 7 ~~~df~G~Dl~~~~~~~~~~Cq~~Ct~~~~C~~FTf~~~~~~~~~C~LK~s 57 (79)
T smart00223 7 KNVDFRGSDINTVYVPSAQVCQKRCTSHPRCLFFTFSTNEPPEEKCLLKDS 57 (79)
T ss_pred cCccccCceeeeeecCCHHHHHHhhcCCCCccEEEeeCCCCCCCEeEeCcC
Confidence 56666544555556678999999999999999999976555 8998854
No 14
>PF00024 PAN_1: PAN domain This Prosite entry concerns apple domains, a subset of PAN domains; InterPro: IPR003014 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs It has been shown that, the N-terminal N domains of members of the plasminogen/hepatocyte growth factor family, the apple domains of the plasma prekallikrein/coagulation factor XI family, and domains of various nematode proteins belong to the same module superfamily, the PAN module []. PAN contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge that links the N and C termini of the domain.; PDB: 1GP9_C 2QJ2_B 1GMO_H 1NK1_B 3MKP_B 1BHT_B 3HN4_A 1GMN_A 3HMS_A 3HMT_B ....
Probab=93.66 E-value=0.085 Score=40.72 Aligned_cols=51 Identities=20% Similarity=0.405 Sum_probs=37.2
Q ss_pred eEEecCeeecCCCCCccccCCHHHHHHHhhcCCC-eEeEEecCCCCCeEEcc
Q 013272 321 MLELEGVGSVLRDGPKMVNVSKEECASMCTSDCK-CVGVLYSSAELECFFYG 371 (446)
Q Consensus 321 f~~l~~~~~~~~~~~~~~~~~~~~C~~~CL~nCs-C~A~~y~~~~~~C~~~~ 371 (446)
|..+++..+...........++++|.+.|+.+=. |.++.|....+.|.++.
T Consensus 4 f~~~~~~~l~~~~~~~~~v~s~~~C~~~C~~~~~~C~s~~y~~~~~~C~L~~ 55 (79)
T PF00024_consen 4 FERIPGYRLSGHSIKEINVPSLEECAQLCLNEPRRCKSFNYDPSSKTCYLSS 55 (79)
T ss_dssp EEEEEEEEEESCEEEEEEESSHHHHHHHHHHSTT-ESEEEEETTTTEEEEEC
T ss_pred eEEECCEEEeCCcceEEcCCCHHHHHhhcCcCcccCCeEEEECCCCEEEEcC
Confidence 5666666654322222333489999999999999 99999987778999863
No 15
>PF08693 SKG6: Transmembrane alpha-helix domain; InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=93.27 E-value=0.066 Score=36.39 Aligned_cols=8 Identities=38% Similarity=0.783 Sum_probs=4.1
Q ss_pred EEEeeccC
Q 013272 430 YLIRRRRK 437 (446)
Q Consensus 430 ~~~~rr~~ 437 (446)
++++||+|
T Consensus 33 ~~~~rR~k 40 (40)
T PF08693_consen 33 FFWYRRKK 40 (40)
T ss_pred heEEeccC
Confidence 34555554
No 16
>PF14295 PAN_4: PAN domain; PDB: 2YIL_E 2YIP_C 2YIO_A.
Probab=92.93 E-value=0.09 Score=37.16 Aligned_cols=33 Identities=24% Similarity=0.631 Sum_probs=17.9
Q ss_pred ccCCHHHHHHHhhcCCCeEeEEecC-----CCCCeEEc
Q 013272 338 VNVSKEECASMCTSDCKCVGVLYSS-----AELECFFY 370 (446)
Q Consensus 338 ~~~~~~~C~~~CL~nCsC~A~~y~~-----~~~~C~~~ 370 (446)
...+.++|.++|..+=.|.++.|.. ..+.|+++
T Consensus 14 ~~~s~~~C~~~C~~~~~C~~~~~~~~~~~~~~~~C~LK 51 (51)
T PF14295_consen 14 TASSPEECQAACAADPGCQAFTFNPPGCPSSSGRCYLK 51 (51)
T ss_dssp ----HHHHHHHHHTSTT--EEEEETTEE----------
T ss_pred cCCCHHHHHHHccCCCCCCEEEEECCCcccccccccCC
Confidence 4568999999999999999999975 34668764
No 17
>PF08277 PAN_3: PAN-like domain; InterPro: IPR006583 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs The PAN-3 or CW is a domain associated with a number of Caenorhabditis elegans hypothetical proteins.
Probab=92.67 E-value=0.66 Score=35.34 Aligned_cols=43 Identities=26% Similarity=0.665 Sum_probs=33.1
Q ss_pred cccCCHHHHHHHhhcCCCeEeEEecCCCCCeEEc--ceecceEEecc
Q 013272 337 MVNVSKEECASMCTSDCKCVGVLYSSAELECFFY--GVVMGVKQVEK 381 (446)
Q Consensus 337 ~~~~~~~~C~~~CL~nCsC~A~~y~~~~~~C~~~--~~l~~~~~~~~ 381 (446)
....+.++|-+.|..+=+|.++.+. .+.|.++ +.+..+++...
T Consensus 17 ~~~~sw~~Cv~~C~~~~~C~la~~~--~~~C~~y~~~~i~~v~~~~~ 61 (71)
T PF08277_consen 17 TTNTSWDDCVQKCYNDENCVLAYFD--SGKCYLYNYGSISTVQKTDS 61 (71)
T ss_pred ccCCCHHHHhHHhCCCCEEEEEEeC--CCCEEEEEcCCEEEEEEeec
Confidence 3456889999999999999998886 5789986 45555555443
No 18
>smart00605 CW CW domain.
Probab=92.65 E-value=0.97 Score=36.72 Aligned_cols=55 Identities=31% Similarity=0.579 Sum_probs=39.0
Q ss_pred ccCCHHHHHHHhhcCCCeEeEEecCCCCCeEEc--ceecceEEeccCCCeEEEEEEeC
Q 013272 338 VNVSKEECASMCTSDCKCVGVLYSSAELECFFY--GVVMGVKQVEKRSGLIYMVKVAK 393 (446)
Q Consensus 338 ~~~~~~~C~~~CL~nCsC~A~~y~~~~~~C~~~--~~l~~~~~~~~~~~~~~~iKv~~ 393 (446)
...+.++|...|..+..|..+.... ...|.++ +.++.+++.....+..+=||+..
T Consensus 20 ~~~sw~~Ci~~C~~~~~Cvlay~~~-~~~C~~f~~~~~~~v~~~~~~~~~~VAfK~~~ 76 (94)
T smart00605 20 ATLSWDECIQKCYEDSNCVLAYGNS-SETCYLFSYGTVLTVKKLSSSSGKKVAFKVST 76 (94)
T ss_pred cCCCHHHHHHHHhCCCceEEEecCC-CCceEEEEcCCeEEEEEccCCCCcEEEEEEeC
Confidence 4568899999999999999876542 4789885 55666776654333334468764
No 19
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=91.57 E-value=0.26 Score=43.62 Aligned_cols=12 Identities=33% Similarity=0.634 Sum_probs=5.0
Q ss_pred heeeEEEeeccC
Q 013272 426 GLAYYLIRRRRK 437 (446)
Q Consensus 426 ~~~~~~~~rr~~ 437 (446)
+++|++++|+||
T Consensus 68 ~lvf~~c~r~kk 79 (154)
T PF04478_consen 68 ALVFIFCIRRKK 79 (154)
T ss_pred HhheeEEEeccc
Confidence 334444444443
No 20
>PF15102 TMEM154: TMEM154 protein family
Probab=84.79 E-value=1.3 Score=38.97 Aligned_cols=8 Identities=38% Similarity=0.804 Sum_probs=3.2
Q ss_pred EEEEehhh
Q 013272 408 VLILVGVV 415 (446)
Q Consensus 408 i~i~~~~~ 415 (446)
++|++.++
T Consensus 59 LmIlIP~V 66 (146)
T PF15102_consen 59 LMILIPLV 66 (146)
T ss_pred EEEeHHHH
Confidence 33444433
No 21
>cd01099 PAN_AP_HGF Subfamily of PAN/APPLE-like domains; present in N-terminal (N) domains of plasminogen/hepatocyte growth factor proteins, and various proteins found in Bilateria, such as leech anti-platelet proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=83.75 E-value=2.2 Score=33.53 Aligned_cols=32 Identities=28% Similarity=0.563 Sum_probs=28.3
Q ss_pred cCCHHHHHHHhhc--CCCeEeEEecCCCCCeEEc
Q 013272 339 NVSKEECASMCTS--DCKCVGVLYSSAELECFFY 370 (446)
Q Consensus 339 ~~~~~~C~~~CL~--nCsC~A~~y~~~~~~C~~~ 370 (446)
..++++|.+.|++ +=.|.++.|......|.+-
T Consensus 24 ~~s~~~C~~~C~~~~~f~CrSf~y~~~~~~C~L~ 57 (80)
T cd01099 24 VASLEECLRKCLEETEFTCRSFNYNYKSKECILS 57 (80)
T ss_pred cCCHHHHHHHhCCCCCceEeEEEEEcCCCEEEEe
Confidence 4789999999999 8899999997767889985
No 22
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=75.79 E-value=2.3 Score=34.94 Aligned_cols=12 Identities=25% Similarity=0.650 Sum_probs=5.3
Q ss_pred hheeeEEEeecc
Q 013272 425 GGLAYYLIRRRR 436 (446)
Q Consensus 425 ~~~~~~~~~rr~ 436 (446)
+++.||+++|||
T Consensus 84 ~~l~w~f~~r~k 95 (96)
T PTZ00382 84 GFLCWWFVCRGK 95 (96)
T ss_pred HHHhheeEEeec
Confidence 333444455443
No 23
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=73.59 E-value=0.46 Score=40.75 Aligned_cols=29 Identities=45% Similarity=0.722 Sum_probs=13.0
Q ss_pred EEEEehhhHHHHHHHHHhheeeEEEeeccCCc
Q 013272 408 VLILVGVVDGLIIVLVFGGLAYYLIRRRRKKS 439 (446)
Q Consensus 408 i~i~~~~~~~~~~l~~~~~~~~~~~~rr~~~~ 439 (446)
++|+++++.|++++++ ++. |+++|+||+.
T Consensus 67 ~~Ii~gv~aGvIg~Il--li~-y~irR~~Kk~ 95 (122)
T PF01102_consen 67 IGIIFGVMAGVIGIIL--LIS-YCIRRLRKKS 95 (122)
T ss_dssp HHHHHHHHHHHHHHHH--HHH-HHHHHHS---
T ss_pred eehhHHHHHHHHHHHH--HHH-HHHHHHhccC
Confidence 4466666666644332 223 3455555554
No 24
>PF13360 PQQ_2: PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=73.15 E-value=12 Score=34.66 Aligned_cols=74 Identities=18% Similarity=0.309 Sum_probs=44.2
Q ss_pred CEEEEcCC----CCCCc--ccCCceEEEE-ecCCcEEEEcC-CCceEEeecCCCCc--c-----ceEEE-ecCCCeEEEe
Q 013272 68 DVKVWNSG----HYSRF--YVSEKCVLEL-TKDGDLRLKGP-NDRVGWLSGTSRQG--V-----ERLQI-LRTGNLVLVD 131 (446)
Q Consensus 68 ~~vVW~AN----rd~P~--~~~~~~~L~l-t~dG~LvL~d~-~g~~vWst~~~~~~--~-----~~a~L-lDsGNLVL~d 131 (446)
+..+|..+ .+.++ ....+..|-+ +.+|.|+..|. +|.++|+....... . ..+.+ ..+|-|+..|
T Consensus 13 G~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~~~d 92 (238)
T PF13360_consen 13 GKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLYALD 92 (238)
T ss_dssp TEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEEEEE
T ss_pred CCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEeeccccccceeeecccccccccceeeeEecc
Confidence 46788764 23333 2212334444 58899999996 89999998753221 1 11222 2344566677
Q ss_pred -ecCeEEEEee
Q 013272 132 -VVNRVKWQSF 141 (446)
Q Consensus 132 -~~~~~lWQSF 141 (446)
.+++++|+..
T Consensus 93 ~~tG~~~W~~~ 103 (238)
T PF13360_consen 93 AKTGKVLWSIY 103 (238)
T ss_dssp TTTSCEEEEEE
T ss_pred cCCcceeeeec
Confidence 5688999953
No 25
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=69.07 E-value=3.9 Score=29.20 Aligned_cols=33 Identities=15% Similarity=0.423 Sum_probs=27.3
Q ss_pred eeccCCCCCCCCCCCCCcCCCCCceeecCCCCCC
Q 013272 264 QAINKTCDLPLGCKPCEICTFTNSCSCIGLLTKK 297 (446)
Q Consensus 264 ~~p~d~C~~~~~CG~~g~C~~~~~C~Cl~gf~~~ 297 (446)
..|.+.|....-|-.+++|. +..|.|++++...
T Consensus 16 ~~~g~~C~~~~qC~~~s~C~-~g~C~C~~g~~~~ 48 (52)
T PF01683_consen 16 VQPGESCESDEQCIGGSVCV-NGRCQCPPGYVEV 48 (52)
T ss_pred CCCCCCCCCcCCCCCcCEEc-CCEeECCCCCEec
Confidence 34667899999999999994 6899999987654
No 26
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=67.49 E-value=2.4 Score=28.94 Aligned_cols=29 Identities=28% Similarity=0.449 Sum_probs=22.4
Q ss_pred CCCCCC-CCCCCCCcCCCC---CceeecCCCCC
Q 013272 268 KTCDLP-LGCKPCEICTFT---NSCSCIGLLTK 296 (446)
Q Consensus 268 d~C~~~-~~CG~~g~C~~~---~~C~Cl~gf~~ 296 (446)
|+|... ..|..++.|... -.|.|++||..
T Consensus 3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~~ 35 (42)
T PF07645_consen 3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYEL 35 (42)
T ss_dssp STTTTTSSSSSTTSEEEEETTEEEEEESTTEEE
T ss_pred cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcEE
Confidence 678764 479999999753 27999999874
No 27
>PF15102 TMEM154: TMEM154 protein family
Probab=66.98 E-value=4.1 Score=35.93 Aligned_cols=12 Identities=17% Similarity=0.398 Sum_probs=5.0
Q ss_pred EEEehhhHHHHH
Q 013272 409 LILVGVVDGLII 420 (446)
Q Consensus 409 ~i~~~~~~~~~~ 420 (446)
.+++.++-.+++
T Consensus 57 fiLmIlIP~VLL 68 (146)
T PF15102_consen 57 FILMILIPLVLL 68 (146)
T ss_pred eEEEEeHHHHHH
Confidence 344444433444
No 28
>PF06024 DUF912: Nucleopolyhedrovirus protein of unknown function (DUF912); InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=63.33 E-value=7.2 Score=32.30 Aligned_cols=11 Identities=45% Similarity=0.872 Sum_probs=5.1
Q ss_pred eeEEEeeccCC
Q 013272 428 AYYLIRRRRKK 438 (446)
Q Consensus 428 ~~~~~~rr~~~ 438 (446)
.|+++.|.|++
T Consensus 82 yYFVILRer~~ 92 (101)
T PF06024_consen 82 YYFVILRERQK 92 (101)
T ss_pred eEEEEEecccc
Confidence 34455554443
No 29
>PF13360 PQQ_2: PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=62.79 E-value=40 Score=31.19 Aligned_cols=73 Identities=22% Similarity=0.251 Sum_probs=44.8
Q ss_pred CEEEEcCCCCCCcc----cCCceEEEEecCCcEEEEc-CCCceEEee-cCC----C---Ccc-----c-eEEEecCCCeE
Q 013272 68 DVKVWNSGHYSRFY----VSEKCVLELTKDGDLRLKG-PNDRVGWLS-GTS----R---QGV-----E-RLQILRTGNLV 128 (446)
Q Consensus 68 ~~vVW~ANrd~P~~----~~~~~~L~lt~dG~LvL~d-~~g~~vWst-~~~----~---~~~-----~-~a~LlDsGNLV 128 (446)
++++|...-+.++. ...+..+..+.+|.|...| .+|.++|.. ... . ... . ......+|.|+
T Consensus 56 G~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~l~ 135 (238)
T PF13360_consen 56 GKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLYALDAKTGKVLWSIYLTSSPPAGVRSSSSPAVDGDRLYVGTSSGKLV 135 (238)
T ss_dssp SEEEEEEECSSCGGSGEEEETTEEEEEETTSEEEEEETTTSCEEEEEEE-SSCTCSTB--SEEEEETTEEEEEETCSEEE
T ss_pred CCEEEEeeccccccceeeecccccccccceeeeEecccCCcceeeeeccccccccccccccCceEecCEEEEEeccCcEE
Confidence 57899877554322 2233444556688898888 689999984 321 1 000 1 12233478888
Q ss_pred EEe-ecCeEEEEe
Q 013272 129 LVD-VVNRVKWQS 140 (446)
Q Consensus 129 L~d-~~~~~lWQS 140 (446)
..| .+++.+|+-
T Consensus 136 ~~d~~tG~~~w~~ 148 (238)
T PF13360_consen 136 ALDPKTGKLLWKY 148 (238)
T ss_dssp EEETTTTEEEEEE
T ss_pred EEecCCCcEEEEe
Confidence 888 457889964
No 30
>KOG0291 consensus WD40-repeat-containing subunit of the 18S rRNA processing complex [RNA processing and modification]
Probab=61.67 E-value=2.3e+02 Score=31.74 Aligned_cols=85 Identities=19% Similarity=0.344 Sum_probs=53.4
Q ss_pred ceEEEEecCCcEEEEcC-CCc-eEEeecCC---------CCccceEEEecCCCeEEEee-cCeE-EEEeeccCccccccc
Q 013272 85 KCVLELTKDGDLRLKGP-NDR-VGWLSGTS---------RQGVERLQILRTGNLVLVDV-VNRV-KWQSFNFPTDVMLWG 151 (446)
Q Consensus 85 ~~~L~lt~dG~LvL~d~-~g~-~vWst~~~---------~~~~~~a~LlDsGNLVL~d~-~~~~-lWQSFD~PTDTlLPG 151 (446)
-..|+.+.||.++.+.+ +|. -||.+..+ .++++..+..-+||.+|... +|++ .|.
T Consensus 353 i~~l~YSpDgq~iaTG~eDgKVKvWn~~SgfC~vTFteHts~Vt~v~f~~~g~~llssSLDGtVRAwD------------ 420 (893)
T KOG0291|consen 353 ITSLAYSPDGQLIATGAEDGKVKVWNTQSGFCFVTFTEHTSGVTAVQFTARGNVLLSSSLDGTVRAWD------------ 420 (893)
T ss_pred eeeEEECCCCcEEEeccCCCcEEEEeccCceEEEEeccCCCceEEEEEEecCCEEEEeecCCeEEeee------------
Confidence 46789999999998865 555 48987642 23467788999999999754 3443 562
Q ss_pred cccccCcEEEeCCCCCCcce-EEEEecceeEEE
Q 013272 152 QRLNVATRLTSFPGNSTEFY-SFEIQRYRIALF 183 (446)
Q Consensus 152 q~L~~~~~L~S~~~~s~G~y-~l~~~~~~~~l~ 183 (446)
|...+...+...|.+-.| .+.+|+.|....
T Consensus 421 --lkRYrNfRTft~P~p~QfscvavD~sGelV~ 451 (893)
T KOG0291|consen 421 --LKRYRNFRTFTSPEPIQFSCVAVDPSGELVC 451 (893)
T ss_pred --ecccceeeeecCCCceeeeEEEEcCCCCEEE
Confidence 222333333334555555 466676665443
No 31
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=61.07 E-value=8.1 Score=24.13 Aligned_cols=27 Identities=26% Similarity=0.460 Sum_probs=19.2
Q ss_pred CCCCCCCCCCCcCCCC---CceeecCCCCC
Q 013272 270 CDLPLGCKPCEICTFT---NSCSCIGLLTK 296 (446)
Q Consensus 270 C~~~~~CG~~g~C~~~---~~C~Cl~gf~~ 296 (446)
|.....|..++.|... ..|.|+++|..
T Consensus 2 C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g 31 (36)
T cd00053 2 CAASNPCSNGGTCVNTPGSYRCVCPPGYTG 31 (36)
T ss_pred CCCCCCCCCCCEEecCCCCeEeECCCCCcc
Confidence 4435678888899753 47999988743
No 32
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=59.33 E-value=40 Score=34.48 Aligned_cols=24 Identities=13% Similarity=0.018 Sum_probs=13.2
Q ss_pred EEEecCCcEEEEcC-CCceEEeecC
Q 013272 88 LELTKDGDLRLKGP-NDRVGWLSGT 111 (446)
Q Consensus 88 L~lt~dG~LvL~d~-~g~~vWst~~ 111 (446)
+..+.+|.|..+|. +|+++|+.+.
T Consensus 339 ~v~~~~G~l~~ld~~tG~~~~~~~~ 363 (394)
T PRK11138 339 VVGDSEGYLHWINREDGRFVAQQKV 363 (394)
T ss_pred EEEeCCCEEEEEECCCCCEEEEEEc
Confidence 33445566665554 5666666543
No 33
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=59.14 E-value=8 Score=24.97 Aligned_cols=22 Identities=23% Similarity=0.388 Sum_probs=18.0
Q ss_pred CCCCCCCcCCCC-CceeecCCCC
Q 013272 274 LGCKPCEICTFT-NSCSCIGLLT 295 (446)
Q Consensus 274 ~~CG~~g~C~~~-~~C~Cl~gf~ 295 (446)
..|..+|.|... ..|.|.++|.
T Consensus 6 ~~C~~~G~C~~~~g~C~C~~g~~ 28 (32)
T PF07974_consen 6 NICSGHGTCVSPCGRCVCDSGYT 28 (32)
T ss_pred CccCCCCEEeCCCCEEECCCCCc
Confidence 479999999865 7999998764
No 34
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=55.47 E-value=4.1 Score=30.73 Aligned_cols=8 Identities=75% Similarity=0.845 Sum_probs=0.4
Q ss_pred EEeeccCC
Q 013272 431 LIRRRRKK 438 (446)
Q Consensus 431 ~~~rr~~~ 438 (446)
+++|.||+
T Consensus 32 ~iyR~rkk 39 (64)
T PF01034_consen 32 LIYRMRKK 39 (64)
T ss_dssp -----S--
T ss_pred HHHHHHhc
Confidence 34444443
No 35
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=55.13 E-value=11 Score=24.41 Aligned_cols=28 Identities=25% Similarity=0.458 Sum_probs=20.3
Q ss_pred CCCCCCCCCCCCCcCCCC---CceeecCCCC
Q 013272 268 KTCDLPLGCKPCEICTFT---NSCSCIGLLT 295 (446)
Q Consensus 268 d~C~~~~~CG~~g~C~~~---~~C~Cl~gf~ 295 (446)
|+|.....|...+.|... -.|.|+++|.
T Consensus 3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~ 33 (39)
T smart00179 3 DECASGNPCQNGGTCVNTVGSYRCECPPGYT 33 (39)
T ss_pred ccCcCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence 567654678888899743 2699998875
No 36
>KOG4649 consensus PQQ (pyrrolo-quinoline quinone) repeat protein [Secondary metabolites biosynthesis, transport and catabolism]
Probab=54.45 E-value=37 Score=33.21 Aligned_cols=44 Identities=23% Similarity=0.410 Sum_probs=33.0
Q ss_pred EEEEcCCCCCCcccC----CceEEEEecCCcEEEEcCCCceEEeecCC
Q 013272 69 VKVWNSGHYSRFYVS----EKCVLELTKDGDLRLKGPNDRVGWLSGTS 112 (446)
Q Consensus 69 ~vVW~ANrd~P~~~~----~~~~L~lt~dG~LvL~d~~g~~vWst~~~ 112 (446)
++.|.|.|..|+-.+ ..++..=+-||+|.-.|+.|+.||...+.
T Consensus 170 ~~~w~~~~~~PiF~splcv~~sv~i~~VdG~l~~f~~sG~qvwr~~t~ 217 (354)
T KOG4649|consen 170 TEFWAATRFGPIFASPLCVGSSVIITTVDGVLTSFDESGRQVWRPATK 217 (354)
T ss_pred ceehhhhcCCccccCceeccceEEEEEeccEEEEEcCCCcEEEeecCC
Confidence 789999999987643 12344446799999899999999965543
No 37
>PF02439 Adeno_E3_CR2: Adenovirus E3 region protein CR2; InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=53.47 E-value=1.3 Score=29.65 Aligned_cols=13 Identities=46% Similarity=0.746 Sum_probs=6.7
Q ss_pred EEEEehhhHHHHH
Q 013272 408 VLILVGVVDGLII 420 (446)
Q Consensus 408 i~i~~~~~~~~~~ 420 (446)
+.+++++++++++
T Consensus 6 IaIIv~V~vg~~i 18 (38)
T PF02439_consen 6 IAIIVAVVVGMAI 18 (38)
T ss_pred hhHHHHHHHHHHH
Confidence 3455566555543
No 38
>PF06365 CD34_antigen: CD34/Podocalyxin family; InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=52.68 E-value=7.2 Score=36.46 Aligned_cols=23 Identities=17% Similarity=0.345 Sum_probs=11.7
Q ss_pred HHHHHHHhheeeEEEeeccCCcc
Q 013272 418 LIIVLVFGGLAYYLIRRRRKKSL 440 (446)
Q Consensus 418 ~~~l~~~~~~~~~~~~rr~~~~~ 440 (446)
+++++++...+|+++.||..+.+
T Consensus 111 ~lLla~~~~~~Y~~~~Rrs~~~~ 133 (202)
T PF06365_consen 111 FLLLAILLGAGYCCHQRRSWSKK 133 (202)
T ss_pred HHHHHHHHHHHHHhhhhccCCcc
Confidence 33344334445666666654443
No 39
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=51.12 E-value=24 Score=21.67 Aligned_cols=20 Identities=10% Similarity=0.257 Sum_probs=14.4
Q ss_pred EEEecCCcEEEEcCCCceEE
Q 013272 88 LELTKDGDLRLKGPNDRVGW 107 (446)
Q Consensus 88 L~lt~dG~LvL~d~~g~~vW 107 (446)
|.++++|+|++.|..+.-||
T Consensus 7 vav~~~g~i~VaD~~n~rV~ 26 (28)
T PF01436_consen 7 VAVDSDGNIYVADSGNHRVQ 26 (28)
T ss_dssp EEEETTSEEEEEECCCTEEE
T ss_pred EEEeCCCCEEEEECCCCEEE
Confidence 56778888888887655554
No 40
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=49.43 E-value=5.1 Score=30.22 Aligned_cols=28 Identities=21% Similarity=0.463 Sum_probs=0.7
Q ss_pred EEEehhhHHHHHHHHHhheeeEEEeecc
Q 013272 409 LILVGVVDGLIIVLVFGGLAYYLIRRRR 436 (446)
Q Consensus 409 ~i~~~~~~~~~~l~~~~~~~~~~~~rr~ 436 (446)
.+++++++++++++++.+++.+.++||.
T Consensus 13 avIaG~Vvgll~ailLIlf~iyR~rkkd 40 (64)
T PF01034_consen 13 AVIAGGVVGLLFAILLILFLIYRMRKKD 40 (64)
T ss_dssp ------------------------S---
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHhcC
Confidence 3455544454443333344445567764
No 41
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=49.28 E-value=2 Score=42.70 Aligned_cols=12 Identities=50% Similarity=0.664 Sum_probs=6.1
Q ss_pred eeE-EEeeccCCc
Q 013272 428 AYY-LIRRRRKKS 439 (446)
Q Consensus 428 ~~~-~~~rr~~~~ 439 (446)
+|+ +.|||+|++
T Consensus 275 IYLILRYRRKKKm 287 (299)
T PF02009_consen 275 IYLILRYRRKKKM 287 (299)
T ss_pred HHHHHHHHHHhhh
Confidence 444 446665554
No 42
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=48.96 E-value=6.9 Score=42.44 Aligned_cols=31 Identities=26% Similarity=0.485 Sum_probs=16.9
Q ss_pred EEEEehhhHHHHHHHHHhheeeEEEeeccCC
Q 013272 408 VLILVGVVDGLIIVLVFGGLAYYLIRRRRKK 438 (446)
Q Consensus 408 i~i~~~~~~~~~~l~~~~~~~~~~~~rr~~~ 438 (446)
++|+++|++.++++++++++++|..+|++|.
T Consensus 269 lWII~gVlvPv~vV~~Iiiil~~~LCRk~K~ 299 (684)
T PF12877_consen 269 LWIIAGVLVPVLVVLLIIIILYWKLCRKNKL 299 (684)
T ss_pred eEEEehHhHHHHHHHHHHHHHHHHHhccccc
Confidence 4455565555555555555556655666543
No 43
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=47.12 E-value=4.2 Score=39.71 Aligned_cols=7 Identities=43% Similarity=1.236 Sum_probs=5.0
Q ss_pred CceeecC
Q 013272 286 NSCSCIG 292 (446)
Q Consensus 286 ~~C~Cl~ 292 (446)
..|+|-.
T Consensus 143 s~cectd 149 (295)
T TIGR01478 143 KSCECTN 149 (295)
T ss_pred Cceeeec
Confidence 5688865
No 44
>PTZ00370 STEVOR; Provisional
Probab=45.52 E-value=4.6 Score=39.55 Aligned_cols=7 Identities=29% Similarity=1.194 Sum_probs=4.9
Q ss_pred CceeecC
Q 013272 286 NSCSCIG 292 (446)
Q Consensus 286 ~~C~Cl~ 292 (446)
..|.|-+
T Consensus 143 s~cectd 149 (296)
T PTZ00370 143 STCECTD 149 (296)
T ss_pred Cceeeee
Confidence 4788865
No 45
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=45.30 E-value=19 Score=22.79 Aligned_cols=28 Identities=29% Similarity=0.482 Sum_probs=19.4
Q ss_pred CCCCCCCCCCCCCcCCCC---CceeecCCCC
Q 013272 268 KTCDLPLGCKPCEICTFT---NSCSCIGLLT 295 (446)
Q Consensus 268 d~C~~~~~CG~~g~C~~~---~~C~Cl~gf~ 295 (446)
++|.....|...+.|... ..|.|+++|.
T Consensus 3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~ 33 (38)
T cd00054 3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33 (38)
T ss_pred ccCCCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence 467654568778889743 2699998764
No 46
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=44.25 E-value=85 Score=31.66 Aligned_cols=54 Identities=11% Similarity=0.142 Sum_probs=33.0
Q ss_pred eEEEE-ecCCcEEEEc-CCCceEEeecCCCCc--c------ceEEEecCCCeEEEee-cCeEEEE
Q 013272 86 CVLEL-TKDGDLRLKG-PNDRVGWLSGTSRQG--V------ERLQILRTGNLVLVDV-VNRVKWQ 139 (446)
Q Consensus 86 ~~L~l-t~dG~LvL~d-~~g~~vWst~~~~~~--~------~~a~LlDsGNLVL~d~-~~~~lWQ 139 (446)
..|-+ +.+|.|.-.| .+|.++|..+..... . ....-..+|+|+-+|. +++++|+
T Consensus 66 ~~v~v~~~~g~v~a~d~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~ald~~tG~~~W~ 130 (377)
T TIGR03300 66 GKVYAADADGTVVALDAETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIALDAEDGKELWR 130 (377)
T ss_pred CEEEEECCCCeEEEEEccCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEEEECCCCcEeee
Confidence 34444 4568888888 578999987643211 1 1111234677777775 5788996
No 47
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=44.14 E-value=19 Score=44.79 Aligned_cols=24 Identities=17% Similarity=0.441 Sum_probs=17.4
Q ss_pred CCCCCCCcCCCCC----ceeecCCCCCC
Q 013272 274 LGCKPCEICTFTN----SCSCIGLLTKK 297 (446)
Q Consensus 274 ~~CG~~g~C~~~~----~C~Cl~gf~~~ 297 (446)
..|---|.|+..+ .|.||+-|.-+
T Consensus 3870 npCqhgG~C~~~~~ggy~CkCpsqysG~ 3897 (4289)
T KOG1219|consen 3870 NPCQHGGTCISQPKGGYKCKCPSQYSGN 3897 (4289)
T ss_pred CcccCCCEecCCCCCceEEeCcccccCc
Confidence 5677778888643 79999866544
No 48
>PF14610 DUF4448: Protein of unknown function (DUF4448)
Probab=43.26 E-value=26 Score=32.24 Aligned_cols=28 Identities=14% Similarity=0.427 Sum_probs=12.2
Q ss_pred EEEEehhhHHHHHHHHHhheeeEEEeeccCC
Q 013272 408 VLILVGVVDGLIIVLVFGGLAYYLIRRRRKK 438 (446)
Q Consensus 408 i~i~~~~~~~~~~l~~~~~~~~~~~~rr~~~ 438 (446)
++|++.+++++++++ +++++++.||+|+
T Consensus 160 laI~lPvvv~~~~~~---~~~~~~~~R~~Rr 187 (189)
T PF14610_consen 160 LAIALPVVVVVLALI---MYGFFFWNRKKRR 187 (189)
T ss_pred EEEEccHHHHHHHHH---HHhhheeecccee
Confidence 445555554443322 2233444555544
No 49
>PTZ00208 65 kDa invariant surface glycoprotein; Provisional
Probab=42.52 E-value=12 Score=38.31 Aligned_cols=30 Identities=27% Similarity=0.503 Sum_probs=15.3
Q ss_pred EEehhhHHHHHHHHHhheeeEEEeeccCCc
Q 013272 410 ILVGVVDGLIIVLVFGGLAYYLIRRRRKKS 439 (446)
Q Consensus 410 i~~~~~~~~~~l~~~~~~~~~~~~rr~~~~ 439 (446)
|+++|++.+++|++++++.|++.+|||.+.
T Consensus 388 i~~avl~p~~il~~~~~~~~~~v~rrr~~~ 417 (436)
T PTZ00208 388 IILAVLVPAIILAIIAVAFFIMVKRRRNSS 417 (436)
T ss_pred HHHHHHHHHHHHHHHHHHhheeeeeccCCc
Confidence 445555555555555444444556666543
No 50
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=41.70 E-value=78 Score=32.34 Aligned_cols=61 Identities=13% Similarity=0.182 Sum_probs=35.2
Q ss_pred CCcccCCceEEEEecCCcEEEEcC-CCceEEeecCCCCcc------ceEEEecCCCeEEEeec-CeEEEE
Q 013272 78 SRFYVSEKCVLELTKDGDLRLKGP-NDRVGWLSGTSRQGV------ERLQILRTGNLVLVDVV-NRVKWQ 139 (446)
Q Consensus 78 ~P~~~~~~~~L~lt~dG~LvL~d~-~g~~vWst~~~~~~~------~~a~LlDsGNLVL~d~~-~~~lWQ 139 (446)
.|+.. .+-....+.+|.|+-+|. +|+++|......... ..-...++|.++..|.. ++.+|+
T Consensus 251 sP~v~-~~~vy~~~~~g~l~ald~~tG~~~W~~~~~~~~~~~~~~~~vy~~~~~g~l~ald~~tG~~~W~ 319 (394)
T PRK11138 251 TPVVV-GGVVYALAYNGNLVALDLRSGQIVWKREYGSVNDFAVDGGRIYLVDQNDRVYALDTRGGVELWS 319 (394)
T ss_pred CcEEE-CCEEEEEEcCCeEEEEECCCCCEEEeecCCCccCcEEECCEEEEEcCCCeEEEEECCCCcEEEc
Confidence 45542 233334466788887775 678899765432110 11123456777777754 578895
No 51
>PTZ00046 rifin; Provisional
Probab=41.65 E-value=5.4 Score=40.48 Aligned_cols=13 Identities=46% Similarity=0.560 Sum_probs=8.0
Q ss_pred eeE-EEeeccCCcc
Q 013272 428 AYY-LIRRRRKKSL 440 (446)
Q Consensus 428 ~~~-~~~rr~~~~~ 440 (446)
+|+ ++|||+|+++
T Consensus 334 IYLILRYRRKKKMk 347 (358)
T PTZ00046 334 IYLILRYRRKKKMK 347 (358)
T ss_pred HHHHHHhhhcchhH
Confidence 444 5677777654
No 52
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=41.45 E-value=23 Score=32.26 Aligned_cols=13 Identities=46% Similarity=0.700 Sum_probs=6.2
Q ss_pred EEEEehhhHHHHH
Q 013272 408 VLILVGVVDGLII 420 (446)
Q Consensus 408 i~i~~~~~~~~~~ 420 (446)
++|++++++++++
T Consensus 78 ~~iivgvi~~Vi~ 90 (179)
T PF13908_consen 78 TGIIVGVICGVIA 90 (179)
T ss_pred eeeeeehhhHHHH
Confidence 3455555544433
No 53
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=40.88 E-value=21 Score=21.54 Aligned_cols=11 Identities=27% Similarity=0.489 Sum_probs=9.2
Q ss_pred ceeecCCCCCC
Q 013272 287 SCSCIGLLTKK 297 (446)
Q Consensus 287 ~C~Cl~gf~~~ 297 (446)
.|+|++||...
T Consensus 3 ~C~C~~Gy~l~ 13 (24)
T PF12662_consen 3 TCSCPPGYQLS 13 (24)
T ss_pred EeeCCCCCcCC
Confidence 59999999765
No 54
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=40.65 E-value=14 Score=36.89 Aligned_cols=28 Identities=36% Similarity=0.500 Sum_probs=11.8
Q ss_pred EEEEehhhHHHHHHHHHhheeeEEEeeccCC
Q 013272 408 VLILVGVVDGLIIVLVFGGLAYYLIRRRRKK 438 (446)
Q Consensus 408 i~i~~~~~~~~~~l~~~~~~~~~~~~rr~~~ 438 (446)
+.|++|++++.++++ ++++|+ |.|||.+
T Consensus 273 vPIaVG~~La~lvli--vLiaYl-i~Rrr~~ 300 (306)
T PF01299_consen 273 VPIAVGAALAGLVLI--VLIAYL-IGRRRSR 300 (306)
T ss_pred HHHHHHHHHHHHHHH--HHHhhe-eEecccc
Confidence 345555444333222 333454 4444433
No 55
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=39.85 E-value=4.8 Score=40.70 Aligned_cols=13 Identities=46% Similarity=0.560 Sum_probs=8.0
Q ss_pred eeE-EEeeccCCcc
Q 013272 428 AYY-LIRRRRKKSL 440 (446)
Q Consensus 428 ~~~-~~~rr~~~~~ 440 (446)
+|+ +.|||+|+++
T Consensus 329 IYLILRYRRKKKMk 342 (353)
T TIGR01477 329 IYLILRYRRKKKMK 342 (353)
T ss_pred HHHHHHhhhcchhH
Confidence 444 5677777654
No 56
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=39.73 E-value=11 Score=19.18 Aligned_cols=9 Identities=33% Similarity=0.663 Sum_probs=6.0
Q ss_pred ceeecCCCC
Q 013272 287 SCSCIGLLT 295 (446)
Q Consensus 287 ~C~Cl~gf~ 295 (446)
.|.|++||.
T Consensus 1 ~C~C~~G~~ 9 (13)
T PF12661_consen 1 TCQCPPGWT 9 (13)
T ss_dssp EEEE-TTEE
T ss_pred CccCcCCCc
Confidence 489998864
No 57
>TIGR03066 Gem_osc_para_1 Gemmata obscuriglobus paralogous family TIGR03066. This model represents an uncharacterized paralogous family in Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. This family shows sequence similarity to TIGR03067, which is also found in Gemmata obscuriglobus as well as in a few other species.
Probab=39.48 E-value=67 Score=27.12 Aligned_cols=52 Identities=21% Similarity=0.250 Sum_probs=31.8
Q ss_pred CceEEEEecCCcEEEEcCCCce------EEeec---------CCCCc-cc--eEEEecCCCeEEEeecCe
Q 013272 84 EKCVLELTKDGDLRLKGPNDRV------GWLSG---------TSRQG-VE--RLQILRTGNLVLVDVVNR 135 (446)
Q Consensus 84 ~~~~L~lt~dG~LvL~d~~g~~------vWst~---------~~~~~-~~--~a~LlDsGNLVL~d~~~~ 135 (446)
+.+.|+|..||.|+|+.+++.- -|+-. ..++. .. ...=++.|-|||.|++++
T Consensus 34 ~~~~leF~~dGKL~v~~gnng~~~~~~Gty~L~G~kLtL~~~p~g~t~k~~Vtv~~l~~~~Lvl~d~dg~ 103 (111)
T TIGR03066 34 DDVVIEFAKDGKLVVTIGEKGKEVKADGTYKLDGNKLTLTLKAGGKEKKETLTVKKLTDDELVGKDPDGK 103 (111)
T ss_pred CceEEEEcCCCeEEEecCCCCcEeccCceEEEECCEEEEEEcCCCccccceEEEEEecCCeEEEEcCCCC
Confidence 4578999999999988765321 14321 11110 11 112368899999998875
No 58
>PF14269 Arylsulfotran_2: Arylsulfotransferase (ASST)
Probab=39.34 E-value=2.6e+02 Score=27.77 Aligned_cols=46 Identities=20% Similarity=0.371 Sum_probs=29.4
Q ss_pred EEEecCCcEEE----------Ec-CCCceEEeecCC-C----------CccceEEEe----cCCCeEEEeec
Q 013272 88 LELTKDGDLRL----------KG-PNDRVGWLSGTS-R----------QGVERLQIL----RTGNLVLVDVV 133 (446)
Q Consensus 88 L~lt~dG~LvL----------~d-~~g~~vWst~~~-~----------~~~~~a~Ll----DsGNLVL~d~~ 133 (446)
+....+|+++| .| .+|.++|.-... + ...--|.++ +.|++-|+|+.
T Consensus 149 V~~~~~G~yLiS~R~~~~i~~I~~~tG~I~W~lgG~~~~df~~~~~~f~~QHdar~~~~~~~~~~IslFDN~ 220 (299)
T PF14269_consen 149 VDKDDDGDYLISSRNTSTIYKIDPSTGKIIWRLGGKRNSDFTLPATNFSWQHDARFLNESNDDGTISLFDNA 220 (299)
T ss_pred eeecCCccEEEEecccCEEEEEECCCCcEEEEeCCCCCCcccccCCcEeeccCCEEeccCCCCCEEEEEcCC
Confidence 46667777654 35 468899986533 1 011246778 88888888873
No 59
>PF12768 Rax2: Cortical protein marker for cell polarity
Probab=38.52 E-value=35 Score=33.66 Aligned_cols=33 Identities=33% Similarity=0.534 Sum_probs=15.9
Q ss_pred EEEEehhhHHHHHHHHHhheee-EEEeeccCCcc
Q 013272 408 VLILVGVVDGLIIVLVFGGLAY-YLIRRRRKKSL 440 (446)
Q Consensus 408 i~i~~~~~~~~~~l~~~~~~~~-~~~~rr~~~~~ 440 (446)
++|.+++.+++++|++++-+++ ++.+||++..+
T Consensus 230 VlIslAiALG~v~ll~l~Gii~~~~~r~~~~~~~ 263 (281)
T PF12768_consen 230 VLISLAIALGTVFLLVLIGIILAYIRRRRQGYVP 263 (281)
T ss_pred EEEehHHHHHHHHHHHHHHHHHHHHHhhhccCcC
Confidence 4555556666655554433333 33344444444
No 60
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=38.46 E-value=19 Score=37.35 Aligned_cols=16 Identities=25% Similarity=0.387 Sum_probs=9.8
Q ss_pred HHHHhheeeEEEeecc
Q 013272 421 VLVFGGLAYYLIRRRR 436 (446)
Q Consensus 421 l~~~~~~~~~~~~rr~ 436 (446)
-.|++|+-||++.|.|
T Consensus 381 gglvGfLcWwf~crgk 396 (397)
T PF03302_consen 381 GGLVGFLCWWFICRGK 396 (397)
T ss_pred HHHHHHHhhheeeccc
Confidence 3445666677776655
No 61
>smart00564 PQQ beta-propeller repeat. Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases.
Probab=38.07 E-value=59 Score=20.07 Aligned_cols=18 Identities=22% Similarity=0.460 Sum_probs=11.2
Q ss_pred ecCCcEEEEcC-CCceEEe
Q 013272 91 TKDGDLRLKGP-NDRVGWL 108 (446)
Q Consensus 91 t~dG~LvL~d~-~g~~vWs 108 (446)
+.+|.|+-.|. +|.++|.
T Consensus 13 ~~~g~l~a~d~~~G~~~W~ 31 (33)
T smart00564 13 STDGTLYALDAKTGEILWT 31 (33)
T ss_pred cCCCEEEEEEcccCcEEEE
Confidence 44566666665 5677775
No 62
>PF02480 Herpes_gE: Alphaherpesvirus glycoprotein E; InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=35.15 E-value=13 Score=39.21 Aligned_cols=27 Identities=11% Similarity=0.162 Sum_probs=10.9
Q ss_pred CCCCCCceeeeccCCcceeeEEecCeeec
Q 013272 302 SDCGCGEIAVGLCGRNRVEMLELEGVGSV 330 (446)
Q Consensus 302 ~~C~~~~~~~~~C~~~~~~f~~l~~~~~~ 330 (446)
.+|++. ..+..|.. ...+.+.+++++.
T Consensus 239 ~~C~~~-~~~~~C~~-~~~~~~~~~~~~~ 265 (439)
T PF02480_consen 239 ANCSPS-GWPRRCPS-TSHIEPVPGLRWA 265 (439)
T ss_dssp EEEBTT-C-TTTTEE-EEEE---TTEEE-
T ss_pred cCCCCC-CCcCCCCc-hhccCcCcccccc
Confidence 567764 13345742 2334445555543
No 63
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=34.69 E-value=9.8 Score=25.23 Aligned_cols=23 Identities=35% Similarity=0.702 Sum_probs=15.0
Q ss_pred CCCCCCCCcCCCC---CceeecCCCC
Q 013272 273 PLGCKPCEICTFT---NSCSCIGLLT 295 (446)
Q Consensus 273 ~~~CG~~g~C~~~---~~C~Cl~gf~ 295 (446)
.+-|.++..|... -.|.|.+||.
T Consensus 5 ~~~C~~nA~C~~~~~~~~C~C~~Gy~ 30 (36)
T PF12947_consen 5 NGGCHPNATCTNTGGSYTCTCKPGYE 30 (36)
T ss_dssp GGGS-TTCEEEE-TTSEEEEE-CEEE
T ss_pred CCCCCCCcEeecCCCCEEeECCCCCc
Confidence 3568889999753 2799998764
No 64
>PF05935 Arylsulfotrans: Arylsulfotransferase (ASST); InterPro: IPR010262 This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate [].; PDB: 3ETT_B 3ELQ_A 3ETS_A.
Probab=33.41 E-value=1.7e+02 Score=31.12 Aligned_cols=62 Identities=16% Similarity=0.110 Sum_probs=29.5
Q ss_pred CCEEEEcCCCCCCcccCCceEEEEecCCcE--------EEEcCCCceEEeecCCCCc---cceEEEecCCCeEEEee
Q 013272 67 GDVKVWNSGHYSRFYVSEKCVLELTKDGDL--------RLKGPNDRVGWLSGTSRQG---VERLQILRTGNLVLVDV 132 (446)
Q Consensus 67 ~~~vVW~ANrd~P~~~~~~~~L~lt~dG~L--------vL~d~~g~~vWst~~~~~~---~~~a~LlDsGNLVL~d~ 132 (446)
.+.|+|.-..+..... .+.+..+|+| ...|-.|.++|.-...+.. +-....+++||++++..
T Consensus 136 ~G~Vrw~~~~~~~~~~----~~~~l~nG~ll~~~~~~~~e~D~~G~v~~~~~l~~~~~~~HHD~~~l~nGn~L~l~~ 208 (477)
T PF05935_consen 136 NGDVRWYLPLDSGSDN----SFKQLPNGNLLIGSGNRLYEIDLLGKVIWEYDLPGGYYDFHHDIDELPNGNLLILAS 208 (477)
T ss_dssp TS-EEEEE-GGGT--S----SEEE-TTS-EEEEEBTEEEEE-TT--EEEEEE--TTEE-B-S-EEE-TTS-EEEEEE
T ss_pred CccEEEEEccCccccc----eeeEcCCCCEEEecCCceEEEcCCCCEEEeeecCCcccccccccEECCCCCEEEEEe
Confidence 3568897665532111 1444455554 4467788999986554322 45678899999998754
No 65
>PF05454 DAG1: Dystroglycan (Dystrophin-associated glycoprotein 1); InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=30.64 E-value=17 Score=36.07 Aligned_cols=7 Identities=43% Similarity=0.719 Sum_probs=0.0
Q ss_pred EeeccCC
Q 013272 432 IRRRRKK 438 (446)
Q Consensus 432 ~~rr~~~ 438 (446)
.+||||+
T Consensus 169 cyrrkR~ 175 (290)
T PF05454_consen 169 CYRRKRK 175 (290)
T ss_dssp -------
T ss_pred hhhhhhc
Confidence 3444443
No 66
>PF05935 Arylsulfotrans: Arylsulfotransferase (ASST); InterPro: IPR010262 This family consists of several bacterial arylsulphotransferase proteins. Arylsulphotransferase (ASST) transfers a sulphate group from phenolic sulphate esters to a phenolic acceptor substrate [].; PDB: 3ETT_B 3ELQ_A 3ETS_A.
Probab=29.42 E-value=89 Score=33.21 Aligned_cols=53 Identities=21% Similarity=0.409 Sum_probs=31.5
Q ss_pred CCcEEEEcCCCceEEeecCCCCccceEEEecCCCeEEEee--------cCeEEEEeeccCcc
Q 013272 93 DGDLRLKGPNDRVGWLSGTSRQGVERLQILRTGNLVLVDV--------VNRVKWQSFNFPTD 146 (446)
Q Consensus 93 dG~LvL~d~~g~~vWst~~~~~~~~~a~LlDsGNLVL~d~--------~~~~lWQSFD~PTD 146 (446)
.+..++.|.+|.++|-.............+++|+|..... .|+++|+ ++.|..
T Consensus 127 ~~~~~~iD~~G~Vrw~~~~~~~~~~~~~~l~nG~ll~~~~~~~~e~D~~G~v~~~-~~l~~~ 187 (477)
T PF05935_consen 127 SSYTYLIDNNGDVRWYLPLDSGSDNSFKQLPNGNLLIGSGNRLYEIDLLGKVIWE-YDLPGG 187 (477)
T ss_dssp EEEEEEEETTS-EEEEE-GGGT--SSEEE-TTS-EEEEEBTEEEEE-TT--EEEE-EE--TT
T ss_pred CceEEEECCCccEEEEEccCccccceeeEcCCCCEEEecCCceEEEcCCCCEEEe-eecCCc
Confidence 4668889999999998765432222367899999997643 4678998 777763
No 67
>TIGR02513 type_III_yscB type III secretion system chaperone, YscB family. Members of this family include YscB of Yersinia and functionally equivalent (but differently named) proteins from type III secretion systems of other pathogens that affect animal cells. YscB acts, along with SycN (TIGR02503), as a chaperone for YopN, a key part of a complex that regulates type III secretion so it responds to contact with the eukaryotic target cell.
Probab=28.15 E-value=1e+02 Score=26.83 Aligned_cols=53 Identities=19% Similarity=0.191 Sum_probs=34.5
Q ss_pred CCcccCCceEEEEecCCcEEEEcC-CCceEEeecCCCC--------------------------ccceEEEecCCCeEEE
Q 013272 78 SRFYVSEKCVLELTKDGDLRLKGP-NDRVGWLSGTSRQ--------------------------GVERLQILRTGNLVLV 130 (446)
Q Consensus 78 ~P~~~~~~~~L~lt~dG~LvL~d~-~g~~vWst~~~~~--------------------------~~~~a~LlDsGNLVL~ 130 (446)
.|+..+..+.-.|.-||-++.+-. .+.++|+|.-... ...+.++-|+|||+|.
T Consensus 14 gpFVAd~qG~Yhl~iD~~~l~l~q~~sellletpL~~~~~~~~d~q~~~lLk~lmQq~l~w~R~~p~aLvld~~~qLiLe 93 (139)
T TIGR02513 14 GPFVADRQGVYHLTIDQHLVMLAQHGSELVLETPLDARMLRPGDNQNVTLLRSLMQQVLAWARRYPQALVLDADGQLILE 93 (139)
T ss_pred CCcccCCCCceEEEEcCcEEEeeccCceEEEeccccchhhCccccccHHHHHHHHHHHHHHHhcCCceEEEcCccchhHH
Confidence 355555567777888886655444 4568999874210 1135788899999985
No 68
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=26.96 E-value=44 Score=21.96 Aligned_cols=10 Identities=20% Similarity=0.414 Sum_probs=8.2
Q ss_pred CceeecCCCC
Q 013272 286 NSCSCIGLLT 295 (446)
Q Consensus 286 ~~C~Cl~gf~ 295 (446)
.+|.||.||-
T Consensus 18 ~~C~CPeGyI 27 (34)
T PF09064_consen 18 GQCFCPEGYI 27 (34)
T ss_pred CceeCCCceE
Confidence 4899999874
No 69
>PF13570 PQQ_3: PQQ-like domain; PDB: 3HXJ_B 3Q54_A.
Probab=26.78 E-value=99 Score=20.30 Aligned_cols=8 Identities=38% Similarity=0.605 Sum_probs=3.1
Q ss_pred ceEEeecC
Q 013272 104 RVGWLSGT 111 (446)
Q Consensus 104 ~~vWst~~ 111 (446)
+++|+...
T Consensus 2 ~~~W~~~~ 9 (40)
T PF13570_consen 2 KVLWSYDT 9 (40)
T ss_dssp -EEEEEE-
T ss_pred ceeEEEEC
Confidence 44555544
No 70
>PF15065 NCU-G1: Lysosomal transcription factor, NCU-G1
Probab=26.32 E-value=22 Score=36.16 Aligned_cols=14 Identities=21% Similarity=0.060 Sum_probs=9.3
Q ss_pred EEEeeccCccccccc
Q 013272 137 KWQSFNFPTDVMLWG 151 (446)
Q Consensus 137 lWQSFD~PTDTlLPG 151 (446)
||| ||.+.||-++.
T Consensus 73 L~E-fnD~ndta~~~ 86 (350)
T PF15065_consen 73 LIE-FNDVNDTANIS 86 (350)
T ss_pred hee-eeCCCCccccc
Confidence 555 77777776655
No 71
>COG1520 FOG: WD40-like repeat [Function unknown]
Probab=26.25 E-value=3.9e+02 Score=26.90 Aligned_cols=73 Identities=16% Similarity=0.219 Sum_probs=42.6
Q ss_pred CEEEEcCCCCC------CcccCCceEEEEe-cCCcEEEEcCC-CceEEeecCCC---Cc-----c---ceEE-Eec--CC
Q 013272 68 DVKVWNSGHYS------RFYVSEKCVLELT-KDGDLRLKGPN-DRVGWLSGTSR---QG-----V---ERLQ-ILR--TG 125 (446)
Q Consensus 68 ~~vVW~ANrd~------P~~~~~~~~L~lt-~dG~LvL~d~~-g~~vWst~~~~---~~-----~---~~a~-LlD--sG 125 (446)
.+.+|..+... |+.+ ...++-+. .+|.|+-+|++ |+.+|...... .. . ..+- ..+ +|
T Consensus 131 G~~~W~~~~~~~~~~~~~~v~-~~~~v~~~s~~g~~~al~~~tG~~~W~~~~~~~~~~~~~~~~~~~~~~vy~~~~~~~~ 209 (370)
T COG1520 131 GTLVWSRNVGGSPYYASPPVV-GDGTVYVGTDDGHLYALNADTGTLKWTYETPAPLSLSIYGSPAIASGTVYVGSDGYDG 209 (370)
T ss_pred CcEEEEEecCCCeEEecCcEE-cCcEEEEecCCCeEEEEEccCCcEEEEEecCCccccccccCceeecceEEEecCCCcc
Confidence 57899877665 2222 24555555 68999988876 99999854321 10 0 0111 122 44
Q ss_pred CeEEEee-cCeEEEEee
Q 013272 126 NLVLVDV-VNRVKWQSF 141 (446)
Q Consensus 126 NLVL~d~-~~~~lWQSF 141 (446)
+|+=.|. ++..+|+.+
T Consensus 210 ~~~a~~~~~G~~~w~~~ 226 (370)
T COG1520 210 ILYALNAEDGTLKWSQK 226 (370)
T ss_pred eEEEEEccCCcEeeeee
Confidence 5666665 567888753
No 72
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=25.96 E-value=1.9e+02 Score=29.12 Aligned_cols=62 Identities=16% Similarity=0.161 Sum_probs=38.6
Q ss_pred CCcccCCceEEEEecCCcEEEEcC-CCceEEeecCCCCc------cceEEEecCCCeEEEee-cCeEEEEe
Q 013272 78 SRFYVSEKCVLELTKDGDLRLKGP-NDRVGWLSGTSRQG------VERLQILRTGNLVLVDV-VNRVKWQS 140 (446)
Q Consensus 78 ~P~~~~~~~~L~lt~dG~LvL~d~-~g~~vWst~~~~~~------~~~a~LlDsGNLVL~d~-~~~~lWQS 140 (446)
.|+.. .+..+..+.+|.|+..|. +|.++|..+..... .......++|.|+..|. +++.+|+.
T Consensus 236 ~p~~~-~~~vy~~~~~g~l~a~d~~tG~~~W~~~~~~~~~p~~~~~~vyv~~~~G~l~~~d~~tG~~~W~~ 305 (377)
T TIGR03300 236 DPVVD-GGQVYAVSYQGRVAALDLRSGRVLWKRDASSYQGPAVDDNRLYVTDADGVVVALDRRSGSELWKN 305 (377)
T ss_pred ccEEE-CCEEEEEEcCCEEEEEECCCCcEEEeeccCCccCceEeCCEEEEECCCCeEEEEECCCCcEEEcc
Confidence 45432 234444566899998886 78899987642211 11223345788888886 46789974
No 73
>PF06697 DUF1191: Protein of unknown function (DUF1191); InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=25.68 E-value=28 Score=34.18 Aligned_cols=17 Identities=35% Similarity=0.671 Sum_probs=8.9
Q ss_pred EEehhhHHHHHHHHHhh
Q 013272 410 ILVGVVDGLIIVLVFGG 426 (446)
Q Consensus 410 i~~~~~~~~~~l~~~~~ 426 (446)
+++++++|++++.++++
T Consensus 215 iv~g~~~G~~~L~ll~~ 231 (278)
T PF06697_consen 215 IVVGVVGGVVLLGLLSL 231 (278)
T ss_pred EEEEehHHHHHHHHHHH
Confidence 45555556655554433
No 74
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=24.54 E-value=14 Score=34.78 Aligned_cols=6 Identities=33% Similarity=0.955 Sum_probs=2.4
Q ss_pred EEehhh
Q 013272 410 ILVGVV 415 (446)
Q Consensus 410 i~~~~~ 415 (446)
|+.|++
T Consensus 43 iVAG~~ 48 (221)
T PF08374_consen 43 IVAGIM 48 (221)
T ss_pred eecchh
Confidence 444433
No 75
>PF06006 DUF905: Bacterial protein of unknown function (DUF905); InterPro: IPR009253 This family consists of several short hypothetical proteobacterial proteins of unknown function.; PDB: 2HJJ_A.
Probab=24.16 E-value=80 Score=24.24 Aligned_cols=17 Identities=35% Similarity=0.843 Sum_probs=8.0
Q ss_pred eEEEeecCeEEEEeecc
Q 013272 127 LVLVDVVNRVKWQSFNF 143 (446)
Q Consensus 127 LVL~d~~~~~lWQSFD~ 143 (446)
||++|.++..+|..|.+
T Consensus 35 lvvRd~~g~mvWRaWNF 51 (70)
T PF06006_consen 35 LVVRDTEGQMVWRAWNF 51 (70)
T ss_dssp EEEE-SS--EEEEEESS
T ss_pred EEEEcCCCcEEEEeecc
Confidence 55555555666665554
No 76
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=23.96 E-value=41 Score=29.05 Aligned_cols=9 Identities=44% Similarity=0.619 Sum_probs=4.6
Q ss_pred EEEeeccCC
Q 013272 430 YLIRRRRKK 438 (446)
Q Consensus 430 ~~~~rr~~~ 438 (446)
+.++||+|.
T Consensus 123 yr~~r~~~~ 131 (139)
T PHA03099 123 YRFTRRTKL 131 (139)
T ss_pred heeeecccC
Confidence 345555553
No 77
>PF01011 PQQ: PQQ enzyme repeat family.; InterPro: IPR002372 Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor for a number of enzymes (quinoproteins) and particularly for some bacterial dehydrogenases [, ]. A number of bacterial quinoproteins belong to this family. Enzymes in this group have repeats of a beta propeller.; PDB: 1H4I_C 1H4J_E 1W6S_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A 1G72_A ....
Probab=23.90 E-value=95 Score=20.30 Aligned_cols=22 Identities=32% Similarity=0.353 Sum_probs=17.7
Q ss_pred ecCCcEEEEcC-CCceEEeecCC
Q 013272 91 TKDGDLRLKGP-NDRVGWLSGTS 112 (446)
Q Consensus 91 t~dG~LvL~d~-~g~~vWst~~~ 112 (446)
+.+|.|+-+|. .|.++|.-+..
T Consensus 7 ~~~g~l~AlD~~TG~~~W~~~~~ 29 (38)
T PF01011_consen 7 TPDGYLYALDAKTGKVLWKFQTG 29 (38)
T ss_dssp TTTSEEEEEETTTTSEEEEEESS
T ss_pred CCCCEEEEEECCCCCEEEeeeCC
Confidence 67899988886 68999987764
No 78
>KOG3637 consensus Vitronectin receptor, alpha subunit [Extracellular structures]
Probab=23.07 E-value=40 Score=39.40 Aligned_cols=32 Identities=25% Similarity=0.516 Sum_probs=18.7
Q ss_pred cEEEEEehhhHHHHHHHHHhheeeE--EEeeccC
Q 013272 406 KWVLILVGVVDGLIIVLVFGGLAYY--LIRRRRK 437 (446)
Q Consensus 406 ~~i~i~~~~~~~~~~l~~~~~~~~~--~~~rr~~ 437 (446)
.+.+|++++++|+++|++|.+++|- +++|.|+
T Consensus 977 p~wiIi~svl~GLLlL~llv~~LwK~GFFKR~r~ 1010 (1030)
T KOG3637|consen 977 PLWIIILSVLGGLLLLALLVLLLWKCGFFKRNRK 1010 (1030)
T ss_pred ceeeehHHHHHHHHHHHHHHHHHHhcCccccCCC
Confidence 4445666677777776666655442 3455553
No 79
>PF12545 DUF3739: Filamentous haemagglutinin family outer membrane protein; InterPro: IPR021026 This entry represents a conserved sequence domain found in a number of bacterial filamentous haemagglutinins, usually in association with PF05860 from PFAM.
Probab=22.43 E-value=4.2e+02 Score=22.37 Aligned_cols=10 Identities=30% Similarity=0.743 Sum_probs=7.4
Q ss_pred CEEEEcCCCC
Q 013272 68 DVKVWNSGHY 77 (446)
Q Consensus 68 ~~vVW~ANrd 77 (446)
+-.+|++|.|
T Consensus 3 Di~iWSs~Gd 12 (112)
T PF12545_consen 3 DILIWSSNGD 12 (112)
T ss_pred CEEEEeccCc
Confidence 3468888877
No 80
>KOG4289 consensus Cadherin EGF LAG seven-pass G-type receptor [Signal transduction mechanisms]
Probab=22.37 E-value=59 Score=38.91 Aligned_cols=36 Identities=19% Similarity=0.351 Sum_probs=25.3
Q ss_pred CCCCCCCCCcCCCC---CceeecCCCCCCC----CCCCCCCCC
Q 013272 272 LPLGCKPCEICTFT---NSCSCIGLLTKKE----KDKSDCGCG 307 (446)
Q Consensus 272 ~~~~CG~~g~C~~~---~~C~Cl~gf~~~~----~~~~~C~~~ 307 (446)
+.+.||++|-|... -+|.|-|+|.-.. .+.+-|.++
T Consensus 1243 Ys~pC~nng~C~srEggYtCeCrpg~tGehCEvs~~agrCvpG 1285 (2531)
T KOG4289|consen 1243 YSGPCGNNGRCRSREGGYTCECRPGFTGEHCEVSARAGRCVPG 1285 (2531)
T ss_pred hcCCCCCCCceEEecCceeEEecCCccccceeeecccCccccc
Confidence 47899999999742 3899999875442 234556555
No 81
>PF05568 ASFV_J13L: African swine fever virus J13L protein; InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=21.96 E-value=21 Score=31.41 Aligned_cols=12 Identities=25% Similarity=0.487 Sum_probs=5.6
Q ss_pred eeEEEeeccCCc
Q 013272 428 AYYLIRRRRKKS 439 (446)
Q Consensus 428 ~~~~~~rr~~~~ 439 (446)
++|+-+||||.+
T Consensus 49 i~lcssRKkKaa 60 (189)
T PF05568_consen 49 IYLCSSRKKKAA 60 (189)
T ss_pred HHHHhhhhHHHH
Confidence 444444444444
No 82
>cd00216 PQQ_DH Dehydrogenases with pyrrolo-quinoline quinone (PQQ) as cofactor, like ethanol, methanol, and membrane bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller.
Probab=21.15 E-value=2.8e+02 Score=29.40 Aligned_cols=70 Identities=16% Similarity=0.177 Sum_probs=0.0
Q ss_pred EEEEcCCCC-------CCcccCCceEEEEecCCcEEEEcC-CCceEEeecCCCC----------------c-cceEEEec
Q 013272 69 VKVWNSGHY-------SRFYVSEKCVLELTKDGDLRLKGP-NDRVGWLSGTSRQ----------------G-VERLQILR 123 (446)
Q Consensus 69 ~vVW~ANrd-------~P~~~~~~~~L~lt~dG~LvL~d~-~g~~vWst~~~~~----------------~-~~~a~LlD 123 (446)
+++|..+-. .|+.. .+.....+.+|.|+-+|. .|.++|+.+.... . ..+..-..
T Consensus 40 ~~~W~~~~~~~~~~~~sPvv~-~g~vy~~~~~g~l~AlD~~tG~~~W~~~~~~~~~~~~~~~~~~g~~~~~~~~V~v~~~ 118 (488)
T cd00216 40 KVAWTFSTGDERGQEGTPLVV-DGDMYFTTSHSALFALDAATGKVLWRYDPKLPADRGCCDVVNRGVAYWDPRKVFFGTF 118 (488)
T ss_pred eeeEEEECCCCCCcccCCEEE-CCEEEEeCCCCcEEEEECCCChhhceeCCCCCccccccccccCCcEEccCCeEEEecC
Q ss_pred CCCeEEEeec-CeEEEE
Q 013272 124 TGNLVLVDVV-NRVKWQ 139 (446)
Q Consensus 124 sGNLVL~d~~-~~~lWQ 139 (446)
+|.++-+|.. ++.+|+
T Consensus 119 ~g~v~AlD~~TG~~~W~ 135 (488)
T cd00216 119 DGRLVALDAETGKQVWK 135 (488)
T ss_pred CCeEEEEECCCCCEeee
No 83
>COG1520 FOG: WD40-like repeat [Function unknown]
Probab=21.13 E-value=2.4e+02 Score=28.53 Aligned_cols=60 Identities=15% Similarity=0.217 Sum_probs=38.8
Q ss_pred CcccCCceEEEEec-CCcEEEEcC-CCceEEeecCCC-----Ccc---ceEEE-e-cCCCeEEEeec-CeEEEE
Q 013272 79 RFYVSEKCVLELTK-DGDLRLKGP-NDRVGWLSGTSR-----QGV---ERLQI-L-RTGNLVLVDVV-NRVKWQ 139 (446)
Q Consensus 79 P~~~~~~~~L~lt~-dG~LvL~d~-~g~~vWst~~~~-----~~~---~~a~L-l-DsGNLVL~d~~-~~~lWQ 139 (446)
|+... .++|-+.. +|.|.-+|. +|+.+|+.+... ... ..... . ++|.++-.|.+ +..+|+
T Consensus 106 ~~~~~-~G~i~~g~~~g~~y~ld~~~G~~~W~~~~~~~~~~~~~~v~~~~~v~~~s~~g~~~al~~~tG~~~W~ 178 (370)
T COG1520 106 PILGS-DGKIYVGSWDGKLYALDASTGTLVWSRNVGGSPYYASPPVVGDGTVYVGTDDGHLYALNADTGTLKWT 178 (370)
T ss_pred ceEEe-CCeEEEecccceEEEEECCCCcEEEEEecCCCeEEecCcEEcCcEEEEecCCCeEEEEEccCCcEEEE
Confidence 44443 45566644 677877887 899999987654 111 11122 2 67999888877 788996
No 84
>smart00181 EGF Epidermal growth factor-like domain.
Probab=20.93 E-value=78 Score=19.79 Aligned_cols=22 Identities=32% Similarity=0.463 Sum_probs=14.9
Q ss_pred CCCCCCCcCCCC---CceeecCCCCC
Q 013272 274 LGCKPCEICTFT---NSCSCIGLLTK 296 (446)
Q Consensus 274 ~~CG~~g~C~~~---~~C~Cl~gf~~ 296 (446)
..|... .|... ..|.|++||..
T Consensus 6 ~~C~~~-~C~~~~~~~~C~C~~g~~g 30 (35)
T smart00181 6 GPCSNG-TCINTPGSYTCSCPPGYTG 30 (35)
T ss_pred CCCCCC-EEECCCCCeEeECCCCCcc
Confidence 456666 77642 47999998754
Done!