Query 010261
Match_columns 514
No_of_seqs 314 out of 1594
Neff 7.4
Searched_HMMs 46136
Date Thu Mar 28 22:27:30 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/010261.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/010261hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF01453 B_lectin: D-mannose b 99.9 3.9E-25 8.5E-30 192.9 5.9 87 106-194 1-95 (114)
2 cd00028 B_lectin Bulb-type man 99.9 2.2E-23 4.7E-28 182.6 13.6 105 75-181 7-116 (116)
3 smart00108 B_lectin Bulb-type 99.9 2.1E-22 4.5E-27 175.9 13.0 104 75-180 7-114 (114)
4 PF00954 S_locus_glycop: S-loc 99.8 9.4E-19 2E-23 151.9 9.7 102 228-339 1-108 (110)
5 PF08276 PAN_2: PAN-like domai 99.5 1E-14 2.2E-19 114.6 6.2 63 359-423 4-66 (66)
6 cd01098 PAN_AP_plant Plant PAN 99.4 3.1E-13 6.8E-18 110.8 8.4 80 353-438 3-84 (84)
7 cd00129 PAN_APPLE PAN/APPLE-li 99.2 1.8E-11 4E-16 99.6 6.1 67 360-436 9-79 (80)
8 smart00473 PAN_AP divergent su 98.4 8.5E-07 1.9E-11 70.9 8.1 69 365-436 7-77 (78)
9 smart00108 B_lectin Bulb-type 98.4 1.6E-06 3.6E-11 75.4 9.0 81 126-230 25-111 (114)
10 cd00028 B_lectin Bulb-type man 98.0 2.8E-05 6E-10 67.9 8.6 81 127-231 26-113 (116)
11 PF01453 B_lectin: D-mannose b 97.6 0.00052 1.1E-08 59.9 10.4 96 78-181 14-113 (114)
12 cd01100 APPLE_Factor_XI_like S 97.6 6.2E-05 1.3E-09 60.1 4.1 34 383-418 24-57 (73)
13 PF14295 PAN_4: PAN domain; PD 95.0 0.018 4E-07 41.9 2.5 35 382-416 14-51 (51)
14 PF00024 PAN_1: PAN domain Thi 94.3 0.076 1.6E-06 42.1 4.7 36 383-420 22-58 (79)
15 smart00223 APPLE APPLE domain. 93.6 0.11 2.4E-06 42.2 4.3 48 368-418 7-57 (79)
16 PF08693 SKG6: Transmembrane a 89.8 0.2 4.4E-06 35.0 1.6 7 473-479 33-39 (40)
17 PF04478 Mid2: Mid2 like cell 86.8 0.47 1E-05 43.1 2.4 8 449-456 51-58 (154)
18 cd01099 PAN_AP_HGF Subfamily o 86.0 1.6 3.5E-05 35.2 5.0 36 383-420 24-61 (80)
19 PF08277 PAN_3: PAN-like domai 83.4 3.6 7.8E-05 32.0 5.8 33 382-418 18-50 (71)
20 PTZ00382 Variant-specific surf 83.2 1 2.3E-05 37.9 2.8 6 472-477 88-93 (96)
21 PF01102 Glycophorin_A: Glycop 83.0 0.11 2.5E-06 45.6 -3.2 29 453-481 66-94 (122)
22 PF15102 TMEM154: TMEM154 prot 81.7 1.7 3.8E-05 39.2 3.7 12 471-482 77-88 (146)
23 PF01034 Syndecan: Syndecan do 80.7 0.5 1.1E-05 36.5 -0.1 15 466-480 24-38 (64)
24 smart00605 CW CW domain. 78.5 7.8 0.00017 32.2 6.5 55 382-439 20-76 (94)
25 PF01299 Lamp: Lysosome-associ 68.8 2.4 5.1E-05 43.4 1.3 27 456-482 275-301 (306)
26 PF01683 EB: EB module; Inter 65.2 5.4 0.00012 29.2 2.2 32 308-339 16-47 (52)
27 TIGR01478 STEVOR variant surfa 59.2 1.8 3.8E-05 43.3 -1.7 11 344-356 170-180 (295)
28 PTZ00370 STEVOR; Provisional 57.5 2 4.2E-05 43.1 -1.7 11 344-356 170-180 (296)
29 PF13360 PQQ_2: PQQ-like domai 56.5 87 0.0019 29.6 9.7 78 99-177 48-148 (238)
30 cd00053 EGF Epidermal growth f 55.8 9.9 0.00021 24.4 2.1 26 314-339 2-31 (36)
31 PF06024 DUF912: Nucleopolyhed 54.8 9.8 0.00021 32.3 2.3 14 469-482 80-93 (101)
32 PF13908 Shisa: Wnt and FGF in 54.4 10 0.00022 35.5 2.7 12 452-463 80-91 (179)
33 PF04478 Mid2: Mid2 like cell 54.3 15 0.00032 33.6 3.5 25 451-475 49-73 (154)
34 PF07645 EGF_CA: Calcium-bindi 53.5 4 8.7E-05 28.6 -0.2 27 312-338 3-34 (42)
35 PF06365 CD34_antigen: CD34/Po 51.7 7.4 0.00016 37.3 1.2 29 452-480 101-129 (202)
36 PF02009 Rifin_STEVOR: Rifin/s 50.9 2.6 5.5E-05 43.0 -2.1 10 471-480 274-283 (299)
37 PF03302 VSP: Giardia variant- 50.2 8.9 0.00019 40.8 1.7 30 451-480 367-396 (397)
38 KOG4649 PQQ (pyrrolo-quinoline 49.1 52 0.0011 33.0 6.6 44 106-149 168-217 (354)
39 smart00179 EGF_CA Calcium-bind 47.7 15 0.00032 24.4 2.0 27 312-338 3-33 (39)
40 PF13360 PQQ_2: PQQ-like domai 47.5 1.1E+02 0.0024 28.9 8.8 73 104-177 10-102 (238)
41 PHA03265 envelope glycoprotein 47.3 8.1 0.00018 39.7 0.8 30 449-482 349-378 (402)
42 PF15102 TMEM154: TMEM154 prot 47.1 16 0.00034 33.2 2.5 29 453-481 62-90 (146)
43 PF07974 EGF_2: EGF-like domai 46.4 13 0.00029 24.6 1.5 21 318-338 6-28 (32)
44 PRK11138 outer membrane biogen 44.1 98 0.0021 32.4 8.5 48 129-177 128-186 (394)
45 TIGR03300 assembly_YfgL outer 43.5 67 0.0014 33.3 7.0 47 129-176 73-130 (377)
46 cd05845 Ig2_L1-CAM_like Second 41.7 34 0.00073 28.7 3.6 34 104-137 31-64 (95)
47 PF01436 NHL: NHL repeat; Int 39.4 42 0.00091 21.2 3.0 20 125-144 7-26 (28)
48 TIGR03300 assembly_YfgL outer 38.3 1.8E+02 0.0039 30.1 9.3 71 105-176 83-170 (377)
49 cd00054 EGF_CA Calcium-binding 37.9 26 0.00055 22.7 1.9 27 312-338 3-33 (38)
50 PF12877 DUF3827: Domain of un 37.0 15 0.00032 41.0 0.8 30 450-480 269-298 (684)
51 PRK11138 outer membrane biogen 33.5 2.3E+02 0.005 29.6 9.3 18 130-147 265-283 (394)
52 PF02009 Rifin_STEVOR: Rifin/s 31.5 5.8 0.00012 40.5 -3.1 26 456-481 262-287 (299)
53 PF00157 Pou: Pou domain - N-t 31.2 10 0.00023 30.4 -1.0 26 6-31 41-66 (75)
54 PF12458 DUF3686: ATPase invol 27.6 1.3E+02 0.0027 32.3 5.7 58 74-139 309-368 (448)
55 smart00765 MANEC The MANEC dom 27.1 89 0.0019 26.2 3.7 35 383-417 37-73 (93)
56 PF07354 Sp38: Zona-pellucida- 27.0 74 0.0016 31.8 3.8 33 106-138 12-44 (271)
57 PF01034 Syndecan: Syndecan do 25.6 18 0.00039 28.0 -0.6 30 451-481 13-42 (64)
58 PF12662 cEGF: Complement Clr- 24.8 50 0.0011 20.5 1.4 9 331-339 4-12 (24)
59 TIGR01477 RIFIN variant surfac 24.0 16 0.00035 38.0 -1.5 13 462-474 322-334 (353)
60 PTZ00046 rifin; Provisional 22.6 18 0.0004 37.7 -1.4 13 462-474 327-339 (358)
61 PF07172 GRP: Glycine rich pro 21.9 70 0.0015 26.9 2.2 7 28-34 3-9 (95)
62 PF12947 EGF_3: EGF domain; I 21.8 17 0.00038 24.7 -1.2 22 317-338 5-30 (36)
63 PF09064 Tme5_EGF_like: Thromb 21.5 52 0.0011 22.2 1.0 8 330-337 19-26 (34)
64 PF08374 Protocadherin: Protoc 21.1 25 0.00054 33.9 -0.7 8 451-458 42-49 (221)
65 PF12661 hEGF: Human growth fa 20.8 21 0.00045 18.7 -0.7 8 331-338 2-9 (13)
66 PHA03099 epidermal growth fact 20.5 63 0.0014 28.7 1.7 14 467-480 117-130 (139)
67 KOG1219 Uncharacterized conser 20.3 1E+02 0.0023 39.9 3.9 26 312-338 3865-3895(4289)
68 cd05852 Ig5_Contactin-1 Fifth 20.2 1.2E+02 0.0025 23.6 3.1 34 104-138 13-46 (73)
No 1
>PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]: Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=99.91 E-value=3.9e-25 Score=192.90 Aligned_cols=87 Identities=30% Similarity=0.547 Sum_probs=64.0
Q ss_pred CCceEEeecCCCCCC---CCCcEEEeecCcEEEecCCCceEEee-cCCC----CcEEEeecCCCEEEEecCCCCeeeeee
Q 010261 106 SSKPLWLANSTQLAP---WSDRIELSFNGSLVISGPHSRVFWST-TRAE----GQRVVILNTSNLQIQKLDDPLSVVWQS 177 (514)
Q Consensus 106 ~~tvVWvANR~~Pv~---~~~~l~L~~~G~LvL~d~~g~~vWst-~~s~----~~~a~LldsGNLVL~~~~~~~~~lWQS 177 (514)
++||||+|||++|+. ...+|.|+.||+|+|.|..++++|++ ++.. +..|+|+|+|||||++. .+ .+||||
T Consensus 1 ~~tvvW~an~~~p~~~~s~~~~L~l~~dGnLvl~~~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~-~~-~~lW~S 78 (114)
T PF01453_consen 1 PRTVVWVANRNSPLTSSSGNYTLILQSDGNLVLYDSNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDS-SG-NVLWQS 78 (114)
T ss_dssp ---------TTEEEEECETTEEEEEETTSEEEEEETTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEET-TS-EEEEES
T ss_pred CcccccccccccccccccccccceECCCCeEEEEcCCCCEEEEecccCCccccCeEEEEeCCCCEEEEee-cc-eEEEee
Confidence 369999999999994 23679999999999999999999999 5442 36899999999999994 65 999999
Q ss_pred cCCCCCcccCCCcccCC
Q 010261 178 FDFPTDTLVENQNFTST 194 (514)
Q Consensus 178 FD~PTDTLLPgq~L~~~ 194 (514)
|||||||+||+|+|+.+
T Consensus 79 f~~ptdt~L~~q~l~~~ 95 (114)
T PF01453_consen 79 FDYPTDTLLPGQKLGDG 95 (114)
T ss_dssp TTSSS-EEEEEET--TS
T ss_pred cCCCccEEEeccCcccC
Confidence 99999999999999853
No 2
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=99.90 E-value=2.2e-23 Score=182.64 Aligned_cols=105 Identities=28% Similarity=0.464 Sum_probs=90.7
Q ss_pred CcEEEeCCCeEEEeeeecCCCC-eEEEEEEcCC-CceEEeecCCCCCCCCCcEEEeecCcEEEecCCCceEEeecCCC--
Q 010261 75 QSLLNDTTDTFSLGFLRVNSNQ-LALAVIHLPS-SKPLWLANSTQLAPWSDRIELSFNGSLVISGPHSRVFWSTTRAE-- 150 (514)
Q Consensus 75 ~~~LvS~~g~F~lGFf~~~~s~-~~i~i~~~~~-~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~~g~~vWst~~s~-- 150 (514)
+++|+|+++.|++|||.+.... .+..|++... .++||+|||+.|......+.|+.||+|+|.|.+|.+||++++..
T Consensus 7 ~~~l~s~~~~f~~G~~~~~~q~~dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~~~~ 86 (116)
T cd00028 7 GQTLVSSGSLFELGFFKLIMQSRDYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTTRVN 86 (116)
T ss_pred CCEEEeCCCcEEEecccCCCCCCeEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEecccCCC
Confidence 3789999999999999988765 7776665432 68999999999966667899999999999999999999999864
Q ss_pred -CcEEEeecCCCEEEEecCCCCeeeeeecCCC
Q 010261 151 -GQRVVILNTSNLQIQKLDDPLSVVWQSFDFP 181 (514)
Q Consensus 151 -~~~a~LldsGNLVL~~~~~~~~~lWQSFD~P 181 (514)
..+++|+|+|||||++. ++ .+||||||||
T Consensus 87 ~~~~~~L~ddGnlvl~~~-~~-~~~W~Sf~~P 116 (116)
T cd00028 87 GNYVLVLLDDGNLVLYDS-DG-NFLWQSFDYP 116 (116)
T ss_pred CceEEEEeCCCCEEEECC-CC-CEEEcCCCCC
Confidence 26789999999999995 66 8999999999
No 3
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=99.88 E-value=2.1e-22 Score=175.87 Aligned_cols=104 Identities=29% Similarity=0.473 Sum_probs=89.0
Q ss_pred CcEEEeCCCeEEEeeeecCCCCeEEEEEEcCC-CceEEeecCCCCCCCCCcEEEeecCcEEEecCCCceEEeecCC-C-C
Q 010261 75 QSLLNDTTDTFSLGFLRVNSNQLALAVIHLPS-SKPLWLANSTQLAPWSDRIELSFNGSLVISGPHSRVFWSTTRA-E-G 151 (514)
Q Consensus 75 ~~~LvS~~g~F~lGFf~~~~s~~~i~i~~~~~-~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~~g~~vWst~~s-~-~ 151 (514)
++.|+|+++.|++|||.+.....+..|++... .++||+|||+.|+..+..+.|+.||+|+|.|.+|.+||++++. . +
T Consensus 7 ~~~l~s~~~~f~~G~~~~~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~t~~~~~ 86 (114)
T smart00108 7 GQTLVSGNSLFELGFFTLIMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSSNTTGANG 86 (114)
T ss_pred CCEEecCCCcEeeeccccCCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEecccCCCC
Confidence 37899999999999999876666666665332 6899999999998776789999999999999999999999986 2 2
Q ss_pred -cEEEeecCCCEEEEecCCCCeeeeeecCC
Q 010261 152 -QRVVILNTSNLQIQKLDDPLSVVWQSFDF 180 (514)
Q Consensus 152 -~~a~LldsGNLVL~~~~~~~~~lWQSFD~ 180 (514)
.+++|+|+|||||++. ++ .+|||||||
T Consensus 87 ~~~~~L~ddGnlvl~~~-~~-~~~W~Sf~~ 114 (114)
T smart00108 87 NYVLVLLDDGNLVIYDS-DG-NFLWQSFDY 114 (114)
T ss_pred ceEEEEeCCCCEEEECC-CC-CEEeCCCCC
Confidence 5799999999999984 66 899999997
No 4
>PF00954 S_locus_glycop: S-locus glycoprotein family; InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=99.77 E-value=9.4e-19 Score=151.86 Aligned_cols=102 Identities=25% Similarity=0.504 Sum_probs=76.6
Q ss_pred EEecccCCceeeeeccccEEEEECCCcceEEEeeCCCcceeeEEEEEe-eeCCCcEEEEEEeeCCceEEEEe--eCCceE
Q 010261 228 WRHRALEAKADIVEGKGPIYVRVNSDGFLGTYQVGNNVPVDVEAFNNF-QRNSSGLLTLRLEQDGNLKGHYW--DGTNWV 304 (514)
Q Consensus 228 W~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~rl~Ld~dG~lr~y~w--~~~~W~ 304 (514)
||+|+|++. .++ +.|.+... . -+...+.. ++.+.+++| ..+.+.++|++||+||++++|.| +.++|.
T Consensus 1 wrsG~WnG~--~f~-g~p~~~~~-~-~~~~~fv~-----~~~e~~~t~~~~~~s~~~r~~ld~~G~l~~~~w~~~~~~W~ 70 (110)
T PF00954_consen 1 WRSGPWNGQ--RFS-GIPEMSSN-S-LYNYSFVS-----NNEEVYYTYSLSNSSVLSRLVLDSDGQLQRYIWNESTQSWS 70 (110)
T ss_pred CCccccCCe--EEC-Cccccccc-c-eeEEEEEE-----CCCeEEEEEecCCCceEEEEEEeeeeEEEEEEEecCCCcEE
Confidence 788999853 354 35543211 1 02122222 245677777 34567899999999999999999 468999
Q ss_pred EEeeeccCCCCCCCCCCCCCcCCCCC---cccCCCCCC
Q 010261 305 LNYQAISDACQLPSPCGSYSLCKQSG---CSCLDNRTD 339 (514)
Q Consensus 305 ~~~~~p~~~Cd~~g~CG~~giC~~~~---C~Cl~g~~~ 339 (514)
+.|++|.|+||+|+.||+||+|+.+. |+||+||++
T Consensus 71 ~~~~~p~d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~P 108 (110)
T PF00954_consen 71 VFWSAPKDQCDVYGFCGPNGICNSNNSPKCSCLPGFEP 108 (110)
T ss_pred EEEEecccCCCCccccCCccEeCCCCCCceECCCCcCC
Confidence 99999999999999999999998743 999999975
No 5
>PF08276 PAN_2: PAN-like domain; InterPro: IPR013227 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs
Probab=99.54 E-value=1e-14 Score=114.61 Aligned_cols=63 Identities=32% Similarity=0.540 Sum_probs=51.0
Q ss_pred ccceEEEEEecccCCCcceeeeccCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeeccccce
Q 010261 359 KSRFRVLRRKGVELPFKELIRYEMTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDYPIQTL 423 (514)
Q Consensus 359 ~d~F~~~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~~L~~~ 423 (514)
+|+| +++++|++|+++...++.+.++++||++||+||||+||+|.+..+++.|++|.++|.++
T Consensus 4 ~d~F--~~l~~~~~p~~~~~~~~~~~s~~~C~~~Cl~nCsC~Ayay~~~~~~~~C~lW~~~L~d~ 66 (66)
T PF08276_consen 4 GDGF--LKLPNMKLPDFDNAIVDSSVSLEECEKACLSNCSCTAYAYSNLSGGGGCLLWYGDLVDL 66 (66)
T ss_pred CCEE--EEECCeeCCCCcceeeecCCCHHHHHhhcCCCCCEeeEEeeccCCCCEEEEEcCEeecC
Confidence 4778 56789999999666555679999999999999999999998532245699999888763
No 6
>cd01098 PAN_AP_plant Plant PAN/APPLE-like domain; present in plant S-receptor protein kinases and secreted glycoproteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. S-receptor protein kinases and S-locus glycoproteins are involved in sporophytic self-incompatibility response in Brassica, one of probably many molecular mechanisms, by which hermaphrodite flowering plants avoid self-fertilization.
Probab=99.44 E-value=3.1e-13 Score=110.75 Aligned_cols=80 Identities=30% Similarity=0.503 Sum_probs=60.3
Q ss_pred CCCCCCc--cceEEEEEecccCCCcceeeeccCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeeccccceeecCCCC
Q 010261 353 DFCSEDK--SRFRVLRRKGVELPFKELIRYEMTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDYPIQTLLGAGDVS 430 (514)
Q Consensus 353 ~~C~~~~--d~F~~~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~~L~~~~~~~~~~ 430 (514)
+.|+.+. +.| +++.++++|+.... . .+.++++|+++||+||+|+||+|.++ +|.|++|..++.+.+.....+
T Consensus 3 ~~C~~~~~~~~f--~~~~~~~~~~~~~~-~-~~~s~~~C~~~Cl~nCsC~a~~~~~~--~~~C~~~~~~~~~~~~~~~~~ 76 (84)
T cd01098 3 LNCGGDGSTDGF--LKLPDVKLPDNASA-I-TAISLEECREACLSNCSCTAYAYNNG--SGGCLLWNGLLNNLRSLSSGG 76 (84)
T ss_pred cccCCCCCCCEE--EEeCCeeCCCchhh-h-ccCCHHHHHHHHhcCCCcceeeecCC--CCeEEEEeceecceEeecCCC
Confidence 3575432 355 56679999987543 2 67899999999999999999999875 567999998888765433344
Q ss_pred eeEEEEEe
Q 010261 431 KLGYFKLR 438 (514)
Q Consensus 431 ~~~yIKv~ 438 (514)
..+||||+
T Consensus 77 ~~~yiKv~ 84 (84)
T cd01098 77 GTLYLRLA 84 (84)
T ss_pred cEEEEEeC
Confidence 67899985
No 7
>cd00129 PAN_APPLE PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=99.22 E-value=1.8e-11 Score=99.60 Aligned_cols=67 Identities=12% Similarity=0.224 Sum_probs=53.2
Q ss_pred cceEEEEEecccCCCcceeeeccCCCHHHHHHHhhc---cCCeEEEEecCCCCcceEEEeeccc-cceeecCCCCeeEEE
Q 010261 360 SRFRVLRRKGVELPFKELIRYEMTSYLEQCEDLCQN---NCSCWGALYNNASGSGFCYMLDYPI-QTLLGAGDVSKLGYF 435 (514)
Q Consensus 360 d~F~~~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~---nCSC~Ay~y~~~~gsG~C~l~~~~L-~~~~~~~~~~~~~yI 435 (514)
..| +++.+|++|++.. .+++||+++|++ ||||+||+|.+. +.| |++|.+++ .+++...+++.++|+
T Consensus 9 g~f--l~~~~~klpd~~~------~s~~eC~~~Cl~~~~nCsC~Aya~~~~-~~g-C~~W~~~l~~d~~~~~~~g~~Ly~ 78 (80)
T cd00129 9 GTT--LIKIALKIKTTKA------NTADECANRCEKNGLPFSCKAFVFAKA-RKQ-CLWFPFNSMSGVRKEFSHGFDLYE 78 (80)
T ss_pred CeE--EEeecccCCcccc------cCHHHHHHHHhcCCCCCCceeeeccCC-CCC-eEEecCcchhhHHhccCCCceeEe
Confidence 456 4567899998743 689999999999 999999999754 134 99999999 777655555677899
Q ss_pred E
Q 010261 436 K 436 (514)
Q Consensus 436 K 436 (514)
|
T Consensus 79 r 79 (80)
T cd00129 79 N 79 (80)
T ss_pred E
Confidence 8
No 8
>smart00473 PAN_AP divergent subfamily of APPLE domains. Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions.
Probab=98.45 E-value=8.5e-07 Score=70.88 Aligned_cols=69 Identities=28% Similarity=0.335 Sum_probs=48.6
Q ss_pred EEEecccCCCcceeeeccCCCHHHHHHHhhc-cCCeEEEEecCCCCcceEEEee-ccccceeecCCCCeeEEEE
Q 010261 365 LRRKGVELPFKELIRYEMTSYLEQCEDLCQN-NCSCWGALYNNASGSGFCYMLD-YPIQTLLGAGDVSKLGYFK 436 (514)
Q Consensus 365 ~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~-nCSC~Ay~y~~~~gsG~C~l~~-~~L~~~~~~~~~~~~~yIK 436 (514)
+++.++++|+..... ....++++|++.|++ +|+|.||.|..+ ++.|++|. .++.+.......+...|.|
T Consensus 7 ~~~~~~~l~~~~~~~-~~~~s~~~C~~~C~~~~~~C~s~~y~~~--~~~C~l~~~~~~~~~~~~~~~~~~~y~~ 77 (78)
T smart00473 7 VRLPNTKLPGFSRIV-ISVASLEECASKCLNSNCSCRSFTYNNG--TKGCLLWSESSLGDARLFPSGGVDLYEK 77 (78)
T ss_pred EEecCccCCCCccee-EcCCCHHHHHHHhCCCCCceEEEEEcCC--CCEEEEeeCCccccceecccCCceeEEe
Confidence 456788888553322 346799999999999 999999999863 45799998 6666654223333445554
No 9
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=98.38 E-value=1.6e-06 Score=75.44 Aligned_cols=81 Identities=22% Similarity=0.402 Sum_probs=58.7
Q ss_pred EEeecCcEEEecCC-CceEEeecCCC----CcEEEeecCCCEEEEecCCCCeeeeeecCCCCCcccCCCcccCCceEEee
Q 010261 126 ELSFNGSLVISGPH-SRVFWSTTRAE----GQRVVILNTSNLQIQKLDDPLSVVWQSFDFPTDTLVENQNFTSTMSLVSS 200 (514)
Q Consensus 126 ~L~~~G~LvL~d~~-g~~vWst~~s~----~~~a~LldsGNLVL~~~~~~~~~lWQSFD~PTDTLLPgq~L~~~~~L~Ss 200 (514)
....||+||+.+.. +.+||++++.. +..+.|.++|||||++. ++ .++|+|= |+ ++
T Consensus 25 ~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~-~g-~~vW~S~---t~---~~------------ 84 (114)
T smart00108 25 IMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDG-DG-RVVWSSN---TT---GA------------ 84 (114)
T ss_pred CCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeC-CC-CEEEEec---cc---CC------------
Confidence 34568999998765 57999999853 24788999999999984 56 8999982 11 22
Q ss_pred cceeEEEecCC-ceeeEEEecCCccceEEEe
Q 010261 201 NGLYSMRLGSN-FIGLYAKFNDKSEQIYWRH 230 (514)
Q Consensus 201 ~G~y~l~~~~~-~~~l~~~~~~~~~~~YW~~ 230 (514)
.+.|.+.++++ +++++- ...++.|.+
T Consensus 85 ~~~~~~~L~ddGnlvl~~----~~~~~~W~S 111 (114)
T smart00108 85 NGNYVLVLLDDGNLVIYD----SDGNFLWQS 111 (114)
T ss_pred CCceEEEEeCCCCEEEEC----CCCCEEeCC
Confidence 34578889888 888862 113577865
No 10
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=98.01 E-value=2.8e-05 Score=67.93 Aligned_cols=81 Identities=23% Similarity=0.387 Sum_probs=58.3
Q ss_pred Eee-cCcEEEecCC-CceEEeecCCC----CcEEEeecCCCEEEEecCCCCeeeeeecCCCCCcccCCCcccCCceEEee
Q 010261 127 LSF-NGSLVISGPH-SRVFWSTTRAE----GQRVVILNTSNLQIQKLDDPLSVVWQSFDFPTDTLVENQNFTSTMSLVSS 200 (514)
Q Consensus 127 L~~-~G~LvL~d~~-g~~vWst~~s~----~~~a~LldsGNLVL~~~~~~~~~lWQSFD~PTDTLLPgq~L~~~~~L~Ss 200 (514)
... ||+||+.+.. +.+||++|+.. ...+.|.++|||||.+. ++ .++|||=-.. .
T Consensus 26 ~q~~dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~-~g-~~vW~S~~~~------------------~ 85 (116)
T cd00028 26 MQSRDYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDG-SG-TVVWSSNTTR------------------V 85 (116)
T ss_pred CCCCeEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcC-CC-cEEEEecccC------------------C
Confidence 344 8999998754 57999999753 25788999999999984 66 8999975321 0
Q ss_pred cceeEEEecCC-ceeeEEEecCCccceEEEec
Q 010261 201 NGLYSMRLGSN-FIGLYAKFNDKSEQIYWRHR 231 (514)
Q Consensus 201 ~G~y~l~~~~~-~~~l~~~~~~~~~~~YW~~~ 231 (514)
.+.+.+.++++ +++++-. + ..+.|.+.
T Consensus 86 ~~~~~~~L~ddGnlvl~~~--~--~~~~W~Sf 113 (116)
T cd00028 86 NGNYVLVLLDDGNLVLYDS--D--GNFLWQSF 113 (116)
T ss_pred CCceEEEEeCCCCEEEECC--C--CCEEEcCC
Confidence 24577888888 8888632 1 35678763
No 11
>PF01453 B_lectin: D-mannose binding lectin; InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]: Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity. Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=97.63 E-value=0.00052 Score=59.86 Aligned_cols=96 Identities=18% Similarity=0.283 Sum_probs=61.0
Q ss_pred EEeCCCeEEEeeeecCCCCeEEEEEEcCCCceEEee-cCCCCCCCCCcEEEeecCcEEEecCCCceEEeecCCCC-cEEE
Q 010261 78 LNDTTDTFSLGFLRVNSNQLALAVIHLPSSKPLWLA-NSTQLAPWSDRIELSFNGSLVISGPHSRVFWSTTRAEG-QRVV 155 (514)
Q Consensus 78 LvS~~g~F~lGFf~~~~s~~~i~i~~~~~~tvVWvA-NR~~Pv~~~~~l~L~~~G~LvL~d~~g~~vWst~~s~~-~~a~ 155 (514)
+.+.+|.+.|-|-..++ |.+.. ...++||.. +...+......+.|..||||||.|..+.++|++..... ..+.
T Consensus 14 ~~~~s~~~~L~l~~dGn----Lvl~~-~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~~ptdt~L~ 88 (114)
T PF01453_consen 14 LTSSSGNYTLILQSDGN----LVLYD-SNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFDYPTDTLLP 88 (114)
T ss_dssp EEECETTEEEEEETTSE----EEEEE-TTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTTSSS-EEEE
T ss_pred cccccccccceECCCCe----EEEEc-CCCCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecCCCccEEEe
Confidence 44444777777655443 33332 335679999 43433323457899999999999999999999954322 4566
Q ss_pred eec--CCCEEEEecCCCCeeeeeecCCC
Q 010261 156 ILN--TSNLQIQKLDDPLSVVWQSFDFP 181 (514)
Q Consensus 156 Lld--sGNLVL~~~~~~~~~lWQSFD~P 181 (514)
+++ .||++ ... .. .+.|.|=+.|
T Consensus 89 ~q~l~~~~~~-~~~-~~-~~sw~s~~dp 113 (114)
T PF01453_consen 89 GQKLGDGNVT-GKN-DS-LTSWSSNTDP 113 (114)
T ss_dssp EET--TSEEE-EES-TS-SEEEESS---
T ss_pred ccCcccCCCc-ccc-ce-EEeECCCCCC
Confidence 777 88988 542 33 7889887666
No 12
>cd01100 APPLE_Factor_XI_like Subfamily of PAN/APPLE-like domains; present in plasma prekallikrein/coagulation factor XI, microneme antigen proteins, and a few prokaryotic proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=97.62 E-value=6.2e-05 Score=60.12 Aligned_cols=34 Identities=29% Similarity=0.603 Sum_probs=30.0
Q ss_pred CCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeec
Q 010261 383 TSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDY 418 (514)
Q Consensus 383 ~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~ 418 (514)
..+.++|++.|+.+|+|.||.|... .+.|++|..
T Consensus 24 ~~s~~~Cq~~C~~~~~C~afT~~~~--~~~C~lk~~ 57 (73)
T cd01100 24 ASSAEQCQAACTADPGCLAFTYNTK--SKKCFLKSS 57 (73)
T ss_pred cCCHHHHHHHcCCCCCceEEEEECC--CCeEEcccC
Confidence 4689999999999999999999865 578999875
No 13
>PF14295 PAN_4: PAN domain; PDB: 2YIL_E 2YIP_C 2YIO_A.
Probab=95.03 E-value=0.018 Score=41.86 Aligned_cols=35 Identities=31% Similarity=0.645 Sum_probs=18.3
Q ss_pred cCCCHHHHHHHhhccCCeEEEEecCC---CCcceEEEe
Q 010261 382 MTSYLEQCEDLCQNNCSCWGALYNNA---SGSGFCYML 416 (514)
Q Consensus 382 ~~~sl~~C~~~CL~nCSC~Ay~y~~~---~gsG~C~l~ 416 (514)
...+.++|.++|.++=.|.++.|... ++.+.|+||
T Consensus 14 ~~~s~~~C~~~C~~~~~C~~~~~~~~~~~~~~~~C~LK 51 (51)
T PF14295_consen 14 TASSPEECQAACAADPGCQAFTFNPPGCPSSSGRCYLK 51 (51)
T ss_dssp ----HHHHHHHHHTSTT--EEEEETTEE----------
T ss_pred cCCCHHHHHHHccCCCCCCEEEEECCCcccccccccCC
Confidence 35689999999999999999999762 235679886
No 14
>PF00024 PAN_1: PAN domain This Prosite entry concerns apple domains, a subset of PAN domains; InterPro: IPR003014 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs It has been shown that, the N-terminal N domains of members of the plasminogen/hepatocyte growth factor family, the apple domains of the plasma prekallikrein/coagulation factor XI family, and domains of various nematode proteins belong to the same module superfamily, the PAN module []. PAN contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge that links the N and C termini of the domain.; PDB: 1GP9_C 2QJ2_B 1GMO_H 1NK1_B 3MKP_B 1BHT_B 3HN4_A 1GMN_A 3HMS_A 3HMT_B ....
Probab=94.32 E-value=0.076 Score=42.06 Aligned_cols=36 Identities=31% Similarity=0.586 Sum_probs=30.7
Q ss_pred CCCHHHHHHHhhccCC-eEEEEecCCCCcceEEEeeccc
Q 010261 383 TSYLEQCEDLCQNNCS-CWGALYNNASGSGFCYMLDYPI 420 (514)
Q Consensus 383 ~~sl~~C~~~CL~nCS-C~Ay~y~~~~gsG~C~l~~~~L 420 (514)
..++++|.+.|+.+=. |.+|.|... ++.|+|+...-
T Consensus 22 v~s~~~C~~~C~~~~~~C~s~~y~~~--~~~C~L~~~~~ 58 (79)
T PF00024_consen 22 VPSLEECAQLCLNEPRRCKSFNYDPS--SKTCYLSSSDR 58 (79)
T ss_dssp ESSHHHHHHHHHHSTT-ESEEEEETT--TTEEEEECSSS
T ss_pred CCCHHHHHhhcCcCcccCCeEEEECC--CCEEEEcCCCC
Confidence 3599999999999999 999999876 66899987533
No 15
>smart00223 APPLE APPLE domain. Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder.
Probab=93.57 E-value=0.11 Score=42.18 Aligned_cols=48 Identities=15% Similarity=0.294 Sum_probs=36.1
Q ss_pred ecccCCCcceeeeccCCCHHHHHHHhhccCCeEEEEecCCCCcc---eEEEeec
Q 010261 368 KGVELPFKELIRYEMTSYLEQCEDLCQNNCSCWGALYNNASGSG---FCYMLDY 418 (514)
Q Consensus 368 ~~v~~P~~~~~~~~~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG---~C~l~~~ 418 (514)
.+++++..+.. .....+.++|++.|..+=.|.+|.|... .+ .|+||..
T Consensus 7 ~~~df~G~Dl~-~~~~~~~~~Cq~~Ct~~~~C~~FTf~~~--~~~~~~C~LK~s 57 (79)
T smart00223 7 KNVDFRGSDIN-TVYVPSAQVCQKRCTSHPRCLFFTFSTN--EPPEEKCLLKDS 57 (79)
T ss_pred cCccccCceee-eeecCCHHHHHHhhcCCCCccEEEeeCC--CCCCCEeEeCcC
Confidence 45666665443 2345789999999999999999999765 33 7999864
No 16
>PF08693 SKG6: Transmembrane alpha-helix domain; InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=89.80 E-value=0.2 Score=34.96 Aligned_cols=7 Identities=0% Similarity=0.306 Sum_probs=2.7
Q ss_pred eEEEeee
Q 010261 473 YKIWTSR 479 (514)
Q Consensus 473 ~~i~~rr 479 (514)
|+.|||+
T Consensus 33 ~~~~rR~ 39 (40)
T PF08693_consen 33 FFWYRRK 39 (40)
T ss_pred heEEecc
Confidence 3334443
No 17
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=86.80 E-value=0.47 Score=43.06 Aligned_cols=8 Identities=25% Similarity=0.451 Sum_probs=3.3
Q ss_pred cceeEEee
Q 010261 449 GIAAGIGI 456 (514)
Q Consensus 449 ~~~~~i~~ 456 (514)
++++++|+
T Consensus 51 VIGvVVGV 58 (154)
T PF04478_consen 51 VIGVVVGV 58 (154)
T ss_pred EEEEEecc
Confidence 34444443
No 18
>cd01099 PAN_AP_HGF Subfamily of PAN/APPLE-like domains; present in N-terminal (N) domains of plasminogen/hepatocyte growth factor proteins, and various proteins found in Bilateria, such as leech anti-platelet proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=86.01 E-value=1.6 Score=35.21 Aligned_cols=36 Identities=28% Similarity=0.460 Sum_probs=30.0
Q ss_pred CCCHHHHHHHhhc--cCCeEEEEecCCCCcceEEEeeccc
Q 010261 383 TSYLEQCEDLCQN--NCSCWGALYNNASGSGFCYMLDYPI 420 (514)
Q Consensus 383 ~~sl~~C~~~CL~--nCSC~Ay~y~~~~gsG~C~l~~~~L 420 (514)
..++++|.++|++ +=.|.++.|... ++.|.|-..+-
T Consensus 24 ~~s~~~C~~~C~~~~~f~CrSf~y~~~--~~~C~L~~~~~ 61 (80)
T cd01099 24 VASLEECLRKCLEETEFTCRSFNYNYK--SKECILSDEDR 61 (80)
T ss_pred cCCHHHHHHHhCCCCCceEeEEEEEcC--CCEEEEeCCCc
Confidence 4799999999999 899999999766 56799866433
No 19
>PF08277 PAN_3: PAN-like domain; InterPro: IPR006583 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs The PAN-3 or CW is a domain associated with a number of Caenorhabditis elegans hypothetical proteins.
Probab=83.42 E-value=3.6 Score=32.01 Aligned_cols=33 Identities=27% Similarity=0.759 Sum_probs=27.5
Q ss_pred cCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeec
Q 010261 382 MTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDY 418 (514)
Q Consensus 382 ~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~ 418 (514)
...+.++|-+.|..+=.|.++.++ . +.|++...
T Consensus 18 ~~~sw~~Cv~~C~~~~~C~la~~~-~---~~C~~y~~ 50 (71)
T PF08277_consen 18 TNTSWDDCVQKCYNDENCVLAYFD-S---GKCYLYNY 50 (71)
T ss_pred cCCCHHHHhHHhCCCCEEEEEEeC-C---CCEEEEEc
Confidence 457889999999999999999887 3 25998763
No 20
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=83.23 E-value=1 Score=37.94 Aligned_cols=6 Identities=0% Similarity=-0.070 Sum_probs=2.4
Q ss_pred eeEEEe
Q 010261 472 GYKIWT 477 (514)
Q Consensus 472 ~~~i~~ 477 (514)
.|+++|
T Consensus 88 w~f~~r 93 (96)
T PTZ00382 88 WWFVCR 93 (96)
T ss_pred heeEEe
Confidence 344443
No 21
>PF01102 Glycophorin_A: Glycophorin A; InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=82.96 E-value=0.11 Score=45.56 Aligned_cols=29 Identities=38% Similarity=0.518 Sum_probs=14.5
Q ss_pred EEeehhHHHHHHHHHHhheeeEEEeeeec
Q 010261 453 GIGILGGALLILIGVILFGGYKIWTSRRA 481 (514)
Q Consensus 453 ~i~~~~~~~~~li~~~~~~~~~i~~rrk~ 481 (514)
++++++++++.+|++++++.|+|.||||+
T Consensus 66 i~~Ii~gv~aGvIg~Illi~y~irR~~Kk 94 (122)
T PF01102_consen 66 IIGIIFGVMAGVIGIILLISYCIRRLRKK 94 (122)
T ss_dssp HHHHHHHHHHHHHHHHHHHHHHHHHHS--
T ss_pred eeehhHHHHHHHHHHHHHHHHHHHHHhcc
Confidence 34444444444555556667776544443
No 22
>PF15102 TMEM154: TMEM154 protein family
Probab=81.66 E-value=1.7 Score=39.22 Aligned_cols=12 Identities=8% Similarity=0.072 Sum_probs=5.7
Q ss_pred eeeEEEeeeecc
Q 010261 471 GGYKIWTSRRAN 482 (514)
Q Consensus 471 ~~~~i~~rrk~~ 482 (514)
+++++.||||.|
T Consensus 77 ~lv~~~kRkr~K 88 (146)
T PF15102_consen 77 CLVIYYKRKRTK 88 (146)
T ss_pred HheeEEeecccC
Confidence 333444565544
No 23
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=80.73 E-value=0.5 Score=36.46 Aligned_cols=15 Identities=27% Similarity=0.288 Sum_probs=0.4
Q ss_pred HHHhheeeEEEeeee
Q 010261 466 GVILFGGYKIWTSRR 480 (514)
Q Consensus 466 ~~~~~~~~~i~~rrk 480 (514)
++++++.|+++|.||
T Consensus 24 ~ailLIlf~iyR~rk 38 (64)
T PF01034_consen 24 FAILLILFLIYRMRK 38 (64)
T ss_dssp -------------S-
T ss_pred HHHHHHHHHHHHHHh
Confidence 333344455555443
No 24
>smart00605 CW CW domain.
Probab=78.46 E-value=7.8 Score=32.17 Aligned_cols=55 Identities=25% Similarity=0.433 Sum_probs=35.1
Q ss_pred cCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeec-cccceeecCC-CCeeEEEEEec
Q 010261 382 MTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDY-PIQTLLGAGD-VSKLGYFKLRE 439 (514)
Q Consensus 382 ~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~-~L~~~~~~~~-~~~~~yIKv~~ 439 (514)
...+.++|.+.|..+..|..+.... +..|++... .+..+++... .+..+=+|+..
T Consensus 20 ~~~sw~~Ci~~C~~~~~Cvlay~~~---~~~C~~f~~~~~~~v~~~~~~~~~~VAfK~~~ 76 (94)
T smart00605 20 ATLSWDECIQKCYEDSNCVLAYGNS---SETCYLFSYGTVLTVKKLSSSSGKKVAFKVST 76 (94)
T ss_pred cCCCHHHHHHHHhCCCceEEEecCC---CCceEEEEcCCeEEEEEccCCCCcEEEEEEeC
Confidence 3578999999999999999875543 236998764 2333433322 22334566654
No 25
>PF01299 Lamp: Lysosome-associated membrane glycoprotein (Lamp); InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+ In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=68.81 E-value=2.4 Score=43.45 Aligned_cols=27 Identities=30% Similarity=0.250 Sum_probs=15.0
Q ss_pred ehhHHHHHHHHHHhheeeEEEeeeecc
Q 010261 456 ILGGALLILIGVILFGGYKIWTSRRAN 482 (514)
Q Consensus 456 ~~~~~~~~li~~~~~~~~~i~~rrk~~ 482 (514)
|+|++.++.+++++++.|+|.|||+.+
T Consensus 275 IaVG~~La~lvlivLiaYli~Rrr~~~ 301 (306)
T PF01299_consen 275 IAVGAALAGLVLIVLIAYLIGRRRSRA 301 (306)
T ss_pred HHHHHHHHHHHHHHHHhheeEeccccc
Confidence 334444444445556678887666543
No 26
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=65.17 E-value=5.4 Score=29.21 Aligned_cols=32 Identities=19% Similarity=0.512 Sum_probs=26.9
Q ss_pred eeccCCCCCCCCCCCCCcCCCCCcccCCCCCC
Q 010261 308 QAISDACQLPSPCGSYSLCKQSGCSCLDNRTD 339 (514)
Q Consensus 308 ~~p~~~Cd~~g~CG~~giC~~~~C~Cl~g~~~ 339 (514)
..|.+.|+...-|-.+++|..+.|.|++|+..
T Consensus 16 ~~~g~~C~~~~qC~~~s~C~~g~C~C~~g~~~ 47 (52)
T PF01683_consen 16 VQPGESCESDEQCIGGSVCVNGRCQCPPGYVE 47 (52)
T ss_pred CCCCCCCCCcCCCCCcCEEcCCEeECCCCCEe
Confidence 45668899999999999997777999998754
No 27
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=59.23 E-value=1.8 Score=43.32 Aligned_cols=11 Identities=27% Similarity=0.528 Sum_probs=6.1
Q ss_pred CCCccCCCCCCCC
Q 010261 344 GECFASTSGDFCS 356 (514)
Q Consensus 344 ~GC~r~~~~~~C~ 356 (514)
++|.|. --.|.
T Consensus 170 ~rC~~g--i~~Cs 180 (295)
T TIGR01478 170 KGCTAG--VGTCA 180 (295)
T ss_pred ccCCCe--eEeec
Confidence 577753 33464
No 28
>PTZ00370 STEVOR; Provisional
Probab=57.46 E-value=2 Score=43.09 Aligned_cols=11 Identities=27% Similarity=0.377 Sum_probs=6.3
Q ss_pred CCCccCCCCCCCC
Q 010261 344 GECFASTSGDFCS 356 (514)
Q Consensus 344 ~GC~r~~~~~~C~ 356 (514)
.+|.|. --.|.
T Consensus 170 ~rC~~g--i~~Cs 180 (296)
T PTZ00370 170 HRCTGG--ICSCS 180 (296)
T ss_pred ccCCCe--eEeec
Confidence 578763 33465
No 29
>PF13360 PQQ_2: PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=56.49 E-value=87 Score=29.61 Aligned_cols=78 Identities=19% Similarity=0.334 Sum_probs=49.4
Q ss_pred EEEEEcCCCceEEeecCCCCCC-----CCCcEE-EeecCcEEEec-CCCceEEee-cCC----C----------CcEEEe
Q 010261 99 LAVIHLPSSKPLWLANSTQLAP-----WSDRIE-LSFNGSLVISG-PHSRVFWST-TRA----E----------GQRVVI 156 (514)
Q Consensus 99 i~i~~~~~~tvVWvANR~~Pv~-----~~~~l~-L~~~G~LvL~d-~~g~~vWst-~~s----~----------~~~a~L 156 (514)
+..+.+...+++|...-+.++. ..+.+. .+.+|.|...| .+|.++|.. ... . +..+.+
T Consensus 48 l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~~~~~~~~~~ 127 (238)
T PF13360_consen 48 LYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLYALDAKTGKVLWSIYLTSSPPAGVRSSSSPAVDGDRLYV 127 (238)
T ss_dssp EEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEEEEETTTSCEEEEEEE-SSCTCSTB--SEEEEETTEEEE
T ss_pred EEEEECCCCCEEEEeeccccccceeeecccccccccceeeeEecccCCcceeeeeccccccccccccccCceEecCEEEE
Confidence 3444555678999988766532 233343 34467788888 889999994 321 1 112233
Q ss_pred -ecCCCEEEEecCCCCeeeeee
Q 010261 157 -LNTSNLQIQKLDDPLSVVWQS 177 (514)
Q Consensus 157 -ldsGNLVL~~~~~~~~~lWQS 177 (514)
..+|.|+..|..+| +.+|+=
T Consensus 128 ~~~~g~l~~~d~~tG-~~~w~~ 148 (238)
T PF13360_consen 128 GTSSGKLVALDPKTG-KLLWKY 148 (238)
T ss_dssp EETCSEEEEEETTTT-EEEEEE
T ss_pred EeccCcEEEEecCCC-cEEEEe
Confidence 33889999986577 899974
No 30
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Probab=55.76 E-value=9.9 Score=24.37 Aligned_cols=26 Identities=27% Similarity=0.763 Sum_probs=19.0
Q ss_pred CCCCCCCCCCCcCCCC--C--cccCCCCCC
Q 010261 314 CQLPSPCGSYSLCKQS--G--CSCLDNRTD 339 (514)
Q Consensus 314 Cd~~g~CG~~giC~~~--~--C~Cl~g~~~ 339 (514)
|.....|..++.|... . |.|++||..
T Consensus 2 C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g 31 (36)
T cd00053 2 CAASNPCSNGGTCVNTPGSYRCVCPPGYTG 31 (36)
T ss_pred CCCCCCCCCCCEEecCCCCeEeECCCCCcc
Confidence 4445678888999762 2 999988754
No 31
>PF06024 DUF912: Nucleopolyhedrovirus protein of unknown function (DUF912); InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=54.80 E-value=9.8 Score=32.34 Aligned_cols=14 Identities=14% Similarity=0.126 Sum_probs=7.2
Q ss_pred hheeeEEEeeeecc
Q 010261 469 LFGGYKIWTSRRAN 482 (514)
Q Consensus 469 ~~~~~~i~~rrk~~ 482 (514)
++.+|.|.|.|+.+
T Consensus 80 ~IyYFVILRer~~~ 93 (101)
T PF06024_consen 80 AIYYFVILRERQKS 93 (101)
T ss_pred hheEEEEEeccccc
Confidence 44455665555444
No 32
>PF13908 Shisa: Wnt and FGF inhibitory regulator
Probab=54.43 E-value=10 Score=35.48 Aligned_cols=12 Identities=25% Similarity=0.573 Sum_probs=5.2
Q ss_pred eEEeehhHHHHH
Q 010261 452 AGIGILGGALLI 463 (514)
Q Consensus 452 ~~i~~~~~~~~~ 463 (514)
++++++++++++
T Consensus 80 iivgvi~~Vi~I 91 (179)
T PF13908_consen 80 IIVGVICGVIAI 91 (179)
T ss_pred eeeehhhHHHHH
Confidence 444454444333
No 33
>PF04478 Mid2: Mid2 like cell wall stress sensor; InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=54.34 E-value=15 Score=33.55 Aligned_cols=25 Identities=20% Similarity=0.042 Sum_probs=15.3
Q ss_pred eeEEeehhHHHHHHHHHHhheeeEE
Q 010261 451 AAGIGILGGALLILIGVILFGGYKI 475 (514)
Q Consensus 451 ~~~i~~~~~~~~~li~~~~~~~~~i 475 (514)
.++||+++++-+++|++++.++|++
T Consensus 49 nIVIGvVVGVGg~ill~il~lvf~~ 73 (154)
T PF04478_consen 49 NIVIGVVVGVGGPILLGILALVFIF 73 (154)
T ss_pred cEEEEEEecccHHHHHHHHHhheeE
Confidence 5899999998554444433333433
No 34
>PF07645 EGF_CA: Calcium-binding EGF domain; InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes []. +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=53.48 E-value=4 Score=28.59 Aligned_cols=27 Identities=26% Similarity=0.621 Sum_probs=21.6
Q ss_pred CCCCCC-CCCCCCCcCCC--CC--cccCCCCC
Q 010261 312 DACQLP-SPCGSYSLCKQ--SG--CSCLDNRT 338 (514)
Q Consensus 312 ~~Cd~~-g~CG~~giC~~--~~--C~Cl~g~~ 338 (514)
|+|... ..|.+++.|.. +. |.|++||.
T Consensus 3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~ 34 (42)
T PF07645_consen 3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYE 34 (42)
T ss_dssp STTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence 678774 48999999975 33 99999986
No 35
>PF06365 CD34_antigen: CD34/Podocalyxin family; InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=51.69 E-value=7.4 Score=37.32 Aligned_cols=29 Identities=21% Similarity=0.395 Sum_probs=17.8
Q ss_pred eEEeehhHHHHHHHHHHhheeeEEEeeee
Q 010261 452 AGIGILGGALLILIGVILFGGYKIWTSRR 480 (514)
Q Consensus 452 ~~i~~~~~~~~~li~~~~~~~~~i~~rrk 480 (514)
++|++++...++|++++++.+|++|+||.
T Consensus 101 ~lI~lv~~g~~lLla~~~~~~Y~~~~Rrs 129 (202)
T PF06365_consen 101 TLIALVTSGSFLLLAILLGAGYCCHQRRS 129 (202)
T ss_pred EEEehHHhhHHHHHHHHHHHHHHhhhhcc
Confidence 55555554434555556666777787775
No 36
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=50.90 E-value=2.6 Score=43.02 Aligned_cols=10 Identities=40% Similarity=0.245 Sum_probs=5.1
Q ss_pred eeeEEEeeee
Q 010261 471 GGYKIWTSRR 480 (514)
Q Consensus 471 ~~~~i~~rrk 480 (514)
++|+|||+||
T Consensus 274 IIYLILRYRR 283 (299)
T PF02009_consen 274 IIYLILRYRR 283 (299)
T ss_pred HHHHHHHHHH
Confidence 3455555444
No 37
>PF03302 VSP: Giardia variant-specific surface protein; InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=50.23 E-value=8.9 Score=40.81 Aligned_cols=30 Identities=17% Similarity=0.179 Sum_probs=18.5
Q ss_pred eeEEeehhHHHHHHHHHHhheeeEEEeeee
Q 010261 451 AAGIGILGGALLILIGVILFGGYKIWTSRR 480 (514)
Q Consensus 451 ~~~i~~~~~~~~~li~~~~~~~~~i~~rrk 480 (514)
++|+||.|+++++|-+++.|+.||+.-|+|
T Consensus 367 gaIaGIsvavvvvVgglvGfLcWwf~crgk 396 (397)
T PF03302_consen 367 GAIAGISVAVVVVVGGLVGFLCWWFICRGK 396 (397)
T ss_pred cceeeeeehhHHHHHHHHHHHhhheeeccc
Confidence 466777776666555555666665555554
No 38
>KOG4649 consensus PQQ (pyrrolo-quinoline quinone) repeat protein [Secondary metabolites biosynthesis, transport and catabolism]
Probab=49.11 E-value=52 Score=32.99 Aligned_cols=44 Identities=20% Similarity=0.302 Sum_probs=33.3
Q ss_pred CCceEEeecCCCCCCCC-----CcEEE-eecCcEEEecCCCceEEeecCC
Q 010261 106 SSKPLWLANSTQLAPWS-----DRIEL-SFNGSLVISGPHSRVFWSTTRA 149 (514)
Q Consensus 106 ~~tvVWvANR~~Pv~~~-----~~l~L-~~~G~LvL~d~~g~~vWst~~s 149 (514)
+.++.|-|-|..|+-.+ ....+ +-||+|.-.|..|+.||.-.+.
T Consensus 168 ~~~~~w~~~~~~PiF~splcv~~sv~i~~VdG~l~~f~~sG~qvwr~~t~ 217 (354)
T KOG4649|consen 168 SSTEFWAATRFGPIFASPLCVGSSVIITTVDGVLTSFDESGRQVWRPATK 217 (354)
T ss_pred CcceehhhhcCCccccCceeccceEEEEEeccEEEEEcCCCcEEEeecCC
Confidence 34899999999997532 23444 3599999899999999976654
No 39
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=47.72 E-value=15 Score=24.36 Aligned_cols=27 Identities=30% Similarity=0.753 Sum_probs=20.0
Q ss_pred CCCCCCCCCCCCCcCCCC--C--cccCCCCC
Q 010261 312 DACQLPSPCGSYSLCKQS--G--CSCLDNRT 338 (514)
Q Consensus 312 ~~Cd~~g~CG~~giC~~~--~--C~Cl~g~~ 338 (514)
++|.....|...+.|... . |.|++|+.
T Consensus 3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~ 33 (39)
T smart00179 3 DECASGNPCQNGGTCVNTVGSYRCECPPGYT 33 (39)
T ss_pred ccCcCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence 567655678888899752 2 99999875
No 40
>PF13360 PQQ_2: PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=47.50 E-value=1.1e+02 Score=28.87 Aligned_cols=73 Identities=21% Similarity=0.304 Sum_probs=43.1
Q ss_pred cCCCceEEeecC----CCCC----CCCCcEEE-eecCcEEEecC-CCceEEeecCCCC---------cEEEeec-CCCEE
Q 010261 104 LPSSKPLWLANS----TQLA----PWSDRIEL-SFNGSLVISGP-HSRVFWSTTRAEG---------QRVVILN-TSNLQ 163 (514)
Q Consensus 104 ~~~~tvVWvANR----~~Pv----~~~~~l~L-~~~G~LvL~d~-~g~~vWst~~s~~---------~~a~Lld-sGNLV 163 (514)
......+|..+- +.++ ..+..+-+ +.+|.|+..|. +|.++|+...... ..+.+.. +|-|.
T Consensus 10 ~~tG~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~ 89 (238)
T PF13360_consen 10 PRTGKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLY 89 (238)
T ss_dssp TTTTEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEE
T ss_pred CCCCCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEeeccccccceeeecccccccccceeeeE
Confidence 345567887753 2222 12233333 36788999996 8999999875321 1223333 44466
Q ss_pred EEecCCCCeeeeee
Q 010261 164 IQKLDDPLSVVWQS 177 (514)
Q Consensus 164 L~~~~~~~~~lWQS 177 (514)
..|..++ +++|+.
T Consensus 90 ~~d~~tG-~~~W~~ 102 (238)
T PF13360_consen 90 ALDAKTG-KVLWSI 102 (238)
T ss_dssp EEETTTS-CEEEEE
T ss_pred ecccCCc-ceeeee
Confidence 6664466 899995
No 41
>PHA03265 envelope glycoprotein D; Provisional
Probab=47.32 E-value=8.1 Score=39.75 Aligned_cols=30 Identities=20% Similarity=0.323 Sum_probs=16.3
Q ss_pred cceeEEeehhHHHHHHHHHHhheeeEEEeeeecc
Q 010261 449 GIAAGIGILGGALLILIGVILFGGYKIWTSRRAN 482 (514)
Q Consensus 449 ~~~~~i~~~~~~~~~li~~~~~~~~~i~~rrk~~ 482 (514)
.++++||+.++.++++ +++ .|++|||||..
T Consensus 349 ~~g~~ig~~i~glv~v-g~i---l~~~~rr~k~~ 378 (402)
T PHA03265 349 FVGISVGLGIAGLVLV-GVI---LYVCLRRKKEL 378 (402)
T ss_pred ccceEEccchhhhhhh-hHH---HHHHhhhhhhh
Confidence 3456666665543322 333 35567787743
No 42
>PF15102 TMEM154: TMEM154 protein family
Probab=47.08 E-value=16 Score=33.16 Aligned_cols=29 Identities=28% Similarity=0.141 Sum_probs=19.1
Q ss_pred EEeehhHHHHHHHHHHhheeeEEEeeeec
Q 010261 453 GIGILGGALLILIGVILFGGYKIWTSRRA 481 (514)
Q Consensus 453 ~i~~~~~~~~~li~~~~~~~~~i~~rrk~ 481 (514)
+|.++++++++|++++++..++.||.|+.
T Consensus 62 lIP~VLLvlLLl~vV~lv~~~kRkr~K~~ 90 (146)
T PF15102_consen 62 LIPLVLLVLLLLSVVCLVIYYKRKRTKQE 90 (146)
T ss_pred eHHHHHHHHHHHHHHHheeEEeecccCCC
Confidence 33334444555666778888899998873
No 43
>PF07974 EGF_2: EGF-like domain; InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=46.39 E-value=13 Score=24.59 Aligned_cols=21 Identities=24% Similarity=0.580 Sum_probs=16.9
Q ss_pred CCCCCCCcCCCC--CcccCCCCC
Q 010261 318 SPCGSYSLCKQS--GCSCLDNRT 338 (514)
Q Consensus 318 g~CG~~giC~~~--~C~Cl~g~~ 338 (514)
.+|..+|.|... .|.|.+||.
T Consensus 6 ~~C~~~G~C~~~~g~C~C~~g~~ 28 (32)
T PF07974_consen 6 NICSGHGTCVSPCGRCVCDSGYT 28 (32)
T ss_pred CccCCCCEEeCCCCEEECCCCCc
Confidence 579999999854 499998864
No 44
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=44.10 E-value=98 Score=32.44 Aligned_cols=48 Identities=23% Similarity=0.282 Sum_probs=32.1
Q ss_pred ecCcEEEecC-CCceEEeecCCCC---------cE-EEeecCCCEEEEecCCCCeeeeee
Q 010261 129 FNGSLVISGP-HSRVFWSTTRAEG---------QR-VVILNTSNLQIQKLDDPLSVVWQS 177 (514)
Q Consensus 129 ~~G~LvL~d~-~g~~vWst~~s~~---------~~-a~LldsGNLVL~~~~~~~~~lWQS 177 (514)
.+|.|+-+|. +|.++|+.+.... .. .....+|.|+-.|..+| +.+|+-
T Consensus 128 ~~g~l~ald~~tG~~~W~~~~~~~~~ssP~v~~~~v~v~~~~g~l~ald~~tG-~~~W~~ 186 (394)
T PRK11138 128 EKGQVYALNAEDGEVAWQTKVAGEALSRPVVSDGLVLVHTSNGMLQALNESDG-AVKWTV 186 (394)
T ss_pred CCCEEEEEECCCCCCcccccCCCceecCCEEECCEEEEECCCCEEEEEEccCC-CEeeee
Confidence 4678887875 6899999875421 11 12234677888886567 899974
No 45
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=43.52 E-value=67 Score=33.30 Aligned_cols=47 Identities=17% Similarity=0.322 Sum_probs=26.3
Q ss_pred ecCcEEEec-CCCceEEeecCCCC---------cEE-EeecCCCEEEEecCCCCeeeee
Q 010261 129 FNGSLVISG-PHSRVFWSTTRAEG---------QRV-VILNTSNLQIQKLDDPLSVVWQ 176 (514)
Q Consensus 129 ~~G~LvL~d-~~g~~vWst~~s~~---------~~a-~LldsGNLVL~~~~~~~~~lWQ 176 (514)
.+|.|.-.| .+|.++|+.+.... ..+ .--.+|+|+-.|..++ +++|+
T Consensus 73 ~~g~v~a~d~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~ald~~tG-~~~W~ 130 (377)
T TIGR03300 73 ADGTVVALDAETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIALDAEDG-KELWR 130 (377)
T ss_pred CCCeEEEEEccCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEEEECCCC-cEeee
Confidence 356676666 56778887654321 111 1223566666664355 77885
No 46
>cd05845 Ig2_L1-CAM_like Second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. Ig2_L1-CAM_like: domain similar to the second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth.
Probab=41.72 E-value=34 Score=28.70 Aligned_cols=34 Identities=15% Similarity=0.333 Sum_probs=22.2
Q ss_pred cCCCceEEeecCCCCCCCCCcEEEeecCcEEEec
Q 010261 104 LPSSKPLWLANSTQLAPWSDRIELSFNGSLVISG 137 (514)
Q Consensus 104 ~~~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d 137 (514)
.|..++.|+-+....+......+++.+|+|.+.+
T Consensus 31 ~P~P~i~W~~~~~~~i~~~~Ri~~~~~GnL~fs~ 64 (95)
T cd05845 31 AVPLRIYWMNSDLLHITQDERVSMGQNGNLYFAN 64 (95)
T ss_pred CCCCEEEEECCCCccccccccEEECCCceEEEEE
Confidence 4667888984443445545567777778887654
No 47
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=39.41 E-value=42 Score=21.18 Aligned_cols=20 Identities=5% Similarity=0.238 Sum_probs=13.7
Q ss_pred EEEeecCcEEEecCCCceEE
Q 010261 125 IELSFNGSLVISGPHSRVFW 144 (514)
Q Consensus 125 l~L~~~G~LvL~d~~g~~vW 144 (514)
+.++.+|++++.|.++.-||
T Consensus 7 vav~~~g~i~VaD~~n~rV~ 26 (28)
T PF01436_consen 7 VAVDSDGNIYVADSGNHRVQ 26 (28)
T ss_dssp EEEETTSEEEEEECCCTEEE
T ss_pred EEEeCCCCEEEEECCCCEEE
Confidence 56667777777776666555
No 48
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=38.29 E-value=1.8e+02 Score=30.07 Aligned_cols=71 Identities=14% Similarity=0.257 Sum_probs=43.5
Q ss_pred CCCceEEeecCCCCC-----CCCCcEEE-eecCcEEEecC-CCceEEeecCCCC---------cE-EEeecCCCEEEEec
Q 010261 105 PSSKPLWLANSTQLA-----PWSDRIEL-SFNGSLVISGP-HSRVFWSTTRAEG---------QR-VVILNTSNLQIQKL 167 (514)
Q Consensus 105 ~~~tvVWvANR~~Pv-----~~~~~l~L-~~~G~LvL~d~-~g~~vWst~~s~~---------~~-a~LldsGNLVL~~~ 167 (514)
....++|.-+-..++ -....+-+ +.+|.|+-+|. +|.++|+...... .. ..-..+|.|+..|.
T Consensus 83 ~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~ald~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~a~d~ 162 (377)
T TIGR03300 83 ETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIALDAEDGKELWRAKLSSEVLSPPLVANGLVVVRTNDGRLTALDA 162 (377)
T ss_pred cCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEEEECCCCcEeeeeccCceeecCCEEECCEEEEECCCCeEEEEEc
Confidence 445788875543332 22333433 35788888886 6899998765321 11 12234677888886
Q ss_pred CCCCeeeee
Q 010261 168 DDPLSVVWQ 176 (514)
Q Consensus 168 ~~~~~~lWQ 176 (514)
.++ +++|+
T Consensus 163 ~tG-~~~W~ 170 (377)
T TIGR03300 163 ATG-ERLWT 170 (377)
T ss_pred CCC-ceeeE
Confidence 566 89997
No 49
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=37.95 E-value=26 Score=22.74 Aligned_cols=27 Identities=33% Similarity=0.778 Sum_probs=19.0
Q ss_pred CCCCCCCCCCCCCcCCCC--C--cccCCCCC
Q 010261 312 DACQLPSPCGSYSLCKQS--G--CSCLDNRT 338 (514)
Q Consensus 312 ~~Cd~~g~CG~~giC~~~--~--C~Cl~g~~ 338 (514)
++|.....|...+.|... . |.|++|+.
T Consensus 3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~ 33 (38)
T cd00054 3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYT 33 (38)
T ss_pred ccCCCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence 567654578878889752 2 99998764
No 50
>PF12877 DUF3827: Domain of unknown function (DUF3827); InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells.
Probab=37.03 E-value=15 Score=40.97 Aligned_cols=30 Identities=20% Similarity=0.359 Sum_probs=14.7
Q ss_pred ceeEEeehhHHHHHHHHHHhheeeEEEeeee
Q 010261 450 IAAGIGILGGALLILIGVILFGGYKIWTSRR 480 (514)
Q Consensus 450 ~~~~i~~~~~~~~~li~~~~~~~~~i~~rrk 480 (514)
.++++|+++.++++++ +++++.+.++|++|
T Consensus 269 lWII~gVlvPv~vV~~-Iiiil~~~LCRk~K 298 (684)
T PF12877_consen 269 LWIIAGVLVPVLVVLL-IIIILYWKLCRKNK 298 (684)
T ss_pred eEEEehHhHHHHHHHH-HHHHHHHHHhcccc
Confidence 4566677666554443 33334444444443
No 51
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=33.50 E-value=2.3e+02 Score=29.61 Aligned_cols=18 Identities=28% Similarity=0.602 Sum_probs=8.0
Q ss_pred cCcEEEecC-CCceEEeec
Q 010261 130 NGSLVISGP-HSRVFWSTT 147 (514)
Q Consensus 130 ~G~LvL~d~-~g~~vWst~ 147 (514)
+|.|.-.|. +|.++|+..
T Consensus 265 ~g~l~ald~~tG~~~W~~~ 283 (394)
T PRK11138 265 NGNLVALDLRSGQIVWKRE 283 (394)
T ss_pred CCeEEEEECCCCCEEEeec
Confidence 444444442 344555443
No 52
>PF02009 Rifin_STEVOR: Rifin/stevor family; InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=31.51 E-value=5.8 Score=40.49 Aligned_cols=26 Identities=19% Similarity=0.337 Sum_probs=15.8
Q ss_pred ehhHHHHHHHHHHhheeeEEEeeeec
Q 010261 456 ILGGALLILIGVILFGGYKIWTSRRA 481 (514)
Q Consensus 456 ~~~~~~~~li~~~~~~~~~i~~rrk~ 481 (514)
++++++++||.+++++++|+.|+||-
T Consensus 262 iiaIliIVLIMvIIYLILRYRRKKKm 287 (299)
T PF02009_consen 262 IIAILIIVLIMVIIYLILRYRRKKKM 287 (299)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhh
Confidence 33333445555667777777787663
No 53
>PF00157 Pou: Pou domain - N-terminal to homeobox domain; InterPro: IPR000327 POU proteins are eukaryotic transcription factors containing a bipartite DNA binding domain referred to as the POU domain. The acronym POU (pronounced 'pow') is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans. POU domain genes have been identified in diverse organisms including nematodes, flies, amphibians, fish and mammals but have not been yet identified in plants and fungi. The various members of the POU family have a wide variety of functions, all of which are related to the function of the neuroendocrine system [] and the development of an organism []. Some other genes are also regulated, including those for immunoglobulin light and heavy chains (Oct-2) [, ], and trophic hormone genes, such as those for prolactin and growth hormone (Pit-1). The POU domain is a bipartite domain composed of two subunits separated by a non-conserved region of 15-55 aa. The N-terminal subunit is known as the POU-specific (POUs) domain (IPR000327 from INTERPRO), while the C-terminal subunit is a homeobox domain (IPR007103 from INTERPRO). 3D structures of complexes including both POU subdomains bound to DNA are available. Both subdomains contain the structural motif 'helix-turn-helix', which directly associates with the two components of bipartite DNA binding sites, and both are required for high affinity sequence-specific DNA-binding. The domain may also be involved in protein-protein interactions []. The subdomains are connected by a flexible linker [, , ]. In proteins a POU-specific domain is always accompanied by a homeodomain. Despite of the lack of sequence homology, 3D structure of POUs is similar to 3D structure of bacteriophage lambda repressor and other members of HTH_3 family [, ]. This entry represents the POU-specific subunit of the POU domain.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent; PDB: 3D1N_O 1AU7_A 3L1P_A 2XSD_C 1O4X_A 1HF0_B 1GT0_C 1POU_A 1CQT_B 1E3O_C ....
Probab=31.23 E-value=10 Score=30.37 Aligned_cols=26 Identities=27% Similarity=0.428 Sum_probs=20.9
Q ss_pred eeccceeeeccchhhhhhhhhhhhhH
Q 010261 6 NFSQSTISLFSNMKKSANSATRTHAI 31 (514)
Q Consensus 6 ~~~~~~~~~~~~~~~~~~~~~~~~~~ 31 (514)
.|||+||+-||+++-+..+|....++
T Consensus 41 ~~SQttI~RFE~L~LS~kn~~klkP~ 66 (75)
T PF00157_consen 41 EFSQTTICRFEALQLSFKNMCKLKPL 66 (75)
T ss_dssp GGSHHHHHHHHTTTSCHHHHHHHHHH
T ss_pred cccchhhhhhHhcccCHHHHHHHHHH
Confidence 58999999999999887777655443
No 54
>PF12458 DUF3686: ATPase involved in DNA repair ; InterPro: IPR020958 This entry represents an N-terminal domain associated with ATPases and some uncharacterised proteins; it is approximately 450 amino acids in length and contains two conserved sequence motifs: DVF and SPNGED.
Probab=27.61 E-value=1.3e+02 Score=32.31 Aligned_cols=58 Identities=12% Similarity=0.072 Sum_probs=37.6
Q ss_pred CCcEEEeCCC-eEEEeeeecCCCCe-EEEEEEcCCCceEEeecCCCCCCCCCcEEEeecCcEEEecCC
Q 010261 74 FQSLLNDTTD-TFSLGFLRVNSNQL-ALAVIHLPSSKPLWLANSTQLAPWSDRIELSFNGSLVISGPH 139 (514)
Q Consensus 74 ~~~~LvS~~g-~F~lGFf~~~~s~~-~i~i~~~~~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~~ 139 (514)
+.+.+.|||| ++-.-||.+....+ .+.|+-|... | ++|+...+ -+|-.||.|++..+.
T Consensus 309 F~r~vrSPNGEDvLYvF~~~~~g~~~Ll~YN~I~k~-v------~tPi~chG-~alf~DG~l~~fra~ 368 (448)
T PF12458_consen 309 FERKVRSPNGEDVLYVFYAREEGRYLLLPYNLIRKE-V------ATPIICHG-YALFEDGRLVYFRAE 368 (448)
T ss_pred EEEEecCCCCceEEEEEEECCCCcEEEEechhhhhh-h------cCCeeccc-eeEecCCEEEEEecC
Confidence 4456789987 67777888876553 3456545432 1 24776543 567788998887655
No 55
>smart00765 MANEC The MANEC domain was formerly called MANSC. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase.
Probab=27.14 E-value=89 Score=26.20 Aligned_cols=35 Identities=23% Similarity=0.581 Sum_probs=26.5
Q ss_pred CCCHHHHHHHhhccCCeEEEEecC--CCCcceEEEee
Q 010261 383 TSYLEQCEDLCQNNCSCWGALYNN--ASGSGFCYMLD 417 (514)
Q Consensus 383 ~~sl~~C~~~CL~nCSC~Ay~y~~--~~gsG~C~l~~ 417 (514)
..+.++|..+|=+.=+|..+.+.. ..+.+.|||.+
T Consensus 37 ~~s~edC~~aCC~~~~CnlAv~e~~~~~~~~~CyLf~ 73 (93)
T smart00765 37 VNTWEDCVRACCSTPNCNLAVFELRREDAEGNCYLFN 73 (93)
T ss_pred cCCHHHHHHHHcCCCCCcEEEEeccCCCCCCceEEEE
Confidence 357899999999888888887752 22356799875
No 56
>PF07354 Sp38: Zona-pellucida-binding protein (Sp38); InterPro: IPR010857 This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90 kDa family of zona pellucida glycoproteins in a calcium-dependent manner []. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur [].; GO: 0007339 binding of sperm to zona pellucida, 0005576 extracellular region
Probab=27.02 E-value=74 Score=31.80 Aligned_cols=33 Identities=12% Similarity=0.235 Sum_probs=23.1
Q ss_pred CCceEEeecCCCCCCCCCcEEEeecCcEEEecC
Q 010261 106 SSKPLWLANSTQLAPWSDRIELSFNGSLVISGP 138 (514)
Q Consensus 106 ~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~ 138 (514)
+.+..|+--.+.++.+++.+.|+..|.|++.|-
T Consensus 12 DP~y~W~GP~g~~l~gn~~~nIT~TG~L~~~~F 44 (271)
T PF07354_consen 12 DPTYLWTGPNGKPLSGNSYVNITETGKLMFKNF 44 (271)
T ss_pred CCceEEECCCCcccCCCCeEEEccCceEEeecc
Confidence 456677777777777666677777777777654
No 57
>PF01034 Syndecan: Syndecan domain; InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains: A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains; A transmembrane region; A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins. The proteins known to belong to this family are: Syndecan 1. Syndecan 2 or fibroglycan. Syndecan 3 or neuroglycan or N-syndecan. Syndecan 4 or amphiglycan or ryudocan. Drosophila syndecan. Caenorhabditis elegans probable syndecan (F57C7.3). Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=25.59 E-value=18 Score=28.04 Aligned_cols=30 Identities=27% Similarity=0.300 Sum_probs=0.9
Q ss_pred eeEEeehhHHHHHHHHHHhheeeEEEeeeec
Q 010261 451 AAGIGILGGALLILIGVILFGGYKIWTSRRA 481 (514)
Q Consensus 451 ~~~i~~~~~~~~~li~~~~~~~~~i~~rrk~ 481 (514)
+++.|+++++++++ ++++|+.|++.+|...
T Consensus 13 avIaG~Vvgll~ai-lLIlf~iyR~rkkdEG 42 (64)
T PF01034_consen 13 AVIAGGVVGLLFAI-LLILFLIYRMRKKDEG 42 (64)
T ss_dssp -------------------------S-----
T ss_pred HHHHHHHHHHHHHH-HHHHHHHHHHHhcCCC
Confidence 34445555444433 4567888998898763
No 58
>PF12662 cEGF: Complement Clr-like EGF-like
Probab=24.75 E-value=50 Score=20.46 Aligned_cols=9 Identities=33% Similarity=0.623 Sum_probs=7.8
Q ss_pred cccCCCCCC
Q 010261 331 CSCLDNRTD 339 (514)
Q Consensus 331 C~Cl~g~~~ 339 (514)
|+|++||..
T Consensus 4 C~C~~Gy~l 12 (24)
T PF12662_consen 4 CSCPPGYQL 12 (24)
T ss_pred eeCCCCCcC
Confidence 999999874
No 59
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=24.01 E-value=16 Score=37.97 Aligned_cols=13 Identities=31% Similarity=0.501 Sum_probs=5.1
Q ss_pred HHHHHHHhheeeE
Q 010261 462 LILIGVILFGGYK 474 (514)
Q Consensus 462 ~~li~~~~~~~~~ 474 (514)
++|+.+++++++|
T Consensus 322 IVLIMvIIYLILR 334 (353)
T TIGR01477 322 IVLIMVIIYLILR 334 (353)
T ss_pred HHHHHHHHHHHHH
Confidence 3333333444444
No 60
>PTZ00046 rifin; Provisional
Probab=22.59 E-value=18 Score=37.65 Aligned_cols=13 Identities=31% Similarity=0.501 Sum_probs=5.1
Q ss_pred HHHHHHHhheeeE
Q 010261 462 LILIGVILFGGYK 474 (514)
Q Consensus 462 ~~li~~~~~~~~~ 474 (514)
++||.+++++++|
T Consensus 327 IVLIMvIIYLILR 339 (358)
T PTZ00046 327 IVLIMVIIYLILR 339 (358)
T ss_pred HHHHHHHHHHHHH
Confidence 3333333444444
No 61
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=21.94 E-value=70 Score=26.91 Aligned_cols=7 Identities=29% Similarity=0.472 Sum_probs=2.8
Q ss_pred hhhHHHH
Q 010261 28 THAIQFL 34 (514)
Q Consensus 28 ~~~~~~~ 34 (514)
+..+++|
T Consensus 3 SK~~llL 9 (95)
T PF07172_consen 3 SKAFLLL 9 (95)
T ss_pred hhHHHHH
Confidence 3334343
No 62
>PF12947 EGF_3: EGF domain; InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=21.80 E-value=17 Score=24.70 Aligned_cols=22 Identities=18% Similarity=0.608 Sum_probs=14.9
Q ss_pred CCCCCCCCcCCC--CC--cccCCCCC
Q 010261 317 PSPCGSYSLCKQ--SG--CSCLDNRT 338 (514)
Q Consensus 317 ~g~CG~~giC~~--~~--C~Cl~g~~ 338 (514)
.+-|.++..|.. +. |.|.+||.
T Consensus 5 ~~~C~~nA~C~~~~~~~~C~C~~Gy~ 30 (36)
T PF12947_consen 5 NGGCHPNATCTNTGGSYTCTCKPGYE 30 (36)
T ss_dssp GGGS-TTCEEEE-TTSEEEEE-CEEE
T ss_pred CCCCCCCcEeecCCCCEEeECCCCCc
Confidence 357889999976 23 99998864
No 63
>PF09064 Tme5_EGF_like: Thrombomodulin like fifth domain, EGF-like; InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=21.48 E-value=52 Score=22.24 Aligned_cols=8 Identities=25% Similarity=0.758 Sum_probs=6.6
Q ss_pred CcccCCCC
Q 010261 330 GCSCLDNR 337 (514)
Q Consensus 330 ~C~Cl~g~ 337 (514)
+|.||+||
T Consensus 19 ~C~CPeGy 26 (34)
T PF09064_consen 19 QCFCPEGY 26 (34)
T ss_pred ceeCCCce
Confidence 39999987
No 64
>PF08374 Protocadherin: Protocadherin; InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated [].
Probab=21.08 E-value=25 Score=33.94 Aligned_cols=8 Identities=38% Similarity=0.410 Sum_probs=3.0
Q ss_pred eeEEeehh
Q 010261 451 AAGIGILG 458 (514)
Q Consensus 451 ~~~i~~~~ 458 (514)
+++.|+++
T Consensus 42 aiVAG~~t 49 (221)
T PF08374_consen 42 AIVAGIMT 49 (221)
T ss_pred eeecchhh
Confidence 33333333
No 65
>PF12661 hEGF: Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=20.82 E-value=21 Score=18.75 Aligned_cols=8 Identities=38% Similarity=0.895 Sum_probs=5.0
Q ss_pred cccCCCCC
Q 010261 331 CSCLDNRT 338 (514)
Q Consensus 331 C~Cl~g~~ 338 (514)
|.|++||+
T Consensus 2 C~C~~G~~ 9 (13)
T PF12661_consen 2 CQCPPGWT 9 (13)
T ss_dssp EEE-TTEE
T ss_pred ccCcCCCc
Confidence 77888764
No 66
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=20.54 E-value=63 Score=28.65 Aligned_cols=14 Identities=14% Similarity=0.290 Sum_probs=7.1
Q ss_pred HHhheeeEEEeeee
Q 010261 467 VILFGGYKIWTSRR 480 (514)
Q Consensus 467 ~~~~~~~~i~~rrk 480 (514)
+.++.+++..||||
T Consensus 117 ~~~~~~yr~~r~~~ 130 (139)
T PHA03099 117 CCLLSVYRFTRRTK 130 (139)
T ss_pred HHHHhhheeeeccc
Confidence 33455566555554
No 67
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=20.33 E-value=1e+02 Score=39.90 Aligned_cols=26 Identities=27% Similarity=0.782 Sum_probs=18.3
Q ss_pred CCCCCCCCCCCCCcCCC-CC----cccCCCCC
Q 010261 312 DACQLPSPCGSYSLCKQ-SG----CSCLDNRT 338 (514)
Q Consensus 312 ~~Cd~~g~CG~~giC~~-~~----C~Cl~g~~ 338 (514)
++|. -..|---|.|+. +. |.||+-|.
T Consensus 3865 d~C~-~npCqhgG~C~~~~~ggy~CkCpsqys 3895 (4289)
T KOG1219|consen 3865 DPCN-DNPCQHGGTCISQPKGGYKCKCPSQYS 3895 (4289)
T ss_pred cccc-cCcccCCCEecCCCCCceEEeCccccc
Confidence 6663 467777788876 22 99997764
No 68
>cd05852 Ig5_Contactin-1 Fifth Ig domain of contactin-1. Ig5_Contactin-1: fifth Ig domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.
Probab=20.17 E-value=1.2e+02 Score=23.62 Aligned_cols=34 Identities=29% Similarity=0.327 Sum_probs=22.1
Q ss_pred cCCCceEEeecCCCCCCCCCcEEEeecCcEEEecC
Q 010261 104 LPSSKPLWLANSTQLAPWSDRIELSFNGSLVISGP 138 (514)
Q Consensus 104 ~~~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~ 138 (514)
.|..++.|.=|.. ++..+....+..+|.|.|.+.
T Consensus 13 ~P~p~v~W~k~~~-~l~~~~r~~~~~~g~L~I~~v 46 (73)
T cd05852 13 APKPKFSWSKGTE-LLVNNSRISIWDDGSLEILNI 46 (73)
T ss_pred eCCCEEEEEeCCE-ecccCCCEEEcCCCEEEECcC
Confidence 4566889986543 554445666666788877653
Done!