Query         010261
Match_columns 514
No_of_seqs    314 out of 1594
Neff          7.4 
Searched_HMMs 46136
Date          Thu Mar 28 22:27:30 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/010261.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/010261hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF01453 B_lectin:  D-mannose b  99.9 3.9E-25 8.5E-30  192.9   5.9   87  106-194     1-95  (114)
  2 cd00028 B_lectin Bulb-type man  99.9 2.2E-23 4.7E-28  182.6  13.6  105   75-181     7-116 (116)
  3 smart00108 B_lectin Bulb-type   99.9 2.1E-22 4.5E-27  175.9  13.0  104   75-180     7-114 (114)
  4 PF00954 S_locus_glycop:  S-loc  99.8 9.4E-19   2E-23  151.9   9.7  102  228-339     1-108 (110)
  5 PF08276 PAN_2:  PAN-like domai  99.5   1E-14 2.2E-19  114.6   6.2   63  359-423     4-66  (66)
  6 cd01098 PAN_AP_plant Plant PAN  99.4 3.1E-13 6.8E-18  110.8   8.4   80  353-438     3-84  (84)
  7 cd00129 PAN_APPLE PAN/APPLE-li  99.2 1.8E-11   4E-16   99.6   6.1   67  360-436     9-79  (80)
  8 smart00473 PAN_AP divergent su  98.4 8.5E-07 1.9E-11   70.9   8.1   69  365-436     7-77  (78)
  9 smart00108 B_lectin Bulb-type   98.4 1.6E-06 3.6E-11   75.4   9.0   81  126-230    25-111 (114)
 10 cd00028 B_lectin Bulb-type man  98.0 2.8E-05   6E-10   67.9   8.6   81  127-231    26-113 (116)
 11 PF01453 B_lectin:  D-mannose b  97.6 0.00052 1.1E-08   59.9  10.4   96   78-181    14-113 (114)
 12 cd01100 APPLE_Factor_XI_like S  97.6 6.2E-05 1.3E-09   60.1   4.1   34  383-418    24-57  (73)
 13 PF14295 PAN_4:  PAN domain; PD  95.0   0.018   4E-07   41.9   2.5   35  382-416    14-51  (51)
 14 PF00024 PAN_1:  PAN domain Thi  94.3   0.076 1.6E-06   42.1   4.7   36  383-420    22-58  (79)
 15 smart00223 APPLE APPLE domain.  93.6    0.11 2.4E-06   42.2   4.3   48  368-418     7-57  (79)
 16 PF08693 SKG6:  Transmembrane a  89.8     0.2 4.4E-06   35.0   1.6    7  473-479    33-39  (40)
 17 PF04478 Mid2:  Mid2 like cell   86.8    0.47   1E-05   43.1   2.4    8  449-456    51-58  (154)
 18 cd01099 PAN_AP_HGF Subfamily o  86.0     1.6 3.5E-05   35.2   5.0   36  383-420    24-61  (80)
 19 PF08277 PAN_3:  PAN-like domai  83.4     3.6 7.8E-05   32.0   5.8   33  382-418    18-50  (71)
 20 PTZ00382 Variant-specific surf  83.2       1 2.3E-05   37.9   2.8    6  472-477    88-93  (96)
 21 PF01102 Glycophorin_A:  Glycop  83.0    0.11 2.5E-06   45.6  -3.2   29  453-481    66-94  (122)
 22 PF15102 TMEM154:  TMEM154 prot  81.7     1.7 3.8E-05   39.2   3.7   12  471-482    77-88  (146)
 23 PF01034 Syndecan:  Syndecan do  80.7     0.5 1.1E-05   36.5  -0.1   15  466-480    24-38  (64)
 24 smart00605 CW CW domain.        78.5     7.8 0.00017   32.2   6.5   55  382-439    20-76  (94)
 25 PF01299 Lamp:  Lysosome-associ  68.8     2.4 5.1E-05   43.4   1.3   27  456-482   275-301 (306)
 26 PF01683 EB:  EB module;  Inter  65.2     5.4 0.00012   29.2   2.2   32  308-339    16-47  (52)
 27 TIGR01478 STEVOR variant surfa  59.2     1.8 3.8E-05   43.3  -1.7   11  344-356   170-180 (295)
 28 PTZ00370 STEVOR; Provisional    57.5       2 4.2E-05   43.1  -1.7   11  344-356   170-180 (296)
 29 PF13360 PQQ_2:  PQQ-like domai  56.5      87  0.0019   29.6   9.7   78   99-177    48-148 (238)
 30 cd00053 EGF Epidermal growth f  55.8     9.9 0.00021   24.4   2.1   26  314-339     2-31  (36)
 31 PF06024 DUF912:  Nucleopolyhed  54.8     9.8 0.00021   32.3   2.3   14  469-482    80-93  (101)
 32 PF13908 Shisa:  Wnt and FGF in  54.4      10 0.00022   35.5   2.7   12  452-463    80-91  (179)
 33 PF04478 Mid2:  Mid2 like cell   54.3      15 0.00032   33.6   3.5   25  451-475    49-73  (154)
 34 PF07645 EGF_CA:  Calcium-bindi  53.5       4 8.7E-05   28.6  -0.2   27  312-338     3-34  (42)
 35 PF06365 CD34_antigen:  CD34/Po  51.7     7.4 0.00016   37.3   1.2   29  452-480   101-129 (202)
 36 PF02009 Rifin_STEVOR:  Rifin/s  50.9     2.6 5.5E-05   43.0  -2.1   10  471-480   274-283 (299)
 37 PF03302 VSP:  Giardia variant-  50.2     8.9 0.00019   40.8   1.7   30  451-480   367-396 (397)
 38 KOG4649 PQQ (pyrrolo-quinoline  49.1      52  0.0011   33.0   6.6   44  106-149   168-217 (354)
 39 smart00179 EGF_CA Calcium-bind  47.7      15 0.00032   24.4   2.0   27  312-338     3-33  (39)
 40 PF13360 PQQ_2:  PQQ-like domai  47.5 1.1E+02  0.0024   28.9   8.8   73  104-177    10-102 (238)
 41 PHA03265 envelope glycoprotein  47.3     8.1 0.00018   39.7   0.8   30  449-482   349-378 (402)
 42 PF15102 TMEM154:  TMEM154 prot  47.1      16 0.00034   33.2   2.5   29  453-481    62-90  (146)
 43 PF07974 EGF_2:  EGF-like domai  46.4      13 0.00029   24.6   1.5   21  318-338     6-28  (32)
 44 PRK11138 outer membrane biogen  44.1      98  0.0021   32.4   8.5   48  129-177   128-186 (394)
 45 TIGR03300 assembly_YfgL outer   43.5      67  0.0014   33.3   7.0   47  129-176    73-130 (377)
 46 cd05845 Ig2_L1-CAM_like Second  41.7      34 0.00073   28.7   3.6   34  104-137    31-64  (95)
 47 PF01436 NHL:  NHL repeat;  Int  39.4      42 0.00091   21.2   3.0   20  125-144     7-26  (28)
 48 TIGR03300 assembly_YfgL outer   38.3 1.8E+02  0.0039   30.1   9.3   71  105-176    83-170 (377)
 49 cd00054 EGF_CA Calcium-binding  37.9      26 0.00055   22.7   1.9   27  312-338     3-33  (38)
 50 PF12877 DUF3827:  Domain of un  37.0      15 0.00032   41.0   0.8   30  450-480   269-298 (684)
 51 PRK11138 outer membrane biogen  33.5 2.3E+02   0.005   29.6   9.3   18  130-147   265-283 (394)
 52 PF02009 Rifin_STEVOR:  Rifin/s  31.5     5.8 0.00012   40.5  -3.1   26  456-481   262-287 (299)
 53 PF00157 Pou:  Pou domain - N-t  31.2      10 0.00023   30.4  -1.0   26    6-31     41-66  (75)
 54 PF12458 DUF3686:  ATPase invol  27.6 1.3E+02  0.0027   32.3   5.7   58   74-139   309-368 (448)
 55 smart00765 MANEC The MANEC dom  27.1      89  0.0019   26.2   3.7   35  383-417    37-73  (93)
 56 PF07354 Sp38:  Zona-pellucida-  27.0      74  0.0016   31.8   3.8   33  106-138    12-44  (271)
 57 PF01034 Syndecan:  Syndecan do  25.6      18 0.00039   28.0  -0.6   30  451-481    13-42  (64)
 58 PF12662 cEGF:  Complement Clr-  24.8      50  0.0011   20.5   1.4    9  331-339     4-12  (24)
 59 TIGR01477 RIFIN variant surfac  24.0      16 0.00035   38.0  -1.5   13  462-474   322-334 (353)
 60 PTZ00046 rifin; Provisional     22.6      18  0.0004   37.7  -1.4   13  462-474   327-339 (358)
 61 PF07172 GRP:  Glycine rich pro  21.9      70  0.0015   26.9   2.2    7   28-34      3-9   (95)
 62 PF12947 EGF_3:  EGF domain;  I  21.8      17 0.00038   24.7  -1.2   22  317-338     5-30  (36)
 63 PF09064 Tme5_EGF_like:  Thromb  21.5      52  0.0011   22.2   1.0    8  330-337    19-26  (34)
 64 PF08374 Protocadherin:  Protoc  21.1      25 0.00054   33.9  -0.7    8  451-458    42-49  (221)
 65 PF12661 hEGF:  Human growth fa  20.8      21 0.00045   18.7  -0.7    8  331-338     2-9   (13)
 66 PHA03099 epidermal growth fact  20.5      63  0.0014   28.7   1.7   14  467-480   117-130 (139)
 67 KOG1219 Uncharacterized conser  20.3   1E+02  0.0023   39.9   3.9   26  312-338  3865-3895(4289)
 68 cd05852 Ig5_Contactin-1 Fifth   20.2 1.2E+02  0.0025   23.6   3.1   34  104-138    13-46  (73)

No 1  
>PF01453 B_lectin:  D-mannose binding lectin;  InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]:  Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein   This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity.  Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=99.91  E-value=3.9e-25  Score=192.90  Aligned_cols=87  Identities=30%  Similarity=0.547  Sum_probs=64.0

Q ss_pred             CCceEEeecCCCCCC---CCCcEEEeecCcEEEecCCCceEEee-cCCC----CcEEEeecCCCEEEEecCCCCeeeeee
Q 010261          106 SSKPLWLANSTQLAP---WSDRIELSFNGSLVISGPHSRVFWST-TRAE----GQRVVILNTSNLQIQKLDDPLSVVWQS  177 (514)
Q Consensus       106 ~~tvVWvANR~~Pv~---~~~~l~L~~~G~LvL~d~~g~~vWst-~~s~----~~~a~LldsGNLVL~~~~~~~~~lWQS  177 (514)
                      ++||||+|||++|+.   ...+|.|+.||+|+|.|..++++|++ ++..    +..|+|+|+|||||++. .+ .+||||
T Consensus         1 ~~tvvW~an~~~p~~~~s~~~~L~l~~dGnLvl~~~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~-~~-~~lW~S   78 (114)
T PF01453_consen    1 PRTVVWVANRNSPLTSSSGNYTLILQSDGNLVLYDSNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDS-SG-NVLWQS   78 (114)
T ss_dssp             ---------TTEEEEECETTEEEEEETTSEEEEEETTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEET-TS-EEEEES
T ss_pred             CcccccccccccccccccccccceECCCCeEEEEcCCCCEEEEecccCCccccCeEEEEeCCCCEEEEee-cc-eEEEee
Confidence            369999999999994   23679999999999999999999999 5442    36899999999999994 65 999999


Q ss_pred             cCCCCCcccCCCcccCC
Q 010261          178 FDFPTDTLVENQNFTST  194 (514)
Q Consensus       178 FD~PTDTLLPgq~L~~~  194 (514)
                      |||||||+||+|+|+.+
T Consensus        79 f~~ptdt~L~~q~l~~~   95 (114)
T PF01453_consen   79 FDYPTDTLLPGQKLGDG   95 (114)
T ss_dssp             TTSSS-EEEEEET--TS
T ss_pred             cCCCccEEEeccCcccC
Confidence            99999999999999853


No 2  
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=99.90  E-value=2.2e-23  Score=182.64  Aligned_cols=105  Identities=28%  Similarity=0.464  Sum_probs=90.7

Q ss_pred             CcEEEeCCCeEEEeeeecCCCC-eEEEEEEcCC-CceEEeecCCCCCCCCCcEEEeecCcEEEecCCCceEEeecCCC--
Q 010261           75 QSLLNDTTDTFSLGFLRVNSNQ-LALAVIHLPS-SKPLWLANSTQLAPWSDRIELSFNGSLVISGPHSRVFWSTTRAE--  150 (514)
Q Consensus        75 ~~~LvS~~g~F~lGFf~~~~s~-~~i~i~~~~~-~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~~g~~vWst~~s~--  150 (514)
                      +++|+|+++.|++|||.+.... .+..|++... .++||+|||+.|......+.|+.||+|+|.|.+|.+||++++..  
T Consensus         7 ~~~l~s~~~~f~~G~~~~~~q~~dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~~~~~~   86 (116)
T cd00028           7 GQTLVSSGSLFELGFFKLIMQSRDYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDGSGTVVWSSNTTRVN   86 (116)
T ss_pred             CCEEEeCCCcEEEecccCCCCCCeEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcCCCcEEEEecccCCC
Confidence            3789999999999999988765 7776665432 68999999999966667899999999999999999999999864  


Q ss_pred             -CcEEEeecCCCEEEEecCCCCeeeeeecCCC
Q 010261          151 -GQRVVILNTSNLQIQKLDDPLSVVWQSFDFP  181 (514)
Q Consensus       151 -~~~a~LldsGNLVL~~~~~~~~~lWQSFD~P  181 (514)
                       ..+++|+|+|||||++. ++ .+||||||||
T Consensus        87 ~~~~~~L~ddGnlvl~~~-~~-~~~W~Sf~~P  116 (116)
T cd00028          87 GNYVLVLLDDGNLVLYDS-DG-NFLWQSFDYP  116 (116)
T ss_pred             CceEEEEeCCCCEEEECC-CC-CEEEcCCCCC
Confidence             26789999999999995 66 8999999999


No 3  
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=99.88  E-value=2.1e-22  Score=175.87  Aligned_cols=104  Identities=29%  Similarity=0.473  Sum_probs=89.0

Q ss_pred             CcEEEeCCCeEEEeeeecCCCCeEEEEEEcCC-CceEEeecCCCCCCCCCcEEEeecCcEEEecCCCceEEeecCC-C-C
Q 010261           75 QSLLNDTTDTFSLGFLRVNSNQLALAVIHLPS-SKPLWLANSTQLAPWSDRIELSFNGSLVISGPHSRVFWSTTRA-E-G  151 (514)
Q Consensus        75 ~~~LvS~~g~F~lGFf~~~~s~~~i~i~~~~~-~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~~g~~vWst~~s-~-~  151 (514)
                      ++.|+|+++.|++|||.+.....+..|++... .++||+|||+.|+..+..+.|+.||+|+|.|.+|.+||++++. . +
T Consensus         7 ~~~l~s~~~~f~~G~~~~~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~~g~~vW~S~t~~~~~   86 (114)
T smart00108        7 GQTLVSGNSLFELGFFTLIMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDGDGRVVWSSNTTGANG   86 (114)
T ss_pred             CCEEecCCCcEeeeccccCCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeCCCCEEEEecccCCCC
Confidence            37899999999999999876666666665332 6899999999998776789999999999999999999999986 2 2


Q ss_pred             -cEEEeecCCCEEEEecCCCCeeeeeecCC
Q 010261          152 -QRVVILNTSNLQIQKLDDPLSVVWQSFDF  180 (514)
Q Consensus       152 -~~a~LldsGNLVL~~~~~~~~~lWQSFD~  180 (514)
                       .+++|+|+|||||++. ++ .+|||||||
T Consensus        87 ~~~~~L~ddGnlvl~~~-~~-~~~W~Sf~~  114 (114)
T smart00108       87 NYVLVLLDDGNLVIYDS-DG-NFLWQSFDY  114 (114)
T ss_pred             ceEEEEeCCCCEEEECC-CC-CEEeCCCCC
Confidence             5799999999999984 66 899999997


No 4  
>PF00954 S_locus_glycop:  S-locus glycoprotein family;  InterPro: IPR000858 In Brassicaceae, self-incompatible plants have a self/non-self recognition system, which involves the inability of flowering plants to achieve self-fertilisation. This is sporophytically controlled by multiple alleles at a single locus (S). There are a total of 50 different S alleles in Brassica oleracea. S-locus glycoproteins, as well as S-receptor kinases, are in linkage with the S-alleles []. Most of the proteins within this family contain apple-like domain (IPR003609 from INTERPRO), which is predicted to possess protein- and/or carbohydrate-binding functions.; GO: 0048544 recognition of pollen
Probab=99.77  E-value=9.4e-19  Score=151.86  Aligned_cols=102  Identities=25%  Similarity=0.504  Sum_probs=76.6

Q ss_pred             EEecccCCceeeeeccccEEEEECCCcceEEEeeCCCcceeeEEEEEe-eeCCCcEEEEEEeeCCceEEEEe--eCCceE
Q 010261          228 WRHRALEAKADIVEGKGPIYVRVNSDGFLGTYQVGNNVPVDVEAFNNF-QRNSSGLLTLRLEQDGNLKGHYW--DGTNWV  304 (514)
Q Consensus       228 W~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~rl~Ld~dG~lr~y~w--~~~~W~  304 (514)
                      ||+|+|++.  .++ +.|.+... . -+...+..     ++.+.+++| ..+.+.++|++||+||++++|.|  +.++|.
T Consensus         1 wrsG~WnG~--~f~-g~p~~~~~-~-~~~~~fv~-----~~~e~~~t~~~~~~s~~~r~~ld~~G~l~~~~w~~~~~~W~   70 (110)
T PF00954_consen    1 WRSGPWNGQ--RFS-GIPEMSSN-S-LYNYSFVS-----NNEEVYYTYSLSNSSVLSRLVLDSDGQLQRYIWNESTQSWS   70 (110)
T ss_pred             CCccccCCe--EEC-Cccccccc-c-eeEEEEEE-----CCCeEEEEEecCCCceEEEEEEeeeeEEEEEEEecCCCcEE
Confidence            788999853  354 35543211 1 02122222     245677777 34567899999999999999999  468999


Q ss_pred             EEeeeccCCCCCCCCCCCCCcCCCCC---cccCCCCCC
Q 010261          305 LNYQAISDACQLPSPCGSYSLCKQSG---CSCLDNRTD  339 (514)
Q Consensus       305 ~~~~~p~~~Cd~~g~CG~~giC~~~~---C~Cl~g~~~  339 (514)
                      +.|++|.|+||+|+.||+||+|+.+.   |+||+||++
T Consensus        71 ~~~~~p~d~Cd~y~~CG~~g~C~~~~~~~C~Cl~GF~P  108 (110)
T PF00954_consen   71 VFWSAPKDQCDVYGFCGPNGICNSNNSPKCSCLPGFEP  108 (110)
T ss_pred             EEEEecccCCCCccccCCccEeCCCCCCceECCCCcCC
Confidence            99999999999999999999998743   999999975


No 5  
>PF08276 PAN_2:  PAN-like domain;  InterPro: IPR013227 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs
Probab=99.54  E-value=1e-14  Score=114.61  Aligned_cols=63  Identities=32%  Similarity=0.540  Sum_probs=51.0

Q ss_pred             ccceEEEEEecccCCCcceeeeccCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeeccccce
Q 010261          359 KSRFRVLRRKGVELPFKELIRYEMTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDYPIQTL  423 (514)
Q Consensus       359 ~d~F~~~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~~L~~~  423 (514)
                      +|+|  +++++|++|+++...++.+.++++||++||+||||+||+|.+..+++.|++|.++|.++
T Consensus         4 ~d~F--~~l~~~~~p~~~~~~~~~~~s~~~C~~~Cl~nCsC~Ayay~~~~~~~~C~lW~~~L~d~   66 (66)
T PF08276_consen    4 GDGF--LKLPNMKLPDFDNAIVDSSVSLEECEKACLSNCSCTAYAYSNLSGGGGCLLWYGDLVDL   66 (66)
T ss_pred             CCEE--EEECCeeCCCCcceeeecCCCHHHHHhhcCCCCCEeeEEeeccCCCCEEEEEcCEeecC
Confidence            4778  56789999999666555679999999999999999999998532245699999888763


No 6  
>cd01098 PAN_AP_plant Plant PAN/APPLE-like domain; present in plant S-receptor protein kinases and secreted glycoproteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions. S-receptor protein kinases and S-locus glycoproteins are involved in sporophytic self-incompatibility response in Brassica, one of probably many molecular mechanisms, by which hermaphrodite flowering plants avoid self-fertilization.
Probab=99.44  E-value=3.1e-13  Score=110.75  Aligned_cols=80  Identities=30%  Similarity=0.503  Sum_probs=60.3

Q ss_pred             CCCCCCc--cceEEEEEecccCCCcceeeeccCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeeccccceeecCCCC
Q 010261          353 DFCSEDK--SRFRVLRRKGVELPFKELIRYEMTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDYPIQTLLGAGDVS  430 (514)
Q Consensus       353 ~~C~~~~--d~F~~~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~~L~~~~~~~~~~  430 (514)
                      +.|+.+.  +.|  +++.++++|+.... . .+.++++|+++||+||+|+||+|.++  +|.|++|..++.+.+.....+
T Consensus         3 ~~C~~~~~~~~f--~~~~~~~~~~~~~~-~-~~~s~~~C~~~Cl~nCsC~a~~~~~~--~~~C~~~~~~~~~~~~~~~~~   76 (84)
T cd01098           3 LNCGGDGSTDGF--LKLPDVKLPDNASA-I-TAISLEECREACLSNCSCTAYAYNNG--SGGCLLWNGLLNNLRSLSSGG   76 (84)
T ss_pred             cccCCCCCCCEE--EEeCCeeCCCchhh-h-ccCCHHHHHHHHhcCCCcceeeecCC--CCeEEEEeceecceEeecCCC
Confidence            3575432  355  56679999987543 2 67899999999999999999999875  567999998888765433344


Q ss_pred             eeEEEEEe
Q 010261          431 KLGYFKLR  438 (514)
Q Consensus       431 ~~~yIKv~  438 (514)
                      ..+||||+
T Consensus        77 ~~~yiKv~   84 (84)
T cd01098          77 GTLYLRLA   84 (84)
T ss_pred             cEEEEEeC
Confidence            67899985


No 7  
>cd00129 PAN_APPLE PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins,  plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=99.22  E-value=1.8e-11  Score=99.60  Aligned_cols=67  Identities=12%  Similarity=0.224  Sum_probs=53.2

Q ss_pred             cceEEEEEecccCCCcceeeeccCCCHHHHHHHhhc---cCCeEEEEecCCCCcceEEEeeccc-cceeecCCCCeeEEE
Q 010261          360 SRFRVLRRKGVELPFKELIRYEMTSYLEQCEDLCQN---NCSCWGALYNNASGSGFCYMLDYPI-QTLLGAGDVSKLGYF  435 (514)
Q Consensus       360 d~F~~~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~---nCSC~Ay~y~~~~gsG~C~l~~~~L-~~~~~~~~~~~~~yI  435 (514)
                      ..|  +++.+|++|++..      .+++||+++|++   ||||+||+|.+. +.| |++|.+++ .+++...+++.++|+
T Consensus         9 g~f--l~~~~~klpd~~~------~s~~eC~~~Cl~~~~nCsC~Aya~~~~-~~g-C~~W~~~l~~d~~~~~~~g~~Ly~   78 (80)
T cd00129           9 GTT--LIKIALKIKTTKA------NTADECANRCEKNGLPFSCKAFVFAKA-RKQ-CLWFPFNSMSGVRKEFSHGFDLYE   78 (80)
T ss_pred             CeE--EEeecccCCcccc------cCHHHHHHHHhcCCCCCCceeeeccCC-CCC-eEEecCcchhhHHhccCCCceeEe
Confidence            456  4567899998743      689999999999   999999999754 134 99999999 777655555677899


Q ss_pred             E
Q 010261          436 K  436 (514)
Q Consensus       436 K  436 (514)
                      |
T Consensus        79 r   79 (80)
T cd00129          79 N   79 (80)
T ss_pred             E
Confidence            8


No 8  
>smart00473 PAN_AP divergent subfamily of APPLE domains. Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions.
Probab=98.45  E-value=8.5e-07  Score=70.88  Aligned_cols=69  Identities=28%  Similarity=0.335  Sum_probs=48.6

Q ss_pred             EEEecccCCCcceeeeccCCCHHHHHHHhhc-cCCeEEEEecCCCCcceEEEee-ccccceeecCCCCeeEEEE
Q 010261          365 LRRKGVELPFKELIRYEMTSYLEQCEDLCQN-NCSCWGALYNNASGSGFCYMLD-YPIQTLLGAGDVSKLGYFK  436 (514)
Q Consensus       365 ~~~~~v~~P~~~~~~~~~~~sl~~C~~~CL~-nCSC~Ay~y~~~~gsG~C~l~~-~~L~~~~~~~~~~~~~yIK  436 (514)
                      +++.++++|+..... ....++++|++.|++ +|+|.||.|..+  ++.|++|. .++.+.......+...|.|
T Consensus         7 ~~~~~~~l~~~~~~~-~~~~s~~~C~~~C~~~~~~C~s~~y~~~--~~~C~l~~~~~~~~~~~~~~~~~~~y~~   77 (78)
T smart00473        7 VRLPNTKLPGFSRIV-ISVASLEECASKCLNSNCSCRSFTYNNG--TKGCLLWSESSLGDARLFPSGGVDLYEK   77 (78)
T ss_pred             EEecCccCCCCccee-EcCCCHHHHHHHhCCCCCceEEEEEcCC--CCEEEEeeCCccccceecccCCceeEEe
Confidence            456788888553322 346799999999999 999999999863  45799998 6666654223333445554


No 9  
>smart00108 B_lectin Bulb-type mannose-specific lectin.
Probab=98.38  E-value=1.6e-06  Score=75.44  Aligned_cols=81  Identities=22%  Similarity=0.402  Sum_probs=58.7

Q ss_pred             EEeecCcEEEecCC-CceEEeecCCC----CcEEEeecCCCEEEEecCCCCeeeeeecCCCCCcccCCCcccCCceEEee
Q 010261          126 ELSFNGSLVISGPH-SRVFWSTTRAE----GQRVVILNTSNLQIQKLDDPLSVVWQSFDFPTDTLVENQNFTSTMSLVSS  200 (514)
Q Consensus       126 ~L~~~G~LvL~d~~-g~~vWst~~s~----~~~a~LldsGNLVL~~~~~~~~~lWQSFD~PTDTLLPgq~L~~~~~L~Ss  200 (514)
                      ....||+||+.+.. +.+||++++..    +..+.|.++|||||++. ++ .++|+|=   |+   ++            
T Consensus        25 ~~q~dgnlV~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~-~g-~~vW~S~---t~---~~------------   84 (114)
T smart00108       25 IMQNDYNLILYKSSSRTVVWVANRDNPVSDSCTLTLQSDGNLVLYDG-DG-RVVWSSN---TT---GA------------   84 (114)
T ss_pred             CCCCCEEEEEEECCCCcEEEECCCCCCCCCCEEEEEeCCCCEEEEeC-CC-CEEEEec---cc---CC------------
Confidence            34568999998765 57999999853    24788999999999984 56 8999982   11   22            


Q ss_pred             cceeEEEecCC-ceeeEEEecCCccceEEEe
Q 010261          201 NGLYSMRLGSN-FIGLYAKFNDKSEQIYWRH  230 (514)
Q Consensus       201 ~G~y~l~~~~~-~~~l~~~~~~~~~~~YW~~  230 (514)
                      .+.|.+.++++ +++++-    ...++.|.+
T Consensus        85 ~~~~~~~L~ddGnlvl~~----~~~~~~W~S  111 (114)
T smart00108       85 NGNYVLVLLDDGNLVIYD----SDGNFLWQS  111 (114)
T ss_pred             CCceEEEEeCCCCEEEEC----CCCCEEeCC
Confidence            34578889888 888862    113577865


No 10 
>cd00028 B_lectin Bulb-type mannose-specific lectin. The domain contains a three-fold internal repeat (beta-prism architecture). The consensus sequence motif QXDXNXVXY is involved in alpha-D-mannose recognition. Lectins are carbohydrate-binding proteins which specifically recognize diverse carbohydrates and mediate a wide variety of biological processes, such as cell-cell and host-pathogen interactions, serum glycoprotein turnover, and innate immune responses.
Probab=98.01  E-value=2.8e-05  Score=67.93  Aligned_cols=81  Identities=23%  Similarity=0.387  Sum_probs=58.3

Q ss_pred             Eee-cCcEEEecCC-CceEEeecCCC----CcEEEeecCCCEEEEecCCCCeeeeeecCCCCCcccCCCcccCCceEEee
Q 010261          127 LSF-NGSLVISGPH-SRVFWSTTRAE----GQRVVILNTSNLQIQKLDDPLSVVWQSFDFPTDTLVENQNFTSTMSLVSS  200 (514)
Q Consensus       127 L~~-~G~LvL~d~~-g~~vWst~~s~----~~~a~LldsGNLVL~~~~~~~~~lWQSFD~PTDTLLPgq~L~~~~~L~Ss  200 (514)
                      ... ||+||+.+.. +.+||++|+..    ...+.|.++|||||.+. ++ .++|||=-..                  .
T Consensus        26 ~q~~dgnlv~~~~~~~~~vW~snt~~~~~~~~~l~l~~dGnLvl~~~-~g-~~vW~S~~~~------------------~   85 (116)
T cd00028          26 MQSRDYNLILYKGSSRTVVWVANRDNPSGSSCTLTLQSDGNLVIYDG-SG-TVVWSSNTTR------------------V   85 (116)
T ss_pred             CCCCeEEEEEEeCCCCeEEEECCCCCCCCCCEEEEEecCCCeEEEcC-CC-cEEEEecccC------------------C
Confidence            344 8999998754 57999999753    25788999999999984 66 8999975321                  0


Q ss_pred             cceeEEEecCC-ceeeEEEecCCccceEEEec
Q 010261          201 NGLYSMRLGSN-FIGLYAKFNDKSEQIYWRHR  231 (514)
Q Consensus       201 ~G~y~l~~~~~-~~~l~~~~~~~~~~~YW~~~  231 (514)
                      .+.+.+.++++ +++++-.  +  ..+.|.+.
T Consensus        86 ~~~~~~~L~ddGnlvl~~~--~--~~~~W~Sf  113 (116)
T cd00028          86 NGNYVLVLLDDGNLVLYDS--D--GNFLWQSF  113 (116)
T ss_pred             CCceEEEEeCCCCEEEECC--C--CCEEEcCC
Confidence            24577888888 8888632  1  35678763


No 11 
>PF01453 B_lectin:  D-mannose binding lectin;  InterPro: IPR001480 A bulb lectin super-family (Amaryllidaceae, Orchidaceae and Aliaceae) contains a ~115-residue-long domain whose overall three dimensional fold is very similar to that of [, ]:  Dictyostelium discoideum comitin, an actin binding protein Curculigo latifolia curculin, a sweet tasting and taste-modifying protein   This domain generally binds mannose, but in at least one protein, curculin, it is apparently devoid of mannose-binding activity.  Each bulb-type lectin domain consists of three sequential beta-sheet subdomains (I, II, III) that are inter-related by pseudo three-fold symmetry. The three subdomains are flat four-stranded, antiparrallel beta-sheets. Together they form a 12-stranded beta-barrel in which the barrel axis coincides with the pseudo 3-fold axis.; GO: 0005529 sugar binding; PDB: 3M7H_A 3M7J_B 3MEZ_D 1DLP_A 1BWU_D 1KJ1_A 1B2P_A 1XD6_A 2DPF_C 2D04_B ....
Probab=97.63  E-value=0.00052  Score=59.86  Aligned_cols=96  Identities=18%  Similarity=0.283  Sum_probs=61.0

Q ss_pred             EEeCCCeEEEeeeecCCCCeEEEEEEcCCCceEEee-cCCCCCCCCCcEEEeecCcEEEecCCCceEEeecCCCC-cEEE
Q 010261           78 LNDTTDTFSLGFLRVNSNQLALAVIHLPSSKPLWLA-NSTQLAPWSDRIELSFNGSLVISGPHSRVFWSTTRAEG-QRVV  155 (514)
Q Consensus        78 LvS~~g~F~lGFf~~~~s~~~i~i~~~~~~tvVWvA-NR~~Pv~~~~~l~L~~~G~LvL~d~~g~~vWst~~s~~-~~a~  155 (514)
                      +.+.+|.+.|-|-..++    |.+.. ...++||.. +...+......+.|..||||||.|..+.++|++..... ..+.
T Consensus        14 ~~~~s~~~~L~l~~dGn----Lvl~~-~~~~~iWss~~t~~~~~~~~~~~L~~~GNlvl~d~~~~~lW~Sf~~ptdt~L~   88 (114)
T PF01453_consen   14 LTSSSGNYTLILQSDGN----LVLYD-SNGSVIWSSNNTSGRGNSGCYLVLQDDGNLVLYDSSGNVLWQSFDYPTDTLLP   88 (114)
T ss_dssp             EEECETTEEEEEETTSE----EEEEE-TTTEEEEE--S-TTSS-SSEEEEEETTSEEEEEETTSEEEEESTTSSS-EEEE
T ss_pred             cccccccccceECCCCe----EEEEc-CCCCEEEEecccCCccccCeEEEEeCCCCEEEEeecceEEEeecCCCccEEEe
Confidence            44444777777655443    33332 335679999 43433323457899999999999999999999954322 4566


Q ss_pred             eec--CCCEEEEecCCCCeeeeeecCCC
Q 010261          156 ILN--TSNLQIQKLDDPLSVVWQSFDFP  181 (514)
Q Consensus       156 Lld--sGNLVL~~~~~~~~~lWQSFD~P  181 (514)
                      +++  .||++ ... .. .+.|.|=+.|
T Consensus        89 ~q~l~~~~~~-~~~-~~-~~sw~s~~dp  113 (114)
T PF01453_consen   89 GQKLGDGNVT-GKN-DS-LTSWSSNTDP  113 (114)
T ss_dssp             EET--TSEEE-EES-TS-SEEEESS---
T ss_pred             ccCcccCCCc-ccc-ce-EEeECCCCCC
Confidence            777  88988 542 33 7889887666


No 12 
>cd01100 APPLE_Factor_XI_like Subfamily of PAN/APPLE-like domains; present in plasma prekallikrein/coagulation factor XI, microneme antigen proteins, and a few prokaryotic proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=97.62  E-value=6.2e-05  Score=60.12  Aligned_cols=34  Identities=29%  Similarity=0.603  Sum_probs=30.0

Q ss_pred             CCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeec
Q 010261          383 TSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDY  418 (514)
Q Consensus       383 ~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~  418 (514)
                      ..+.++|++.|+.+|+|.||.|...  .+.|++|..
T Consensus        24 ~~s~~~Cq~~C~~~~~C~afT~~~~--~~~C~lk~~   57 (73)
T cd01100          24 ASSAEQCQAACTADPGCLAFTYNTK--SKKCFLKSS   57 (73)
T ss_pred             cCCHHHHHHHcCCCCCceEEEEECC--CCeEEcccC
Confidence            4689999999999999999999865  578999875


No 13 
>PF14295 PAN_4:  PAN domain; PDB: 2YIL_E 2YIP_C 2YIO_A.
Probab=95.03  E-value=0.018  Score=41.86  Aligned_cols=35  Identities=31%  Similarity=0.645  Sum_probs=18.3

Q ss_pred             cCCCHHHHHHHhhccCCeEEEEecCC---CCcceEEEe
Q 010261          382 MTSYLEQCEDLCQNNCSCWGALYNNA---SGSGFCYML  416 (514)
Q Consensus       382 ~~~sl~~C~~~CL~nCSC~Ay~y~~~---~gsG~C~l~  416 (514)
                      ...+.++|.++|.++=.|.++.|...   ++.+.|+||
T Consensus        14 ~~~s~~~C~~~C~~~~~C~~~~~~~~~~~~~~~~C~LK   51 (51)
T PF14295_consen   14 TASSPEECQAACAADPGCQAFTFNPPGCPSSSGRCYLK   51 (51)
T ss_dssp             ----HHHHHHHHHTSTT--EEEEETTEE----------
T ss_pred             cCCCHHHHHHHccCCCCCCEEEEECCCcccccccccCC
Confidence            35689999999999999999999762   235679886


No 14 
>PF00024 PAN_1:  PAN domain This Prosite entry concerns apple domains, a subset of PAN domains;  InterPro: IPR003014 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs It has been shown that, the N-terminal N domains of members of the plasminogen/hepatocyte growth factor family, the apple domains of the plasma prekallikrein/coagulation factor XI family, and domains of various nematode proteins belong to the same module superfamily, the PAN module []. PAN contains a conserved core of three disulphide bridges. In some members of the family there is an additional fourth disulphide bridge that links the N and C termini of the domain.; PDB: 1GP9_C 2QJ2_B 1GMO_H 1NK1_B 3MKP_B 1BHT_B 3HN4_A 1GMN_A 3HMS_A 3HMT_B ....
Probab=94.32  E-value=0.076  Score=42.06  Aligned_cols=36  Identities=31%  Similarity=0.586  Sum_probs=30.7

Q ss_pred             CCCHHHHHHHhhccCC-eEEEEecCCCCcceEEEeeccc
Q 010261          383 TSYLEQCEDLCQNNCS-CWGALYNNASGSGFCYMLDYPI  420 (514)
Q Consensus       383 ~~sl~~C~~~CL~nCS-C~Ay~y~~~~gsG~C~l~~~~L  420 (514)
                      ..++++|.+.|+.+=. |.+|.|...  ++.|+|+...-
T Consensus        22 v~s~~~C~~~C~~~~~~C~s~~y~~~--~~~C~L~~~~~   58 (79)
T PF00024_consen   22 VPSLEECAQLCLNEPRRCKSFNYDPS--SKTCYLSSSDR   58 (79)
T ss_dssp             ESSHHHHHHHHHHSTT-ESEEEEETT--TTEEEEECSSS
T ss_pred             CCCHHHHHhhcCcCcccCCeEEEECC--CCEEEEcCCCC
Confidence            3599999999999999 999999876  66899987533


No 15 
>smart00223 APPLE APPLE domain. Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder.
Probab=93.57  E-value=0.11  Score=42.18  Aligned_cols=48  Identities=15%  Similarity=0.294  Sum_probs=36.1

Q ss_pred             ecccCCCcceeeeccCCCHHHHHHHhhccCCeEEEEecCCCCcc---eEEEeec
Q 010261          368 KGVELPFKELIRYEMTSYLEQCEDLCQNNCSCWGALYNNASGSG---FCYMLDY  418 (514)
Q Consensus       368 ~~v~~P~~~~~~~~~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG---~C~l~~~  418 (514)
                      .+++++..+.. .....+.++|++.|..+=.|.+|.|...  .+   .|+||..
T Consensus         7 ~~~df~G~Dl~-~~~~~~~~~Cq~~Ct~~~~C~~FTf~~~--~~~~~~C~LK~s   57 (79)
T smart00223        7 KNVDFRGSDIN-TVYVPSAQVCQKRCTSHPRCLFFTFSTN--EPPEEKCLLKDS   57 (79)
T ss_pred             cCccccCceee-eeecCCHHHHHHhhcCCCCccEEEeeCC--CCCCCEeEeCcC
Confidence            45666665443 2345789999999999999999999765  33   7999864


No 16 
>PF08693 SKG6:  Transmembrane alpha-helix domain;  InterPro: IPR014805 SKG6 and AXL2 are membrane proteins that show polarised intracellular localisation [, ]. This entry represents the highly conserved transmembrane alpha-helical domain found in these proteins [, ]. The full-length AXL2 protein has a negative regulatory function in cytokinesis [].
Probab=89.80  E-value=0.2  Score=34.96  Aligned_cols=7  Identities=0%  Similarity=0.306  Sum_probs=2.7

Q ss_pred             eEEEeee
Q 010261          473 YKIWTSR  479 (514)
Q Consensus       473 ~~i~~rr  479 (514)
                      |+.|||+
T Consensus        33 ~~~~rR~   39 (40)
T PF08693_consen   33 FFWYRRK   39 (40)
T ss_pred             heEEecc
Confidence            3334443


No 17 
>PF04478 Mid2:  Mid2 like cell wall stress sensor;  InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=86.80  E-value=0.47  Score=43.06  Aligned_cols=8  Identities=25%  Similarity=0.451  Sum_probs=3.3

Q ss_pred             cceeEEee
Q 010261          449 GIAAGIGI  456 (514)
Q Consensus       449 ~~~~~i~~  456 (514)
                      ++++++|+
T Consensus        51 VIGvVVGV   58 (154)
T PF04478_consen   51 VIGVVVGV   58 (154)
T ss_pred             EEEEEecc
Confidence            34444443


No 18 
>cd01099 PAN_AP_HGF Subfamily of PAN/APPLE-like domains; present in N-terminal (N) domains of plasminogen/hepatocyte growth factor proteins, and various proteins found in Bilateria, such as leech anti-platelet proteins. PAN/APPLE domains fulfill diverse biological functions by mediating protein-protein or protein-carbohydrate interactions.
Probab=86.01  E-value=1.6  Score=35.21  Aligned_cols=36  Identities=28%  Similarity=0.460  Sum_probs=30.0

Q ss_pred             CCCHHHHHHHhhc--cCCeEEEEecCCCCcceEEEeeccc
Q 010261          383 TSYLEQCEDLCQN--NCSCWGALYNNASGSGFCYMLDYPI  420 (514)
Q Consensus       383 ~~sl~~C~~~CL~--nCSC~Ay~y~~~~gsG~C~l~~~~L  420 (514)
                      ..++++|.++|++  +=.|.++.|...  ++.|.|-..+-
T Consensus        24 ~~s~~~C~~~C~~~~~f~CrSf~y~~~--~~~C~L~~~~~   61 (80)
T cd01099          24 VASLEECLRKCLEETEFTCRSFNYNYK--SKECILSDEDR   61 (80)
T ss_pred             cCCHHHHHHHhCCCCCceEeEEEEEcC--CCEEEEeCCCc
Confidence            4799999999999  899999999766  56799866433


No 19 
>PF08277 PAN_3:  PAN-like domain;  InterPro: IPR006583 PAN domains have significant functional versatility fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions []. These domains contain a hair-pin loop like structure, similar to knottins, but the pattern of disulphide bonds differs The PAN-3 or CW is a domain associated with a number of Caenorhabditis elegans hypothetical proteins.
Probab=83.42  E-value=3.6  Score=32.01  Aligned_cols=33  Identities=27%  Similarity=0.759  Sum_probs=27.5

Q ss_pred             cCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeec
Q 010261          382 MTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDY  418 (514)
Q Consensus       382 ~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~  418 (514)
                      ...+.++|-+.|..+=.|.++.++ .   +.|++...
T Consensus        18 ~~~sw~~Cv~~C~~~~~C~la~~~-~---~~C~~y~~   50 (71)
T PF08277_consen   18 TNTSWDDCVQKCYNDENCVLAYFD-S---GKCYLYNY   50 (71)
T ss_pred             cCCCHHHHhHHhCCCCEEEEEEeC-C---CCEEEEEc
Confidence            457889999999999999999887 3   25998763


No 20 
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=83.23  E-value=1  Score=37.94  Aligned_cols=6  Identities=0%  Similarity=-0.070  Sum_probs=2.4

Q ss_pred             eeEEEe
Q 010261          472 GYKIWT  477 (514)
Q Consensus       472 ~~~i~~  477 (514)
                      .|+++|
T Consensus        88 w~f~~r   93 (96)
T PTZ00382         88 WWFVCR   93 (96)
T ss_pred             heeEEe
Confidence            344443


No 21 
>PF01102 Glycophorin_A:  Glycophorin A;  InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=82.96  E-value=0.11  Score=45.56  Aligned_cols=29  Identities=38%  Similarity=0.518  Sum_probs=14.5

Q ss_pred             EEeehhHHHHHHHHHHhheeeEEEeeeec
Q 010261          453 GIGILGGALLILIGVILFGGYKIWTSRRA  481 (514)
Q Consensus       453 ~i~~~~~~~~~li~~~~~~~~~i~~rrk~  481 (514)
                      ++++++++++.+|++++++.|+|.||||+
T Consensus        66 i~~Ii~gv~aGvIg~Illi~y~irR~~Kk   94 (122)
T PF01102_consen   66 IIGIIFGVMAGVIGIILLISYCIRRLRKK   94 (122)
T ss_dssp             HHHHHHHHHHHHHHHHHHHHHHHHHHS--
T ss_pred             eeehhHHHHHHHHHHHHHHHHHHHHHhcc
Confidence            34444444444555556667776544443


No 22 
>PF15102 TMEM154:  TMEM154 protein family
Probab=81.66  E-value=1.7  Score=39.22  Aligned_cols=12  Identities=8%  Similarity=0.072  Sum_probs=5.7

Q ss_pred             eeeEEEeeeecc
Q 010261          471 GGYKIWTSRRAN  482 (514)
Q Consensus       471 ~~~~i~~rrk~~  482 (514)
                      +++++.||||.|
T Consensus        77 ~lv~~~kRkr~K   88 (146)
T PF15102_consen   77 CLVIYYKRKRTK   88 (146)
T ss_pred             HheeEEeecccC
Confidence            333444565544


No 23 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=80.73  E-value=0.5  Score=36.46  Aligned_cols=15  Identities=27%  Similarity=0.288  Sum_probs=0.4

Q ss_pred             HHHhheeeEEEeeee
Q 010261          466 GVILFGGYKIWTSRR  480 (514)
Q Consensus       466 ~~~~~~~~~i~~rrk  480 (514)
                      ++++++.|+++|.||
T Consensus        24 ~ailLIlf~iyR~rk   38 (64)
T PF01034_consen   24 FAILLILFLIYRMRK   38 (64)
T ss_dssp             -------------S-
T ss_pred             HHHHHHHHHHHHHHh
Confidence            333344455555443


No 24 
>smart00605 CW CW domain.
Probab=78.46  E-value=7.8  Score=32.17  Aligned_cols=55  Identities=25%  Similarity=0.433  Sum_probs=35.1

Q ss_pred             cCCCHHHHHHHhhccCCeEEEEecCCCCcceEEEeec-cccceeecCC-CCeeEEEEEec
Q 010261          382 MTSYLEQCEDLCQNNCSCWGALYNNASGSGFCYMLDY-PIQTLLGAGD-VSKLGYFKLRE  439 (514)
Q Consensus       382 ~~~sl~~C~~~CL~nCSC~Ay~y~~~~gsG~C~l~~~-~L~~~~~~~~-~~~~~yIKv~~  439 (514)
                      ...+.++|.+.|..+..|..+....   +..|++... .+..+++... .+..+=+|+..
T Consensus        20 ~~~sw~~Ci~~C~~~~~Cvlay~~~---~~~C~~f~~~~~~~v~~~~~~~~~~VAfK~~~   76 (94)
T smart00605       20 ATLSWDECIQKCYEDSNCVLAYGNS---SETCYLFSYGTVLTVKKLSSSSGKKVAFKVST   76 (94)
T ss_pred             cCCCHHHHHHHHhCCCceEEEecCC---CCceEEEEcCCeEEEEEccCCCCcEEEEEEeC
Confidence            3578999999999999999875543   236998764 2333433322 22334566654


No 25 
>PF01299 Lamp:  Lysosome-associated membrane glycoprotein (Lamp);  InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below.   +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+  In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100.  Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail.   Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=68.81  E-value=2.4  Score=43.45  Aligned_cols=27  Identities=30%  Similarity=0.250  Sum_probs=15.0

Q ss_pred             ehhHHHHHHHHHHhheeeEEEeeeecc
Q 010261          456 ILGGALLILIGVILFGGYKIWTSRRAN  482 (514)
Q Consensus       456 ~~~~~~~~li~~~~~~~~~i~~rrk~~  482 (514)
                      |+|++.++.+++++++.|+|.|||+.+
T Consensus       275 IaVG~~La~lvlivLiaYli~Rrr~~~  301 (306)
T PF01299_consen  275 IAVGAALAGLVLIVLIAYLIGRRRSRA  301 (306)
T ss_pred             HHHHHHHHHHHHHHHHhheeEeccccc
Confidence            334444444445556678887666543


No 26 
>PF01683 EB:  EB module;  InterPro: IPR006149  The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO 
Probab=65.17  E-value=5.4  Score=29.21  Aligned_cols=32  Identities=19%  Similarity=0.512  Sum_probs=26.9

Q ss_pred             eeccCCCCCCCCCCCCCcCCCCCcccCCCCCC
Q 010261          308 QAISDACQLPSPCGSYSLCKQSGCSCLDNRTD  339 (514)
Q Consensus       308 ~~p~~~Cd~~g~CG~~giC~~~~C~Cl~g~~~  339 (514)
                      ..|.+.|+...-|-.+++|..+.|.|++|+..
T Consensus        16 ~~~g~~C~~~~qC~~~s~C~~g~C~C~~g~~~   47 (52)
T PF01683_consen   16 VQPGESCESDEQCIGGSVCVNGRCQCPPGYVE   47 (52)
T ss_pred             CCCCCCCCCcCCCCCcCEEcCCEeECCCCCEe
Confidence            45668899999999999997777999998754


No 27 
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=59.23  E-value=1.8  Score=43.32  Aligned_cols=11  Identities=27%  Similarity=0.528  Sum_probs=6.1

Q ss_pred             CCCccCCCCCCCC
Q 010261          344 GECFASTSGDFCS  356 (514)
Q Consensus       344 ~GC~r~~~~~~C~  356 (514)
                      ++|.|.  --.|.
T Consensus       170 ~rC~~g--i~~Cs  180 (295)
T TIGR01478       170 KGCTAG--VGTCA  180 (295)
T ss_pred             ccCCCe--eEeec
Confidence            577753  33464


No 28 
>PTZ00370 STEVOR; Provisional
Probab=57.46  E-value=2  Score=43.09  Aligned_cols=11  Identities=27%  Similarity=0.377  Sum_probs=6.3

Q ss_pred             CCCccCCCCCCCC
Q 010261          344 GECFASTSGDFCS  356 (514)
Q Consensus       344 ~GC~r~~~~~~C~  356 (514)
                      .+|.|.  --.|.
T Consensus       170 ~rC~~g--i~~Cs  180 (296)
T PTZ00370        170 HRCTGG--ICSCS  180 (296)
T ss_pred             ccCCCe--eEeec
Confidence            578763  33465


No 29 
>PF13360 PQQ_2:  PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=56.49  E-value=87  Score=29.61  Aligned_cols=78  Identities=19%  Similarity=0.334  Sum_probs=49.4

Q ss_pred             EEEEEcCCCceEEeecCCCCCC-----CCCcEE-EeecCcEEEec-CCCceEEee-cCC----C----------CcEEEe
Q 010261           99 LAVIHLPSSKPLWLANSTQLAP-----WSDRIE-LSFNGSLVISG-PHSRVFWST-TRA----E----------GQRVVI  156 (514)
Q Consensus        99 i~i~~~~~~tvVWvANR~~Pv~-----~~~~l~-L~~~G~LvL~d-~~g~~vWst-~~s----~----------~~~a~L  156 (514)
                      +..+.+...+++|...-+.++.     ..+.+. .+.+|.|...| .+|.++|.. ...    .          +..+.+
T Consensus        48 l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~~~~~~~~~~  127 (238)
T PF13360_consen   48 LYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLYALDAKTGKVLWSIYLTSSPPAGVRSSSSPAVDGDRLYV  127 (238)
T ss_dssp             EEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEEEEETTTSCEEEEEEE-SSCTCSTB--SEEEEETTEEEE
T ss_pred             EEEEECCCCCEEEEeeccccccceeeecccccccccceeeeEecccCCcceeeeeccccccccccccccCceEecCEEEE
Confidence            3444555678999988766532     233343 34467788888 889999994 321    1          112233


Q ss_pred             -ecCCCEEEEecCCCCeeeeee
Q 010261          157 -LNTSNLQIQKLDDPLSVVWQS  177 (514)
Q Consensus       157 -ldsGNLVL~~~~~~~~~lWQS  177 (514)
                       ..+|.|+..|..+| +.+|+=
T Consensus       128 ~~~~g~l~~~d~~tG-~~~w~~  148 (238)
T PF13360_consen  128 GTSSGKLVALDPKTG-KLLWKY  148 (238)
T ss_dssp             EETCSEEEEEETTTT-EEEEEE
T ss_pred             EeccCcEEEEecCCC-cEEEEe
Confidence             33889999986577 899974


No 30 
>cd00053 EGF Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at  least  one  is  present  in  most EGF-like domains; a subset of these bind calcium.
Probab=55.76  E-value=9.9  Score=24.37  Aligned_cols=26  Identities=27%  Similarity=0.763  Sum_probs=19.0

Q ss_pred             CCCCCCCCCCCcCCCC--C--cccCCCCCC
Q 010261          314 CQLPSPCGSYSLCKQS--G--CSCLDNRTD  339 (514)
Q Consensus       314 Cd~~g~CG~~giC~~~--~--C~Cl~g~~~  339 (514)
                      |.....|..++.|...  .  |.|++||..
T Consensus         2 C~~~~~C~~~~~C~~~~~~~~C~C~~g~~g   31 (36)
T cd00053           2 CAASNPCSNGGTCVNTPGSYRCVCPPGYTG   31 (36)
T ss_pred             CCCCCCCCCCCEEecCCCCeEeECCCCCcc
Confidence            4445678888999762  2  999988754


No 31 
>PF06024 DUF912:  Nucleopolyhedrovirus protein of unknown function (DUF912);  InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=54.80  E-value=9.8  Score=32.34  Aligned_cols=14  Identities=14%  Similarity=0.126  Sum_probs=7.2

Q ss_pred             hheeeEEEeeeecc
Q 010261          469 LFGGYKIWTSRRAN  482 (514)
Q Consensus       469 ~~~~~~i~~rrk~~  482 (514)
                      ++.+|.|.|.|+.+
T Consensus        80 ~IyYFVILRer~~~   93 (101)
T PF06024_consen   80 AIYYFVILRERQKS   93 (101)
T ss_pred             hheEEEEEeccccc
Confidence            44455665555444


No 32 
>PF13908 Shisa:  Wnt and FGF inhibitory regulator
Probab=54.43  E-value=10  Score=35.48  Aligned_cols=12  Identities=25%  Similarity=0.573  Sum_probs=5.2

Q ss_pred             eEEeehhHHHHH
Q 010261          452 AGIGILGGALLI  463 (514)
Q Consensus       452 ~~i~~~~~~~~~  463 (514)
                      ++++++++++++
T Consensus        80 iivgvi~~Vi~I   91 (179)
T PF13908_consen   80 IIVGVICGVIAI   91 (179)
T ss_pred             eeeehhhHHHHH
Confidence            444454444333


No 33 
>PF04478 Mid2:  Mid2 like cell wall stress sensor;  InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=54.34  E-value=15  Score=33.55  Aligned_cols=25  Identities=20%  Similarity=0.042  Sum_probs=15.3

Q ss_pred             eeEEeehhHHHHHHHHHHhheeeEE
Q 010261          451 AAGIGILGGALLILIGVILFGGYKI  475 (514)
Q Consensus       451 ~~~i~~~~~~~~~li~~~~~~~~~i  475 (514)
                      .++||+++++-+++|++++.++|++
T Consensus        49 nIVIGvVVGVGg~ill~il~lvf~~   73 (154)
T PF04478_consen   49 NIVIGVVVGVGGPILLGILALVFIF   73 (154)
T ss_pred             cEEEEEEecccHHHHHHHHHhheeE
Confidence            5899999998554444433333433


No 34 
>PF07645 EGF_CA:  Calcium-binding EGF domain;  InterPro: IPR001881 A sequence of about forty amino-acid residues found in epidermal growth factor (EGF) has been shown [, , , , , ] to be present in a large number of membrane-bound and extracellular, mostly animal, proteins. Many of these proteins require calcium for their biological function and a calcium-binding site has been found at the N terminus of some EGF-like domains []. Calcium-binding may be crucial for numerous protein-protein interactions. For human coagulation factor IX it has been shown [] that the calcium-ligands form a pentagonal bipyramid. The first, third and fourth conserved negatively charged or polar residues are side chain ligands. The latter is possibly hydroxylated (see aspartic acid and asparagine hydroxylation site) []. A conserved aromatic residue, as well as the second conserved negative residue, are thought to be involved in stabilising the calcium-binding site. As in non-calcium binding EGF-like domains, there are six conserved cysteines and the structure of both types is very similar as calcium-binding induces only strictly local structural changes [].  +------------------+ +---------+ | | | | nxnnC-x(3,14)-C-x(3,7)-CxxbxxxxaxC-x(1,6)-C-x(8,13)-Cx | | +------------------+ 'n': negatively charged or polar residue [DEQN] 'b': possibly beta-hydroxylated residue [DN] 'a': aromatic amino acid 'C': cysteine, involved in disulphide bond 'x': any amino acid. ; GO: 0005509 calcium ion binding; PDB: 2VJ3_A 1TOZ_A 1LMJ_A 1UZQ_A 1UZK_A 1UZJ_B 1UZP_A 1EMO_A 1EMN_A 2RR0_A ....
Probab=53.48  E-value=4  Score=28.59  Aligned_cols=27  Identities=26%  Similarity=0.621  Sum_probs=21.6

Q ss_pred             CCCCCC-CCCCCCCcCCC--CC--cccCCCCC
Q 010261          312 DACQLP-SPCGSYSLCKQ--SG--CSCLDNRT  338 (514)
Q Consensus       312 ~~Cd~~-g~CG~~giC~~--~~--C~Cl~g~~  338 (514)
                      |+|... ..|.+++.|..  +.  |.|++||.
T Consensus         3 dEC~~~~~~C~~~~~C~N~~Gsy~C~C~~Gy~   34 (42)
T PF07645_consen    3 DECAEGPHNCPENGTCVNTEGSYSCSCPPGYE   34 (42)
T ss_dssp             STTTTTSSSSSTTSEEEEETTEEEEEESTTEE
T ss_pred             cccCCCCCcCCCCCEEEcCCCCEEeeCCCCcE
Confidence            678774 48999999975  33  99999986


No 35 
>PF06365 CD34_antigen:  CD34/Podocalyxin family;  InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=51.69  E-value=7.4  Score=37.32  Aligned_cols=29  Identities=21%  Similarity=0.395  Sum_probs=17.8

Q ss_pred             eEEeehhHHHHHHHHHHhheeeEEEeeee
Q 010261          452 AGIGILGGALLILIGVILFGGYKIWTSRR  480 (514)
Q Consensus       452 ~~i~~~~~~~~~li~~~~~~~~~i~~rrk  480 (514)
                      ++|++++...++|++++++.+|++|+||.
T Consensus       101 ~lI~lv~~g~~lLla~~~~~~Y~~~~Rrs  129 (202)
T PF06365_consen  101 TLIALVTSGSFLLLAILLGAGYCCHQRRS  129 (202)
T ss_pred             EEEehHHhhHHHHHHHHHHHHHHhhhhcc
Confidence            55555554434555556666777787775


No 36 
>PF02009 Rifin_STEVOR:  Rifin/stevor family;  InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=50.90  E-value=2.6  Score=43.02  Aligned_cols=10  Identities=40%  Similarity=0.245  Sum_probs=5.1

Q ss_pred             eeeEEEeeee
Q 010261          471 GGYKIWTSRR  480 (514)
Q Consensus       471 ~~~~i~~rrk  480 (514)
                      ++|+|||+||
T Consensus       274 IIYLILRYRR  283 (299)
T PF02009_consen  274 IIYLILRYRR  283 (299)
T ss_pred             HHHHHHHHHH
Confidence            3455555444


No 37 
>PF03302 VSP:  Giardia variant-specific surface protein;  InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=50.23  E-value=8.9  Score=40.81  Aligned_cols=30  Identities=17%  Similarity=0.179  Sum_probs=18.5

Q ss_pred             eeEEeehhHHHHHHHHHHhheeeEEEeeee
Q 010261          451 AAGIGILGGALLILIGVILFGGYKIWTSRR  480 (514)
Q Consensus       451 ~~~i~~~~~~~~~li~~~~~~~~~i~~rrk  480 (514)
                      ++|+||.|+++++|-+++.|+.||+.-|+|
T Consensus       367 gaIaGIsvavvvvVgglvGfLcWwf~crgk  396 (397)
T PF03302_consen  367 GAIAGISVAVVVVVGGLVGFLCWWFICRGK  396 (397)
T ss_pred             cceeeeeehhHHHHHHHHHHHhhheeeccc
Confidence            466777776666555555666665555554


No 38 
>KOG4649 consensus PQQ (pyrrolo-quinoline quinone) repeat protein [Secondary metabolites biosynthesis, transport and catabolism]
Probab=49.11  E-value=52  Score=32.99  Aligned_cols=44  Identities=20%  Similarity=0.302  Sum_probs=33.3

Q ss_pred             CCceEEeecCCCCCCCC-----CcEEE-eecCcEEEecCCCceEEeecCC
Q 010261          106 SSKPLWLANSTQLAPWS-----DRIEL-SFNGSLVISGPHSRVFWSTTRA  149 (514)
Q Consensus       106 ~~tvVWvANR~~Pv~~~-----~~l~L-~~~G~LvL~d~~g~~vWst~~s  149 (514)
                      +.++.|-|-|..|+-.+     ....+ +-||+|.-.|..|+.||.-.+.
T Consensus       168 ~~~~~w~~~~~~PiF~splcv~~sv~i~~VdG~l~~f~~sG~qvwr~~t~  217 (354)
T KOG4649|consen  168 SSTEFWAATRFGPIFASPLCVGSSVIITTVDGVLTSFDESGRQVWRPATK  217 (354)
T ss_pred             CcceehhhhcCCccccCceeccceEEEEEeccEEEEEcCCCcEEEeecCC
Confidence            34899999999997532     23444 3599999899999999976654


No 39 
>smart00179 EGF_CA Calcium-binding EGF-like domain.
Probab=47.72  E-value=15  Score=24.36  Aligned_cols=27  Identities=30%  Similarity=0.753  Sum_probs=20.0

Q ss_pred             CCCCCCCCCCCCCcCCCC--C--cccCCCCC
Q 010261          312 DACQLPSPCGSYSLCKQS--G--CSCLDNRT  338 (514)
Q Consensus       312 ~~Cd~~g~CG~~giC~~~--~--C~Cl~g~~  338 (514)
                      ++|.....|...+.|...  .  |.|++|+.
T Consensus         3 ~~C~~~~~C~~~~~C~~~~g~~~C~C~~g~~   33 (39)
T smart00179        3 DECASGNPCQNGGTCVNTVGSYRCECPPGYT   33 (39)
T ss_pred             ccCcCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence            567655678888899752  2  99999875


No 40 
>PF13360 PQQ_2:  PQQ-like domain; PDB: 3HXJ_B 1YIQ_A 1KV9_A 3Q54_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A ....
Probab=47.50  E-value=1.1e+02  Score=28.87  Aligned_cols=73  Identities=21%  Similarity=0.304  Sum_probs=43.1

Q ss_pred             cCCCceEEeecC----CCCC----CCCCcEEE-eecCcEEEecC-CCceEEeecCCCC---------cEEEeec-CCCEE
Q 010261          104 LPSSKPLWLANS----TQLA----PWSDRIEL-SFNGSLVISGP-HSRVFWSTTRAEG---------QRVVILN-TSNLQ  163 (514)
Q Consensus       104 ~~~~tvVWvANR----~~Pv----~~~~~l~L-~~~G~LvL~d~-~g~~vWst~~s~~---------~~a~Lld-sGNLV  163 (514)
                      ......+|..+-    +.++    ..+..+-+ +.+|.|+..|. +|.++|+......         ..+.+.. +|-|.
T Consensus        10 ~~tG~~~W~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~~d~~tG~~~W~~~~~~~~~~~~~~~~~~v~v~~~~~~l~   89 (238)
T PF13360_consen   10 PRTGKELWSYDLGPGIGGPVATAVPDGGRVYVASGDGNLYALDAKTGKVLWRFDLPGPISGAPVVDGGRVYVGTSDGSLY   89 (238)
T ss_dssp             TTTTEEEEEEECSSSCSSEEETEEEETTEEEEEETTSEEEEEETTTSEEEEEEECSSCGGSGEEEETTEEEEEETTSEEE
T ss_pred             CCCCCEEEEEECCCCCCCccceEEEeCCEEEEEcCCCEEEEEECCCCCEEEEeeccccccceeeecccccccccceeeeE
Confidence            345567887753    2222    12233333 36788999996 8999999875321         1223333 44466


Q ss_pred             EEecCCCCeeeeee
Q 010261          164 IQKLDDPLSVVWQS  177 (514)
Q Consensus       164 L~~~~~~~~~lWQS  177 (514)
                      ..|..++ +++|+.
T Consensus        90 ~~d~~tG-~~~W~~  102 (238)
T PF13360_consen   90 ALDAKTG-KVLWSI  102 (238)
T ss_dssp             EEETTTS-CEEEEE
T ss_pred             ecccCCc-ceeeee
Confidence            6664466 899995


No 41 
>PHA03265 envelope glycoprotein D; Provisional
Probab=47.32  E-value=8.1  Score=39.75  Aligned_cols=30  Identities=20%  Similarity=0.323  Sum_probs=16.3

Q ss_pred             cceeEEeehhHHHHHHHHHHhheeeEEEeeeecc
Q 010261          449 GIAAGIGILGGALLILIGVILFGGYKIWTSRRAN  482 (514)
Q Consensus       449 ~~~~~i~~~~~~~~~li~~~~~~~~~i~~rrk~~  482 (514)
                      .++++||+.++.++++ +++   .|++|||||..
T Consensus       349 ~~g~~ig~~i~glv~v-g~i---l~~~~rr~k~~  378 (402)
T PHA03265        349 FVGISVGLGIAGLVLV-GVI---LYVCLRRKKEL  378 (402)
T ss_pred             ccceEEccchhhhhhh-hHH---HHHHhhhhhhh
Confidence            3456666665543322 333   35567787743


No 42 
>PF15102 TMEM154:  TMEM154 protein family
Probab=47.08  E-value=16  Score=33.16  Aligned_cols=29  Identities=28%  Similarity=0.141  Sum_probs=19.1

Q ss_pred             EEeehhHHHHHHHHHHhheeeEEEeeeec
Q 010261          453 GIGILGGALLILIGVILFGGYKIWTSRRA  481 (514)
Q Consensus       453 ~i~~~~~~~~~li~~~~~~~~~i~~rrk~  481 (514)
                      +|.++++++++|++++++..++.||.|+.
T Consensus        62 lIP~VLLvlLLl~vV~lv~~~kRkr~K~~   90 (146)
T PF15102_consen   62 LIPLVLLVLLLLSVVCLVIYYKRKRTKQE   90 (146)
T ss_pred             eHHHHHHHHHHHHHHHheeEEeecccCCC
Confidence            33334444555666778888899998873


No 43 
>PF07974 EGF_2:  EGF-like domain;  InterPro: IPR013111 A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , , ] to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length. This entry contains EGF domains found in a variety of extracellular and membrane proteins
Probab=46.39  E-value=13  Score=24.59  Aligned_cols=21  Identities=24%  Similarity=0.580  Sum_probs=16.9

Q ss_pred             CCCCCCCcCCCC--CcccCCCCC
Q 010261          318 SPCGSYSLCKQS--GCSCLDNRT  338 (514)
Q Consensus       318 g~CG~~giC~~~--~C~Cl~g~~  338 (514)
                      .+|..+|.|...  .|.|.+||.
T Consensus         6 ~~C~~~G~C~~~~g~C~C~~g~~   28 (32)
T PF07974_consen    6 NICSGHGTCVSPCGRCVCDSGYT   28 (32)
T ss_pred             CccCCCCEEeCCCCEEECCCCCc
Confidence            579999999854  499998864


No 44 
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=44.10  E-value=98  Score=32.44  Aligned_cols=48  Identities=23%  Similarity=0.282  Sum_probs=32.1

Q ss_pred             ecCcEEEecC-CCceEEeecCCCC---------cE-EEeecCCCEEEEecCCCCeeeeee
Q 010261          129 FNGSLVISGP-HSRVFWSTTRAEG---------QR-VVILNTSNLQIQKLDDPLSVVWQS  177 (514)
Q Consensus       129 ~~G~LvL~d~-~g~~vWst~~s~~---------~~-a~LldsGNLVL~~~~~~~~~lWQS  177 (514)
                      .+|.|+-+|. +|.++|+.+....         .. .....+|.|+-.|..+| +.+|+-
T Consensus       128 ~~g~l~ald~~tG~~~W~~~~~~~~~ssP~v~~~~v~v~~~~g~l~ald~~tG-~~~W~~  186 (394)
T PRK11138        128 EKGQVYALNAEDGEVAWQTKVAGEALSRPVVSDGLVLVHTSNGMLQALNESDG-AVKWTV  186 (394)
T ss_pred             CCCEEEEEECCCCCCcccccCCCceecCCEEECCEEEEECCCCEEEEEEccCC-CEeeee
Confidence            4678887875 6899999875421         11 12234677888886567 899974


No 45 
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=43.52  E-value=67  Score=33.30  Aligned_cols=47  Identities=17%  Similarity=0.322  Sum_probs=26.3

Q ss_pred             ecCcEEEec-CCCceEEeecCCCC---------cEE-EeecCCCEEEEecCCCCeeeee
Q 010261          129 FNGSLVISG-PHSRVFWSTTRAEG---------QRV-VILNTSNLQIQKLDDPLSVVWQ  176 (514)
Q Consensus       129 ~~G~LvL~d-~~g~~vWst~~s~~---------~~a-~LldsGNLVL~~~~~~~~~lWQ  176 (514)
                      .+|.|.-.| .+|.++|+.+....         ..+ .--.+|+|+-.|..++ +++|+
T Consensus        73 ~~g~v~a~d~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~ald~~tG-~~~W~  130 (377)
T TIGR03300        73 ADGTVVALDAETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIALDAEDG-KELWR  130 (377)
T ss_pred             CCCeEEEEEccCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEEEECCCC-cEeee
Confidence            356676666 56778887654321         111 1223566666664355 77885


No 46 
>cd05845 Ig2_L1-CAM_like Second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM) and similar proteins. Ig2_L1-CAM_like: domain similar to the second immunoglobulin (Ig)-like domain of the L1 cell adhesion molecule (CAM). L1 belongs to the L1 subfamily of cell adhesion molecules (CAMs) and is comprised of an extracellular region having six Ig-like domains, five fibronectin type III domains, a transmembrane region and an intracellular domain. L1 is primarily expressed in the nervous system and is involved in its development and function. L1 is associated with an X-linked recessive disorder, X-linked hydrocephalus, MASA syndrome, or spastic paraplegia type 1, that involves abnormalities of axonal growth.
Probab=41.72  E-value=34  Score=28.70  Aligned_cols=34  Identities=15%  Similarity=0.333  Sum_probs=22.2

Q ss_pred             cCCCceEEeecCCCCCCCCCcEEEeecCcEEEec
Q 010261          104 LPSSKPLWLANSTQLAPWSDRIELSFNGSLVISG  137 (514)
Q Consensus       104 ~~~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d  137 (514)
                      .|..++.|+-+....+......+++.+|+|.+.+
T Consensus        31 ~P~P~i~W~~~~~~~i~~~~Ri~~~~~GnL~fs~   64 (95)
T cd05845          31 AVPLRIYWMNSDLLHITQDERVSMGQNGNLYFAN   64 (95)
T ss_pred             CCCCEEEEECCCCccccccccEEECCCceEEEEE
Confidence            4667888984443445545567777778887654


No 47 
>PF01436 NHL:  NHL repeat;  InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ].  The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=39.41  E-value=42  Score=21.18  Aligned_cols=20  Identities=5%  Similarity=0.238  Sum_probs=13.7

Q ss_pred             EEEeecCcEEEecCCCceEE
Q 010261          125 IELSFNGSLVISGPHSRVFW  144 (514)
Q Consensus       125 l~L~~~G~LvL~d~~g~~vW  144 (514)
                      +.++.+|++++.|.++.-||
T Consensus         7 vav~~~g~i~VaD~~n~rV~   26 (28)
T PF01436_consen    7 VAVDSDGNIYVADSGNHRVQ   26 (28)
T ss_dssp             EEEETTSEEEEEECCCTEEE
T ss_pred             EEEeCCCCEEEEECCCCEEE
Confidence            56667777777776666555


No 48 
>TIGR03300 assembly_YfgL outer membrane assembly lipoprotein YfgL. Members of this protein family are YfgL, a lipoprotein component of a complex that acts protein insertion into the bacterial outer membrane. Other members of this complex are NlpB, YfiO, and YaeT. This protein contains multiple copies of a repeat that, in other contexts, are associated with binding of the coenzyme PQQ.
Probab=38.29  E-value=1.8e+02  Score=30.07  Aligned_cols=71  Identities=14%  Similarity=0.257  Sum_probs=43.5

Q ss_pred             CCCceEEeecCCCCC-----CCCCcEEE-eecCcEEEecC-CCceEEeecCCCC---------cE-EEeecCCCEEEEec
Q 010261          105 PSSKPLWLANSTQLA-----PWSDRIEL-SFNGSLVISGP-HSRVFWSTTRAEG---------QR-VVILNTSNLQIQKL  167 (514)
Q Consensus       105 ~~~tvVWvANR~~Pv-----~~~~~l~L-~~~G~LvL~d~-~g~~vWst~~s~~---------~~-a~LldsGNLVL~~~  167 (514)
                      ....++|.-+-..++     -....+-+ +.+|.|+-+|. +|.++|+......         .. ..-..+|.|+..|.
T Consensus        83 ~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~ald~~tG~~~W~~~~~~~~~~~p~v~~~~v~v~~~~g~l~a~d~  162 (377)
T TIGR03300        83 ETGKRLWRVDLDERLSGGVGADGGLVFVGTEKGEVIALDAEDGKELWRAKLSSEVLSPPLVANGLVVVRTNDGRLTALDA  162 (377)
T ss_pred             cCCcEeeeecCCCCcccceEEcCCEEEEEcCCCEEEEEECCCCcEeeeeccCceeecCCEEECCEEEEECCCCeEEEEEc
Confidence            445788875543332     22333433 35788888886 6899998765321         11 12234677888886


Q ss_pred             CCCCeeeee
Q 010261          168 DDPLSVVWQ  176 (514)
Q Consensus       168 ~~~~~~lWQ  176 (514)
                      .++ +++|+
T Consensus       163 ~tG-~~~W~  170 (377)
T TIGR03300       163 ATG-ERLWT  170 (377)
T ss_pred             CCC-ceeeE
Confidence            566 89997


No 49 
>cd00054 EGF_CA Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Probab=37.95  E-value=26  Score=22.74  Aligned_cols=27  Identities=33%  Similarity=0.778  Sum_probs=19.0

Q ss_pred             CCCCCCCCCCCCCcCCCC--C--cccCCCCC
Q 010261          312 DACQLPSPCGSYSLCKQS--G--CSCLDNRT  338 (514)
Q Consensus       312 ~~Cd~~g~CG~~giC~~~--~--C~Cl~g~~  338 (514)
                      ++|.....|...+.|...  .  |.|++|+.
T Consensus         3 ~~C~~~~~C~~~~~C~~~~~~~~C~C~~g~~   33 (38)
T cd00054           3 DECASGNPCQNGGTCVNTVGSYRCSCPPGYT   33 (38)
T ss_pred             ccCCCCCCcCCCCEeECCCCCeEeECCCCCc
Confidence            567654578878889752  2  99998764


No 50 
>PF12877 DUF3827:  Domain of unknown function (DUF3827);  InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells. 
Probab=37.03  E-value=15  Score=40.97  Aligned_cols=30  Identities=20%  Similarity=0.359  Sum_probs=14.7

Q ss_pred             ceeEEeehhHHHHHHHHHHhheeeEEEeeee
Q 010261          450 IAAGIGILGGALLILIGVILFGGYKIWTSRR  480 (514)
Q Consensus       450 ~~~~i~~~~~~~~~li~~~~~~~~~i~~rrk  480 (514)
                      .++++|+++.++++++ +++++.+.++|++|
T Consensus       269 lWII~gVlvPv~vV~~-Iiiil~~~LCRk~K  298 (684)
T PF12877_consen  269 LWIIAGVLVPVLVVLL-IIIILYWKLCRKNK  298 (684)
T ss_pred             eEEEehHhHHHHHHHH-HHHHHHHHHhcccc
Confidence            4566677666554443 33334444444443


No 51 
>PRK11138 outer membrane biogenesis protein BamB; Provisional
Probab=33.50  E-value=2.3e+02  Score=29.61  Aligned_cols=18  Identities=28%  Similarity=0.602  Sum_probs=8.0

Q ss_pred             cCcEEEecC-CCceEEeec
Q 010261          130 NGSLVISGP-HSRVFWSTT  147 (514)
Q Consensus       130 ~G~LvL~d~-~g~~vWst~  147 (514)
                      +|.|.-.|. +|.++|+..
T Consensus       265 ~g~l~ald~~tG~~~W~~~  283 (394)
T PRK11138        265 NGNLVALDLRSGQIVWKRE  283 (394)
T ss_pred             CCeEEEEECCCCCEEEeec
Confidence            444444442 344555443


No 52 
>PF02009 Rifin_STEVOR:  Rifin/stevor family;  InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=31.51  E-value=5.8  Score=40.49  Aligned_cols=26  Identities=19%  Similarity=0.337  Sum_probs=15.8

Q ss_pred             ehhHHHHHHHHHHhheeeEEEeeeec
Q 010261          456 ILGGALLILIGVILFGGYKIWTSRRA  481 (514)
Q Consensus       456 ~~~~~~~~li~~~~~~~~~i~~rrk~  481 (514)
                      ++++++++||.+++++++|+.|+||-
T Consensus       262 iiaIliIVLIMvIIYLILRYRRKKKm  287 (299)
T PF02009_consen  262 IIAILIIVLIMVIIYLILRYRRKKKM  287 (299)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhh
Confidence            33333445555667777777787663


No 53 
>PF00157 Pou:  Pou domain - N-terminal to homeobox domain;  InterPro: IPR000327 POU proteins are eukaryotic transcription factors containing a bipartite DNA binding domain referred to as the POU domain. The acronym POU (pronounced 'pow') is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans. POU domain genes have been identified in diverse organisms including nematodes, flies, amphibians, fish and mammals but have not been yet identified in plants and fungi. The various members of the POU family have a wide variety of functions, all of which are related to the function of the neuroendocrine system [] and the development of an organism []. Some other genes are also regulated, including those for immunoglobulin light and heavy chains (Oct-2) [, ], and trophic hormone genes, such as those for prolactin and growth hormone (Pit-1).  The POU domain is a bipartite domain composed of two subunits separated by a non-conserved region of 15-55 aa. The N-terminal subunit is known as the POU-specific (POUs) domain (IPR000327 from INTERPRO), while the C-terminal subunit is a homeobox domain (IPR007103 from INTERPRO). 3D structures of complexes including both POU subdomains bound to DNA are available. Both subdomains contain the structural motif 'helix-turn-helix', which directly associates with the two components of bipartite DNA binding sites, and both are required for high affinity sequence-specific DNA-binding. The domain may also be involved in protein-protein interactions []. The subdomains are connected by a flexible linker [, , ]. In proteins a POU-specific domain is always accompanied by a homeodomain. Despite of the lack of sequence homology, 3D structure of POUs is similar to 3D structure of bacteriophage lambda repressor and other members of HTH_3 family [, ]. This entry represents the POU-specific subunit of the POU domain.; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent; PDB: 3D1N_O 1AU7_A 3L1P_A 2XSD_C 1O4X_A 1HF0_B 1GT0_C 1POU_A 1CQT_B 1E3O_C ....
Probab=31.23  E-value=10  Score=30.37  Aligned_cols=26  Identities=27%  Similarity=0.428  Sum_probs=20.9

Q ss_pred             eeccceeeeccchhhhhhhhhhhhhH
Q 010261            6 NFSQSTISLFSNMKKSANSATRTHAI   31 (514)
Q Consensus         6 ~~~~~~~~~~~~~~~~~~~~~~~~~~   31 (514)
                      .|||+||+-||+++-+..+|....++
T Consensus        41 ~~SQttI~RFE~L~LS~kn~~klkP~   66 (75)
T PF00157_consen   41 EFSQTTICRFEALQLSFKNMCKLKPL   66 (75)
T ss_dssp             GGSHHHHHHHHTTTSCHHHHHHHHHH
T ss_pred             cccchhhhhhHhcccCHHHHHHHHHH
Confidence            58999999999999887777655443


No 54 
>PF12458 DUF3686:  ATPase involved in DNA repair ;  InterPro: IPR020958  This entry represents an N-terminal domain associated with ATPases and some uncharacterised proteins; it is approximately 450 amino acids in length and contains two conserved sequence motifs: DVF and SPNGED. 
Probab=27.61  E-value=1.3e+02  Score=32.31  Aligned_cols=58  Identities=12%  Similarity=0.072  Sum_probs=37.6

Q ss_pred             CCcEEEeCCC-eEEEeeeecCCCCe-EEEEEEcCCCceEEeecCCCCCCCCCcEEEeecCcEEEecCC
Q 010261           74 FQSLLNDTTD-TFSLGFLRVNSNQL-ALAVIHLPSSKPLWLANSTQLAPWSDRIELSFNGSLVISGPH  139 (514)
Q Consensus        74 ~~~~LvS~~g-~F~lGFf~~~~s~~-~i~i~~~~~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~~  139 (514)
                      +.+.+.|||| ++-.-||.+....+ .+.|+-|... |      ++|+...+ -+|-.||.|++..+.
T Consensus       309 F~r~vrSPNGEDvLYvF~~~~~g~~~Ll~YN~I~k~-v------~tPi~chG-~alf~DG~l~~fra~  368 (448)
T PF12458_consen  309 FERKVRSPNGEDVLYVFYAREEGRYLLLPYNLIRKE-V------ATPIICHG-YALFEDGRLVYFRAE  368 (448)
T ss_pred             EEEEecCCCCceEEEEEEECCCCcEEEEechhhhhh-h------cCCeeccc-eeEecCCEEEEEecC
Confidence            4456789987 67777888876553 3456545432 1      24776543 567788998887655


No 55 
>smart00765 MANEC The MANEC domain was formerly called MANSC. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase.
Probab=27.14  E-value=89  Score=26.20  Aligned_cols=35  Identities=23%  Similarity=0.581  Sum_probs=26.5

Q ss_pred             CCCHHHHHHHhhccCCeEEEEecC--CCCcceEEEee
Q 010261          383 TSYLEQCEDLCQNNCSCWGALYNN--ASGSGFCYMLD  417 (514)
Q Consensus       383 ~~sl~~C~~~CL~nCSC~Ay~y~~--~~gsG~C~l~~  417 (514)
                      ..+.++|..+|=+.=+|..+.+..  ..+.+.|||.+
T Consensus        37 ~~s~edC~~aCC~~~~CnlAv~e~~~~~~~~~CyLf~   73 (93)
T smart00765       37 VNTWEDCVRACCSTPNCNLAVFELRREDAEGNCYLFN   73 (93)
T ss_pred             cCCHHHHHHHHcCCCCCcEEEEeccCCCCCCceEEEE
Confidence            357899999999888888887752  22356799875


No 56 
>PF07354 Sp38:  Zona-pellucida-binding protein (Sp38);  InterPro: IPR010857 This family contains a number of zona-pellucida-binding proteins that seem to be restricted to mammals. These are sperm proteins that bind to the 90 kDa family of zona pellucida glycoproteins in a calcium-dependent manner []. These represent some of the specific molecules that mediate the first steps of gamete interaction, allowing fertilisation to occur [].; GO: 0007339 binding of sperm to zona pellucida, 0005576 extracellular region
Probab=27.02  E-value=74  Score=31.80  Aligned_cols=33  Identities=12%  Similarity=0.235  Sum_probs=23.1

Q ss_pred             CCceEEeecCCCCCCCCCcEEEeecCcEEEecC
Q 010261          106 SSKPLWLANSTQLAPWSDRIELSFNGSLVISGP  138 (514)
Q Consensus       106 ~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~  138 (514)
                      +.+..|+--.+.++.+++.+.|+..|.|++.|-
T Consensus        12 DP~y~W~GP~g~~l~gn~~~nIT~TG~L~~~~F   44 (271)
T PF07354_consen   12 DPTYLWTGPNGKPLSGNSYVNITETGKLMFKNF   44 (271)
T ss_pred             CCceEEECCCCcccCCCCeEEEccCceEEeecc
Confidence            456677777777777666677777777777654


No 57 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=25.59  E-value=18  Score=28.04  Aligned_cols=30  Identities=27%  Similarity=0.300  Sum_probs=0.9

Q ss_pred             eeEEeehhHHHHHHHHHHhheeeEEEeeeec
Q 010261          451 AAGIGILGGALLILIGVILFGGYKIWTSRRA  481 (514)
Q Consensus       451 ~~~i~~~~~~~~~li~~~~~~~~~i~~rrk~  481 (514)
                      +++.|+++++++++ ++++|+.|++.+|...
T Consensus        13 avIaG~Vvgll~ai-lLIlf~iyR~rkkdEG   42 (64)
T PF01034_consen   13 AVIAGGVVGLLFAI-LLILFLIYRMRKKDEG   42 (64)
T ss_dssp             -------------------------S-----
T ss_pred             HHHHHHHHHHHHHH-HHHHHHHHHHHhcCCC
Confidence            34445555444433 4567888998898763


No 58 
>PF12662 cEGF:  Complement Clr-like EGF-like
Probab=24.75  E-value=50  Score=20.46  Aligned_cols=9  Identities=33%  Similarity=0.623  Sum_probs=7.8

Q ss_pred             cccCCCCCC
Q 010261          331 CSCLDNRTD  339 (514)
Q Consensus       331 C~Cl~g~~~  339 (514)
                      |+|++||..
T Consensus         4 C~C~~Gy~l   12 (24)
T PF12662_consen    4 CSCPPGYQL   12 (24)
T ss_pred             eeCCCCCcC
Confidence            999999874


No 59 
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=24.01  E-value=16  Score=37.97  Aligned_cols=13  Identities=31%  Similarity=0.501  Sum_probs=5.1

Q ss_pred             HHHHHHHhheeeE
Q 010261          462 LILIGVILFGGYK  474 (514)
Q Consensus       462 ~~li~~~~~~~~~  474 (514)
                      ++|+.+++++++|
T Consensus       322 IVLIMvIIYLILR  334 (353)
T TIGR01477       322 IVLIMVIIYLILR  334 (353)
T ss_pred             HHHHHHHHHHHHH
Confidence            3333333444444


No 60 
>PTZ00046 rifin; Provisional
Probab=22.59  E-value=18  Score=37.65  Aligned_cols=13  Identities=31%  Similarity=0.501  Sum_probs=5.1

Q ss_pred             HHHHHHHhheeeE
Q 010261          462 LILIGVILFGGYK  474 (514)
Q Consensus       462 ~~li~~~~~~~~~  474 (514)
                      ++||.+++++++|
T Consensus       327 IVLIMvIIYLILR  339 (358)
T PTZ00046        327 IVLIMVIIYLILR  339 (358)
T ss_pred             HHHHHHHHHHHHH
Confidence            3333333444444


No 61 
>PF07172 GRP:  Glycine rich protein family;  InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=21.94  E-value=70  Score=26.91  Aligned_cols=7  Identities=29%  Similarity=0.472  Sum_probs=2.8

Q ss_pred             hhhHHHH
Q 010261           28 THAIQFL   34 (514)
Q Consensus        28 ~~~~~~~   34 (514)
                      +..+++|
T Consensus         3 SK~~llL    9 (95)
T PF07172_consen    3 SKAFLLL    9 (95)
T ss_pred             hhHHHHH
Confidence            3334343


No 62 
>PF12947 EGF_3:  EGF domain;  InterPro: IPR024731 This entry represents an EGF domain found in the the C terminus of malarial parasite merozoite surface protein 1 [], as well as other proteins.; PDB: 2NPR_A 1N1I_C 1B9W_A 1YO8_A 2RHP_A.
Probab=21.80  E-value=17  Score=24.70  Aligned_cols=22  Identities=18%  Similarity=0.608  Sum_probs=14.9

Q ss_pred             CCCCCCCCcCCC--CC--cccCCCCC
Q 010261          317 PSPCGSYSLCKQ--SG--CSCLDNRT  338 (514)
Q Consensus       317 ~g~CG~~giC~~--~~--C~Cl~g~~  338 (514)
                      .+-|.++..|..  +.  |.|.+||.
T Consensus         5 ~~~C~~nA~C~~~~~~~~C~C~~Gy~   30 (36)
T PF12947_consen    5 NGGCHPNATCTNTGGSYTCTCKPGYE   30 (36)
T ss_dssp             GGGS-TTCEEEE-TTSEEEEE-CEEE
T ss_pred             CCCCCCCcEeecCCCCEEeECCCCCc
Confidence            357889999976  23  99998864


No 63 
>PF09064 Tme5_EGF_like:  Thrombomodulin like fifth domain, EGF-like;  InterPro: IPR015149 This domain adopts a fold similar to other EGF domains, with a flat major and a twisted minor beta sheet. Disulphide pairing, however, is not of the usual 1-3, 2-4, 5-6 type; rather 1-2, 3-4, 5-6 pairing is found. Its extended major sheet (strands beta-2 and beta-3 and the connecting loop) projects into thrombin's active site groove. This domain is required for interaction of thrombomodulin with thrombin, and subsequent activation of protein-C []. ; GO: 0004888 transmembrane signaling receptor activity, 0016021 integral to membrane
Probab=21.48  E-value=52  Score=22.24  Aligned_cols=8  Identities=25%  Similarity=0.758  Sum_probs=6.6

Q ss_pred             CcccCCCC
Q 010261          330 GCSCLDNR  337 (514)
Q Consensus       330 ~C~Cl~g~  337 (514)
                      +|.||+||
T Consensus        19 ~C~CPeGy   26 (34)
T PF09064_consen   19 QCFCPEGY   26 (34)
T ss_pred             ceeCCCce
Confidence            39999987


No 64 
>PF08374 Protocadherin:  Protocadherin;  InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated []. 
Probab=21.08  E-value=25  Score=33.94  Aligned_cols=8  Identities=38%  Similarity=0.410  Sum_probs=3.0

Q ss_pred             eeEEeehh
Q 010261          451 AAGIGILG  458 (514)
Q Consensus       451 ~~~i~~~~  458 (514)
                      +++.|+++
T Consensus        42 aiVAG~~t   49 (221)
T PF08374_consen   42 AIVAGIMT   49 (221)
T ss_pred             eeecchhh
Confidence            33333333


No 65 
>PF12661 hEGF:  Human growth factor-like EGF; PDB: 2YGQ_A 2E26_A 3A7Q_A 2YGP_A 2YGO_A 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=20.82  E-value=21  Score=18.75  Aligned_cols=8  Identities=38%  Similarity=0.895  Sum_probs=5.0

Q ss_pred             cccCCCCC
Q 010261          331 CSCLDNRT  338 (514)
Q Consensus       331 C~Cl~g~~  338 (514)
                      |.|++||+
T Consensus         2 C~C~~G~~    9 (13)
T PF12661_consen    2 CQCPPGWT    9 (13)
T ss_dssp             EEE-TTEE
T ss_pred             ccCcCCCc
Confidence            77888764


No 66 
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=20.54  E-value=63  Score=28.65  Aligned_cols=14  Identities=14%  Similarity=0.290  Sum_probs=7.1

Q ss_pred             HHhheeeEEEeeee
Q 010261          467 VILFGGYKIWTSRR  480 (514)
Q Consensus       467 ~~~~~~~~i~~rrk  480 (514)
                      +.++.+++..||||
T Consensus       117 ~~~~~~yr~~r~~~  130 (139)
T PHA03099        117 CCLLSVYRFTRRTK  130 (139)
T ss_pred             HHHHhhheeeeccc
Confidence            33455566555554


No 67 
>KOG1219 consensus Uncharacterized conserved protein, contains laminin, cadherin and EGF domains [Signal transduction mechanisms]
Probab=20.33  E-value=1e+02  Score=39.90  Aligned_cols=26  Identities=27%  Similarity=0.782  Sum_probs=18.3

Q ss_pred             CCCCCCCCCCCCCcCCC-CC----cccCCCCC
Q 010261          312 DACQLPSPCGSYSLCKQ-SG----CSCLDNRT  338 (514)
Q Consensus       312 ~~Cd~~g~CG~~giC~~-~~----C~Cl~g~~  338 (514)
                      ++|. -..|---|.|+. +.    |.||+-|.
T Consensus      3865 d~C~-~npCqhgG~C~~~~~ggy~CkCpsqys 3895 (4289)
T KOG1219|consen 3865 DPCN-DNPCQHGGTCISQPKGGYKCKCPSQYS 3895 (4289)
T ss_pred             cccc-cCcccCCCEecCCCCCceEEeCccccc
Confidence            6663 467777788876 22    99997764


No 68 
>cd05852 Ig5_Contactin-1 Fifth Ig domain of contactin-1. Ig5_Contactin-1: fifth Ig domain of the neural cell adhesion molecule contactin-1. Contactins are comprised of six Ig domains followed by four fibronectin type III (FnIII) domains anchored to the membrane by glycosylphosphatidylinositol. Contactin-1 is differentially expressed in tumor tissues and may through a RhoA mechanism, facilitate invasion and metastasis of human lung adenocarcinoma.
Probab=20.17  E-value=1.2e+02  Score=23.62  Aligned_cols=34  Identities=29%  Similarity=0.327  Sum_probs=22.1

Q ss_pred             cCCCceEEeecCCCCCCCCCcEEEeecCcEEEecC
Q 010261          104 LPSSKPLWLANSTQLAPWSDRIELSFNGSLVISGP  138 (514)
Q Consensus       104 ~~~~tvVWvANR~~Pv~~~~~l~L~~~G~LvL~d~  138 (514)
                      .|..++.|.=|.. ++..+....+..+|.|.|.+.
T Consensus        13 ~P~p~v~W~k~~~-~l~~~~r~~~~~~g~L~I~~v   46 (73)
T cd05852          13 APKPKFSWSKGTE-LLVNNSRISIWDDGSLEILNI   46 (73)
T ss_pred             eCCCEEEEEeCCE-ecccCCCEEEcCCCEEEECcC
Confidence            4566889986543 554445666666788877653


Done!