Query         psy4697
Match_columns 383
No_of_seqs    305 out of 2376
Neff          7.9 
Searched_HMMs 46136
Date          Fri Aug 16 23:32:21 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy4697.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/4697hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG4289|consensus              100.0 2.3E-39 4.9E-44  337.8  22.3  241    3-248   227-479 (2531)
  2 KOG4289|consensus              100.0 2.1E-36 4.6E-41  315.7  25.0  243    3-253   332-586 (2531)
  3 KOG1219|consensus              100.0 3.2E-33 6.9E-38  300.0  27.4  241    4-249   910-1165(4289)
  4 KOG1219|consensus              100.0 5.2E-33 1.1E-37  298.4  27.2  236    4-248  2535-2782(4289)
  5 cd00031 CA Cadherin repeat dom 100.0 1.6E-27 3.4E-32  214.7  26.9  185   49-237     1-198 (199)
  6 cd00031 CA Cadherin repeat dom  99.8 9.7E-20 2.1E-24  163.9  19.2  132    3-134    60-199 (199)
  7 PF00028 Cadherin:  Cadherin do  99.6 7.1E-15 1.5E-19  116.5  13.0   84   50-133     1-93  (93)
  8 KOG1834|consensus               99.6 4.2E-14 9.1E-19  141.3  18.9  196   33-236    21-242 (952)
  9 PF00028 Cadherin:  Cadherin do  99.5 1.3E-12 2.8E-17  103.5  13.2   87  147-237     1-93  (93)
 10 smart00112 CA Cadherin repeats  99.4 7.7E-13 1.7E-17  101.5   9.4   70   71-140     2-79  (79)
 11 KOG1834|consensus               99.3 1.2E-11 2.6E-16  124.0  14.0  128    4-133   102-243 (952)
 12 smart00112 CA Cadherin repeats  98.9 5.6E-09 1.2E-13   79.9   9.0   74  171-245     1-79  (79)
 13 PF08758 Cadherin_pro:  Cadheri  97.8 0.00016 3.5E-09   56.8   8.8   86   41-130     2-88  (90)
 14 PF08758 Cadherin_pro:  Cadheri  97.4  0.0013 2.8E-08   51.7   8.8   80  139-225     3-82  (90)
 15 PF08266 Cadherin_2:  Cadherin-  97.4 0.00059 1.3E-08   52.9   6.4   57   50-107     3-66  (84)
 16 TIGR01965 VCBS_repeat VCBS rep  95.9   0.063 1.4E-06   42.8   8.3   77   65-145     2-89  (99)
 17 smart00736 CADG Dystroglycan-t  95.8    0.13 2.8E-06   40.8   9.8   67   69-137    24-96  (97)
 18 PF15102 TMEM154:  TMEM154 prot  94.5   0.053 1.2E-06   46.0   4.2   34  254-287    57-90  (146)
 19 KOG0196|consensus               94.2     2.3 5.1E-05   45.7  16.4  110  103-225   403-524 (996)
 20 smart00736 CADG Dystroglycan-t  94.2    0.77 1.7E-05   36.3  10.3   65  169-237    23-92  (97)
 21 TIGR01965 VCBS_repeat VCBS rep  92.3     1.2 2.5E-05   35.6   8.3   72  166-244     2-84  (99)
 22 PF08374 Protocadherin:  Protoc  91.7    0.13 2.8E-06   46.4   2.5   37  250-286    34-70  (221)
 23 KOG3597|consensus               90.1      13 0.00028   37.8  15.2  144   27-184    24-193 (442)
 24 PF01102 Glycophorin_A:  Glycop  89.6    0.17 3.7E-06   41.9   1.3   19  254-272    65-83  (122)
 25 PF01034 Syndecan:  Syndecan do  89.0    0.18 3.8E-06   36.6   0.8   13  278-290    33-45  (64)
 26 PF08266 Cadherin_2:  Cadherin-  88.5     1.2 2.6E-05   34.4   5.3   55  148-207     4-65  (84)
 27 PF07495 Y_Y_Y:  Y_Y_Y domain;   88.1     4.5 9.8E-05   29.0   8.0   56  178-236     9-65  (66)
 28 TIGR00845 caca sodium/calcium   86.5      59  0.0013   36.2  20.8  138   37-184   394-567 (928)
 29 PF02439 Adeno_E3_CR2:  Adenovi  85.8    0.76 1.6E-05   29.7   2.3    9  257-265     6-14  (38)
 30 PF12877 DUF3827:  Domain of un  81.0       2 4.3E-05   44.9   4.4   35  253-287   268-302 (684)
 31 PF05345 He_PIG:  Putative Ig d  80.5     7.4 0.00016   26.7   5.8   37  186-222    11-48  (49)
 32 PF02439 Adeno_E3_CR2:  Adenovi  80.4     1.9 4.1E-05   27.9   2.5   25  252-276     4-28  (38)
 33 KOG1094|consensus               80.4     2.9 6.3E-05   43.7   5.2   23  251-273   388-410 (807)
 34 KOG4221|consensus               79.7 1.2E+02  0.0026   34.7  21.2   47  179-225   959-1008(1381)
 35 PF10577 UPF0560:  Uncharacteri  79.0     1.6 3.4E-05   46.9   2.9   31  254-284   273-303 (807)
 36 TIGR03660 T1SS_rpt_143 T1SS-14  78.4      41 0.00088   28.5  12.3   53   96-152    69-125 (137)
 37 PF15347 PAG:  Phosphoprotein a  77.9     2.9 6.4E-05   40.8   4.2   36  251-286    12-47  (428)
 38 PF02009 Rifin_STEVOR:  Rifin/s  76.2    0.99 2.1E-05   43.4   0.5   30  254-284   256-286 (299)
 39 PF04478 Mid2:  Mid2 like cell   74.3     2.3 5.1E-05   36.4   2.2   11  275-285    70-80  (154)
 40 PF12273 RCR:  Chitin synthesis  73.4     3.1 6.8E-05   34.8   2.8   11  277-287    19-29  (130)
 41 PF14575 EphA2_TM:  Ephrin type  72.1     2.2 4.7E-05   32.2   1.4   27  258-284     2-28  (75)
 42 PF13750 Big_3_3:  Bacterial Ig  72.0      66  0.0014   27.9  16.1  121    9-133    15-148 (158)
 43 PF05393 Hum_adeno_E3A:  Human   70.6     3.3 7.2E-05   31.9   2.0   28  259-287    36-63  (94)
 44 PF06024 DUF912:  Nucleopolyhed  68.9       5 0.00011   32.1   2.9   30  256-285    64-93  (101)
 45 PF01299 Lamp:  Lysosome-associ  68.8     3.6 7.7E-05   39.8   2.4   20  253-272   270-289 (306)
 46 PF13753 SWM_repeat:  Putative   67.8 1.2E+02  0.0026   29.2  18.5  202    8-223    11-228 (317)
 47 PF15298 AJAP1_PANP_C:  AJAP1/P  66.7      14 0.00031   33.0   5.5   89  253-350    99-192 (205)
 48 PF05083 LST1:  LST-1 protein;   65.9     2.7 5.9E-05   30.9   0.7   23  278-300    19-41  (74)
 49 PTZ00382 Variant-specific surf  64.8     4.3 9.3E-05   32.2   1.7   24  258-281    71-94  (96)
 50 PF06365 CD34_antigen:  CD34/Po  64.5      21 0.00046   32.3   6.3    7  327-333   158-164 (202)
 51 TIGR01478 STEVOR variant surfa  62.3     5.4 0.00012   37.7   2.2   14   46-59     20-33  (295)
 52 PF15330 SIT:  SHP2-interacting  62.0     7.5 0.00016   31.5   2.7   31  260-290     3-33  (107)
 53 PF02480 Herpes_gE:  Alphaherpe  61.8     2.6 5.7E-05   42.8   0.0   16  125-140   182-197 (439)
 54 PTZ00370 STEVOR; Provisional    61.6     5.7 0.00012   37.7   2.2   11  102-112    65-75  (296)
 55 PF13750 Big_3_3:  Bacterial Ig  59.9 1.2E+02  0.0025   26.3  15.3  121  108-236    14-147 (158)
 56 KOG3597|consensus               59.9      94   0.002   31.7  10.6   59  124-186    24-83  (442)
 57 PF12768 Rax2:  Cortical protei  59.0      15 0.00032   35.1   4.6   11  255-265   229-239 (281)
 58 PF15102 TMEM154:  TMEM154 prot  58.7      12 0.00027   31.9   3.5   35  252-286    58-92  (146)
 59 PTZ00046 rifin; Provisional     57.2     4.9 0.00011   39.4   1.0   30  254-284   315-345 (358)
 60 PF05568 ASFV_J13L:  African sw  57.0     6.4 0.00014   33.3   1.5   29  257-286    32-60  (189)
 61 TIGR01477 RIFIN variant surfac  56.9     5.3 0.00011   39.1   1.1   30  254-284   310-340 (353)
 62 PF07495 Y_Y_Y:  Y_Y_Y domain;   55.7      45 0.00099   23.5   5.8   55   75-133     8-66  (66)
 63 PF07204 Orthoreo_P10:  Orthore  54.3     5.2 0.00011   31.3   0.5    8  277-284    62-69  (98)
 64 PF02038 ATP1G1_PLM_MAT8:  ATP1  54.0      11 0.00025   25.9   2.1    9  261-269    22-30  (50)
 65 KOG4482|consensus               52.1      17 0.00036   35.8   3.6   40  251-290   292-332 (449)
 66 PF11980 DUF3481:  Domain of un  50.1      12 0.00027   28.6   1.9   30  253-282    15-44  (87)
 67 PF12191 stn_TNFRSF12A:  Tumour  49.5     7.1 0.00015   32.3   0.6   13  272-284    97-109 (129)
 68 PF13753 SWM_repeat:  Putative   47.4 2.3E+02  0.0049   27.2  10.9  108  108-223    11-124 (317)
 69 PF02158 Neuregulin:  Neureguli  47.2     6.3 0.00014   38.8   0.0   29  257-285     9-38  (404)
 70 PF15069 FAM163:  FAM163 family  45.9      58  0.0013   27.7   5.5    7  350-356    93-99  (143)
 71 PF06697 DUF1191:  Protein of u  45.4      34 0.00073   32.5   4.5   11   48-58     33-43  (278)
 72 PF01034 Syndecan:  Syndecan do  45.2       7 0.00015   28.4  -0.0   18  269-286    27-44  (64)
 73 PF13908 Shisa:  Wnt and FGF in  44.7      33 0.00071   30.2   4.2   19  254-272    79-97  (179)
 74 cd00146 PKD polycystic kidney   43.8 1.3E+02  0.0028   22.1   7.7   62  167-235    18-80  (81)
 75 PF15050 SCIMP:  SCIMP protein   43.3      14 0.00031   30.2   1.5   11  255-265     9-19  (133)
 76 KOG3513|consensus               41.7 5.8E+02   0.013   29.1  19.9  132   86-236   470-613 (1051)
 77 PF01102 Glycophorin_A:  Glycop  40.3      16 0.00035   30.3   1.4   27  247-273    61-87  (122)
 78 PF15234 LAT:  Linker for activ  39.9 1.3E+02  0.0028   26.8   6.8    9  305-313    53-61  (230)
 79 PF05454 DAG1:  Dystroglycan (D  39.4     9.8 0.00021   36.4   0.0   11  254-264   144-154 (290)
 80 PF04906 Tweety:  Tweety;  Inte  38.9      34 0.00074   34.5   3.8   31  255-285    20-52  (406)
 81 PF05895 DUF859:  Siphovirus pr  38.8 3.4E+02  0.0075   29.0  11.2  117   10-130   299-433 (624)
 82 PF14610 DUF4448:  Protein of u  38.5      94   0.002   27.6   6.2   17  255-271   159-175 (189)
 83 PF05510 Sarcoglycan_2:  Sarcog  38.3      22 0.00048   35.4   2.3   20  271-290   301-320 (386)
 84 KOG4433|consensus               36.5      22 0.00049   36.1   2.0   31  253-283    42-74  (526)
 85 PHA03290 envelope glycoprotein  36.4      46 0.00099   32.3   3.9   45   94-138   127-171 (357)
 86 cd05774 Ig_CEACAM_D1 First imm  36.2 1.1E+02  0.0024   24.4   5.6   34  188-221    61-94  (105)
 87 PF11857 DUF3377:  Domain of un  35.8      37 0.00081   25.5   2.5   20  254-273    30-49  (74)
 88 PF12245 Big_3_2:  Bacterial Ig  35.3 1.2E+02  0.0025   21.5   5.1   30  108-137    22-52  (60)
 89 KOG1226|consensus               35.2      68  0.0015   34.7   5.3   13  255-267   713-725 (783)
 90 PF14991 MLANA:  Protein melan-  34.8      10 0.00023   30.8  -0.5   10  274-283    42-51  (118)
 91 PF15048 OSTbeta:  Organic solu  32.9      54  0.0012   27.2   3.3   20  254-273    36-55  (125)
 92 cd05741 Ig_CEACAM_D1_like Firs  32.8 1.1E+02  0.0024   22.8   5.0   34  187-220    47-80  (92)
 93 PF03302 VSP:  Giardia variant-  32.2      38 0.00082   34.0   2.9   27  256-282   370-396 (397)
 94 PHA03283 envelope glycoprotein  31.9      34 0.00073   35.3   2.4   31  255-285   401-431 (542)
 95 PF13754 Big_3_4:  Bacterial Ig  31.0 1.8E+02  0.0039   20.0   6.0   27  208-234    22-49  (54)
 96 TIGR00845 caca sodium/calcium   30.8 4.8E+02    0.01   29.4  11.0   51   30-83    516-568 (928)
 97 KOG3488|consensus               30.1      53  0.0012   24.3   2.5   31  255-285    49-79  (81)
 98 PF10365 DUF2436:  Domain of un  29.9 2.5E+02  0.0054   23.9   6.7   82   39-121    66-156 (161)
 99 PHA03286 envelope glycoprotein  29.2      53  0.0011   33.3   3.2   11  207-217   317-327 (492)
100 PF07213 DAP10:  DAP10 membrane  29.1      71  0.0015   24.3   3.1   22  264-285    45-66  (79)
101 cd00146 PKD polycystic kidney   28.0 1.1E+02  0.0025   22.4   4.3   29  103-131    51-80  (81)
102 TIGR00864 PCC polycystin catio  27.8   1E+03   0.022   30.5  13.6  110  105-236  1480-1591(2740)
103 PF07204 Orthoreo_P10:  Orthore  26.9      26 0.00057   27.5   0.5   25  257-281    45-69  (98)
104 PF14979 TMEM52:  Transmembrane  26.5      74  0.0016   27.2   3.1   10  274-283    40-50  (154)
105 PHA03281 envelope glycoprotein  25.7      72  0.0016   33.1   3.5   24  110-134   312-335 (642)
106 TIGR03778 VPDSG_CTERM VPDSG-CT  25.1      81  0.0018   18.7   2.2   11  266-276    10-20  (26)
107 PF13965 SID-1_RNA_chan:  dsRNA  24.8 3.5E+02  0.0075   28.7   8.5   24  193-217    58-81  (570)
108 PRK14081 triple tyrosine motif  24.8 9.1E+02    0.02   26.2  21.1  189    8-224    63-270 (667)
109 smart00089 PKD Repeats in poly  24.5 1.8E+02  0.0038   21.2   4.8   28  206-236    51-78  (79)
110 cd05762 Ig8_MLCK Eighth immuno  24.1 3.4E+02  0.0073   20.9  11.1   37   99-137    59-95  (98)
111 COG4288 Uncharacterized protei  23.5 1.2E+02  0.0026   24.5   3.7   47    3-59     52-98  (124)
112 PHA03283 envelope glycoprotein  23.4 4.2E+02  0.0091   27.6   8.3   38  251-288   394-431 (542)
113 PF00558 Vpu:  Vpu protein;  In  23.3      78  0.0017   24.3   2.5    6  278-283    29-34  (81)
114 KOG3637|consensus               23.0   1E+02  0.0022   35.1   4.4   23  257-279   980-1002(1030)
115 PF11395 DUF2873:  Protein of u  22.9      80  0.0017   20.3   2.0    9  264-272    17-25  (43)
116 TIGR03660 T1SS_rpt_143 T1SS-14  22.9 2.7E+02  0.0058   23.6   5.9   44   10-59     86-129 (137)
117 cd05775 Ig_SLAM-CD84_like_N N-  22.7 1.9E+02  0.0041   22.3   4.8   31  190-220    53-84  (97)
118 smart00089 PKD Repeats in poly  22.6 1.9E+02  0.0042   21.0   4.6   31  102-132    48-78  (79)
119 PF13584 BatD:  Oxygen toleranc  22.0 8.5E+02   0.018   24.8  14.3   16  178-193   339-354 (484)
120 PHA03265 envelope glycoprotein  21.7      41  0.0009   32.9   0.9   51   40-91     42-93  (402)
121 PHA03099 epidermal growth fact  21.5   1E+02  0.0022   25.8   2.9    6  278-283   122-127 (139)
122 PF05399 EVI2A:  Ectropic viral  21.1      41 0.00088   30.5   0.7   14  262-275   142-155 (227)
123 PF08391 Ly49:  Ly49-like prote  20.8      33 0.00071   28.4   0.0   22  255-276     6-27  (119)
124 PLN03150 hypothetical protein;  20.5      88  0.0019   33.4   3.2   12  256-267   547-558 (623)
125 PF15065 NCU-G1:  Lysosomal tra  20.4      51  0.0011   32.5   1.2   12  108-119   128-139 (350)
126 PF02124 Marek_A:  Marek's dise  20.3 5.3E+02   0.012   23.5   7.6   20  211-230   152-171 (211)

No 1  
>KOG4289|consensus
Probab=100.00  E-value=2.3e-39  Score=337.76  Aligned_cols=241  Identities=29%  Similarity=0.374  Sum_probs=225.3

Q ss_pred             CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEE
Q psy4697           3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKY   79 (383)
Q Consensus         3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Y   79 (383)
                      .|||.++.+.|.|+|.|++.|.++++++|+|.|.|.|||.|+|+|.+|.-++.||.+.|+.|++|+|+|.|.   +++.|
T Consensus       227 lDREt~e~HvlrVtA~d~~~P~~SAtttv~V~V~D~nDhsPvFEq~~Y~e~lREn~evGy~vLtvrAtD~Dsp~Nani~Y  306 (2531)
T KOG4289|consen  227 LDRETKETHVLRVTAQDHGDPRRSATTTVTVLVLDTNDHSPVFEQDEYREELRENLEVGYEVLTVRATDGDSPPNANIRY  306 (2531)
T ss_pred             hhhhhhheeEEEEEeeecCCCcccceeEEEEEEeecCCCCcccchhHHHHHHhhccccCceEEEEEeccCCCCCCCceEE
Confidence            489999999999999999999999999999999999999999999999999999999999999999999996   89999


Q ss_pred             EEec-CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCCcccCCceeEEEecc
Q psy4697          80 WLSN-DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDPRFRYPQYELFLPHI  154 (383)
Q Consensus        80 si~~-~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~  154 (383)
                      ++.+ +..+.|.||+ +|.|.+..+||||+...|++.|+|.|.|   ...++.|.|+|+|+|||+|+|....|.+.|.  
T Consensus       307 rl~eg~~~~~f~in~rSGvI~T~a~lDRE~~~~y~L~VeAsDqG~~pgp~Ta~V~itV~D~NDNaPqFse~~Yvvqv~--  384 (2531)
T KOG4289|consen  307 RLLEGNAKNVFEINPRSGVISTRAPLDREELESYQLDVEASDQGRPPGPRTAMVEITVEDENDNAPQFSEKRYVVQVR--  384 (2531)
T ss_pred             EecCCCccceeEEcCccceeeccCccCHHhhhheEEEEEeccCCCCCCCceEEEEEEEEecCCCCccccccceEEEec--
Confidence            9998 4778999997 9999999999999999999999999976   3459999999999999999999999999999  


Q ss_pred             CCCCCCCCceEEEEEeeeCCCCC--eEEEEEe-CCCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEE
Q psy4697         155 PLADLTPGSVIGKVEAADGDKGD--RVTLSLR-GPYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTGNPPRQASV  230 (383)
Q Consensus       155 ~~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~-~~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~  230 (383)
                        |+..+++.|.+|+|+|.|.|.  .+.|+|. |+..+.|.|+. +|++.+..+.+++...|.|.|.|+|+|.|++++++
T Consensus       385 --Edvt~~avvlrV~AtDrD~g~Ng~VHYsi~Sgn~~G~f~id~~tGel~vv~plD~e~~~ytl~IrAqDggrPpLsn~s  462 (2531)
T KOG4289|consen  385 --EDVTPPAVVLRVTATDRDKGTNGKVHYSIASGNGRGQFYIDSLTGELDVVEPLDFENSEYTLRIRAQDGGRPPLSNTS  462 (2531)
T ss_pred             --ccCCCCceEEEEEecccCCCcCceEEEEeeccCccccEEEecccceEEEeccccccCCeeEEEEEcccCCCCCccCCC
Confidence              999999999999999999986  6999997 77889999997 99999887655555599999999999999999999


Q ss_pred             EEEEEEeCCcccCccccc
Q psy4697         231 PAIMHFPEAIVQQASSKL  248 (383)
Q Consensus       231 tv~I~v~~~~~~~~p~~~  248 (383)
                      -|.|+| -+.|+++|.|.
T Consensus       463 gl~iqV-lDINDhaPifv  479 (2531)
T KOG4289|consen  463 GLVIQV-LDINDHAPIFV  479 (2531)
T ss_pred             ceEEEE-EecCCCCceeE
Confidence            999999 77888888764


No 2  
>KOG4289|consensus
Probab=100.00  E-value=2.1e-36  Score=315.72  Aligned_cols=243  Identities=23%  Similarity=0.353  Sum_probs=222.5

Q ss_pred             CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEE
Q psy4697           3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKY   79 (383)
Q Consensus         3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Y   79 (383)
                      .|||..+.|+|.|.|.|.|.++.-.++.|.|.|.|+|||+|+|....|.+.|.||..+++.|++|+|+|.|.   +.|.|
T Consensus       332 lDRE~~~~y~L~VeAsDqG~~pgp~Ta~V~itV~D~NDNaPqFse~~Yvvqv~Edvt~~avvlrV~AtDrD~g~Ng~VHY  411 (2531)
T KOG4289|consen  332 LDREELESYQLDVEASDQGRPPGPRTAMVEITVEDENDNAPQFSEKRYVVQVREDVTPPAVVLRVTATDRDKGTNGKVHY  411 (2531)
T ss_pred             cCHHhhhheEEEEEeccCCCCCCCceEEEEEEEEecCCCCccccccceEEEecccCCCCceEEEEEecccCCCcCceEEE
Confidence            489999999999999999999887899999999999999999999999999999999999999999999995   89999


Q ss_pred             EEec-CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCCcccCCceeEEEecc
Q psy4697          80 WLSN-DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDPRFRYPQYELFLPHI  154 (383)
Q Consensus        80 si~~-~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~  154 (383)
                      +|.+ +..+.|.||. +|+|.+..+||+|.. .|.+.|.|.|+|   ++.+.-+.|+|+|+|||+|.|...++..++.  
T Consensus       412 si~Sgn~~G~f~id~~tGel~vv~plD~e~~-~ytl~IrAqDggrPpLsn~sgl~iqVlDINDhaPifvstpfq~tvl--  488 (2531)
T KOG4289|consen  412 SIASGNGRGQFYIDSLTGELDVVEPLDFENS-EYTLRIRAQDGGRPPLSNTSGLVIQVLDINDHAPIFVSTPFQATVL--  488 (2531)
T ss_pred             EeeccCccccEEEecccceEEEeccccccCC-eeEEEEEcccCCCCCccCCCceEEEEEecCCCCceeEechhhhhhh--
Confidence            9987 6677899996 999999999999998 999999999987   6777777899999999999999999989999  


Q ss_pred             CCCCCCCCceEEEEEeeeCCCCC--eEEEEEeCCCCCCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEE
Q psy4697         155 PLADLTPGSVIGKVEAADGDKGD--RVTLSLRGPYEKMFSIND-SGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASV  230 (383)
Q Consensus       155 ~~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~~~~~~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~  230 (383)
                        |+.+.|..+..+.|.|.|+|+  .+.|++.|-  +.|.|+. +|.|...+ ++++....|.|.|.|+|+|.|++++.+
T Consensus       489 --Env~lg~~v~~vqaidadsg~na~l~y~laG~--~pf~I~~~SG~Itvtk~ldrEt~~~ysl~V~ard~gtp~l~tst  564 (2531)
T KOG4289|consen  489 --ENVPLGYLVCHVQAIDADSGENARLHYSLAGV--GPFQINNGSGWITVTKELDRETVEHYSLGVEARDHGTPPLSTST  564 (2531)
T ss_pred             --hcccccceEEEEecccCCCCcccceeeeeccC--CCeeEecCCceEEEeecccccccceEEEEEEEcCCCCCcccccc
Confidence              999999999999999999997  589998754  3899997 99999766 488889999999999999999999999


Q ss_pred             EEEEEEeCCcccCcccccCCCCc
Q psy4697         231 PAIMHFPEAIVQQASSKLNSGTS  253 (383)
Q Consensus       231 tv~I~v~~~~~~~~p~~~~~~~~  253 (383)
                      .|.|.+ .++|++.|.|....+.
T Consensus       565 sI~Vtv-~dvndndP~Ft~~eyt  586 (2531)
T KOG4289|consen  565 SISVTV-LDVNDNDPTFTQKEYT  586 (2531)
T ss_pred             eEEEEe-cccCCCCCccccCceE
Confidence            999999 7888887777544333


No 3  
>KOG1219|consensus
Probab=100.00  E-value=3.2e-33  Score=300.03  Aligned_cols=241  Identities=22%  Similarity=0.289  Sum_probs=224.3

Q ss_pred             CcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCC--CCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeE
Q psy4697           4 EDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLR--ELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVK   78 (383)
Q Consensus         4 DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn--~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~   78 (383)
                      |.|+.+-|+|.|.|.|+|.|.+++..++.|.+.|+|+|  ||.|..-.-+..|.||+|.|+.++.+.|.|.|.   +.++
T Consensus       910 Df~k~~fynLsv~a~d~g~p~lss~chl~Vevldv~enlhpp~F~~~v~e~~V~EnapiGT~vi~i~A~dedsgldg~l~  989 (4289)
T KOG1219|consen  910 DFEKSDFYNLSVTAVDRGTPILSSICHLEVEVLDVNENLHPPEFISFVTEGHVLENAPIGTIVIRIQARDEDSGLDGELS  989 (4289)
T ss_pred             ccccccceEEEEEEecCCCcceeeeEEEEEEEeccCCCCCCcchheeeeeeeEeecCCcceEEEEEEEecCCCCccceEE
Confidence            56999999999999999999999999999999999888  999998888999999999999999999999996   8999


Q ss_pred             EEEec-CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCCcccCCceeEEEec
Q psy4697          79 YWLSN-DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDPRFRYPQYELFLPH  153 (383)
Q Consensus        79 Ysi~~-~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~  153 (383)
                      |+|.. +..+.|+||+ +|.|++.+.||||....|.|+|.|+|.|   +++.+.|.|.|+|+|||+|+|..+.|..+|. 
T Consensus       990 Y~I~~gdg~g~FsId~~tG~irTl~~lDrE~ks~YwltveA~D~gt~~~ssv~~vyI~ieDvNDn~Pq~s~pvy~asI~- 1068 (4289)
T KOG1219|consen  990 YKIRTGDGDGIFSIDSTTGSIRTLKALDREKKSSYWLTVEAKDLGTVPLSSVCEVYIEIEDVNDNVPQFSSPVYYASIS- 1068 (4289)
T ss_pred             EEEEcCCcceeEEecCCcceEeechhhchhhcceEEEEEEEEecCCCccccceeEEEEEEecCCCCcccCCceEeeeec-
Confidence            99987 6677899995 9999999999999999999999999987   7889999999999999999999999999999 


Q ss_pred             cCCCCCCCCceEEEEEeeeCCCC--CeEEEEEe-CCCCCCEEEcC-CCcEEE-eccCCCCcceEEEEEEEeeCCCCCcee
Q psy4697         154 IPLADLTPGSVIGKVEAADGDKG--DRVTLSLR-GPYEKMFSIND-SGHISI-VDLSALNTSTIQLVVVATDTGNPPRQA  228 (383)
Q Consensus       154 ~~~e~~~~g~~v~~v~A~D~D~g--~~i~ysi~-~~~~~~F~i~~-tG~i~l-~~~~~~~~~~y~L~V~a~D~g~p~~ss  228 (383)
                         |+++.+..|.++.|.|+|+.  .++.|.|. |+..++|.|++ +|-|.+ ++++++.+.++.|.|.++|.|.|++.+
T Consensus      1069 ---enSp~~vsivq~ea~D~Dsssn~kLmykI~sGnyq~FF~Id~~TG~iTt~r~LDRE~qdEHiLeVTi~D~gep~l~s 1145 (4289)
T KOG1219|consen 1069 ---ENSPETVSIVQAEANDPDSSSNQKLMYKITSGNYQGFFQIDPETGLITTIRRLDREKQDEHILEVTIQDNGEPWLCS 1145 (4289)
T ss_pred             ---cCCCCceEEEEeccCCCCcccCcceEEEEccCCccceEEEccccceeeeehhhcccccccceEEEEEecCCCCcccc
Confidence               99999999999999999954  38999997 88999999998 999984 556999999999999999999999999


Q ss_pred             EEEEEEEEeCCcccCcccccC
Q psy4697         229 SVPAIMHFPEAIVQQASSKLN  249 (383)
Q Consensus       229 t~tv~I~v~~~~~~~~p~~~~  249 (383)
                      .+.|.|.| .+.|++.|.|..
T Consensus      1146 ~~rviV~I-ldvNdnsp~Flq 1165 (4289)
T KOG1219|consen 1146 NQRVIVSI-LDVNDNSPRFLQ 1165 (4289)
T ss_pred             ceEEEEEE-eeccCCchhhhh
Confidence            99999999 778888777653


No 4  
>KOG1219|consensus
Probab=100.00  E-value=5.2e-33  Score=298.41  Aligned_cols=236  Identities=26%  Similarity=0.364  Sum_probs=208.6

Q ss_pred             CcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCCCeeEEEEe-
Q psy4697           4 EDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRDLRVKYWLS-   82 (383)
Q Consensus         4 DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~~~v~Ysi~-   82 (383)
                      |++....|.|.|+|+|.|.|++.+.++|.|+|.+..++.|+|+.+.|.|+|+|+.+.|+.|++|+|.|.|. .+-|++. 
T Consensus      2535 ~~~en~tl~l~vkA~D~g~P~~~s~ttV~v~vl~e~v~lPrFSep~y~fsvpEDv~vG~~Ig~v~a~~a~~-~~i~~~v~ 2613 (4289)
T KOG1219|consen 2535 DGLENSTLHLFVKAIDDGKPRRRSNTTVIVTVLPEDVNLPRFSEPIYTFSVPEDVPVGEEIGQVSASDADE-HVIYSLVL 2613 (4289)
T ss_pred             hcccCcEEEEEEEeccCCCCCcccceEEEEEecCcccCcccccCceEEEeccccCCCCCeeeEEeecccCC-ceEEEEEe
Confidence            67888999999999999999999999999999999999999999999999999999999999999999885 4455553 


Q ss_pred             c-----CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC-ceeEEEEEEEEeecCCCCCcccCCceeEEEeccC
Q psy4697          83 N-----DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL-MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIP  155 (383)
Q Consensus        83 ~-----~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~-~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~  155 (383)
                      +     +....|++|. +|.|.+.++||+|..++|++.|.|++++ ..+.++|.|.|.|+|||+|+|..+.|.+.+.   
T Consensus      2614 ~gt~Esn~d~~Fsvdr~TG~i~v~ksLD~E~kk~yqi~v~a~~~~~vva~tsv~vqVkDvNDNaPvFe~d~y~f~i~--- 2690 (4289)
T KOG1219|consen 2614 GGTPESNPDLPFSVDRNTGMIKVNKSLDHEKKKSYQIKVKATCGQWVVAETSVFVQVKDVNDNAPVFEKDPYLFIIE--- 2690 (4289)
T ss_pred             CCCCCCCCCCceEEcCCCceEEeccccchhhhceEEEEEEeecCCceEEEEEEEEEeecccCCCccccCCceeEEEe---
Confidence            3     3445699995 9999999999999999999999999987 4889999999999999999999999999999   


Q ss_pred             CCCCCCCceEEEEEeeeCCCCC--eEEEEEeCCCCCCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEE
Q psy4697         156 LADLTPGSVIGKVEAADGDKGD--RVTLSLRGPYEKMFSIND-SGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVP  231 (383)
Q Consensus       156 ~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~~~~~~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~t  231 (383)
                       |+.+.|+.|.+++|.|.|+|.  +++|++... ..+|.|++ +|+|.+.. ++.+.+..|.|.|.|+|+|.|+.  .++
T Consensus      2691 -En~pvGtsV~qf~AsD~Ds~~nGqirysl~~~-v~yF~In~etGwlTt~~eld~ek~d~y~lkv~AtDhG~~ss--q~~ 2766 (4289)
T KOG1219|consen 2691 -ENSPVGTSVIQFHASDMDSGNNGQIRYSLTSP-VPYFAINPETGWLTTLFELDLEKQDLYSLKVVATDHGVPSS--QAT 2766 (4289)
T ss_pred             -ccCCCCceEEEEEeeccCCCCCceEEEEEcCC-cceEEEcCCCCeeeehhhhccccCCceEEEEEEecCCcccc--cce
Confidence             999999999999999999986  799999854 44999997 99998654 56677999999999999999854  455


Q ss_pred             EEEEEeCCcccCccccc
Q psy4697         232 AIMHFPEAIVQQASSKL  248 (383)
Q Consensus       232 v~I~v~~~~~~~~p~~~  248 (383)
                      +.|.| .+.|+.+|.|.
T Consensus      2767 v~v~v-tDvndspprf~ 2782 (4289)
T KOG1219|consen 2767 VLVHV-TDVNDSPPRFQ 2782 (4289)
T ss_pred             EEEEE-EecCCCcchhh
Confidence            55555 56777766654


No 5  
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.96  E-value=1.6e-27  Score=214.73  Aligned_cols=185  Identities=36%  Similarity=0.537  Sum_probs=167.2

Q ss_pred             cEEEEEECCCCCCcEEEEEEeEeCCC---CeeEEEEecCCC-CCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC-c
Q psy4697          49 EYSVSALENLPVNYVLLTVTTNKPRD---LRVKYWLSNDYG-ERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL-M  122 (383)
Q Consensus        49 ~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Ysi~~~~~-~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~-~  122 (383)
                      .|.+.|.||.++|+.++++.|.|+|.   +.++|+|.+... .+|.|++ +|.|++.+.||||....|.|.|+|.|.| .
T Consensus         1 ~~~~~i~En~~~g~~v~~~~a~D~D~~~~~~~~y~i~~~~~~~~F~i~~~tG~l~~~~~lD~e~~~~~~l~v~a~D~g~~   80 (199)
T cd00031           1 SYSVSVPENAPPGTVVGTVSATDPDSGENGRVTYSILGGNEDGLFSIDPNTGVITTTKPLDREEQSEYTLTVVASDGGGP   80 (199)
T ss_pred             CeEEEEeCCCCCCCEEEEEEEECCCCCCCceEEEEEeCCCCcccEEEeCCCCEEEECCCCCCcCCceEEEEEEEEECCcC
Confidence            47899999999999999999999997   579999998554 7999997 8999999999999999999999999954 3


Q ss_pred             --eeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCC--CeEEEEEeCCCC-CCEEEcC-C
Q psy4697         123 --TTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKG--DRVTLSLRGPYE-KMFSIND-S  196 (383)
Q Consensus       123 --~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g--~~i~ysi~~~~~-~~F~i~~-t  196 (383)
                        ++...++|.|.|+|||+|.|..+.|.+.+.    |+.++|+.++++.|+|+|.+  ..++|+|.+... .+|.|++ +
T Consensus        81 ~~~~~~~v~I~V~d~Nd~~P~~~~~~~~~~v~----e~~~~~~~i~~~~a~D~D~~~~~~~~y~l~~~~~~~~f~i~~~~  156 (199)
T cd00031          81 PLSSTATVTVTVLDVNDNPPVFEQSSYEASVP----ENAPPGTVVGTVTATDADSGENAKLTYSILSGNDKELFSIDPNT  156 (199)
T ss_pred             cceeEEEEEEEEccCCCCCCcccccceEEEEe----CCCCCCCEEEEEEEEcCCCCCCccEEEEEeCCCCCCEEEEeCCc
Confidence              388999999999999999999888999999    99999999999999999986  589999986544 7999998 9


Q ss_pred             CcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEEe
Q psy4697         197 GHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVPAIMHFP  237 (383)
Q Consensus       197 G~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v~  237 (383)
                      |.|.+.. .+++....|.|.|.|+|.+.|.++++++++|.+.
T Consensus       157 G~i~~~~~ld~e~~~~~~l~v~a~D~~~~~~~~~~~i~i~v~  198 (199)
T cd00031         157 GIITLAKPLDREEKSSYELTVVATDGGGPPLSSTATVTVTVL  198 (199)
T ss_pred             eEEEeCCccCCccCceEEEEEEEEECCCCCceeEEEEEEEEE
Confidence            9998875 4677777999999999999999999999999884


No 6  
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.85  E-value=9.7e-20  Score=163.90  Aligned_cols=132  Identities=30%  Similarity=0.332  Sum_probs=123.1

Q ss_pred             CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEE
Q psy4697           3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKY   79 (383)
Q Consensus         3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Y   79 (383)
                      .|||..+.|.|.|+|.|.|.|.+++...|.|.|.|+|||+|.|....|.+.|.|+.++|+.++++.|+|+|.   +.++|
T Consensus        60 lD~e~~~~~~l~v~a~D~g~~~~~~~~~v~I~V~d~Nd~~P~~~~~~~~~~v~e~~~~~~~i~~~~a~D~D~~~~~~~~y  139 (199)
T cd00031          60 LDREEQSEYTLTVVASDGGGPPLSSTATVTVTVLDVNDNPPVFEQSSYEASVPENAPPGTVVGTVTATDADSGENAKLTY  139 (199)
T ss_pred             CCCcCCceEEEEEEEEECCcCcceeEEEEEEEEccCCCCCCcccccceEEEEeCCCCCCCEEEEEEEEcCCCCCCccEEE
Confidence            488999999999999999888888999999999999999999999999999999999999999999999996   89999


Q ss_pred             EEecCCC-CCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEee
Q psy4697          80 WLSNDYG-ERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVN  134 (383)
Q Consensus        80 si~~~~~-~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~D  134 (383)
                      +|.+... .+|.|+. +|.|++.+.||||....|.+.|.|+|.+   +++.++++|.|.|
T Consensus       140 ~l~~~~~~~~f~i~~~~G~i~~~~~ld~e~~~~~~l~v~a~D~~~~~~~~~~~i~i~v~d  199 (199)
T cd00031         140 SILSGNDKELFSIDPNTGIITLAKPLDREEKSSYELTVVATDGGGPPLSSTATVTVTVLD  199 (199)
T ss_pred             EEeCCCCCCEEEEeCCceEEEeCCccCCccCceEEEEEEEEECCCCCceeEEEEEEEEEC
Confidence            9998654 7999997 9999999999999999999999999974   7888889988875


No 7  
>PF00028 Cadherin:  Cadherin domain;  InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.63  E-value=7.1e-15  Score=116.52  Aligned_cols=84  Identities=40%  Similarity=0.539  Sum_probs=77.6

Q ss_pred             EEEEEECCCCCCcEEEEEEeEeCCC---CeeEEEEecCC-CCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecC-C--
Q psy4697          50 YSVSALENLPVNYVLLTVTTNKPRD---LRVKYWLSNDY-GERFSISR-QGDISLMQCLDYETEDSYRFTVYATDT-L--  121 (383)
Q Consensus        50 y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Ysi~~~~-~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~-~--  121 (383)
                      |+++|+||.++|+.++++.|.|+|.   +.+.|+|.+.. ..+|.|++ +|.|++.+.||||..+.|.|.|.|+|. +  
T Consensus         1 Y~~~v~E~~~~g~~v~~v~a~D~D~~~n~~i~y~i~~~~~~~~F~I~~~tg~i~~~~~LD~E~~~~y~l~v~a~D~~~~~   80 (93)
T PF00028_consen    1 YSFSVPENAPPGTVVGQVTATDPDSGPNSQITYSILGGNPDGLFSIDPNTGEISLKKPLDRETQSSYQLTVRATDSGGSP   80 (93)
T ss_dssp             EEEEEETTGSTSSEEEEEEEEESSTSTTSSEEEEEEETTSTTSEEEETTTTEEEESSSSCTTTTSEEEEEEEEEETTTSS
T ss_pred             CEEEEECCCCCCCEEEEEEEEeCCCCCCceEEEEEecCcccCceEEeeeeeccccceecCcccCCEEEEEEEEEECCCCC
Confidence            8999999999999999999999994   89999999854 78999997 999999999999999999999999998 4  


Q ss_pred             -ceeEEEEEEEEe
Q psy4697         122 -MTTSATVNISVV  133 (383)
Q Consensus       122 -~~s~~tV~I~V~  133 (383)
                       ++++++|.|+|+
T Consensus        81 ~~~~~~~V~I~V~   93 (93)
T PF00028_consen   81 PLSSTATVTINVL   93 (93)
T ss_dssp             EEEEEEEEEEEEE
T ss_pred             CCEEEEEEEEEEC
Confidence             778888888874


No 8  
>KOG1834|consensus
Probab=99.60  E-value=4.2e-14  Score=141.33  Aligned_cols=196  Identities=22%  Similarity=0.265  Sum_probs=151.9

Q ss_pred             EEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-----Ce-eEEEEecCCCCCEEE---cC---cccEEEc
Q psy4697          33 VSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-----LR-VKYWLSNDYGERFSI---SR---QGDISLM  100 (383)
Q Consensus        33 V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-----~~-v~Ysi~~~~~~~F~I---d~---tG~I~~~  100 (383)
                      .....+|-+.|. -...|+.-|.||.-.-...--+.|-|.|.     |+ .-|.|-+.+ -.|.+   |.   .|.|+++
T Consensus        21 ~~aarankhkpw-ie~ey~gvV~Endntvll~Ppl~aLdkdaplr~ageiC~fklhgq~-vPFdavVvdK~TGegvlRaK   98 (952)
T KOG1834|consen   21 HHAARANKHKPW-IEEEYHGVVTENDNTVLLDPPLAALDKDAPLRYAGEICGFKLHGQP-VPFDAVVVDKYTGEGVLRAK   98 (952)
T ss_pred             cccccccccCcc-cccceeEEEEeCCceEEeCCCeeeecCCCCcccccccceeEecCCC-CCceEEEEeccCCceEEeec
Confidence            334466777774 56789999999964323333567778774     33 346665532 24654   53   5679999


Q ss_pred             ccCCcccCcEEEEEEEEecCC---------ceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEee
Q psy4697         101 QCLDYETEDSYRFTVYATDTL---------MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAA  171 (383)
Q Consensus       101 ~~LD~E~~~~y~~~V~A~D~~---------~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~  171 (383)
                      .+||.|.++.|+|+|+|-|-|         .+..++|+|+|.|+|+++|+|..+.|.+.|.    |. +.-..|+++.|.
T Consensus        99 ~~lDCelqkeytf~iQAydCg~gpdgtn~kKShkatvhIrVkDvNe~AP~f~ep~Yka~V~----EG-K~yd~il~veAi  173 (952)
T KOG1834|consen   99 EPLDCELQKEYTFTIQAYDCGNGPDGTNTKKSHKATVHIRVKDVNEFAPVFKEPWYKAHVT----EG-KVYDSILRVEAI  173 (952)
T ss_pred             CcccccccccceEEEEEEecCCCCCccccccccceEEEEEeccccccCchhcccceeeEEe----cc-eeeeeeEEEEee
Confidence            999999999999999999832         4668999999999999999999999999998    54 345678899999


Q ss_pred             eCCCCC----eEEEEEeCCCCCCEEEcCCCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697         172 DGDKGD----RVTLSLRGPYEKMFSINDSGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVPAIMHF  236 (383)
Q Consensus       172 D~D~g~----~i~ysi~~~~~~~F~i~~tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v  236 (383)
                      |.|-+.    -..|.|. +.+-.|.||..|.|+... +.+-....|.|+|+|.|.|......-+.|+|.|
T Consensus       174 D~DCspq~sqIC~YEI~-t~d~PFaIdn~G~irnTekLny~ke~~Y~ltVtAyDCg~kraa~d~lV~v~V  242 (952)
T KOG1834|consen  174 DKDCSPQYSQICEYEIT-TPDVPFAIDNDGNIRNTEKLNYTKEHQYKLTVTAYDCGKKRAASDSLVTVHV  242 (952)
T ss_pred             cCCCCCcccceeEEEec-CCCCceEEcCCCccccccccccccceeEEEEEEEEecccccccCcceEEEEe
Confidence            999764    4789997 466689999999998655 455567899999999999987766667888888


No 9  
>PF00028 Cadherin:  Cadherin domain;  InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.47  E-value=1.3e-12  Score=103.51  Aligned_cols=87  Identities=36%  Similarity=0.630  Sum_probs=78.3

Q ss_pred             eeEEEeccCCCCCCCCceEEEEEeeeCCCCC--eEEEEEeC-CCCCCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeC
Q psy4697         147 YELFLPHIPLADLTPGSVIGKVEAADGDKGD--RVTLSLRG-PYEKMFSIND-SGHISIVD-LSALNTSTIQLVVVATDT  221 (383)
Q Consensus       147 ~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~--~i~ysi~~-~~~~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~  221 (383)
                      |.+.++    |+.++|+.++++.|.|+|.+.  .+.|+|.+ +..++|.|++ +|.|.+.+ ++++....|.|.|.|+|.
T Consensus         1 Y~~~v~----E~~~~g~~v~~v~a~D~D~~~n~~i~y~i~~~~~~~~F~I~~~tg~i~~~~~LD~E~~~~y~l~v~a~D~   76 (93)
T PF00028_consen    1 YSFSVP----ENAPPGTVVGQVTATDPDSGPNSQITYSILGGNPDGLFSIDPNTGEISLKKPLDRETQSSYQLTVRATDS   76 (93)
T ss_dssp             EEEEEE----TTGSTSSEEEEEEEEESSTSTTSSEEEEEEETTSTTSEEEETTTTEEEESSSSCTTTTSEEEEEEEEEET
T ss_pred             CEEEEE----CCCCCCCEEEEEEEEeCCCCCCceEEEEEecCcccCceEEeeeeeccccceecCcccCCEEEEEEEEEEC
Confidence            678899    999999999999999999765  69999984 4478999998 99999876 488889999999999999


Q ss_pred             -CCCCceeEEEEEEEEe
Q psy4697         222 -GNPPRQASVPAIMHFP  237 (383)
Q Consensus       222 -g~p~~sst~tv~I~v~  237 (383)
                       |.|+++++++|+|+|+
T Consensus        77 ~~~~~~~~~~~V~I~V~   93 (93)
T PF00028_consen   77 GGSPPLSSTATVTINVL   93 (93)
T ss_dssp             TTSSEEEEEEEEEEEEE
T ss_pred             CCCCCCEEEEEEEEEEC
Confidence             8999999999999984


No 10 
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=99.43  E-value=7.7e-13  Score=101.49  Aligned_cols=70  Identities=34%  Similarity=0.478  Sum_probs=62.9

Q ss_pred             eCCC---CeeEEEEecCCC-CCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEeecCCCCC
Q psy4697          71 KPRD---LRVKYWLSNDYG-ERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVVNVNDWDP  140 (383)
Q Consensus        71 D~D~---~~v~Ysi~~~~~-~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~DvNDn~P  140 (383)
                      |+|.   +.++|+|..... .+|.|++ +|.|++.++||||....|.|.|.|.|.+   +++.++|.|+|.|+|||+|
T Consensus         2 D~D~g~n~~i~Y~i~~~~~~~~F~i~~~tg~i~~~~~LD~e~~~~y~l~v~a~D~~~~~~~~~~~v~I~V~D~Nd~~P   79 (79)
T smart00112        2 DADSGENGKVTYSILSGNEDGLFSIDPETGEITTTKPLDREEQPEYTLTVEATDGGGPPLSSTATVTVTVLDVNDNAP   79 (79)
T ss_pred             CCCCCcCcEEEEEEecCCCCCEEEEeCCccEEEeCCccCeeCCCeEEEEEEEEECCCCCcccEEEEEEEEEECCCCCC
Confidence            5554   789999987554 8999996 9999999999999999999999999976   7899999999999999998


No 11 
>KOG1834|consensus
Probab=99.35  E-value=1.2e-11  Score=124.02  Aligned_cols=128  Identities=20%  Similarity=0.239  Sum_probs=108.2

Q ss_pred             CcCCCCeEEEEEEEEECCCCC------ceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---
Q psy4697           4 EDDFLQPITLVVRAIQYDNQD------RYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---   74 (383)
Q Consensus         4 DrE~~~~y~l~V~a~D~g~~~------~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---   74 (383)
                      |=|.+..|+++|+|.|-|..+      .+.-++|.|+|.|+|+.+|+|..+.|.+.|.|.- .-..|++|.|.|.|=   
T Consensus       102 DCelqkeytf~iQAydCg~gpdgtn~kKShkatvhIrVkDvNe~AP~f~ep~Yka~V~EGK-~yd~il~veAiD~DCspq  180 (952)
T KOG1834|consen  102 DCELQKEYTFTIQAYDCGNGPDGTNTKKSHKATVHIRVKDVNEFAPVFKEPWYKAHVTEGK-VYDSILRVEAIDKDCSPQ  180 (952)
T ss_pred             cccccccceEEEEEEecCCCCCccccccccceEEEEEeccccccCchhcccceeeEEecce-eeeeeEEEEeecCCCCCc
Confidence            568899999999999977654      4556889999999999999999999999999984 467899999999993   


Q ss_pred             --CeeEEEEecCCCCCEEEcCcccEEEcccCCcccCcEEEEEEEEecCC---ceeEEEEEEEEe
Q psy4697          75 --LRVKYWLSNDYGERFSISRQGDISLMQCLDYETEDSYRFTVYATDTL---MTTSATVNISVV  133 (383)
Q Consensus        75 --~~v~Ysi~~~~~~~F~Id~tG~I~~~~~LD~E~~~~y~~~V~A~D~~---~~s~~tV~I~V~  133 (383)
                        .-..|.|.. +.-.|.||+.|.|+.+.+|.|.....|.|+|.|-|-|   ..+.+.|+|.|.
T Consensus       181 ~sqIC~YEI~t-~d~PFaIdn~G~irnTekLny~ke~~Y~ltVtAyDCg~kraa~d~lV~v~Vk  243 (952)
T KOG1834|consen  181 YSQICEYEITT-PDVPFAIDNDGNIRNTEKLNYTKEHQYKLTVTAYDCGKKRAASDSLVTVHVK  243 (952)
T ss_pred             ccceeEEEecC-CCCceEEcCCCccccccccccccceeEEEEEEEEecccccccCcceEEEEec
Confidence              446788886 5557999999999999999999999999999999965   223356676664


No 12 
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=98.94  E-value=5.6e-09  Score=79.86  Aligned_cols=74  Identities=30%  Similarity=0.471  Sum_probs=61.2

Q ss_pred             eeCCCCC--eEEEEEeCCCC-CCEEEcC-CCcEEEec-cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEEeCCcccCcc
Q psy4697         171 ADGDKGD--RVTLSLRGPYE-KMFSIND-SGHISIVD-LSALNTSTIQLVVVATDTGNPPRQASVPAIMHFPEAIVQQAS  245 (383)
Q Consensus       171 ~D~D~g~--~i~ysi~~~~~-~~F~i~~-tG~i~l~~-~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v~~~~~~~~p  245 (383)
                      +|+|.|.  .+.|+|.++.. .+|.|++ +|.+.+.+ ++++....|.|.|.|+|.|.|+++++++|.|.| .+.|+++|
T Consensus         1 ~D~D~g~n~~i~Y~i~~~~~~~~F~i~~~tg~i~~~~~LD~e~~~~y~l~v~a~D~~~~~~~~~~~v~I~V-~D~Nd~~P   79 (79)
T smart00112        1 TDADSGENGKVTYSILSGNEDGLFSIDPETGEITTTKPLDREEQPEYTLTVEATDGGGPPLSSTATVTVTV-LDVNDNAP   79 (79)
T ss_pred             CCCCCCcCcEEEEEEecCCCCCEEEEeCCccEEEeCCccCeeCCCeEEEEEEEEECCCCCcccEEEEEEEE-EECCCCCC
Confidence            4788874  69999985443 8999997 89887664 577778999999999999999999999999999 66666554


No 13 
>PF08758 Cadherin_pro:  Cadherin prodomain like;  InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=97.80  E-value=0.00016  Score=56.77  Aligned_cols=86  Identities=23%  Similarity=0.256  Sum_probs=46.8

Q ss_pred             CCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-CeeEEEEecCCCCCEEEcCcccEEEcccCCcccCcEEEEEEEEec
Q psy4697          41 RELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-LRVKYWLSNDYGERFSISRQGDISLMQCLDYETEDSYRFTVYATD  119 (383)
Q Consensus        41 n~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-~~v~Ysi~~~~~~~F~Id~tG~I~~~~~LD~E~~~~y~~~V~A~D  119 (383)
                      +-|-|.+..|.+.|+.+...|..|++|.-.|-.. ..+.|.-. ++  .|.|.++|.|++++++.....+ -.|.|.|.|
T Consensus         2 C~pGF~~~~~~~~Vp~~l~~g~~lg~V~f~dC~~~~~~~~~ss-Dp--dF~V~~DGsVy~~r~v~l~~~~-~~F~V~a~D   77 (90)
T PF08758_consen    2 CRPGFSQKKYTFEVPSNLEAGQPLGKVNFEDCTGRRRVIFESS-DP--DFRVLEDGSVYAKRPVQLSSEQ-RSFTVHAWD   77 (90)
T ss_dssp             ---B--S-EEEE----SS-SS--EEE---B--SS---EEEE----S--EEEEETTTEEEEES--S-SSS--EEEEEEEEE
T ss_pred             CcCCcccceEEEEcCchhhCCcEEEEEEeccCCCCCceEEecC-CC--CEEEcCCCeEEEeeeEecCCCc-eEEEEEEEC
Confidence            4588999999999999999999999999988865 56888643 22  6999999999999999886543 479999999


Q ss_pred             CCceeEEEEEE
Q psy4697         120 TLMTTSATVNI  130 (383)
Q Consensus       120 ~~~~s~~tV~I  130 (383)
                      .......++.|
T Consensus        78 ~~~~~~~~v~V   88 (90)
T PF08758_consen   78 SQTQEQKEVKV   88 (90)
T ss_dssp             TTTTEEEEEEE
T ss_pred             CCCCeEEEEEE
Confidence            75333344443


No 14 
>PF08758 Cadherin_pro:  Cadherin prodomain like;  InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=97.40  E-value=0.0013  Score=51.69  Aligned_cols=80  Identities=23%  Similarity=0.353  Sum_probs=45.5

Q ss_pred             CCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCCCeEEEEEeCCCCCCEEEcCCCcEEEeccCCCCcceEEEEEEE
Q psy4697         139 DPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKGDRVTLSLRGPYEKMFSINDSGHISIVDLSALNTSTIQLVVVA  218 (383)
Q Consensus       139 ~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~~i~ysi~~~~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a  218 (383)
                      .|=|....|.+.|+    .+...|+.|++|.-.|-.....+.|.-.   +..|.|.++|.+++.+.-.+....-.+.|.|
T Consensus         3 ~pGF~~~~~~~~Vp----~~l~~g~~lg~V~f~dC~~~~~~~~~ss---DpdF~V~~DGsVy~~r~v~l~~~~~~F~V~a   75 (90)
T PF08758_consen    3 RPGFSQKKYTFEVP----SNLEAGQPLGKVNFEDCTGRRRVIFESS---DPDFRVLEDGSVYAKRPVQLSSEQRSFTVHA   75 (90)
T ss_dssp             --B--S-EEEE--------SS-SS--EEE---B--SS---EEEE------SEEEEETTTEEEEES--S-SSS-EEEEEEE
T ss_pred             cCCcccceEEEEcC----chhhCCcEEEEEEeccCCCCCceEEecC---CCCEEEcCCCeEEEeeeEecCCCceEEEEEE
Confidence            36788888999999    8899999999999999965567888754   3389999999999888756666667899999


Q ss_pred             eeCCCCC
Q psy4697         219 TDTGNPP  225 (383)
Q Consensus       219 ~D~g~p~  225 (383)
                      +|.....
T Consensus        76 ~D~~~~~   82 (90)
T PF08758_consen   76 WDSQTQE   82 (90)
T ss_dssp             EETTTTE
T ss_pred             ECCCCCe
Confidence            9987644


No 15 
>PF08266 Cadherin_2:  Cadherin-like;  InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=97.36  E-value=0.00059  Score=52.90  Aligned_cols=57  Identities=16%  Similarity=0.253  Sum_probs=37.9

Q ss_pred             EEEEEECCCCCCcEEEEEEeEeCCC-----CeeEEEEec-CCCCCEEEcC-cccEEEcccCCccc
Q psy4697          50 YSVSALENLPVNYVLLTVTTNKPRD-----LRVKYWLSN-DYGERFSISR-QGDISLMQCLDYET  107 (383)
Q Consensus        50 y~~~V~En~~~gt~v~~v~A~D~D~-----~~v~Ysi~~-~~~~~F~Id~-tG~I~~~~~LD~E~  107 (383)
                      ..++|+|..++|+.||.+ |.|...     ....|.+.. ....+|.++. +|.|++...+|||.
T Consensus         3 i~YsV~EE~~~Gt~IGni-a~dL~l~~~~l~~~~~ri~s~~~~~~~~v~~~tG~L~v~~rIDRE~   66 (84)
T PF08266_consen    3 IRYSVPEEMPPGTVIGNI-AKDLGLDPQSLSSRNFRIVSEGNSQYFRVNEKTGDLFVSERIDREE   66 (84)
T ss_dssp             EEEEEESS--TT-EEEEC-CCCCT--HHHHCCTTBEEE-SSSS-SEEE-TTTSEEEESS--SCCC
T ss_pred             eEEEeecCCCCCCEEEEh-HHhhCCCcccccccceEEeecCCcceeEecCCceeEEeCCccCHHH
Confidence            357899999999999999 445432     334566555 4567999996 99999999999998


No 16 
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=95.86  E-value=0.063  Score=42.84  Aligned_cols=77  Identities=23%  Similarity=0.185  Sum_probs=51.7

Q ss_pred             EEEEeEeCCC-CeeEEEEec--CCCCCEEEcCcccEEEc--------ccCCcccCcEEEEEEEEecCCceeEEEEEEEEe
Q psy4697          65 LTVTTNKPRD-LRVKYWLSN--DYGERFSISRQGDISLM--------QCLDYETEDSYRFTVYATDTLMTTSATVNISVV  133 (383)
Q Consensus        65 ~~v~A~D~D~-~~v~Ysi~~--~~~~~F~Id~tG~I~~~--------~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~  133 (383)
                      |++.++|+|. ....++...  ...+.|.|+.+|.....        +.|.--+...-.|+|.+.|+   ...+|.|.|.
T Consensus         2 G~Lt~sD~D~gd~~~~s~~~~~g~yGtlti~~~G~wtYtl~n~~~avq~L~~Ge~~tdsFtvtv~DG---tt~~vtItI~   78 (99)
T TIGR01965         2 GQLTISDADAGQAHFIAQTDAAGQYGTFSIDADGQWTYQADNSQTAVQALKAGETLTDTFTVTSADG---TSQTVTITIT   78 (99)
T ss_pred             CceEEeCCCCCCceEEecccccCCcEEEEECCCCcEEEEeCCCcHHHHhhcCCCEEEEEEEEEEeCC---CeEEEEEEEE
Confidence            4678888887 345555533  23345888777765432        23333344567888889997   3889999999


Q ss_pred             ecCCCCCcccCC
Q psy4697         134 NVNDWDPRFRYP  145 (383)
Q Consensus       134 DvNDn~P~f~~~  145 (383)
                      ..|| +|++...
T Consensus        79 GtND-apvi~~~   89 (99)
T TIGR01965        79 GAND-AAVIGGA   89 (99)
T ss_pred             ccCC-CCEEecc
Confidence            9999 8877543


No 17 
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=95.75  E-value=0.13  Score=40.82  Aligned_cols=67  Identities=25%  Similarity=0.251  Sum_probs=51.3

Q ss_pred             eEeCCCCeeEEEEec----CCCCCEEEcC-cccEEEcccCCcccCcEEEEEEEEecCC-ceeEEEEEEEEeecCC
Q psy4697          69 TNKPRDLRVKYWLSN----DYGERFSISR-QGDISLMQCLDYETEDSYRFTVYATDTL-MTTSATVNISVVNVND  137 (383)
Q Consensus        69 A~D~D~~~v~Ysi~~----~~~~~F~Id~-tG~I~~~~~LD~E~~~~y~~~V~A~D~~-~~s~~tV~I~V~DvND  137 (383)
                      ..|+|...++|++..    .-..|...|+ ++.++= .+.+.+ ...|.+.|.|+|+. .++...+.|.|.+.||
T Consensus        24 F~d~d~~~lty~~~~~~~~~lP~Wl~fd~~~~~~~G-tP~~~~-~g~~~i~v~a~D~~g~~~~~~f~i~V~~~~~   96 (97)
T smart00736       24 FTDADGDTLTYSATLSDGSALPSWLSFDSDTGTLSG-TPTNSD-VGSLSLKVTATDSSGASASDTFTITVVNTND   96 (97)
T ss_pred             eECCCCCeEEEEEEeCCCCCCCCeEEEeCCCCEEEE-ECCCCC-CcEEEEEEEEEECCCCEEEEEEEEEEeCCCC
Confidence            467777889999864    2256899986 777665 344433 46799999999975 8888899999999987


No 18 
>PF15102 TMEM154:  TMEM154 protein family
Probab=94.47  E-value=0.053  Score=46.00  Aligned_cols=34  Identities=26%  Similarity=0.444  Sum_probs=16.6

Q ss_pred             eeeehhhHHHHHHHHHHHHHHhhheeecCCCCCC
Q psy4697         254 STVLIILGVVLIVLGFVIILLILYIHKNKHTKNN  287 (383)
Q Consensus       254 ~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~  287 (383)
                      +++++++..++++||++++++.++++||+|.|..
T Consensus        57 fiLmIlIP~VLLvlLLl~vV~lv~~~kRkr~K~~   90 (146)
T PF15102_consen   57 FILMILIPLVLLVLLLLSVVCLVIYYKRKRTKQE   90 (146)
T ss_pred             eEEEEeHHHHHHHHHHHHHHHheeEEeecccCCC
Confidence            3556666655554443333334444555544443


No 19 
>KOG0196|consensus
Probab=94.23  E-value=2.3  Score=45.71  Aligned_cols=110  Identities=18%  Similarity=0.140  Sum_probs=55.5

Q ss_pred             CCcccCcEEEEEEEEecCC------ceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCC
Q psy4697         103 LDYETEDSYRFTVYATDTL------MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKG  176 (383)
Q Consensus       103 LD~E~~~~y~~~V~A~D~~------~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g  176 (383)
                      -|.+....|+|.|+|.++-      ....+.|.|..   |.-+|.-   ...+.+.     ......+-..-.--|++.|
T Consensus       403 ~~L~ah~~YTFeV~AvNgVS~lsp~~~~~a~vnItt---~qa~ps~---V~~~r~~-----~~~~~sitlsW~~p~~png  471 (996)
T KOG0196|consen  403 SDLLAHTNYTFEVEAVNGVSDLSPFPRQFASVNITT---NQAAPSP---VSVLRQV-----SRTSDSITLSWSEPDQPNG  471 (996)
T ss_pred             eccccccccEEEEEEeecccccCCCCCcceeEEeec---cccCCCc---cceEEEe-----eeccCceEEecCCCCCCCC
Confidence            3556677899999999862      22344555544   3333321   1112111     1111222122233344445


Q ss_pred             CeEEEEEe----C-CCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCCCCC
Q psy4697         177 DRVTLSLR----G-PYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTGNPP  225 (383)
Q Consensus       177 ~~i~ysi~----~-~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~  225 (383)
                      ..+.|.+.    . +...+..+.. .-...+..+  .....|-+.|.|.+..+..
T Consensus       472 ~ildYEvky~ek~~~e~~~~~~~t~~~~~ti~gL--~p~t~YvfqVRarT~aG~G  524 (996)
T KOG0196|consen  472 VILDYEVKYYEKDEDERSYSTLKTKTTTATITGL--KPGTVYVFQVRARTAAGYG  524 (996)
T ss_pred             cceeEEEEEeeccccccceeEEecccceEEeecc--CCCcEEEEEEEEecccCCC
Confidence            55677764    1 3344444543 323334443  3357899999999875543


No 20 
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=94.21  E-value=0.77  Score=36.27  Aligned_cols=65  Identities=25%  Similarity=0.286  Sum_probs=46.3

Q ss_pred             EeeeCCCCCeEEEEEeC----CCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEEEe
Q psy4697         169 EAADGDKGDRVTLSLRG----PYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMHFP  237 (383)
Q Consensus       169 ~A~D~D~g~~i~ysi~~----~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v~  237 (383)
                      ...|.| +..++|++..    ....|...++ ++.+.-. +...+.+.|.+.|.|+|..+  .++...+.|.|.
T Consensus        23 tF~d~d-~~~lty~~~~~~~~~lP~Wl~fd~~~~~~~Gt-P~~~~~g~~~i~v~a~D~~g--~~~~~~f~i~V~   92 (97)
T smart00736       23 TFTDAD-GDTLTYSATLSDGSALPSWLSFDSDTGTLSGT-PTNSDVGSLSLKVTATDSSG--ASASDTFTITVV   92 (97)
T ss_pred             ceECCC-CCeEEEEEEeCCCCCCCCeEEEeCCCCEEEEE-CCCCCCcEEEEEEEEEECCC--CEEEEEEEEEEe
Confidence            356888 7789999862    2345777886 7777543 33344678999999999876  567777888773


No 21 
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=92.25  E-value=1.2  Score=35.63  Aligned_cols=72  Identities=22%  Similarity=0.264  Sum_probs=46.6

Q ss_pred             EEEEeeeCCCCCeEEEEEe--CCCCCCEEEcCCCcEEEec-c-----CCC---CcceEEEEEEEeeCCCCCceeEEEEEE
Q psy4697         166 GKVEAADGDKGDRVTLSLR--GPYEKMFSINDSGHISIVD-L-----SAL---NTSTIQLVVVATDTGNPPRQASVPAIM  234 (383)
Q Consensus       166 ~~v~A~D~D~g~~i~ysi~--~~~~~~F~i~~tG~i~l~~-~-----~~~---~~~~y~L~V~a~D~g~p~~sst~tv~I  234 (383)
                      +++.++|+|.|+...++..  ....+.|.|+.+|.-.-.. .     ..+   +...-.++|.+.|+      .+.+|.|
T Consensus         2 G~Lt~sD~D~gd~~~~s~~~~~g~yGtlti~~~G~wtYtl~n~~~avq~L~~Ge~~tdsFtvtv~DG------tt~~vtI   75 (99)
T TIGR01965         2 GQLTISDADAGQAHFIAQTDAAGQYGTFSIDADGQWTYQADNSQTAVQALKAGETLTDTFTVTSADG------TSQTVTI   75 (99)
T ss_pred             CceEEeCCCCCCceEEecccccCCcEEEEECCCCcEEEEeCCCcHHHHhhcCCCEEEEEEEEEEeCC------CeEEEEE
Confidence            3578999999988888874  2345678998887543211 1     112   23345788889995      2677788


Q ss_pred             EEeCCcccCc
Q psy4697         235 HFPEAIVQQA  244 (383)
Q Consensus       235 ~v~~~~~~~~  244 (383)
                      +| .+.|+.+
T Consensus        76 tI-~GtNDap   84 (99)
T TIGR01965        76 TI-TGANDAA   84 (99)
T ss_pred             EE-EccCCCC
Confidence            77 5555543


No 22 
>PF08374 Protocadherin:  Protocadherin;  InterPro: IPR013585 The structure of protocadherins is similar to that of classic cadherins (IPR002126 from INTERPRO), but they also have some unique features associated with the cytoplasmic domains. They are expressed in a variety of organisms and are found in high concentrations in the brain where they seem to be localised mainly at cell-cell contact sites. Their expression seems to be developmentally regulated []. 
Probab=91.71  E-value=0.13  Score=46.39  Aligned_cols=37  Identities=11%  Similarity=0.265  Sum_probs=23.8

Q ss_pred             CCCceeeehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697         250 SGTSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKN  286 (383)
Q Consensus       250 ~~~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~  286 (383)
                      ..+..++++++|+++.+.|+|++..++++||++..+.
T Consensus        34 ~d~~~I~iaiVAG~~tVILVI~i~v~vR~CRq~~~k~   70 (221)
T PF08374_consen   34 KDYVKIMIAIVAGIMTVILVIFIVVLVRYCRQSPHKK   70 (221)
T ss_pred             ccceeeeeeeecchhhhHHHHHHHHHHHHHhhccccc
Confidence            3566777888888877666666655565577554443


No 23 
>KOG3597|consensus
Probab=90.13  E-value=13  Score=37.78  Aligned_cols=144  Identities=14%  Similarity=0.130  Sum_probs=85.0

Q ss_pred             EEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC--CeeEEEEecCCCC---------------CE
Q psy4697          27 ALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD--LRVKYWLSNDYGE---------------RF   89 (383)
Q Consensus        27 s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~--~~v~Ysi~~~~~~---------------~F   89 (383)
                      .+....|.|..+||+|..+-...+.+-+.|+...-.....+.+.|+|.  ..+.|++....+.               -|
T Consensus        24 ~~~~~~i~v~pvndpp~~~~~~~~~l~~~~~~~k~l~~~~l~~~d~d~~~~~l~f~v~~t~~~~~~~~~~~~~g~~~~~F  103 (442)
T KOG3597|consen   24 QTDVLRIHVNPVNDPPSLIFPSGSLLVILEGGQKVLDPELLTAADPDSAPLPLEFQVLGTSSVPLPVLKFDVPGAPATEF  103 (442)
T ss_pred             EEeeecccccccCCCcceeecccceEEeecCCceeccceEeeccCCCCCccceEEEEccCCCCCCccceeeccCCcccce
Confidence            344567889999999888888888888898866555556788899987  7788888773322               23


Q ss_pred             EEc--CcccEEEcccCCccc--CcEEEEEEEEecCCceeEEEEEEEEeecCCCCCcccCC-ceeEEEeccCCCCCCCCce
Q psy4697          90 SIS--RQGDISLMQCLDYET--EDSYRFTVYATDTLMTTSATVNISVVNVNDWDPRFRYP-QYELFLPHIPLADLTPGSV  164 (383)
Q Consensus        90 ~Id--~tG~I~~~~~LD~E~--~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn~P~f~~~-~~~~~v~~~~~e~~~~g~~  164 (383)
                      +-.  ..|.+.    +++..  .....++..++|+-..+. .   .+.-.-...|.+... ...+.+.      ......
T Consensus       104 s~~~v~~g~~~----yvh~g~el~~~~~~~~~SDg~~~S~-~---~i~~~~~~~~~~~~~~~~gL~v~------~gS~~~  169 (442)
T KOG3597|consen  104 SYEEVEDGSLS----YVHSGTELRESELQLRVSDGLLVSE-R---AILKVEATGPAPHLARNTGLKVL------QGSTAP  169 (442)
T ss_pred             EehHhhcCcee----EEecCcccccceEEEEeecceEeee-e---EEecccCCCcceeeecccceEEc------cCcccc
Confidence            222  133332    23333  567788888999875555 1   111122223332221 1122222      112222


Q ss_pred             E--EEEEeeeCCCC-C-eEEEEEe
Q psy4697         165 I--GKVEAADGDKG-D-RVTLSLR  184 (383)
Q Consensus       165 v--~~v~A~D~D~g-~-~i~ysi~  184 (383)
                      |  ..+.+.|.|++ + .+.|.|.
T Consensus       170 IT~~~L~ved~d~~~d~~v~~~i~  193 (442)
T KOG3597|consen  170 ITPSNLSVEDNDSSPDDEVRYDIT  193 (442)
T ss_pred             ccHhHceeecCCCCCCcEEEEEec
Confidence            3  24788888854 3 6888885


No 24 
>PF01102 Glycophorin_A:  Glycophorin A;  InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=89.62  E-value=0.17  Score=41.91  Aligned_cols=19  Identities=32%  Similarity=0.660  Sum_probs=12.1

Q ss_pred             eeeehhhHHHHHHHHHHHH
Q psy4697         254 STVLIILGVVLIVLGFVII  272 (383)
Q Consensus       254 ~~li~~l~~i~~lL~l~~~  272 (383)
                      .++.|++|+++++++++++
T Consensus        65 ~i~~Ii~gv~aGvIg~Ill   83 (122)
T PF01102_consen   65 AIIGIIFGVMAGVIGIILL   83 (122)
T ss_dssp             CHHHHHHHHHHHHHHHHHH
T ss_pred             ceeehhHHHHHHHHHHHHH
Confidence            4556777777777665443


No 25 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=88.96  E-value=0.18  Score=36.59  Aligned_cols=13  Identities=8%  Similarity=0.107  Sum_probs=1.6

Q ss_pred             eeecCCCCCCCCC
Q psy4697         278 IHKNKHTKNNGPP  290 (383)
Q Consensus       278 ~~r~~~~~~~~~~  290 (383)
                      ..|.+||+..+|.
T Consensus        33 iyR~rkkdEGSY~   45 (64)
T PF01034_consen   33 IYRMRKKDEGSYD   45 (64)
T ss_dssp             ----S------SS
T ss_pred             HHHHHhcCCCCcc
Confidence            4555556655554


No 26 
>PF08266 Cadherin_2:  Cadherin-like;  InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=88.55  E-value=1.2  Score=34.43  Aligned_cols=55  Identities=20%  Similarity=0.420  Sum_probs=32.8

Q ss_pred             eEEEeccCCCCCCCCceEEEEEeeeCCCCC----eEEEEEe-CCCCCCEEEcC-CCcEEEecc-CCC
Q psy4697         148 ELFLPHIPLADLTPGSVIGKVEAADGDKGD----RVTLSLR-GPYEKMFSIND-SGHISIVDL-SAL  207 (383)
Q Consensus       148 ~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~----~i~ysi~-~~~~~~F~i~~-tG~i~l~~~-~~~  207 (383)
                      ..+|+    |..++|+.|+.| |.|.-...    ...|.+. .....+|.++. +|.+++... +++
T Consensus         4 ~YsV~----EE~~~Gt~IGni-a~dL~l~~~~l~~~~~ri~s~~~~~~~~v~~~tG~L~v~~rIDRE   65 (84)
T PF08266_consen    4 RYSVP----EEMPPGTVIGNI-AKDLGLDPQSLSSRNFRIVSEGNSQYFRVNEKTGDLFVSERIDRE   65 (84)
T ss_dssp             EEEEE----SS--TT-EEEEC-CCCCT--HHHHCCTTBEEE-SSSS-SEEE-TTTSEEEESS--SCC
T ss_pred             EEEee----cCCCCCCEEEEh-HHhhCCCcccccccceEEeecCCcceeEecCCceeEEeCCccCHH
Confidence            46788    899999999998 55553211    2345544 34567999997 999998754 444


No 27 
>PF07495 Y_Y_Y:  Y_Y_Y domain;  InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=88.09  E-value=4.5  Score=28.98  Aligned_cols=56  Identities=18%  Similarity=0.193  Sum_probs=33.1

Q ss_pred             eEEEEEeCCCCCCEEEcCCC-cEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697         178 RVTLSLRGPYEKMFSINDSG-HISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMHF  236 (383)
Q Consensus       178 ~i~ysi~~~~~~~F~i~~tG-~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v  236 (383)
                      ...|.|.|....+..+.... .+....   +..+.|.|.|.|.|..........++.|.|
T Consensus         9 ~Y~Y~l~g~d~~W~~~~~~~~~~~~~~---L~~G~Y~l~V~a~~~~~~~~~~~~~l~i~I   65 (66)
T PF07495_consen    9 RYRYRLEGFDDEWITLGSYSNSISYTN---LPPGKYTLEVRAKDNNGKWSSDEKSLTITI   65 (66)
T ss_dssp             EEEEEEETTESSEEEESSTS-EEEEES-----SEEEEEEEEEEETTS-B-SS-EEEEEEE
T ss_pred             EEEEEEECCCCeEEECCCCcEEEEEEe---CCCEEEEEEEEEECCCCCcCcccEEEEEEE
Confidence            45666776555455555544 554433   567999999999997665444335666655


No 28 
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=86.52  E-value=59  Score=36.23  Aligned_cols=138  Identities=15%  Similarity=0.227  Sum_probs=71.6

Q ss_pred             eCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC---CeeEEEEecC---CCCCEEEcCcccEEEc----------
Q psy4697          37 GTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD---LRVKYWLSND---YGERFSISRQGDISLM----------  100 (383)
Q Consensus        37 DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~---~~v~Ysi~~~---~~~~F~Id~tG~I~~~----------  100 (383)
                      +.||.++.|....-+..|.||.  |+.-.+|.=...|.   -.|.|+..+.   .+.-|.- .+|.|.-.          
T Consensus       394 ~~dd~~s~i~Fe~~~Y~V~En~--GtV~VtV~R~GGdl~~tVsVdY~T~DGTA~AG~DY~~-~sGTLtF~PGEt~KtItV  470 (928)
T TIGR00845       394 EENDPVSKIFFEPGHYTCLENC--GTVALTVVRRGGDLTNTVYVDYRTEDGTANAGSDYEF-TEGTLVFKPGETQKEFRI  470 (928)
T ss_pred             cccCCcceEEecCCeEEEeecC--cEEEEEEEEccCCCCceEEEEEEccCCccCCCCCccc-cCceEEECCCceEEEEEE
Confidence            3577777776666677888985  77666665443332   5578887651   1111111 13333221          


Q ss_pred             ccC---CcccCcEEEEEEEEecCC----------------ceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCC
Q psy4697         101 QCL---DYETEDSYRFTVYATDTL----------------MTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTP  161 (383)
Q Consensus       101 ~~L---D~E~~~~y~~~V~A~D~~----------------~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~  161 (383)
                      ..+   -+|....|.+.+.--..+                +....+.+|+|.| ||++|.|....-...+.    |+.  
T Consensus       471 ~IIDDdi~E~DE~F~V~LSNp~~g~~~G~~~~~~~~~~A~Lg~ps~ATVTIlD-DD~aGIfsFe~~~~sV~----Es~--  543 (928)
T TIGR00845       471 GIIDDDIFEEDEHFYVRLSNLRVGSEDGILEANHVSAVAQLASPNTATVTILD-DDHAGIFTFEEDVFHVS----ESI--  543 (928)
T ss_pred             EEccCCCCCCCceEEEEEeCCCCCCcccccccccccccceecCCceEEEEEec-CcccCcccccCceEEEE----cCC--
Confidence            112   234455555555332111                1223456677777 77899877665566777    654  


Q ss_pred             CceEEEEEeeeCCCCC-eEEEEEe
Q psy4697         162 GSVIGKVEAADGDKGD-RVTLSLR  184 (383)
Q Consensus       162 g~~v~~v~A~D~D~g~-~i~ysi~  184 (383)
                      |..-.+|.-+-.-.|. .+.|.-.
T Consensus       544 G~vtvtV~RtsGa~G~VtV~Y~T~  567 (928)
T TIGR00845       544 GIMEVKVLRTSGARGTVIVPYRTV  567 (928)
T ss_pred             CEEEEEEEEcCCCCeeEEEEEEee
Confidence            4443343333222233 5667654


No 29 
>PF02439 Adeno_E3_CR2:  Adenovirus E3 region protein CR2;  InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=85.78  E-value=0.76  Score=29.69  Aligned_cols=9  Identities=22%  Similarity=0.516  Sum_probs=3.6

Q ss_pred             ehhhHHHHH
Q psy4697         257 LIILGVVLI  265 (383)
Q Consensus       257 i~~l~~i~~  265 (383)
                      ++++++++.
T Consensus         6 IaIIv~V~v   14 (38)
T PF02439_consen    6 IAIIVAVVV   14 (38)
T ss_pred             hhHHHHHHH
Confidence            334444443


No 30 
>PF12877 DUF3827:  Domain of unknown function (DUF3827);  InterPro: IPR024606 The function of the proteins in this entry is not currently known, but one of the human proteins (Q9HCM3 from SWISSPROT) has been implicated in pilocytic astrocytomas [, , ]. In the majority of cases of pilocytic astrocytomas a tandem duplication produces an in-frame fusion of the gene encoding this protein and the BRAF oncogene. The resulting fusion protein has constitutive BRAF kinase activity and is capable of transforming cells. 
Probab=81.02  E-value=2  Score=44.92  Aligned_cols=35  Identities=14%  Similarity=0.198  Sum_probs=19.7

Q ss_pred             ceeeehhhHHHHHHHHHHHHHHhhheeecCCCCCC
Q psy4697         253 SSTVLIILGVVLIVLGFVIILLILYIHKNKHTKNN  287 (383)
Q Consensus       253 ~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~  287 (383)
                      ..+||+++.+-++++++|+++|++.+||++|-+..
T Consensus       268 NlWII~gVlvPv~vV~~Iiiil~~~LCRk~K~eFq  302 (684)
T PF12877_consen  268 NLWIIAGVLVPVLVVLLIIIILYWKLCRKNKLEFQ  302 (684)
T ss_pred             CeEEEehHhHHHHHHHHHHHHHHHHHhcccccCCC
Confidence            34444444444444444555567778888877643


No 31 
>PF05345 He_PIG:  Putative Ig domain;  InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=80.46  E-value=7.4  Score=26.69  Aligned_cols=37  Identities=27%  Similarity=0.298  Sum_probs=27.2

Q ss_pred             CCCCCEEEcC-CCcEEEeccCCCCcceEEEEEEEeeCC
Q psy4697         186 PYEKMFSIND-SGHISIVDLSALNTSTIQLVVVATDTG  222 (383)
Q Consensus       186 ~~~~~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D~g  222 (383)
                      ....+..+|+ +|.|.-........+.|.+.|.|+|..
T Consensus        11 ~LP~gLs~d~~tG~isGtp~~~~~~G~y~~~vtatd~~   48 (49)
T PF05345_consen   11 GLPSGLSLDPSTGTISGTPTSSVQPGTYTFTVTATDGS   48 (49)
T ss_pred             CCCCcEEEeCCCCEEEeecCCCccccEEEEEEEEEcCC
Confidence            3455788987 999975544333457999999999964


No 32 
>PF02439 Adeno_E3_CR2:  Adenovirus E3 region protein CR2;  InterPro: IPR003470 Early region 3 (E3) of human adenoviruses (Ads) codes for proteins that appear to control viral interactions with the host []. This region called CR1 (conserved region 1) [] is found three times in Human adenovirus 19 (a subgroup D adenovirus) 49 kDa protein in the E3 region. CR1 is also found in the 20.1 Kd protein of subgroup B adenoviruses. The function of this 80 amino acid region is unknown. This region is probably a divergent immunoglobulin domain.
Probab=80.43  E-value=1.9  Score=27.91  Aligned_cols=25  Identities=4%  Similarity=0.191  Sum_probs=14.7

Q ss_pred             CceeeehhhHHHHHHHHHHHHHHhh
Q psy4697         252 TSSTVLIILGVVLIVLGFVIILLIL  276 (383)
Q Consensus       252 ~~~~li~~l~~i~~lL~l~~~~l~~  276 (383)
                      .+..++.++.+.+.++.+.++..+.
T Consensus         4 s~IaIIv~V~vg~~iiii~~~~YaC   28 (38)
T PF02439_consen    4 STIAIIVAVVVGMAIIIICMFYYAC   28 (38)
T ss_pred             chhhHHHHHHHHHHHHHHHHHHHHH
Confidence            3456666666666666665554333


No 33 
>KOG1094|consensus
Probab=80.41  E-value=2.9  Score=43.65  Aligned_cols=23  Identities=17%  Similarity=0.573  Sum_probs=12.6

Q ss_pred             CCceeeehhhHHHHHHHHHHHHH
Q psy4697         251 GTSSTVLIILGVVLIVLGFVIIL  273 (383)
Q Consensus       251 ~~~~~li~~l~~i~~lL~l~~~~  273 (383)
                      +.+.++++++.+|.+++++++++
T Consensus       388 ~~t~~~~~~f~~if~iva~ii~~  410 (807)
T KOG1094|consen  388 SPTAILIIIFVAIFLIVALIIAL  410 (807)
T ss_pred             CCceehHHHHHHHHHHHHHHHHH
Confidence            34455666666666555554443


No 34 
>KOG4221|consensus
Probab=79.73  E-value=1.2e+02  Score=34.74  Aligned_cols=47  Identities=15%  Similarity=0.144  Sum_probs=28.4

Q ss_pred             EEEEEeCC-CCCCEEEcC-CCcEEEecc-CCCCcceEEEEEEEeeCCCCC
Q psy4697         179 VTLSLRGP-YEKMFSIND-SGHISIVDL-SALNTSTIQLVVVATDTGNPP  225 (383)
Q Consensus       179 i~ysi~~~-~~~~F~i~~-tG~i~l~~~-~~~~~~~y~L~V~a~D~g~p~  225 (383)
                      +-|+..++ ...-|++.. .|....... .......|.+.|.|+-..+|.
T Consensus       959 i~Ys~~~n~~~~dWt~~t~~g~~L~~~v~~l~p~t~yffkiQAr~~kG~g 1008 (1381)
T KOG4221|consen  959 IYYSTDGNTPEHDWTIETTAGAELSHQVPNLDPDTGYFFKIQARNEKGPG 1008 (1381)
T ss_pred             EEEecCCCCchhhceeeecccchhhhccCCCCCCCceEEEEEeeccCCCC
Confidence            44555433 445688876 565543332 333466799999998876554


No 35 
>PF10577 UPF0560:  Uncharacterised protein family UPF0560;  InterPro: IPR018890  This family of proteins has no known function. 
Probab=79.00  E-value=1.6  Score=46.88  Aligned_cols=31  Identities=19%  Similarity=0.356  Sum_probs=18.0

Q ss_pred             eeeehhhHHHHHHHHHHHHHHhhheeecCCC
Q psy4697         254 STVLIILGVVLIVLGFVIILLILYIHKNKHT  284 (383)
Q Consensus       254 ~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~  284 (383)
                      ..+.++||+.++++++++.+|.++|+|++.|
T Consensus       273 ~fLl~ILG~~~livl~lL~vLl~yCrrkc~~  303 (807)
T PF10577_consen  273 VFLLAILGGTALIVLILLCVLLCYCRRKCLK  303 (807)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhhcccCC
Confidence            5567778877776665555554444444433


No 36 
>TIGR03660 T1SS_rpt_143 T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion.
Probab=78.40  E-value=41  Score=28.53  Aligned_cols=53  Identities=28%  Similarity=0.331  Sum_probs=36.1

Q ss_pred             cEEEcccCCccc---CcEEEEEEEEecC-CceeEEEEEEEEeecCCCCCcccCCceeEEEe
Q psy4697          96 DISLMQCLDYET---EDSYRFTVYATDT-LMTTSATVNISVVNVNDWDPRFRYPQYELFLP  152 (383)
Q Consensus        96 ~I~~~~~LD~E~---~~~y~~~V~A~D~-~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~  152 (383)
                      ...+.++||+..   .-...|.|.|+|. |-.++.++.|.|.|  | .|+..... .+.|.
T Consensus        69 tftL~~~lDH~~g~d~l~l~~~v~a~D~DGD~s~~~l~VtI~D--D-~P~~~~~~-~~~V~  125 (137)
T TIGR03660        69 EFTLEGPLDHAAGSDELTLNFPIIATDFDGDTSSITLPVTIVD--D-VPTITDVD-ALTVD  125 (137)
T ss_pred             EEEEcccccCCCCCceEEEeeeEEEEeCCCCccccEEEEEEEC--C-CCeecccc-ceEEe
Confidence            455678888843   4467889999985 44455688888887  6 57765543 35666


No 37 
>PF15347 PAG:  Phosphoprotein associated with glycosphingolipid-enriched
Probab=77.94  E-value=2.9  Score=40.84  Aligned_cols=36  Identities=8%  Similarity=0.056  Sum_probs=27.1

Q ss_pred             CCceeeehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697         251 GTSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKN  286 (383)
Q Consensus       251 ~~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~  286 (383)
                      ...+++++.|+++..||++.+|+|.+.-|.|.||..
T Consensus        12 q~qivlwgsLaav~~f~lis~LifLCsSC~reKK~~   47 (428)
T PF15347_consen   12 QVQIVLWGSLAAVTTFLLISFLIFLCSSCDREKKPK   47 (428)
T ss_pred             ceeEEeehHHHHHHHHHHHHHHHHHhhcccccccCC
Confidence            445788899999998888777766666777776654


No 38 
>PF02009 Rifin_STEVOR:  Rifin/stevor family;  InterPro: IPR002858 Malaria is still a major cause of mortality in many areas of the world. Plasmodium falciparum causes the most severe human form of the disease and is responsible for most fatalities. Severe cases of malaria can occur when the parasite invades and then proliferates within red blood cell erythrocytes. The parasite produces many variant antigenic proteins, encoded by multigene families, which are present on the surface of the infected erythrocyte and play important roles in virulence. A crucial survival mechanism for the malaria parasite is its ability to evade the immune response by switching these variant surface antigens. The high virulence of P. falciparum relative to other malarial parasites is in large part due to the fact that in this organism many of these surface antigens mediate the binding of infected erythrocytes to the vascular endothelium (cytoadherence) and non-infected erythrocytes (rosetting). This can lead to the accumulation of infected cells in the vasculature of a variety of organs, blocking the blood flow and reducing the oxygen supply. Clinical symptoms of severe infection can include fever, progressive anaemia, multi-organ dysfunction and coma. For more information see []. Several multicopy gene families have been described in Plasmodium falciparum, including the stevor family of subtelomeric open reading frames and the rif interspersed repetitive elements. Both families contain three predicted transmembrane segments. It has been proposed that stevor and rif are members of a larger superfamily that code for variant surface antigens [].
Probab=76.16  E-value=0.99  Score=43.40  Aligned_cols=30  Identities=37%  Similarity=0.502  Sum_probs=13.5

Q ss_pred             eeeehhhHHHHH-HHHHHHHHHhhheeecCCC
Q psy4697         254 STVLIILGVVLI-VLGFVIILLILYIHKNKHT  284 (383)
Q Consensus       254 ~~li~~l~~i~~-lL~l~~~~l~~~~~r~~~~  284 (383)
                      .++++++.+|++ +|+++++.|++ |+||+||
T Consensus       256 t~I~aSiiaIliIVLIMvIIYLIL-RYRRKKK  286 (299)
T PF02009_consen  256 TAIIASIIAILIIVLIMVIIYLIL-RYRRKKK  286 (299)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHH-HHHHHhh
Confidence            334444444443 44444444444 4455443


No 39 
>PF04478 Mid2:  Mid2 like cell wall stress sensor;  InterPro: IPR007567 This family represents a region near the C terminus of Mid2, which contains a transmembrane region. The remainder of the protein sequence is serine-rich and of low complexity, and is therefore impossible to align accurately. Mid2 is thought to act as a mechanosensor of cell wall stress. The C-terminal cytoplasmic region of Mid2 is known to interact with Rom2, a guanine nucleotide exchange factor (GEF) for Rho1, which is part of the cell wall integrity signalling pathway [].
Probab=74.30  E-value=2.3  Score=36.44  Aligned_cols=11  Identities=0%  Similarity=0.178  Sum_probs=4.7

Q ss_pred             hhheeecCCCC
Q psy4697         275 ILYIHKNKHTK  285 (383)
Q Consensus       275 ~~~~~r~~~~~  285 (383)
                      ++++|+|+||.
T Consensus        70 vf~~c~r~kkt   80 (154)
T PF04478_consen   70 VFIFCIRRKKT   80 (154)
T ss_pred             heeEEEecccC
Confidence            33344444443


No 40 
>PF12273 RCR:  Chitin synthesis regulation, resistance to Congo red;  InterPro: IPR020999  RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 []. 
Probab=73.37  E-value=3.1  Score=34.78  Aligned_cols=11  Identities=9%  Similarity=0.422  Sum_probs=5.6

Q ss_pred             heeecCCCCCC
Q psy4697         277 YIHKNKHTKNN  287 (383)
Q Consensus       277 ~~~r~~~~~~~  287 (383)
                      ++|+.+||+++
T Consensus        19 ~~~~~rRR~r~   29 (130)
T PF12273_consen   19 FYCHNRRRRRR   29 (130)
T ss_pred             HHHHHHHHhhc
Confidence            34555555544


No 41 
>PF14575 EphA2_TM:  Ephrin type-A receptor 2 transmembrane domain; PDB: 3KUL_A 2XVD_A 2VX1_A 2VWV_A 2VX0_A 2VWY_A 2VWZ_A 2VWW_A 2VWU_A 2VWX_A ....
Probab=72.08  E-value=2.2  Score=32.25  Aligned_cols=27  Identities=19%  Similarity=0.413  Sum_probs=14.7

Q ss_pred             hhhHHHHHHHHHHHHHHhhheeecCCC
Q psy4697         258 IILGVVLIVLGFVIILLILYIHKNKHT  284 (383)
Q Consensus       258 ~~l~~i~~lL~l~~~~l~~~~~r~~~~  284 (383)
                      ++.+++.+++++++++++..+|+|+++
T Consensus         2 ii~~~~~g~~~ll~~v~~~~~~~rr~~   28 (75)
T PF14575_consen    2 IIASIIVGVLLLLVLVIIVIVCFRRCK   28 (75)
T ss_dssp             HHHHHHHHHHHHHHHHHHHHCCCTT--
T ss_pred             EEehHHHHHHHHHHhheeEEEEEeeEc
Confidence            344555567777666555555555544


No 42 
>PF13750 Big_3_3:  Bacterial Ig-like domain (group 3)
Probab=71.98  E-value=66  Score=27.87  Aligned_cols=121  Identities=19%  Similarity=0.199  Sum_probs=63.3

Q ss_pred             CeEEEEE-EEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCC-CcEEEEEEeEeCCC--CeeEEEEecC
Q psy4697           9 QPITLVV-RAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLPV-NYVLLTVTTNKPRD--LRVKYWLSND   84 (383)
Q Consensus         9 ~~y~l~V-~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~-gt~v~~v~A~D~D~--~~v~Ysi~~~   84 (383)
                      -.|.+++ .|.|..+..........+.+   ...||...- .....+..+... |..=..+.++|...  .-.+.++.+.
T Consensus        15 G~Y~l~~~~a~D~agN~~~~~~~~~~~i---D~T~Ptisi-~~~~~~~~g~~v~~~~~i~i~~tD~~~~~~i~sv~l~Gg   90 (158)
T PF13750_consen   15 GSYTLTVVTATDAAGNTSTSTVSETFTI---DNTPPTISI-SDGASVANGSTVYGLVNISINVTDNSDDSKITSVSLTGG   90 (158)
T ss_pred             ccEEEEEEEEEecCCCEEEEEEeeEEEE---cCCCCEEEE-ecCCccCCCccccceeeeEEEEEeCCCCceEEEEEEECC
Confidence            5699999 79996544333333223433   344777644 111122222221 22223466666543  2334555542


Q ss_pred             C-CCCEEE--cC--cccEEE--cccC-CcccCcEEEEEEEEecC-CceeEEEEEEEEe
Q psy4697          85 Y-GERFSI--SR--QGDISL--MQCL-DYETEDSYRFTVYATDT-LMTTSATVNISVV  133 (383)
Q Consensus        85 ~-~~~F~I--d~--tG~I~~--~~~L-D~E~~~~y~~~V~A~D~-~~~s~~tV~I~V~  133 (383)
                      + .....+  ..  .|...+  .+.| ..|....|+++|.|.|. |..++.++.....
T Consensus        91 ~~~d~v~ls~~~~~~~~~~~~yp~~fpsle~~~~YtLtV~a~D~aGN~~~~si~F~y~  148 (158)
T PF13750_consen   91 PASDSVSLSWTNKGNGVYTLEYPRIFPSLEADDSYTLTVSATDKAGNQSTKSISFSYM  148 (158)
T ss_pred             cccceEEEeeEeccCceEEeecccccCCcCCCCeEEEEEEEEecCCCEEEEEEEEEEe
Confidence            2 222222  22  343322  1222 34778899999999995 6777777776654


No 43 
>PF05393 Hum_adeno_E3A:  Human adenovirus early E3A glycoprotein;  InterPro: IPR008652 This family consists of several early glycoproteins (E3A), from human adenovirus type 2.; GO: 0016021 integral to membrane
Probab=70.57  E-value=3.3  Score=31.94  Aligned_cols=28  Identities=14%  Similarity=0.196  Sum_probs=13.4

Q ss_pred             hhHHHHHHHHHHHHHHhhheeecCCCCCC
Q psy4697         259 ILGVVLIVLGFVIILLILYIHKNKHTKNN  287 (383)
Q Consensus       259 ~l~~i~~lL~l~~~~l~~~~~r~~~~~~~  287 (383)
                      ....++++++++++ +.+.||+.|||.++
T Consensus        36 ~~lvI~~iFil~Vi-lwfvCC~kRkrsRr   63 (94)
T PF05393_consen   36 WFLVICGIFILLVI-LWFVCCKKRKRSRR   63 (94)
T ss_pred             hHHHHHHHHHHHHH-HHHHHHHHhhhccC
Confidence            35555555544333 34445555554443


No 44 
>PF06024 DUF912:  Nucleopolyhedrovirus protein of unknown function (DUF912);  InterPro: IPR009261 This entry is represented by Autographa californica nuclear polyhedrosis virus (AcMNPV), Orf78; it is a family of uncharacterised viral proteins.
Probab=68.90  E-value=5  Score=32.11  Aligned_cols=30  Identities=17%  Similarity=0.297  Sum_probs=11.9

Q ss_pred             eehhhHHHHHHHHHHHHHHhhheeecCCCC
Q psy4697         256 VLIILGVVLIVLGFVIILLILYIHKNKHTK  285 (383)
Q Consensus       256 li~~l~~i~~lL~l~~~~l~~~~~r~~~~~  285 (383)
                      +++++.+++++|+++.++....+.|.+++.
T Consensus        64 ili~lls~v~IlVily~IyYFVILRer~~~   93 (101)
T PF06024_consen   64 ILISLLSFVCILVILYAIYYFVILRERQKS   93 (101)
T ss_pred             hHHHHHHHHHHHHHHhhheEEEEEeccccc
Confidence            333333333333333333333345544443


No 45 
>PF01299 Lamp:  Lysosome-associated membrane glycoprotein (Lamp);  InterPro: IPR002000 Lysosome-associated membrane glycoproteins (lamp) [] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below.   +-----+ +-----+ +-----+ +-----+ | | | | | | | | xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx +--------------------------++Hinge++--------------------------++TM++C+  In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100.  Also included in this entry is the macrophage protein CD68 (or macrosialin) [] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail.   Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region []. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved. ; GO: 0016020 membrane
Probab=68.82  E-value=3.6  Score=39.75  Aligned_cols=20  Identities=30%  Similarity=0.433  Sum_probs=12.0

Q ss_pred             ceeeehhhHHHHHHHHHHHH
Q psy4697         253 SSTVLIILGVVLIVLGFVII  272 (383)
Q Consensus       253 ~~~li~~l~~i~~lL~l~~~  272 (383)
                      ...+-|++|++|+.|++++|
T Consensus       270 ~~~vPIaVG~~La~lvlivL  289 (306)
T PF01299_consen  270 SDLVPIAVGAALAGLVLIVL  289 (306)
T ss_pred             cchHHHHHHHHHHHHHHHHH
Confidence            45556667777766655443


No 46 
>PF13753 SWM_repeat:  Putative flagellar system-associated repeat
Probab=67.78  E-value=1.2e+02  Score=29.16  Aligned_cols=202  Identities=18%  Similarity=0.140  Sum_probs=88.9

Q ss_pred             CCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCC------CCcEEEEEEeEeCCC-CeeEEE
Q psy4697           8 LQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLP------VNYVLLTVTTNKPRD-LRVKYW   80 (383)
Q Consensus         8 ~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~------~gt~v~~v~A~D~D~-~~v~Ys   80 (383)
                      -..|.+.+.++|..+....  .+..+.|.-.   +|...-.    .+.++..      ......+..+++.+. ..+.+.
T Consensus        11 d~~~~v~vt~tD~aGN~~~--~t~~~~vDt~---~P~v~i~----~~~~~~~~~~~~~~~~~t~s~tvs~~~~g~~v~v~   81 (317)
T PF13753_consen   11 DGTYTVSVTVTDAAGNTST--ATQSITVDTT---APTVTIT----SIADDDIINGDEATNTVTFSGTVSGAEPGSTVTVT   81 (317)
T ss_pred             CCcEEEEEEEEeCCCCeee--eeEEEEEecC---CCceeee----cccCCCccccceeeeeeEEEEEecCCCCCCEEEEE
Confidence            4679999999996644333  3344443322   5533221    1111111      122233444433333 446555


Q ss_pred             EecCCCCCEEEcCcccEEEcc-cCCcccCcEEEEEEE-EecC-CceeEE-EEEEEEeecCCCCCcccCCceeE-EEeccC
Q psy4697          81 LSNDYGERFSISRQGDISLMQ-CLDYETEDSYRFTVY-ATDT-LMTTSA-TVNISVVNVNDWDPRFRYPQYEL-FLPHIP  155 (383)
Q Consensus        81 i~~~~~~~F~Id~tG~I~~~~-~LD~E~~~~y~~~V~-A~D~-~~~s~~-tV~I~V~DvNDn~P~f~~~~~~~-~v~~~~  155 (383)
                      +... ..-+..+.+|...+.- +-+.-....|.+.+. ++|. |..+.+ ...+.|-..--.+|.+.-....- .+.   
T Consensus        82 ~~g~-~~t~~~~~~G~ws~t~~~~~~l~~g~~ti~v~~~tD~aGN~~t~~s~~~~vDt~~~~~p~vti~~~~~~~~~---  157 (317)
T PF13753_consen   82 INGT-TGTLTADADGNWSVTVTPSDDLPDGDYTITVTTVTDAAGNTSTAASQTFTVDTTAPTAPTVTITGISDDNII---  157 (317)
T ss_pred             ECCE-EEEEEEecCCcEEEeeccccccccCcceeEEEEEEccCCccccccccccccccccccccccceecccCCcee---
Confidence            5221 1123334466533321 111223458888998 9995 454444 45553333311245543321000 011   


Q ss_pred             CCCCCCCceEEEEEeeeCCCCCeEEEEEeCCCCCCEEEcCCCcEE--Eec--cCCCCcceEEEEEEEeeCCC
Q psy4697         156 LADLTPGSVIGKVEAADGDKGDRVTLSLRGPYEKMFSINDSGHIS--IVD--LSALNTSTIQLVVVATDTGN  223 (383)
Q Consensus       156 ~e~~~~g~~v~~v~A~D~D~g~~i~ysi~~~~~~~F~i~~tG~i~--l~~--~~~~~~~~y~L~V~a~D~g~  223 (383)
                      ..........+.-...+.+.++.+...+.|... .+.....|...  ...  ......+.|.+.+.++|..+
T Consensus       158 ~~~~~~~t~t~sg~v~~~~~~d~v~vt~~G~~~-~~~~~~~g~~t~~~~~~~~~~~~d~~~~v~v~~tD~AG  228 (317)
T PF13753_consen  158 NGAESTVTVTFSGTVTGFDAGDTVTVTINGTTY-TTTVGADGTWTVTVTPSDLAGLADGTYTVTVTVTDAAG  228 (317)
T ss_pred             eccceeecccccccceeeeeceeEEEeeccccc-ceeecCCCcccccccccccccccCceEEEEEEeeeccc
Confidence            000001111122222345555556666643332 44454444222  111  12244568999999999743


No 47 
>PF15298 AJAP1_PANP_C:  AJAP1/PANP C-terminus
Probab=66.73  E-value=14  Score=32.99  Aligned_cols=89  Identities=11%  Similarity=0.243  Sum_probs=42.1

Q ss_pred             ceeeehhhHHHHHHHHHHHHHHhhheeecCCCCCCCCCCC-CCCCCCCcccCc-cCCCCceeecccCceEeeccccCccc
Q psy4697         253 SSTVLIILGVVLIVLGFVIILLILYIHKNKHTKNNGPPGS-SHSKNDSFLSNV-ILPEKHVNVVAIPKIQENPVFNGSQE  330 (383)
Q Consensus       253 ~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~~~~~~-~~~~~~~~~~~~-~~p~~~~~~~~~~~i~~~p~~~~~~~  330 (383)
                      ..++-|.+..|+.|..|+.-+++..||-+..+.++.+... ..+.|.+.++-- -.|.|.-++        ..+|--|++
T Consensus        99 h~~iTITvSlImViaAliTtlvlK~C~~~s~~~r~~s~qr~~~qqeeS~Q~Ltd~~p~~~ps~--------~diftayn~  170 (205)
T PF15298_consen   99 HQIITITVSLIMVIAALITTLVLKNCCAQSQNRRRNSHQRKINQQEESCQNLTDFTPARVPSS--------VDIFTAYND  170 (205)
T ss_pred             eEEEEEeeehhHHHHHhhhhhhhhhhhhhhcccCCCccccccccchhhccccccCCcccCccc--------eeEecccCC
Confidence            3444455555554444444445556666665555444322 222222222211 122222222        446788888


Q ss_pred             ccc---cccCCCCCCcccCCeec
Q psy4697         331 ELQ---TQRGGTNSSIYTATVKK  350 (383)
Q Consensus       331 ~~~---~~~~~~~s~i~~~~~~~  350 (383)
                      .+|   ++-. +--++|...+++
T Consensus       171 sl~cshecvr-~~~~~y~~e~~~  192 (205)
T PF15298_consen  171 SLQCSHECVR-TSVPVYTDETLH  192 (205)
T ss_pred             CCCCCccccc-CCCCcccccccC
Confidence            888   5544 455666544444


No 48 
>PF05083 LST1:  LST-1 protein;  InterPro: IPR007775 B144/LST1 is a gene encoded in the human major histocompatibility complex that produces multiple forms of alternatively spliced mRNA and encodes peptides fewer than 100 amino acids in length. B144/LST1 is strongly expressed in dendritic cells. Transfection of B144/LST1 into a variety of cells induces morphologic changes including the production of long, thin filopodia []. A possible role in modulating immune responses. Induces morphological changes including production of filopodia and microspikes when overexpressed in a variety of cell types and may be involved in dendritic cell maturation. Isoform 1 and isoform 2 have an inhibitory effect on lymphocyte proliferation [, ]. ; GO: 0000902 cell morphogenesis, 0006955 immune response, 0016020 membrane
Probab=65.94  E-value=2.7  Score=30.90  Aligned_cols=23  Identities=4%  Similarity=0.013  Sum_probs=8.9

Q ss_pred             eeecCCCCCCCCCCCCCCCCCCc
Q psy4697         278 IHKNKHTKNNGPPGSSHSKNDSF  300 (383)
Q Consensus       278 ~~r~~~~~~~~~~~~~~~~~~~~  300 (383)
                      ..||.++-.+......++.+..|
T Consensus        19 lsrRvkrLErs~~~~~~eQE~hy   41 (74)
T PF05083_consen   19 LSRRVKRLERSWEQLSSEQELHY   41 (74)
T ss_pred             HHhhhhhcccchhccccccchHH
Confidence            34444433333332233344444


No 49 
>PTZ00382 Variant-specific surface protein (VSP); Provisional
Probab=64.83  E-value=4.3  Score=32.21  Aligned_cols=24  Identities=29%  Similarity=0.427  Sum_probs=10.5

Q ss_pred             hhhHHHHHHHHHHHHHHhhheeec
Q psy4697         258 IILGVVLIVLGFVIILLILYIHKN  281 (383)
Q Consensus       258 ~~l~~i~~lL~l~~~~l~~~~~r~  281 (383)
                      |++++++++.+|+.+++.++++|+
T Consensus        71 i~vg~~~~v~~lv~~l~w~f~~r~   94 (96)
T PTZ00382         71 ISVAVVAVVGGLVGFLCWWFVCRG   94 (96)
T ss_pred             EEeehhhHHHHHHHHHhheeEEee
Confidence            444444444444444334444443


No 50 
>PF06365 CD34_antigen:  CD34/Podocalyxin family;  InterPro: IPR013836 This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [, ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Probab=64.47  E-value=21  Score=32.28  Aligned_cols=7  Identities=57%  Similarity=0.605  Sum_probs=2.8

Q ss_pred             Ccccccc
Q psy4697         327 GSQEELQ  333 (383)
Q Consensus       327 ~~~~~~~  333 (383)
                      +.+.++|
T Consensus       158 ~~~~E~q  164 (202)
T PF06365_consen  158 ESQPEMQ  164 (202)
T ss_pred             CCCcccc
Confidence            3334444


No 51 
>TIGR01478 STEVOR variant surface antigen, stevor family. This model represents the stevor branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of stevor sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 8 bits.
Probab=62.30  E-value=5.4  Score=37.72  Aligned_cols=14  Identities=29%  Similarity=0.479  Sum_probs=7.3

Q ss_pred             ecCcEEEEEECCCC
Q psy4697          46 SKNEYSVSALENLP   59 (383)
Q Consensus        46 ~~~~y~~~V~En~~   59 (383)
                      ....|++++-.|..
T Consensus        20 ~n~~yn~~li~n~t   33 (295)
T TIGR01478        20 HNKKYNVSYIQNNT   33 (295)
T ss_pred             hccccceecccCcc
Confidence            44556665555443


No 52 
>PF15330 SIT:  SHP2-interacting transmembrane adaptor protein, SIT
Probab=62.04  E-value=7.5  Score=31.51  Aligned_cols=31  Identities=16%  Similarity=0.147  Sum_probs=19.6

Q ss_pred             hHHHHHHHHHHHHHHhhheeecCCCCCCCCC
Q psy4697         260 LGVVLIVLGFVIILLILYIHKNKHTKNNGPP  290 (383)
Q Consensus       260 l~~i~~lL~l~~~~l~~~~~r~~~~~~~~~~  290 (383)
                      |-+++++|+++++++-+..||.+|++.+.+.
T Consensus         3 Ll~il~llLll~l~asl~~wr~~~rq~k~~~   33 (107)
T PF15330_consen    3 LLGILALLLLLSLAASLLAWRMKQRQKKAGQ   33 (107)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhhhccccC
Confidence            3445555555555566788888877765555


No 53 
>PF02480 Herpes_gE:  Alphaherpesvirus glycoprotein E;  InterPro: IPR003404 Glycoprotein E (gE) of Alphaherpesvirus forms a complex with glycoprotein I (gI), functioning as an immunoglobulin G (IgG) Fc binding protein. gE is involved in virus spread but is not essential for propagation [].; GO: 0016020 membrane; PDB: 2GJ7_F 2GIY_B.
Probab=61.76  E-value=2.6  Score=42.84  Aligned_cols=16  Identities=6%  Similarity=-0.052  Sum_probs=5.8

Q ss_pred             EEEEEEEEeecCCCCC
Q psy4697         125 SATVNISVVNVNDWDP  140 (383)
Q Consensus       125 ~~tV~I~V~DvNDn~P  140 (383)
                      ++++.+.-.++|...+
T Consensus       182 ~~~i~W~~~~~~~~C~  197 (439)
T PF02480_consen  182 SLEIDWYYMPTDPSCA  197 (439)
T ss_dssp             EEEEEEEEE---TT-S
T ss_pred             eEEEEEEEecCCCCCc
Confidence            3444555555554444


No 54 
>PTZ00370 STEVOR; Provisional
Probab=61.56  E-value=5.7  Score=37.65  Aligned_cols=11  Identities=18%  Similarity=0.332  Sum_probs=5.2

Q ss_pred             cCCcccCcEEE
Q psy4697         102 CLDYETEDSYR  112 (383)
Q Consensus       102 ~LD~E~~~~y~  112 (383)
                      .||+|..+.|+
T Consensus        65 ~~n~eaikkyq   75 (296)
T PTZ00370         65 KMNEEAIKKYQ   75 (296)
T ss_pred             HHhHHHhhhhh
Confidence            35555544443


No 55 
>PF13750 Big_3_3:  Bacterial Ig-like domain (group 3)
Probab=59.94  E-value=1.2e+02  Score=26.33  Aligned_cols=121  Identities=22%  Similarity=0.278  Sum_probs=63.1

Q ss_pred             CcEEEEEE-EEecC-CceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCC-CceEEEEEeeeCCCCCe-EEEEE
Q psy4697         108 EDSYRFTV-YATDT-LMTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTP-GSVIGKVEAADGDKGDR-VTLSL  183 (383)
Q Consensus       108 ~~~y~~~V-~A~D~-~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~-g~~v~~v~A~D~D~g~~-i~ysi  183 (383)
                      ...|.+.+ .|.|. |.....++...+. ++..+|.+.- .....+.    ..... |..=..+.++|...+.. -..++
T Consensus        14 dG~Y~l~~~~a~D~agN~~~~~~~~~~~-iD~T~Ptisi-~~~~~~~----~g~~v~~~~~i~i~~tD~~~~~~i~sv~l   87 (158)
T PF13750_consen   14 DGSYTLTVVTATDAAGNTSTSTVSETFT-IDNTPPTISI-SDGASVA----NGSTVYGLVNISINVTDNSDDSKITSVSL   87 (158)
T ss_pred             CccEEEEEEEEEecCCCEEEEEEeeEEE-EcCCCCEEEE-ecCCccC----CCccccceeeeEEEEEeCCCCceEEEEEE
Confidence            45799999 79995 4555555543333 2444776643 0001111    11111 11224577777765543 45667


Q ss_pred             eCCC-CCCEEE--cC--CCcEEEe--c--cCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697         184 RGPY-EKMFSI--ND--SGHISIV--D--LSALNTSTIQLVVVATDTGNPPRQASVPAIMHF  236 (383)
Q Consensus       184 ~~~~-~~~F~i--~~--tG~i~l~--~--~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v  236 (383)
                      .|+. +..-.+  ..  .|...+.  .  +.-+....|.|+|.|.|..+  ..++..+....
T Consensus        88 ~Gg~~~d~v~ls~~~~~~~~~~~~yp~~fpsle~~~~YtLtV~a~D~aG--N~~~~si~F~y  147 (158)
T PF13750_consen   88 TGGPASDSVSLSWTNKGNGVYTLEYPRIFPSLEADDSYTLTVSATDKAG--NQSTKSISFSY  147 (158)
T ss_pred             ECCcccceEEEeeEeccCceEEeecccccCCcCCCCeEEEEEEEEecCC--CEEEEEEEEEE
Confidence            6432 222222  22  3333222  1  12245789999999999855  35555555544


No 56 
>KOG3597|consensus
Probab=59.93  E-value=94  Score=31.67  Aligned_cols=59  Identities=19%  Similarity=0.062  Sum_probs=39.7

Q ss_pred             eEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeeeCCCCC-eEEEEEeCC
Q psy4697         124 TSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAADGDKGD-RVTLSLRGP  186 (383)
Q Consensus       124 s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~-~i~ysi~~~  186 (383)
                      -+...+|.|.-+||++..+....+.+-+.    +....-..-..+.+.|+|++. .+.|++.+.
T Consensus        24 ~~~~~~i~v~pvndpp~~~~~~~~~l~~~----~~~~k~l~~~~l~~~d~d~~~~~l~f~v~~t   83 (442)
T KOG3597|consen   24 QTDVLRIHVNPVNDPPSLIFPSGSLLVIL----EGGQKVLDPELLTAADPDSAPLPLEFQVLGT   83 (442)
T ss_pred             EEeeecccccccCCCcceeecccceEEee----cCCceeccceEeeccCCCCCccceEEEEccC
Confidence            55677899999999655555444455555    444333333568999999875 788888643


No 57 
>PF12768 Rax2:  Cortical protein marker for cell polarity
Probab=59.01  E-value=15  Score=35.12  Aligned_cols=11  Identities=45%  Similarity=0.510  Sum_probs=5.1

Q ss_pred             eeehhhHHHHH
Q psy4697         255 TVLIILGVVLI  265 (383)
Q Consensus       255 ~li~~l~~i~~  265 (383)
                      .+.|.||+.++
T Consensus       229 VVlIslAiALG  239 (281)
T PF12768_consen  229 VVLISLAIALG  239 (281)
T ss_pred             EEEEehHHHHH
Confidence            33445555444


No 58 
>PF15102 TMEM154:  TMEM154 protein family
Probab=58.73  E-value=12  Score=31.89  Aligned_cols=35  Identities=14%  Similarity=0.275  Sum_probs=21.7

Q ss_pred             CceeeehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697         252 TSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKN  286 (383)
Q Consensus       252 ~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~  286 (383)
                      ..+++|-.+..+++||++++++...+|+|.|+...
T Consensus        58 iLmIlIP~VLLvlLLl~vV~lv~~~kRkr~K~~~s   92 (146)
T PF15102_consen   58 ILMILIPLVLLVLLLLSVVCLVIYYKRKRTKQEPS   92 (146)
T ss_pred             EEEEeHHHHHHHHHHHHHHHheeEEeecccCCCCc
Confidence            56666664445555554456666777888888643


No 59 
>PTZ00046 rifin; Provisional
Probab=57.20  E-value=4.9  Score=39.44  Aligned_cols=30  Identities=33%  Similarity=0.491  Sum_probs=13.7

Q ss_pred             eeeehhhHHHH-HHHHHHHHHHhhheeecCCC
Q psy4697         254 STVLIILGVVL-IVLGFVIILLILYIHKNKHT  284 (383)
Q Consensus       254 ~~li~~l~~i~-~lL~l~~~~l~~~~~r~~~~  284 (383)
                      -++++++.+|+ ++|+.+++.|++ |+||+||
T Consensus       315 taIiaSiiAIvVIVLIMvIIYLIL-RYRRKKK  345 (358)
T PTZ00046        315 TAIIASIVAIVVIVLIMVIIYLIL-RYRRKKK  345 (358)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHH-Hhhhcch
Confidence            34444443333 344445554555 4555544


No 60 
>PF05568 ASFV_J13L:  African swine fever virus J13L protein;  InterPro: IPR008385 This family consists of several African swine fever virus (ASFV) j13L proteins [, , ].
Probab=57.01  E-value=6.4  Score=33.34  Aligned_cols=29  Identities=24%  Similarity=0.395  Sum_probs=12.3

Q ss_pred             ehhhHHHHHHHHHHHHHHhhheeecCCCCC
Q psy4697         257 LIILGVVLIVLGFVIILLILYIHKNKHTKN  286 (383)
Q Consensus       257 i~~l~~i~~lL~l~~~~l~~~~~r~~~~~~  286 (383)
                      +++|.+|.++.+ ++++++.+|.+||||..
T Consensus        32 ~tILiaIvVlii-iiivli~lcssRKkKaa   60 (189)
T PF05568_consen   32 YTILIAIVVLII-IIIVLIYLCSSRKKKAA   60 (189)
T ss_pred             HHHHHHHHHHHH-HHHHHHHHHhhhhHHHH
Confidence            334443333332 33334444555555543


No 61 
>TIGR01477 RIFIN variant surface antigen, rifin family. This model represents the rifin branch of the rifin/stevor family (pfam02009) of predicted variant surface antigens as found in Plasmodium falciparum. This model is based on a set of rifin sequences kindly provided by Matt Berriman from the Sanger Center. This is a global model and assesses a penalty for incomplete sequence. Additional fragmentary sequences may be found with the fragment model and a cutoff of 20 bits.
Probab=56.86  E-value=5.3  Score=39.12  Aligned_cols=30  Identities=37%  Similarity=0.496  Sum_probs=13.8

Q ss_pred             eeeehhhHHHH-HHHHHHHHHHhhheeecCCC
Q psy4697         254 STVLIILGVVL-IVLGFVIILLILYIHKNKHT  284 (383)
Q Consensus       254 ~~li~~l~~i~-~lL~l~~~~l~~~~~r~~~~  284 (383)
                      -++++++.+|+ ++|+.+++.|++ |.||+||
T Consensus       310 t~IiaSiIAIvvIVLIMvIIYLIL-RYRRKKK  340 (353)
T TIGR01477       310 TPIIASIIAILIIVLIMVIIYLIL-RYRRKKK  340 (353)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHH-Hhhhcch
Confidence            33444433333 345555555555 4555543


No 62 
>PF07495 Y_Y_Y:  Y_Y_Y domain;  InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=55.72  E-value=45  Score=23.55  Aligned_cols=55  Identities=20%  Similarity=0.310  Sum_probs=31.9

Q ss_pred             CeeEEEEecCCCCCEEEcCcc-cEEEcccCCcccCcEEEEEEEEecCC--c-eeEEEEEEEEe
Q psy4697          75 LRVKYWLSNDYGERFSISRQG-DISLMQCLDYETEDSYRFTVYATDTL--M-TTSATVNISVV  133 (383)
Q Consensus        75 ~~v~Ysi~~~~~~~F~Id~tG-~I~~~~~LD~E~~~~y~~~V~A~D~~--~-~s~~tV~I~V~  133 (383)
                      -...|.|.+-...|..+.... .+...    .-....|+|.|.|.|..  . ....++.|.|+
T Consensus         8 ~~Y~Y~l~g~d~~W~~~~~~~~~~~~~----~L~~G~Y~l~V~a~~~~~~~~~~~~~l~i~I~   66 (66)
T PF07495_consen    8 IRYRYRLEGFDDEWITLGSYSNSISYT----NLPPGKYTLEVRAKDNNGKWSSDEKSLTITIL   66 (66)
T ss_dssp             EEEEEEEETTESSEEEESSTS-EEEEE----S--SEEEEEEEEEEETTS-B-SS-EEEEEEEE
T ss_pred             eEEEEEEECCCCeEEECCCCcEEEEEE----eCCCEEEEEEEEEECCCCCcCcccEEEEEEEC
Confidence            345667776556677766433 33221    12357999999999953  2 22266776663


No 63 
>PF07204 Orthoreo_P10:  Orthoreovirus membrane fusion protein p10;  InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=54.30  E-value=5.2  Score=31.28  Aligned_cols=8  Identities=25%  Similarity=0.368  Sum_probs=3.8

Q ss_pred             heeecCCC
Q psy4697         277 YIHKNKHT  284 (383)
Q Consensus       277 ~~~r~~~~  284 (383)
                      ++||.|+|
T Consensus        62 ~CC~~K~K   69 (98)
T PF07204_consen   62 CCCRAKHK   69 (98)
T ss_pred             HHhhhhhh
Confidence            34555544


No 64 
>PF02038 ATP1G1_PLM_MAT8:  ATP1G1/PLM/MAT8 family;  InterPro: IPR000272  The FXYD protein family contains at least seven members in mammals []. Two other family members that are not obvious orthologs of any identified mammalian FXYD protein exist in zebrafish. All these proteins share a signature sequence of six conserved amino acids comprising the FXYD motif in the NH2-terminus, and two glycines and one serine residue in the transmembrane domain. FXYD proteins are widely distributed in mammalian tissues with prominent expression in tissues that perform fluid and solute transport or that are electrically excitable.   Initial functional characterisation suggested that FXYD proteins act as channels or as modulators of ion channels however studies have revealed that most FXYD proteins have another specific function and act as tissue-specific regulatory subunits of the Na,K-ATPase. Each of these auxiliary subunits produces a distinct functional effect on the transport characteristics of the Na,K-ATPase that is adjusted to the specific functional demands of the tissue in which the FXYD protein is expressed. FXYD proteins appear to preferentially associate with Na,K-ATPase alpha1-beta isozymes, and affect their function in a way that render them operationally complementary or supplementary to coexisting isozymes.; GO: 0005216 ion channel activity, 0006811 ion transport, 0016020 membrane; PDB: 2JO1_A 2JP3_A 2ZXE_G 3A3Y_G 3N23_E 3B8E_H 3KDP_G 3N2F_E.
Probab=54.01  E-value=11  Score=25.95  Aligned_cols=9  Identities=44%  Similarity=0.752  Sum_probs=3.6

Q ss_pred             HHHHHHHHH
Q psy4697         261 GVVLIVLGF  269 (383)
Q Consensus       261 ~~i~~lL~l  269 (383)
                      |++++++.+
T Consensus        22 A~vlfi~Gi   30 (50)
T PF02038_consen   22 AGVLFILGI   30 (50)
T ss_dssp             HHHHHHHHH
T ss_pred             HHHHHHHHH
Confidence            334444443


No 65 
>KOG4482|consensus
Probab=52.11  E-value=17  Score=35.85  Aligned_cols=40  Identities=18%  Similarity=0.168  Sum_probs=22.1

Q ss_pred             CCceeeehhhHHHHHH-HHHHHHHHhhheeecCCCCCCCCC
Q psy4697         251 GTSSTVLIILGVVLIV-LGFVIILLILYIHKNKHTKNNGPP  290 (383)
Q Consensus       251 ~~~~~li~~l~~i~~l-L~l~~~~l~~~~~r~~~~~~~~~~  290 (383)
                      .+...+..++|..+++ +++++++.+++||||++.+.+-..
T Consensus       292 dyy~df~~tfaIpl~Valll~~~La~imc~rrEg~~~rd~~  332 (449)
T KOG4482|consen  292 DYYGDFLHTFAIPLGVALLLVLALAYIMCCRREGQKKRDDK  332 (449)
T ss_pred             hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccc
Confidence            4445555566665553 233344445668888876654433


No 66 
>PF11980 DUF3481:  Domain of unknown function (DUF3481);  InterPro: IPR022579  This domain of unknown function is located in the C terminus of the eukaryotic neuropilin receptor family of proteins. It is found in association with PF00754 from PFAM, PF00431 from PFAM and PF00629 from PFAM. There are two completely conserved residues (Y and E) that may be functionally important.
Probab=50.07  E-value=12  Score=28.62  Aligned_cols=30  Identities=27%  Similarity=0.373  Sum_probs=17.2

Q ss_pred             ceeeehhhHHHHHHHHHHHHHHhhheeecC
Q psy4697         253 SSTVLIILGVVLIVLGFVIILLILYIHKNK  282 (383)
Q Consensus       253 ~~~li~~l~~i~~lL~l~~~~l~~~~~r~~  282 (383)
                      .++-|++.++.+++|+.+.+.+++.++|.+
T Consensus        15 ~~yyiiA~gga~llL~~v~l~vvL~C~r~~   44 (87)
T PF11980_consen   15 YWYYIIAMGGALLLLVAVCLGVVLYCHRFH   44 (87)
T ss_pred             eeeHHHhhccHHHHHHHHHHHHHHhhhhhc
Confidence            445566666666666666655555444443


No 67 
>PF12191 stn_TNFRSF12A:  Tumour necrosis factor receptor stn_TNFRSF12A_TNFR domain;  InterPro: IPR022316 The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved []. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 12 (also known as TWEAK receptor, and fibroblast growth factor-inducible-14 (Fn14)) has been implicated in endothelial cell growth and migration []. The receptor may also play a role in cell-matrix interactions [].; PDB: 2KN0_A 2RPJ_A 2KMZ_A 2EQP_A.
Probab=49.47  E-value=7.1  Score=32.33  Aligned_cols=13  Identities=8%  Similarity=0.120  Sum_probs=0.0

Q ss_pred             HHHhhheeecCCC
Q psy4697         272 ILLILYIHKNKHT  284 (383)
Q Consensus       272 ~~l~~~~~r~~~~  284 (383)
                      .++++++|||++|
T Consensus        97 g~lv~rrcrrr~~  109 (129)
T PF12191_consen   97 GFLVWRRCRRREK  109 (129)
T ss_dssp             -------------
T ss_pred             HHHHHhhhhcccc
Confidence            3444555555543


No 68 
>PF13753 SWM_repeat:  Putative flagellar system-associated repeat
Probab=47.40  E-value=2.3e+02  Score=27.17  Aligned_cols=108  Identities=20%  Similarity=0.302  Sum_probs=57.0

Q ss_pred             CcEEEEEEEEecC-CceeEEEEEEEEeecCCCCCcccCCce--eEEEeccCCCCCCCCceEEEEEeeeCCCCCeEEEEEe
Q psy4697         108 EDSYRFTVYATDT-LMTTSATVNISVVNVNDWDPRFRYPQY--ELFLPHIPLADLTPGSVIGKVEAADGDKGDRVTLSLR  184 (383)
Q Consensus       108 ~~~y~~~V~A~D~-~~~s~~tV~I~V~DvNDn~P~f~~~~~--~~~v~~~~~e~~~~g~~v~~v~A~D~D~g~~i~ysi~  184 (383)
                      ...|.+.+.++|. |..+..+..+.|--.   +|.......  ...+.    .............+.+.+.|..+.+.+.
T Consensus        11 d~~~~v~vt~tD~aGN~~~~t~~~~vDt~---~P~v~i~~~~~~~~~~----~~~~~~~~t~s~tvs~~~~g~~v~v~~~   83 (317)
T PF13753_consen   11 DGTYTVSVTVTDAAGNTSTATQSITVDTT---APTVTITSIADDDIIN----GDEATNTVTFSGTVSGAEPGSTVTVTIN   83 (317)
T ss_pred             CCcEEEEEEEEeCCCCeeeeeEEEEEecC---CCceeeecccCCCccc----cceeeeeeEEEEEecCCCCCCEEEEEEC
Confidence            4678999999995 555555555543222   664332210  00000    0111222345566667777777777763


Q ss_pred             CCCCCCEEEcCCCcEEEe-cc-CCCCcceEEEEEE-EeeCCC
Q psy4697         185 GPYEKMFSINDSGHISIV-DL-SALNTSTIQLVVV-ATDTGN  223 (383)
Q Consensus       185 ~~~~~~F~i~~tG~i~l~-~~-~~~~~~~y~L~V~-a~D~g~  223 (383)
                      + ....+..+.+|.-... .. ..+..+.|.+.+. ++|..+
T Consensus        84 g-~~~t~~~~~~G~ws~t~~~~~~l~~g~~ti~v~~~tD~aG  124 (317)
T PF13753_consen   84 G-TTGTLTADADGNWSVTVTPSDDLPDGDYTITVTTVTDAAG  124 (317)
T ss_pred             C-EEEEEEEecCCcEEEeeccccccccCcceeEEEEEEccCC
Confidence            2 2223344445642111 11 2466778999999 999754


No 69 
>PF02158 Neuregulin:  Neuregulin family;  InterPro: IPR002154 Neuregulins are a sub-family of EGF-like molecules that have been shown to play multiple essential roles in vertebrate embryogenesis including: cardiac development, Schwann cell and oligodendrocyte differentiation, some aspects of neuronal development, as well as the formation of neuromuscular synapses [, ]. Included in the family are heregulin; neu differentiation factor; acetylcholine receptor synthesis stimulator; glial growth factor; and sensory and motor-neuron derived factor []. Multiple family members are generated by alternate splicing or by use of several cell type-specific transcription initiation sites. In general, they bind to and activate the erbB family of receptor tyrosine kinases (erbB2 (HER2), erbB3 (HER3), and erbB4 (HER4)), functioning both as heterodimers and homodimers.  The transmembrane forms of neuregulin 1 (NRG1) are present within synaptic vesicles, including those containing glutamate []. After exocytosis, NRG1 is in the presynaptic membrane, where the ectodomain of NRG1 may be cleaved off. The ectodomain then migrates across the synaptic cleft and binds to and activates a member of the EGF-receptor family on the postsynaptic membrane. This has been shown to increase the expression of certain glutamate-receptor subunits. NRG1 appears to signal for glutamate-receptor subunit expression, localisation, and /or phosphorylation facilitating subsequent glutamate transmission.   The NRG1 gene has been identified as a potential gene determining susceptibility to schizophrenia by a combination of genetic linkage and association approaches []. ; GO: 0005102 receptor binding, 0009790 embryo development; PDB: 1HRE_A 1HAE_A 1HAF_A 1HRF_A.
Probab=47.24  E-value=6.3  Score=38.80  Aligned_cols=29  Identities=17%  Similarity=0.243  Sum_probs=0.0

Q ss_pred             ehhhHHHHHHHHHHHHHHhh-heeecCCCC
Q psy4697         257 LIILGVVLIVLGFVIILLIL-YIHKNKHTK  285 (383)
Q Consensus       257 i~~l~~i~~lL~l~~~~l~~-~~~r~~~~~  285 (383)
                      |..+.|||+-||++.+++++ +.||.||.+
T Consensus         9 VLTITgIcvaLlVVGi~Cvv~aYCKTKKQR   38 (404)
T PF02158_consen    9 VLTITGICVALLVVGIVCVVDAYCKTKKQR   38 (404)
T ss_dssp             ------------------------------
T ss_pred             hhhhhhhhHHHHHHHHHHHHHHHHHhHHHH
Confidence            55667777766666666566 667666653


No 70 
>PF15069 FAM163:  FAM163 family
Probab=45.86  E-value=58  Score=27.71  Aligned_cols=7  Identities=14%  Similarity=0.112  Sum_probs=3.0

Q ss_pred             cccCCCC
Q psy4697         350 KTLSGKP  356 (383)
Q Consensus       350 ~~~s~~~  356 (383)
                      .|.||.+
T Consensus        93 ~CptCS~   99 (143)
T PF15069_consen   93 YCPTCSP   99 (143)
T ss_pred             cCCCCCC
Confidence            3444433


No 71 
>PF06697 DUF1191:  Protein of unknown function (DUF1191);  InterPro: IPR010605 This family contains hypothetical plant proteins of unknown function.
Probab=45.43  E-value=34  Score=32.55  Aligned_cols=11  Identities=45%  Similarity=0.401  Sum_probs=6.2

Q ss_pred             CcEEEEEECCC
Q psy4697          48 NEYSVSALENL   58 (383)
Q Consensus        48 ~~y~~~V~En~   58 (383)
                      ..|.+.++.|.
T Consensus        33 ~~y~~~LP~nl   43 (278)
T PF06697_consen   33 ILYNVSLPSNL   43 (278)
T ss_pred             ceeeeecCCcc
Confidence            34666666554


No 72 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=45.20  E-value=7  Score=28.43  Aligned_cols=18  Identities=22%  Similarity=0.453  Sum_probs=1.1

Q ss_pred             HHHHHHhhheeecCCCCC
Q psy4697         269 FVIILLILYIHKNKHTKN  286 (383)
Q Consensus       269 l~~~~l~~~~~r~~~~~~  286 (383)
                      +++++++.+++|+....-
T Consensus        27 lLIlf~iyR~rkkdEGSY   44 (64)
T PF01034_consen   27 LLILFLIYRMRKKDEGSY   44 (64)
T ss_dssp             ----------S------S
T ss_pred             HHHHHHHHHHHhcCCCCc
Confidence            334455667666665543


No 73 
>PF13908 Shisa:  Wnt and FGF inhibitory regulator
Probab=44.71  E-value=33  Score=30.21  Aligned_cols=19  Identities=5%  Similarity=0.263  Sum_probs=8.3

Q ss_pred             eeeehhhHHHHHHHHHHHH
Q psy4697         254 STVLIILGVVLIVLGFVII  272 (383)
Q Consensus       254 ~~li~~l~~i~~lL~l~~~  272 (383)
                      ++++.++.++++++++|++
T Consensus        79 ~iivgvi~~Vi~Iv~~Iv~   97 (179)
T PF13908_consen   79 GIIVGVICGVIAIVVLIVC   97 (179)
T ss_pred             eeeeehhhHHHHHHHhHhh
Confidence            3444444444444444333


No 74 
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=43.82  E-value=1.3e+02  Score=22.08  Aligned_cols=62  Identities=18%  Similarity=0.246  Sum_probs=33.0

Q ss_pred             EEEeeeCCCCCeEEEEEe-CCCCCCEEEcCCCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEE
Q psy4697         167 KVEAADGDKGDRVTLSLR-GPYEKMFSINDSGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMH  235 (383)
Q Consensus       167 ~v~A~D~D~g~~i~ysi~-~~~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~  235 (383)
                      ++.+.+.+.+....|.+. ++..    ....+. ..........+.|.+++.|+|...  .+.+.++.|.
T Consensus        18 ~~~~~~~~~~~~~~~~W~fgdg~----~~~~~~-~~~~~~y~~~G~y~v~l~v~d~~g--~~~~~~~~V~   80 (81)
T cd00146          18 TFSASDSSGGSIVSYKWDFGDGE----VSSSGE-PTVTHTYTKPGTYTVTLTVTNAVG--SSSTKTTTVV   80 (81)
T ss_pred             EEEEEeCCCCCEEEEEEEeCCCC----ccccCC-CceEEEcCCCcEEEEEEEEEeCCC--CEEEEEEEEE
Confidence            556666655556677664 3320    111110 111123456789999999999754  3334344443


No 75 
>PF15050 SCIMP:  SCIMP protein
Probab=43.33  E-value=14  Score=30.24  Aligned_cols=11  Identities=9%  Similarity=0.607  Sum_probs=5.0

Q ss_pred             eeehhhHHHHH
Q psy4697         255 TVLIILGVVLI  265 (383)
Q Consensus       255 ~li~~l~~i~~  265 (383)
                      .+|.+++.|+.
T Consensus         9 WiiLAVaII~v   19 (133)
T PF15050_consen    9 WIILAVAIILV   19 (133)
T ss_pred             HHHHHHHHHHH
Confidence            34444554443


No 76 
>KOG3513|consensus
Probab=41.69  E-value=5.8e+02  Score=29.09  Aligned_cols=132  Identities=20%  Similarity=0.263  Sum_probs=73.0

Q ss_pred             CCCEEEcCcccEEEcccCCcccCcEEEEEEEEecCCceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceE
Q psy4697          86 GERFSISRQGDISLMQCLDYETEDSYRFTVYATDTLMTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVI  165 (383)
Q Consensus        86 ~~~F~Id~tG~I~~~~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v  165 (383)
                      .+.+.|.++|.|..... -++....|...+  .+.--.+..+..+.|.|    ++.+..+....        ....|..+
T Consensus       470 ~~r~~i~edGtL~I~n~-t~~DaG~YtC~A--~N~~G~a~~~~~L~Vkd----~tri~~~P~~~--------~v~~g~~v  534 (1051)
T KOG3513|consen  470 SGRIRILEDGTLEISNV-TRSDAGKYTCVA--ENKLGKAESTGNLIVKD----ATRITLAPSNT--------DVKVGESV  534 (1051)
T ss_pred             CceEEECCCCcEEeccc-CcccCcEEEEEE--EcccCccceEEEEEEec----CceEEeccchh--------hhccCceE
Confidence            44577777888766443 355667777664  44333455566666665    67776543322        33345544


Q ss_pred             -EEEEeeeCCCCC--eEEEEEeC------CCCCCEEEc-C--CCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEE
Q psy4697         166 -GKVEAADGDKGD--RVTLSLRG------PYEKMFSIN-D--SGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAI  233 (383)
Q Consensus       166 -~~v~A~D~D~g~--~i~ysi~~------~~~~~F~i~-~--tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~  233 (383)
                       ++..+. .|.-.  .|.+++.|      ....+|.++ .  +|.+.++.....+.+.|...+...   --+.++.+.+.
T Consensus       535 ~l~Ce~s-hD~~ld~~f~W~~nG~~id~~~~~~~~~~~~~~~~g~L~i~nv~l~~~G~Y~C~aqT~---~Ds~s~~A~l~  610 (1051)
T KOG3513|consen  535 TLTCEAS-HDPSLDITFTWKKNGRPIDFNPDGDHFEINDGSDSGRLTIANVSLEDSGKYTCVAQTA---LDSASARADLL  610 (1051)
T ss_pred             EEEeecc-cCCCcceEEEEEECCEEhhccCCCCceEEeCCcCccceEEEeeccccCceEEEEEEEe---ecchhcccceE
Confidence             333333 13333  45444433      233456654 2  578888777778889998776542   12345555555


Q ss_pred             EEE
Q psy4697         234 MHF  236 (383)
Q Consensus       234 I~v  236 (383)
                      |.-
T Consensus       611 V~g  613 (1051)
T KOG3513|consen  611 VRG  613 (1051)
T ss_pred             Eec
Confidence            544


No 77 
>PF01102 Glycophorin_A:  Glycophorin A;  InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=40.34  E-value=16  Score=30.33  Aligned_cols=27  Identities=4%  Similarity=0.025  Sum_probs=15.7

Q ss_pred             ccCCCCceeeehhhHHHHHHHHHHHHH
Q psy4697         247 KLNSGTSSTVLIILGVVLIVLGFVIIL  273 (383)
Q Consensus       247 ~~~~~~~~~li~~l~~i~~lL~l~~~~  273 (383)
                      |......++++.++++++++.+++..+
T Consensus        61 fs~~~i~~Ii~gv~aGvIg~Illi~y~   87 (122)
T PF01102_consen   61 FSEPAIIGIIFGVMAGVIGIILLISYC   87 (122)
T ss_dssp             SS-TCHHHHHHHHHHHHHHHHHHHHHH
T ss_pred             ccccceeehhHHHHHHHHHHHHHHHHH
Confidence            444456666676666666655555554


No 78 
>PF15234 LAT:  Linker for activation of T-cells
Probab=39.85  E-value=1.3e+02  Score=26.81  Aligned_cols=9  Identities=11%  Similarity=-0.005  Sum_probs=4.2

Q ss_pred             cCCCCceee
Q psy4697         305 ILPEKHVNV  313 (383)
Q Consensus       305 ~~p~~~~~~  313 (383)
                      |+|+.-+.+
T Consensus        53 k~p~t~~pw   61 (230)
T PF15234_consen   53 KRPQTLAPW   61 (230)
T ss_pred             cCCCCCCCC
Confidence            555544333


No 79 
>PF05454 DAG1:  Dystroglycan (Dystrophin-associated glycoprotein 1);  InterPro: IPR008465 Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in Homo sapiens. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in Mus musculus brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear [].; PDB: 1EG4_P.
Probab=39.42  E-value=9.8  Score=36.43  Aligned_cols=11  Identities=9%  Similarity=0.166  Sum_probs=0.0

Q ss_pred             eeeehhhHHHH
Q psy4697         254 STVLIILGVVL  264 (383)
Q Consensus       254 ~~li~~l~~i~  264 (383)
                      .++.-.+-+++
T Consensus       144 ~yL~T~IpaVV  154 (290)
T PF05454_consen  144 DYLHTFIPAVV  154 (290)
T ss_dssp             -----------
T ss_pred             chHHHHHHHHH
Confidence            33444443333


No 80 
>PF04906 Tweety:  Tweety;  InterPro: IPR006990 None of the members of the tweety (tty) family have been functionally characterised. However, they are considered to be transmembrane proteins with five potential membrane-spanning regions. A number of potential functions have been suggested on the basis of homology to the yeast FTR1 and FTH1 iron transporter proteins and the mammalian neurotensin receptors 1 and 2 in that they have a similar hydrophobicity profiles although there is no detectable sequence homology to the tweety-related proteins. It has been proposed that the tweety-related proteins could be involved in transport of iron or other divalent cations or alternatively that they may be membrane-bound receptors [].
Probab=38.92  E-value=34  Score=34.47  Aligned_cols=31  Identities=16%  Similarity=0.265  Sum_probs=13.5

Q ss_pred             eeehhhHHHHHHHHH--HHHHHhhheeecCCCC
Q psy4697         255 TVLIILGVVLIVLGF--VIILLILYIHKNKHTK  285 (383)
Q Consensus       255 ~li~~l~~i~~lL~l--~~~~l~~~~~r~~~~~  285 (383)
                      .++++++++++.|.+  +++.++.++|+|++++
T Consensus        20 ~~la~v~~~~l~l~Ll~ll~yl~~~CC~r~~~~   52 (406)
T PF04906_consen   20 LILASVAAACLALSLLFLLIYLICRCCCRRPRE   52 (406)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHhhCCCCCc
Confidence            344444444443322  2333444556655444


No 81 
>PF05895 DUF859:  Siphovirus protein of unknown function (DUF859);  InterPro: IPR008577 This entry is represented by Streptococcus phage 7201, Orf39. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. This family consists of several uncharacterised proteins from a number of the Siphoviruses as well as some bacterial proteins from Streptococcus species. Some of the members of this family are described as putative minor structural proteins.
Probab=38.85  E-value=3.4e+02  Score=29.04  Aligned_cols=117  Identities=14%  Similarity=0.111  Sum_probs=0.0

Q ss_pred             eEEEEEEEEECCCCCceE-EEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-----CeeEEEEec
Q psy4697          10 PITLVVRAIQYDNQDRYA-LATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-----LRVKYWLSN   83 (383)
Q Consensus        10 ~y~l~V~a~D~g~~~~~s-~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-----~~v~Ysi~~   83 (383)
                      .+++++.++|  +.++.+ ..+..|.|.+-  ++|.+....+...-.++...-..-+.+...--.+     -.++|+...
T Consensus       299 ~~Ti~atVtD--SRGr~S~~~~~tItVl~Y--~~P~lsfsv~R~~~~~~~~~v~~~a~Iapl~v~g~qKN~~~lt~~~a~  374 (624)
T PF05895_consen  299 SATIRATVTD--SRGRTSDPKTKTITVLEY--SPPTLSFSVYRCGSSGNTLTVTRNAKIAPLTVNGVQKNTMTLTFKVAP  374 (624)
T ss_pred             eEEEEEEEEE--CCCccCCceEEEEEEEEc--CCCcEEEEEEEeCCCCcEEEEEEEEEEeEEEEcccccceEEEEEEEEE


Q ss_pred             CCCCCEEEc--Ccc--------cEEEcccC--CcccCcEEEEEEEEecCCceeEEEEEE
Q psy4697          84 DYGERFSIS--RQG--------DISLMQCL--DYETEDSYRFTVYATDTLMTTSATVNI  130 (383)
Q Consensus        84 ~~~~~F~Id--~tG--------~I~~~~~L--D~E~~~~y~~~V~A~D~~~~s~~tV~I  130 (383)
                      -....|.+|  ..+        .......|  .|...+.|.+.+..+|.-.+...+..|
T Consensus       375 ~gt~~~t~d~~~a~~~~s~~s~~~~~~~~L~g~y~~~kSy~V~~~l~D~F~s~t~~~~V  433 (624)
T PF05895_consen  375 LGTGTFTTDNGSASGTWSSISELTNSSANLGGTYDAEKSYDVRGTLSDKFTSTTFTVTV  433 (624)
T ss_pred             cCcceEEEEccccccceeeeeeecccceeeccccCCCceEEEEEEEEEEeeeEEEEEEc


No 82 
>PF14610 DUF4448:  Protein of unknown function (DUF4448)
Probab=38.45  E-value=94  Score=27.59  Aligned_cols=17  Identities=35%  Similarity=0.667  Sum_probs=7.9

Q ss_pred             eeehhhHHHHHHHHHHH
Q psy4697         255 TVLIILGVVLIVLGFVI  271 (383)
Q Consensus       255 ~li~~l~~i~~lL~l~~  271 (383)
                      ++.|+|-.+++++++++
T Consensus       159 ~laI~lPvvv~~~~~~~  175 (189)
T PF14610_consen  159 ALAIALPVVVVVLALIM  175 (189)
T ss_pred             eEEEEccHHHHHHHHHH
Confidence            44555554444444433


No 83 
>PF05510 Sarcoglycan_2:  Sarcoglycan alpha/epsilon;  InterPro: IPR008908 Sarcoglycans are a subcomplex of transmembrane proteins which are part of the dystrophin-glycoprotein complex. They are expressed in the skeletal, cardiac and smooth muscle. Although numerous studies have been conducted on the sarcoglycan subcomplex in skeletal and cardiac muscle, the manner of the distribution and localisation of these proteins along the nonjunctional sarcolemma is not clear []. This family contains alpha and epsilon members.; GO: 0016012 sarcoglycan complex
Probab=38.35  E-value=22  Score=35.39  Aligned_cols=20  Identities=10%  Similarity=0.235  Sum_probs=11.9

Q ss_pred             HHHHhhheeecCCCCCCCCC
Q psy4697         271 IILLILYIHKNKHTKNNGPP  290 (383)
Q Consensus       271 ~~~l~~~~~r~~~~~~~~~~  290 (383)
                      +++.+++||||++.+.+-.+
T Consensus       301 llLs~Imc~rREG~~~rd~~  320 (386)
T PF05510_consen  301 LLLSYIMCCRREGVKKRDSK  320 (386)
T ss_pred             HHHHHHheechHHhhcchhc
Confidence            33345668888766555444


No 84 
>KOG4433|consensus
Probab=36.50  E-value=22  Score=36.13  Aligned_cols=31  Identities=19%  Similarity=0.254  Sum_probs=17.1

Q ss_pred             ceeeehhhHHHHHHHHHHHHH--HhhheeecCC
Q psy4697         253 SSTVLIILGVVLIVLGFVIIL--LILYIHKNKH  283 (383)
Q Consensus       253 ~~~li~~l~~i~~lL~l~~~~--l~~~~~r~~~  283 (383)
                      ...+++++++.+++|.++.++  +++++|+|++
T Consensus        42 aL~lla~l~aa~l~l~Ll~ll~yli~~cC~Rr~   74 (526)
T KOG4433|consen   42 ALLLLAALAAACLGLSLLFLLFYLICRCCCRRE   74 (526)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHcCCC
Confidence            356677777777755544443  3333444443


No 85 
>PHA03290 envelope glycoprotein I; Provisional
Probab=36.37  E-value=46  Score=32.34  Aligned_cols=45  Identities=13%  Similarity=0.065  Sum_probs=28.4

Q ss_pred             cccEEEcccCCcccCcEEEEEEEEecCCceeEEEEEEEEeecCCC
Q psy4697          94 QGDISLMQCLDYETEDSYRFTVYATDTLMTTSATVNISVVNVNDW  138 (383)
Q Consensus        94 tG~I~~~~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn  138 (383)
                      +|.+...+.--.|....|.+.|..-+....+--.+.+.|.+..+|
T Consensus       127 ~~vLL~I~~P~~~DSGiY~LRV~Ldga~~sDvF~lsv~Vyp~g~~  171 (357)
T PHA03290        127 AEIIFKINKPGIEDAGIYLLLVQLDHSRLFDGFFLGLNVYPAGDH  171 (357)
T ss_pred             cceEEEeCCCCcccCeeEEEEEEeCCCcccceEEEEEEEecCCCC
Confidence            565555555556667788888888665555555556666555443


No 86 
>cd05774 Ig_CEACAM_D1 First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM). IG_CEACAM_D1: immunoglobulin (Ig)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions, it is a cell adhesion molecule, and a signaling molecule that regulates the growth of tumor cells, it is an angiogenic factor, and is a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surface. This family corresponds to the D
Probab=36.24  E-value=1.1e+02  Score=24.44  Aligned_cols=34  Identities=18%  Similarity=0.213  Sum_probs=27.3

Q ss_pred             CCCEEEcCCCcEEEeccCCCCcceEEEEEEEeeC
Q psy4697         188 EKMFSINDSGHISIVDLSALNTSTIQLVVVATDT  221 (383)
Q Consensus       188 ~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D~  221 (383)
                      .+...+.++|.|.+......+.+.|.+.+...+.
T Consensus        61 ~gR~~~~~ngSL~I~~v~~~D~G~Y~~~v~~~~~   94 (105)
T cd05774          61 SGRETIYPNGSLLIQNVTQKDTGFYTLQTITTNF   94 (105)
T ss_pred             CCcEEEeCCCcEEEecCCcccCEEEEEEEEeCCc
Confidence            4456677789999998888999999998876653


No 87 
>PF11857 DUF3377:  Domain of unknown function (DUF3377);  InterPro: IPR021805  This domain is functionally uncharacterised and found at the C terminus of peptidases belonging to MEROPS peptidase family M10A, membrane-type matrix metallopeptidases (clan MA). ; GO: 0004222 metalloendopeptidase activity
Probab=35.85  E-value=37  Score=25.46  Aligned_cols=20  Identities=25%  Similarity=0.519  Sum_probs=10.0

Q ss_pred             eeeehhhHHHHHHHHHHHHH
Q psy4697         254 STVLIILGVVLIVLGFVIIL  273 (383)
Q Consensus       254 ~~li~~l~~i~~lL~l~~~~  273 (383)
                      .++++.+-+++++.+++++.
T Consensus        30 ~avaVviPl~L~LCiLvl~y   49 (74)
T PF11857_consen   30 NAVAVVIPLVLLLCILVLIY   49 (74)
T ss_pred             eEEEEeHHHHHHHHHHHHHH
Confidence            34445555555555554443


No 88 
>PF12245 Big_3_2:  Bacterial Ig-like domain (group 3);  InterPro: IPR022038  This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT. 
Probab=35.32  E-value=1.2e+02  Score=21.54  Aligned_cols=30  Identities=37%  Similarity=0.444  Sum_probs=22.8

Q ss_pred             CcEEEEEEEEecC-CceeEEEEEEEEeecCC
Q psy4697         108 EDSYRFTVYATDT-LMTTSATVNISVVNVND  137 (383)
Q Consensus       108 ~~~y~~~V~A~D~-~~~s~~tV~I~V~DvND  137 (383)
                      ...|.+.+.|.|. |..+.......+.|..-
T Consensus        22 dg~yt~~v~a~D~AGN~~~~~~~~~i~d~~~   52 (60)
T PF12245_consen   22 DGEYTLTVTATDKAGNTSSSTTQIVIVDNTA   52 (60)
T ss_pred             CccEEEEEEEEECCCCEEEeeeEEEEEcCCC
Confidence            5689999999995 67777777777776553


No 89 
>KOG1226|consensus
Probab=35.17  E-value=68  Score=34.69  Aligned_cols=13  Identities=38%  Similarity=0.608  Sum_probs=5.9

Q ss_pred             eeehhhHHHHHHH
Q psy4697         255 TVLIILGVVLIVL  267 (383)
Q Consensus       255 ~li~~l~~i~~lL  267 (383)
                      ++.|.|+.+++++
T Consensus       713 ~~~i~lgvv~~iv  725 (783)
T KOG1226|consen  713 ILAIVLGVVAGIV  725 (783)
T ss_pred             EeeehHHHHHHHH
Confidence            4444554444433


No 90 
>PF14991 MLANA:  Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=34.79  E-value=10  Score=30.77  Aligned_cols=10  Identities=10%  Similarity=0.165  Sum_probs=0.0

Q ss_pred             HhhheeecCC
Q psy4697         274 LILYIHKNKH  283 (383)
Q Consensus       274 l~~~~~r~~~  283 (383)
                      +.++.|||+.
T Consensus        42 iGCWYckRRS   51 (118)
T PF14991_consen   42 IGCWYCKRRS   51 (118)
T ss_dssp             ----------
T ss_pred             Hhheeeeecc
Confidence            4555666654


No 91 
>PF15048 OSTbeta:  Organic solute transporter subunit beta protein
Probab=32.85  E-value=54  Score=27.19  Aligned_cols=20  Identities=20%  Similarity=0.504  Sum_probs=10.8

Q ss_pred             eeeehhhHHHHHHHHHHHHH
Q psy4697         254 STVLIILGVVLIVLGFVIIL  273 (383)
Q Consensus       254 ~~li~~l~~i~~lL~l~~~~  273 (383)
                      -+-+.+|+++++++.++++.
T Consensus        36 NysiL~Ls~vvlvi~~~LLg   55 (125)
T PF15048_consen   36 NYSILALSFVVLVISFFLLG   55 (125)
T ss_pred             chHHHHHHHHHHHHHHHHHH
Confidence            33455666666655554443


No 92 
>cd05741 Ig_CEACAM_D1_like First immunoglobulin (Ig)-like domain of carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) and similar proteins. Ig_CEACAM_D1_like : immunoglobulin (IG)-like domain 1 in carcinoembryonic antigen (CEA) related cell adhesion molecule (CEACAM) protein subfamily-like. The CEA family is a group of anchored or secreted glycoproteins, expressed by epithelial cells, leukocytes, endothelial cells and placenta. The CEA family is divided into the CEACAM and pregnancy-specific glycoprotein (PSG) subfamilies. This group represents the CEACAM subfamily. CEACAM1 has many important cellular functions, it is a cell adhesion molecule, and a signaling molecule that regulates the growth of tumor cells, it is an angiogenic factor, and is a receptor for bacterial and viral pathogens, including mouse hepatitis virus (MHV). In mice, four isoforms of CEACAM1 generated by alternative splicing have either two [D1, D4] or four [D1-D4] Ig-like domains on the cell surf
Probab=32.80  E-value=1.1e+02  Score=22.85  Aligned_cols=34  Identities=24%  Similarity=0.394  Sum_probs=27.4

Q ss_pred             CCCCEEEcCCCcEEEeccCCCCcceEEEEEEEee
Q psy4697         187 YEKMFSINDSGHISIVDLSALNTSTIQLVVVATD  220 (383)
Q Consensus       187 ~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D  220 (383)
                      ..+.+.++.+|.|.+......+.+.|.+.|.-..
T Consensus        47 ~~~R~~~~~~~sL~I~~l~~~DsG~Y~c~v~~~~   80 (92)
T cd05741          47 YSGRETIYPNGSLLIQNLTKEDSGTYTLQIISTN   80 (92)
T ss_pred             cCCeEEEcCCceEEEccCCchhcEEEEEEEEcCC
Confidence            3456777777999998888899999999887665


No 93 
>PF03302 VSP:  Giardia variant-specific surface protein;  InterPro: IPR005127 During infection, the intestinal protozoan parasite Giardia lamblia virus undergoes continuous antigenic variation which is determined by diversification of the parasite's major surface antigen, named VSP (variant surface protein).
Probab=32.16  E-value=38  Score=34.03  Aligned_cols=27  Identities=33%  Similarity=0.415  Sum_probs=18.6

Q ss_pred             eehhhHHHHHHHHHHHHHHhhheeecC
Q psy4697         256 VLIILGVVLIVLGFVIILLILYIHKNK  282 (383)
Q Consensus       256 li~~l~~i~~lL~l~~~~l~~~~~r~~  282 (383)
                      .-|++|+|+++..||-+|.+|++||.|
T Consensus       370 aGIsvavvvvVgglvGfLcWwf~crgk  396 (397)
T PF03302_consen  370 AGISVAVVVVVGGLVGFLCWWFICRGK  396 (397)
T ss_pred             eeeeehhHHHHHHHHHHHhhheeeccc
Confidence            346666677777777777777777765


No 94 
>PHA03283 envelope glycoprotein E; Provisional
Probab=31.93  E-value=34  Score=35.29  Aligned_cols=31  Identities=10%  Similarity=0.118  Sum_probs=14.7

Q ss_pred             eeehhhHHHHHHHHHHHHHHhhheeecCCCC
Q psy4697         255 TVLIILGVVLIVLGFVIILLILYIHKNKHTK  285 (383)
Q Consensus       255 ~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~  285 (383)
                      .+++.++|.++++++.+++-++.+||+.+++
T Consensus       401 ~~~~~~~~~~~~~~~~l~vw~c~~~r~~~~~  431 (542)
T PHA03283        401 AFLLAIICTCAALLVALVVWGCILYRRSNRK  431 (542)
T ss_pred             hhHHHHHHHHHHHHHHHhhhheeeehhhcCC
Confidence            3455555555554444433334444444443


No 95 
>PF13754 Big_3_4:  Bacterial Ig-like domain (group 3)
Probab=31.03  E-value=1.8e+02  Score=20.04  Aligned_cols=27  Identities=30%  Similarity=0.365  Sum_probs=17.7

Q ss_pred             CcceEEEEEEEeeCC-CCCceeEEEEEE
Q psy4697         208 NTSTIQLVVVATDTG-NPPRQASVPAIM  234 (383)
Q Consensus       208 ~~~~y~L~V~a~D~g-~p~~sst~tv~I  234 (383)
                      ..+.|.+.+.|+|.. +....+...+.|
T Consensus        22 ~dG~y~itv~a~D~AGN~s~~~~~~~ti   49 (54)
T PF13754_consen   22 ADGTYTITVTATDAAGNTSTSSSVTFTI   49 (54)
T ss_pred             CCccEEEEEEEEeCCCCCCCccceeEEE
Confidence            468899999999974 433333334444


No 96 
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=30.85  E-value=4.8e+02  Score=29.36  Aligned_cols=51  Identities=18%  Similarity=0.123  Sum_probs=30.7

Q ss_pred             EEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcEEEEEEe-EeCCC-CeeEEEEec
Q psy4697          30 TLIVSKAGTSLRELQFSKNEYSVSALENLPVNYVLLTVTT-NKPRD-LRVKYWLSN   83 (383)
Q Consensus        30 ~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~v~~v~A-~D~D~-~~v~Ysi~~   83 (383)
                      +.+|.|.| ||++|.|....-..+|.|+.  |..-.+|.- .+.+. -.+.|...+
T Consensus       516 ~ATVTIlD-DD~aGIfsFe~~~~sV~Es~--G~vtvtV~RtsGa~G~VtV~Y~T~d  568 (928)
T TIGR00845       516 TATVTILD-DDHAGIFTFEEDVFHVSESI--GIMEVKVLRTSGARGTVIVPYRTVE  568 (928)
T ss_pred             eEEEEEec-CcccCcccccCceEEEEcCC--CEEEEEEEEcCCCCeeEEEEEEeec
Confidence            44566677 78899887766677889984  554444432 22222 335576554


No 97 
>KOG3488|consensus
Probab=30.10  E-value=53  Score=24.31  Aligned_cols=31  Identities=23%  Similarity=0.318  Sum_probs=15.5

Q ss_pred             eeehhhHHHHHHHHHHHHHHhhheeecCCCC
Q psy4697         255 TVLIILGVVLIVLGFVIILLILYIHKNKHTK  285 (383)
Q Consensus       255 ~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~  285 (383)
                      ++.+-+++.+++|.++..++.....|.+||+
T Consensus        49 Ai~iPvaagl~ll~lig~Fis~vMlKskkKK   79 (81)
T KOG3488|consen   49 AITIPVAAGLFLLCLIGTFISLVMLKSKKKK   79 (81)
T ss_pred             HhhhHHHHHHHHHHHHHHHHHHHhhhccccc
Confidence            3445556656555555555444444444443


No 98 
>PF10365 DUF2436:  Domain of unknown function (DUF2436);  InterPro: IPR018832  Gingipains R and K are endopeptidases with specificity for arginyl and lysyl bonds, respectively. Like other cysteine peptidases, they require reducing conditions for activity. They are maximally active at approximately neutral pH. Gingipains R and K are secreted by the bacterium Porphyromonas gingivalis (Bacteroides gingivalis). The bacterium is a major pathogen in periodontal disease, and the many ways in which the activities of the gingipains may contribute to the disease processes have been reviewed []. These enzymes are also involved in the hemagglutinating activity of the organisms.  This entry represents a central region found in gingipain K peptidases, active on lysyl bonds; they belong to the MEROPS peptidase family C25 (gingipain family, clan CD).  
Probab=29.86  E-value=2.5e+02  Score=23.90  Aligned_cols=82  Identities=12%  Similarity=0.073  Sum_probs=39.8

Q ss_pred             CCCCCeeecCcEEEEEECCCCCCcEEEEEEeEeC------CCCeeEEEEecC-CCCCEEEcCcccEEEcc--cCCcccCc
Q psy4697          39 SLRELQFSKNEYSVSALENLPVNYVLLTVTTNKP------RDLRVKYWLSND-YGERFSISRQGDISLMQ--CLDYETED  109 (383)
Q Consensus        39 Ndn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~------D~~~v~Ysi~~~-~~~~F~Id~tG~I~~~~--~LD~E~~~  109 (383)
                      |+++|.=.-..|+-.|++|+-+...- +-...|.      ..+..-|-|... +....+|-..|-=-..|  -+-+|.-+
T Consensus        66 n~~~pa~ly~~FEYkiP~NADps~tp-q~mv~dG~~~i~IPaG~YDy~I~~P~~~~kiwIaGd~g~~~tr~dDy~fEAGK  144 (161)
T PF10365_consen   66 NCNVPANLYDPFEYKIPANADPSTTP-QNMVVDGEASIDIPAGTYDYCIAAPQPGGKIWIAGDGGDGPTRGDDYVFEAGK  144 (161)
T ss_pred             CCCCChhhcccceEeccCCCCCccCc-ceEEecCceEEEecCceeEEEEecCCCCCeEEEecCCCCCCccccceEEecCC
Confidence            34455433355677788876543211 1111111      114445555552 33445553211000122  23457789


Q ss_pred             EEEEEEEEecCC
Q psy4697         110 SYRFTVYATDTL  121 (383)
Q Consensus       110 ~y~~~V~A~D~~  121 (383)
                      .|.|++.+...+
T Consensus       145 tY~ftm~~~g~g  156 (161)
T PF10365_consen  145 TYRFTMKRVGSG  156 (161)
T ss_pred             EEEEEEEeccCC
Confidence            999999987654


No 99 
>PHA03286 envelope glycoprotein E; Provisional
Probab=29.18  E-value=53  Score=33.29  Aligned_cols=11  Identities=9%  Similarity=0.169  Sum_probs=5.1

Q ss_pred             CCcceEEEEEE
Q psy4697         207 LNTSTIQLVVV  217 (383)
Q Consensus       207 ~~~~~y~L~V~  217 (383)
                      ...+.|-+.+.
T Consensus       317 s~SGLYVfVl~  327 (492)
T PHA03286        317 TDAGLYVVVAL  327 (492)
T ss_pred             ccCceEEEEEE
Confidence            34555544443


No 100
>PF07213 DAP10:  DAP10 membrane protein;  InterPro: IPR009861 This family consists of several mammalian DAP10 membrane proteins. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells [].
Probab=29.11  E-value=71  Score=24.32  Aligned_cols=22  Identities=14%  Similarity=0.259  Sum_probs=12.5

Q ss_pred             HHHHHHHHHHHhhheeecCCCC
Q psy4697         264 LIVLGFVIILLILYIHKNKHTK  285 (383)
Q Consensus       264 ~~lL~l~~~~l~~~~~r~~~~~  285 (383)
                      ++.|++++....+.+-|+++++
T Consensus        45 vlTLLIv~~vy~car~r~r~~~   66 (79)
T PF07213_consen   45 VLTLLIVLVVYYCARPRRRPTQ   66 (79)
T ss_pred             HHHHHHHHHHHhhcccccCCcc
Confidence            3445555666666666666544


No 101
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=27.99  E-value=1.1e+02  Score=22.37  Aligned_cols=29  Identities=17%  Similarity=0.299  Sum_probs=21.3

Q ss_pred             CCcccCcEEEEEEEEecC-CceeEEEEEEE
Q psy4697         103 LDYETEDSYRFTVYATDT-LMTTSATVNIS  131 (383)
Q Consensus       103 LD~E~~~~y~~~V~A~D~-~~~s~~tV~I~  131 (383)
                      ..|.....|.+++.++|. +.+...++.|.
T Consensus        51 ~~y~~~G~y~v~l~v~d~~g~~~~~~~~V~   80 (81)
T cd00146          51 HTYTKPGTYTVTLTVTNAVGSSSTKTTTVV   80 (81)
T ss_pred             EEcCCCcEEEEEEEEEeCCCCEEEEEEEEE
Confidence            456778899999999997 45555465554


No 102
>TIGR00864 PCC polycystin cation channel protein. Note: this model has been restricted to the amino half because for technical reasons.
Probab=27.84  E-value=1e+03  Score=30.47  Aligned_cols=110  Identities=18%  Similarity=0.211  Sum_probs=0.0

Q ss_pred             cccCcEEEEEEEEecCCceeEEEEEEEEeecCCCCCcccCCceeEEEeccCCCCCCCCceEEEEEeee-CCCCCeEEEEE
Q psy4697         105 YETEDSYRFTVYATDTLMTTSATVNISVVNVNDWDPRFRYPQYELFLPHIPLADLTPGSVIGKVEAAD-GDKGDRVTLSL  183 (383)
Q Consensus       105 ~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~~~~e~~~~g~~v~~v~A~D-~D~g~~i~ysi  183 (383)
                      |.....|.+++.|+|..-.++.+..|.|          ..+.-.+.+.    .....-.+-..+..+| .+.|...+|++
T Consensus      1480 Y~~~GtYtVtLTvtN~~Gsst~T~~VtV----------~~pV~~~tin----as~~~vpl~~sV~Fta~~s~Gs~v~ysW 1545 (2740)
T TIGR00864      1480 FNSPGDFNIRLAAANEVGKNEATLNVAV----------KARVRGLTIN----ASLTNVPLNGSVHFEAHLDAGDDVRFSW 1545 (2740)
T ss_pred             cCCCceEEEEEEEECCCCceEEEEEEEE----------eccccceEEc----CCCccccccceEEEEEEccCCCceeEEE


Q ss_pred             e-CCCCCCEEEcCCCcEEEeccCCCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697         184 R-GPYEKMFSINDSGHISIVDLSALNTSTIQLVVVATDTGNPPRQASVPAIMHF  236 (383)
Q Consensus       184 ~-~~~~~~F~i~~tG~i~l~~~~~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v  236 (383)
                      . ++......-+++-.-.-...     +.|.+.+.|.+..+   +..+++.|.|
T Consensus      1546 dFGDg~ts~~~npt~~yTY~sp-----GtYtVtLTvtN~~G---s~~~T~~i~V 1591 (2740)
T TIGR00864      1546 ILCDHCTPIFGGNTIFYTFRSV-----GTFNIIVTAENDVG---AAQASIFLFV 1591 (2740)
T ss_pred             EeCCCCccccCCCceEEeecCC-----ceEEEEEEEecCCC---ccceeEEEEE


No 103
>PF07204 Orthoreo_P10:  Orthoreovirus membrane fusion protein p10;  InterPro: IPR009854 This family consists of several Orthoreovirus membrane fusion protein p10 sequences. p10 is thought to be a multifunctional protein that plays a key role in virus-host interaction [].
Probab=26.86  E-value=26  Score=27.48  Aligned_cols=25  Identities=12%  Similarity=0.195  Sum_probs=12.5

Q ss_pred             ehhhHHHHHHHHHHHHHHhhheeec
Q psy4697         257 LIILGVVLIVLGFVIILLILYIHKN  281 (383)
Q Consensus       257 i~~l~~i~~lL~l~~~~l~~~~~r~  281 (383)
                      +++-|+++++++++.+++..+.+++
T Consensus        45 LA~GGG~iLilIii~Lv~CC~~K~K   69 (98)
T PF07204_consen   45 LAAGGGLILILIIIALVCCCRAKHK   69 (98)
T ss_pred             hhccchhhhHHHHHHHHHHhhhhhh
Confidence            3333555555544555555555444


No 104
>PF14979 TMEM52:  Transmembrane 52
Probab=26.48  E-value=74  Score=27.20  Aligned_cols=10  Identities=10%  Similarity=-0.097  Sum_probs=4.4

Q ss_pred             Hhhh-eeecCC
Q psy4697         274 LILY-IHKNKH  283 (383)
Q Consensus       274 l~~~-~~r~~~  283 (383)
                      +.++ +|.||+
T Consensus        40 ~C~rfCClrk~   50 (154)
T PF14979_consen   40 SCVRFCCLRKQ   50 (154)
T ss_pred             HHHHHHHhccc
Confidence            3344 444444


No 105
>PHA03281 envelope glycoprotein E; Provisional
Probab=25.71  E-value=72  Score=33.12  Aligned_cols=24  Identities=13%  Similarity=0.076  Sum_probs=10.7

Q ss_pred             EEEEEEEEecCCceeEEEEEEEEee
Q psy4697         110 SYRFTVYATDTLMTTSATVNISVVN  134 (383)
Q Consensus       110 ~y~~~V~A~D~~~~s~~tV~I~V~D  134 (383)
                      .|.+-.. -+.|.+..+.+.|.|.+
T Consensus       312 VYtly~r-g~~G~s~~svfLVtVkg  335 (642)
T PHA03281        312 VYIWNLQ-GSDGENMYATFLVKLKG  335 (642)
T ss_pred             eEEEEec-CCCCcceeEEEEEEecC
Confidence            4444444 22233334455566654


No 106
>TIGR03778 VPDSG_CTERM VPDSG-CTERM exosortase interaction domain. Through in silico analysis, we previously described the PEP-CTERM/exosortase system (PubMed:16930487). This model describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C-terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif (TIGR02595) of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria.
Probab=25.06  E-value=81  Score=18.71  Aligned_cols=11  Identities=27%  Similarity=0.446  Sum_probs=3.9

Q ss_pred             HHHHHHHHHhh
Q psy4697         266 VLGFVIILLIL  276 (383)
Q Consensus       266 lL~l~~~~l~~  276 (383)
                      +|.+.+..++.
T Consensus        10 Ll~~~l~~l~~   20 (26)
T TIGR03778        10 LLGLGLLGLLG   20 (26)
T ss_pred             HHHHHHHHHHH
Confidence            33333333333


No 107
>PF13965 SID-1_RNA_chan:  dsRNA-gated channel SID-1
Probab=24.84  E-value=3.5e+02  Score=28.69  Aligned_cols=24  Identities=17%  Similarity=0.331  Sum_probs=10.0

Q ss_pred             EcCCCcEEEeccCCCCcceEEEEEE
Q psy4697         193 INDSGHISIVDLSALNTSTIQLVVV  217 (383)
Q Consensus       193 i~~tG~i~l~~~~~~~~~~y~L~V~  217 (383)
                      +...|.|.+.+.+ ...+.+.+.+.
T Consensus        58 ~T~~a~itv~r~~-f~~~~F~Vvvv   81 (570)
T PF13965_consen   58 MTKKAGITVQRKD-FPSGSFYVVVV   81 (570)
T ss_pred             EeccccEEEEhhh-CCCCeEEEEEE
Confidence            3344555544332 22334444444


No 108
>PRK14081 triple tyrosine motif-containing protein; Provisional
Probab=24.84  E-value=9.1e+02  Score=26.15  Aligned_cols=189  Identities=13%  Similarity=0.140  Sum_probs=89.1

Q ss_pred             CCeEEEEEEEEECCCCC-ceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCCCCcE-EEEEEeEeCCCCeeEEEEecCC
Q psy4697           8 LQPITLVVRAIQYDNQD-RYALATLIVSKAGTSLRELQFSKNEYSVSALENLPVNYV-LLTVTTNKPRDLRVKYWLSNDY   85 (383)
Q Consensus         8 ~~~y~l~V~a~D~g~~~-~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~~gt~-v~~v~A~D~D~~~v~Ysi~~~~   85 (383)
                      .-.|.+.|+|.+..+.. .--.+.+...|.+.+.   ..- ...... .+...+|.. +..|.+..   ..+-|.     
T Consensus        63 ~GkY~imVq~K~~~S~~~fD~~~~~~~~v~~~~~---~~I-~~~~~~-~~~l~vGe~l~~~V~~~~---e~~LYK-----  129 (667)
T PRK14081         63 EGEYTIMVQAKKEDSNKPFDYVSKEDYVIGKAEE---KLI-KNIYLD-KDTLNVGEKIEIKVDSNK---EPLMYR-----  129 (667)
T ss_pred             CccEEEEEEEecCCCCCCcceeEEEEEEEcccch---hhh-eeeEec-CccccCCCEEEEEEEecc---CcEEEE-----
Confidence            35688888888866542 2233444444444333   111 111111 222334543 23333322   224453     


Q ss_pred             CCCEEEcCcccEEEcc------cCCcc--cCcEEEEEEEEecCC----ceeEEEEEEEEeecCCCCCcccCCceeEEEec
Q psy4697          86 GERFSISRQGDISLMQ------CLDYE--TEDSYRFTVYATDTL----MTTSATVNISVVNVNDWDPRFRYPQYELFLPH  153 (383)
Q Consensus        86 ~~~F~Id~tG~I~~~~------~LD~E--~~~~y~~~V~A~D~~----~~s~~tV~I~V~DvNDn~P~f~~~~~~~~v~~  153 (383)
                         |.|+.+|.....+      .|.|-  ....|.+.+.+.|..    .-..+.+...|....+  +.+..  +. ... 
T Consensus       130 ---F~I~~~~~w~~iqDYst~n~lsyt~~~~G~Y~ll~~~Kd~~S~~~fDD~~~v~y~Vk~~~~--v~I~~--F~-~ln-  200 (667)
T PRK14081        130 ---YWIKEDNNWKLIKDYSTENSLSYTANKPGKYELLVECKRIDSTKDFDDFKKVKFKVKEIDK--VEITD--FK-CLN-  200 (667)
T ss_pred             ---EEEcCCCcEEEEEecCCcceEEEEecCCCcEEEEEEEecCCCccccCcceEEEEEcccCcc--eEEEe--cc-ccC-
Confidence               3344444433332      22221  246899999999964    4456677776665543  22211  00 111 


Q ss_pred             cCCCCCCCC-ceEEEEEeeeCCCCCeEEEEEe-CCCCCCEEEcC---CCcEEEeccCCCCcceEEEEEEEeeCCCC
Q psy4697         154 IPLADLTPG-SVIGKVEAADGDKGDRVTLSLR-GPYEKMFSIND---SGHISIVDLSALNTSTIQLVVVATDTGNP  224 (383)
Q Consensus       154 ~~~e~~~~g-~~v~~v~A~D~D~g~~i~ysi~-~~~~~~F~i~~---tG~i~l~~~~~~~~~~y~L~V~a~D~g~p  224 (383)
                         ...-.| .+.+.+.|... .|..+.|.+. -+..+.+.-+.   +-.++...  ....+.|.|.+.|.|....
T Consensus       201 ---s~~i~~~eI~f~~~a~~~-~g~~~LYKF~~i~~~G~~~~~qdYst~n~~~y~--~~~~G~Y~i~~~VKD~~S~  270 (667)
T PRK14081        201 ---KELICDEELVFEVESVYE-EDRTILYKFVKIDSDGKQTCIQDYSTKNIVSYK--EKKSGDYKLLCLVKDMYSN  270 (667)
T ss_pred             ---cceecCcEEEEEEEEEeC-CCceEEEEEEEECCCCCEEEecCccccceEEEE--eCCCccEEEEEEEeccCcc
Confidence               111122 33455556554 3556666643 12233444332   11222222  2457889999999998654


No 109
>smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.
Probab=24.48  E-value=1.8e+02  Score=21.17  Aligned_cols=28  Identities=14%  Similarity=0.120  Sum_probs=20.8

Q ss_pred             CCCcceEEEEEEEeeCCCCCceeEEEEEEEE
Q psy4697         206 ALNTSTIQLVVVATDTGNPPRQASVPAIMHF  236 (383)
Q Consensus       206 ~~~~~~y~L~V~a~D~g~p~~sst~tv~I~v  236 (383)
                      ....+.|.+.+.+.|..+   ++++++.|.|
T Consensus        51 y~~~G~y~v~l~v~n~~g---~~~~~~~i~v   78 (79)
T smart00089       51 YTKPGTYTVTLTVTNAVG---SASATVTVVV   78 (79)
T ss_pred             eCCCcEEEEEEEEEcCCC---cEEEEEEEEE
Confidence            455789999999999866   5565666654


No 110
>cd05762 Ig8_MLCK Eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK). Ig8_MLCK: the eighth immunoglobulin (Ig)-like domain of human myosin light-chain kinase (MLCK). MLCK is a key regulator of different forms of cell motility involving actin and myosin II.  Agonist stimulation of smooth muscle cells increases cytosolic Ca2+, which binds calmodulin.  This Ca2+-calmodulin complex in turn binds to and activates MLCK. Activated MLCK leads to the phosphorylation of the 20 kDa myosin regulatory light chain (RLC) of myosin II and the stimulation of actin-activated myosin MgATPase activity. MLCK is widely present in vertebrate tissues; it phosphorylates the 20 kDa RLC of both smooth and nonmuscle myosin II. Phosphorylation leads to the activation of the myosin motor domain and altered structural properties of myosin II. In smooth muscle MLCK it is involved in initiating contraction. In nonmuscle cells, MLCK may participate in cell division and cell motility; it has
Probab=24.06  E-value=3.4e+02  Score=20.90  Aligned_cols=37  Identities=27%  Similarity=0.322  Sum_probs=24.7

Q ss_pred             EcccCCcccCcEEEEEEEEecCCceeEEEEEEEEeecCC
Q psy4697          99 LMQCLDYETEDSYRFTVYATDTLMTTSATVNISVVNVND  137 (383)
Q Consensus        99 ~~~~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V~DvND  137 (383)
                      +.+...++....|.+.+  .+..-...+++.|.|.|.-+
T Consensus        59 ~I~~~~~~D~G~Ytc~a--~N~~G~~~~~~~l~V~~~P~   95 (98)
T cd05762          59 TITEGQQEHCGCYTLEV--ENKLGSRQAQVNLTVVDKPD   95 (98)
T ss_pred             EECCCChhhCEEEEEEE--EcCCCceeEEEEEEEecCCC
Confidence            34556666677777665  44444566788888888776


No 111
>COG4288 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=23.47  E-value=1.2e+02  Score=24.49  Aligned_cols=47  Identities=19%  Similarity=0.162  Sum_probs=29.1

Q ss_pred             CCcCCCCeEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCC
Q psy4697           3 NEDDFLQPITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLP   59 (383)
Q Consensus         3 ~DrE~~~~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~   59 (383)
                      .|||....|.++|.|.          .++.+...|.+|..+.-....|...+.-|.|
T Consensus        52 sDr~pvgpyevevaar----------rt~hlRfndL~dpe~iP~d~~yasviesnvP   98 (124)
T COG4288          52 SDREPVGPYEVEVAAR----------RTLHLRFNDLGDPEAIPKDTPYASVIESNVP   98 (124)
T ss_pred             ccCCCCCceEEEeecc----------eeEEEEecccCCcccCCCCCchhhheecCCc
Confidence            4555555555555443          3567888899987766555666655555554


No 112
>PHA03283 envelope glycoprotein E; Provisional
Probab=23.36  E-value=4.2e+02  Score=27.61  Aligned_cols=38  Identities=11%  Similarity=0.110  Sum_probs=29.2

Q ss_pred             CCceeeehhhHHHHHHHHHHHHHHhhheeecCCCCCCC
Q psy4697         251 GTSSTVLIILGVVLIVLGFVIILLILYIHKNKHTKNNG  288 (383)
Q Consensus       251 ~~~~~li~~l~~i~~lL~l~~~~l~~~~~r~~~~~~~~  288 (383)
                      .|.-..+..++++..+++++++.|+++.|-+.++.++.
T Consensus       394 ~~~~~~l~~~~~~~~~~~~~~~~l~vw~c~~~r~~~~~  431 (542)
T PHA03283        394 AWTRHYLAFLLAIICTCAALLVALVVWGCILYRRSNRK  431 (542)
T ss_pred             ccccccchhHHHHHHHHHHHHHHHhhhheeeehhhcCC
Confidence            44566677788888888888888889889998766554


No 113
>PF00558 Vpu:  Vpu protein;  InterPro: IPR008187 The Human immunodeficiency virus 1 (HIV-1) Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells [].; GO: 0019076 release of virus from host; PDB: 2JPX_A 1PI8_A 2GOH_A 2GOF_A 1PI7_A 1PJE_A 1VPU_A 2K7Y_A.
Probab=23.28  E-value=78  Score=24.27  Aligned_cols=6  Identities=0%  Similarity=-0.158  Sum_probs=0.0

Q ss_pred             eeecCC
Q psy4697         278 IHKNKH  283 (383)
Q Consensus       278 ~~r~~~  283 (383)
                      .||+.|
T Consensus        29 eYrk~~   34 (81)
T PF00558_consen   29 EYRKIK   34 (81)
T ss_dssp             ------
T ss_pred             HHHHHH
Confidence            344333


No 114
>KOG3637|consensus
Probab=23.03  E-value=1e+02  Score=35.06  Aligned_cols=23  Identities=30%  Similarity=0.492  Sum_probs=13.6

Q ss_pred             ehhhHHHHHHHHHHHHHHhhhee
Q psy4697         257 LIILGVVLIVLGFVIILLILYIH  279 (383)
Q Consensus       257 i~~l~~i~~lL~l~~~~l~~~~~  279 (383)
                      +|+++.+.+||+|++|++++++|
T Consensus       980 iIi~svl~GLLlL~llv~~LwK~ 1002 (1030)
T KOG3637|consen  980 IIILSVLGGLLLLALLVLLLWKC 1002 (1030)
T ss_pred             eehHHHHHHHHHHHHHHHHHHhc
Confidence            45555555566666666666554


No 115
>PF11395 DUF2873:  Protein of unknown function (DUF2873);  InterPro: IPR021532 This entry is represented by the human SARS coronavirus, Orf7b; it is a family of uncharacterised viral proteins.
Probab=22.89  E-value=80  Score=20.30  Aligned_cols=9  Identities=22%  Similarity=0.630  Sum_probs=3.6

Q ss_pred             HHHHHHHHH
Q psy4697         264 LIVLGFVII  272 (383)
Q Consensus       264 ~~lL~l~~~  272 (383)
                      +++|+++++
T Consensus        17 llflv~iml   25 (43)
T PF11395_consen   17 LLFLVIIML   25 (43)
T ss_pred             HHHHHHHHH
Confidence            333444443


No 116
>TIGR03660 T1SS_rpt_143 T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion.
Probab=22.87  E-value=2.7e+02  Score=23.56  Aligned_cols=44  Identities=14%  Similarity=0.142  Sum_probs=27.5

Q ss_pred             eEEEEEEEEECCCCCceEEEEEEEEEeeCCCCCCeeecCcEEEEEECCCC
Q psy4697          10 PITLVVRAIQYDNQDRYALATLIVSKAGTSLRELQFSKNEYSVSALENLP   59 (383)
Q Consensus        10 ~y~l~V~a~D~g~~~~~s~~~v~V~V~DvNdn~P~F~~~~y~~~V~En~~   59 (383)
                      ...|.|.|+|..+..  +...+.|.|.|  | .|.-.... .+.|.|+..
T Consensus        86 ~l~~~v~a~D~DGD~--s~~~l~VtI~D--D-~P~~~~~~-~~~V~E~~L  129 (137)
T TIGR03660        86 TLNFPIIATDFDGDT--SSITLPVTIVD--D-VPTITDVD-ALTVDEDDL  129 (137)
T ss_pred             EEeeeEEEEeCCCCc--cccEEEEEEEC--C-CCeecccc-ceEEecccc
Confidence            467788899865333  23477788877  5 46654433 378888543


No 117
>cd05775 Ig_SLAM-CD84_like_N N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family, CD84_like. Ig_SLAM-CD84_like_N: The N-terminal immunoglobulin (Ig)-like domain of the signaling lymphocyte activation molecule (SLAM) family, CD84_like. The SLAM family is a group of immune-cell specific receptors that can regulate both adaptive and innate immune responses. Members of this group include proteins such as CD84, SLAM (CD150), Ly-9 (CD229), NTB-A (ly-108, SLAM6), 19A (CRACC), and SLAMF9. The genes coding for the SLAM family are nested on chromosome 1, in humans at 1q23, and in mice at 1H2. The SLAM family is a subset of the CD2 family, which also includes CD2 and CD58 located on chromosome 1 at 1p13 in humans. In mice, CD2 is located on chromosome 3, and there is no CD58 homolog. The SLAM family proteins are organized as an extracellular domain with either two or four Ig-like domains, a single transmembrane segment, and a cytoplasmic region 
Probab=22.68  E-value=1.9e+02  Score=22.27  Aligned_cols=31  Identities=6%  Similarity=0.146  Sum_probs=25.1

Q ss_pred             CEEEcC-CCcEEEeccCCCCcceEEEEEEEee
Q psy4697         190 MFSIND-SGHISIVDLSALNTSTIQLVVVATD  220 (383)
Q Consensus       190 ~F~i~~-tG~i~l~~~~~~~~~~y~L~V~a~D  220 (383)
                      .+.++. ++.|.+......+.+.|.+.|...+
T Consensus        53 R~~~~~~~~sL~I~~~~~~DsG~Y~c~v~~~~   84 (97)
T cd05775          53 RVNFSQNDYSLQISNLKMEDAGSYRAEINTKN   84 (97)
T ss_pred             eEEecCCceeEEECCCchHHCEEEEEEEEcCC
Confidence            455665 6888888888888999999998776


No 118
>smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.
Probab=22.60  E-value=1.9e+02  Score=20.95  Aligned_cols=31  Identities=29%  Similarity=0.419  Sum_probs=23.0

Q ss_pred             cCCcccCcEEEEEEEEecCCceeEEEEEEEE
Q psy4697         102 CLDYETEDSYRFTVYATDTLMTTSATVNISV  132 (383)
Q Consensus       102 ~LD~E~~~~y~~~V~A~D~~~~s~~tV~I~V  132 (383)
                      ..-|+....|.+++.+.|..-++++++.|.|
T Consensus        48 ~~~y~~~G~y~v~l~v~n~~g~~~~~~~i~v   78 (79)
T smart00089       48 THTYTKPGTYTVTLTVTNAVGSASATVTVVV   78 (79)
T ss_pred             EEEeCCCcEEEEEEEEEcCCCcEEEEEEEEE
Confidence            4456778899999999986446666666665


No 119
>PF13584 BatD:  Oxygen tolerance
Probab=21.99  E-value=8.5e+02  Score=24.76  Aligned_cols=16  Identities=13%  Similarity=0.117  Sum_probs=9.1

Q ss_pred             eEEEEEeCCCCCCEEE
Q psy4697         178 RVTLSLRGPYEKMFSI  193 (383)
Q Consensus       178 ~i~ysi~~~~~~~F~i  193 (383)
                      .++|.+.....+.|.|
T Consensus       339 ~~~~~~ip~~~G~~~l  354 (484)
T PF13584_consen  339 TFKYTLIPKKPGDFTL  354 (484)
T ss_pred             EEEEEEEeCCCCeEEc
Confidence            4666666555555555


No 120
>PHA03265 envelope glycoprotein D; Provisional
Probab=21.69  E-value=41  Score=32.89  Aligned_cols=51  Identities=14%  Similarity=0.148  Sum_probs=25.0

Q ss_pred             CCCCeeecCcEEEEEECCCCCCcEEEEEEeEeCCC-CeeEEEEecCCCCCEEE
Q psy4697          40 LRELQFSKNEYSVSALENLPVNYVLLTVTTNKPRD-LRVKYWLSNDYGERFSI   91 (383)
Q Consensus        40 dn~P~F~~~~y~~~V~En~~~gt~v~~v~A~D~D~-~~v~Ysi~~~~~~~F~I   91 (383)
                      |.||.|+.+.|+..+..-.. +..|.+-.+.+.+. ..++|-+..++-+...+
T Consensus        42 ~~PP~~PPPRYNyt~~w~~~-~~~IPSPF~d~~~~~veVr~Vtst~pCgmvAL   93 (402)
T PHA03265         42 DRPKEFPPPRYNYTILTRYN-ATALASPFINDQVKNVDLRIVTATRPCEMIAL   93 (402)
T ss_pred             CCCCCCCCCCCCceEEEeec-CCCCCCcccCCCCCceeeeeeeccCCcceEEE
Confidence            44788988888776553321 11222222233333 45555554444444444


No 121
>PHA03099 epidermal growth factor-like protein (EGF-like protein); Provisional
Probab=21.47  E-value=1e+02  Score=25.82  Aligned_cols=6  Identities=0%  Similarity=0.069  Sum_probs=2.4

Q ss_pred             eeecCC
Q psy4697         278 IHKNKH  283 (383)
Q Consensus       278 ~~r~~~  283 (383)
                      +||+-|
T Consensus       122 ~yr~~r  127 (139)
T PHA03099        122 VYRFTR  127 (139)
T ss_pred             hheeee
Confidence            344433


No 122
>PF05399 EVI2A:  Ectropic viral integration site 2A protein (EVI2A);  InterPro: IPR008608 This family contains several mammalian ectropic viral integration site 2A (EVI2A) proteins. The function of this protein is unknown although it is thought to be a membrane protein and may function as an oncogene in retrovirus induced myeloid tumours [, ].; GO: 0016021 integral to membrane
Probab=21.12  E-value=41  Score=30.53  Aligned_cols=14  Identities=14%  Similarity=0.285  Sum_probs=5.1

Q ss_pred             HHHHHHHHHHHHHh
Q psy4697         262 VVLIVLGFVIILLI  275 (383)
Q Consensus       262 ~i~~lL~l~~~~l~  275 (383)
                      .||.+|+|-.++||
T Consensus       142 LICT~LfLSTVVLA  155 (227)
T PF05399_consen  142 LICTLLFLSTVVLA  155 (227)
T ss_pred             HHHHHHHHHHHHHH
Confidence            33333333333333


No 123
>PF08391 Ly49:  Ly49-like protein, N-terminal region;  InterPro: IPR013600 The sequences making up this entry are annotated as, or are similar to, Ly49 receptors (e.g. P20937 from SWISSPROT). These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function []. They are members of the C-type lectin receptor superfamily [], and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (IPR001304 from INTERPRO). ; PDB: 1QO3_D 3C8J_D 1P4L_D 3C8K_D 3G8K_B 1JA3_B 3CAD_A 3G8L_A.
Probab=20.80  E-value=33  Score=28.36  Aligned_cols=22  Identities=18%  Similarity=0.357  Sum_probs=0.0

Q ss_pred             eeehhhHHHHHHHHHHHHHHhh
Q psy4697         255 TVLIILGVVLIVLGFVIILLIL  276 (383)
Q Consensus       255 ~li~~l~~i~~lL~l~~~~l~~  276 (383)
                      .++++||.+|++|++.++.|+.
T Consensus         6 liav~LGILCllLLvtv~vL~t   27 (119)
T PF08391_consen    6 LIAVALGILCLLLLVTVAVLGT   27 (119)
T ss_dssp             ----------------------
T ss_pred             HHHHHHHHHHHHHHHHHHHHHH
Confidence            4577888888876665555554


No 124
>PLN03150 hypothetical protein; Provisional
Probab=20.51  E-value=88  Score=33.39  Aligned_cols=12  Identities=33%  Similarity=0.501  Sum_probs=4.9

Q ss_pred             eehhhHHHHHHH
Q psy4697         256 VLIILGVVLIVL  267 (383)
Q Consensus       256 li~~l~~i~~lL  267 (383)
                      +.++++++++++
T Consensus       547 i~~~~~~~~~~l  558 (623)
T PLN03150        547 IGIAFGVSVAFL  558 (623)
T ss_pred             EEEEhHHHHHHH
Confidence            334444444333


No 125
>PF15065 NCU-G1:  Lysosomal transcription factor, NCU-G1
Probab=20.36  E-value=51  Score=32.54  Aligned_cols=12  Identities=25%  Similarity=0.274  Sum_probs=7.3

Q ss_pred             CcEEEEEEEEec
Q psy4697         108 EDSYRFTVYATD  119 (383)
Q Consensus       108 ~~~y~~~V~A~D  119 (383)
                      .....|+++|.+
T Consensus       128 ngsi~~~~~af~  139 (350)
T PF15065_consen  128 NGSIAFKLQAFS  139 (350)
T ss_pred             CCeEEEEEEEec
Confidence            556666666655


No 126
>PF02124 Marek_A:  Marek's disease glycoprotein A;  InterPro: IPR001038  Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) glycoprotein 13 (EHV-1 gp13) has the characteristic features of a membrane-spanning protein: an N-terminal signal sequence; a hydrophobic membrane anchor region; a charged C-terminal cytoplasmic tail; and an exterior domain with nine potential N-glycosylation sites []. EHV-1 gp13 is the structural homologue of the gC-like glycoproteins of the Human herpesvirus 1 (HHV-1) and Human herpesvirus 2 (HHV-2) (gC-1 and gC-2 respectively), Pseudorabies virus (strain Indiana-Funkhauser/Becker) (PRV) (gIII) and Human herpesvirus 3 (HHV-3) (gp66).  Secretory glycoprotein GP57-65 precursor (glycoprotein A - GA) is similar to Herpesvirus glycoprotein C, and belongs to the immunoglobulin gene superfamily [, ]. GA is thought to play an immunoevasive role in the pathogenesis of Marek's disease. It is a candidate for causing the early-stage immunosuppression that occurs after MDHV infection.
Probab=20.30  E-value=5.3e+02  Score=23.53  Aligned_cols=20  Identities=5%  Similarity=-0.054  Sum_probs=12.4

Q ss_pred             eEEEEEEEeeCCCCCceeEE
Q psy4697         211 TIQLVVVATDTGNPPRQASV  230 (383)
Q Consensus       211 ~y~L~V~a~D~g~p~~sst~  230 (383)
                      .|.-.+.=.-.+-|..+.+.
T Consensus       152 ~YtC~l~GYP~~~p~f~~~~  171 (211)
T PF02124_consen  152 EYTCRLIGYPDILPVFEDTA  171 (211)
T ss_pred             eEEEEEeeCCCCCCcccceE
Confidence            67777765555556655543


Done!