Query         psy9968
Match_columns 118
No_of_seqs    156 out of 1155
Neff          8.1 
Searched_HMMs 46136
Date          Sat Aug 17 00:19:23 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy9968.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/9968hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 smart00112 CA Cadherin repeats  99.8 6.3E-20 1.4E-24  111.9  10.5   79    6-111     1-79  (79)
  2 cd00031 CA Cadherin repeat dom  99.8 1.3E-17 2.8E-22  116.4  12.1   90    2-118    18-107 (199)
  3 KOG4289|consensus               99.7 7.9E-18 1.7E-22  142.1  10.0   92    1-118   803-894 (2531)
  4 KOG4289|consensus               99.7 2.1E-17 4.6E-22  139.6   8.9   91    1-118   289-379 (2531)
  5 PF00028 Cadherin:  Cadherin do  99.7 4.6E-16 9.9E-21   97.5  11.5   77    1-104    16-93  (93)
  6 KOG1219|consensus               99.6 5.1E-16 1.1E-20  135.1   8.0   91    1-118  1078-1168(4289)
  7 KOG1219|consensus               99.6 1.6E-15 3.4E-20  132.1  10.2   91    1-118   973-1063(4289)
  8 cd00031 CA Cadherin repeat dom  99.3 5.5E-11 1.2E-15   82.8  11.3   77    2-105   123-199 (199)
  9 KOG1834|consensus               98.7 7.2E-08 1.6E-12   78.2   9.5   91    3-118    54-154 (952)
 10 KOG1834|consensus               97.9 6.7E-05 1.5E-09   61.4   8.3   76    1-105   168-244 (952)
 11 smart00736 CADG Dystroglycan-t  97.2   0.021 4.5E-07   35.8  11.4   71    5-108    24-96  (97)
 12 PF08266 Cadherin_2:  Cadherin-  96.1   0.021 4.5E-07   35.3   5.0   32   17-49     34-65  (84)
 13 TIGR01965 VCBS_repeat VCBS rep  94.7    0.65 1.4E-05   29.6   9.3   82    2-112     3-85  (99)
 14 PF05345 He_PIG:  Putative Ig d  83.4     5.6 0.00012   21.8   6.4   36   28-89     13-48  (49)
 15 PF07495 Y_Y_Y:  Y_Y_Y domain;   69.1      18 0.00039   20.2   8.5   27   77-103    39-65  (66)
 16 PF08758 Cadherin_pro:  Cadheri  64.4      32 0.00068   21.3   7.4   45   15-91     37-81  (90)
 17 TIGR00845 caca sodium/calcium   54.6      74  0.0016   28.2   7.4   19   95-114   513-531 (928)
 18 PF03413 PepSY:  Peptidase prop  52.2      33 0.00071   18.7   3.6   29   14-42     29-62  (64)
 19 PF13754 Big_3_4:  Bacterial Ig  49.7      41  0.0009   18.5   3.7   14   77-90     24-37  (54)
 20 PF02494 HYR:  HYR domain;  Int  49.6      52  0.0011   19.3   4.4   25   77-103    57-81  (81)
 21 PF12245 Big_3_2:  Bacterial Ig  48.2      50  0.0011   18.6   5.4   33   77-111    23-55  (60)
 22 TIGR03660 T1SS_rpt_143 T1SS-14  46.9      86  0.0019   21.0   6.3   32   77-113    85-116 (137)
 23 PF14157 YmzC:  YmzC-like prote  42.7      59  0.0013   19.0   3.6   28   18-45     31-58  (63)
 24 smart00089 PKD Repeats in poly  39.3      76  0.0016   18.1   4.5   26   75-103    53-78  (79)
 25 PF09100 Qn_am_d_aIV:  Quinohem  38.6      98  0.0021   20.7   4.6   31   80-111   101-132 (133)
 26 cd00146 PKD polycystic kidney   34.7      91   0.002   17.9   3.8   26   75-102    55-80  (81)
 27 PF12971 NAGLU_N:  Alpha-N-acet  30.1   1E+02  0.0022   18.7   3.5   31   15-45     18-49  (86)
 28 PF00635 Motile_Sperm:  MSP (Ma  29.3      97  0.0021   18.9   3.4   27   15-43     32-58  (109)
 29 PF07145 PAM2:  Ataxin-2 C-term  28.4      42  0.0009   14.6   1.1   12  106-117     6-17  (18)
 30 cd06891 PX_Vps17p The phosphoi  27.3 1.7E+02  0.0036   19.8   4.4   39   76-118    25-63  (140)
 31 COG3212 Predicted membrane pro  24.9      97  0.0021   20.9   2.9   29   13-41    108-138 (144)
 32 PF13750 Big_3_3:  Bacterial Ig  24.1 2.4E+02  0.0052   19.2   5.3   27   76-104   122-148 (158)
 33 PF07861 WND:  WisP family N-Te  22.0 1.3E+02  0.0029   21.7   3.2   29   14-45    202-230 (263)
 34 PF01011 PQQ:  PQQ enzyme repea  21.2 1.3E+02  0.0027   14.9   3.6   18   28-45     10-27  (38)
 35 PF05688 DUF824:  Salmonella re  21.0 1.4E+02  0.0031   16.2   2.6   14   96-109    14-27  (47)
 36 PF15418 DUF4625:  Domain of un  20.4 1.8E+02  0.0038   19.3   3.5   15   76-90    106-120 (132)

No 1  
>smart00112 CA Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium.
Probab=99.84  E-value=6.3e-20  Score=111.91  Aligned_cols=79  Identities=43%  Similarity=0.663  Sum_probs=70.4

Q ss_pred             EeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEEEEEEE
Q psy9968           6 TDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIYITVIA   85 (118)
Q Consensus         6 ~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~l~v~a   85 (118)
                      +|+|.| .|+.++|+|..+... .+|.|++.+|.|++.+.                      ||||   ....|.|.|.|
T Consensus         1 ~D~D~g-~n~~i~Y~i~~~~~~-~~F~i~~~tg~i~~~~~----------------------LD~e---~~~~y~l~v~a   53 (79)
T smart00112        1 TDADSG-ENGKVTYSILSGNED-GLFSIDPETGEITTTKP----------------------LDRE---EQPEYTLTVEA   53 (79)
T ss_pred             CCCCCC-cCcEEEEEEecCCCC-CEEEEeCCccEEEeCCc----------------------cCee---CCCeEEEEEEE
Confidence            488998 689999999965442 69999999999988888                      6887   55799999999


Q ss_pred             EECCCCCCeeEEEEEEEEEeCCCCCC
Q psy9968          86 EDNGTPQLSDACTMKITVEDINDNEP  111 (118)
Q Consensus        86 ~D~g~p~~s~~~~v~I~v~DvNDn~P  111 (118)
                      .|.|.|+++++++|.|+|.|+|||+|
T Consensus        54 ~D~~~~~~~~~~~v~I~V~D~Nd~~P   79 (79)
T smart00112       54 TDGGGPPLSSTATVTVTVLDVNDNAP   79 (79)
T ss_pred             EECCCCCcccEEEEEEEEEECCCCCC
Confidence            99999999999999999999999998


No 2  
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.76  E-value=1.3e-17  Score=116.37  Aligned_cols=90  Identities=41%  Similarity=0.698  Sum_probs=78.5

Q ss_pred             EEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEEE
Q psy9968           2 QVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIYI   81 (118)
Q Consensus         2 ~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~l   81 (118)
                      ++.|.|+|.+ .++.+.|+|...... .+|.|++.+|.|++.+.                      ||||   ....|.+
T Consensus        18 ~~~a~D~D~~-~~~~~~y~i~~~~~~-~~F~i~~~tG~l~~~~~----------------------lD~e---~~~~~~l   70 (199)
T cd00031          18 TVSATDPDSG-ENGRVTYSILGGNED-GLFSIDPNTGVITTTKP----------------------LDRE---EQSEYTL   70 (199)
T ss_pred             EEEEECCCCC-CCceEEEEEeCCCCc-ccEEEeCCCCEEEECCC----------------------CCCc---CCceEEE
Confidence            5789999998 489999999965442 69999999999999998                      6887   5579999


Q ss_pred             EEEEEECCCCCCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          82 TVIAEDNGTPQLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        82 ~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      .|.|.|.|.|.++++..+.|.|.|+|||+|.|.+..|
T Consensus        71 ~v~a~D~g~~~~~~~~~v~I~V~d~Nd~~P~~~~~~~  107 (199)
T cd00031          71 TVVASDGGGPPLSSTATVTVTVLDVNDNPPVFEQSSY  107 (199)
T ss_pred             EEEEEECCcCcceeEEEEEEEEccCCCCCCcccccce
Confidence            9999999888888999999999999999999985443


No 3  
>KOG4289|consensus
Probab=99.74  E-value=7.9e-18  Score=142.15  Aligned_cols=92  Identities=33%  Similarity=0.557  Sum_probs=84.2

Q ss_pred             CEEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEE
Q psy9968           1 MQVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIY   80 (118)
Q Consensus         1 ~~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~   80 (118)
                      +||+|+|+|.| .|+++.|-+..+....+.|.|++.+|.|++.+.                      ||||   ....|.
T Consensus       803 lQVSatDaD~g-~Ng~v~y~~qg~~d~p~~F~IEptSGviRtl~r----------------------LdRE---~~avy~  856 (2531)
T KOG4289|consen  803 LQVSATDADSG-PNGRVYYTFQGGDDGPGDFYIEPTSGVIRTLRR----------------------LDRE---NVAVYV  856 (2531)
T ss_pred             EEEEEeccCCC-CCceEEEEecCCCCCCCceEEccCcceeehhhh----------------------hcch---heeEEE
Confidence            58999999999 799999998765555578999999999999999                      7998   678999


Q ss_pred             EEEEEEECCCCCCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          81 ITVIAEDNGTPQLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        81 l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      |.+.|.|.|.|++++++.|+|+|.|+|||||+|++.+|
T Consensus       857 L~a~avDrg~p~ls~~~eItvtvldvNDnaPvfe~~e~  894 (2531)
T KOG4289|consen  857 LAAYAVDRGNPPLSAPVEITVTVLDVNDNAPVFEQDEL  894 (2531)
T ss_pred             EEEEEeeCCCCCcCCceEEEEEEEecCCCCCCCCCcce
Confidence            99999999999999999999999999999999999875


No 4  
>KOG4289|consensus
Probab=99.72  E-value=2.1e-17  Score=139.64  Aligned_cols=91  Identities=35%  Similarity=0.607  Sum_probs=82.1

Q ss_pred             CEEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEE
Q psy9968           1 MQVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIY   80 (118)
Q Consensus         1 ~~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~   80 (118)
                      ++|.|+|.|.+ .|+.|+|++..+ +....|.||+.+|.|.+..+                      ||||   ....|.
T Consensus       289 LtvrAtD~Dsp-~Nani~Yrl~eg-~~~~~f~in~rSGvI~T~a~----------------------lDRE---~~~~y~  341 (2531)
T KOG4289|consen  289 LTVRATDGDSP-PNANIRYRLLEG-NAKNVFEINPRSGVISTRAP----------------------LDRE---ELESYQ  341 (2531)
T ss_pred             EEEEeccCCCC-CCCceEEEecCC-CccceeEEcCccceeeccCc----------------------cCHH---hhhheE
Confidence            47999999999 799999999954 44579999999999999999                      7887   556899


Q ss_pred             EEEEEEECCCCCCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          81 ITVIAEDNGTPQLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        81 l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      |.|+|+|.|.|+...++.|.|+|.|+|||+|+|....|
T Consensus       342 L~VeAsDqG~~pgp~Ta~V~itV~D~NDNaPqFse~~Y  379 (2531)
T KOG4289|consen  342 LDVEASDQGRPPGPRTAMVEITVEDENDNAPQFSEKRY  379 (2531)
T ss_pred             EEEEeccCCCCCCCceEEEEEEEEecCCCCccccccce
Confidence            99999999998877899999999999999999998776


No 5  
>PF00028 Cadherin:  Cadherin domain;  InterPro: IPR002126 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.; GO: 0005509 calcium ion binding, 0007156 homophilic cell adhesion, 0016020 membrane; PDB: 2A4E_A 2A4C_B 2O72_A 2QVI_A 1NCJ_A 3Q2W_A 3Q2N_A 3LNH_B 3LNI_A 3Q2L_A ....
Probab=99.70  E-value=4.6e-16  Score=97.45  Aligned_cols=77  Identities=38%  Similarity=0.485  Sum_probs=68.5

Q ss_pred             CEEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEE
Q psy9968           1 MQVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIY   80 (118)
Q Consensus         1 ~~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~   80 (118)
                      +++.|.|+|.+ .|+.+.|+|..+.. ..+|.|++.+|.|++.+.                      ||||   ..+.|.
T Consensus        16 ~~v~a~D~D~~-~n~~i~y~i~~~~~-~~~F~I~~~tg~i~~~~~----------------------LD~E---~~~~y~   68 (93)
T PF00028_consen   16 GQVTATDPDSG-PNSQITYSILGGNP-DGLFSIDPNTGEISLKKP----------------------LDRE---TQSSYQ   68 (93)
T ss_dssp             EEEEEEESSTS-TTSSEEEEEEETTS-TTSEEEETTTTEEEESSS----------------------SCTT---TTSEEE
T ss_pred             EEEEEEeCCCC-CCceEEEEEecCcc-cCceEEeeeeecccccee----------------------cCcc---cCCEEE
Confidence            36899999988 79999999997553 469999999999999999                      6887   567999


Q ss_pred             EEEEEEEC-CCCCCeeEEEEEEEEE
Q psy9968          81 ITVIAEDN-GTPQLSDACTMKITVE  104 (118)
Q Consensus        81 l~v~a~D~-g~p~~s~~~~v~I~v~  104 (118)
                      |.|.|+|. |.|+++++++|.|+|.
T Consensus        69 l~v~a~D~~~~~~~~~~~~V~I~V~   93 (93)
T PF00028_consen   69 LTVRATDSGGSPPLSSTATVTINVL   93 (93)
T ss_dssp             EEEEEEETTTSSEEEEEEEEEEEEE
T ss_pred             EEEEEEECCCCCCCEEEEEEEEEEC
Confidence            99999999 8899999999999985


No 6  
>KOG1219|consensus
Probab=99.64  E-value=5.1e-16  Score=135.07  Aligned_cols=91  Identities=32%  Similarity=0.538  Sum_probs=82.8

Q ss_pred             CEEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEE
Q psy9968           1 MQVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIY   80 (118)
Q Consensus         1 ~~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~   80 (118)
                      +|+.|+|+|.. .|+++.|.|.+ ++...+|+||+.||.|++.+.                      ||||   +++++.
T Consensus      1078 vq~ea~D~Dss-sn~kLmykI~s-Gnyq~FF~Id~~TG~iTt~r~----------------------LDRE---~qdEHi 1130 (4289)
T KOG1219|consen 1078 VQAEANDPDSS-SNQKLMYKITS-GNYQGFFQIDPETGLITTIRR----------------------LDRE---KQDEHI 1130 (4289)
T ss_pred             EEeccCCCCcc-cCcceEEEEcc-CCccceEEEccccceeeeehh----------------------hccc---ccccce
Confidence            47889999977 68999999994 555689999999999999999                      7998   889999


Q ss_pred             EEEEEEECCCCCCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          81 ITVIAEDNGTPQLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        81 l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      |.|.+.|.|.|++.+..-|.|.|+|+|||+|+|.|..|
T Consensus      1131 LeVTi~D~gep~l~s~~rviV~IldvNdnsp~Flqk~~ 1168 (4289)
T KOG1219|consen 1131 LEVTIQDNGEPWLCSNQRVIVSILDVNDNSPRFLQKKT 1168 (4289)
T ss_pred             EEEEEecCCCCccccceEEEEEEeeccCCchhhhhhee
Confidence            99999999999999999999999999999999998754


No 7  
>KOG1219|consensus
Probab=99.63  E-value=1.6e-15  Score=132.15  Aligned_cols=91  Identities=40%  Similarity=0.646  Sum_probs=83.4

Q ss_pred             CEEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEE
Q psy9968           1 MQVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIY   80 (118)
Q Consensus         1 ~~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~   80 (118)
                      ++|.|.|.|.| ..+.++|+|.. +...+.|+||..+|.|++.+.                      ||||   ....|.
T Consensus       973 i~i~A~dedsg-ldg~l~Y~I~~-gdg~g~FsId~~tG~irTl~~----------------------lDrE---~ks~Yw 1025 (4289)
T KOG1219|consen  973 IRIQARDEDSG-LDGELSYKIRT-GDGDGIFSIDSTTGSIRTLKA----------------------LDRE---KKSSYW 1025 (4289)
T ss_pred             EEEEEecCCCC-ccceEEEEEEc-CCcceeEEecCCcceEeechh----------------------hchh---hcceEE
Confidence            46899999999 78999999995 344578999999999999999                      7888   778999


Q ss_pred             EEEEEEECCCCCCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          81 ITVIAEDNGTPQLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        81 l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      |++.|.|.|.+++++.+.+.|.|+|+|||+|+|.++.|
T Consensus      1026 ltveA~D~gt~~~ssv~~vyI~ieDvNDn~Pq~s~pvy 1063 (4289)
T KOG1219|consen 1026 LTVEAKDLGTVPLSSVCEVYIEIEDVNDNVPQFSSPVY 1063 (4289)
T ss_pred             EEEEEEecCCCccccceeEEEEEEecCCCCcccCCceE
Confidence            99999999999999999999999999999999999877


No 8  
>cd00031 CA Cadherin repeat domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion; these domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium; plays a role in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-,CNR-,proto-,and FAT-family cadherin, desmocollin, and desmoglein, exists as monomers or dimers (hetero- and homo-); two copies of the repeat are present here
Probab=99.30  E-value=5.5e-11  Score=82.80  Aligned_cols=77  Identities=38%  Similarity=0.588  Sum_probs=67.0

Q ss_pred             EEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEEE
Q psy9968           2 QVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIYI   81 (118)
Q Consensus         2 ~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~l   81 (118)
                      ++.|+|+|.+ .++.++|+|..... ..+|.|++.+|.|++.+.                      ||+|   ....|.+
T Consensus       123 ~~~a~D~D~~-~~~~~~y~l~~~~~-~~~f~i~~~~G~i~~~~~----------------------ld~e---~~~~~~l  175 (199)
T cd00031         123 TVTATDADSG-ENAKLTYSILSGND-KELFSIDPNTGIITLAKP----------------------LDRE---EKSSYEL  175 (199)
T ss_pred             EEEEEcCCCC-CCccEEEEEeCCCC-CCEEEEeCCceEEEeCCc----------------------cCCc---cCceEEE
Confidence            6889999998 68999999996544 359999999999999988                      6776   4568999


Q ss_pred             EEEEEECCCCCCeeEEEEEEEEEe
Q psy9968          82 TVIAEDNGTPQLSDACTMKITVED  105 (118)
Q Consensus        82 ~v~a~D~g~p~~s~~~~v~I~v~D  105 (118)
                      .|.|.|.+.+++++++.+.|.|.|
T Consensus       176 ~v~a~D~~~~~~~~~~~i~i~v~d  199 (199)
T cd00031         176 TVVATDGGGPPLSSTATVTVTVLD  199 (199)
T ss_pred             EEEEEECCCCCceeEEEEEEEEEC
Confidence            999999998889999999999876


No 9  
>KOG1834|consensus
Probab=98.75  E-value=7.2e-08  Score=78.16  Aligned_cols=91  Identities=31%  Similarity=0.455  Sum_probs=67.9

Q ss_pred             EEEEeCCCCC-CCCcE-EEEEEeCCCCCccEEEeCCCC--EEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcce
Q psy9968           3 VHATDVDPPS-NGGTI-QYRIIKAPGERAKFSIDKETG--IVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKE   78 (118)
Q Consensus         3 v~A~D~D~~~-~n~~i-~y~i~~~~~~~~~F~id~~tG--~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~   78 (118)
                      +.|-|.|.+- ..|.| -|.|.+++-.-....+|..||  .|+.+.+                      ||.|   .++.
T Consensus        54 l~aLdkdaplr~ageiC~fklhgq~vPFdavVvdK~TGegvlRaK~~----------------------lDCe---lqke  108 (952)
T KOG1834|consen   54 LAALDKDAPLRYAGEICGFKLHGQPVPFDAVVVDKYTGEGVLRAKEP----------------------LDCE---LQKE  108 (952)
T ss_pred             eeeecCCCCcccccccceeEecCCCCCceEEEEeccCCceEEeecCc----------------------cccc---cccc
Confidence            3466777652 23555 677775443223455799998  5666666                      7887   6789


Q ss_pred             EEEEEEEEECCCC------CCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          79 IYITVIAEDNGTP------QLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        79 ~~l~v~a~D~g~p------~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      |+++|+|.|+|+.      ..|..++|.|+|.|+|.++|+|..+.|
T Consensus       109 ytf~iQAydCg~gpdgtn~kKShkatvhIrVkDvNe~AP~f~ep~Y  154 (952)
T KOG1834|consen  109 YTFTIQAYDCGNGPDGTNTKKSHKATVHIRVKDVNEFAPVFKEPWY  154 (952)
T ss_pred             ceEEEEEEecCCCCCccccccccceEEEEEeccccccCchhcccce
Confidence            9999999999763      467778999999999999999998877


No 10 
>KOG1834|consensus
Probab=97.91  E-value=6.7e-05  Score=61.42  Aligned_cols=76  Identities=26%  Similarity=0.423  Sum_probs=60.2

Q ss_pred             CEEEEEeCCCCCCCCcE-EEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceE
Q psy9968           1 MQVHATDVDPPSNGGTI-QYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEI   79 (118)
Q Consensus         1 ~~v~A~D~D~~~~n~~i-~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~   79 (118)
                      ++|.|.|.|-+..+++| .|.|.. ++  -.|.||. .|.|+....                      |.+.   ++.+|
T Consensus       168 l~veAiD~DCspq~sqIC~YEI~t-~d--~PFaIdn-~G~irnTek----------------------Lny~---ke~~Y  218 (952)
T KOG1834|consen  168 LRVEAIDKDCSPQYSQICEYEITT-PD--VPFAIDN-DGNIRNTEK----------------------LNYT---KEHQY  218 (952)
T ss_pred             EEEEeecCCCCCcccceeEEEecC-CC--CceEEcC-CCccccccc----------------------cccc---cceeE
Confidence            47899999988667777 788884 43  3799986 599999998                      5665   56799


Q ss_pred             EEEEEEEECCCCCCeeEEEEEEEEEe
Q psy9968          80 YITVIAEDNGTPQLSDACTMKITVED  105 (118)
Q Consensus        80 ~l~v~a~D~g~p~~s~~~~v~I~v~D  105 (118)
                      .|+|.|.|+|..+..+-..|+|+|..
T Consensus       219 ~ltVtAyDCg~kraa~d~lV~v~Vkp  244 (952)
T KOG1834|consen  219 KLTVTAYDCGKKRAASDSLVTVHVKP  244 (952)
T ss_pred             EEEEEEEecccccccCcceEEEEecC
Confidence            99999999998665555778888753


No 11 
>smart00736 CADG Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions.
Probab=97.21  E-value=0.021  Score=35.84  Aligned_cols=71  Identities=28%  Similarity=0.333  Sum_probs=51.5

Q ss_pred             EEeCCCCCCCCcEEEEEEeCC--CCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEEEE
Q psy9968           5 ATDVDPPSNGGTIQYRIIKAP--GERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIYIT   82 (118)
Q Consensus         5 A~D~D~~~~n~~i~y~i~~~~--~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~l~   82 (118)
                      ..|+| +   ..+.|++....  +...|.+.|+.++.+.-. +                      ...+    ...|.+.
T Consensus        24 F~d~d-~---~~lty~~~~~~~~~lP~Wl~fd~~~~~~~Gt-P----------------------~~~~----~g~~~i~   72 (97)
T smart00736       24 FTDAD-G---DTLTYSATLSDGSALPSWLSFDSDTGTLSGT-P----------------------TNSD----VGSLSLK   72 (97)
T ss_pred             eECCC-C---CeEEEEEEeCCCCCCCCeEEEeCCCCEEEEE-C----------------------CCCC----CcEEEEE
Confidence            45666 3   46899987432  224599999999988774 4                      1221    2469999


Q ss_pred             EEEEECCCCCCeeEEEEEEEEEeCCC
Q psy9968          83 VIAEDNGTPQLSDACTMKITVEDIND  108 (118)
Q Consensus        83 v~a~D~g~p~~s~~~~v~I~v~DvND  108 (118)
                      |.|+|..+  .+....+.|.|.+.|+
T Consensus        73 v~a~D~~g--~~~~~~f~i~V~~~~~   96 (97)
T smart00736       73 VTATDSSG--ASASDTFTITVVNTND   96 (97)
T ss_pred             EEEEECCC--CEEEEEEEEEEeCCCC
Confidence            99999875  5677889999999887


No 12 
>PF08266 Cadherin_2:  Cadherin-like;  InterPro: IPR013164 Cadherins are a family of adhesion molecules that mediate Ca2+-dependent cell-cell adhesion in all solid tissues of the organism which modulate a wide variety of processes including cell polarisation and migration [, ,]. Cadherin-mediated cell-cell junctions are formed as a result of interaction between extracellular domains of identical cadherins, which are located on the membranes of the neighbouring cells. The stability of these adhesive junctions is ensured by binding of the intracellular cadherin domain with the actin cytoskeleton. There are a number of different isoforms distributed in a tissue-specific manner in a wide variety of organisms. Cells containing different cadherins tend to segregate in vitro, while those that contain the same cadherins tend to preferentially aggregate together. This observation is linked to the finding that cadherin expression causes morphological changes involving the positional segregation of cells into layers, suggesting they may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins. Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a N-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion. This entry represents a cadherin domain that is usually found at the N terminus of cadherin proteins.; PDB: 1WUZ_A 1WYJ_A.
Probab=96.06  E-value=0.021  Score=35.31  Aligned_cols=32  Identities=22%  Similarity=0.417  Sum_probs=20.6

Q ss_pred             EEEEEEeCCCCCccEEEeCCCCEEEEeeeecCC
Q psy9968          17 IQYRIIKAPGERAKFSIDKETGIVKTLYALDRD   49 (118)
Q Consensus        17 i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e   49 (118)
                      ..|+|...+ ...+|.+++.||.|++...+|+|
T Consensus        34 ~~~ri~s~~-~~~~~~v~~~tG~L~v~~rIDRE   65 (84)
T PF08266_consen   34 RNFRIVSEG-NSQYFRVNEKTGDLFVSERIDRE   65 (84)
T ss_dssp             TTBEEE-SS-SS-SEEE-TTTSEEEESS--SCC
T ss_pred             cceEEeecC-CcceeEecCCceeEEeCCccCHH
Confidence            357776543 34799999999999999995444


No 13 
>TIGR01965 VCBS_repeat VCBS repeat. This domain of about 100 residues is found multiple (up to 35) copies in long proteins from several species of Vibrio, Colwellia, Bradyrhizobium, and Shewanella (hence the name VCBS) and in smaller copy numbers in proteins from several other bacteria. The large protein size and repeat copy numbers, species distribution, and suggested activities of several member proteins suggests a role for this domain in adhesion.
Probab=94.70  E-value=0.65  Score=29.59  Aligned_cols=82  Identities=27%  Similarity=0.280  Sum_probs=48.7

Q ss_pred             EEEEEeCCCCCCCCcEEEEEEeCCCCCccEEEeCCCCEE-EEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEE
Q psy9968           2 QVHATDVDPPSNGGTIQYRIIKAPGERAKFSIDKETGIV-KTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIY   80 (118)
Q Consensus         2 ~v~A~D~D~~~~n~~i~y~i~~~~~~~~~F~id~~tG~i-~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~   80 (118)
                      ++.++|+|.+.   ...++.......-..|.|++ .|.- +....   .            ..++|.|...   ....-.
T Consensus         3 ~Lt~sD~D~gd---~~~~s~~~~~g~yGtlti~~-~G~wtYtl~n---~------------~~avq~L~~G---e~~tds   60 (99)
T TIGR01965         3 QLTISDADAGQ---AHFIAQTDAAGQYGTFSIDA-DGQWTYQADN---S------------QTAVQALKAG---ETLTDT   60 (99)
T ss_pred             ceEEeCCCCCC---ceEEecccccCCcEEEEECC-CCcEEEEeCC---C------------cHHHHhhcCC---CEEEEE
Confidence            68899999872   34566643222335688887 5642 22211   0            0134445543   222345


Q ss_pred             EEEEEEECCCCCCeeEEEEEEEEEeCCCCCCe
Q psy9968          81 ITVIAEDNGTPQLSDACTMKITVEDINDNEPM  112 (118)
Q Consensus        81 l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~  112 (118)
                      +.+.+.|+      .+.+|.|+|.-.|| +|+
T Consensus        61 Ftvtv~DG------tt~~vtItI~GtND-apv   85 (99)
T TIGR01965        61 FTVTSADG------TSQTVTITITGAND-AAV   85 (99)
T ss_pred             EEEEEeCC------CeEEEEEEEEccCC-CCE
Confidence            77778884      28889999999999 553


No 14 
>PF05345 He_PIG:  Putative Ig domain;  InterPro: IPR008009 This alignment represents the conserved core region of a ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to Hyalin (IPR003410 from INTERPRO) and the PKD domain (IPR000601 from INTERPRO) suggest an Ig-like fold so this family may be similar in function to the (IPR003791 from INTERPRO) and (IPR003790 from INTERPRO) protein families.
Probab=83.36  E-value=5.6  Score=21.77  Aligned_cols=36  Identities=22%  Similarity=0.232  Sum_probs=26.5

Q ss_pred             CccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEEEEEEEEECC
Q psy9968          28 RAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIYITVIAEDNG   89 (118)
Q Consensus        28 ~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~l~v~a~D~g   89 (118)
                      .....+|+.+|.|.-.-.-                      .-    ....|.+.|.|+|..
T Consensus        13 P~gLs~d~~tG~isGtp~~----------------------~~----~~G~y~~~vtatd~~   48 (49)
T PF05345_consen   13 PSGLSLDPSTGTISGTPTS----------------------SV----QPGTYTFTVTATDGS   48 (49)
T ss_pred             CCcEEEeCCCCEEEeecCC----------------------Cc----cccEEEEEEEEEcCC
Confidence            4589999999999887441                      10    124799999999874


No 15 
>PF07495 Y_Y_Y:  Y_Y_Y domain;  InterPro: IPR011123 This region is mostly found at the end of the beta propellers (IPR011110 from INTERPRO) in a family of two component regulators. However they are also found tandemly repeated in Q891H4 from SWISSPROT without other signal conduction domains being present. It is named after the conserved tyrosines found in the alignment. The exact function is not known.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=69.06  E-value=18  Score=20.20  Aligned_cols=27  Identities=30%  Similarity=0.314  Sum_probs=18.0

Q ss_pred             ceEEEEEEEEECCCCCCeeEEEEEEEE
Q psy9968          77 KEIYITVIAEDNGTPQLSDACTMKITV  103 (118)
Q Consensus        77 ~~~~l~v~a~D~g~p~~s~~~~v~I~v  103 (118)
                      ..|.|.|.|.|..+........+.|+|
T Consensus        39 G~Y~l~V~a~~~~~~~~~~~~~l~i~I   65 (66)
T PF07495_consen   39 GKYTLEVRAKDNNGKWSSDEKSLTITI   65 (66)
T ss_dssp             EEEEEEEEEEETTS-B-SS-EEEEEEE
T ss_pred             EEEEEEEEEECCCCCcCcccEEEEEEE
Confidence            589999999998664333336666665


No 16 
>PF08758 Cadherin_pro:  Cadherin prodomain like;  InterPro: IPR014868 Cadherins are a group of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This protein corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions []. ; GO: 0007155 cell adhesion, 0016021 integral to membrane; PDB: 1OP4_A.
Probab=64.41  E-value=32  Score=21.34  Aligned_cols=45  Identities=18%  Similarity=0.186  Sum_probs=26.2

Q ss_pred             CcEEEEEEeCCCCCccEEEeCCCCEEEEeeeecCCCCcccceeeeeeeeecccCCCCCCCCcceEEEEEEEEECCCC
Q psy9968          15 GTIQYRIIKAPGERAKFSIDKETGIVKTLYALDRDDPEREKEIYITTDIRVQALDRDDPEREKEIYITVIAEDNGTP   91 (118)
Q Consensus        15 ~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~ld~e~~~~~~~~~~~~~~~~~~lDre~~~~~~~~~l~v~a~D~g~p   91 (118)
                      ..+.|.-. +    ..|.|.++ |.|++++.+                      ...    ...-.+.|.|.|..+.
T Consensus        37 ~~~~~~ss-D----pdF~V~~D-GsVy~~r~v----------------------~l~----~~~~~F~V~a~D~~~~   81 (90)
T PF08758_consen   37 RRVIFESS-D----PDFRVLED-GSVYAKRPV----------------------QLS----SEQRSFTVHAWDSQTQ   81 (90)
T ss_dssp             --EEEE--------SEEEEETT-TEEEEES------------------------S-S----SS-EEEEEEEEETTTT
T ss_pred             CceEEecC-C----CCEEEcCC-CeEEEeeeE----------------------ecC----CCceEEEEEEECCCCC
Confidence            44565553 1    26999864 999999984                      332    1234799999998763


No 17 
>TIGR00845 caca sodium/calcium exchanger 1. This model is specific for the eukaryotic sodium ion/calcium ion exchangers of the Caca family
Probab=54.57  E-value=74  Score=28.16  Aligned_cols=19  Identities=26%  Similarity=0.483  Sum_probs=13.3

Q ss_pred             eEEEEEEEEEeCCCCCCeeC
Q psy9968          95 DACTMKITVEDINDNEPMFD  114 (118)
Q Consensus        95 ~~~~v~I~v~DvNDn~P~F~  114 (118)
                      ...+.+|+|.| ||++|.|.
T Consensus       513 ~ps~ATVTIlD-DD~aGIfs  531 (928)
T TIGR00845       513 SPNTATVTILD-DDHAGIFT  531 (928)
T ss_pred             CCceEEEEEec-CcccCccc
Confidence            33466777787 88898754


No 18 
>PF03413 PepSY:  Peptidase propeptide and YPEB domain This Prosite motif covers only the active site. This is family M4 in the peptidase classification. ;  InterPro: IPR005075  This signature, PepSY, is found in the propeptide of members of the MEROPS peptidase family M4 (clan MA(E)), which contains the thermostable thermolysins (3.4.24.27 from EC), and related thermolabile neutral proteases (bacillolysins) (3.4.24.28 from EC) from various species of Bacillus. It is also in many non-peptidase proteins, including Bacillus subtilis YpeB protein - a regulator of SleB spore cortex lytic enzyme - and a large number of eubacterial and archaeal cell wall-associated and secreted proteins which are mostly annotated as 'hypothetical protein'. Many extracellular bacterial proteases are produced as proenzymes. The propeptides usually have a dual function, i.e. they function as an intramolecular chaperone required for the folding of the polypeptide and as an inhibitor preventing premature activation of the enzyme. Analysis of the propeptide region of the M4 family of peptidases reveals two regions of conservation, the PepSY domain and a second domain, proximate to the N terminus, the FTP domain (IPR011096 from INTERPRO), which is also found in isolation in the propeptide of eukaryotic peptidases belong to MEROPS peptidase family M36.  Propeptide domain swapping experiments, for example swapping the propeptide domain of PA protease with that of vibrolysin, both propeptides contain the FTP and PepSY domains, allows the PA protease domain to fold correctly and inhibits the C-terminal autoprocessing activity. However, swapping the propeptide of PA protease for the thermolysin propeptide, does not facilitate the correct folding nor the processing of the chimaeric protein into an active peptidase []. Mutational analysis of the Pseudomonas aeruginosa elastase gene revealed two mutations in the propeptide which resulted in the loss of inhibitory activity but not chaperone activity: A-15V and T-153I (where +1 is defined as the first residue of the mature peptidase). Both mutations resulted in peptidase activity, the T-153V mutation being much less effective than the A-15I mutation [] in activating peptidase activity. The T-153V mutation lies N-terminal to the FTP domain while the A-15I mutation is C-terminal to the PepSY domain.  Given the diverse range of other proteins, both domains occur in in isolation, the exact function of each is still unclear; though it has been proposed that the PepSY domain primarily has inhibitory activity and in conjunction with the FTP domain in chaperone activity. ; GO: 0008237 metallopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0005576 extracellular region; PDB: 2GU3_A 3NQZ_A 3NQY_A 2KGY_A.
Probab=52.23  E-value=33  Score=18.72  Aligned_cols=29  Identities=17%  Similarity=0.336  Sum_probs=16.3

Q ss_pred             CCcEEEEEEeCC---CCCc--cEEEeCCCCEEEE
Q psy9968          14 GGTIQYRIIKAP---GERA--KFSIDKETGIVKT   42 (118)
Q Consensus        14 n~~i~y~i~~~~---~~~~--~F~id~~tG~i~~   42 (118)
                      ++...|.+.-..   +...  .+.||+.||.|.-
T Consensus        29 ~~~~~Y~v~~~~~~~~~~~~~~v~VDa~tG~Il~   62 (64)
T PF03413_consen   29 NGRLVYEVEVVSDDDPDGGEYEVYVDAYTGEILS   62 (64)
T ss_dssp             TCEEEEEEEEEBTTSTTTEEEEEEEETTT--EEE
T ss_pred             CCcEEEEEEEEEEecCCCCEEEEEEECCCCeEEE
Confidence            466778876431   2223  3559999998753


No 19 
>PF13754 Big_3_4:  Bacterial Ig-like domain (group 3)
Probab=49.73  E-value=41  Score=18.47  Aligned_cols=14  Identities=36%  Similarity=0.278  Sum_probs=11.8

Q ss_pred             ceEEEEEEEEECCC
Q psy9968          77 KEIYITVIAEDNGT   90 (118)
Q Consensus        77 ~~~~l~v~a~D~g~   90 (118)
                      ..|.+.+.|+|..+
T Consensus        24 G~y~itv~a~D~AG   37 (54)
T PF13754_consen   24 GTYTITVTATDAAG   37 (54)
T ss_pred             ccEEEEEEEEeCCC
Confidence            47999999999844


No 20 
>PF02494 HYR:  HYR domain;  InterPro: IPR003410 This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion. In the Sushi repeat-containing protein (SrpX), this domain is found between two sushi repeats.
Probab=49.60  E-value=52  Score=19.33  Aligned_cols=25  Identities=28%  Similarity=0.395  Sum_probs=19.2

Q ss_pred             ceEEEEEEEEECCCCCCeeEEEEEEEE
Q psy9968          77 KEIYITVIAEDNGTPQLSDACTMKITV  103 (118)
Q Consensus        77 ~~~~l~v~a~D~g~p~~s~~~~v~I~v  103 (118)
                      ..+.+...|.|..+  ..+++.+.|+|
T Consensus        57 G~t~V~ytA~D~~G--N~a~C~f~V~V   81 (81)
T PF02494_consen   57 GTTTVTYTATDAAG--NSATCSFTVTV   81 (81)
T ss_pred             ceEEEEEEEEECCC--CEEEEEEEEEC
Confidence            47889999999754  46788877764


No 21 
>PF12245 Big_3_2:  Bacterial Ig-like domain (group 3);  InterPro: IPR022038  This family of proteins is found in bacteria. They have two conserved sequence motifs: AGN and GMT. 
Probab=48.24  E-value=50  Score=18.61  Aligned_cols=33  Identities=27%  Similarity=0.333  Sum_probs=20.7

Q ss_pred             ceEEEEEEEEECCCCCCeeEEEEEEEEEeCCCCCC
Q psy9968          77 KEIYITVIAEDNGTPQLSDACTMKITVEDINDNEP  111 (118)
Q Consensus        77 ~~~~l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P  111 (118)
                      ..|.+.+.+.|..+  ..+.....+.+.|..-..|
T Consensus        23 g~yt~~v~a~D~AG--N~~~~~~~~~i~d~~~p~p   55 (60)
T PF12245_consen   23 GEYTLTVTATDKAG--NTSSSTTQIVIVDNTAPAP   55 (60)
T ss_pred             ccEEEEEEEEECCC--CEEEeeeEEEEEcCCCCCc
Confidence            47999999999854  2344445555555543333


No 22 
>TIGR03660 T1SS_rpt_143 T1SS-143 repeat domain. This model represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by pfam00353, followed by a C-terminal domain modeled by TIGR03661. Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion.
Probab=46.93  E-value=86  Score=21.00  Aligned_cols=32  Identities=25%  Similarity=0.361  Sum_probs=21.8

Q ss_pred             ceEEEEEEEEECCCCCCeeEEEEEEEEEeCCCCCCee
Q psy9968          77 KEIYITVIAEDNGTPQLSDACTMKITVEDINDNEPMF  113 (118)
Q Consensus        77 ~~~~l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F  113 (118)
                      -...|.|.|+|..+...  +..+.|+|.|  | .|..
T Consensus        85 l~l~~~v~a~D~DGD~s--~~~l~VtI~D--D-~P~~  116 (137)
T TIGR03660        85 LTLNFPIIATDFDGDTS--SITLPVTIVD--D-VPTI  116 (137)
T ss_pred             EEEeeeEEEEeCCCCcc--ccEEEEEEEC--C-CCee
Confidence            35678899999876433  3477888877  4 3553


No 23 
>PF14157 YmzC:  YmzC-like protein; PDB: 3KVP_E.
Probab=42.73  E-value=59  Score=18.95  Aligned_cols=28  Identities=14%  Similarity=0.335  Sum_probs=18.5

Q ss_pred             EEEEEeCCCCCccEEEeCCCCEEEEeee
Q psy9968          18 QYRIIKAPGERAKFSIDKETGIVKTLYA   45 (118)
Q Consensus        18 ~y~i~~~~~~~~~F~id~~tG~i~~~~~   45 (118)
                      +|.+.........|..|+++++|++.+.
T Consensus        31 ~Fav~~e~~~iKIfkyd~~tNei~L~KE   58 (63)
T PF14157_consen   31 HFAVVDEDGQIKIFKYDEDTNEITLKKE   58 (63)
T ss_dssp             EEEEE-ETTEEEEEEEETTTTEEEEEEE
T ss_pred             EEEEEecCCeEEEEEeCCCCCeEEEEEe
Confidence            5555533333356888999999888876


No 24 
>smart00089 PKD Repeats in polycystic kidney disease 1 (PKD1) and other proteins. Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases.
Probab=39.25  E-value=76  Score=18.14  Aligned_cols=26  Identities=15%  Similarity=0.219  Sum_probs=18.7

Q ss_pred             CcceEEEEEEEEECCCCCCeeEEEEEEEE
Q psy9968          75 REKEIYITVIAEDNGTPQLSDACTMKITV  103 (118)
Q Consensus        75 ~~~~~~l~v~a~D~g~p~~s~~~~v~I~v  103 (118)
                      ....|.+.+.+.|..+   +.++++.|.|
T Consensus        53 ~~G~y~v~l~v~n~~g---~~~~~~~i~v   78 (79)
T smart00089       53 KPGTYTVTLTVTNAVG---SASATVTVVV   78 (79)
T ss_pred             CCcEEEEEEEEEcCCC---cEEEEEEEEE
Confidence            3458999999999765   5566666655


No 25 
>PF09100 Qn_am_d_aIV:  Quinohemoprotein amine dehydrogenase, alpha subunit domain IV;  InterPro: IPR015184 This domain is predominantly found in the prokaryotic protein quinohemoprotein amine dehydrogenase, adopting an immunoglobulin-like beta-sandwich fold, with seven strands arranged into two beta sheets; the fold is possibly related to the immunoglobulin and/or fibronectin type III superfamilies. The precise function of this domain has not, as yet, been defined []. ; PDB: 1JMZ_A 1JMX_A 1PBY_A 1JJU_A.
Probab=38.59  E-value=98  Score=20.72  Aligned_cols=31  Identities=26%  Similarity=0.443  Sum_probs=17.2

Q ss_pred             EEEEEEEEC-CCCCCeeEEEEEEEEEeCCCCCC
Q psy9968          80 YITVIAEDN-GTPQLSDACTMKITVEDINDNEP  111 (118)
Q Consensus        80 ~l~v~a~D~-g~p~~s~~~~v~I~v~DvNDn~P  111 (118)
                      .|.|.|+=. +..+++..+.+.|+|.+-|+ +|
T Consensus       101 nl~VvAtv~d~~~~l~~e~~liVtVqr~~~-pp  132 (133)
T PF09100_consen  101 NLKVVATVKDGGKPLTGEAHLIVTVQRWNN-PP  132 (133)
T ss_dssp             EEEEEEEETTTT---EEEEEEEEE---S----S
T ss_pred             cEEEEEEEccCCcccceeEeEEEEeecccC-CC
Confidence            577777765 44589999999999988876 44


No 26 
>cd00146 PKD polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases.
Probab=34.71  E-value=91  Score=17.89  Aligned_cols=26  Identities=12%  Similarity=0.082  Sum_probs=17.1

Q ss_pred             CcceEEEEEEEEECCCCCCeeEEEEEEE
Q psy9968          75 REKEIYITVIAEDNGTPQLSDACTMKIT  102 (118)
Q Consensus        75 ~~~~~~l~v~a~D~g~p~~s~~~~v~I~  102 (118)
                      ....|.+++.+.|..+  .+.+..+.|.
T Consensus        55 ~~G~y~v~l~v~d~~g--~~~~~~~~V~   80 (81)
T cd00146          55 KPGTYTVTLTVTNAVG--SSSTKTTTVV   80 (81)
T ss_pred             CCcEEEEEEEEEeCCC--CEEEEEEEEE
Confidence            3468999999999853  3444444443


No 27 
>PF12971 NAGLU_N:  Alpha-N-acetylglucosaminidase (NAGLU) N-terminal domain;  InterPro: IPR024240 Alpha-N-acetylglucosaminidase, is a lysosomal enzyme required for the stepwise degradation of heparan sulphate []. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterised by neurological dysfunction but relatively mild somatic manifestations []. The structure shows that the enzyme is composed of three domains. This entry represents the N-terminal domain of Alpha-N-acetylglucosaminidase which has an alpha-beta fold [].; PDB: 4A4A_A 2VC9_A 2VCC_A 2VCB_A 2VCA_A.
Probab=30.14  E-value=1e+02  Score=18.67  Aligned_cols=31  Identities=10%  Similarity=0.297  Sum_probs=20.8

Q ss_pred             CcEEEEEEeCCCCCccEEEeC-CCCEEEEeee
Q psy9968          15 GTIQYRIIKAPGERAKFSIDK-ETGIVKTLYA   45 (118)
Q Consensus        15 ~~i~y~i~~~~~~~~~F~id~-~tG~i~~~~~   45 (118)
                      +.+.+++.........|.|.. ..|.|.+...
T Consensus        18 ~~f~~~~~~~~~~~d~F~l~~~~~gki~I~G~   49 (86)
T PF12971_consen   18 SQFTFELIPSSNGKDVFELSSADNGKIVIRGN   49 (86)
T ss_dssp             GGEEEEE---BTTBEEEEEEE-SSS-EEEEES
T ss_pred             ceEEEEEecCCCCCCEEEEEeCCCCeEEEEeC
Confidence            458888886554567999997 8899888765


No 28 
>PF00635 Motile_Sperm:  MSP (Major sperm protein) domain;  InterPro: IPR000535 Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans. MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold []. ; GO: 0005198 structural molecule activity; PDB: 1MSP_A 3MSP_B 2BVU_B 2MSP_C 1Z9O_F 1Z9L_A 3IKK_A 1WIC_A 2CRI_A 2RR3_A ....
Probab=29.32  E-value=97  Score=18.86  Aligned_cols=27  Identities=15%  Similarity=0.385  Sum_probs=18.9

Q ss_pred             CcEEEEEEeCCCCCccEEEeCCCCEEEEe
Q psy9968          15 GTIQYRIIKAPGERAKFSIDKETGIVKTL   43 (118)
Q Consensus        15 ~~i~y~i~~~~~~~~~F~id~~tG~i~~~   43 (118)
                      ..+.|.|....+.  .|.+.|..|.|.-.
T Consensus        32 ~~i~fKiktt~~~--~y~v~P~~G~i~p~   58 (109)
T PF00635_consen   32 KPIAFKIKTTNPN--RYRVKPSYGIIEPG   58 (109)
T ss_dssp             SEEEEEEEES-TT--TEEEESSEEEE-TT
T ss_pred             CcEEEEEEcCCCc--eEEecCCCEEECCC
Confidence            5688888865544  79999999876443


No 29 
>PF07145 PAM2:  Ataxin-2 C-terminal region;  InterPro: IPR009818 This entry represents a conserved region approximately 250 residues long located towards the C terminus of eukaryotic ataxin-2. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder []. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death []. Ataxin-2 is predicted to consist of mostly non-globular domains [].; PDB: 3NTW_B 1JH4_B 3KTR_B 3KUJ_B 3KUT_D 3KUS_D 1JGN_B 2RQG_A 2RQH_A.
Probab=28.40  E-value=42  Score=14.60  Aligned_cols=12  Identities=33%  Similarity=0.493  Sum_probs=7.8

Q ss_pred             CCCCCCeeCCCC
Q psy9968         106 INDNEPMFDRVQ  117 (118)
Q Consensus       106 vNDn~P~F~~~~  117 (118)
                      .|-|+|.|-.+.
T Consensus         6 LNp~A~eFvP~~   17 (18)
T PF07145_consen    6 LNPNAPEFVPSS   17 (18)
T ss_dssp             SSTTSSSS-TTT
T ss_pred             cCCCCccccCCC
Confidence            477888887654


No 30 
>cd06891 PX_Vps17p The phosphoinositide binding Phox Homology domain of yeast sorting nexin Vps17p. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds PIs and targets the protein to PI-enriched membranes. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway. Vsp17p forms a dimer with Vps5p, the yeast counterpart of human SNX1, and is part of the retromer complex that mediates the transport of the carboxypeptidase Y receptor Vps10p from endosomes to Golgi. Similar to Vps5p and SNX1, Vps17p harbors a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvatur
Probab=27.30  E-value=1.7e+02  Score=19.78  Aligned_cols=39  Identities=15%  Similarity=0.279  Sum_probs=24.6

Q ss_pred             cceEEEEEEEEECCCCCCeeEEEEEEEEEeCCCCCCeeCCCCC
Q psy9968          76 EKEIYITVIAEDNGTPQLSDACTMKITVEDINDNEPMFDRVQY  118 (118)
Q Consensus        76 ~~~~~l~v~a~D~g~p~~s~~~~v~I~v~DvNDn~P~F~~~~Y  118 (118)
                      .+.|.+.|.++|....+- ....+.+.+   .-|.|.|.+..|
T Consensus        25 ~~~~~l~i~Vtd~ek~G~-~~~~~~~~~---~Tnlp~Fr~~~~   63 (140)
T cd06891          25 KPKYFLRVRVTGIERNKS-KDPIIRFDV---TTNLPTFRSSTY   63 (140)
T ss_pred             CCCceEEEEEeCceecCC-CCeEEEEEE---eeCCcccCCCCC
Confidence            457889999988643221 333444444   477999987654


No 31 
>COG3212 Predicted membrane protein [Function unknown]
Probab=24.91  E-value=97  Score=20.94  Aligned_cols=29  Identities=24%  Similarity=0.475  Sum_probs=20.9

Q ss_pred             CCCcEEEEEEe-CC-CCCccEEEeCCCCEEE
Q psy9968          13 NGGTIQYRIIK-AP-GERAKFSIDKETGIVK   41 (118)
Q Consensus        13 ~n~~i~y~i~~-~~-~~~~~F~id~~tG~i~   41 (118)
                      .+++..|.+.= .+ +...-|.||..||.|-
T Consensus       108 ~~g~~vYevei~~~d~~e~ev~iDA~TG~Il  138 (144)
T COG3212         108 DNGRLVYEVEIVKDDGQEYEVEIDAKTGKIL  138 (144)
T ss_pred             cCCEEEEEEEEEeCCCcEEEEEEecCCCCcc
Confidence            46889999864 32 3345699999999764


No 32 
>PF13750 Big_3_3:  Bacterial Ig-like domain (group 3)
Probab=24.13  E-value=2.4e+02  Score=19.16  Aligned_cols=27  Identities=15%  Similarity=0.174  Sum_probs=19.0

Q ss_pred             cceEEEEEEEEECCCCCCeeEEEEEEEEE
Q psy9968          76 EKEIYITVIAEDNGTPQLSDACTMKITVE  104 (118)
Q Consensus        76 ~~~~~l~v~a~D~g~p~~s~~~~v~I~v~  104 (118)
                      ...|.|+|.|.|..+  -.++..+.+...
T Consensus       122 ~~~YtLtV~a~D~aG--N~~~~si~F~y~  148 (158)
T PF13750_consen  122 DDSYTLTVSATDKAG--NQSTKSISFSYM  148 (158)
T ss_pred             CCeEEEEEEEEecCC--CEEEEEEEEEEe
Confidence            458999999999854  355666655544


No 33 
>PF07861 WND:  WisP family N-Terminal Region;  InterPro: IPR012503 This family is found at the N terminus of the Tropheryma whipplei WisP family proteins []. 
Probab=21.98  E-value=1.3e+02  Score=21.70  Aligned_cols=29  Identities=17%  Similarity=0.324  Sum_probs=21.7

Q ss_pred             CCcEEEEEEeCCCCCccEEEeCCCCEEEEeee
Q psy9968          14 GGTIQYRIIKAPGERAKFSIDKETGIVKTLYA   45 (118)
Q Consensus        14 n~~i~y~i~~~~~~~~~F~id~~tG~i~~~~~   45 (118)
                      |++..|++.....   -..||..||.|...-.
T Consensus       202 ~S~~T~SLs~P~~---~v~lD~~TG~l~~Svk  230 (263)
T PF07861_consen  202 GSPFTYSLSTPVA---GVRLDANTGALSGSVK  230 (263)
T ss_pred             CCcceEEeccCCC---ceEEecccceeeeeee
Confidence            5788999974322   5899999999877643


No 34 
>PF01011 PQQ:  PQQ enzyme repeat family.;  InterPro: IPR002372 Pyrrolo-quinoline quinone (PQQ) is a redox coenzyme, which serves as a cofactor for a number of enzymes (quinoproteins) and particularly for some bacterial dehydrogenases [, ]. A number of bacterial quinoproteins belong to this family. Enzymes in this group have repeats of a beta propeller.; PDB: 1H4I_C 1H4J_E 1W6S_A 2YH3_A 3PRW_A 3P1L_A 3Q7M_A 3Q7O_A 3Q7N_A 1G72_A ....
Probab=21.23  E-value=1.3e+02  Score=14.92  Aligned_cols=18  Identities=22%  Similarity=0.313  Sum_probs=14.6

Q ss_pred             CccEEEeCCCCEEEEeee
Q psy9968          28 RAKFSIDKETGIVKTLYA   45 (118)
Q Consensus        28 ~~~F~id~~tG~i~~~~~   45 (118)
                      ...+.+|..||++.....
T Consensus        10 g~l~AlD~~TG~~~W~~~   27 (38)
T PF01011_consen   10 GYLYALDAKTGKVLWKFQ   27 (38)
T ss_dssp             SEEEEEETTTTSEEEEEE
T ss_pred             CEEEEEECCCCCEEEeee
Confidence            358999999999887765


No 35 
>PF05688 DUF824:  Salmonella repeat of unknown function (DUF824);  InterPro: IPR008542 This family consists of a series of repeated sequences (of around 180 residues) which are found in Salmonella typhimurium, Salmonella typhi and Escherichia coli. These repeats are almost always found with this entry. The repeats are associated with RatA and RatB, the coding sequences of which are found in the pathogeneicity island of Salmonella. The sequences may be determinants of pathogenicity [, ].
Probab=20.95  E-value=1.4e+02  Score=16.24  Aligned_cols=14  Identities=36%  Similarity=0.579  Sum_probs=7.7

Q ss_pred             EEEEEEEEEeCCCC
Q psy9968          96 ACTMKITVEDINDN  109 (118)
Q Consensus        96 ~~~v~I~v~DvNDn  109 (118)
                      +..++|++.|.|.|
T Consensus        14 ~I~ltVt~kda~G~   27 (47)
T PF05688_consen   14 TIPLTVTVKDANGN   27 (47)
T ss_pred             eEEEEEEEECCCCC
Confidence            34556666666543


No 36 
>PF15418 DUF4625:  Domain of unknown function (DUF4625)
Probab=20.38  E-value=1.8e+02  Score=19.31  Aligned_cols=15  Identities=7%  Similarity=0.122  Sum_probs=12.1

Q ss_pred             cceEEEEEEEEECCC
Q psy9968          76 EKEIYITVIAEDNGT   90 (118)
Q Consensus        76 ~~~~~l~v~a~D~g~   90 (118)
                      ...|.+.|.++|..+
T Consensus       106 ~G~YH~~i~VtD~~G  120 (132)
T PF15418_consen  106 AGDYHFMITVTDAAG  120 (132)
T ss_pred             CcceEEEEEEEECCC
Confidence            358999999999865


Done!