Query         005351
Match_columns 701
No_of_seqs    92 out of 94
Neff          5.1 
Searched_HMMs 46136
Date          Thu Mar 28 22:05:42 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/005351.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/005351hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF10699 HAP2-GCS1:  Male gamet  99.9 3.4E-24 7.3E-29  169.4   2.2   49  275-323     1-49  (49)
  2 PF07705 CARDB:  CARDB;  InterP  87.8     7.6 0.00017   33.6  10.6   71  401-476    18-90  (101)
  3 PF10633 NPCBM_assoc:  NPCBM-as  79.8      13 0.00029   31.7   8.5   64  404-467     7-74  (78)
  4 PF11614 FixG_C:  IG-like fold   73.3      48   0.001   30.5  10.9   79  406-484    35-116 (118)
  5 COG1470 Predicted membrane pro  69.9 1.7E+02  0.0037   34.1  16.0   64  405-469   400-468 (513)
  6 PF14874 PapD-like:  Flagellar-  69.2      82  0.0018   27.8  14.0   66  405-471    23-89  (102)
  7 PF00869 Flavi_glycoprot:  Flav  64.4      83  0.0018   34.4  11.8  102  147-257    88-201 (293)
  8 COG2928 Uncharacterized conser  61.4     7.1 0.00015   40.7   3.0   40  554-595    55-94  (222)
  9 PF06280 DUF1034:  Fn3-like dom  61.1      80  0.0017   28.8   9.6   72  403-474     9-104 (112)
 10 PF03896 TRAP_alpha:  Transloco  61.1 1.5E+02  0.0032   32.4  13.0   82  405-490   102-197 (285)
 11 TIGR03066 Gem_osc_para_1 Gemma  58.4      59  0.0013   30.7   8.2   66  385-476    34-105 (111)
 12 PF14884 EFF-AFF:  Type I membr  55.7 2.3E+02   0.005   33.7  13.9   76  183-259   199-283 (589)
 13 PHA02819 hypothetical protein;  55.6      20 0.00042   31.2   4.2   49  505-574    19-67  (71)
 14 PF00703 Glyco_hydro_2:  Glycos  49.1 1.8E+02  0.0038   25.2  11.5   94  389-484     3-108 (110)
 15 PRK09918 putative fimbrial cha  47.6 1.5E+02  0.0033   30.9  10.3   97   13-126     5-103 (230)
 16 PHA02844 putative transmembran  47.5      29 0.00063   30.5   4.0   19  556-574    51-69  (75)
 17 COG3190 FliO Flagellar biogene  45.8      28  0.0006   34.0   4.1   32  554-586    24-55  (137)
 18 PF09451 ATG27:  Autophagy-rela  42.8 1.6E+02  0.0034   31.5   9.7   27  405-431   118-146 (268)
 19 TIGR00869 sec62 protein transl  42.5      29 0.00063   36.6   4.0   53  537-589   135-207 (232)
 20 PF01102 Glycophorin_A:  Glycop  40.2      34 0.00074   32.7   3.7   22  555-576    68-89  (122)
 21 PF05506 DUF756:  Domain of unk  39.4 1.2E+02  0.0026   26.7   6.9   48  402-452    18-65  (89)
 22 TIGR02745 ccoG_rdxA_fixG cytoc  34.6 7.2E+02   0.016   28.7  13.8   68  405-472   349-419 (434)
 23 PHA02692 hypothetical protein;  33.3      64  0.0014   28.1   3.9   15  560-574    53-67  (70)
 24 PLN03080 Probable beta-xylosid  32.1      90   0.002   38.5   6.5   50  405-454   687-744 (779)
 25 PRK15098 beta-D-glucoside gluc  30.2      92   0.002   38.3   6.1   51  405-455   670-728 (765)
 26 PRK10856 cytoskeletal protein   30.1      41 0.00089   37.2   2.9   21  553-573   111-131 (331)
 27 PRK10019 nickel/cobalt efflux   29.9 1.1E+02  0.0024   33.2   6.1    7  584-590    86-92  (279)
 28 PF03839 Sec62:  Translocation   28.5      70  0.0015   33.7   4.1   52  537-588   127-200 (224)
 29 PHA02650 hypothetical protein;  25.9      90   0.002   27.9   3.6   21  554-574    50-70  (81)
 30 PHA03054 IMV membrane protein;  25.8      94   0.002   27.2   3.6   21  554-574    49-69  (72)
 31 PF05297 Herpes_LMP1:  Herpesvi  25.8      23  0.0005   38.5   0.0   90  538-635   134-242 (381)
 32 PF12575 DUF3753:  Protein of u  24.7      93   0.002   27.3   3.5   21  554-574    49-69  (72)
 33 PF14155 DUF4307:  Domain of un  24.5 2.4E+02  0.0053   26.3   6.6   51  431-483    32-84  (112)
 34 PHA02975 hypothetical protein;  24.3   1E+02  0.0023   26.8   3.6   21  554-574    45-65  (69)
 35 PF12273 RCR:  Chitin synthesis  22.8      68  0.0015   30.4   2.6    8  649-656    73-80  (130)
 36 PF06030 DUF916:  Bacterial pro  22.8 2.3E+02  0.0049   26.8   6.1   20  438-457    87-106 (121)
 37 PRK13211 N-acetylglucosamine-b  22.3 8.2E+02   0.018   28.7  11.5   40  448-487   367-406 (478)
 38 PF13198 DUF4014:  Protein of u  21.3      82  0.0018   27.6   2.5   30  565-596    29-59  (72)
 39 PF02018 CBM_4_9:  Carbohydrate  20.3 6.1E+02   0.013   22.4   8.9   41  417-457    33-75  (131)

No 1  
>PF10699 HAP2-GCS1:  Male gamete fusion factor;  InterPro: IPR018928  The gene encoding Arabidopsis HAP2 is allelic with GCS1 (Generative cell-specific protein 1). HAP2 is expressed only in the haploid sperm and is required for efficient guidance of the pollen tube to the ovules. In Arabidopsis the protein is a predicted membrane protein with an N-terminal secretion signal, a single transmembrane domain and a C-terminal histidine-rich domain []. HAP2-GCS1 is found from plants to lower eukaryotes and is necessary for the fusion of the gametes in fertilisation. It is involved in a novel mechanism for gamete fusion where a first species-specific protein binds male and female gamete membranes together after which a second, broadly conserved protein, either directly or indirectly, causes fusion of the two membranes together. The broadly conserved protein is represented by this HAP2-GCS1 domain, conserved from plants to lower eukaryotes []. In Plasmodium berghei the protein is expressed only in male gametocytes and gametes, having a male-specific function during the interaction with female gametes, and being indispensable for parasite fertilisation. The gene in plants and eukaryotes might well have originated from acquisition of plastids from red algae []. 
Probab=99.89  E-value=3.4e-24  Score=169.40  Aligned_cols=49  Identities=63%  Similarity=1.094  Sum_probs=48.1

Q ss_pred             EEEecceeccCCCccccccccchhhcCCCCCCCCCCCChhhHHHHHHHH
Q 005351          275 MLLERTRFTLDGLECNKIGVSYEAFNGQPSFCSSPFWSCLHNQLWNYRE  323 (701)
Q Consensus       275 MlLdk~~VdlsG~eCnKIGVSy~aF~~Q~n~C~~~~GSCL~NQL~dy~~  323 (701)
                      |||||++||++|.|||||||||+||++|+++|++|+||||+|||+|||+
T Consensus         1 miv~k~~v~~~G~eCnKIGvs~~~f~~q~~~C~~~~gsCL~nQl~~f~~   49 (49)
T PF10699_consen    1 MIVDKSLVDLDGLECNKIGVSYEAFRNQPNFCSSPPGSCLKNQLADFYE   49 (49)
T ss_pred             CccchhhccCCCCccCcceeCHHHHHhcCCccCCCccchHHHhHHHHhC
Confidence            8999999999999999999999999999999999999999999999985


No 2  
>PF07705 CARDB:  CARDB;  InterPro: IPR011635 The APHP (acidic peptide-dependent hydrolases/peptidase) domain is found in a variety of different proteins.; PDB: 2KUT_A 2L0D_A 3IDU_A 2KL6_A.
Probab=87.80  E-value=7.6  Score=33.60  Aligned_cols=71  Identities=17%  Similarity=0.251  Sum_probs=45.0

Q ss_pred             cceeEEEEEEEecCcee-eeEEEEEEcCCCCeeeceeEE-EeCCCceEEEEEEEeeccccccceEEEEEEEcCCCCce
Q 005351          401 TQFGVATITTQNTGEVE-ASYSLTFDCSTGVTLMEEQYF-IIKPKETSIRSFKIYPTTNQAAKYTCSAILKDSDFSEV  476 (701)
Q Consensus       401 S~~G~L~V~V~N~G~~~-A~Y~v~vnCS~~I~pI~~q~~-~I~p~~~~~~~F~I~~~s~~~~~~~C~v~L~ds~~~~l  476 (701)
                      -+...+++.|+|.|... ..+.+.+.-.+...  ..+.+ .|.|+++..+.|.+...  ....+.-.|.+ |....+-
T Consensus        18 g~~~~i~~~V~N~G~~~~~~~~v~~~~~~~~~--~~~~i~~L~~g~~~~v~~~~~~~--~~G~~~i~~~i-D~~n~i~   90 (101)
T PF07705_consen   18 GEPVTITVTVKNNGTADAENVTVRLYLDGNSV--STVTIPSLAPGESETVTFTWTPP--SPGSYTIRVVI-DPDNDID   90 (101)
T ss_dssp             TSEEEEEEEEEE-SSS-BEEEEEEEEETTEEE--EEEEESEB-TTEEEEEEEEEE-S--S-CEEEEEEEE-STTTSS-
T ss_pred             CCEEEEEEEEEECCCCCCCCEEEEEEECCcee--ccEEECCcCCCcEEEEEEEEEeC--CCCeEEEEEEE-eeCCccc
Confidence            45567999999999775 56777765443322  45555 88999999998888776  55677755554 6665443


No 3  
>PF10633 NPCBM_assoc:  NPCBM-associated, NEW3 domain of alpha-galactosidase;  InterPro: IPR018905 This domain has been named NEW3, but its function is not known. It is found on proteins which are bacterial galactosidases [].; PDB: 1EUT_A 2BZD_A 1WCQ_C 2BER_A 1W8O_A 1EUU_A 1W8N_A.
Probab=79.83  E-value=13  Score=31.74  Aligned_cols=64  Identities=23%  Similarity=0.306  Sum_probs=37.4

Q ss_pred             eEEEEEEEecCcee-eeEEEEEEcCCCCe--eeceeEEEeCCCceEEEEEEEeeccccc-cceEEEEE
Q 005351          404 GVATITTQNTGEVE-ASYSLTFDCSTGVT--LMEEQYFIIKPKETSIRSFKIYPTTNQA-AKYTCSAI  467 (701)
Q Consensus       404 G~L~V~V~N~G~~~-A~Y~v~vnCS~~I~--pI~~q~~~I~p~~~~~~~F~I~~~s~~~-~~~~C~v~  467 (701)
                      ..++++|+|.|... ....++++-..|-.  .-..+...|+|+++..+.|.|.+..+-. ..|.=.+.
T Consensus         7 ~~~~~tv~N~g~~~~~~v~~~l~~P~GW~~~~~~~~~~~l~pG~s~~~~~~V~vp~~a~~G~y~v~~~   74 (78)
T PF10633_consen    7 VTVTLTVTNTGTAPLTNVSLSLSLPEGWTVSASPASVPSLPPGESVTVTFTVTVPADAAPGTYTVTVT   74 (78)
T ss_dssp             EEEEEEEE--SSS-BSS-EEEEE--TTSE---EEEEE--B-TTSEEEEEEEEEE-TT--SEEEEEEEE
T ss_pred             EEEEEEEEECCCCceeeEEEEEeCCCCccccCCccccccCCCCCEEEEEEEEECCCCCCCceEEEEEE
Confidence            46899999999765 45778887788765  3344455899999999999999976633 45544443


No 4  
>PF11614 FixG_C:  IG-like fold at C-terminal of FixG, putative oxidoreductase; PDB: 2R39_A.
Probab=73.30  E-value=48  Score=30.51  Aligned_cols=79  Identities=16%  Similarity=0.198  Sum_probs=51.2

Q ss_pred             EEEEEEecCceeeeEEEEEEcCCCCee-eceeEEEeCCCceEEEEEEEeeccccc--cceEEEEEEEcCCCCceeeeEEE
Q 005351          406 ATITTQNTGEVEASYSLTFDCSTGVTL-MEEQYFIIKPKETSIRSFKIYPTTNQA--AKYTCSAILKDSDFSEVDRAECQ  482 (701)
Q Consensus       406 L~V~V~N~G~~~A~Y~v~vnCS~~I~p-I~~q~~~I~p~~~~~~~F~I~~~s~~~--~~~~C~v~L~ds~~~~lD~~~~~  482 (701)
                      -++.+.|.+.-+-.|.+++.=..++.. .....+.|.|++...+.|.|....+..  ....=.+.+.|..+...++..-.
T Consensus        35 Y~lkl~Nkt~~~~~~~i~~~g~~~~~l~~~~~~i~v~~g~~~~~~v~v~~p~~~~~~~~~~i~f~v~~~~~~~~~~~~s~  114 (118)
T PF11614_consen   35 YTLKLTNKTNQPRTYTISVEGLPGAELQGPENTITVPPGETREVPVFVTAPPDALKSGSTPITFTVTDDDGGEIITYKST  114 (118)
T ss_dssp             EEEEEEE-SSS-EEEEEEEES-SS-EE-ES--EEEE-TT-EEEEEEEEEE-GGG-SSSEEEEEEEEEEGGGTEEEEEEEE
T ss_pred             EEEEEEECCCCCEEEEEEEecCCCeEEECCCcceEECCCCEEEEEEEEEECHHHccCCCeeEEEEEEECCCCEEEEEEEE
Confidence            468899999999999999866667776 566899999999998888887765532  34455555656667777776666


Q ss_pred             EE
Q 005351          483 FS  484 (701)
Q Consensus       483 F~  484 (701)
                      |-
T Consensus       115 F~  116 (118)
T PF11614_consen  115 FI  116 (118)
T ss_dssp             EE
T ss_pred             EE
Confidence            53


No 5  
>COG1470 Predicted membrane protein [Function unknown]
Probab=69.90  E-value=1.7e+02  Score=34.11  Aligned_cols=64  Identities=22%  Similarity=0.298  Sum_probs=45.0

Q ss_pred             EEEEEEEecCcee-eeEEEEEEcCCCCeee---ceeEEEeCCCceEEEEEEEeeccc-cccceEEEEEEE
Q 005351          405 VATITTQNTGEVE-ASYSLTFDCSTGVTLM---EEQYFIIKPKETSIRSFKIYPTTN-QAAKYTCSAILK  469 (701)
Q Consensus       405 ~L~V~V~N~G~~~-A~Y~v~vnCS~~I~pI---~~q~~~I~p~~~~~~~F~I~~~s~-~~~~~~C~v~L~  469 (701)
                      .+.+.|.|.|... .+-.+.++-..| ..+   +.+.-.++|++..+.+..|++..+ .++.|.-++.-+
T Consensus       400 ~i~i~I~NsGna~LtdIkl~v~~Pqg-Wei~Vd~~~I~sL~pge~~tV~ltI~vP~~a~aGdY~i~i~~k  468 (513)
T COG1470         400 TIRISIENSGNAPLTDIKLTVNGPQG-WEIEVDESTIPSLEPGESKTVSLTITVPEDAGAGDYRITITAK  468 (513)
T ss_pred             eEEEEEEecCCCccceeeEEecCCcc-ceEEECcccccccCCCCcceEEEEEEcCCCCCCCcEEEEEEEe
Confidence            6889999999654 455677777765 322   345667899999999999988765 445665555444


No 6  
>PF14874 PapD-like:  Flagellar-associated PapD-like
Probab=69.16  E-value=82  Score=27.84  Aligned_cols=66  Identities=18%  Similarity=0.214  Sum_probs=45.0

Q ss_pred             EEEEEEEecCceeeeEEEEEEc-CCCCeeeceeEEEeCCCceEEEEEEEeeccccccceEEEEEEEcC
Q 005351          405 VATITTQNTGEVEASYSLTFDC-STGVTLMEEQYFIIKPKETSIRSFKIYPTTNQAAKYTCSAILKDS  471 (701)
Q Consensus       405 ~L~V~V~N~G~~~A~Y~v~vnC-S~~I~pI~~q~~~I~p~~~~~~~F~I~~~s~~~~~~~C~v~L~ds  471 (701)
                      ...|.++|+|.+.+.|.+...- .....-++...-.|.|+.+......+.. ......+.+.+.+.-.
T Consensus        23 ~~~v~l~N~s~~p~~f~v~~~~~~~~~~~v~~~~g~l~PG~~~~~~V~~~~-~~~~g~~~~~l~i~~e   89 (102)
T PF14874_consen   23 SRTVTLTNTSSIPARFRVRQPESLSSFFSVEPPSGFLAPGESVELEVTFSP-TKPLGDYEGSLVITTE   89 (102)
T ss_pred             EEEEEEEECCCCCEEEEEEeCCcCCCCEEEECCCCEECCCCEEEEEEEEEe-CCCCceEEEEEEEEEC
Confidence            5889999999999999998632 1234444445557999998777555553 2344567777766433


No 7  
>PF00869 Flavi_glycoprot:  Flavivirus glycoprotein, central and dimerisation domains;  InterPro: IPR011999  Flaviviruses are small, enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Yellow fever virus (YFV), West Nile virus (WNV), Tick-borne encephalitis virus, Japanese encephalitis virus and Dengue virus 2 viruses []. Flaviviruses consist of three structural proteins: the core nucleocapsid protein C (IPR001122 from INTERPRO), and the envelope glycoproteins M (IPR000069 from INTERPRO) and E. Glycoprotein E is a class II viral fusion protein that mediates both receptor binding and fusion. Class II viral fusion proteins are found in flaviviruses and alphaviruses, and are structurally distinct from class I fusion proteins from influenza virus and HIV. Glycoprotein E is comprised of three domains: domain I (dimerisation domain) is an 8-stranded beta barrel, domain II (central domain) is an elongated domain composed of twelve beta strands and two alpha helices, and domain III (immunoglobulin-like domain) is an IgC-like module with ten beta strands. This entry represents domains I and II, which are intertwined []. The glycoprotein E dimers on the viral surface re-cluster irreversibly into fusion-competent trimers upon exposure to low pH, as found in the acidic environment of the endosome. The formation of trimers results in a conformational change in the hinge region of domain II, a key structural element that opens a ligand-binding hydrophobic pocket at the interface between domains I and II. The conformational change results in the exposure of a fusion peptide loop at the tip of domain II, which is required in the fusion step to drive the cellular and viral membranes together by inserting into the membrane [].; GO: 0016021 integral to membrane, 0019031 viral envelope; PDB: 3P54_A 1OK8_A 1OAN_A 1OKE_B 3C5X_A 3C6E_A 2JSF_A 1URZ_B 3IYW_A 2JV6_A ....
Probab=64.36  E-value=83  Score=34.43  Aligned_cols=102  Identities=24%  Similarity=0.372  Sum_probs=62.6

Q ss_pred             CceeeeCCCCCc-ccCcccccccccccCCcceeeeecccCC-eeEEEeeccceeEEEEEEEEEEcc---------eEeEE
Q 005351          147 QPICCPCGPQRR-IPSSCGNVFDKLLKGKANTAHCLRFPGD-WFHVFGIGQRSIGFSVRIEVKTGS---------KVSEV  215 (701)
Q Consensus       147 QGfCC~C~~~~r-~~~~Cg~~~~~l~~g~~~SAHCLrf~~~-WY~vy~I~~~~i~y~I~VtI~~gs---------s~~~v  215 (701)
                      ..|=|.=+..+| +..-||-|      |+-+-.-|.+|.=. -..+|.|++..|.|+|.|+|-.+.         ...++
T Consensus        88 ~~~vCKr~~sDRGWGNGCgLF------GKGSIvtCaKftC~k~~~g~~id~enI~Y~V~v~vH~g~~~~~~~~~~~~~~~  161 (293)
T PF00869_consen   88 KNYVCKRGYSDRGWGNGCGLF------GKGSIVTCAKFTCSKKATGYVIDRENIEYTVKVEVHGGTKSAANGTSKHRKTA  161 (293)
T ss_dssp             TTEEEEEEEEEESGGGT-SS-------EEEEEEEEEEEEEEEEEEEEE--GGGEEEEEEEEEE-SBCTTTTSHTTTEEEE
T ss_pred             cccEeeccccccccccccEEE------eCCceEEEEEEEcCCcceEEEEEeeeEEEEEEEeccCCccccccccccceeEE
Confidence            445565555555 57788644      56667778887764 688999999999999999998752         35688


Q ss_pred             EEcCCCcceeecCCcE-EEEEeecccCCCCCCCccccEEEeeC
Q 005351          216 TVGPENKTATSADNFL-KVNLIGDFVGYTNIPSFEEFYLVIPR  257 (701)
Q Consensus       216 ~Lgp~~~~~~s~d~~l-~~~LiGDf~~~~~~p~l~~~yLliPs  257 (701)
                      ++.|..+...-+-.+- .+.|-=   ..++-.||++.||+.=.
T Consensus       162 ~fTp~s~~~~~~lgdYG~vtl~C---~~~sg~D~~~~yl~~~~  201 (293)
T PF00869_consen  162 TFTPQSPKQTVELGDYGTVTLEC---RVRSGLDFSQYYLVEMG  201 (293)
T ss_dssp             EEBTTS-EEEEEEGGGBEEEEEE---ECGGSS-TTSEEEEEET
T ss_pred             EEEeCCCcEEEEcCCCcEEEEEE---EeccCcChhhEEEEEEC
Confidence            9999887554431111 111111   12456788998888654


No 8  
>COG2928 Uncharacterized conserved protein [Function unknown]
Probab=61.38  E-value=7.1  Score=40.74  Aligned_cols=40  Identities=23%  Similarity=0.259  Sum_probs=30.3

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhcCCcccchhhhhhhccCc
Q 005351          554 WLVLFGLVLAIFPTVLVLLWLLHQKGLFDPLYDWWDDHFQSD  595 (701)
Q Consensus       554 ~~~~~~l~l~~~~~~l~~l~ll~k~g~~~~~~~~~~~~~~~~  595 (701)
                      ++.-++++++++  +++++-+|.+++++.+|..|||.+++.=
T Consensus        55 ~i~~lg~il~ii--li~l~G~l~~~~ig~~l~~~~d~~L~Ri   94 (222)
T COG2928          55 NIPGLGVILAII--LIFLLGFLARNMIGRSLLSLGDSLLRRI   94 (222)
T ss_pred             hhHHHHHHHHHH--HHHHHHHHHHHHhhhHHHHHHHHHHccC
Confidence            444455555544  4448899999999999999999998753


No 9  
>PF06280 DUF1034:  Fn3-like domain (DUF1034);  InterPro: IPR010435 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain of unknown function is present in bacterial and plant peptidases belonging to MEROPS peptidase family S8 (subfamily S8A subtilisin, clan SB). It is C-terminal to and adjacent to the S8 peptidase domain and can be found in conjunction with the PA (Protease associated) domain (IPR003137 from INTERPRO) and additionally in Gram-positive bacteria with the surface protein anchor domain (IPR001899 from INTERPRO).; GO: 0004252 serine-type endopeptidase activity, 0005618 cell wall, 0016020 membrane; PDB: 3EIF_A 1XF1_B.
Probab=61.10  E-value=80  Score=28.81  Aligned_cols=72  Identities=19%  Similarity=0.210  Sum_probs=46.2

Q ss_pred             eeEEEEEEEecCceeeeEEEEEE-cC-------CC------------CeeeceeEEEeCCCceEEEEEEEeeccc-c---
Q 005351          403 FGVATITTQNTGEVEASYSLTFD-CS-------TG------------VTLMEEQYFIIKPKETSIRSFKIYPTTN-Q---  458 (701)
Q Consensus       403 ~G~L~V~V~N~G~~~A~Y~v~vn-CS-------~~------------I~pI~~q~~~I~p~~~~~~~F~I~~~s~-~---  458 (701)
                      ....+|++.|.|.-+..|.+... ..       .+            ..-.+...++|+|+++..+.+.|.+... .   
T Consensus         9 ~~~~~itl~N~~~~~~ty~~~~~~~~t~~~~~~~~~~~~~~~~~~~~~~~~~~~~vTV~ag~s~~v~vti~~p~~~~~~~   88 (112)
T PF06280_consen    9 KFSFTITLHNYGDKPVTYTLSHVPVLTDKTDTEEGYSILVPPVPSISTVSFSPDTVTVPAGQSKTVTVTITPPSGLDASN   88 (112)
T ss_dssp             EEEEEEEEEE-SSS-EEEEEEEE-EEEEEE--ETTEEEEEEEE----EEE---EEEEE-TTEEEEEEEEEE--GGGHHTT
T ss_pred             ceEEEEEEEECCCCCEEEEEeeEEEEeeEeeccCCcccccccccceeeEEeCCCeEEECCCCEEEEEEEEEehhcCCccc
Confidence            35788999999999999998754 11       01            2233567899999999999999988542 2   


Q ss_pred             ccceEEEEEEEcCCCC
Q 005351          459 AAKYTCSAILKDSDFS  474 (701)
Q Consensus       459 ~~~~~C~v~L~ds~~~  474 (701)
                      ..-+.=.|.|.++++.
T Consensus        89 ~~~~eG~I~~~~~~~~  104 (112)
T PF06280_consen   89 GPFYEGFITFKSSDGE  104 (112)
T ss_dssp             -EEEEEEEEEESSTTS
T ss_pred             CCEEEEEEEEEcCCCC
Confidence            3466777888887776


No 10 
>PF03896 TRAP_alpha:  Translocon-associated protein (TRAP), alpha subunit;  InterPro: IPR005595  The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane [].; GO: 0005783 endoplasmic reticulum
Probab=61.08  E-value=1.5e+02  Score=32.37  Aligned_cols=82  Identities=15%  Similarity=0.194  Sum_probs=48.4

Q ss_pred             EEEEEEEecCceeeeEEEEE-----EcCC----CCeeecee--EEEeCCCceEEEEEEEeecccccc---ceEEEEEEEc
Q 005351          405 VATITTQNTGEVEASYSLTF-----DCST----GVTLMEEQ--YFIIKPKETSIRSFKIYPTTNQAA---KYTCSAILKD  470 (701)
Q Consensus       405 ~L~V~V~N~G~~~A~Y~v~v-----nCS~----~I~pI~~q--~~~I~p~~~~~~~F~I~~~s~~~~---~~~C~v~L~d  470 (701)
                      .+-|.++|.|.  ..|+|..     ....    .|....++  ...|+|+++.++.+.+.+....+.   .-.=.+...|
T Consensus       102 ~~LvgftN~g~--~~~~V~~i~aSl~~p~d~~~~iqNfTa~~y~~~V~pg~~aT~~YsF~~~~~l~pr~f~L~i~l~y~d  179 (285)
T PF03896_consen  102 KFLVGFTNKGS--EPFTVESIEASLRYPQDYSYYIQNFTAVRYNREVPPGEEATFPYSFTPSEELAPRPFGLVINLIYED  179 (285)
T ss_pred             EEEEEEEeCCC--CCEEEEEEeeeecCccccceEEEeecccccCcccCCCCeEEEEEEEecchhcCCcceEEEEEEEEEe
Confidence            47788999997  5777752     2222    23333333  557899999999999887644332   2333444458


Q ss_pred             CCCCceeeeEEEEEeeeeee
Q 005351          471 SDFSEVDRAECQFSTMATVL  490 (701)
Q Consensus       471 s~~~~lD~~~~~F~TtaT~~  490 (701)
                      ++|...-..  -|+-|-++.
T Consensus       180 ~~g~~y~~~--~fN~TV~Iv  197 (285)
T PF03896_consen  180 SDGNQYQVT--VFNGTVTIV  197 (285)
T ss_pred             CCCCEEEEE--EecceEEEe
Confidence            888654222  244444444


No 11 
>TIGR03066 Gem_osc_para_1 Gemmata obscuriglobus paralogous family TIGR03066. This model represents an uncharacterized paralogous family in Gemmata obscuriglobus UQM 2246, a member of the Planctomycetes. This family shows sequence similarity to TIGR03067, which is also found in Gemmata obscuriglobus as well as in a few other species.
Probab=58.37  E-value=59  Score=30.72  Aligned_cols=66  Identities=18%  Similarity=0.235  Sum_probs=41.6

Q ss_pred             CCeEEEEEEccccccccceeEEEEEEEecCc---eeeeEEEEEEcCCCCeeeceeEEEe--CCCceE-EEEEEEeecccc
Q 005351          385 SPGKIISVIIPTFEALTQFGVATITTQNTGE---VEASYSLTFDCSTGVTLMEEQYFII--KPKETS-IRSFKIYPTTNQ  458 (701)
Q Consensus       385 SpGkI~sv~v~~FEA~S~~G~L~V~V~N~G~---~~A~Y~v~vnCS~~I~pI~~q~~~I--~p~~~~-~~~F~I~~~s~~  458 (701)
                      ||..|      .|+   .+|.|.|.+.|.|.   +.+.|++.           .+.+++  .|+.+. --...+.-.++ 
T Consensus        34 ~~~~l------eF~---~dGKL~v~~gnng~~~~~~Gty~L~-----------G~kLtL~~~p~g~t~k~~Vtv~~l~~-   92 (111)
T TIGR03066        34 DDVVI------EFA---KDGKLVVTIGEKGKEVKADGTYKLD-----------GNKLTLTLKAGGKEKKETLTVKKLTD-   92 (111)
T ss_pred             CceEE------EEc---CCCeEEEecCCCCcEeccCceEEEE-----------CCEEEEEEcCCCccccceEEEEEecC-
Confidence            56666      665   89999999999996   57888887           445555  454442 22222221111 


Q ss_pred             ccceEEEEEEEcCCCCce
Q 005351          459 AAKYTCSAILKDSDFSEV  476 (701)
Q Consensus       459 ~~~~~C~v~L~ds~~~~l  476 (701)
                           =...|+|++|+.+
T Consensus        93 -----~~Lvl~d~dg~~~  105 (111)
T TIGR03066        93 -----DELVGKDPDGKKD  105 (111)
T ss_pred             -----CeEEEEcCCCCEe
Confidence                 2467888888765


No 12 
>PF14884 EFF-AFF:  Type I membrane glycoproteins cell-cell fusogen
Probab=55.66  E-value=2.3e+02  Score=33.75  Aligned_cols=76  Identities=7%  Similarity=0.006  Sum_probs=41.7

Q ss_pred             ccCCeeEEEeeccceeEEEEEEEEEEcc--eE------eEEEEcCCCcceeecCCcEEEEE-eecccCCCCCCCccccEE
Q 005351          183 FPGDWFHVFGIGQRSIGFSVRIEVKTGS--KV------SEVTVGPENKTATSADNFLKVNL-IGDFVGYTNIPSFEEFYL  253 (701)
Q Consensus       183 f~~~WY~vy~I~~~~i~y~I~VtI~~gs--s~------~~v~Lgp~~~~~~s~d~~l~~~L-iGDf~~~~~~p~l~~~yL  253 (701)
                      +++.-|-+-.+++|...=.+.=.++...  +.      .++..-++-. ....+.+-+.++ ++..++...+-.|++-|.
T Consensus       199 ~~n~~y~Avkl~QP~t~aif~y~~y~~~~g~~w~e~~~e~ir~~~~~g-~~~~~l~~~~r~sl~vta~g~~~~QL~~GMY  277 (589)
T PF14884_consen  199 YDNRTYVAVKLEQPTTDAIFTYSIYDKVAGGQWIEKDKEEIRSQLDKG-SQQNELDHKRRISLRVTAGGRPSHQLETGMY  277 (589)
T ss_pred             cCCcEEEEEecCCCcEEEEEEeeeeeccccceeEeccCceEEEecCCc-hhhcccccccEEEEEEeecCCccccccCccE
Confidence            3556799999999866544444444321  11      2233322221 111122223333 366666667788888899


Q ss_pred             EeeCCC
Q 005351          254 VIPRQG  259 (701)
Q Consensus       254 liPs~~  259 (701)
                      +.|+.+
T Consensus       278 yf~~~n  283 (589)
T PF14884_consen  278 YFASEN  283 (589)
T ss_pred             EEEcCC
Confidence            999864


No 13 
>PHA02819 hypothetical protein; Provisional
Probab=55.57  E-value=20  Score=31.22  Aligned_cols=49  Identities=22%  Similarity=0.509  Sum_probs=26.4

Q ss_pred             CCCchhhhHHHhhhhhccccccccccccccccccccceeeeeehhhhHHHHHHHHHHHHHHHHHHHHHHH
Q 005351          505 SINDFFESIESIGKKLWEGLRDFITGKACRRKCSSFFDFSCHIQYICLSWLVLFGLVLAIFPTVLVLLWL  574 (701)
Q Consensus       505 ~~~gf~~~i~~~~~~~~~~~~~f~~g~~C~~~C~~f~d~~C~i~~~C~~~~~~~~l~l~~~~~~l~~l~l  574 (701)
                      -.+.|-|-+++       .|.|  .+-.|..+|+..            +|++.++++++++.+.++++||
T Consensus        19 DFnnFI~VVks-------VLtd--~s~~~~~~~~~~------------~~~~ii~l~~~~~~~~~~flYL   67 (71)
T PHA02819         19 DFNNFINVVKS-------VLNN--ENYNKKTKKSFL------------RYYLIIGLVTIVFVIIFIIFYL   67 (71)
T ss_pred             HHHHHHHHHHH-------HHcC--CCCcccccCChh------------HHHHHHHHHHHHHHHHHHHHHH
Confidence            34555666665       3444  334455555443            3444455666666666666665


No 14 
>PF00703 Glyco_hydro_2:  Glycosyl hydrolases family 2;  InterPro: IPR006102 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 2 GH2 from CAZY comprises enzymes with several known activities: beta-galactosidase (3.2.1.23 from EC); beta-mannosidase (3.2.1.25 from EC); beta-glucuronidase (3.2.1.31 from EC). These enzymes contain a conserved glutamic acid residue which has been shown [], in Escherichia coli lacZ (P00722 from SWISSPROT), to be the general acid/base catalyst in the active site of the enzyme.  This entry describes the immunoglobulin-like beta-sandwich domain [].; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3FN9_C 3DEC_A 3OB8_A 3OBA_A 3CMG_A 3GM8_A 3HN3_E 1BHG_A 2VZU_A 2X09_A ....
Probab=49.10  E-value=1.8e+02  Score=25.18  Aligned_cols=94  Identities=13%  Similarity=0.134  Sum_probs=56.6

Q ss_pred             EEEEEcc-ccccccceeEEEEEE--EecCceeeeEEEEEEcCCC---CeeeceeEEEeCCCceEEEEEEEeeccc-----
Q 005351          389 IISVIIP-TFEALTQFGVATITT--QNTGEVEASYSLTFDCSTG---VTLMEEQYFIIKPKETSIRSFKIYPTTN-----  457 (701)
Q Consensus       389 I~sv~v~-~FEA~S~~G~L~V~V--~N~G~~~A~Y~v~vnCS~~---I~pI~~q~~~I~p~~~~~~~F~I~~~s~-----  457 (701)
                      |..+.+- .++.. ..+.+.|.+  .|.+.....+.|++...+.   ............++......+.+.+...     
T Consensus         3 I~dv~v~~~~~~~-~~~~v~v~~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~lW~p   81 (110)
T PF00703_consen    3 IEDVFVTPDLDDD-DSAKVSVEVEVRNESNKPLDVTVRVRLFDPEGKKVVTQSPVVSLSAPGQARITLTIEIPNPKLWSP   81 (110)
T ss_dssp             EEEEEEEEEEETT-SEEEEEEEEEEEEESSSSCEEEEEEEEEETTSEEEEEEEEEEEECCCCEEEEEEEEEEESS-BBES
T ss_pred             EEEEEEEEEEcCC-CEEEEEEEEEEEeCCCCcEEEEEEEEEECCCCCEEEEeeeEEEecCCceeEEEEEEEcCCCCCcCC
Confidence            3444432 34432 566655555  8888888888888755442   2222344455555555444344444321     


Q ss_pred             -cccceEEEEEEEcCCCCceeeeEEEEE
Q 005351          458 -QAAKYTCSAILKDSDFSEVDRAECQFS  484 (701)
Q Consensus       458 -~~~~~~C~v~L~ds~~~~lD~~~~~F~  484 (701)
                       ...-|...+.| +.+|..+|.....|-
T Consensus        82 ~~P~LY~l~v~l-~~~g~~~d~~~~~~G  108 (110)
T PF00703_consen   82 EDPYLYTLEVEL-DDDGEVLDSIETRFG  108 (110)
T ss_dssp             SSBSEEEEEEEE-EETTEEEEEEEEEEE
T ss_pred             CCceEEEEEEEE-EeCCEEEEEEEeEee
Confidence             12478999999 888899999988774


No 15 
>PRK09918 putative fimbrial chaperone protein; Provisional
Probab=47.60  E-value=1.5e+02  Score=30.92  Aligned_cols=97  Identities=16%  Similarity=0.156  Sum_probs=50.6

Q ss_pred             hHHHHHHHHHHhCCC-ccccEEEEEeeeeccccccccccccCcceeEEEEEEecCCCCCCceEEEEEEEEEeeccccc-c
Q 005351           13 FLLILFCILNLLSPR-CVVGVQILSKSKLEKCEKRTDSDNLNCTTKIVLNMAVPSGSSGGEASIVAEVVEVEENSTQK-M   90 (701)
Q Consensus        13 ~~~~~~l~~~~~~~~-~~~~a~iissS~ie~C~~~s~~~~~~C~kKlVvtL~V~ng~~~~e~s~v~ev~~v~e~~t~k-~   90 (701)
                      |+.+|.+++++++.. .+..+..+...++..-..+.           -++++|.|.+.  + .+..++ .|.++..+. .
T Consensus         5 ~~~~~~~~~l~~~~~~~a~a~v~l~~tRvi~~~~~~-----------~~si~v~N~~~--~-p~lvQ~-wv~~~~~~~~~   69 (230)
T PRK09918          5 LFFLFTALVLLSSSSAVHAAGMVPETSVVIVEESDG-----------EGSINVKNTDS--N-PILLYT-TLVDLPEDKSK   69 (230)
T ss_pred             hHHHHHHHHHHHhhhHhhEeeEEEccEEEEEECCCC-----------eEEEEEEcCCC--C-cEEEEE-EEecCCCCCCC
Confidence            444455444444332 23344457778887776432           25788888542  2 233222 444322212 2


Q ss_pred             eeEeeCcEEEEEeceeEEEeeeeEeecCCCCceeEE
Q 005351           91 RTVRIPPVLTVNKTASYAVYELTYIRDVPYKPQEFY  126 (701)
Q Consensus        91 ~~L~~p~~Iti~KS~V~~~YpL~Y~~~vn~kp~E~~  126 (701)
                      .=+..||...+.-...+.. .+.|....|.+ +|..
T Consensus        70 ~fivtPPl~rl~pg~~q~v-Rii~~~~lp~d-rEs~  103 (230)
T PRK09918         70 LLLVTPPVARVEPGQSQQV-RFILKSGSPLN-TEHL  103 (230)
T ss_pred             CEEEcCCeEEECCCCceEE-EEEECCCCCCC-eeEE
Confidence            3457789888887766533 44455554443 6654


No 16 
>PHA02844 putative transmembrane protein; Provisional
Probab=47.52  E-value=29  Score=30.49  Aligned_cols=19  Identities=26%  Similarity=0.705  Sum_probs=10.7

Q ss_pred             HHHHHHHHHHHHHHHHHHH
Q 005351          556 VLFGLVLAIFPTVLVLLWL  574 (701)
Q Consensus       556 ~~~~l~l~~~~~~l~~l~l  574 (701)
                      +.++++++++.+.++++||
T Consensus        51 ~ii~i~~v~~~~~~~flYL   69 (75)
T PHA02844         51 WILTIIFVVFATFLTFLYL   69 (75)
T ss_pred             HHHHHHHHHHHHHHHHHHH
Confidence            3344555555566666665


No 17 
>COG3190 FliO Flagellar biogenesis protein [Cell motility and secretion]
Probab=45.83  E-value=28  Score=34.01  Aligned_cols=32  Identities=31%  Similarity=0.725  Sum_probs=26.2

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHHHhcCCcccchh
Q 005351          554 WLVLFGLVLAIFPTVLVLLWLLHQKGLFDPLYD  586 (701)
Q Consensus       554 ~~~~~~l~l~~~~~~l~~l~ll~k~g~~~~~~~  586 (701)
                      ++-+|+-++++++++|++.|++.|.+ +.|+..
T Consensus        24 ~~~~~gsL~~iL~lil~~~wl~kr~~-~~~~~~   55 (137)
T COG3190          24 LAQMFGSLILILALILFLAWLVKRLG-RAPLFK   55 (137)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHh-hcccCC
Confidence            56677888888999999999999999 556543


No 18 
>PF09451 ATG27:  Autophagy-related protein 27;  InterPro: IPR018939 Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) [] and is induced by starvation []. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes [, ]. The breakdown of vesicular transport intermediates is a unique feature of autophagy []. Autophagy can also function in the elimination of invading bacteria and antigens []. There are more than 25 AuTophaGy-related (ATG) genes that are essential for autophagy, although it is still not known how the autophagosome is made. Atg9 is a potential membrane carrier to deliver lipids that are used to form the vesicle. Atg27 is another transmembrane protein, and is a cycling protein []. It acts as an effector of VPS34 phosphatidylinositol 3-phosphate kinase signalling and regulates the cytoplasm to vacuole transport (Cvt) vesicle formation. It is also required for autophagy-dependent cycling of ATG9. 
Probab=42.83  E-value=1.6e+02  Score=31.49  Aligned_cols=27  Identities=15%  Similarity=0.302  Sum_probs=17.6

Q ss_pred             EEEEEEEe--cCceeeeEEEEEEcCCCCe
Q 005351          405 VATITTQN--TGEVEASYSLTFDCSTGVT  431 (701)
Q Consensus       405 ~L~V~V~N--~G~~~A~Y~v~vnCS~~I~  431 (701)
                      -|+|..++  .|...-.-.|.+.|..+..
T Consensus       118 Gl~l~l~G~~~~~~~~~a~i~f~Cd~~~~  146 (268)
T PF09451_consen  118 GLRLKLKGGKWGSNNQSAVIEFQCDKNAS  146 (268)
T ss_pred             CEEEEEeCCCCCCceEEEEEEEEcCCCCC
Confidence            46666665  4566666777788877654


No 19 
>TIGR00869 sec62 protein translocation protein, Sec62 family. protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins has been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions.
Probab=42.48  E-value=29  Score=36.64  Aligned_cols=53  Identities=23%  Similarity=0.449  Sum_probs=30.9

Q ss_pred             ccccceeeee--ehhhhHHHHHHHHHHHHHHHH---HHHHHHHHHhcCC---------------cccchhhhh
Q 005351          537 CSSFFDFSCH--IQYICLSWLVLFGLVLAIFPT---VLVLLWLLHQKGL---------------FDPLYDWWD  589 (701)
Q Consensus       537 C~~f~d~~C~--i~~~C~~~~~~~~l~l~~~~~---~l~~l~ll~k~g~---------------~~~~~~~~~  589 (701)
                      |=..|-..|-  +-|.|++-+..++++++++.+   +.+++|++...|+               |.|||.|=|
T Consensus       135 lFPlWP~~~r~gv~YlS~~~lgll~~~~~laivRlilF~i~~~~~g~~fWlfPNLfeD~Gf~eSF~Ply~~~~  207 (232)
T TIGR00869       135 LFPLWPRFMRRGSWYLSLGALGIIGGFFAVAILRLILFVLTLIVVKPGIWIFPNLFADVGFLDSFKPLWGWHE  207 (232)
T ss_pred             hcccChHHHhHhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCCeeeecchhcccCcceeeccceeccc
Confidence            3344444443  367888877666555544433   5555666655553               568888753


No 20 
>PF01102 Glycophorin_A:  Glycophorin A;  InterPro: IPR001195 Proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Glycophorin A (PAS-2) and glycophorin B (PAS-3) belong to the MNS blood group system and are associated with antigens that include M/N, S/s, U, He, Mi(a), M(c), Vw, Mur, M(g), Vr, M(e), Mt(a), St(a), Ri(a), Cl(a), Ny(a), Hut, Hil, M(v), Far, Mit, Dantu, Hop, Nob, En(a), ENKT, amongst others. Glycophorin A is the major sialoglycoprotein of the erythrocyte membrane []. Structurally, glycophorin A consists of an N-terminal extracellular domain, heavily glycosylated on serine and threonine residues, followed by a transmembrane region and a C-terminal cytoplasmic domain. Other glycophorins in this entry such as Glycophorin B and Glycophorin E represent minor sialoglycoproteins in the erythrocyte membrane.; GO: 0016021 integral to membrane; PDB: 2KPF_B 1AFO_B 2KPE_A.
Probab=40.20  E-value=34  Score=32.75  Aligned_cols=22  Identities=14%  Similarity=0.535  Sum_probs=16.0

Q ss_pred             HHHHHHHHHHHHHHHHHHHHHH
Q 005351          555 LVLFGLVLAIFPTVLVLLWLLH  576 (701)
Q Consensus       555 ~~~~~l~l~~~~~~l~~l~ll~  576 (701)
                      ++.|+.++++++++|+++|++.
T Consensus        68 ~Ii~gv~aGvIg~Illi~y~ir   89 (122)
T PF01102_consen   68 GIIFGVMAGVIGIILLISYCIR   89 (122)
T ss_dssp             HHHHHHHHHHHHHHHHHHHHHH
T ss_pred             ehhHHHHHHHHHHHHHHHHHHH
Confidence            3466778888888887777763


No 21 
>PF05506 DUF756:  Domain of unknown function (DUF756);  InterPro: IPR008475 This domain is found, normally as a tandem repeat, at the C terminus of bacterial phospholipase C proteins.; GO: 0004629 phospholipase C activity, 0016042 lipid catabolic process
Probab=39.43  E-value=1.2e+02  Score=26.66  Aligned_cols=48  Identities=10%  Similarity=0.127  Sum_probs=37.2

Q ss_pred             ceeEEEEEEEecCceeeeEEEEEEcCCCCeeeceeEEEeCCCceEEEEEEE
Q 005351          402 QFGVATITTQNTGEVEASYSLTFDCSTGVTLMEEQYFIIKPKETSIRSFKI  452 (701)
Q Consensus       402 ~~G~L~V~V~N~G~~~A~Y~v~vnCS~~I~pI~~q~~~I~p~~~~~~~F~I  452 (701)
                      ..|.|.+++.|.|.-...|+|.-+=.   ..-..+.++|+|+++....+++
T Consensus        18 ~~g~l~l~l~N~g~~~~~~~v~~~~y---~~~~~~~~~v~ag~~~~~~w~l   65 (89)
T PF05506_consen   18 ATGNLRLTLSNPGSAAVTFTVYDNAY---GGGGPWTYTVAAGQTVSLTWPL   65 (89)
T ss_pred             CCCEEEEEEEeCCCCcEEEEEEeCCc---CCCCCEEEEECCCCEEEEEEee
Confidence            45789999999999999888875211   1134589999999998888877


No 22 
>TIGR02745 ccoG_rdxA_fixG cytochrome c oxidase accessory protein FixG. Member of this ferredoxin-like protein family are found exclusively in species with an operon encoding the cbb3 type of cytochrome c oxidase (cco-cbb3), and near the cco-cbb3 operon in about half the cases. The cco-cbb3 is found in a variety of proteobacteria and almost nowhere else, and is associated with oxygen use under microaerobic conditions. Some (but not all) of these proteobacteria are also nitrogen-fixing, hence the gene symbol fixG. FixG was shown essential for functional cco-cbb3 expression in Bradyrhizobium japonicum.
Probab=34.63  E-value=7.2e+02  Score=28.72  Aligned_cols=68  Identities=13%  Similarity=0.078  Sum_probs=44.6

Q ss_pred             EEEEEEEecCceeeeEEEEEEcCCCCeeece-eEEEeCCCceEEEEEEEeeccc--cccceEEEEEEEcCC
Q 005351          405 VATITTQNTGEVEASYSLTFDCSTGVTLMEE-QYFIIKPKETSIRSFKIYPTTN--QAAKYTCSAILKDSD  472 (701)
Q Consensus       405 ~L~V~V~N~G~~~A~Y~v~vnCS~~I~pI~~-q~~~I~p~~~~~~~F~I~~~s~--~~~~~~C~v~L~ds~  472 (701)
                      .-++.+.|....+-.|++++.=-.++.-.-. ..+.++|++.....+.+.+...  ...++.=.+.+.+.+
T Consensus       349 ~Y~~~i~Nk~~~~~~~~l~v~g~~~~~~~~~~~~i~v~~g~~~~~~v~v~~~~~~~~~~~~~i~~~v~~~~  419 (434)
T TIGR02745       349 TYTLKILNKTEQPHEYYLSVLGLPGIKIEGPGAPIHVKAGEKVKLPVFLRTPPDALKSGITSIEIRAYAED  419 (434)
T ss_pred             EEEEEEEECCCCCEEEEEEEecCCCcEEEcCCceEEECCCCEEEEEEEEEechhhccCCceeEEEEEEECC
Confidence            3478889999999999999865555433323 3789999999888777766532  123444444455433


No 23 
>PHA02692 hypothetical protein; Provisional
Probab=33.34  E-value=64  Score=28.14  Aligned_cols=15  Identities=27%  Similarity=0.465  Sum_probs=8.3

Q ss_pred             HHHHHHHHHHHHHHH
Q 005351          560 LVLAIFPTVLVLLWL  574 (701)
Q Consensus       560 l~l~~~~~~l~~l~l  574 (701)
                      ++++++.+.++++||
T Consensus        53 ~~~~~~~vll~flYL   67 (70)
T PHA02692         53 LIAAAIGVLLCFHYL   67 (70)
T ss_pred             HHHHHHHHHHHHHHH
Confidence            455555555556655


No 24 
>PLN03080 Probable beta-xylosidase; Provisional
Probab=32.12  E-value=90  Score=38.48  Aligned_cols=50  Identities=12%  Similarity=0.076  Sum_probs=37.6

Q ss_pred             EEEEEEEecCceeeeEEEEEEcC----CCCeeec----eeEEEeCCCceEEEEEEEee
Q 005351          405 VATITTQNTGEVEASYSLTFDCS----TGVTLME----EQYFIIKPKETSIRSFKIYP  454 (701)
Q Consensus       405 ~L~V~V~N~G~~~A~Y~v~vnCS----~~I~pI~----~q~~~I~p~~~~~~~F~I~~  454 (701)
                      .+.|+|+|+|.+.+...+++.=+    ..-.|+-    =+.+.++|+++.+.+|.|..
T Consensus       687 ~v~v~VtNtG~~~G~evvQlYv~~p~~~~~~P~k~L~gF~kv~L~~Ges~~V~~~l~~  744 (779)
T PLN03080        687 NVHISVSNVGEMDGSHVVMLFSRSPPVVPGVPEKQLVGFDRVHTASGRSTETEIVVDP  744 (779)
T ss_pred             EEEEEEEECCcccCcEEEEEEEecCccCCCCcchhccCcEeEeeCCCCEEEEEEEeCc
Confidence            48899999999999999886322    2223442    23678999999999999965


No 25 
>PRK15098 beta-D-glucoside glucohydrolase; Provisional
Probab=30.20  E-value=92  Score=38.26  Aligned_cols=51  Identities=25%  Similarity=0.309  Sum_probs=37.7

Q ss_pred             EEEEEEEecCceeeeEEEEEEcC---CCC-eee----ceeEEEeCCCceEEEEEEEeec
Q 005351          405 VATITTQNTGEVEASYSLTFDCS---TGV-TLM----EEQYFIIKPKETSIRSFKIYPT  455 (701)
Q Consensus       405 ~L~V~V~N~G~~~A~Y~v~vnCS---~~I-~pI----~~q~~~I~p~~~~~~~F~I~~~  455 (701)
                      .++|.|+|+|...+.-.+++.=+   ..+ .|.    .=+.+.++|+++.+.+|.|...
T Consensus       670 ~v~v~V~NtG~~~G~EVvQlYv~~~~~~~~~P~k~L~gF~Kv~L~pGes~~V~~~l~~~  728 (765)
T PRK15098        670 TASVTVTNTGKREGATVVQLYLQDVTASMSRPVKELKGFEKIMLKPGETQTVSFPIDIE  728 (765)
T ss_pred             EEEEEEEECCCCCccEEEEEeccCCCCCCCCHHHhccCceeEeECCCCeEEEEEeecHH
Confidence            57789999999999999986322   222 232    2236789999999999999764


No 26 
>PRK10856 cytoskeletal protein RodZ; Provisional
Probab=30.06  E-value=41  Score=37.20  Aligned_cols=21  Identities=24%  Similarity=0.548  Sum_probs=14.1

Q ss_pred             HHHHHHHHHHHHHHHHHHHHH
Q 005351          553 SWLVLFGLVLAIFPTVLVLLW  573 (701)
Q Consensus       553 ~~~~~~~l~l~~~~~~l~~l~  573 (701)
                      +|++.+..+++++.++|+.+|
T Consensus       111 ~~~~~~~~lv~~vvl~l~~~w  131 (331)
T PRK10856        111 GWLMTFTWLVLFVVIGLTGAW  131 (331)
T ss_pred             CchHHHHHHHHHHHHHHHHHH
Confidence            366666666666667777767


No 27 
>PRK10019 nickel/cobalt efflux protein RcnA; Provisional
Probab=29.93  E-value=1.1e+02  Score=33.20  Aligned_cols=7  Identities=14%  Similarity=0.278  Sum_probs=3.3

Q ss_pred             chhhhhh
Q 005351          584 LYDWWDD  590 (701)
Q Consensus       584 ~~~~~~~  590 (701)
                      ++.|-|-
T Consensus        86 ~~~~le~   92 (279)
T PRK10019         86 AEPWLQL   92 (279)
T ss_pred             HHHHHHH
Confidence            4455444


No 28 
>PF03839 Sec62:  Translocation protein Sec62;  InterPro: IPR004728 Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins have been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions.; GO: 0008565 protein transporter activity, 0015031 protein transport, 0016021 integral to membrane
Probab=28.48  E-value=70  Score=33.69  Aligned_cols=52  Identities=25%  Similarity=0.624  Sum_probs=28.3

Q ss_pred             ccccceeeee--ehhhhHHHHHHHHHHHHHH---HHHHHHHHHHH--hcCC---------------cccchhhh
Q 005351          537 CSSFFDFSCH--IQYICLSWLVLFGLVLAIF---PTVLVLLWLLH--QKGL---------------FDPLYDWW  588 (701)
Q Consensus       537 C~~f~d~~C~--i~~~C~~~~~~~~l~l~~~---~~~l~~l~ll~--k~g~---------------~~~~~~~~  588 (701)
                      |=..|=.++-  +-|.|++-+..++++++|+   +|+.+++|++.  +.|+               |.|||.|=
T Consensus       127 lFPlWP~~~r~gv~YlS~~~lgll~~~~~laivRlilf~i~w~~~~g~~~fWlfPNLfeD~Gf~eSF~Ply~~~  200 (224)
T PF03839_consen  127 LFPLWPRWMRQGVYYLSVGALGLLGLFFALAIVRLILFLITWFFTGGKHGFWLFPNLFEDVGFFESFKPLYSWE  200 (224)
T ss_pred             hhhcChHHHhheeehhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCCCEEeCCccccccchhhheeeccccc
Confidence            3344444443  3567776555555444433   33555666665  4443               67888874


No 29 
>PHA02650 hypothetical protein; Provisional
Probab=25.92  E-value=90  Score=27.87  Aligned_cols=21  Identities=10%  Similarity=0.151  Sum_probs=11.8

Q ss_pred             HHHHHHHHHHHHHHHHHHHHH
Q 005351          554 WLVLFGLVLAIFPTVLVLLWL  574 (701)
Q Consensus       554 ~~~~~~l~l~~~~~~l~~l~l  574 (701)
                      |++.++++++++.+.++++||
T Consensus        50 ~~~ii~i~~v~i~~l~~flYL   70 (81)
T PHA02650         50 QNFIFLIFSLIIVALFSFFVF   70 (81)
T ss_pred             HHHHHHHHHHHHHHHHHHHHH
Confidence            344445555566666666665


No 30 
>PHA03054 IMV membrane protein; Provisional
Probab=25.80  E-value=94  Score=27.21  Aligned_cols=21  Identities=14%  Similarity=0.488  Sum_probs=11.9

Q ss_pred             HHHHHHHHHHHHHHHHHHHHH
Q 005351          554 WLVLFGLVLAIFPTVLVLLWL  574 (701)
Q Consensus       554 ~~~~~~l~l~~~~~~l~~l~l  574 (701)
                      |++.++++++++.+.++++||
T Consensus        49 ~~~ii~l~~v~~~~l~~flYL   69 (72)
T PHA03054         49 YWLIIIFFIVLILLLLIYLYL   69 (72)
T ss_pred             HHHHHHHHHHHHHHHHHHHHH
Confidence            344444566666666666665


No 31 
>PF05297 Herpes_LMP1:  Herpesvirus latent membrane protein 1 (LMP1);  InterPro: IPR007961 This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4). LMP1 of HHV-4 is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N terminus and a long cytoplasmic carboxy tail of 200 amino acids. HHV-4 virus latent membrane protein 1 (LMP1) is essential for HHV-4 mediated transformation and has been associated with several cases of malignancies. HHV-4-like viruses in Macaca fascicularis (Cynomolgus monkeys) have been associated with high lymphoma rates in immunosuppressed monkeys [].; GO: 0019087 transformation of host cell by virus, 0016021 integral to membrane; PDB: 1CZY_E 1ZMS_B.
Probab=25.79  E-value=23  Score=38.51  Aligned_cols=90  Identities=17%  Similarity=0.259  Sum_probs=0.0

Q ss_pred             cccceeeeeehhhhHHHHHHH---------------HHHHHHHHHHHHHHHHHHhcCCcccchhhhhhhccCcccccccc
Q 005351          538 SSFFDFSCHIQYICLSWLVLF---------------GLVLAIFPTVLVLLWLLHQKGLFDPLYDWWDDHFQSDNQRIRDF  602 (701)
Q Consensus       538 ~~f~d~~C~i~~~C~~~~~~~---------------~l~l~~~~~~l~~l~ll~k~g~~~~~~~~~~~~~~~~~~~~~~~  602 (701)
                      ++||-+.-|.-.+|+.-++..               .+-|.++.+.|.-+|+=-+.+.        |..-+.|.-.....
T Consensus       134 As~WtiLaFcLAF~LaivlLIIAv~L~qaWfT~L~dL~WL~LFlaiLIWlY~H~~~~~--------~e~~~dd~~~h~~~  205 (381)
T PF05297_consen  134 ASFWTILAFCLAFLLAIVLLIIAVLLHQAWFTILVDLYWLLLFLAILIWLYVHDQRHA--------EEHNHDDGHGHPQQ  205 (381)
T ss_dssp             ------------------------------------------------------------------------------EE
T ss_pred             hHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCCCC--------cccccccCCCCCCc


Q ss_pred             cc----cccCCCCCccccccccccccccccchhcccc
Q 005351          603 RS----RRIDVDHPHVHVRKHHKQEGRHHKLEARRRR  635 (701)
Q Consensus       603 ~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  635 (701)
                      -+    |.+..+|+|+||++++........+..|+-+
T Consensus       206 ~~~d~~hd~~~N~Nh~~HH~~vsg~GDgpP~~SQn~G  242 (381)
T PF05297_consen  206 ATDDSGHDKESNHNHHHHHHLVSGAGDGPPYVSQNGG  242 (381)
T ss_dssp             -S-----------------------------------
T ss_pred             CCcccCCCCCCCCCCCCCceeeccCCCCCCcccccCC


No 32 
>PF12575 DUF3753:  Protein of unknown function (DUF3753);  InterPro: IPR009175 This group represents an uncharacterised conserved protein belonging to poxvirus family I2.
Probab=24.74  E-value=93  Score=27.33  Aligned_cols=21  Identities=14%  Similarity=0.573  Sum_probs=12.9

Q ss_pred             HHHHHHHHHHHHHHHHHHHHH
Q 005351          554 WLVLFGLVLAIFPTVLVLLWL  574 (701)
Q Consensus       554 ~~~~~~l~l~~~~~~l~~l~l  574 (701)
                      |++.++++++++.+.|+++||
T Consensus        49 ~~~ii~ii~v~ii~~l~flYL   69 (72)
T PF12575_consen   49 IILIISIIFVLIIVLLTFLYL   69 (72)
T ss_pred             HHHHHHHHHHHHHHHHHHHHh
Confidence            345555666666666667665


No 33 
>PF14155 DUF4307:  Domain of unknown function (DUF4307)
Probab=24.47  E-value=2.4e+02  Score=26.32  Aligned_cols=51  Identities=20%  Similarity=0.177  Sum_probs=39.3

Q ss_pred             eeecee--EEEeCCCceEEEEEEEeeccccccceEEEEEEEcCCCCceeeeEEEE
Q 005351          431 TLMEEQ--YFIIKPKETSIRSFKIYPTTNQAAKYTCSAILKDSDFSEVDRAECQF  483 (701)
Q Consensus       431 ~pI~~q--~~~I~p~~~~~~~F~I~~~s~~~~~~~C~v~L~ds~~~~lD~~~~~F  483 (701)
                      .||+.+  -+.+....+...+|++.-.  ......|.+.-+|.++.++=++++.+
T Consensus        32 ~~v~~~~~gf~vv~d~~v~v~f~Vtr~--~~~~a~C~VrA~~~d~aeVGrreV~v   84 (112)
T PF14155_consen   32 PPVSAEVIGFEVVDDSTVEVTFDVTRD--PGRPAVCIVRALDYDGAEVGRREVLV   84 (112)
T ss_pred             CCceEEEEEEEECCCCEEEEEEEEEEC--CCCCEEEEEEEEeCCCCEEEEEEEEE
Confidence            455555  4455666677777777554  66799999999999999999999888


No 34 
>PHA02975 hypothetical protein; Provisional
Probab=24.32  E-value=1e+02  Score=26.77  Aligned_cols=21  Identities=14%  Similarity=0.391  Sum_probs=12.0

Q ss_pred             HHHHHHHHHHHHHHHHHHHHH
Q 005351          554 WLVLFGLVLAIFPTVLVLLWL  574 (701)
Q Consensus       554 ~~~~~~l~l~~~~~~l~~l~l  574 (701)
                      +++.++++++++.+++.++||
T Consensus        45 ~~~ii~i~~v~~~~~~~flYL   65 (69)
T PHA02975         45 IILIIFIIFITCIAVFTFLYL   65 (69)
T ss_pred             HHHHHHHHHHHHHHHHHHHHH
Confidence            334444566666666666665


No 35 
>PF12273 RCR:  Chitin synthesis regulation, resistance to Congo red;  InterPro: IPR020999  RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall []. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5 []. 
Probab=22.83  E-value=68  Score=30.41  Aligned_cols=8  Identities=38%  Similarity=0.679  Sum_probs=4.0

Q ss_pred             CCCccccc
Q 005351          649 RDTDYYYY  656 (701)
Q Consensus       649 ~~~~~~~~  656 (701)
                      .+..||+.
T Consensus        73 ~~~g~Yd~   80 (130)
T PF12273_consen   73 NDPGYYDQ   80 (130)
T ss_pred             CCCCCCCC
Confidence            34555554


No 36 
>PF06030 DUF916:  Bacterial protein of unknown function (DUF916);  InterPro: IPR010317 This family consists of putative cell surface proteins, from Firmicutes, of unknown function. 
Probab=22.79  E-value=2.3e+02  Score=26.85  Aligned_cols=20  Identities=20%  Similarity=0.365  Sum_probs=17.9

Q ss_pred             EEeCCCceEEEEEEEeeccc
Q 005351          438 FIIKPKETSIRSFKIYPTTN  457 (701)
Q Consensus       438 ~~I~p~~~~~~~F~I~~~s~  457 (701)
                      ++|+|+++....|.|.+...
T Consensus        87 Vtl~~~~sk~V~~~i~~P~~  106 (121)
T PF06030_consen   87 VTLPPNESKTVTFTIKMPKK  106 (121)
T ss_pred             EEECCCCEEEEEEEEEcCCC
Confidence            99999999999999988643


No 37 
>PRK13211 N-acetylglucosamine-binding protein A; Reviewed
Probab=22.35  E-value=8.2e+02  Score=28.75  Aligned_cols=40  Identities=15%  Similarity=0.045  Sum_probs=33.6

Q ss_pred             EEEEEeeccccccceEEEEEEEcCCCCceeeeEEEEEeee
Q 005351          448 RSFKIYPTTNQAAKYTCSAILKDSDFSEVDRAECQFSTMA  487 (701)
Q Consensus       448 ~~F~I~~~s~~~~~~~C~v~L~ds~~~~lD~~~~~F~Tta  487 (701)
                      ..|.|-++...+..|.=.|..+|++|...++..+.|.+++
T Consensus       367 ~~vtL~Ls~~~AG~y~Lvv~~t~~dG~~~~q~~~~~~v~~  406 (478)
T PRK13211        367 QSVSLDLSKLKAGHHMLVVKAKPKDGELIKQQTLDFMLEA  406 (478)
T ss_pred             eeEEEecccCCCceEEEEEEEEeCCCceeeeeeEEEEEEe
Confidence            4456666666788999999999999999999999999963


No 38 
>PF13198 DUF4014:  Protein of unknown function (DUF4014)
Probab=21.32  E-value=82  Score=27.57  Aligned_cols=30  Identities=20%  Similarity=0.726  Sum_probs=20.4

Q ss_pred             HHHHH-HHHHHHHhcCCcccchhhhhhhccCcc
Q 005351          565 FPTVL-VLLWLLHQKGLFDPLYDWWDDHFQSDN  596 (701)
Q Consensus       565 ~~~~l-~~l~ll~k~g~~~~~~~~~~~~~~~~~  596 (701)
                      ++|.. +++|+.-+  .+.|+.+|..|.+++--
T Consensus        29 ipI~pll~~~~i~~--~~E~l~e~Y~~~~w~~F   59 (72)
T PF13198_consen   29 IPISPLLFVWIIGK--IIEPLFELYKDWFWNPF   59 (72)
T ss_pred             HHHHHHHHHHHHHH--HHHHHHHHHHHHHHHhH
Confidence            55544 44444444  89999999999888743


No 39 
>PF02018 CBM_4_9:  Carbohydrate binding domain;  InterPro: IPR003305 The 1,4-beta-glucanase CenC from Cellulomonas fimi contains two cellulose-binding domains, CBD(N1) and CBD(N2), arranged in tandem at its N terminus. These homologous CBDs are distinct in their selectivity for binding amorphous and not crystalline cellulose []. Multidimensional heteronuclear nuclear magnetic resonance (NMR) spectroscopy was used to determine the tertiary structure of the 152 amino acid N-terminal cellulose-binding domain from C. fimi 1,4-beta-glucanase CenC (CBDN1) []. The tertiary structure of CBDN1 is strikingly similar to that of the bacterial 1,3-1,4-beta-glucanases, as well as other sugar-binding proteins with jelly-roll folds.; GO: 0016798 hydrolase activity, acting on glycosyl bonds; PDB: 3OEA_B 2ZEX_B 3OEB_A 2ZEY_A 2ZEW_A 1GUI_A 2W5F_A 2WZE_A 2WYS_A 2ZEZ_B ....
Probab=20.26  E-value=6.1e+02  Score=22.43  Aligned_cols=41  Identities=22%  Similarity=0.296  Sum_probs=26.1

Q ss_pred             eeeEEEEEEcCCCC--eeeceeEEEeCCCceEEEEEEEeeccc
Q 005351          417 EASYSLTFDCSTGV--TLMEEQYFIIKPKETSIRSFKIYPTTN  457 (701)
Q Consensus       417 ~A~Y~v~vnCS~~I--~pI~~q~~~I~p~~~~~~~F~I~~~s~  457 (701)
                      ...|.+.+.-....  .....+.+.|+|+++|.++|.++....
T Consensus        33 ~g~~~l~v~~~~~~~~~~~~~~~~~l~~G~~Y~~s~~vk~~~~   75 (131)
T PF02018_consen   33 SGNYSLKVSNRSATWDGQSQQQTISLKPGKTYTVSFWVKADSG   75 (131)
T ss_dssp             SSSEEEEEECCSSGCGEEEEEEEEEE-TTSEEEEEEEEEESSS
T ss_pred             CCeEEEEEECCCCCccccceecceEecCCCEEEEEEEEEeCCC
Confidence            55566666433321  122344579999999999999988755


Done!