Query         023943
Match_columns 275
No_of_seqs    342 out of 1986
Neff          7.0 
Searched_HMMs 46136
Date          Fri Mar 29 07:46:33 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/023943.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/023943hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PRK09525 lacZ beta-D-galactosi 100.0 1.2E-58 2.5E-63  482.3  23.4  242    8-273     4-246 (1027)
  2 PRK10340 ebgA cryptic beta-D-g 100.0 1.8E-57 3.8E-62  473.9  24.0  233   18-272     1-233 (1021)
  3 PRK10150 beta-D-glucuronidase; 100.0 1.4E-34   3E-39  288.9  20.2  183   71-273     9-207 (604)
  4 PF02837 Glyco_hydro_2_N:  Glyc 100.0 9.2E-35   2E-39  244.6  14.5  165   72-246     2-167 (167)
  5 COG3250 LacZ Beta-galactosidas 100.0 7.1E-30 1.5E-34  259.2  13.0  196   71-272     9-206 (808)
  6 KOG2024 Beta-Glucuronidase GUS  99.7 4.1E-16   9E-21  138.1   9.6  176   70-266    28-209 (297)
  7 KOG2230 Predicted beta-mannosi  99.2 1.4E-10   3E-15  112.4  11.5  169   73-263    21-227 (867)
  8 PF13364 BetaGal_dom4_5:  Beta-  97.9 4.9E-05 1.1E-09   60.2   8.3   71  142-218    33-107 (111)
  9 COG3250 LacZ Beta-galactosidas  97.9 1.6E-05 3.4E-10   82.3   6.5   70  145-218    64-133 (808)
 10 PLN03059 beta-galactosidase; P  97.8 0.00015 3.3E-09   74.8  10.3   94  143-244   469-571 (840)
 11 PF08531 Bac_rhamnosid_N:  Alph  97.6 0.00012 2.5E-09   62.4   5.9   53  161-218     4-65  (172)
 12 PLN03059 beta-galactosidase; P  96.1   0.011 2.3E-07   61.5   6.0   66  143-216   618-712 (840)
 13 KOG0496 Beta-galactosidase [Ca  95.9   0.026 5.6E-07   56.8   7.5   75  162-245   434-514 (649)
 14 KOG0496 Beta-galactosidase [Ca  94.6   0.058 1.3E-06   54.3   5.5   68  143-218   556-625 (649)
 15 PF07691 PA14:  PA14 domain;  I  92.5     1.2 2.7E-05   35.5   9.1   71  143-220    45-122 (145)
 16 PF14683 CBM-like:  Polysacchar  91.8    0.37 7.9E-06   40.9   5.3   69  146-219    63-153 (167)
 17 PF03170 BcsB:  Bacterial cellu  88.9     2.5 5.3E-05   42.8   9.3   71  146-220    29-112 (605)
 18 PF08308 PEGA:  PEGA domain;  I  88.4       1 2.2E-05   32.0   4.6   43  165-219     4-46  (71)
 19 smart00758 PA14 domain in bact  88.1     3.8 8.3E-05   32.6   8.3   70  143-219    43-113 (136)
 20 PF03170 BcsB:  Bacterial cellu  86.7     2.2 4.8E-05   43.1   7.5   67  149-219   327-410 (605)
 21 PRK11114 cellulose synthase re  85.9     3.5 7.7E-05   43.0   8.7   70  148-221    83-166 (756)
 22 PF06832 BiPBP_C:  Penicillin-B  77.4      11 0.00024   28.0   6.4   48  159-214    30-77  (89)
 23 PF12733 Cadherin-like:  Cadher  71.3      19  0.0004   26.4   6.3   56  151-219    17-73  (88)
 24 PF04566 RNA_pol_Rpb2_4:  RNA p  67.2       5 0.00011   28.4   2.3   13  176-188     1-13  (63)
 25 PF12222 PNGaseA:  Peptide N-ac  65.6      14  0.0003   36.1   5.7   51  169-219   218-291 (427)
 26 PF11008 DUF2846:  Protein of u  59.9      30 0.00065   27.1   5.8   34  171-214    40-74  (117)
 27 PF14324 PINIT:  PINIT domain;   56.2      13 0.00028   30.5   3.3   51  162-218    74-131 (144)
 28 PF11824 DUF3344:  Protein of u  56.2      27 0.00058   31.9   5.6   66  148-217    40-132 (271)
 29 PF14814 UB2H:  Bifunctional tr  54.7      22 0.00047   26.5   4.0   42  143-185    38-82  (85)
 30 PF07550 DUF1533:  Protein of u  46.2      29 0.00062   24.4   3.3   43  173-219     8-58  (65)
 31 PF09113 N-glycanase_C:  Peptid  42.4      87  0.0019   25.8   6.0   67  147-218    11-118 (141)
 32 smart00560 LamGL LamG-like jel  41.6      34 0.00073   27.2   3.5   27  161-187    64-90  (133)
 33 PRK11114 cellulose synthase re  39.4      57  0.0012   34.2   5.6   39  148-186   378-427 (756)
 34 PF07908 D-aminoacyl_C:  D-amin  38.0      26 0.00056   23.3   1.9   13  173-185    20-32  (48)
 35 PF10262 Rdx:  Rdx family;  Int  37.0      48   0.001   23.9   3.4   23  161-184    33-55  (76)
 36 PF11824 DUF3344:  Protein of u  35.1      56  0.0012   29.8   4.2   48  165-216   204-254 (271)
 37 TIGR02148 Fibro_Slime fibro-sl  34.4 2.1E+02  0.0045   21.8   6.7   53  163-219    20-76  (90)
 38 PF00337 Gal-bind_lectin:  Gala  33.1      69  0.0015   25.3   4.0   28  159-186    81-108 (133)
 39 PF09829 DUF2057:  Uncharacteri  32.2      98  0.0021   26.3   5.1   39  172-219     8-46  (189)
 40 PF05775 AfaD:  Enterobacteria   32.0      97  0.0021   24.6   4.5   45  164-218    18-62  (111)
 41 PF13385 Laminin_G_3:  Concanav  30.5      58  0.0013   25.1   3.2   24  162-187    89-112 (157)
 42 KOG4342 Alpha-mannosidase [Car  30.2 1.7E+02  0.0037   30.4   6.9   67  145-218   103-172 (1078)
 43 cd00070 GLECT Galectin/galacto  29.3      83  0.0018   24.8   3.9   28  159-186    76-103 (127)
 44 COG0278 Glutaredoxin-related p  28.8      33 0.00073   26.8   1.4   16  171-186    70-85  (105)
 45 PF15625 CC2D2AN-C2:  CC2D2A N-  28.7      94   0.002   26.0   4.3   40  172-220    39-78  (168)
 46 PF06439 DUF1080:  Domain of Un  27.6 1.1E+02  0.0023   25.2   4.4   21  171-191   138-158 (185)
 47 TIGR02412 pepN_strep_liv amino  26.6 7.9E+02   0.017   26.1  11.8   63  145-218    35-98  (831)
 48 PRK10824 glutaredoxin-4; Provi  26.0      48   0.001   26.3   1.9   23  164-186    62-84  (115)
 49 smart00276 GLECT Galectin. Gal  25.6   1E+02  0.0023   24.3   3.9   28  159-186    75-102 (128)
 50 PF13464 DUF4115:  Domain of un  25.6 1.1E+02  0.0024   21.9   3.7   25  160-185    37-61  (77)
 51 COG3148 Uncharacterized conser  25.6      46 0.00099   29.5   1.8   51   22-82     90-140 (231)
 52 smart00776 NPCBM This novel pu  25.5 3.8E+02  0.0082   21.9   7.8   42  172-222    85-131 (145)
 53 smart00561 MBT Present in Dros  24.7      59  0.0013   24.9   2.1   21  159-179    54-74  (96)
 54 PF11324 DUF3126:  Protein of u  23.9      71  0.0015   22.7   2.2   18  169-186    25-42  (63)
 55 PF03422 CBM_6:  Carbohydrate b  23.3 3.3E+02  0.0072   20.7   6.3   40  173-218    61-110 (125)
 56 PRK01904 hypothetical protein;  22.4 1.6E+02  0.0034   26.0   4.6   41  172-220    29-69  (219)
 57 KOG1752 Glutaredoxin and relat  21.9      65  0.0014   25.1   1.9   25  163-187    58-82  (104)
 58 PRK15222 putative pilin struct  20.7 1.8E+02   0.004   24.4   4.4   44  165-218    60-103 (156)
 59 cd02848 Chitinase_N_term Chiti  20.2 2.5E+02  0.0054   22.1   4.8   45  168-219    45-91  (106)
 60 PF01589 Alpha_E1_glycop:  Alph  20.1 1.5E+02  0.0032   29.2   4.2   28  161-188   195-222 (502)
 61 PRK06789 flagellar motor switc  20.0 1.1E+02  0.0024   22.4   2.6   27  160-187    31-57  (74)

No 1  
>PRK09525 lacZ beta-D-galactosidase; Reviewed
Probab=100.00  E-value=1.2e-58  Score=482.29  Aligned_cols=242  Identities=43%  Similarity=0.804  Sum_probs=221.3

Q ss_pred             ccccccccCCCCCCCCcccccccCCCCCCCccccCChhhhcccCCchhhHHHhhhcccccCCCCCcEEecCccceEEecC
Q 023943            8 LPFALENANGYKVWEDPSFIKWRKRDPHVTLRCHDSVEVSNSAVWDDDAVHEALTSAAFWTNGLPFVKSLSGHWKFFLAS   87 (275)
Q Consensus         8 ~~~~~~~~~~~~~w~~p~~~~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~LnG~W~F~~~~   87 (275)
                      +|..|.+....++||||+|+++|||||||+|+||++.++|+.         .         ..++..++|||.|+|++.+
T Consensus         4 ~~~~~~~~~~~~~wenp~v~~~nr~~~~a~~~~~~~~~~a~~---------~---------~~s~~~~sLnG~W~F~~~~   65 (1027)
T PRK09525          4 IMDSLAQILARRDWENPGVTQLNRLPAHPPFASWRNSEAARD---------D---------RPSQQRQSLNGEWRFSYFP   65 (1027)
T ss_pred             chhHHHhhhccCCccCccccCCCCCCCCCCcCCcCCHHHHhh---------c---------cCCcceEecCCCcceeECC
Confidence            345555544457999999999999999999999999985432         1         1245789999999999999


Q ss_pred             CCCCCCccccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEEEEEcCCCCCCc-eEEEE
Q 023943           88 SPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRTYFHIPKEWQGR-RILLH  166 (275)
Q Consensus        88 ~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr~F~lp~~~~~~-~i~L~  166 (275)
                      .+.+.|++|+..++++  |++|+||++|+++|++.++|+|+.|||+.+||++|.+|++|||||+|++|++|+++ +++|+
T Consensus        66 ~~~~~~~~~~~~~~~~--w~~I~VP~~w~~~G~~~~~y~n~~ypf~~~~p~vp~~n~~gwYrr~F~vp~~w~~~~rv~L~  143 (1027)
T PRK09525         66 APEAVPESWLECDLPD--ADTIPVPSNWQLHGYDAPIYTNVTYPIPVNPPFVPEENPTGCYSLTFTVDESWLQSGQTRII  143 (1027)
T ss_pred             ChhhCcccccccCCCC--CcEeCCCCcHHhcCCCCCccccccCCCCCCCCCCCCcCCeEEEEEEEEeChhhcCCCeEEEE
Confidence            9988999999988865  99999999999999999999999999999999999889999999999999999887 99999


Q ss_pred             eCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCCCCCccccccceeEEEEeC
Q 023943          167 FEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQDHWWLSGIHRDVLLLAKP  246 (275)
Q Consensus       167 f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledqd~w~~~GI~RdV~L~~~p  246 (275)
                      |+||++.++|||||++||+|+|+|+||+||||++|++   | +|+|+|+|.+|++|+|+++||+|+++||||+|+|+++|
T Consensus       144 FeGV~~~a~VwvNG~~VG~~~g~~~pfefDIT~~l~~---G-~N~L~V~V~~~sdgs~~e~qd~w~~sGI~R~V~L~~~p  219 (1027)
T PRK09525        144 FDGVNSAFHLWCNGRWVGYSQDSRLPAEFDLSPFLRA---G-ENRLAVMVLRWSDGSYLEDQDMWRMSGIFRDVSLLHKP  219 (1027)
T ss_pred             ECeeccEEEEEECCEEEEeecCCCceEEEEChhhhcC---C-ccEEEEEEEecCCCCccccCCceeeccccceEEEEEcC
Confidence            9999999999999999999999999999999999999   9 59999999999999999999999999999999999999


Q ss_pred             CceEEeEEEEEeecCCeeEEEEEEEEe
Q 023943          247 QVFIADYFFKSNLAEDFSLADIQVNTC  273 (275)
Q Consensus       247 ~~~I~D~~v~t~ld~~~~~~~l~v~~~  273 (275)
                      ++||+|++|+++++.++++|+|+|++.
T Consensus       220 ~~~I~d~~v~t~l~~~~~~a~v~v~v~  246 (1027)
T PRK09525        220 TTQLSDFHITTELDDDFRRAVLEVEAQ  246 (1027)
T ss_pred             CcEEeeeEEEeeccCccceEEEEEEEE
Confidence            999999999999998887888877653


No 2  
>PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed
Probab=100.00  E-value=1.8e-57  Score=473.94  Aligned_cols=233  Identities=37%  Similarity=0.743  Sum_probs=216.4

Q ss_pred             CCCCCCcccccccCCCCCCCccccCChhhhcccCCchhhHHHhhhcccccCCCCCcEEecCccceEEecCCCCCCCcccc
Q 023943           18 YKVWEDPSFIKWRKRDPHVTLRCHDSVEVSNSAVWDDDAVHEALTSAAFWTNGLPFVKSLSGHWKFFLASSPPDVPLNFH   97 (275)
Q Consensus        18 ~~~w~~p~~~~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~LnG~W~F~~~~~~~~~p~~~~   97 (275)
                      +++||||+++++|||||||+|+||.+.++|+.         ++++       .++.+++|||.|+|++.+.+...|++|+
T Consensus         1 ~~~wen~~~~~~nr~~~~a~~~~~~~~~~a~~---------~~~~-------~~~~~~~LnG~W~F~~~~~~~~~~~~f~   64 (1021)
T PRK10340          1 MNRWENIQLTHENRLAPRAYFFSYDSVAQART---------FARE-------TSSLFLLLSGQWNFHFFDHPLYVPEAFT   64 (1021)
T ss_pred             CCcccCccccCCCCCCCCCCcCCcCCHHHHhh---------cccc-------cCCceeecCcceeEEEeCCccccccccc
Confidence            36899999999999999999999999986542         2111       2468899999999999988888899999


Q ss_pred             CCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEEEEEcCCCCCCceEEEEeCcccceeEEE
Q 023943           98 KSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAW  177 (275)
Q Consensus        98 ~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~Vw  177 (275)
                      .+++  ++|++|+|||+|+++|++.|+|+|..|||+..||++|..|++|||||+|++|++|+|++++|+|+||++.++||
T Consensus        65 ~~~~--~~W~~I~VP~~w~~~g~~~~~y~n~~y~~~~~~P~vp~~n~~g~Yrr~F~lp~~~~gkrv~L~FeGV~s~a~Vw  142 (1021)
T PRK10340         65 SELM--SDWGHITVPAMWQMEGHGKLQYTDEGFPFPIDVPFVPSDNPTGAYQRTFTLSDGWQGKQTIIKFDGVETYFEVY  142 (1021)
T ss_pred             cCCC--CCCcEeecCCChhhcCCCCcccccccccCCCCCCCCCCcCCeEEEEEEEEeCcccccCcEEEEECccceEEEEE
Confidence            8887  67999999999999999999999999999999999998899999999999999999999999999999999999


Q ss_pred             EcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCCCCCccccccceeEEEEeCCceEEeEEEEE
Q 023943          178 INGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQDHWWLSGIHRDVLLLAKPQVFIADYFFKS  257 (275)
Q Consensus       178 vNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledqd~w~~~GI~RdV~L~~~p~~~I~D~~v~t  257 (275)
                      |||++||+|+|+|+||+||||++|+.   |+ |+|+|+|++|++++|+++||+|+++||||+|+|+++|++||+|++|++
T Consensus       143 vNG~~VG~~~g~~~pfefDIT~~l~~---G~-N~LaV~V~~~~d~s~le~qd~w~~sGI~R~V~L~~~p~~~I~d~~v~t  218 (1021)
T PRK10340        143 VNGQYVGFSKGSRLTAEFDISAMVKT---GD-NLLCVRVMQWADSTYLEDQDMWWLAGIFRDVYLVGKPLTHINDFTVRT  218 (1021)
T ss_pred             ECCEEeccccCCCccEEEEcchhhCC---Cc-cEEEEEEEecCCCCccccCCccccccccceEEEEEeCCceEEeeEEEe
Confidence            99999999999999999999999999   95 999999999999999999999999999999999999999999999999


Q ss_pred             eecCCeeEEEEEEEE
Q 023943          258 NLAEDFSLADIQVNT  272 (275)
Q Consensus       258 ~ld~~~~~~~l~v~~  272 (275)
                      +++.++++|+|+|++
T Consensus       219 ~l~~~~~~a~l~v~v  233 (1021)
T PRK10340        219 DFDEDYCDATLSCEV  233 (1021)
T ss_pred             eccCccCceEEEEEE
Confidence            999887778877765


No 3  
>PRK10150 beta-D-glucuronidase; Provisional
Probab=100.00  E-value=1.4e-34  Score=288.92  Aligned_cols=183  Identities=29%  Similarity=0.436  Sum_probs=152.0

Q ss_pred             CCcEEecCccceEEecCCCCCCCccccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEE
Q 023943           71 LPFVKSLSGHWKFFLASSPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRT  150 (275)
Q Consensus        71 ~~~~~~LnG~W~F~~~~~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr  150 (275)
                      ++..++|||.|+|+..+.+.+.+++|+...++.  +..|.||++|+.++.+.+..               ...+.+||||
T Consensus         9 ~r~~~~Lng~W~F~~~~~~~~~~~~w~~~~~~~--~~~i~vP~~~~~~~~~~~~~---------------~~~G~~WYrr   71 (604)
T PRK10150          9 TREIKDLSGLWAFKLDRENCGIDQRWWESALPE--SRAMAVPGSFNDQFADADIR---------------NYVGDVWYQR   71 (604)
T ss_pred             CeeeeecCCccceEECCccccccccccccCCCC--CcEecCCCchhhcccccccc---------------CCcccEEEEE
Confidence            356789999999999887766677787765543  35899999998876433211               1257899999


Q ss_pred             EEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC------CCCc
Q 023943          151 YFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS------DGSY  224 (275)
Q Consensus       151 ~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~------dgs~  224 (275)
                      +|++|+.|++++++|+|+||++.++|||||++||+|+|+|+||+||||++|+.   |++|+|+|+|.+..      .|++
T Consensus        72 ~f~lp~~~~gk~v~L~Fegv~~~a~V~lNG~~vg~~~~~~~~f~~DIT~~l~~---G~~n~L~V~v~n~~~~~~~p~g~~  148 (604)
T PRK10150         72 EVFIPKGWAGQRIVLRFGSVTHYAKVWVNGQEVMEHKGGYTPFEADITPYVYA---GKSVRITVCVNNELNWQTLPPGNV  148 (604)
T ss_pred             EEECCcccCCCEEEEEECcccceEEEEECCEEeeeEcCCccceEEeCchhccC---CCceEEEEEEecCCCcccCCCCcc
Confidence            99999999999999999999999999999999999999999999999999999   86569999998642      2444


Q ss_pred             ccC----------CCCCccccccceeEEEEeCCceEEeEEEEEeecCCeeEEEEEEEEe
Q 023943          225 LED----------QDHWWLSGIHRDVLLLAKPQVFIADYFFKSNLAEDFSLADIQVNTC  273 (275)
Q Consensus       225 led----------qd~w~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~~~~~l~v~~~  273 (275)
                      .++          +|+|.++||||+|+|+++|++||+|++|+++++.+.+.|+|+|++.
T Consensus       149 ~~~~~~~~k~~~~~d~~~~~GI~r~V~L~~~~~~~i~dv~v~~~~~~~~~~a~v~v~v~  207 (604)
T PRK10150        149 IEDGNGKKKQKYNFDFFNYAGIHRPVMLYTTPKTHIDDITVVTELAQDLNHASVDWSVE  207 (604)
T ss_pred             ccCCccccccccccccccccCCCceEEEEEcCCccCceEEEEeecCCcCceEEEEEEEE
Confidence            432          5677899999999999999999999999999987767777776653


No 4  
>PF02837 Glyco_hydro_2_N:  Glycosyl hydrolases family 2, sugar binding domain;  InterPro: IPR006104 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 2 GH2 from CAZY comprises enzymes with several known activities; beta-galactosidase (3.2.1.23 from EC); beta-mannosidase (3.2.1.25 from EC); beta-glucuronidase (3.2.1.31 from EC). These enzymes contain a conserved glutamic acid residue which has been shown [], in Escherichia coli lacZ (P00722 from SWISSPROT), to be the general acid/base catalyst in the active site of the enzyme.  This domain has a jelly-roll fold [].; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3DEC_A 3OB8_A 3OBA_A 3CMG_A 3FN9_C 2VZU_A 2X09_A 2VZO_A 2X05_A 2VZV_B ....
Probab=100.00  E-value=9.2e-35  Score=244.55  Aligned_cols=165  Identities=38%  Similarity=0.663  Sum_probs=135.8

Q ss_pred             CcEEecCccceEEecCCCCCCCcc-ccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEE
Q 023943           72 PFVKSLSGHWKFFLASSPPDVPLN-FHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRT  150 (275)
Q Consensus        72 ~~~~~LnG~W~F~~~~~~~~~p~~-~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr  150 (275)
                      +..++|||.|+|+........+.. +....++++.|..|.||++|+..++......+       ..+......+.+||||
T Consensus         2 r~~~~Lng~W~f~~~~~~~~~~~~~~~~~~~~~~~w~~i~VP~~~~~~~~~~~~~~~-------~~~~~~~~~~~~wYr~   74 (167)
T PF02837_consen    2 RQVISLNGQWQFQPDDSPQDRPEGWFSWPDFDDSDWQPISVPGSWEDDLLRAFVPEN-------GDPELWDYSGYAWYRR   74 (167)
T ss_dssp             TCEEESSEEEEEEEESSGGGSCTHHCCSTTCCCTTSEEEEESSEGTCCTSSTBTTST-------TGCCTSTCCSEEEEEE
T ss_pred             CcEEECCccCCEEEeCCcccCccccccccccCcCCCeEEeCCCEeecCccceecccc-------ccccccccCceEEEEE
Confidence            578999999999999887665555 33446778899999999999987543210000       0001112378899999


Q ss_pred             EEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCCCC
Q 023943          151 YFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQDH  230 (275)
Q Consensus       151 ~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledqd~  230 (275)
                      +|++|++|+++++.|+|+||++.++|||||++||.+.++|+|++||||++|++   |++|+|+|+|.+..++++++.+++
T Consensus        75 ~f~lp~~~~~~~~~L~f~gv~~~a~v~vNG~~vg~~~~~~~~~~~dIt~~l~~---g~~N~l~V~v~~~~~~~~~~~~~~  151 (167)
T PF02837_consen   75 TFTLPADWKGKRVFLRFEGVDYAAEVYVNGKLVGSHEGGYTPFEFDITDYLKP---GEENTLAVRVDNWPDGSTIPGFDY  151 (167)
T ss_dssp             EEEESGGGTTSEEEEEESEEESEEEEEETTEEEEEEESTTS-EEEECGGGSSS---EEEEEEEEEEESSSGGGCGBSSSE
T ss_pred             EEEeCchhcCceEEEEeccceEeeEEEeCCeEEeeeCCCcCCeEEeChhhccC---CCCEEEEEEEeecCCCceeecCcC
Confidence            99999999999999999999999999999999999999999999999999999   855999999999998888888888


Q ss_pred             CccccccceeEEEEeC
Q 023943          231 WWLSGIHRDVLLLAKP  246 (275)
Q Consensus       231 w~~~GI~RdV~L~~~p  246 (275)
                      +.++||||+|+|+++|
T Consensus       152 ~~~~GI~r~V~L~~~p  167 (167)
T PF02837_consen  152 FNYAGIWRPVWLEATP  167 (167)
T ss_dssp             EE--EEESEEEEEEEE
T ss_pred             CccCccccEEEEEEEC
Confidence            8999999999999986


No 5  
>COG3250 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]
Probab=99.96  E-value=7.1e-30  Score=259.20  Aligned_cols=196  Identities=41%  Similarity=0.640  Sum_probs=180.5

Q ss_pred             CCcEEecCccceEEecCCCCCCCccccCCCCCCCCCeEeccCccccccc-CCCCcccceeccCCCCCCCCcccCCcccEE
Q 023943           71 LPFVKSLSGHWKFFLASSPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHG-FDRPIYTNVVYPFPLDPPNVPAENPTGCYR  149 (275)
Q Consensus        71 ~~~~~~LnG~W~F~~~~~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g-~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yr  149 (275)
                      ++..++|||.|.|++.+.+..+|..|.....++..  .|.||++|++++ ++.++|+|..||++..+|.++..++++.|.
T Consensus         9 ~~~~~~L~G~W~f~~~~~~~~~~~~w~~~~~s~~~--~i~VP~~w~~~~~~~~~~~~~~~y~~~~~~~~~~~~~~~~l~f   86 (808)
T COG3250           9 SREIKSLNGLWAFSLDDEPCAVPQRWPESLLSESR--AIAVPGNWQDQGEYDRPIYTNVWYPREVFPPKVPAGNRIGLYF   86 (808)
T ss_pred             ccceeccCCceeEEecCCccccccccchhhhhhcc--CccCCccHhhcCccCcceecceeeeecccCCccccCCceEEEE
Confidence            34678999999999998888899999877665544  899999999999 999999999999999999998889999999


Q ss_pred             EEEEcCCCC-CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCC
Q 023943          150 TYFHIPKEW-QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQ  228 (275)
Q Consensus       150 r~F~lp~~~-~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledq  228 (275)
                      +.|++.++| .+..++|.|+|+.+.++|||||+.||++.+++.++++|||+++++   |. |.+++.|.+|++++++++|
T Consensus        87 ~~~~~~~~v~~ng~~~l~~eg~~~~fev~vng~~v~~~~~~~~~~~~dis~~~~~---~~-~~~~~~v~~~~~~~~~~~~  162 (808)
T COG3250          87 DAVDTLAKVWLNGQEVLEFQGVYTPFEVDVTGPYVGGGKDSRITVEFDISPNLQT---GP-NGLVVTVENWSKGSYYEDQ  162 (808)
T ss_pred             eccccceeEEeCCeEEEEecCceeEEEEeeccceecCCcceEEEEeecccccccc---CC-ccCceEEeccCCCCCcccc
Confidence            999998876 567999999999999999999999999999999999999999999   84 9999999999999999999


Q ss_pred             CCCccccccceeEEEEeCCceEEeEEEEEeecCCeeEEEEEEEE
Q 023943          229 DHWWLSGIHRDVLLLAKPQVFIADYFFKSNLAEDFSLADIQVNT  272 (275)
Q Consensus       229 d~w~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~~~~~l~v~~  272 (275)
                      ||||++||+|||+|+.+|.+||.|++|.|+++.....+.+.+++
T Consensus       163 d~~r~aGi~RdV~l~i~p~~~~~di~V~t~~~~~~~~~~~~~~~  206 (808)
T COG3250         163 DFFRYAGIHRDVMLYITPNTHVDDITVVTHLAEDCNHASLDVKI  206 (808)
T ss_pred             CeeecccccceeEEEEccceeEeeeEEEEecchhhhhhheeehe
Confidence            99999999999999999999999999999998888877777433


No 6  
>KOG2024 consensus Beta-Glucuronidase GUSB (glycosylhydrolase superfamily 2) [Carbohydrate transport and metabolism]
Probab=99.65  E-value=4.1e-16  Score=138.08  Aligned_cols=176  Identities=25%  Similarity=0.399  Sum_probs=134.3

Q ss_pred             CCCcEEecCccceEEecCCCCCC---CccccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcc
Q 023943           70 GLPFVKSLSGHWKFFLASSPPDV---PLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTG  146 (275)
Q Consensus        70 ~~~~~~~LnG~W~F~~~~~~~~~---p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g  146 (275)
                      +++...+|+|-|.|..+.+....   -+.|+...+  ..-..|+||++++..|.+.+..               +.-+..
T Consensus        28 pire~~~ldgLw~f~r~~~~~~~~g~~~~w~~~~~--~~t~~mpvpss~nDi~~d~~lr---------------dfv~~~   90 (297)
T KOG2024|consen   28 PIREVKSLDGLWSFVRDSNQNRLQGILEQWENKES--GPTQDMPVPSSFNDIGQDWRLR---------------DFVGLV   90 (297)
T ss_pred             cchhhhhhCcchhcccCcccccchhHHhhhccccc--ccccccccccchhccccCCccc---------------cceeee
Confidence            46778899999999987764332   345655332  1225689999998877554221               124567


Q ss_pred             cEEEEEEcCCCC---CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCC
Q 023943          147 CYRTYFHIPKEW---QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGS  223 (275)
Q Consensus       147 ~Yrr~F~lp~~~---~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs  223 (275)
                      ||.|++.+|+.|   .++++.|||..+++.|.|||||..+-.|++++.|++-+|...++.   |..|.+--....| .+.
T Consensus        91 wyer~v~vpe~w~~~~~~r~vlr~~s~H~~Aivwvng~~~~~h~gg~lP~~~~is~~~~~---g~~~~~dn~L~~~-t~~  166 (297)
T KOG2024|consen   91 WYERTVTVPESWTQDLGKRVVLRIGSAHSYAIVWVNGVDALEHEGGHLPLEPDISALVFF---GPLPAIDNNLLSW-TGP  166 (297)
T ss_pred             EEEEEEEcchhhhhhcCCeEEEEeecccceeEEEEcceeecccccCccccchhhhhhhhc---cccccccCccccc-ccC
Confidence            999999999998   478999999999999999999999999999999999999998888   7656222122222 122


Q ss_pred             cccCCCCCccccccceeEEEEeCCceEEeEEEEEeecCCeeEE
Q 023943          224 YLEDQDHWWLSGIHRDVLLLAKPQVFIADYFFKSNLAEDFSLA  266 (275)
Q Consensus       224 ~ledqd~w~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~~~~  266 (275)
                      -..+.|+++++||.|+|-|+.+|.++|+|+.|.+.+..+...|
T Consensus       167 ~~~~~dffnYag~~~sv~l~t~p~vyi~~~~v~t~l~~~~~~a  209 (297)
T KOG2024|consen  167 NSFCFDFFNYAGEQRSVCLYTTPVVYIEDITVTTGLPHDSGCA  209 (297)
T ss_pred             CcccccCCCchhhheeeeeccCCeEEecCcceeeccccCCcce
Confidence            2346689999999999999999999999999999887765433


No 7  
>KOG2230 consensus Predicted beta-mannosidase [Carbohydrate transport and metabolism]
Probab=99.19  E-value=1.4e-10  Score=112.37  Aligned_cols=169  Identities=20%  Similarity=0.258  Sum_probs=117.8

Q ss_pred             cEEecCccceEEecCCCCCCCccccCCCCCCCCCeEeccCcccccccCCCCcccceeccC-CCCCCCCcccCCcccEEEE
Q 023943           73 FVKSLSGHWKFFLASSPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPF-PLDPPNVPAENPTGCYRTY  151 (275)
Q Consensus        73 ~~~~LnG~W~F~~~~~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~-~~~pp~vp~~n~~g~Yrr~  151 (275)
                      ...+|.|.|.|.-....-.               .+..|||+.....|..-+..|-.|-+ ..+-.++.  ...+.|.|+
T Consensus        21 ~t~~l~gnw~~~~~n~t~~---------------~~g~vpg~i~s~l~~~gii~~~~~~~n~ln~kwia--~d~wtysr~   83 (867)
T KOG2230|consen   21 NTLVLAGNWEFSSSNKTVN---------------GTGTVPGDIYSDLYASGIIDNPLFGENHLNLKWIA--EDDWTYSRK   83 (867)
T ss_pred             eeEEEecceEEecCCCcee---------------cCCCCCchHhHHHHhcccccCccccccccceeEEe--ccCccceee
Confidence            4567999999997654211               24578998766544332222222222 12333343  234679999


Q ss_pred             EEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC-------C---
Q 023943          152 FHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS-------D---  221 (275)
Q Consensus       152 F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~-------d---  221 (275)
                      |.|=+--+-..++|.+||||+.+.||+||+.|+.+.++|.|+.|+||..+.    | +|.|+++.....       +   
T Consensus        84 frl~dl~~~~~~~l~ie~vdtia~v~~n~~~v~~s~n~f~~y~~~vt~ii~----~-~n~i~~~f~ssv~yA~~~~~~~~  158 (867)
T KOG2230|consen   84 FRLIDLDDTVGAFLEIESVDTIATVYVNGQKVLHSRNQFLPYHVNVTDIIA----G-ENDITIKFKSSVKYAEKRADEYK  158 (867)
T ss_pred             eEEEEccccccceEEEeecceeEEEEEccEEEeeccccceeEEEeEEEEec----C-CcceEEEeehhHHHHHHHHHhhh
Confidence            988332234678999999999999999999999999999999999999765    5 599999887531       0   


Q ss_pred             -CC---------c-ccC--------CC--CC------ccccccceeEEEEeCCceEEeEEEEEeecCCe
Q 023943          222 -GS---------Y-LED--------QD--HW------WLSGIHRDVLLLAKPQVFIADYFFKSNLAEDF  263 (275)
Q Consensus       222 -gs---------~-led--------qd--~w------~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~  263 (275)
                       -+         | -||        |.  .|      ...||+.+|.|....-.++.|+.+++..+...
T Consensus       159 k~svPPdC~p~iyhGECH~NfiRK~Q~SFsWDWGPsfPt~GI~k~v~i~iY~~~~~~~f~~~~~~~~g~  227 (867)
T KOG2230|consen  159 KHSLPPDCNPDIYHGECHQNFIRKAQYSFAWDWGPSFPTVGIPSTITINIYRGQYFHDFNWKTRFAHGK  227 (867)
T ss_pred             ccCCCCCCCchhhccchHHHHHHHhhcceecccCCCCccCCCCcceEEEEEeeeEEEeeceeeeeecce
Confidence             00         0 011        22  23      25999999999999999999999998877653


No 8  
>PF13364 BetaGal_dom4_5:  Beta-galactosidase jelly roll domain; PDB: 1TG7_A 1XC6_A 3OGS_A 3OGV_A 3OGR_A 3OG2_A.
Probab=97.95  E-value=4.9e-05  Score=60.20  Aligned_cols=71  Identities=18%  Similarity=0.149  Sum_probs=48.2

Q ss_pred             cCCcccEEEEEEcCCCCCCceEE-EEe-CcccceeEEEEcCEEeeeec-CCCCCceecccc-ccccCCCCCceEEEEEEE
Q 023943          142 ENPTGCYRTYFHIPKEWQGRRIL-LHF-EAVDSAFCAWINGVPVGYSQ-DSRLPAEFEISD-YCYPHGSDKKNVLAVQVF  217 (275)
Q Consensus       142 ~n~~g~Yrr~F~lp~~~~~~~i~-L~f-~gv~s~~~VwvNG~~VG~~~-~~~~p~efdIT~-~Lk~~~~G~eN~L~V~V~  217 (275)
                      ..+..|||.+|.....  ...+. |.. .|-...+.|||||+++|... +.-....|.|+. .|+.   + .|.|+|.+.
T Consensus        33 ~~g~~~Yrg~F~~~~~--~~~~~~l~~~~g~~~~~~vwVNG~~~G~~~~~~g~q~tf~~p~~il~~---~-n~v~~vl~~  106 (111)
T PF13364_consen   33 HAGYLWYRGTFTGTGQ--DTSLTPLNIQGGNAFRASVWVNGWFLGSYWPGIGPQTTFSVPAGILKY---G-NNVLVVLWD  106 (111)
T ss_dssp             SSCEEEEEEEEETTTE--EEEEE-EEECSSTTEEEEEEETTEEEEEEETTTECCEEEEE-BTTBTT---C-EEEEEEEEE
T ss_pred             CCCCEEEEEEEeCCCc--ceeEEEEeccCCCceEEEEEECCEEeeeecCCCCccEEEEeCceeecC---C-CEEEEEEEe
Confidence            3678999999964221  13444 444 36677899999999999977 444447788877 6776   6 467777666


Q ss_pred             e
Q 023943          218 R  218 (275)
Q Consensus       218 ~  218 (275)
                      +
T Consensus       107 ~  107 (111)
T PF13364_consen  107 N  107 (111)
T ss_dssp             -
T ss_pred             C
Confidence            4


No 9  
>COG3250 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]
Probab=97.93  E-value=1.6e-05  Score=82.28  Aligned_cols=70  Identities=27%  Similarity=0.245  Sum_probs=62.9

Q ss_pred             cccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943          145 TGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       145 ~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      ..+|.++|.+|....++++.|.|++++.-++||+||+.++.++|+|++|+++|+.-+..   | +|.+.+.+..
T Consensus        64 ~~~y~~~~~~~~~~~~~~~~l~f~~~~~~~~v~~ng~~~l~~eg~~~~fev~vng~~v~---~-~~~~~~~~~~  133 (808)
T COG3250          64 NVWYPREVFPPKVPAGNRIGLYFDAVDTLAKVWLNGQEVLEFQGVYTPFEVDVTGPYVG---G-GKDSRITVEF  133 (808)
T ss_pred             ceeeeecccCCccccCCceEEEEeccccceeEEeCCeEEEEecCceeEEEEeeccceec---C-CcceEEEEee
Confidence            46799999999888899999999999999999999999999999999999999975555   5 5888888876


No 10 
>PLN03059 beta-galactosidase; Provisional
Probab=97.76  E-value=0.00015  Score=74.78  Aligned_cols=94  Identities=19%  Similarity=0.154  Sum_probs=72.1

Q ss_pred             CCcccEEEEEEcCCC---C-CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceecccc--ccccCCCCCceEEEEEE
Q 023943          143 NPTGCYRTYFHIPKE---W-QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISD--YCYPHGSDKKNVLAVQV  216 (275)
Q Consensus       143 n~~g~Yrr~F~lp~~---~-~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~--~Lk~~~~G~eN~L~V~V  216 (275)
                      .+..|||++|.++.+   | .+....|++..+...+.|||||+++|...+...-..|.+..  -|+.   |. |+|.|.+
T Consensus       469 ~dYlwY~t~i~~~~~~~~~~~~~~~~L~v~~~~d~~~vFVNg~~~Gt~~~~~~~~~~~~~~~v~l~~---g~-n~L~iLs  544 (840)
T PLN03059        469 TDYLWYMTEVHIDPDEGFLKTGQYPVLTIFSAGHALHVFINGQLAGTVYGELSNPKLTFSQNVKLTV---GI-NKISLLS  544 (840)
T ss_pred             CceEEEEEEEeecCCccccccCCCceEEEcccCcEEEEEECCEEEEEEEeecCCcceEEecccccCC---Cc-eEEEEEE
Confidence            667899999998764   2 25667899999999999999999999976655444454443  3667   84 9999999


Q ss_pred             EecC---CCCcccCCCCCccccccceeEEEE
Q 023943          217 FRWS---DGSYLEDQDHWWLSGIHRDVLLLA  244 (275)
Q Consensus       217 ~~~~---dgs~ledqd~w~~~GI~RdV~L~~  244 (275)
                      .+--   -|.++|.+    ..||.++|.|..
T Consensus       545 e~vG~~NyG~~le~~----~kGI~g~V~i~g  571 (840)
T PLN03059        545 VAVGLPNVGLHFETW----NAGVLGPVTLKG  571 (840)
T ss_pred             EeCCCCccCcccccc----cccccccEEEec
Confidence            9753   26667644    499999999965


No 11 
>PF08531 Bac_rhamnosid_N:  Alpha-L-rhamnosidase N-terminal domain;  InterPro: IPR013737 This domain is found in bacterial rhamnosidase A and B enzymes and is probably involved in substrate recognition. ; PDB: 2OKX_B.
Probab=97.62  E-value=0.00012  Score=62.41  Aligned_cols=53  Identities=25%  Similarity=0.346  Sum_probs=35.6

Q ss_pred             ceEEEEeCcccceeEEEEcCEEeeeec---C--CCCC----ceeccccccccCCCCCceEEEEEEEe
Q 023943          161 RRILLHFEAVDSAFCAWINGVPVGYSQ---D--SRLP----AEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       161 ~~i~L~f~gv~s~~~VwvNG~~VG~~~---~--~~~p----~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      ++..|++-+ +..+++||||+.||...   +  .|.-    -.+|||++|+.   | +|.|+|.|.+
T Consensus         4 ~~A~l~isa-~g~Y~l~vNG~~V~~~~l~P~~t~y~~~~~Y~tyDVt~~L~~---G-~N~iav~lg~   65 (172)
T PF08531_consen    4 RSARLYISA-LGRYELYVNGERVGDGPLAPGWTDYDKRVYYQTYDVTPYLRP---G-ENVIAVWLGN   65 (172)
T ss_dssp             ---EEEEEE-ESEEEEEETTEEEEEE--------BTTEEEEEEEE-TTT--T---T-EEEEEEEEEE
T ss_pred             eEEEEEEEe-CeeEEEEECCEEeeCCccccccccCCCceEEEEEeChHHhCC---C-CCEEEEEEeC
Confidence            356677766 56899999999999754   1  1111    36899999999   9 5999999986


No 12 
>PLN03059 beta-galactosidase; Provisional
Probab=96.07  E-value=0.011  Score=61.48  Aligned_cols=66  Identities=26%  Similarity=0.439  Sum_probs=50.0

Q ss_pred             CCcccEEEEEEcCCCCCCce-EEEEeCcccceeEEEEcCEEeeeecCC---------------C----------CC--ce
Q 023943          143 NPTGCYRTYFHIPKEWQGRR-ILLHFEAVDSAFCAWINGVPVGYSQDS---------------R----------LP--AE  194 (275)
Q Consensus       143 n~~g~Yrr~F~lp~~~~~~~-i~L~f~gv~s~~~VwvNG~~VG~~~~~---------------~----------~p--~e  194 (275)
                      .+..||+.+|++|+   +.. ++|.+.| .....|||||+-||.-...               |          -|  .-
T Consensus       618 ~p~twYK~~Fd~p~---g~Dpv~LDm~g-mGKG~aWVNG~nIGRYW~~~a~~~gC~~c~y~g~~~~~kc~~~cggP~q~l  693 (840)
T PLN03059        618 QPLTWYKTTFDAPG---GNDPLALDMSS-MGKGQIWINGQSIGRHWPAYTAHGSCNGCNYAGTFDDKKCRTNCGEPSQRW  693 (840)
T ss_pred             CCceEEEEEEeCCC---CCCCEEEeccc-CCCeeEEECCcccccccccccccCCCccccccccccchhhhccCCCceeEE
Confidence            44789999999986   454 9999999 6789999999999975411               1          22  23


Q ss_pred             ecccc-ccccCCCCCceEEEEEE
Q 023943          195 FEISD-YCYPHGSDKKNVLAVQV  216 (275)
Q Consensus       195 fdIT~-~Lk~~~~G~eN~L~V~V  216 (275)
                      +.|+. +||+   |+ |+|+|-=
T Consensus       694 YHVPr~~Lk~---g~-N~lViFE  712 (840)
T PLN03059        694 YHVPRSWLKP---SG-NLLIVFE  712 (840)
T ss_pred             EeCcHHHhcc---CC-ceEEEEE
Confidence            56776 9999   85 9887753


No 13 
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=95.88  E-value=0.026  Score=56.78  Aligned_cols=75  Identities=20%  Similarity=0.191  Sum_probs=57.7

Q ss_pred             eEEEEeC-cccceeEEEEcCEEeeeecCCCCCceecccc--ccccCCCCCceEEEEEEEecC---CCCcccCCCCCcccc
Q 023943          162 RILLHFE-AVDSAFCAWINGVPVGYSQDSRLPAEFEISD--YCYPHGSDKKNVLAVQVFRWS---DGSYLEDQDHWWLSG  235 (275)
Q Consensus       162 ~i~L~f~-gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~--~Lk~~~~G~eN~L~V~V~~~~---dgs~ledqd~w~~~G  235 (275)
                      ...|.+. ++..+.+|||||+++|...+.+.-..+.+..  -|+.   | +|.|++.+.+--   -| +.|.    +..|
T Consensus       434 ~t~~~i~ls~g~~~hVfvNg~~~G~~~g~~~~~~~~~~~~~~l~~---g-~n~l~iL~~~~G~~n~G-~~e~----~~~G  504 (649)
T KOG0496|consen  434 TTSLKIPLSLGHALHVFVNGEFAGSLHGNNEKIKLNLSQPVGLKA---G-ENKLALLSENVGLPNYG-HFEN----DFKG  504 (649)
T ss_pred             CceEeecccccceEEEEECCEEeeeEeccccceeEEeeccccccc---C-cceEEEEEEecCCCCcC-cccc----cccc
Confidence            4567777 9999999999999999998887666666554  4567   8 599999998752   24 3333    2589


Q ss_pred             ccceeEEEEe
Q 023943          236 IHRDVLLLAK  245 (275)
Q Consensus       236 I~RdV~L~~~  245 (275)
                      |.++|+|...
T Consensus       505 i~g~v~l~g~  514 (649)
T KOG0496|consen  505 ILGPVYLNGL  514 (649)
T ss_pred             cccceEEeee
Confidence            9999999876


No 14 
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=94.62  E-value=0.058  Score=54.32  Aligned_cols=68  Identities=24%  Similarity=0.370  Sum_probs=52.6

Q ss_pred             CCcccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCC-ceecccc-ccccCCCCCceEEEEEEEe
Q 023943          143 NPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLP-AEFEISD-YCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       143 n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p-~efdIT~-~Lk~~~~G~eN~L~V~V~~  218 (275)
                      .|.-||. .|++|+.  ...+.|.+.| -....|||||+-||+..-++-| ..+-|+. +||+   |+ |.|+|-=..
T Consensus       556 ~P~~w~k-~f~~p~g--~~~t~Ldm~g-~GKG~vwVNG~niGRYW~~~G~Q~~yhvPr~~Lk~---~~-N~lvvfEee  625 (649)
T KOG0496|consen  556 QPLTWYK-TFDIPSG--SEPTALDMNG-WGKGQVWVNGQNIGRYWPSFGPQRTYHVPRSWLKP---SG-NLLVVFEEE  625 (649)
T ss_pred             CCeEEEE-EecCCCC--CCCeEEecCC-CcceEEEECCcccccccCCCCCceEEECcHHHhCc---CC-ceEEEEEec
Confidence            4567787 9999986  4479999999 6789999999999987655443 4566776 8999   84 988775443


No 15 
>PF07691 PA14:  PA14 domain;  InterPro: IPR011658 The PA14 domain forms an insert in bacterial beta-glucosidases, other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium pre-spore cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding [].; PDB: 2XVG_A 2XVK_A 2XVL_A 2XJU_A 2XJT_A 2XJQ_A 2XJS_A 2XJV_A 2XJP_A 2XJR_A ....
Probab=92.49  E-value=1.2  Score=35.55  Aligned_cols=71  Identities=17%  Similarity=0.206  Sum_probs=48.5

Q ss_pred             CCcccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCC-------CceeccccccccCCCCCceEEEEE
Q 023943          143 NPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRL-------PAEFEISDYCYPHGSDKKNVLAVQ  215 (275)
Q Consensus       143 n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~-------p~efdIT~~Lk~~~~G~eN~L~V~  215 (275)
                      +-...++-.|.+|..   ....+.+.+ +..++|||||+.|..+.+...       +....-+-.|.+   |+...|.|.
T Consensus        45 ~~~~~~~G~~~~~~~---G~y~f~~~~-~d~~~l~idg~~vid~~~~~~~~~~~~~~~~~~~~v~l~~---g~~y~i~i~  117 (145)
T PF07691_consen   45 NFSVRWTGYFKPPET---GTYTFSLTS-DDGARLWIDGKLVIDNWGNQGGGFFNSGPSSTSGTVTLEA---GGKYPIRIE  117 (145)
T ss_dssp             SEEEEEEEEEEESSS---EEEEEEEEE-SSEEEEEETTEEEEECSCTTTSTTTTTSBCCEEEEEEE-T---T-EEEEEEE
T ss_pred             eEEEEEEEEEecccC---ceEEEEEEe-cccEEEEECCEEEEcCCccccccccccccceEEEEEEeeC---CeeEEEEEE
Confidence            445678889998864   567777774 668999999999988765433       233333334555   656899999


Q ss_pred             EEecC
Q 023943          216 VFRWS  220 (275)
Q Consensus       216 V~~~~  220 (275)
                      ..+..
T Consensus       118 y~~~~  122 (145)
T PF07691_consen  118 YFNRG  122 (145)
T ss_dssp             EEECS
T ss_pred             EEECC
Confidence            88864


No 16 
>PF14683 CBM-like:  Polysaccharide lyase family 4, domain III; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=91.79  E-value=0.37  Score=40.90  Aligned_cols=69  Identities=14%  Similarity=0.174  Sum_probs=38.7

Q ss_pred             ccEEEEEEcCCCCCCc--eEEEEeCcc--cceeEEEEcCEEeee-----------------ecCCCCCceecccc-cccc
Q 023943          146 GCYRTYFHIPKEWQGR--RILLHFEAV--DSAFCAWINGVPVGY-----------------SQDSRLPAEFEISD-YCYP  203 (275)
Q Consensus       146 g~Yrr~F~lp~~~~~~--~i~L~f~gv--~s~~~VwvNG~~VG~-----------------~~~~~~p~efdIT~-~Lk~  203 (275)
                      +-.+-+|.+++...++  .+.|.+-+.  .....|.||| .++.                 +.|-+.-++|+|+. .|+.
T Consensus        63 ~~w~I~F~l~~~~~~~~~tL~i~la~a~~~~~~~V~vNg-~~~~~~~~~~~~d~~~~r~g~~~G~~~~~~~~ipa~~L~~  141 (167)
T PF14683_consen   63 GTWTIKFDLDAVQLAGTYTLRIALAGASAGGRLQVSVNG-WSGPFPSAPFGNDNAIYRSGIHRGNYRLYEFDIPASLLKA  141 (167)
T ss_dssp             --EEEEEEE-GGG-S--EEEEEEEEEEETT-EEEEEETT-EE-----------S--GGGT---S---EEEEEE-TTSS-S
T ss_pred             CCEEEEEECCCCccCCcEEEEEEeccccCCCCEEEEEcC-ccCCccccccCCCCceeeCceecccEEEEEEEEcHHHEEe
Confidence            5677788887765333  333334333  3467899999 4432                 22566788999987 8898


Q ss_pred             CCCCCceEEEEEEEec
Q 023943          204 HGSDKKNVLAVQVFRW  219 (275)
Q Consensus       204 ~~~G~eN~L~V~V~~~  219 (275)
                         | +|+|.+.+.+.
T Consensus       142 ---G-~Nti~lt~~~g  153 (167)
T PF14683_consen  142 ---G-ENTITLTVPSG  153 (167)
T ss_dssp             ---E-EEEEEEEEE-S
T ss_pred             ---c-cEEEEEEEccC
Confidence               9 59999999974


No 17 
>PF03170 BcsB:  Bacterial cellulose synthase subunit;  InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=88.86  E-value=2.5  Score=42.83  Aligned_cols=71  Identities=20%  Similarity=0.302  Sum_probs=51.6

Q ss_pred             ccEEEEEEcCCCCCCceEEEEeCc--------ccceeEEEEcCEEeeeec----CC-CCCceeccccccccCCCCCceEE
Q 023943          146 GCYRTYFHIPKEWQGRRILLHFEA--------VDSAFCAWINGVPVGYSQ----DS-RLPAEFEISDYCYPHGSDKKNVL  212 (275)
Q Consensus       146 g~Yrr~F~lp~~~~~~~i~L~f~g--------v~s~~~VwvNG~~VG~~~----~~-~~p~efdIT~~Lk~~~~G~eN~L  212 (275)
                      +...-.|.+|..|.-+...|+|..        -.+...|+|||+.||.-.    +. ....+++|.+.+..   | .|+|
T Consensus        29 ~~~~~~f~v~~~~~v~~a~L~L~~~~S~~l~~~~S~L~V~lNg~~v~s~~l~~~~~~~~~~~i~Ip~~l~~---g-~N~l  104 (605)
T PF03170_consen   29 ASRTIYFPVPADWVVTKATLNLSYTYSPSLLPERSQLTVSLNGQPVGSIPLDAESAQPQTVTIPIPPALIK---G-FNRL  104 (605)
T ss_pred             CceEEEEEcCCCccccceEEEEEEEECcccCCCcceEEEEECCEEeEEEecCcCCCCceEEEEecChhhcC---C-ceEE
Confidence            455667889888855544444332        136789999999999642    33 56788999988887   8 5999


Q ss_pred             EEEEEecC
Q 023943          213 AVQVFRWS  220 (275)
Q Consensus       213 ~V~V~~~~  220 (275)
                      .|++....
T Consensus       105 ~~~~~~~~  112 (605)
T PF03170_consen  105 TFEFIGHY  112 (605)
T ss_pred             EEEEEecc
Confidence            99999764


No 18 
>PF08308 PEGA:  PEGA domain;  InterPro: IPR013229 This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands.
Probab=88.41  E-value=1  Score=32.01  Aligned_cols=43  Identities=16%  Similarity=0.275  Sum_probs=29.1

Q ss_pred             EEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEec
Q 023943          165 LHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRW  219 (275)
Q Consensus       165 L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~  219 (275)
                      |.+...-..+.|||||+++|.     +|.++.   .|..   | .+.|.|+-..+
T Consensus         4 l~V~s~p~gA~V~vdg~~~G~-----tp~~~~---~l~~---G-~~~v~v~~~Gy   46 (71)
T PF08308_consen    4 LRVTSNPSGAEVYVDGKYIGT-----TPLTLK---DLPP---G-EHTVTVEKPGY   46 (71)
T ss_pred             EEEEEECCCCEEEECCEEecc-----Ccceee---ecCC---c-cEEEEEEECCC
Confidence            455555567999999999993     454433   1456   8 48888876543


No 19 
>smart00758 PA14 domain in bacterial beta-glucosidases other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins.
Probab=88.10  E-value=3.8  Score=32.56  Aligned_cols=70  Identities=14%  Similarity=0.193  Sum_probs=45.3

Q ss_pred             CCcccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCC-CceeccccccccCCCCCceEEEEEEEec
Q 023943          143 NPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRL-PAEFEISDYCYPHGSDKKNVLAVQVFRW  219 (275)
Q Consensus       143 n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~-p~efdIT~~Lk~~~~G~eN~L~V~V~~~  219 (275)
                      +-...++-.|.+|+.   ....+.+.+ +..+.+||||+.|-.+.+... ..+.-.+-.|.+   |+...|.|+....
T Consensus        43 ~f~~~~~g~i~~~~~---G~y~f~~~~-~~~~~l~Idg~~vid~~~~~~~~~~~~~~v~l~~---g~~~~i~v~y~~~  113 (136)
T smart00758       43 NFSVRWTGYLKPPED---GEYTFSITS-DDGARLWIDGKLVIDNWGKHEARPSTSSTLYLLA---GGTYPIRIEYFEA  113 (136)
T ss_pred             cEEEEEEEEEECCCC---ccEEEEEEc-CCcEEEEECCcEEEcCCccCCCccccceeEEEeC---CcEEEEEEEEEeC
Confidence            344668888888764   456777754 678999999999987644322 111122224556   6568888887654


No 20 
>PF03170 BcsB:  Bacterial cellulose synthase subunit;  InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=86.68  E-value=2.2  Score=43.15  Aligned_cols=67  Identities=22%  Similarity=0.331  Sum_probs=50.5

Q ss_pred             EEEEEcCCC---CCCceEEEEeCcc--------cceeEEEEcCEEeeee------cCCCCCceeccccccccCCCCCceE
Q 023943          149 RTYFHIPKE---WQGRRILLHFEAV--------DSAFCAWINGVPVGYS------QDSRLPAEFEISDYCYPHGSDKKNV  211 (275)
Q Consensus       149 rr~F~lp~~---~~~~~i~L~f~gv--------~s~~~VwvNG~~VG~~------~~~~~p~efdIT~~Lk~~~~G~eN~  211 (275)
                      +-.|.+|.+   |.++.+.|++...        .+...|+|||++|+.-      ......+++.|+.++..   |. |+
T Consensus       327 ~~~f~lP~dl~~~~~~~i~l~L~y~y~~~~~~~~S~l~V~vNg~~i~s~~L~~~~~~~~~~~~v~iP~~~~~---~~-N~  402 (605)
T PF03170_consen  327 SFNFRLPPDLFAWDGSGIPLHLRYRYTPGLDFDGSRLTVYVNGQFIGSLPLTPADGAGFDRYTVSIPRLLLP---GR-NQ  402 (605)
T ss_pred             eeEeeCCccccccCCCceEEEEEEecCCCCCCCCcEEEEEECCEEEEeEECCCCCCCccceeEEecCchhcC---CC-cE
Confidence            446888886   5667776666433        5567899999999864      35666788999988888   84 99


Q ss_pred             EEEEEEec
Q 023943          212 LAVQVFRW  219 (275)
Q Consensus       212 L~V~V~~~  219 (275)
                      |.+++.-.
T Consensus       403 l~~~f~l~  410 (605)
T PF03170_consen  403 LQFEFDLP  410 (605)
T ss_pred             EEEEEEee
Confidence            99999854


No 21 
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=85.93  E-value=3.5  Score=43.02  Aligned_cols=70  Identities=17%  Similarity=0.149  Sum_probs=49.3

Q ss_pred             EEEEEEcCCCCCCceEEEEeC--------cccceeEEEEcCEEeeeec------CCCCCceeccccccccCCCCCceEEE
Q 023943          148 YRTYFHIPKEWQGRRILLHFE--------AVDSAFCAWINGVPVGYSQ------DSRLPAEFEISDYCYPHGSDKKNVLA  213 (275)
Q Consensus       148 Yrr~F~lp~~~~~~~i~L~f~--------gv~s~~~VwvNG~~VG~~~------~~~~p~efdIT~~Lk~~~~G~eN~L~  213 (275)
                      .+-.|.+|.+|.-..+.|++.        .-.|.-.|.|||+.||.-.      +.....+++|...+..   | .|+|.
T Consensus        83 ~~i~f~vp~d~~v~~A~L~L~y~~Sp~l~~~~S~L~V~lNg~~v~s~pL~~~~~~~~~~~~i~IP~~l~~---g-~N~L~  158 (756)
T PRK11114         83 GGIEFGVRSDEVVTKARLNLEYTYSPALLPDLSHLKVYLNGELMGTLPLDKEQLGKKVLAQLPIDPRFIT---D-FNRLR  158 (756)
T ss_pred             ceeEeecCccccccCcEEEEEEEECCCCCCCCCeEEEEECCEEeEEEecCcccCCCcceeEEecCHHHcC---C-CceEE
Confidence            366788888774343344333        1247889999999998642      3345778999987777   8 49999


Q ss_pred             EEEEecCC
Q 023943          214 VQVFRWSD  221 (275)
Q Consensus       214 V~V~~~~d  221 (275)
                      +++.....
T Consensus       159 ~~~~~~~~  166 (756)
T PRK11114        159 LEFIGHYT  166 (756)
T ss_pred             EEEecCCC
Confidence            99886543


No 22 
>PF06832 BiPBP_C:  Penicillin-Binding Protein C-terminus Family;  InterPro: IPR009647 This conserved region of approximately 90 residues is found in a sub-group of bacterial Penicillin-Binding Proteins (PBPs). A variable length loop region separates this region from the transpeptidase unit (IPR001460 from INTERPRO). It is predicted to be a beta fold.
Probab=77.45  E-value=11  Score=27.99  Aligned_cols=48  Identities=17%  Similarity=0.311  Sum_probs=34.3

Q ss_pred             CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEE
Q 023943          159 QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAV  214 (275)
Q Consensus       159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V  214 (275)
                      +...+.|...|-....+-||||+++|.....+   ++.+.. ..+   |+ ++|.|
T Consensus        30 ~~~~l~l~a~~~~~~~~W~vdg~~~g~~~~~~---~~~~~~-~~~---G~-h~l~v   77 (89)
T PF06832_consen   30 ERQPLVLKAAGGRGPVYWFVDGEPLGTTQPGH---QLFWQP-DRP---GE-HTLTV   77 (89)
T ss_pred             ccceEEEEEeCCCCcEEEEECCEEcccCCCCC---eEEeCC-CCC---ee-EEEEE
Confidence            46788888888777999999999998765432   233321 246   84 88888


No 23 
>PF12733 Cadherin-like:  Cadherin-like beta sandwich domain
Probab=71.30  E-value=19  Score=26.44  Aligned_cols=56  Identities=20%  Similarity=0.279  Sum_probs=39.1

Q ss_pred             EEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceE-EEEEEEec
Q 023943          151 YFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNV-LAVQVFRW  219 (275)
Q Consensus       151 ~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~-L~V~V~~~  219 (275)
                      +..+|.+  -..+.|.....+..+.|.|||..+...   -.+..+.+    ..   | +|. |.|.|...
T Consensus        17 ~~~V~~~--~~~v~v~a~~~~~~a~v~vng~~~~~~---~~~~~i~L----~~---G-~n~~i~i~Vta~   73 (88)
T PF12733_consen   17 TVTVPND--VDSVTVTATPEDSGATVTVNGVPVNSG---GYSATIPL----NE---G-ENTVITITVTAE   73 (88)
T ss_pred             EEEECCC--ceEEEEEEEECCCCEEEEEcCEEccCC---CcceeeEc----cC---C-CceEEEEEEEcC
Confidence            5566765  356888887778999999999987543   12233444    46   8 498 99999753


No 24 
>PF04566 RNA_pol_Rpb2_4:  RNA polymerase Rpb2, domain 4;  InterPro: IPR007646 RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Domain 4, is also known as the external 2 domain [].; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 3S17_B 1I6H_B 4A3B_B 3K1F_B 4A3I_B 1TWA_B 3S14_B 3S15_B 2NVX_B 3M3Y_B ....
Probab=67.24  E-value=5  Score=28.45  Aligned_cols=13  Identities=38%  Similarity=0.744  Sum_probs=11.8

Q ss_pred             EEEcCEEeeeecC
Q 023943          176 AWINGVPVGYSQD  188 (275)
Q Consensus       176 VwvNG~~VG~~~~  188 (275)
                      |+|||..+|.+++
T Consensus         1 VFlNG~~iG~~~~   13 (63)
T PF04566_consen    1 VFLNGVWIGIHSD   13 (63)
T ss_dssp             EEETTEEEEEESS
T ss_pred             CEECCEEEEEEcC
Confidence            7999999999975


No 25 
>PF12222 PNGaseA:  Peptide N-acetyl-beta-D-glucosaminyl asparaginase amidase A;  InterPro: IPR021102  Peptide-N4-(N-acetyl-beta-glucosaminyl)asparagine amidase A (PNGase A), unlike many other amidases, is capable of hydrolysing glycopeptides with an alpha-1,3-fucosylated asparagine-bound N-acetylglucosamine (GlcNAc). PNGase A is a heterodimer composed of a large and small subunit []. This entry represents the PNGase A precursor, which contains both subunits and is activated by proteolytic cleavage.
Probab=65.64  E-value=14  Score=36.08  Aligned_cols=51  Identities=10%  Similarity=0.098  Sum_probs=35.7

Q ss_pred             cccceeEEEEcCEEeeeec-------C----------------CCCCceeccccccccCCCCCceEEEEEEEec
Q 023943          169 AVDSAFCAWINGVPVGYSQ-------D----------------SRLPAEFEISDYCYPHGSDKKNVLAVQVFRW  219 (275)
Q Consensus       169 gv~s~~~VwvNG~~VG~~~-------~----------------~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~  219 (275)
                      |.--...|+|||+.+|...       |                ...++++|||++|-.--+|+.++|.|+|.+-
T Consensus       218 gpfReV~V~iDg~lag~~~PfPvIfTGGI~P~lWrPI~~i~aFdl~~y~iDlTPfLp~L~dg~~h~~~i~V~~~  291 (427)
T PF12222_consen  218 GPFREVQVYIDGQLAGVVWPFPVIFTGGINPFLWRPIVGIGAFDLPSYDIDLTPFLPLLWDGKPHTFEIRVVNA  291 (427)
T ss_pred             CCcEEEEEEECCEEEEEECCCCeEEeCCcCcccccccCCCcccCCCceeEEeccchhcccCCCccEEEEEEEcc
Confidence            3445678999999998541       2                2336899999966221127668999999984


No 26 
>PF11008 DUF2846:  Protein of unknown function (DUF2846);  InterPro: IPR022548  Some members in this group of proteins with unknown function are annotated as lipoproteins. However this cannot be confirmed. 
Probab=59.88  E-value=30  Score=27.10  Aligned_cols=34  Identities=15%  Similarity=0.283  Sum_probs=23.5

Q ss_pred             cceeEEEEcCEEeeeecC-CCCCceeccccccccCCCCCceEEEE
Q 023943          171 DSAFCAWINGVPVGYSQD-SRLPAEFEISDYCYPHGSDKKNVLAV  214 (275)
Q Consensus       171 ~s~~~VwvNG~~VG~~~~-~~~p~efdIT~~Lk~~~~G~eN~L~V  214 (275)
                      .....|||||+.||.... +|  +.++++    +   |+ ++|..
T Consensus        40 ~~~~~v~vdg~~ig~l~~g~y--~~~~v~----p---G~-h~i~~   74 (117)
T PF11008_consen   40 AVKPDVYVDGELIGELKNGGY--FYVEVP----P---GK-HTISA   74 (117)
T ss_pred             cccceEEECCEEEEEeCCCeE--EEEEEC----C---Cc-EEEEE
Confidence            557899999999998653 33  344544    4   74 77666


No 27 
>PF14324 PINIT:  PINIT domain; PDB: 3I2D_A.
Probab=56.22  E-value=13  Score=30.54  Aligned_cols=51  Identities=16%  Similarity=0.185  Sum_probs=24.8

Q ss_pred             eEEEEeCcccceeEEEEcCEEeeee-------cCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943          162 RILLHFEAVDSAFCAWINGVPVGYS-------QDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       162 ~i~L~f~gv~s~~~VwvNG~~VG~~-------~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      ...+.|.   ...+|+|||+.|-..       .|.-.  =+|||++++.. .+..|+|.|.=..
T Consensus        74 ~q~i~FP---~~~evkvN~~~v~~~~~glknKpGt~r--PvdIT~~l~~~-~~~~N~i~v~y~~  131 (144)
T PF14324_consen   74 NQPIEFP---PPCEVKVNGKQVKLNNRGLKNKPGTAR--PVDITPYLRLS-PPQTNRIEVTYAN  131 (144)
T ss_dssp             GB--------SSEEEEETTEE--S--SS-TTS-GGGS---EE-GGG---S--SS-EEEEEEEEE
T ss_pred             ccccccC---CCeEEEEeCEEcccCccCCCCCCCCCC--Ccccchhhccc-CCCCeEEEEEEeC
Confidence            3444554   368999999999532       23333  47999999862 1346998887554


No 28 
>PF11824 DUF3344:  Protein of unknown function (DUF3344);  InterPro: IPR021779  This family of proteins are functionally uncharacterised. This protein is found in bacteria and archaea. Proteins in this family are typically between 367 to 1857 amino acids in length. 
Probab=56.21  E-value=27  Score=31.89  Aligned_cols=66  Identities=14%  Similarity=0.084  Sum_probs=38.1

Q ss_pred             EEEEEEcCCCCCCceEEEEeC--------cccceeEEEEcCEEeee----------e----cCC-----CCCceeccccc
Q 023943          148 YRTYFHIPKEWQGRRILLHFE--------AVDSAFCAWINGVPVGY----------S----QDS-----RLPAEFEISDY  200 (275)
Q Consensus       148 Yrr~F~lp~~~~~~~i~L~f~--------gv~s~~~VwvNG~~VG~----------~----~~~-----~~p~efdIT~~  200 (275)
                      +..+|+||+.-.=+..+|...        +....+.|-+||+.+..          .    .+.     +.-+.+|||++
T Consensus        40 ~~~~~~lP~ga~v~~ArLYv~~w~~~~~~~~~~~~~~~fNg~~~~~~~l~~~~~~Y~d~~~~g~~~~~~yg~~vYDVT~~  119 (271)
T PF11824_consen   40 VSWDFTLPEGATVKWARLYVYVWSGHMTNGYYPSFTVTFNGNTLEEFNLETPEAPYVDQKGHGNYVDYDYGMWVYDVTDL  119 (271)
T ss_pred             ceEEEeCCCCCeEEEEEEEEEEeCCccccCCCceEEEEECCccceeeeccCCCCceEEecCccceeccceEEEEEECccc
Confidence            566677775422223333332        33445677777776631          0    111     13344899999


Q ss_pred             cccCCCCCceEEEEEEE
Q 023943          201 CYPHGSDKKNVLAVQVF  217 (275)
Q Consensus       201 Lk~~~~G~eN~L~V~V~  217 (275)
                      ++.   | +|.+.|.-.
T Consensus       120 i~~---g-~n~~~v~~~  132 (271)
T PF11824_consen  120 IKS---G-ENTVTVTTG  132 (271)
T ss_pred             ccC---C-ceEEEEEeC
Confidence            998   8 499888773


No 29 
>PF14814 UB2H:  Bifunctional transglycosylase second domain; PDB: 3FWL_A 3VMA_A.
Probab=54.73  E-value=22  Score=26.45  Aligned_cols=42  Identities=26%  Similarity=0.358  Sum_probs=27.8

Q ss_pred             CCcccEEEEEEcCCCC-CCceEEEEeCcccceeEEEE--cCEEeee
Q 023943          143 NPTGCYRTYFHIPKEW-QGRRILLHFEAVDSAFCAWI--NGVPVGY  185 (275)
Q Consensus       143 n~~g~Yrr~F~lp~~~-~~~~i~L~f~gv~s~~~Vwv--NG~~VG~  185 (275)
                      +..-.|+|.|..|+.. ..+++.|+|.+ +....|--  ||+.++.
T Consensus        38 ~~i~i~~R~F~F~Dg~e~~~~~~l~f~~-~~V~~i~~~~~g~~l~~   82 (85)
T PF14814_consen   38 NRIEIYTRGFDFPDGQEPARRVRLTFSG-GRVSSIQDLDNGRDLGL   82 (85)
T ss_dssp             TEEEEEE--EEETTCEE--EEEEEEEET-TEEEEEEETTTTEE-SS
T ss_pred             CEEEEEECCCCCCCCCccCEEEEEEECC-CEEEEEEEcCCCCccCe
Confidence            4556899999999765 46799999998 66666655  5776653


No 30 
>PF07550 DUF1533:  Protein of unknown function (DUF1533);  InterPro: IPR011432 This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.
Probab=46.23  E-value=29  Score=24.42  Aligned_cols=43  Identities=14%  Similarity=0.164  Sum_probs=24.6

Q ss_pred             eeEEEEcCEEe-----eeecCCC-CCceecccc-cc-ccCCCCCceEEEEEEEec
Q 023943          173 AFCAWINGVPV-----GYSQDSR-LPAEFEISD-YC-YPHGSDKKNVLAVQVFRW  219 (275)
Q Consensus       173 ~~~VwvNG~~V-----G~~~~~~-~p~efdIT~-~L-k~~~~G~eN~L~V~V~~~  219 (275)
                      ...|.|||+..     +..+... ..-.+.|.. .+ +.   | +|+|+|...-+
T Consensus         8 I~~V~VNg~~y~~~~~~~~~y~~~~~~~l~i~~~~f~~~---G-~~~I~I~A~GY   58 (65)
T PF07550_consen    8 ITSVTVNGKEYNKSLKGNDKYSISSKGSLKIKASAFNKD---G-ENTIVIKATGY   58 (65)
T ss_pred             CCEEEECCEEeeccccccccEEeccCCcEEEcHHHcCcC---C-ceEEEEEeCCc
Confidence            45789999998     3222111 111144443 34 55   7 59999987644


No 31 
>PF09113 N-glycanase_C:  Peptide-N-glycosidase F, C terminal;  InterPro: IPR015197 This domain adopts an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. It is similar in topology to many viral capsid proteins, as well as lectins and several glucanases. This domain allows the protein to bind sugars and catalyses the complete removal of N-linked oligosaccharide chains from glycoproteins []. ; PDB: 1PNF_A 1PNG_A 1PGS_A 3KS7_D 3PMS_A.
Probab=42.36  E-value=87  Score=25.80  Aligned_cols=67  Identities=13%  Similarity=0.158  Sum_probs=38.5

Q ss_pred             cEEEEEEcCCCCCCceEEEEeCc-----------ccceeEEEEcCEEeeeec-----------------C----------
Q 023943          147 CYRTYFHIPKEWQGRRILLHFEA-----------VDSAFCAWINGVPVGYSQ-----------------D----------  188 (275)
Q Consensus       147 ~Yrr~F~lp~~~~~~~i~L~f~g-----------v~s~~~VwvNG~~VG~~~-----------------~----------  188 (275)
                      .-..+|++|+..+.-++.+.+-|           +...-.|+|||+++-...                 |          
T Consensus        11 ~~~~~f~lp~~~k~~~L~~iiTGHG~~~~gc~EFc~~~h~~~vnG~~~f~~~~~~~~Ca~~~~~n~~p~G~w~~~Rs~WC   90 (141)
T PF09113_consen   11 RLPVNFTLPANAKNARLRYIITGHGSGNNGCDEFCPKSHHFYVNGKEVFSFAPWRDDCASNRLYNPAPSGTWLYSRSNWC   90 (141)
T ss_dssp             SEEEEEEE-TT-SEEEEEEEEEEEEETTEEEETTS-EEEEEEETTEEEEEEEE-BS-GGGGSGG-TTT-SCESS-BSS--
T ss_pred             ceeEEEECCcccceEEEEEEEecCCCCCCCcceecccccEEEECCeEeeecCCCccchhhccccCccccceEecCCCCCC
Confidence            55679999987544444444433           222347999999992211                 1          


Q ss_pred             --C-CCCceeccccccccCCCCCceEEEEEEEe
Q 023943          189 --S-RLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       189 --~-~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                        + -.|.++||++++.    |+ +++.|.|.-
T Consensus        91 PG~~v~p~~~dl~~~~~----g~-ht~~~~i~~  118 (141)
T PF09113_consen   91 PGMVVDPWRIDLTDAVA----GG-HTFSVDIPY  118 (141)
T ss_dssp             TTEEE--EEEEEE-GGG----TT-SEEEEEEET
T ss_pred             CCCCCCceEeccccccC----CC-ceEEEEecc
Confidence              0 1278899998775    53 888777764


No 32 
>smart00560 LamGL LamG-like jellyroll fold domain.
Probab=41.61  E-value=34  Score=27.18  Aligned_cols=27  Identities=19%  Similarity=0.320  Sum_probs=20.9

Q ss_pred             ceEEEEeCcccceeEEEEcCEEeeeec
Q 023943          161 RRILLHFEAVDSAFCAWINGVPVGYSQ  187 (275)
Q Consensus       161 ~~i~L~f~gv~s~~~VwvNG~~VG~~~  187 (275)
                      .++.+.+++......+||||++++...
T Consensus        64 ~hva~v~d~~~g~~~lYvnG~~~~~~~   90 (133)
T smart00560       64 VHLAGVYDGGAGKLSLYVNGVEVATSE   90 (133)
T ss_pred             EEEEEEEECCCCeEEEEECCEEccccc
Confidence            366777777777889999999997543


No 33 
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=39.41  E-value=57  Score=34.22  Aligned_cols=39  Identities=26%  Similarity=0.427  Sum_probs=24.4

Q ss_pred             EEEEEEcCCC---CCCceEE--EEeC------cccceeEEEEcCEEeeee
Q 023943          148 YRTYFHIPKE---WQGRRIL--LHFE------AVDSAFCAWINGVPVGYS  186 (275)
Q Consensus       148 Yrr~F~lp~~---~~~~~i~--L~f~------gv~s~~~VwvNG~~VG~~  186 (275)
                      -+-.|.+|.+   |.++.+-  |++.      .-+|.-.|+|||++|+.-
T Consensus       378 i~~~~~lPpDl~~~~~~~i~l~L~yryt~~~~~~~S~l~V~vN~~~i~S~  427 (756)
T PRK11114        378 IRVNLRLPPDLFLWRGDGIPLDLNYRYTAPPVRDDSRLNISLNDQFVQSL  427 (756)
T ss_pred             eeEcccCCccccccCCCCCceEEEEeCCCCCCCCCcEEEEEECCEEEeeE
Confidence            3455666765   4555543  4431      123688999999999753


No 34 
>PF07908 D-aminoacyl_C:  D-aminoacylase, C-terminal region;  InterPro: IPR012855 D-aminoacylase (Q9AGH8 from SWISSPROT, 3.5.1.81 from EC) hydrolyses a wide variety of N-acyl derivatives of neutral D-amino acids, in a zinc-dependent manner. The enzyme is composed of a small beta-barrel domain and a larger catalytic alpha/beta-barrel that contains a short alpha/beta insert. The overall structure shares significant similarity to the alpha/beta-barrel amidohydrolase superfamily, in which the beta-strands in both barrels superimpose well [].  The C-terminal region featured in this entry forms part of the beta-barrel domain, together with a short N-terminal segment. This domain does not seem to contribute to the substrate-binding site or to be involved in the catalytic process.; GO: 0008270 zinc ion binding, 0016811 hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amides; PDB: 3GIQ_B 3GIP_B 1V4Y_A 1M7J_A 1RK5_A 1RJP_A 1RJR_A 1RJQ_A 1RK6_A 1V51_A.
Probab=37.96  E-value=26  Score=23.26  Aligned_cols=13  Identities=23%  Similarity=0.123  Sum_probs=10.6

Q ss_pred             eeEEEEcCEEeee
Q 023943          173 AFCAWINGVPVGY  185 (275)
Q Consensus       173 ~~~VwvNG~~VG~  185 (275)
                      .-+|||||+.+-.
T Consensus        20 I~~V~VNG~~vv~   32 (48)
T PF07908_consen   20 IDYVFVNGQIVVE   32 (48)
T ss_dssp             EEEEEETTEEEEC
T ss_pred             EEEEEECCEEEEE
Confidence            4689999999854


No 35 
>PF10262 Rdx:  Rdx family;  InterPro: IPR011893 This entry represents the Rdx family of selenoproteins, which includes mammalian selenoproteins SelW, SelV, SelT and SelH, bacterial SelW-like proteins and cysteine-containing proteins of unknown function in all three domains of life. Mammalian Rdx12 and its fish selenoprotein orthologues are also members of this family []. These proteins possess a thioredoxin-like fold and a conserved CXXC or CxxU (U is selenocysteine) motif near the N terminus, suggesting a redox function. Rdx proteins can use catalytic cysteine (or selenocysteine) to form transient mixed disulphides with substrate proteins. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins.   Selenoprotein W (SelW) plays an important role in protection of neurons from oxidative stress during neuronal development [], [].   Selenoprotein T (SelT) is conserved from plants to humans. SelT is localized to the endoplasmic reticulum through a hydrophobic domain. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding [, ]. The function of SelT is unknown, although it may have a role in PACAP signaling during PC12 cell differentiation [, ].  Selenoprotein H (SelH) protects neurons against UVB-induced damage by inhibiting apoptotic cell death pathways, by preventing mitochondrial depolarization, and by promoting cell survival pathways [].; GO: 0008430 selenium binding, 0045454 cell redox homeostasis; PDB: 2OJL_B 2FA8_A 2P0G_C 2NPB_A 3DEX_C 2OKA_A 2OBK_G.
Probab=37.01  E-value=48  Score=23.87  Aligned_cols=23  Identities=22%  Similarity=0.150  Sum_probs=16.7

Q ss_pred             ceEEEEeCcccceeEEEEcCEEee
Q 023943          161 RRILLHFEAVDSAFCAWINGVPVG  184 (275)
Q Consensus       161 ~~i~L~f~gv~s~~~VwvNG~~VG  184 (275)
                      ..+.+.. +...+++|+|||+.|-
T Consensus        33 ~~v~~~~-~~~G~FEV~v~g~lI~   55 (76)
T PF10262_consen   33 AEVELSP-GSTGAFEVTVNGELIF   55 (76)
T ss_dssp             SEEEEEE-ESTT-EEEEETTEEEE
T ss_pred             eEEEEEe-ccCCEEEEEEccEEEE
Confidence            3556666 4478899999999884


No 36 
>PF11824 DUF3344:  Protein of unknown function (DUF3344);  InterPro: IPR021779  This family of proteins are functionally uncharacterised. This protein is found in bacteria and archaea. Proteins in this family are typically between 367 to 1857 amino acids in length. 
Probab=35.08  E-value=56  Score=29.80  Aligned_cols=48  Identities=17%  Similarity=0.188  Sum_probs=33.2

Q ss_pred             EEeCcccceeEEEEcCEEeeeec--CCCCC-ceeccccccccCCCCCceEEEEEE
Q 023943          165 LHFEAVDSAFCAWINGVPVGYSQ--DSRLP-AEFEISDYCYPHGSDKKNVLAVQV  216 (275)
Q Consensus       165 L~f~gv~s~~~VwvNG~~VG~~~--~~~~p-~efdIT~~Lk~~~~G~eN~L~V~V  216 (275)
                      +...|-+....+.+||+-+....  +++.. ..|||+++|+.   | +|.+.++-
T Consensus       204 ~~~s~~~~~g~~~FNg~~l~~~~~~~~~~~~~~~DVt~~l~~---~-~n~~~~~~  254 (271)
T PF11824_consen  204 VALSGGDGEGNLTFNGTNLWNGTPSGSYFGYDTWDVTDYLKS---G-NNSAFIQS  254 (271)
T ss_pred             EEEeccCCCCEEEECCcccCCCCCCccceeeEeeeccccccC---C-CceEEEEe
Confidence            33445444478999997775432  34333 35999999998   8 49988886


No 37 
>TIGR02148 Fibro_Slime fibro-slime domain. This model represents a conserved region of about 90 amino acids, shared in at least 4 distinct large putative proteins from the slime mold Dictyostelium discoideum and 10 proteins from the rumen bacterium Fibrobacter succinogenes, and in no other species so far. We propose here the name fibro-slime domain
Probab=34.36  E-value=2.1e+02  Score=21.82  Aligned_cols=53  Identities=13%  Similarity=0.100  Sum_probs=33.4

Q ss_pred             EEEEeCcccceeEEEEcCEEeeeecCCCCC--ceecccc-ccccCCCCCceEEEE-EEEec
Q 023943          163 ILLHFEAVDSAFCAWINGVPVGYSQDSRLP--AEFEISD-YCYPHGSDKKNVLAV-QVFRW  219 (275)
Q Consensus       163 i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p--~efdIT~-~Lk~~~~G~eN~L~V-~V~~~  219 (275)
                      -.+.|-| +.-.-|+|||++|..-.|-+.|  ..+|+.. =|.+   |+.-.+.+ .+.|.
T Consensus        20 e~F~F~G-DDDvWVFIn~kLv~DlGG~H~~~~~sV~l~~lgl~~---g~~Y~~d~F~~ERh   76 (90)
T TIGR02148        20 QYFEFRG-DDDVWVFINNKLVVDIGGQHPAVPGAVDLDTLGLKE---GKTYPFDIFYCERH   76 (90)
T ss_pred             cEEEEEc-CCeEEEEECCEEEEEccCcCCCcccEEEhhhcCCcc---CcEeeEEEEEEeec
Confidence            4778888 6779999999999766554443  3456554 2444   64445555 33443


No 38 
>PF00337 Gal-bind_lectin:  Galactoside-binding lectin;  InterPro: IPR001079 Galectins (also known as galaptins or S-lectin) are a family of proteins defined by having at least one characteristic carbohydrate recognition domain (CRD) with an affinity for beta-galactosides and sharing certain sequence elements. Members of the galectins family are found in mammals, birds, amphibians, fish, nematodes, sponges, and some fungi. Galectins are known to carry out intra- and extracellular functions through glycoconjugate-mediated recogntion. From the cytosol they may be secreted by non-classical pathways, but they may also be targeted to the nucleus or specific sub-cytosolic sites. Within the same peptide chain some galectins have a CRD with only a few additional amino acids, whereas others have two CRDs joined by a link peptide, and one (galectin-3) has one CRD joined to a different type of domain [, ]. The galectin carbohydrate recognition domain (CRD) is a beta-sandwich of about 135 amino acid. The two sheets are slightly bent with 6 strands forming the concave side and 5 strands forming the convex side. The concave side forms a groove in which carbohydrate is bound, and which is long enough to hold about a linear tetrasaccharide [, ].; GO: 0005529 sugar binding; PDB: 2WSU_B 2WT0_A 2WT1_A 2WT2_B 2WSV_A 1HLC_A 2ZGQ_A 3M3Q_B 1WW5_C 3M3E_A ....
Probab=33.06  E-value=69  Score=25.27  Aligned_cols=28  Identities=14%  Similarity=0.281  Sum_probs=23.2

Q ss_pred             CCceEEEEeCcccceeEEEEcCEEeeee
Q 023943          159 QGRRILLHFEAVDSAFCAWINGVPVGYS  186 (275)
Q Consensus       159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~  186 (275)
                      .|+...|.|.--+..+.|+|||+.+..-
T Consensus        81 ~g~~F~i~I~~~~~~f~I~vng~~~~~F  108 (133)
T PF00337_consen   81 PGQPFEIRIRVEEDGFKIYVNGKHFCSF  108 (133)
T ss_dssp             TTSEEEEEEEEESSEEEEEETTEEEEEE
T ss_pred             CCceEEEEEEEecCeeEEEECCeEEEEe
Confidence            5777777777778999999999998753


No 39 
>PF09829 DUF2057:  Uncharacterized protein conserved in bacteria (DUF2057);  InterPro: IPR018635 The proteins in this entry are functionally uncharacterised.
Probab=32.23  E-value=98  Score=26.29  Aligned_cols=39  Identities=21%  Similarity=0.122  Sum_probs=25.2

Q ss_pred             ceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEec
Q 023943          172 SAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRW  219 (275)
Q Consensus       172 s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~  219 (275)
                      ..--+-|||+.++.+.-+... .+.|    .+   | +|+|+|++...
T Consensus         8 ~i~~l~vnG~~v~~~~~~~~~-~l~L----~~---G-~~Qiv~ry~~~   46 (189)
T PF09829_consen    8 EIELLAVNGQEVSGSLFSSKD-SLEL----PP---G-ENQIVFRYSKI   46 (189)
T ss_pred             CEEEEEEcCeeccCccccCCc-eEEe----CC---C-cEEEEEEEeEe
Confidence            345568999999654322111 2444    45   8 59999999974


No 40 
>PF05775 AfaD:  Enterobacteria AfaD invasin protein;  InterPro: IPR008394 This family consists of several AfaD and related proteins from Escherichia coli and Salmonella bacteria. The afa gene clusters encode an afimbrial adhesive sheath produced by E. coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells [].; GO: 0009289 pilus; PDB: 3UIZ_F 3UIY_A 2AXW_A 2IXQ_A 2FVN_A.
Probab=32.00  E-value=97  Score=24.56  Aligned_cols=45  Identities=18%  Similarity=0.284  Sum_probs=30.7

Q ss_pred             EEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943          164 LLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       164 ~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      .|...+.++.+.||.|.+.++-     .|-.+-|...=.    . .|+|.|++-.
T Consensus        18 rI~~~~~htGF~Vw~na~~~~g-----~p~~Yil~G~~~----~-~h~LrVRlgg   62 (111)
T PF05775_consen   18 RIICREAHTGFHVWSNARQVGG-----RPGRYILQGKRN----S-QHELRVRLGG   62 (111)
T ss_dssp             EEES-SSSSEEEEEESSEESTT-----STTEEEEEBCSS----S-S-EEEEEEET
T ss_pred             EEEeCCCceEEEEEeechhcCC-----CccEEEEeCCCC----C-CceEEEEeCC
Confidence            4667788999999999998765     345555554211    2 4999999985


No 41 
>PF13385 Laminin_G_3:  Concanavalin A-like lectin/glucanases superfamily; PDB: 4DQA_A 1N1Y_A 1MZ6_A 1MZ5_A 1N1S_A 2A75_A 1WCS_A 1N1T_A 1N1V_A 2FHR_A ....
Probab=30.47  E-value=58  Score=25.11  Aligned_cols=24  Identities=29%  Similarity=0.538  Sum_probs=17.8

Q ss_pred             eEEEEeCcccceeEEEEcCEEeeeec
Q 023943          162 RILLHFEAVDSAFCAWINGVPVGYSQ  187 (275)
Q Consensus       162 ~i~L~f~gv~s~~~VwvNG~~VG~~~  187 (275)
                      ++.+.+.  .....+||||+.++...
T Consensus        89 ~l~~~~~--~~~~~lyvnG~~~~~~~  112 (157)
T PF13385_consen   89 HLALTYD--GSTVTLYVNGELVGSST  112 (157)
T ss_dssp             EEEEEEE--TTEEEEEETTEEETTCT
T ss_pred             EEEEEEE--CCeEEEEECCEEEEeEe
Confidence            5555555  44699999999998754


No 42 
>KOG4342 consensus Alpha-mannosidase [Carbohydrate transport and metabolism]
Probab=30.22  E-value=1.7e+02  Score=30.39  Aligned_cols=67  Identities=21%  Similarity=0.444  Sum_probs=47.8

Q ss_pred             cccEEEEEEcCCCCCC-ceEEEEeCcccceeEEEE-cCEEee-eecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943          145 TGCYRTYFHIPKEWQG-RRILLHFEAVDSAFCAWI-NGVPVG-YSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       145 ~g~Yrr~F~lp~~~~~-~~i~L~f~gv~s~~~Vwv-NG~~VG-~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      +.|+|-+++||++|.+ +++.+.-+. +...-||= +|..|- .+.|.++  +|-+++-++.    .+.++-|++..
T Consensus       103 T~WF~V~i~lPe~Wvk~eqv~fqW~c-dnEGlV~~kdg~PvqafsggErT--~yvLpd~~~~----~~~tfYiE~ac  172 (1078)
T KOG4342|consen  103 TCWFRVEITLPEAWVKNEQVHFQWEC-DNEGLVWRKDGEPVQAFSGGERT--SYVLPDRLGE----RSLTFYIEVAC  172 (1078)
T ss_pred             eEEEEEEEECchhhcCceeEEEEEec-CCCeeEEecCCceeeeccCCccc--eeEcccccCC----cceEEEEEeec
Confidence            4689999999999965 788888876 56677777 899885 4444343  4556665543    24777777765


No 43 
>cd00070 GLECT Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation.
Probab=29.28  E-value=83  Score=24.79  Aligned_cols=28  Identities=18%  Similarity=0.199  Sum_probs=23.0

Q ss_pred             CCceEEEEeCcccceeEEEEcCEEeeee
Q 023943          159 QGRRILLHFEAVDSAFCAWINGVPVGYS  186 (275)
Q Consensus       159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~  186 (275)
                      +|+...|.|.--...+.|+|||+.+..-
T Consensus        76 ~g~~F~l~i~~~~~~f~i~vng~~~~~F  103 (127)
T cd00070          76 PGQPFELTILVEEDKFQIFVNGQHFFSF  103 (127)
T ss_pred             CCCeEEEEEEEcCCEEEEEECCEeEEEe
Confidence            4777788887778999999999988653


No 44 
>COG0278 Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]
Probab=28.81  E-value=33  Score=26.77  Aligned_cols=16  Identities=25%  Similarity=0.191  Sum_probs=13.4

Q ss_pred             cceeEEEEcCEEeeee
Q 023943          171 DSAFCAWINGVPVGYS  186 (275)
Q Consensus       171 ~s~~~VwvNG~~VG~~  186 (275)
                      -+.-.+||||++||-+
T Consensus        70 PT~PQLyi~GEfvGG~   85 (105)
T COG0278          70 PTFPQLYVNGEFVGGC   85 (105)
T ss_pred             CCCceeeECCEEeccH
Confidence            4567899999999976


No 45 
>PF15625 CC2D2AN-C2:  CC2D2A N-terminal C2 domain
Probab=28.67  E-value=94  Score=26.05  Aligned_cols=40  Identities=20%  Similarity=0.380  Sum_probs=24.3

Q ss_pred             ceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC
Q 023943          172 SAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS  220 (275)
Q Consensus       172 s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~  220 (275)
                      ....|++||++|+.++-......|-+    ..   |+  .+.|+|.+|.
T Consensus        39 ~~ikl~~N~k~V~~T~~~~l~~dF~v----~f---~~--~f~v~i~~~P   78 (168)
T PF15625_consen   39 YYIKLFFNDKEVSRTRSRPLWSDFRV----HF---NE--IFNVQITRWP   78 (168)
T ss_pred             EEEEEEECCEEEEeeeeEecCCCeEE----ec---cC--EEEEEEecCC
Confidence            34578999999998875444333322    23   42  6666666653


No 46 
>PF06439 DUF1080:  Domain of Unknown Function (DUF1080);  InterPro: IPR010496 This is a family of proteins of unknown function.; PDB: 3IMM_B 3NMB_A 3S5Q_A 3OSD_A 3HBK_A 3H3L_A 3U1X_A.
Probab=27.59  E-value=1.1e+02  Score=25.20  Aligned_cols=21  Identities=29%  Similarity=0.632  Sum_probs=15.8

Q ss_pred             cceeEEEEcCEEeeeecCCCC
Q 023943          171 DSAFCAWINGVPVGYSQDSRL  191 (275)
Q Consensus       171 ~s~~~VwvNG~~VG~~~~~~~  191 (275)
                      .....|||||+.|....+...
T Consensus       138 g~~i~v~vnG~~v~~~~d~~~  158 (185)
T PF06439_consen  138 GNRITVWVNGKPVADFTDPSF  158 (185)
T ss_dssp             TTEEEEEETTEEEEEEETTSH
T ss_pred             CCEEEEEECCEEEEEEEcCCC
Confidence            345889999999988766433


No 47 
>TIGR02412 pepN_strep_liv aminopeptidase N, Streptomyces lividans type. This family is a subset of the members of the zinc metallopeptidase family M1 (pfam01433), with a single member characterized in Streptomyces lividans 66 and designated aminopeptidase N. The spectrum of activity may differ somewhat from the aminopeptidase N clade of E. coli and most other Proteobacteria, well separated phylogenetically within the M1 family. The M1 family also includes leukotriene A-4 hydrolase/aminopeptidase (with a bifunctional active site).
Probab=26.61  E-value=7.9e+02  Score=26.07  Aligned_cols=63  Identities=16%  Similarity=0.174  Sum_probs=40.0

Q ss_pred             cccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCE-EeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943          145 TGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGV-PVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       145 ~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~-~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      .+.=+-+|++-+.  +.++.|.+.+. ..-.|-|||+ .+....   ....+.+.. |..   | +|+|.|....
T Consensus        35 ~~~~~i~~~~~~~--~~~l~LD~~~l-~I~~v~vng~~~~~~~~---~~~~i~l~~-l~~---g-~~~l~i~~~~   98 (831)
T TIGR02412        35 RCVSTNTVRLSEP--GADTFLDLLAA-QIESVTLNGILDVAPVY---DGSRIPLPG-LLT---G-ENTLRVEATR   98 (831)
T ss_pred             ceEEEEEEEEcCC--CCcEEEEccCC-EEEEEEECCcccCcccc---CCCEEEccC-CCC---C-ceEEEEEEEE
Confidence            3444445555333  67899999885 6778889997 332211   223456655 666   8 5999999754


No 48 
>PRK10824 glutaredoxin-4; Provisional
Probab=26.00  E-value=48  Score=26.30  Aligned_cols=23  Identities=22%  Similarity=0.343  Sum_probs=18.2

Q ss_pred             EEEeCcccceeEEEEcCEEeeee
Q 023943          164 LLHFEAVDSAFCAWINGVPVGYS  186 (275)
Q Consensus       164 ~L~f~gv~s~~~VwvNG~~VG~~  186 (275)
                      ...+.|-.+.-.|||||++||-+
T Consensus        62 l~~~sg~~TVPQIFI~G~~IGG~   84 (115)
T PRK10824         62 LPKYANWPTFPQLWVDGELVGGC   84 (115)
T ss_pred             HHHHhCCCCCCeEEECCEEEcCh
Confidence            33445778889999999999876


No 49 
>smart00276 GLECT Galectin. Galectin - galactose-binding lectin
Probab=25.64  E-value=1e+02  Score=24.30  Aligned_cols=28  Identities=21%  Similarity=0.289  Sum_probs=22.6

Q ss_pred             CCceEEEEeCcccceeEEEEcCEEeeee
Q 023943          159 QGRRILLHFEAVDSAFCAWINGVPVGYS  186 (275)
Q Consensus       159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~  186 (275)
                      .|+...|.|---...+.|+|||+.+..-
T Consensus        75 ~g~~F~l~i~~~~~~f~i~vng~~~~~f  102 (128)
T smart00276       75 PGQPFDLTIIVQPDHFQIFVNGVHITTF  102 (128)
T ss_pred             CCCEEEEEEEEcCCEEEEEECCEeEEEe
Confidence            4677777777778899999999998753


No 50 
>PF13464 DUF4115:  Domain of unknown function (DUF4115)
Probab=25.63  E-value=1.1e+02  Score=21.89  Aligned_cols=25  Identities=20%  Similarity=0.268  Sum_probs=21.3

Q ss_pred             CceEEEEeCcccceeEEEEcCEEeee
Q 023943          160 GRRILLHFEAVDSAFCAWINGVPVGY  185 (275)
Q Consensus       160 ~~~i~L~f~gv~s~~~VwvNG~~VG~  185 (275)
                      ...+.|+++.. ++.+|.+||+.++.
T Consensus        37 ~~~~~i~iGna-~~v~v~~nG~~~~~   61 (77)
T PF13464_consen   37 KEPFRIRIGNA-GAVEVTVNGKPVDL   61 (77)
T ss_pred             CCCEEEEEeCC-CcEEEEECCEECCC
Confidence            56788999875 58899999999987


No 51 
>COG3148 Uncharacterized conserved protein [Function unknown]
Probab=25.58  E-value=46  Score=29.48  Aligned_cols=51  Identities=12%  Similarity=0.189  Sum_probs=33.1

Q ss_pred             CCcccccccCCCCCCCccccCChhhhcccCCchhhHHHhhhcccccCCCCCcEEecCccce
Q 023943           22 EDPSFIKWRKRDPHVTLRCHDSVEVSNSAVWDDDAVHEALTSAAFWTNGLPFVKSLSGHWK   82 (275)
Q Consensus        22 ~~p~~~~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~LnG~W~   82 (275)
                      +||+++..=+.|..-++.-|+++...        +...  -.++....+.+..+.|+|+|+
T Consensus        90 ~~~eLl~ll~~P~~~p~lvfP~e~a~--------e~t~--v~~~~p~~k~plfIllDgTW~  140 (231)
T COG3148          90 PNPELLALLANPDYQPYLVFPAEYAE--------ELTE--VISTAPAEKPPLFILLDGTWR  140 (231)
T ss_pred             CCHHHHHHHhCCCCceEEEcchHHHH--------HHHH--HhhcccccCCceEEEecCccH
Confidence            38888888888888888889986521        1110  011111224568999999997


No 52 
>smart00776 NPCBM This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins.
Probab=25.47  E-value=3.8e+02  Score=21.93  Aligned_cols=42  Identities=19%  Similarity=0.245  Sum_probs=28.9

Q ss_pred             ceeEEEEcCEEeeeecC---CC--CCceeccccccccCCCCCceEEEEEEEecCCC
Q 023943          172 SAFCAWINGVPVGYSQD---SR--LPAEFEISDYCYPHGSDKKNVLAVQVFRWSDG  222 (275)
Q Consensus       172 s~~~VwvNG~~VG~~~~---~~--~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dg  222 (275)
                      -.+.|+.+|+.+-.+..   ..  .+.++||+        |. ++|.++|....+|
T Consensus        85 V~F~V~~Dg~~l~~s~~~~~~~~~~~~~vdv~--------G~-~~L~L~v~~~g~g  131 (145)
T smart00776       85 VVFEVYADGTKLYNSGVLRGADPAKAVDVDVS--------GA-KELRLVVTDAGDG  131 (145)
T ss_pred             EEEEEEeCCEeEEEcccccCCCCCeEEEEEcC--------CC-eEEEEEEEeCCCC
Confidence            36799999999977742   22  23455553        74 8999999876544


No 53 
>smart00561 MBT Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation.
Probab=24.74  E-value=59  Score=24.85  Aligned_cols=21  Identities=38%  Similarity=0.874  Sum_probs=18.2

Q ss_pred             CCceEEEEeCcccceeEEEEc
Q 023943          159 QGRRILLHFEAVDSAFCAWIN  179 (275)
Q Consensus       159 ~~~~i~L~f~gv~s~~~VwvN  179 (275)
                      .|+++.|+|+|-++.+..|++
T Consensus        54 ~g~~l~v~~dg~~~~~D~W~~   74 (96)
T smart00561       54 KGYRLLLHFDGWDDKYDFWCD   74 (96)
T ss_pred             ECCEEEEEEccCCCcCCEEEE
Confidence            378999999999988888875


No 54 
>PF11324 DUF3126:  Protein of unknown function (DUF3126);  InterPro: IPR021473  This family of proteins with unknown function appear to be restricted to Alphaproteobacteria. 
Probab=23.88  E-value=71  Score=22.72  Aligned_cols=18  Identities=17%  Similarity=0.163  Sum_probs=14.8

Q ss_pred             cccceeEEEEcCEEeeee
Q 023943          169 AVDSAFCAWINGVPVGYS  186 (275)
Q Consensus       169 gv~s~~~VwvNG~~VG~~  186 (275)
                      .-+..++||+++++||.-
T Consensus        25 k~~dsaEV~~g~EfiGvi   42 (63)
T PF11324_consen   25 KKDDSAEVYIGDEFIGVI   42 (63)
T ss_pred             CCCCceEEEeCCEEEEEE
Confidence            346689999999999963


No 55 
>PF03422 CBM_6:  Carbohydrate binding module (family 6);  InterPro: IPR005084 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [].  This entry represents CBM6 from CAZY which was previously known as cellulose-binding domain family VI (CBD VI). CBM6 bind to amorphous cellulose, xylan, mixed beta-(1,3)(1,4)glucan and beta-1,3-glucan[, , ]. CBM6 adopts a classic lectin-like beta-jelly roll fold, predominantly consisting of five antiparallel beta-strands on one face and four antiparallel beta-strands on the other face. It contains two potential ligand binding sites, named respectively cleft A and B. These clefts include aromatic residues which are probably involved in the substrate binding. The cleft B is located on the concave surface of one beta-sheet, and the cleft A on one edge of the protein between the loop that connects the inner and outer beta-sheets of the jellyroll fold []. The multiple binding clefts confer the extensive range of specificities displayed by the domain [, , ].; GO: 0030246 carbohydrate binding; PDB: 1UY1_A 1UY3_A 1UY4_A 1UY2_A 1UYY_A 1UXZ_B 1UYZ_A 1UY0_B 1UYX_A 1UZ0_A ....
Probab=23.26  E-value=3.3e+02  Score=20.71  Aligned_cols=40  Identities=10%  Similarity=0.061  Sum_probs=23.7

Q ss_pred             eeEEEEcC---EEeeeec----CCCCC---ceeccccccccCCCCCceEEEEEEEe
Q 023943          173 AFCAWING---VPVGYSQ----DSRLP---AEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       173 ~~~VwvNG---~~VG~~~----~~~~p---~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      ..+|+|||   +.++...    ++...   .+..|  .|..   | .|.|.+....
T Consensus        61 ~~~l~id~~~g~~~~~~~~~~tg~w~~~~~~~~~v--~l~~---G-~h~i~l~~~~  110 (125)
T PF03422_consen   61 TIELRIDGPDGTLIGTVSLPPTGGWDTWQTVSVSV--KLPA---G-KHTIYLVFNG  110 (125)
T ss_dssp             EEEEEETTTTSEEEEEEEEE-ESSTTEEEEEEEEE--EEES---E-EEEEEEEESS
T ss_pred             EEEEEECCCCCcEEEEEEEcCCCCccccEEEEEEE--eeCC---C-eeEEEEEEEC
Confidence            56777777   7776542    33333   22223  3556   8 4888888765


No 56 
>PRK01904 hypothetical protein; Provisional
Probab=22.44  E-value=1.6e+02  Score=25.98  Aligned_cols=41  Identities=17%  Similarity=0.081  Sum_probs=25.8

Q ss_pred             ceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC
Q 023943          172 SAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS  220 (275)
Q Consensus       172 s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~  220 (275)
                      ..-.+-|||+.+..+-.. ..-.+.++    .   |++|+|+|++...-
T Consensus        29 ~i~lL~vnG~kv~~s~~~-~~~~l~L~----d---gg~hQIv~ry~~~~   69 (219)
T PRK01904         29 NIDFLAIDGQKASKSLLK-EAKSFNIN----D---TQVHQVVVRVSEIV   69 (219)
T ss_pred             ceEEEEECCEECcccccc-CCcceEeC----C---CCceEEEEEEeecc
Confidence            345678999999643222 22334444    3   53599999999753


No 57 
>KOG1752 consensus Glutaredoxin and related proteins [Posttranslational modification, protein turnover, chaperones]
Probab=21.91  E-value=65  Score=25.08  Aligned_cols=25  Identities=16%  Similarity=0.195  Sum_probs=18.9

Q ss_pred             EEEEeCcccceeEEEEcCEEeeeec
Q 023943          163 ILLHFEAVDSAFCAWINGVPVGYSQ  187 (275)
Q Consensus       163 i~L~f~gv~s~~~VwvNG~~VG~~~  187 (275)
                      ....+.|..+.-.|||||++||...
T Consensus        58 ~l~~~tg~~tvP~vFI~Gk~iGG~~   82 (104)
T KOG1752|consen   58 ALKKLTGQRTVPNVFIGGKFIGGAS   82 (104)
T ss_pred             HHHHhcCCCCCCEEEECCEEEcCHH
Confidence            3344566668899999999998754


No 58 
>PRK15222 putative pilin structural protein SafD; Provisional
Probab=20.70  E-value=1.8e+02  Score=24.36  Aligned_cols=44  Identities=14%  Similarity=0.284  Sum_probs=30.3

Q ss_pred             EEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943          165 LHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR  218 (275)
Q Consensus       165 L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~  218 (275)
                      +...|.++.|.||.|++.+|-..     -.+-|..- +.   . .|+|.||+.-
T Consensus        60 I~~~g~htGF~Vwsna~q~gg~p-----~~Yil~G~-~d---s-~h~LrVRl~G  103 (156)
T PRK15222         60 VTYHGSHSGFRVWSDEQKAGNTP-----TVLLLSGQ-QD---P-RHHIQVRLEG  103 (156)
T ss_pred             EEeCCCceeEEEEecccccCCCc-----cEEEEECC-CC---C-cceEEEEecC
Confidence            33888899999999999986543     33333321 22   3 4899999984


No 59 
>cd02848 Chitinase_N_term Chitinase N-terminus domain. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and based on sequence criteria, chitinases belong to families 18 and 19 of glycosyl hydrolases.  The N-terminus of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at  either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitob
Probab=20.24  E-value=2.5e+02  Score=22.09  Aligned_cols=45  Identities=13%  Similarity=0.137  Sum_probs=30.1

Q ss_pred             CcccceeEEEEcCEEeeeec--CCCCCceeccccccccCCCCCceEEEEEEEec
Q 023943          168 EAVDSAFCAWINGVPVGYSQ--DSRLPAEFEISDYCYPHGSDKKNVLAVQVFRW  219 (275)
Q Consensus       168 ~gv~s~~~VwvNG~~VG~~~--~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~  219 (275)
                      +.....++|++||+.|-...  ++-....|+++   +.   | .-.+.|++-+.
T Consensus        45 G~~Gd~a~vl~dg~~V~~G~~~~~~~~at~~v~---kg---G-~y~m~V~lCn~   91 (106)
T cd02848          45 GDPGDTYKVLLDGKEVWSGALTGSSGTATFKVG---KG---G-RYQMQVALCNG   91 (106)
T ss_pred             CCCCcEEEEEECCeEEEcccCCCCccEEEEEeC---CC---C-eEEEEEEEECC
Confidence            55667899999999984432  22234566654   34   6 48999988764


No 60 
>PF01589 Alpha_E1_glycop:  Alphavirus E1 glycoprotein;  InterPro: IPR002548 Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses []. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 (IPR000936 from INTERPRO) and E3 (IPR002533 from INTERPRO) causes a change in the viral surface. Together the E1, E2, and sometimes E3, glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike []. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together. The alphaviral glycoprotein E1 is a class II viral fusion protein, which is structurally different from the class I fusion proteins found in influenza virus and HIV. The structure of the Semliki Forest virus revealed a structure that is similar to that of flaviviral glycoprotein E, with three structural domains in the same primary sequence arrangement []. This entry represents all three domains of the alphaviral E1 glycoprotein.; GO: 0004252 serine-type endopeptidase activity, 0019028 viral capsid, 0055036 virion membrane; PDB: 2YEW_L 1LD4_P 1Z8Y_K 3MUU_B 3N44_F 2XFB_F 3N42_F 2XFC_H 3N40_F 3N41_F ....
Probab=20.06  E-value=1.5e+02  Score=29.22  Aligned_cols=28  Identities=18%  Similarity=0.332  Sum_probs=24.3

Q ss_pred             ceEEEEeCcccceeEEEEcCEEeeeecC
Q 023943          161 RRILLHFEAVDSAFCAWINGVPVGYSQD  188 (275)
Q Consensus       161 ~~i~L~f~gv~s~~~VwvNG~~VG~~~~  188 (275)
                      -.+.+.++.+....++||||..-+...|
T Consensus       195 a~l~ityG~~~~~v~~yVNG~t~~~~~~  222 (502)
T PF01589_consen  195 AKLRITYGNVNQTVDVYVNGETPVNSGD  222 (502)
T ss_dssp             EEEEEEESSEEEEEEEESSSSCEEEETT
T ss_pred             eEEEEEEcceEEEEEEEEcCccceeccc
Confidence            4678899999999999999998877765


No 61 
>PRK06789 flagellar motor switch protein; Validated
Probab=20.01  E-value=1.1e+02  Score=22.45  Aligned_cols=27  Identities=11%  Similarity=0.209  Sum_probs=18.5

Q ss_pred             CceEEEEeCcccceeEEEEcCEEeeeec
Q 023943          160 GRRILLHFEAVDSAFCAWINGVPVGYSQ  187 (275)
Q Consensus       160 ~~~i~L~f~gv~s~~~VwvNG~~VG~~~  187 (275)
                      |.-+.|. ..+.....+++||+.+|+.+
T Consensus        31 Gsvi~Ld-k~~~epvdI~vNg~lia~GE   57 (74)
T PRK06789         31 GTLYRLE-NSTKNTVRLMLENEEIGTGK   57 (74)
T ss_pred             CCEEEeC-CcCCCCEEEEECCEEEeEEe
Confidence            4444443 23466789999999999854


Done!