Query 023943
Match_columns 275
No_of_seqs 342 out of 1986
Neff 7.0
Searched_HMMs 46136
Date Fri Mar 29 07:46:33 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/023943.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/023943hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK09525 lacZ beta-D-galactosi 100.0 1.2E-58 2.5E-63 482.3 23.4 242 8-273 4-246 (1027)
2 PRK10340 ebgA cryptic beta-D-g 100.0 1.8E-57 3.8E-62 473.9 24.0 233 18-272 1-233 (1021)
3 PRK10150 beta-D-glucuronidase; 100.0 1.4E-34 3E-39 288.9 20.2 183 71-273 9-207 (604)
4 PF02837 Glyco_hydro_2_N: Glyc 100.0 9.2E-35 2E-39 244.6 14.5 165 72-246 2-167 (167)
5 COG3250 LacZ Beta-galactosidas 100.0 7.1E-30 1.5E-34 259.2 13.0 196 71-272 9-206 (808)
6 KOG2024 Beta-Glucuronidase GUS 99.7 4.1E-16 9E-21 138.1 9.6 176 70-266 28-209 (297)
7 KOG2230 Predicted beta-mannosi 99.2 1.4E-10 3E-15 112.4 11.5 169 73-263 21-227 (867)
8 PF13364 BetaGal_dom4_5: Beta- 97.9 4.9E-05 1.1E-09 60.2 8.3 71 142-218 33-107 (111)
9 COG3250 LacZ Beta-galactosidas 97.9 1.6E-05 3.4E-10 82.3 6.5 70 145-218 64-133 (808)
10 PLN03059 beta-galactosidase; P 97.8 0.00015 3.3E-09 74.8 10.3 94 143-244 469-571 (840)
11 PF08531 Bac_rhamnosid_N: Alph 97.6 0.00012 2.5E-09 62.4 5.9 53 161-218 4-65 (172)
12 PLN03059 beta-galactosidase; P 96.1 0.011 2.3E-07 61.5 6.0 66 143-216 618-712 (840)
13 KOG0496 Beta-galactosidase [Ca 95.9 0.026 5.6E-07 56.8 7.5 75 162-245 434-514 (649)
14 KOG0496 Beta-galactosidase [Ca 94.6 0.058 1.3E-06 54.3 5.5 68 143-218 556-625 (649)
15 PF07691 PA14: PA14 domain; I 92.5 1.2 2.7E-05 35.5 9.1 71 143-220 45-122 (145)
16 PF14683 CBM-like: Polysacchar 91.8 0.37 7.9E-06 40.9 5.3 69 146-219 63-153 (167)
17 PF03170 BcsB: Bacterial cellu 88.9 2.5 5.3E-05 42.8 9.3 71 146-220 29-112 (605)
18 PF08308 PEGA: PEGA domain; I 88.4 1 2.2E-05 32.0 4.6 43 165-219 4-46 (71)
19 smart00758 PA14 domain in bact 88.1 3.8 8.3E-05 32.6 8.3 70 143-219 43-113 (136)
20 PF03170 BcsB: Bacterial cellu 86.7 2.2 4.8E-05 43.1 7.5 67 149-219 327-410 (605)
21 PRK11114 cellulose synthase re 85.9 3.5 7.7E-05 43.0 8.7 70 148-221 83-166 (756)
22 PF06832 BiPBP_C: Penicillin-B 77.4 11 0.00024 28.0 6.4 48 159-214 30-77 (89)
23 PF12733 Cadherin-like: Cadher 71.3 19 0.0004 26.4 6.3 56 151-219 17-73 (88)
24 PF04566 RNA_pol_Rpb2_4: RNA p 67.2 5 0.00011 28.4 2.3 13 176-188 1-13 (63)
25 PF12222 PNGaseA: Peptide N-ac 65.6 14 0.0003 36.1 5.7 51 169-219 218-291 (427)
26 PF11008 DUF2846: Protein of u 59.9 30 0.00065 27.1 5.8 34 171-214 40-74 (117)
27 PF14324 PINIT: PINIT domain; 56.2 13 0.00028 30.5 3.3 51 162-218 74-131 (144)
28 PF11824 DUF3344: Protein of u 56.2 27 0.00058 31.9 5.6 66 148-217 40-132 (271)
29 PF14814 UB2H: Bifunctional tr 54.7 22 0.00047 26.5 4.0 42 143-185 38-82 (85)
30 PF07550 DUF1533: Protein of u 46.2 29 0.00062 24.4 3.3 43 173-219 8-58 (65)
31 PF09113 N-glycanase_C: Peptid 42.4 87 0.0019 25.8 6.0 67 147-218 11-118 (141)
32 smart00560 LamGL LamG-like jel 41.6 34 0.00073 27.2 3.5 27 161-187 64-90 (133)
33 PRK11114 cellulose synthase re 39.4 57 0.0012 34.2 5.6 39 148-186 378-427 (756)
34 PF07908 D-aminoacyl_C: D-amin 38.0 26 0.00056 23.3 1.9 13 173-185 20-32 (48)
35 PF10262 Rdx: Rdx family; Int 37.0 48 0.001 23.9 3.4 23 161-184 33-55 (76)
36 PF11824 DUF3344: Protein of u 35.1 56 0.0012 29.8 4.2 48 165-216 204-254 (271)
37 TIGR02148 Fibro_Slime fibro-sl 34.4 2.1E+02 0.0045 21.8 6.7 53 163-219 20-76 (90)
38 PF00337 Gal-bind_lectin: Gala 33.1 69 0.0015 25.3 4.0 28 159-186 81-108 (133)
39 PF09829 DUF2057: Uncharacteri 32.2 98 0.0021 26.3 5.1 39 172-219 8-46 (189)
40 PF05775 AfaD: Enterobacteria 32.0 97 0.0021 24.6 4.5 45 164-218 18-62 (111)
41 PF13385 Laminin_G_3: Concanav 30.5 58 0.0013 25.1 3.2 24 162-187 89-112 (157)
42 KOG4342 Alpha-mannosidase [Car 30.2 1.7E+02 0.0037 30.4 6.9 67 145-218 103-172 (1078)
43 cd00070 GLECT Galectin/galacto 29.3 83 0.0018 24.8 3.9 28 159-186 76-103 (127)
44 COG0278 Glutaredoxin-related p 28.8 33 0.00073 26.8 1.4 16 171-186 70-85 (105)
45 PF15625 CC2D2AN-C2: CC2D2A N- 28.7 94 0.002 26.0 4.3 40 172-220 39-78 (168)
46 PF06439 DUF1080: Domain of Un 27.6 1.1E+02 0.0023 25.2 4.4 21 171-191 138-158 (185)
47 TIGR02412 pepN_strep_liv amino 26.6 7.9E+02 0.017 26.1 11.8 63 145-218 35-98 (831)
48 PRK10824 glutaredoxin-4; Provi 26.0 48 0.001 26.3 1.9 23 164-186 62-84 (115)
49 smart00276 GLECT Galectin. Gal 25.6 1E+02 0.0023 24.3 3.9 28 159-186 75-102 (128)
50 PF13464 DUF4115: Domain of un 25.6 1.1E+02 0.0024 21.9 3.7 25 160-185 37-61 (77)
51 COG3148 Uncharacterized conser 25.6 46 0.00099 29.5 1.8 51 22-82 90-140 (231)
52 smart00776 NPCBM This novel pu 25.5 3.8E+02 0.0082 21.9 7.8 42 172-222 85-131 (145)
53 smart00561 MBT Present in Dros 24.7 59 0.0013 24.9 2.1 21 159-179 54-74 (96)
54 PF11324 DUF3126: Protein of u 23.9 71 0.0015 22.7 2.2 18 169-186 25-42 (63)
55 PF03422 CBM_6: Carbohydrate b 23.3 3.3E+02 0.0072 20.7 6.3 40 173-218 61-110 (125)
56 PRK01904 hypothetical protein; 22.4 1.6E+02 0.0034 26.0 4.6 41 172-220 29-69 (219)
57 KOG1752 Glutaredoxin and relat 21.9 65 0.0014 25.1 1.9 25 163-187 58-82 (104)
58 PRK15222 putative pilin struct 20.7 1.8E+02 0.004 24.4 4.4 44 165-218 60-103 (156)
59 cd02848 Chitinase_N_term Chiti 20.2 2.5E+02 0.0054 22.1 4.8 45 168-219 45-91 (106)
60 PF01589 Alpha_E1_glycop: Alph 20.1 1.5E+02 0.0032 29.2 4.2 28 161-188 195-222 (502)
61 PRK06789 flagellar motor switc 20.0 1.1E+02 0.0024 22.4 2.6 27 160-187 31-57 (74)
No 1
>PRK09525 lacZ beta-D-galactosidase; Reviewed
Probab=100.00 E-value=1.2e-58 Score=482.29 Aligned_cols=242 Identities=43% Similarity=0.804 Sum_probs=221.3
Q ss_pred ccccccccCCCCCCCCcccccccCCCCCCCccccCChhhhcccCCchhhHHHhhhcccccCCCCCcEEecCccceEEecC
Q 023943 8 LPFALENANGYKVWEDPSFIKWRKRDPHVTLRCHDSVEVSNSAVWDDDAVHEALTSAAFWTNGLPFVKSLSGHWKFFLAS 87 (275)
Q Consensus 8 ~~~~~~~~~~~~~w~~p~~~~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~LnG~W~F~~~~ 87 (275)
+|..|.+....++||||+|+++|||||||+|+||++.++|+. . ..++..++|||.|+|++.+
T Consensus 4 ~~~~~~~~~~~~~wenp~v~~~nr~~~~a~~~~~~~~~~a~~---------~---------~~s~~~~sLnG~W~F~~~~ 65 (1027)
T PRK09525 4 IMDSLAQILARRDWENPGVTQLNRLPAHPPFASWRNSEAARD---------D---------RPSQQRQSLNGEWRFSYFP 65 (1027)
T ss_pred chhHHHhhhccCCccCccccCCCCCCCCCCcCCcCCHHHHhh---------c---------cCCcceEecCCCcceeECC
Confidence 345555544457999999999999999999999999985432 1 1245789999999999999
Q ss_pred CCCCCCccccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEEEEEcCCCCCCc-eEEEE
Q 023943 88 SPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRTYFHIPKEWQGR-RILLH 166 (275)
Q Consensus 88 ~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr~F~lp~~~~~~-~i~L~ 166 (275)
.+.+.|++|+..++++ |++|+||++|+++|++.++|+|+.|||+.+||++|.+|++|||||+|++|++|+++ +++|+
T Consensus 66 ~~~~~~~~~~~~~~~~--w~~I~VP~~w~~~G~~~~~y~n~~ypf~~~~p~vp~~n~~gwYrr~F~vp~~w~~~~rv~L~ 143 (1027)
T PRK09525 66 APEAVPESWLECDLPD--ADTIPVPSNWQLHGYDAPIYTNVTYPIPVNPPFVPEENPTGCYSLTFTVDESWLQSGQTRII 143 (1027)
T ss_pred ChhhCcccccccCCCC--CcEeCCCCcHHhcCCCCCccccccCCCCCCCCCCCCcCCeEEEEEEEEeChhhcCCCeEEEE
Confidence 9988999999988865 99999999999999999999999999999999999889999999999999999887 99999
Q ss_pred eCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCCCCCccccccceeEEEEeC
Q 023943 167 FEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQDHWWLSGIHRDVLLLAKP 246 (275)
Q Consensus 167 f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledqd~w~~~GI~RdV~L~~~p 246 (275)
|+||++.++|||||++||+|+|+|+||+||||++|++ | +|+|+|+|.+|++|+|+++||+|+++||||+|+|+++|
T Consensus 144 FeGV~~~a~VwvNG~~VG~~~g~~~pfefDIT~~l~~---G-~N~L~V~V~~~sdgs~~e~qd~w~~sGI~R~V~L~~~p 219 (1027)
T PRK09525 144 FDGVNSAFHLWCNGRWVGYSQDSRLPAEFDLSPFLRA---G-ENRLAVMVLRWSDGSYLEDQDMWRMSGIFRDVSLLHKP 219 (1027)
T ss_pred ECeeccEEEEEECCEEEEeecCCCceEEEEChhhhcC---C-ccEEEEEEEecCCCCccccCCceeeccccceEEEEEcC
Confidence 9999999999999999999999999999999999999 9 59999999999999999999999999999999999999
Q ss_pred CceEEeEEEEEeecCCeeEEEEEEEEe
Q 023943 247 QVFIADYFFKSNLAEDFSLADIQVNTC 273 (275)
Q Consensus 247 ~~~I~D~~v~t~ld~~~~~~~l~v~~~ 273 (275)
++||+|++|+++++.++++|+|+|++.
T Consensus 220 ~~~I~d~~v~t~l~~~~~~a~v~v~v~ 246 (1027)
T PRK09525 220 TTQLSDFHITTELDDDFRRAVLEVEAQ 246 (1027)
T ss_pred CcEEeeeEEEeeccCccceEEEEEEEE
Confidence 999999999999998887888877653
No 2
>PRK10340 ebgA cryptic beta-D-galactosidase subunit alpha; Reviewed
Probab=100.00 E-value=1.8e-57 Score=473.94 Aligned_cols=233 Identities=37% Similarity=0.743 Sum_probs=216.4
Q ss_pred CCCCCCcccccccCCCCCCCccccCChhhhcccCCchhhHHHhhhcccccCCCCCcEEecCccceEEecCCCCCCCcccc
Q 023943 18 YKVWEDPSFIKWRKRDPHVTLRCHDSVEVSNSAVWDDDAVHEALTSAAFWTNGLPFVKSLSGHWKFFLASSPPDVPLNFH 97 (275)
Q Consensus 18 ~~~w~~p~~~~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~LnG~W~F~~~~~~~~~p~~~~ 97 (275)
+++||||+++++|||||||+|+||.+.++|+. ++++ .++.+++|||.|+|++.+.+...|++|+
T Consensus 1 ~~~wen~~~~~~nr~~~~a~~~~~~~~~~a~~---------~~~~-------~~~~~~~LnG~W~F~~~~~~~~~~~~f~ 64 (1021)
T PRK10340 1 MNRWENIQLTHENRLAPRAYFFSYDSVAQART---------FARE-------TSSLFLLLSGQWNFHFFDHPLYVPEAFT 64 (1021)
T ss_pred CCcccCccccCCCCCCCCCCcCCcCCHHHHhh---------cccc-------cCCceeecCcceeEEEeCCccccccccc
Confidence 36899999999999999999999999986542 2111 2468899999999999988888899999
Q ss_pred CCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEEEEEcCCCCCCceEEEEeCcccceeEEE
Q 023943 98 KSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAW 177 (275)
Q Consensus 98 ~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~Vw 177 (275)
.+++ ++|++|+|||+|+++|++.|+|+|..|||+..||++|..|++|||||+|++|++|+|++++|+|+||++.++||
T Consensus 65 ~~~~--~~W~~I~VP~~w~~~g~~~~~y~n~~y~~~~~~P~vp~~n~~g~Yrr~F~lp~~~~gkrv~L~FeGV~s~a~Vw 142 (1021)
T PRK10340 65 SELM--SDWGHITVPAMWQMEGHGKLQYTDEGFPFPIDVPFVPSDNPTGAYQRTFTLSDGWQGKQTIIKFDGVETYFEVY 142 (1021)
T ss_pred cCCC--CCCcEeecCCChhhcCCCCcccccccccCCCCCCCCCCcCCeEEEEEEEEeCcccccCcEEEEECccceEEEEE
Confidence 8887 67999999999999999999999999999999999998899999999999999999999999999999999999
Q ss_pred EcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCCCCCccccccceeEEEEeCCceEEeEEEEE
Q 023943 178 INGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQDHWWLSGIHRDVLLLAKPQVFIADYFFKS 257 (275)
Q Consensus 178 vNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledqd~w~~~GI~RdV~L~~~p~~~I~D~~v~t 257 (275)
|||++||+|+|+|+||+||||++|+. |+ |+|+|+|++|++++|+++||+|+++||||+|+|+++|++||+|++|++
T Consensus 143 vNG~~VG~~~g~~~pfefDIT~~l~~---G~-N~LaV~V~~~~d~s~le~qd~w~~sGI~R~V~L~~~p~~~I~d~~v~t 218 (1021)
T PRK10340 143 VNGQYVGFSKGSRLTAEFDISAMVKT---GD-NLLCVRVMQWADSTYLEDQDMWWLAGIFRDVYLVGKPLTHINDFTVRT 218 (1021)
T ss_pred ECCEEeccccCCCccEEEEcchhhCC---Cc-cEEEEEEEecCCCCccccCCccccccccceEEEEEeCCceEEeeEEEe
Confidence 99999999999999999999999999 95 999999999999999999999999999999999999999999999999
Q ss_pred eecCCeeEEEEEEEE
Q 023943 258 NLAEDFSLADIQVNT 272 (275)
Q Consensus 258 ~ld~~~~~~~l~v~~ 272 (275)
+++.++++|+|+|++
T Consensus 219 ~l~~~~~~a~l~v~v 233 (1021)
T PRK10340 219 DFDEDYCDATLSCEV 233 (1021)
T ss_pred eccCccCceEEEEEE
Confidence 999887778877765
No 3
>PRK10150 beta-D-glucuronidase; Provisional
Probab=100.00 E-value=1.4e-34 Score=288.92 Aligned_cols=183 Identities=29% Similarity=0.436 Sum_probs=152.0
Q ss_pred CCcEEecCccceEEecCCCCCCCccccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEE
Q 023943 71 LPFVKSLSGHWKFFLASSPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRT 150 (275)
Q Consensus 71 ~~~~~~LnG~W~F~~~~~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr 150 (275)
++..++|||.|+|+..+.+.+.+++|+...++. +..|.||++|+.++.+.+.. ...+.+||||
T Consensus 9 ~r~~~~Lng~W~F~~~~~~~~~~~~w~~~~~~~--~~~i~vP~~~~~~~~~~~~~---------------~~~G~~WYrr 71 (604)
T PRK10150 9 TREIKDLSGLWAFKLDRENCGIDQRWWESALPE--SRAMAVPGSFNDQFADADIR---------------NYVGDVWYQR 71 (604)
T ss_pred CeeeeecCCccceEECCccccccccccccCCCC--CcEecCCCchhhcccccccc---------------CCcccEEEEE
Confidence 356789999999999887766677787765543 35899999998876433211 1257899999
Q ss_pred EEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC------CCCc
Q 023943 151 YFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS------DGSY 224 (275)
Q Consensus 151 ~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~------dgs~ 224 (275)
+|++|+.|++++++|+|+||++.++|||||++||+|+|+|+||+||||++|+. |++|+|+|+|.+.. .|++
T Consensus 72 ~f~lp~~~~gk~v~L~Fegv~~~a~V~lNG~~vg~~~~~~~~f~~DIT~~l~~---G~~n~L~V~v~n~~~~~~~p~g~~ 148 (604)
T PRK10150 72 EVFIPKGWAGQRIVLRFGSVTHYAKVWVNGQEVMEHKGGYTPFEADITPYVYA---GKSVRITVCVNNELNWQTLPPGNV 148 (604)
T ss_pred EEECCcccCCCEEEEEECcccceEEEEECCEEeeeEcCCccceEEeCchhccC---CCceEEEEEEecCCCcccCCCCcc
Confidence 99999999999999999999999999999999999999999999999999999 86569999998642 2444
Q ss_pred ccC----------CCCCccccccceeEEEEeCCceEEeEEEEEeecCCeeEEEEEEEEe
Q 023943 225 LED----------QDHWWLSGIHRDVLLLAKPQVFIADYFFKSNLAEDFSLADIQVNTC 273 (275)
Q Consensus 225 led----------qd~w~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~~~~~l~v~~~ 273 (275)
.++ +|+|.++||||+|+|+++|++||+|++|+++++.+.+.|+|+|++.
T Consensus 149 ~~~~~~~~k~~~~~d~~~~~GI~r~V~L~~~~~~~i~dv~v~~~~~~~~~~a~v~v~v~ 207 (604)
T PRK10150 149 IEDGNGKKKQKYNFDFFNYAGIHRPVMLYTTPKTHIDDITVVTELAQDLNHASVDWSVE 207 (604)
T ss_pred ccCCccccccccccccccccCCCceEEEEEcCCccCceEEEEeecCCcCceEEEEEEEE
Confidence 432 5677899999999999999999999999999987767777776653
No 4
>PF02837 Glyco_hydro_2_N: Glycosyl hydrolases family 2, sugar binding domain; InterPro: IPR006104 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 2 GH2 from CAZY comprises enzymes with several known activities; beta-galactosidase (3.2.1.23 from EC); beta-mannosidase (3.2.1.25 from EC); beta-glucuronidase (3.2.1.31 from EC). These enzymes contain a conserved glutamic acid residue which has been shown [], in Escherichia coli lacZ (P00722 from SWISSPROT), to be the general acid/base catalyst in the active site of the enzyme. This domain has a jelly-roll fold [].; GO: 0004553 hydrolase activity, hydrolyzing O-glycosyl compounds, 0005975 carbohydrate metabolic process; PDB: 3DEC_A 3OB8_A 3OBA_A 3CMG_A 3FN9_C 2VZU_A 2X09_A 2VZO_A 2X05_A 2VZV_B ....
Probab=100.00 E-value=9.2e-35 Score=244.55 Aligned_cols=165 Identities=38% Similarity=0.663 Sum_probs=135.8
Q ss_pred CcEEecCccceEEecCCCCCCCcc-ccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcccEEE
Q 023943 72 PFVKSLSGHWKFFLASSPPDVPLN-FHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTGCYRT 150 (275)
Q Consensus 72 ~~~~~LnG~W~F~~~~~~~~~p~~-~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yrr 150 (275)
+..++|||.|+|+........+.. +....++++.|..|.||++|+..++......+ ..+......+.+||||
T Consensus 2 r~~~~Lng~W~f~~~~~~~~~~~~~~~~~~~~~~~w~~i~VP~~~~~~~~~~~~~~~-------~~~~~~~~~~~~wYr~ 74 (167)
T PF02837_consen 2 RQVISLNGQWQFQPDDSPQDRPEGWFSWPDFDDSDWQPISVPGSWEDDLLRAFVPEN-------GDPELWDYSGYAWYRR 74 (167)
T ss_dssp TCEEESSEEEEEEEESSGGGSCTHHCCSTTCCCTTSEEEEESSEGTCCTSSTBTTST-------TGCCTSTCCSEEEEEE
T ss_pred CcEEECCccCCEEEeCCcccCccccccccccCcCCCeEEeCCCEeecCccceecccc-------ccccccccCceEEEEE
Confidence 578999999999999887665555 33446778899999999999987543210000 0001112378899999
Q ss_pred EEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCCCC
Q 023943 151 YFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQDH 230 (275)
Q Consensus 151 ~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledqd~ 230 (275)
+|++|++|+++++.|+|+||++.++|||||++||.+.++|+|++||||++|++ |++|+|+|+|.+..++++++.+++
T Consensus 75 ~f~lp~~~~~~~~~L~f~gv~~~a~v~vNG~~vg~~~~~~~~~~~dIt~~l~~---g~~N~l~V~v~~~~~~~~~~~~~~ 151 (167)
T PF02837_consen 75 TFTLPADWKGKRVFLRFEGVDYAAEVYVNGKLVGSHEGGYTPFEFDITDYLKP---GEENTLAVRVDNWPDGSTIPGFDY 151 (167)
T ss_dssp EEEESGGGTTSEEEEEESEEESEEEEEETTEEEEEEESTTS-EEEECGGGSSS---EEEEEEEEEEESSSGGGCGBSSSE
T ss_pred EEEeCchhcCceEEEEeccceEeeEEEeCCeEEeeeCCCcCCeEEeChhhccC---CCCEEEEEEEeecCCCceeecCcC
Confidence 99999999999999999999999999999999999999999999999999999 855999999999998888888888
Q ss_pred CccccccceeEEEEeC
Q 023943 231 WWLSGIHRDVLLLAKP 246 (275)
Q Consensus 231 w~~~GI~RdV~L~~~p 246 (275)
+.++||||+|+|+++|
T Consensus 152 ~~~~GI~r~V~L~~~p 167 (167)
T PF02837_consen 152 FNYAGIWRPVWLEATP 167 (167)
T ss_dssp EE--EEESEEEEEEEE
T ss_pred CccCccccEEEEEEEC
Confidence 8999999999999986
No 5
>COG3250 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]
Probab=99.96 E-value=7.1e-30 Score=259.20 Aligned_cols=196 Identities=41% Similarity=0.640 Sum_probs=180.5
Q ss_pred CCcEEecCccceEEecCCCCCCCccccCCCCCCCCCeEeccCccccccc-CCCCcccceeccCCCCCCCCcccCCcccEE
Q 023943 71 LPFVKSLSGHWKFFLASSPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHG-FDRPIYTNVVYPFPLDPPNVPAENPTGCYR 149 (275)
Q Consensus 71 ~~~~~~LnG~W~F~~~~~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g-~~~p~y~n~~yp~~~~pp~vp~~n~~g~Yr 149 (275)
++..++|||.|.|++.+.+..+|..|.....++.. .|.||++|++++ ++.++|+|..||++..+|.++..++++.|.
T Consensus 9 ~~~~~~L~G~W~f~~~~~~~~~~~~w~~~~~s~~~--~i~VP~~w~~~~~~~~~~~~~~~y~~~~~~~~~~~~~~~~l~f 86 (808)
T COG3250 9 SREIKSLNGLWAFSLDDEPCAVPQRWPESLLSESR--AIAVPGNWQDQGEYDRPIYTNVWYPREVFPPKVPAGNRIGLYF 86 (808)
T ss_pred ccceeccCCceeEEecCCccccccccchhhhhhcc--CccCCccHhhcCccCcceecceeeeecccCCccccCCceEEEE
Confidence 34678999999999998888899999877665544 899999999999 999999999999999999998889999999
Q ss_pred EEEEcCCCC-CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCCcccCC
Q 023943 150 TYFHIPKEW-QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGSYLEDQ 228 (275)
Q Consensus 150 r~F~lp~~~-~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs~ledq 228 (275)
+.|++.++| .+..++|.|+|+.+.++|||||+.||++.+++.++++|||+++++ |. |.+++.|.+|++++++++|
T Consensus 87 ~~~~~~~~v~~ng~~~l~~eg~~~~fev~vng~~v~~~~~~~~~~~~dis~~~~~---~~-~~~~~~v~~~~~~~~~~~~ 162 (808)
T COG3250 87 DAVDTLAKVWLNGQEVLEFQGVYTPFEVDVTGPYVGGGKDSRITVEFDISPNLQT---GP-NGLVVTVENWSKGSYYEDQ 162 (808)
T ss_pred eccccceeEEeCCeEEEEecCceeEEEEeeccceecCCcceEEEEeecccccccc---CC-ccCceEEeccCCCCCcccc
Confidence 999998876 567999999999999999999999999999999999999999999 84 9999999999999999999
Q ss_pred CCCccccccceeEEEEeCCceEEeEEEEEeecCCeeEEEEEEEE
Q 023943 229 DHWWLSGIHRDVLLLAKPQVFIADYFFKSNLAEDFSLADIQVNT 272 (275)
Q Consensus 229 d~w~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~~~~~l~v~~ 272 (275)
||||++||+|||+|+.+|.+||.|++|.|+++.....+.+.+++
T Consensus 163 d~~r~aGi~RdV~l~i~p~~~~~di~V~t~~~~~~~~~~~~~~~ 206 (808)
T COG3250 163 DFFRYAGIHRDVMLYITPNTHVDDITVVTHLAEDCNHASLDVKI 206 (808)
T ss_pred CeeecccccceeEEEEccceeEeeeEEEEecchhhhhhheeehe
Confidence 99999999999999999999999999999998888877777433
No 6
>KOG2024 consensus Beta-Glucuronidase GUSB (glycosylhydrolase superfamily 2) [Carbohydrate transport and metabolism]
Probab=99.65 E-value=4.1e-16 Score=138.08 Aligned_cols=176 Identities=25% Similarity=0.399 Sum_probs=134.3
Q ss_pred CCCcEEecCccceEEecCCCCCC---CccccCCCCCCCCCeEeccCcccccccCCCCcccceeccCCCCCCCCcccCCcc
Q 023943 70 GLPFVKSLSGHWKFFLASSPPDV---PLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPFPLDPPNVPAENPTG 146 (275)
Q Consensus 70 ~~~~~~~LnG~W~F~~~~~~~~~---p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~~~~pp~vp~~n~~g 146 (275)
+++...+|+|-|.|..+.+.... -+.|+...+ ..-..|+||++++..|.+.+.. +.-+..
T Consensus 28 pire~~~ldgLw~f~r~~~~~~~~g~~~~w~~~~~--~~t~~mpvpss~nDi~~d~~lr---------------dfv~~~ 90 (297)
T KOG2024|consen 28 PIREVKSLDGLWSFVRDSNQNRLQGILEQWENKES--GPTQDMPVPSSFNDIGQDWRLR---------------DFVGLV 90 (297)
T ss_pred cchhhhhhCcchhcccCcccccchhHHhhhccccc--ccccccccccchhccccCCccc---------------cceeee
Confidence 46778899999999987764332 345655332 1225689999998877554221 124567
Q ss_pred cEEEEEEcCCCC---CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecCCCC
Q 023943 147 CYRTYFHIPKEW---QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWSDGS 223 (275)
Q Consensus 147 ~Yrr~F~lp~~~---~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dgs 223 (275)
||.|++.+|+.| .++++.|||..+++.|.|||||..+-.|++++.|++-+|...++. |..|.+--....| .+.
T Consensus 91 wyer~v~vpe~w~~~~~~r~vlr~~s~H~~Aivwvng~~~~~h~gg~lP~~~~is~~~~~---g~~~~~dn~L~~~-t~~ 166 (297)
T KOG2024|consen 91 WYERTVTVPESWTQDLGKRVVLRIGSAHSYAIVWVNGVDALEHEGGHLPLEPDISALVFF---GPLPAIDNNLLSW-TGP 166 (297)
T ss_pred EEEEEEEcchhhhhhcCCeEEEEeecccceeEEEEcceeecccccCccccchhhhhhhhc---cccccccCccccc-ccC
Confidence 999999999998 478999999999999999999999999999999999999998888 7656222122222 122
Q ss_pred cccCCCCCccccccceeEEEEeCCceEEeEEEEEeecCCeeEE
Q 023943 224 YLEDQDHWWLSGIHRDVLLLAKPQVFIADYFFKSNLAEDFSLA 266 (275)
Q Consensus 224 ~ledqd~w~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~~~~ 266 (275)
-..+.|+++++||.|+|-|+.+|.++|+|+.|.+.+..+...|
T Consensus 167 ~~~~~dffnYag~~~sv~l~t~p~vyi~~~~v~t~l~~~~~~a 209 (297)
T KOG2024|consen 167 NSFCFDFFNYAGEQRSVCLYTTPVVYIEDITVTTGLPHDSGCA 209 (297)
T ss_pred CcccccCCCchhhheeeeeccCCeEEecCcceeeccccCCcce
Confidence 2346689999999999999999999999999999887765433
No 7
>KOG2230 consensus Predicted beta-mannosidase [Carbohydrate transport and metabolism]
Probab=99.19 E-value=1.4e-10 Score=112.37 Aligned_cols=169 Identities=20% Similarity=0.258 Sum_probs=117.8
Q ss_pred cEEecCccceEEecCCCCCCCccccCCCCCCCCCeEeccCcccccccCCCCcccceeccC-CCCCCCCcccCCcccEEEE
Q 023943 73 FVKSLSGHWKFFLASSPPDVPLNFHKSSFQDSKWEAIPVPSNWQMHGFDRPIYTNVVYPF-PLDPPNVPAENPTGCYRTY 151 (275)
Q Consensus 73 ~~~~LnG~W~F~~~~~~~~~p~~~~~~~~d~~~W~~i~VP~~w~~~g~~~p~y~n~~yp~-~~~pp~vp~~n~~g~Yrr~ 151 (275)
...+|.|.|.|.-....-. .+..|||+.....|..-+..|-.|-+ ..+-.++. ...+.|.|+
T Consensus 21 ~t~~l~gnw~~~~~n~t~~---------------~~g~vpg~i~s~l~~~gii~~~~~~~n~ln~kwia--~d~wtysr~ 83 (867)
T KOG2230|consen 21 NTLVLAGNWEFSSSNKTVN---------------GTGTVPGDIYSDLYASGIIDNPLFGENHLNLKWIA--EDDWTYSRK 83 (867)
T ss_pred eeEEEecceEEecCCCcee---------------cCCCCCchHhHHHHhcccccCccccccccceeEEe--ccCccceee
Confidence 4567999999997654211 24578998766544332222222222 12333343 234679999
Q ss_pred EEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC-------C---
Q 023943 152 FHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS-------D--- 221 (275)
Q Consensus 152 F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~-------d--- 221 (275)
|.|=+--+-..++|.+||||+.+.||+||+.|+.+.++|.|+.|+||..+. | +|.|+++..... +
T Consensus 84 frl~dl~~~~~~~l~ie~vdtia~v~~n~~~v~~s~n~f~~y~~~vt~ii~----~-~n~i~~~f~ssv~yA~~~~~~~~ 158 (867)
T KOG2230|consen 84 FRLIDLDDTVGAFLEIESVDTIATVYVNGQKVLHSRNQFLPYHVNVTDIIA----G-ENDITIKFKSSVKYAEKRADEYK 158 (867)
T ss_pred eEEEEccccccceEEEeecceeEEEEEccEEEeeccccceeEEEeEEEEec----C-CcceEEEeehhHHHHHHHHHhhh
Confidence 988332234678999999999999999999999999999999999999765 5 599999887531 0
Q ss_pred -CC---------c-ccC--------CC--CC------ccccccceeEEEEeCCceEEeEEEEEeecCCe
Q 023943 222 -GS---------Y-LED--------QD--HW------WLSGIHRDVLLLAKPQVFIADYFFKSNLAEDF 263 (275)
Q Consensus 222 -gs---------~-led--------qd--~w------~~~GI~RdV~L~~~p~~~I~D~~v~t~ld~~~ 263 (275)
-+ | -|| |. .| ...||+.+|.|....-.++.|+.+++..+...
T Consensus 159 k~svPPdC~p~iyhGECH~NfiRK~Q~SFsWDWGPsfPt~GI~k~v~i~iY~~~~~~~f~~~~~~~~g~ 227 (867)
T KOG2230|consen 159 KHSLPPDCNPDIYHGECHQNFIRKAQYSFAWDWGPSFPTVGIPSTITINIYRGQYFHDFNWKTRFAHGK 227 (867)
T ss_pred ccCCCCCCCchhhccchHHHHHHHhhcceecccCCCCccCCCCcceEEEEEeeeEEEeeceeeeeecce
Confidence 00 0 011 22 23 25999999999999999999999998877653
No 8
>PF13364 BetaGal_dom4_5: Beta-galactosidase jelly roll domain; PDB: 1TG7_A 1XC6_A 3OGS_A 3OGV_A 3OGR_A 3OG2_A.
Probab=97.95 E-value=4.9e-05 Score=60.20 Aligned_cols=71 Identities=18% Similarity=0.149 Sum_probs=48.2
Q ss_pred cCCcccEEEEEEcCCCCCCceEE-EEe-CcccceeEEEEcCEEeeeec-CCCCCceecccc-ccccCCCCCceEEEEEEE
Q 023943 142 ENPTGCYRTYFHIPKEWQGRRIL-LHF-EAVDSAFCAWINGVPVGYSQ-DSRLPAEFEISD-YCYPHGSDKKNVLAVQVF 217 (275)
Q Consensus 142 ~n~~g~Yrr~F~lp~~~~~~~i~-L~f-~gv~s~~~VwvNG~~VG~~~-~~~~p~efdIT~-~Lk~~~~G~eN~L~V~V~ 217 (275)
..+..|||.+|..... ...+. |.. .|-...+.|||||+++|... +.-....|.|+. .|+. + .|.|+|.+.
T Consensus 33 ~~g~~~Yrg~F~~~~~--~~~~~~l~~~~g~~~~~~vwVNG~~~G~~~~~~g~q~tf~~p~~il~~---~-n~v~~vl~~ 106 (111)
T PF13364_consen 33 HAGYLWYRGTFTGTGQ--DTSLTPLNIQGGNAFRASVWVNGWFLGSYWPGIGPQTTFSVPAGILKY---G-NNVLVVLWD 106 (111)
T ss_dssp SSCEEEEEEEEETTTE--EEEEE-EEECSSTTEEEEEEETTEEEEEEETTTECCEEEEE-BTTBTT---C-EEEEEEEEE
T ss_pred CCCCEEEEEEEeCCCc--ceeEEEEeccCCCceEEEEEECCEEeeeecCCCCccEEEEeCceeecC---C-CEEEEEEEe
Confidence 3678999999964221 13444 444 36677899999999999977 444447788877 6776 6 467777666
Q ss_pred e
Q 023943 218 R 218 (275)
Q Consensus 218 ~ 218 (275)
+
T Consensus 107 ~ 107 (111)
T PF13364_consen 107 N 107 (111)
T ss_dssp -
T ss_pred C
Confidence 4
No 9
>COG3250 LacZ Beta-galactosidase/beta-glucuronidase [Carbohydrate transport and metabolism]
Probab=97.93 E-value=1.6e-05 Score=82.28 Aligned_cols=70 Identities=27% Similarity=0.245 Sum_probs=62.9
Q ss_pred cccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943 145 TGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 145 ~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
..+|.++|.+|....++++.|.|++++.-++||+||+.++.++|+|++|+++|+.-+.. | +|.+.+.+..
T Consensus 64 ~~~y~~~~~~~~~~~~~~~~l~f~~~~~~~~v~~ng~~~l~~eg~~~~fev~vng~~v~---~-~~~~~~~~~~ 133 (808)
T COG3250 64 NVWYPREVFPPKVPAGNRIGLYFDAVDTLAKVWLNGQEVLEFQGVYTPFEVDVTGPYVG---G-GKDSRITVEF 133 (808)
T ss_pred ceeeeecccCCccccCCceEEEEeccccceeEEeCCeEEEEecCceeEEEEeeccceec---C-CcceEEEEee
Confidence 46799999999888899999999999999999999999999999999999999975555 5 5888888876
No 10
>PLN03059 beta-galactosidase; Provisional
Probab=97.76 E-value=0.00015 Score=74.78 Aligned_cols=94 Identities=19% Similarity=0.154 Sum_probs=72.1
Q ss_pred CCcccEEEEEEcCCC---C-CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceecccc--ccccCCCCCceEEEEEE
Q 023943 143 NPTGCYRTYFHIPKE---W-QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISD--YCYPHGSDKKNVLAVQV 216 (275)
Q Consensus 143 n~~g~Yrr~F~lp~~---~-~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~--~Lk~~~~G~eN~L~V~V 216 (275)
.+..|||++|.++.+ | .+....|++..+...+.|||||+++|...+...-..|.+.. -|+. |. |+|.|.+
T Consensus 469 ~dYlwY~t~i~~~~~~~~~~~~~~~~L~v~~~~d~~~vFVNg~~~Gt~~~~~~~~~~~~~~~v~l~~---g~-n~L~iLs 544 (840)
T PLN03059 469 TDYLWYMTEVHIDPDEGFLKTGQYPVLTIFSAGHALHVFINGQLAGTVYGELSNPKLTFSQNVKLTV---GI-NKISLLS 544 (840)
T ss_pred CceEEEEEEEeecCCccccccCCCceEEEcccCcEEEEEECCEEEEEEEeecCCcceEEecccccCC---Cc-eEEEEEE
Confidence 667899999998764 2 25667899999999999999999999976655444454443 3667 84 9999999
Q ss_pred EecC---CCCcccCCCCCccccccceeEEEE
Q 023943 217 FRWS---DGSYLEDQDHWWLSGIHRDVLLLA 244 (275)
Q Consensus 217 ~~~~---dgs~ledqd~w~~~GI~RdV~L~~ 244 (275)
.+-- -|.++|.+ ..||.++|.|..
T Consensus 545 e~vG~~NyG~~le~~----~kGI~g~V~i~g 571 (840)
T PLN03059 545 VAVGLPNVGLHFETW----NAGVLGPVTLKG 571 (840)
T ss_pred EeCCCCccCcccccc----cccccccEEEec
Confidence 9753 26667644 499999999965
No 11
>PF08531 Bac_rhamnosid_N: Alpha-L-rhamnosidase N-terminal domain; InterPro: IPR013737 This domain is found in bacterial rhamnosidase A and B enzymes and is probably involved in substrate recognition. ; PDB: 2OKX_B.
Probab=97.62 E-value=0.00012 Score=62.41 Aligned_cols=53 Identities=25% Similarity=0.346 Sum_probs=35.6
Q ss_pred ceEEEEeCcccceeEEEEcCEEeeeec---C--CCCC----ceeccccccccCCCCCceEEEEEEEe
Q 023943 161 RRILLHFEAVDSAFCAWINGVPVGYSQ---D--SRLP----AEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 161 ~~i~L~f~gv~s~~~VwvNG~~VG~~~---~--~~~p----~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
++..|++-+ +..+++||||+.||... + .|.- -.+|||++|+. | +|.|+|.|.+
T Consensus 4 ~~A~l~isa-~g~Y~l~vNG~~V~~~~l~P~~t~y~~~~~Y~tyDVt~~L~~---G-~N~iav~lg~ 65 (172)
T PF08531_consen 4 RSARLYISA-LGRYELYVNGERVGDGPLAPGWTDYDKRVYYQTYDVTPYLRP---G-ENVIAVWLGN 65 (172)
T ss_dssp ---EEEEEE-ESEEEEEETTEEEEEE--------BTTEEEEEEEE-TTT--T---T-EEEEEEEEEE
T ss_pred eEEEEEEEe-CeeEEEEECCEEeeCCccccccccCCCceEEEEEeChHHhCC---C-CCEEEEEEeC
Confidence 356677766 56899999999999754 1 1111 36899999999 9 5999999986
No 12
>PLN03059 beta-galactosidase; Provisional
Probab=96.07 E-value=0.011 Score=61.48 Aligned_cols=66 Identities=26% Similarity=0.439 Sum_probs=50.0
Q ss_pred CCcccEEEEEEcCCCCCCce-EEEEeCcccceeEEEEcCEEeeeecCC---------------C----------CC--ce
Q 023943 143 NPTGCYRTYFHIPKEWQGRR-ILLHFEAVDSAFCAWINGVPVGYSQDS---------------R----------LP--AE 194 (275)
Q Consensus 143 n~~g~Yrr~F~lp~~~~~~~-i~L~f~gv~s~~~VwvNG~~VG~~~~~---------------~----------~p--~e 194 (275)
.+..||+.+|++|+ +.. ++|.+.| .....|||||+-||.-... | -| .-
T Consensus 618 ~p~twYK~~Fd~p~---g~Dpv~LDm~g-mGKG~aWVNG~nIGRYW~~~a~~~gC~~c~y~g~~~~~kc~~~cggP~q~l 693 (840)
T PLN03059 618 QPLTWYKTTFDAPG---GNDPLALDMSS-MGKGQIWINGQSIGRHWPAYTAHGSCNGCNYAGTFDDKKCRTNCGEPSQRW 693 (840)
T ss_pred CCceEEEEEEeCCC---CCCCEEEeccc-CCCeeEEECCcccccccccccccCCCccccccccccchhhhccCCCceeEE
Confidence 44789999999986 454 9999999 6789999999999975411 1 22 23
Q ss_pred ecccc-ccccCCCCCceEEEEEE
Q 023943 195 FEISD-YCYPHGSDKKNVLAVQV 216 (275)
Q Consensus 195 fdIT~-~Lk~~~~G~eN~L~V~V 216 (275)
+.|+. +||+ |+ |+|+|-=
T Consensus 694 YHVPr~~Lk~---g~-N~lViFE 712 (840)
T PLN03059 694 YHVPRSWLKP---SG-NLLIVFE 712 (840)
T ss_pred EeCcHHHhcc---CC-ceEEEEE
Confidence 56776 9999 85 9887753
No 13
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=95.88 E-value=0.026 Score=56.78 Aligned_cols=75 Identities=20% Similarity=0.191 Sum_probs=57.7
Q ss_pred eEEEEeC-cccceeEEEEcCEEeeeecCCCCCceecccc--ccccCCCCCceEEEEEEEecC---CCCcccCCCCCcccc
Q 023943 162 RILLHFE-AVDSAFCAWINGVPVGYSQDSRLPAEFEISD--YCYPHGSDKKNVLAVQVFRWS---DGSYLEDQDHWWLSG 235 (275)
Q Consensus 162 ~i~L~f~-gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~--~Lk~~~~G~eN~L~V~V~~~~---dgs~ledqd~w~~~G 235 (275)
...|.+. ++..+.+|||||+++|...+.+.-..+.+.. -|+. | +|.|++.+.+-- -| +.|. +..|
T Consensus 434 ~t~~~i~ls~g~~~hVfvNg~~~G~~~g~~~~~~~~~~~~~~l~~---g-~n~l~iL~~~~G~~n~G-~~e~----~~~G 504 (649)
T KOG0496|consen 434 TTSLKIPLSLGHALHVFVNGEFAGSLHGNNEKIKLNLSQPVGLKA---G-ENKLALLSENVGLPNYG-HFEN----DFKG 504 (649)
T ss_pred CceEeecccccceEEEEECCEEeeeEeccccceeEEeeccccccc---C-cceEEEEEEecCCCCcC-cccc----cccc
Confidence 4567777 9999999999999999998887666666554 4567 8 599999998752 24 3333 2589
Q ss_pred ccceeEEEEe
Q 023943 236 IHRDVLLLAK 245 (275)
Q Consensus 236 I~RdV~L~~~ 245 (275)
|.++|+|...
T Consensus 505 i~g~v~l~g~ 514 (649)
T KOG0496|consen 505 ILGPVYLNGL 514 (649)
T ss_pred cccceEEeee
Confidence 9999999876
No 14
>KOG0496 consensus Beta-galactosidase [Carbohydrate transport and metabolism]
Probab=94.62 E-value=0.058 Score=54.32 Aligned_cols=68 Identities=24% Similarity=0.370 Sum_probs=52.6
Q ss_pred CCcccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCC-ceecccc-ccccCCCCCceEEEEEEEe
Q 023943 143 NPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLP-AEFEISD-YCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 143 n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p-~efdIT~-~Lk~~~~G~eN~L~V~V~~ 218 (275)
.|.-||. .|++|+. ...+.|.+.| -....|||||+-||+..-++-| ..+-|+. +||+ |+ |.|+|-=..
T Consensus 556 ~P~~w~k-~f~~p~g--~~~t~Ldm~g-~GKG~vwVNG~niGRYW~~~G~Q~~yhvPr~~Lk~---~~-N~lvvfEee 625 (649)
T KOG0496|consen 556 QPLTWYK-TFDIPSG--SEPTALDMNG-WGKGQVWVNGQNIGRYWPSFGPQRTYHVPRSWLKP---SG-NLLVVFEEE 625 (649)
T ss_pred CCeEEEE-EecCCCC--CCCeEEecCC-CcceEEEECCcccccccCCCCCceEEECcHHHhCc---CC-ceEEEEEec
Confidence 4567787 9999986 4479999999 6789999999999987655443 4566776 8999 84 988775443
No 15
>PF07691 PA14: PA14 domain; InterPro: IPR011658 The PA14 domain forms an insert in bacterial beta-glucosidases, other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium pre-spore cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding [].; PDB: 2XVG_A 2XVK_A 2XVL_A 2XJU_A 2XJT_A 2XJQ_A 2XJS_A 2XJV_A 2XJP_A 2XJR_A ....
Probab=92.49 E-value=1.2 Score=35.55 Aligned_cols=71 Identities=17% Similarity=0.206 Sum_probs=48.5
Q ss_pred CCcccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCC-------CceeccccccccCCCCCceEEEEE
Q 023943 143 NPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRL-------PAEFEISDYCYPHGSDKKNVLAVQ 215 (275)
Q Consensus 143 n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~-------p~efdIT~~Lk~~~~G~eN~L~V~ 215 (275)
+-...++-.|.+|.. ....+.+.+ +..++|||||+.|..+.+... +....-+-.|.+ |+...|.|.
T Consensus 45 ~~~~~~~G~~~~~~~---G~y~f~~~~-~d~~~l~idg~~vid~~~~~~~~~~~~~~~~~~~~v~l~~---g~~y~i~i~ 117 (145)
T PF07691_consen 45 NFSVRWTGYFKPPET---GTYTFSLTS-DDGARLWIDGKLVIDNWGNQGGGFFNSGPSSTSGTVTLEA---GGKYPIRIE 117 (145)
T ss_dssp SEEEEEEEEEEESSS---EEEEEEEEE-SSEEEEEETTEEEEECSCTTTSTTTTTSBCCEEEEEEE-T---T-EEEEEEE
T ss_pred eEEEEEEEEEecccC---ceEEEEEEe-cccEEEEECCEEEEcCCccccccccccccceEEEEEEeeC---CeeEEEEEE
Confidence 445678889998864 567777774 668999999999988765433 233333334555 656899999
Q ss_pred EEecC
Q 023943 216 VFRWS 220 (275)
Q Consensus 216 V~~~~ 220 (275)
..+..
T Consensus 118 y~~~~ 122 (145)
T PF07691_consen 118 YFNRG 122 (145)
T ss_dssp EEECS
T ss_pred EEECC
Confidence 88864
No 16
>PF14683 CBM-like: Polysaccharide lyase family 4, domain III; PDB: 1NKG_A 2XHN_B 3NJX_A 3NJV_A.
Probab=91.79 E-value=0.37 Score=40.90 Aligned_cols=69 Identities=14% Similarity=0.174 Sum_probs=38.7
Q ss_pred ccEEEEEEcCCCCCCc--eEEEEeCcc--cceeEEEEcCEEeee-----------------ecCCCCCceecccc-cccc
Q 023943 146 GCYRTYFHIPKEWQGR--RILLHFEAV--DSAFCAWINGVPVGY-----------------SQDSRLPAEFEISD-YCYP 203 (275)
Q Consensus 146 g~Yrr~F~lp~~~~~~--~i~L~f~gv--~s~~~VwvNG~~VG~-----------------~~~~~~p~efdIT~-~Lk~ 203 (275)
+-.+-+|.+++...++ .+.|.+-+. .....|.||| .++. +.|-+.-++|+|+. .|+.
T Consensus 63 ~~w~I~F~l~~~~~~~~~tL~i~la~a~~~~~~~V~vNg-~~~~~~~~~~~~d~~~~r~g~~~G~~~~~~~~ipa~~L~~ 141 (167)
T PF14683_consen 63 GTWTIKFDLDAVQLAGTYTLRIALAGASAGGRLQVSVNG-WSGPFPSAPFGNDNAIYRSGIHRGNYRLYEFDIPASLLKA 141 (167)
T ss_dssp --EEEEEEE-GGG-S--EEEEEEEEEEETT-EEEEEETT-EE-----------S--GGGT---S---EEEEEE-TTSS-S
T ss_pred CCEEEEEECCCCccCCcEEEEEEeccccCCCCEEEEEcC-ccCCccccccCCCCceeeCceecccEEEEEEEEcHHHEEe
Confidence 5677788887765333 333334333 3467899999 4432 22566788999987 8898
Q ss_pred CCCCCceEEEEEEEec
Q 023943 204 HGSDKKNVLAVQVFRW 219 (275)
Q Consensus 204 ~~~G~eN~L~V~V~~~ 219 (275)
| +|+|.+.+.+.
T Consensus 142 ---G-~Nti~lt~~~g 153 (167)
T PF14683_consen 142 ---G-ENTITLTVPSG 153 (167)
T ss_dssp ---E-EEEEEEEEE-S
T ss_pred ---c-cEEEEEEEccC
Confidence 9 59999999974
No 17
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=88.86 E-value=2.5 Score=42.83 Aligned_cols=71 Identities=20% Similarity=0.302 Sum_probs=51.6
Q ss_pred ccEEEEEEcCCCCCCceEEEEeCc--------ccceeEEEEcCEEeeeec----CC-CCCceeccccccccCCCCCceEE
Q 023943 146 GCYRTYFHIPKEWQGRRILLHFEA--------VDSAFCAWINGVPVGYSQ----DS-RLPAEFEISDYCYPHGSDKKNVL 212 (275)
Q Consensus 146 g~Yrr~F~lp~~~~~~~i~L~f~g--------v~s~~~VwvNG~~VG~~~----~~-~~p~efdIT~~Lk~~~~G~eN~L 212 (275)
+...-.|.+|..|.-+...|+|.. -.+...|+|||+.||.-. +. ....+++|.+.+.. | .|+|
T Consensus 29 ~~~~~~f~v~~~~~v~~a~L~L~~~~S~~l~~~~S~L~V~lNg~~v~s~~l~~~~~~~~~~~i~Ip~~l~~---g-~N~l 104 (605)
T PF03170_consen 29 ASRTIYFPVPADWVVTKATLNLSYTYSPSLLPERSQLTVSLNGQPVGSIPLDAESAQPQTVTIPIPPALIK---G-FNRL 104 (605)
T ss_pred CceEEEEEcCCCccccceEEEEEEEECcccCCCcceEEEEECCEEeEEEecCcCCCCceEEEEecChhhcC---C-ceEE
Confidence 455667889888855544444332 136789999999999642 33 56788999988887 8 5999
Q ss_pred EEEEEecC
Q 023943 213 AVQVFRWS 220 (275)
Q Consensus 213 ~V~V~~~~ 220 (275)
.|++....
T Consensus 105 ~~~~~~~~ 112 (605)
T PF03170_consen 105 TFEFIGHY 112 (605)
T ss_pred EEEEEecc
Confidence 99999764
No 18
>PF08308 PEGA: PEGA domain; InterPro: IPR013229 This domain is found in both archaea and bacteria and has similarity to S-layer (surface layer) proteins. It is named after the characteristic PEGA sequence motif found in this domain. The secondary structure of this domain is predicted to be beta-strands.
Probab=88.41 E-value=1 Score=32.01 Aligned_cols=43 Identities=16% Similarity=0.275 Sum_probs=29.1
Q ss_pred EEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEec
Q 023943 165 LHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRW 219 (275)
Q Consensus 165 L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~ 219 (275)
|.+...-..+.|||||+++|. +|.++. .|.. | .+.|.|+-..+
T Consensus 4 l~V~s~p~gA~V~vdg~~~G~-----tp~~~~---~l~~---G-~~~v~v~~~Gy 46 (71)
T PF08308_consen 4 LRVTSNPSGAEVYVDGKYIGT-----TPLTLK---DLPP---G-EHTVTVEKPGY 46 (71)
T ss_pred EEEEEECCCCEEEECCEEecc-----Ccceee---ecCC---c-cEEEEEEECCC
Confidence 455555567999999999993 454433 1456 8 48888876543
No 19
>smart00758 PA14 domain in bacterial beta-glucosidases other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins.
Probab=88.10 E-value=3.8 Score=32.56 Aligned_cols=70 Identities=14% Similarity=0.193 Sum_probs=45.3
Q ss_pred CCcccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCC-CceeccccccccCCCCCceEEEEEEEec
Q 023943 143 NPTGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRL-PAEFEISDYCYPHGSDKKNVLAVQVFRW 219 (275)
Q Consensus 143 n~~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~-p~efdIT~~Lk~~~~G~eN~L~V~V~~~ 219 (275)
+-...++-.|.+|+. ....+.+.+ +..+.+||||+.|-.+.+... ..+.-.+-.|.+ |+...|.|+....
T Consensus 43 ~f~~~~~g~i~~~~~---G~y~f~~~~-~~~~~l~Idg~~vid~~~~~~~~~~~~~~v~l~~---g~~~~i~v~y~~~ 113 (136)
T smart00758 43 NFSVRWTGYLKPPED---GEYTFSITS-DDGARLWIDGKLVIDNWGKHEARPSTSSTLYLLA---GGTYPIRIEYFEA 113 (136)
T ss_pred cEEEEEEEEEECCCC---ccEEEEEEc-CCcEEEEECCcEEEcCCccCCCccccceeEEEeC---CcEEEEEEEEEeC
Confidence 344668888888764 456777754 678999999999987644322 111122224556 6568888887654
No 20
>PF03170 BcsB: Bacterial cellulose synthase subunit; InterPro: IPR018513 An operon encoding 4 proteins required for bacterial cellulose biosynthesis (bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementation with strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum. The calculated molecular mass of the protein encoded by bcsB is 85.3kDa []. BcsB encodes the catalytic subunit of cellulose synthase. The protein polymerises uridine 5'-diphosphate glucose to cellulose: UDP-glucose + (1,4-beta-D-glucosyl)(N) = UDP + (1,4-beta-D-glucosyl)(N+1). The enzyme is specifically activated by the nucleotide cyclic diguanylic acid. Sequence analysis suggests that BcsB contains several transmembrane (TM) domains, and shares a high degree of similarity with Escherichia coli YhjN.; GO: 0006011 UDP-glucose metabolic process, 0016020 membrane
Probab=86.68 E-value=2.2 Score=43.15 Aligned_cols=67 Identities=22% Similarity=0.331 Sum_probs=50.5
Q ss_pred EEEEEcCCC---CCCceEEEEeCcc--------cceeEEEEcCEEeeee------cCCCCCceeccccccccCCCCCceE
Q 023943 149 RTYFHIPKE---WQGRRILLHFEAV--------DSAFCAWINGVPVGYS------QDSRLPAEFEISDYCYPHGSDKKNV 211 (275)
Q Consensus 149 rr~F~lp~~---~~~~~i~L~f~gv--------~s~~~VwvNG~~VG~~------~~~~~p~efdIT~~Lk~~~~G~eN~ 211 (275)
+-.|.+|.+ |.++.+.|++... .+...|+|||++|+.- ......+++.|+.++.. |. |+
T Consensus 327 ~~~f~lP~dl~~~~~~~i~l~L~y~y~~~~~~~~S~l~V~vNg~~i~s~~L~~~~~~~~~~~~v~iP~~~~~---~~-N~ 402 (605)
T PF03170_consen 327 SFNFRLPPDLFAWDGSGIPLHLRYRYTPGLDFDGSRLTVYVNGQFIGSLPLTPADGAGFDRYTVSIPRLLLP---GR-NQ 402 (605)
T ss_pred eeEeeCCccccccCCCceEEEEEEecCCCCCCCCcEEEEEECCEEEEeEECCCCCCCccceeEEecCchhcC---CC-cE
Confidence 446888886 5667776666433 5567899999999864 35666788999988888 84 99
Q ss_pred EEEEEEec
Q 023943 212 LAVQVFRW 219 (275)
Q Consensus 212 L~V~V~~~ 219 (275)
|.+++.-.
T Consensus 403 l~~~f~l~ 410 (605)
T PF03170_consen 403 LQFEFDLP 410 (605)
T ss_pred EEEEEEee
Confidence 99999854
No 21
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=85.93 E-value=3.5 Score=43.02 Aligned_cols=70 Identities=17% Similarity=0.149 Sum_probs=49.3
Q ss_pred EEEEEEcCCCCCCceEEEEeC--------cccceeEEEEcCEEeeeec------CCCCCceeccccccccCCCCCceEEE
Q 023943 148 YRTYFHIPKEWQGRRILLHFE--------AVDSAFCAWINGVPVGYSQ------DSRLPAEFEISDYCYPHGSDKKNVLA 213 (275)
Q Consensus 148 Yrr~F~lp~~~~~~~i~L~f~--------gv~s~~~VwvNG~~VG~~~------~~~~p~efdIT~~Lk~~~~G~eN~L~ 213 (275)
.+-.|.+|.+|.-..+.|++. .-.|.-.|.|||+.||.-. +.....+++|...+.. | .|+|.
T Consensus 83 ~~i~f~vp~d~~v~~A~L~L~y~~Sp~l~~~~S~L~V~lNg~~v~s~pL~~~~~~~~~~~~i~IP~~l~~---g-~N~L~ 158 (756)
T PRK11114 83 GGIEFGVRSDEVVTKARLNLEYTYSPALLPDLSHLKVYLNGELMGTLPLDKEQLGKKVLAQLPIDPRFIT---D-FNRLR 158 (756)
T ss_pred ceeEeecCccccccCcEEEEEEEECCCCCCCCCeEEEEECCEEeEEEecCcccCCCcceeEEecCHHHcC---C-CceEE
Confidence 366788888774343344333 1247889999999998642 3345778999987777 8 49999
Q ss_pred EEEEecCC
Q 023943 214 VQVFRWSD 221 (275)
Q Consensus 214 V~V~~~~d 221 (275)
+++.....
T Consensus 159 ~~~~~~~~ 166 (756)
T PRK11114 159 LEFIGHYT 166 (756)
T ss_pred EEEecCCC
Confidence 99886543
No 22
>PF06832 BiPBP_C: Penicillin-Binding Protein C-terminus Family; InterPro: IPR009647 This conserved region of approximately 90 residues is found in a sub-group of bacterial Penicillin-Binding Proteins (PBPs). A variable length loop region separates this region from the transpeptidase unit (IPR001460 from INTERPRO). It is predicted to be a beta fold.
Probab=77.45 E-value=11 Score=27.99 Aligned_cols=48 Identities=17% Similarity=0.311 Sum_probs=34.3
Q ss_pred CCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEE
Q 023943 159 QGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAV 214 (275)
Q Consensus 159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V 214 (275)
+...+.|...|-....+-||||+++|.....+ ++.+.. ..+ |+ ++|.|
T Consensus 30 ~~~~l~l~a~~~~~~~~W~vdg~~~g~~~~~~---~~~~~~-~~~---G~-h~l~v 77 (89)
T PF06832_consen 30 ERQPLVLKAAGGRGPVYWFVDGEPLGTTQPGH---QLFWQP-DRP---GE-HTLTV 77 (89)
T ss_pred ccceEEEEEeCCCCcEEEEECCEEcccCCCCC---eEEeCC-CCC---ee-EEEEE
Confidence 46788888888777999999999998765432 233321 246 84 88888
No 23
>PF12733 Cadherin-like: Cadherin-like beta sandwich domain
Probab=71.30 E-value=19 Score=26.44 Aligned_cols=56 Identities=20% Similarity=0.279 Sum_probs=39.1
Q ss_pred EEEcCCCCCCceEEEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceE-EEEEEEec
Q 023943 151 YFHIPKEWQGRRILLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNV-LAVQVFRW 219 (275)
Q Consensus 151 ~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~-L~V~V~~~ 219 (275)
+..+|.+ -..+.|.....+..+.|.|||..+... -.+..+.+ .. | +|. |.|.|...
T Consensus 17 ~~~V~~~--~~~v~v~a~~~~~~a~v~vng~~~~~~---~~~~~i~L----~~---G-~n~~i~i~Vta~ 73 (88)
T PF12733_consen 17 TVTVPND--VDSVTVTATPEDSGATVTVNGVPVNSG---GYSATIPL----NE---G-ENTVITITVTAE 73 (88)
T ss_pred EEEECCC--ceEEEEEEEECCCCEEEEEcCEEccCC---CcceeeEc----cC---C-CceEEEEEEEcC
Confidence 5566765 356888887778999999999987543 12233444 46 8 498 99999753
No 24
>PF04566 RNA_pol_Rpb2_4: RNA polymerase Rpb2, domain 4; InterPro: IPR007646 RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Domain 4, is also known as the external 2 domain [].; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent; PDB: 3S17_B 1I6H_B 4A3B_B 3K1F_B 4A3I_B 1TWA_B 3S14_B 3S15_B 2NVX_B 3M3Y_B ....
Probab=67.24 E-value=5 Score=28.45 Aligned_cols=13 Identities=38% Similarity=0.744 Sum_probs=11.8
Q ss_pred EEEcCEEeeeecC
Q 023943 176 AWINGVPVGYSQD 188 (275)
Q Consensus 176 VwvNG~~VG~~~~ 188 (275)
|+|||..+|.+++
T Consensus 1 VFlNG~~iG~~~~ 13 (63)
T PF04566_consen 1 VFLNGVWIGIHSD 13 (63)
T ss_dssp EEETTEEEEEESS
T ss_pred CEECCEEEEEEcC
Confidence 7999999999975
No 25
>PF12222 PNGaseA: Peptide N-acetyl-beta-D-glucosaminyl asparaginase amidase A; InterPro: IPR021102 Peptide-N4-(N-acetyl-beta-glucosaminyl)asparagine amidase A (PNGase A), unlike many other amidases, is capable of hydrolysing glycopeptides with an alpha-1,3-fucosylated asparagine-bound N-acetylglucosamine (GlcNAc). PNGase A is a heterodimer composed of a large and small subunit []. This entry represents the PNGase A precursor, which contains both subunits and is activated by proteolytic cleavage.
Probab=65.64 E-value=14 Score=36.08 Aligned_cols=51 Identities=10% Similarity=0.098 Sum_probs=35.7
Q ss_pred cccceeEEEEcCEEeeeec-------C----------------CCCCceeccccccccCCCCCceEEEEEEEec
Q 023943 169 AVDSAFCAWINGVPVGYSQ-------D----------------SRLPAEFEISDYCYPHGSDKKNVLAVQVFRW 219 (275)
Q Consensus 169 gv~s~~~VwvNG~~VG~~~-------~----------------~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~ 219 (275)
|.--...|+|||+.+|... | ...++++|||++|-.--+|+.++|.|+|.+-
T Consensus 218 gpfReV~V~iDg~lag~~~PfPvIfTGGI~P~lWrPI~~i~aFdl~~y~iDlTPfLp~L~dg~~h~~~i~V~~~ 291 (427)
T PF12222_consen 218 GPFREVQVYIDGQLAGVVWPFPVIFTGGINPFLWRPIVGIGAFDLPSYDIDLTPFLPLLWDGKPHTFEIRVVNA 291 (427)
T ss_pred CCcEEEEEEECCEEEEEECCCCeEEeCCcCcccccccCCCcccCCCceeEEeccchhcccCCCccEEEEEEEcc
Confidence 3445678999999998541 2 2336899999966221127668999999984
No 26
>PF11008 DUF2846: Protein of unknown function (DUF2846); InterPro: IPR022548 Some members in this group of proteins with unknown function are annotated as lipoproteins. However this cannot be confirmed.
Probab=59.88 E-value=30 Score=27.10 Aligned_cols=34 Identities=15% Similarity=0.283 Sum_probs=23.5
Q ss_pred cceeEEEEcCEEeeeecC-CCCCceeccccccccCCCCCceEEEE
Q 023943 171 DSAFCAWINGVPVGYSQD-SRLPAEFEISDYCYPHGSDKKNVLAV 214 (275)
Q Consensus 171 ~s~~~VwvNG~~VG~~~~-~~~p~efdIT~~Lk~~~~G~eN~L~V 214 (275)
.....|||||+.||.... +| +.++++ + |+ ++|..
T Consensus 40 ~~~~~v~vdg~~ig~l~~g~y--~~~~v~----p---G~-h~i~~ 74 (117)
T PF11008_consen 40 AVKPDVYVDGELIGELKNGGY--FYVEVP----P---GK-HTISA 74 (117)
T ss_pred cccceEEECCEEEEEeCCCeE--EEEEEC----C---Cc-EEEEE
Confidence 557899999999998653 33 344544 4 74 77666
No 27
>PF14324 PINIT: PINIT domain; PDB: 3I2D_A.
Probab=56.22 E-value=13 Score=30.54 Aligned_cols=51 Identities=16% Similarity=0.185 Sum_probs=24.8
Q ss_pred eEEEEeCcccceeEEEEcCEEeeee-------cCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943 162 RILLHFEAVDSAFCAWINGVPVGYS-------QDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 162 ~i~L~f~gv~s~~~VwvNG~~VG~~-------~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
...+.|. ...+|+|||+.|-.. .|.-. =+|||++++.. .+..|+|.|.=..
T Consensus 74 ~q~i~FP---~~~evkvN~~~v~~~~~glknKpGt~r--PvdIT~~l~~~-~~~~N~i~v~y~~ 131 (144)
T PF14324_consen 74 NQPIEFP---PPCEVKVNGKQVKLNNRGLKNKPGTAR--PVDITPYLRLS-PPQTNRIEVTYAN 131 (144)
T ss_dssp GB--------SSEEEEETTEE--S--SS-TTS-GGGS---EE-GGG---S--SS-EEEEEEEEE
T ss_pred ccccccC---CCeEEEEeCEEcccCccCCCCCCCCCC--Ccccchhhccc-CCCCeEEEEEEeC
Confidence 3444554 368999999999532 23333 47999999862 1346998887554
No 28
>PF11824 DUF3344: Protein of unknown function (DUF3344); InterPro: IPR021779 This family of proteins are functionally uncharacterised. This protein is found in bacteria and archaea. Proteins in this family are typically between 367 to 1857 amino acids in length.
Probab=56.21 E-value=27 Score=31.89 Aligned_cols=66 Identities=14% Similarity=0.084 Sum_probs=38.1
Q ss_pred EEEEEEcCCCCCCceEEEEeC--------cccceeEEEEcCEEeee----------e----cCC-----CCCceeccccc
Q 023943 148 YRTYFHIPKEWQGRRILLHFE--------AVDSAFCAWINGVPVGY----------S----QDS-----RLPAEFEISDY 200 (275)
Q Consensus 148 Yrr~F~lp~~~~~~~i~L~f~--------gv~s~~~VwvNG~~VG~----------~----~~~-----~~p~efdIT~~ 200 (275)
+..+|+||+.-.=+..+|... +....+.|-+||+.+.. . .+. +.-+.+|||++
T Consensus 40 ~~~~~~lP~ga~v~~ArLYv~~w~~~~~~~~~~~~~~~fNg~~~~~~~l~~~~~~Y~d~~~~g~~~~~~yg~~vYDVT~~ 119 (271)
T PF11824_consen 40 VSWDFTLPEGATVKWARLYVYVWSGHMTNGYYPSFTVTFNGNTLEEFNLETPEAPYVDQKGHGNYVDYDYGMWVYDVTDL 119 (271)
T ss_pred ceEEEeCCCCCeEEEEEEEEEEeCCccccCCCceEEEEECCccceeeeccCCCCceEEecCccceeccceEEEEEECccc
Confidence 566677775422223333332 33445677777776631 0 111 13344899999
Q ss_pred cccCCCCCceEEEEEEE
Q 023943 201 CYPHGSDKKNVLAVQVF 217 (275)
Q Consensus 201 Lk~~~~G~eN~L~V~V~ 217 (275)
++. | +|.+.|.-.
T Consensus 120 i~~---g-~n~~~v~~~ 132 (271)
T PF11824_consen 120 IKS---G-ENTVTVTTG 132 (271)
T ss_pred ccC---C-ceEEEEEeC
Confidence 998 8 499888773
No 29
>PF14814 UB2H: Bifunctional transglycosylase second domain; PDB: 3FWL_A 3VMA_A.
Probab=54.73 E-value=22 Score=26.45 Aligned_cols=42 Identities=26% Similarity=0.358 Sum_probs=27.8
Q ss_pred CCcccEEEEEEcCCCC-CCceEEEEeCcccceeEEEE--cCEEeee
Q 023943 143 NPTGCYRTYFHIPKEW-QGRRILLHFEAVDSAFCAWI--NGVPVGY 185 (275)
Q Consensus 143 n~~g~Yrr~F~lp~~~-~~~~i~L~f~gv~s~~~Vwv--NG~~VG~ 185 (275)
+..-.|+|.|..|+.. ..+++.|+|.+ +....|-- ||+.++.
T Consensus 38 ~~i~i~~R~F~F~Dg~e~~~~~~l~f~~-~~V~~i~~~~~g~~l~~ 82 (85)
T PF14814_consen 38 NRIEIYTRGFDFPDGQEPARRVRLTFSG-GRVSSIQDLDNGRDLGL 82 (85)
T ss_dssp TEEEEEE--EEETTCEE--EEEEEEEET-TEEEEEEETTTTEE-SS
T ss_pred CEEEEEECCCCCCCCCccCEEEEEEECC-CEEEEEEEcCCCCccCe
Confidence 4556899999999765 46799999998 66666655 5776653
No 30
>PF07550 DUF1533: Protein of unknown function (DUF1533); InterPro: IPR011432 This domain is found duplicated in proteins of unknown function. The proteins typically also contain leucine-rich repeats.
Probab=46.23 E-value=29 Score=24.42 Aligned_cols=43 Identities=14% Similarity=0.164 Sum_probs=24.6
Q ss_pred eeEEEEcCEEe-----eeecCCC-CCceecccc-cc-ccCCCCCceEEEEEEEec
Q 023943 173 AFCAWINGVPV-----GYSQDSR-LPAEFEISD-YC-YPHGSDKKNVLAVQVFRW 219 (275)
Q Consensus 173 ~~~VwvNG~~V-----G~~~~~~-~p~efdIT~-~L-k~~~~G~eN~L~V~V~~~ 219 (275)
...|.|||+.. +..+... ..-.+.|.. .+ +. | +|+|+|...-+
T Consensus 8 I~~V~VNg~~y~~~~~~~~~y~~~~~~~l~i~~~~f~~~---G-~~~I~I~A~GY 58 (65)
T PF07550_consen 8 ITSVTVNGKEYNKSLKGNDKYSISSKGSLKIKASAFNKD---G-ENTIVIKATGY 58 (65)
T ss_pred CCEEEECCEEeeccccccccEEeccCCcEEEcHHHcCcC---C-ceEEEEEeCCc
Confidence 45789999998 3222111 111144443 34 55 7 59999987644
No 31
>PF09113 N-glycanase_C: Peptide-N-glycosidase F, C terminal; InterPro: IPR015197 This domain adopts an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. It is similar in topology to many viral capsid proteins, as well as lectins and several glucanases. This domain allows the protein to bind sugars and catalyses the complete removal of N-linked oligosaccharide chains from glycoproteins []. ; PDB: 1PNF_A 1PNG_A 1PGS_A 3KS7_D 3PMS_A.
Probab=42.36 E-value=87 Score=25.80 Aligned_cols=67 Identities=13% Similarity=0.158 Sum_probs=38.5
Q ss_pred cEEEEEEcCCCCCCceEEEEeCc-----------ccceeEEEEcCEEeeeec-----------------C----------
Q 023943 147 CYRTYFHIPKEWQGRRILLHFEA-----------VDSAFCAWINGVPVGYSQ-----------------D---------- 188 (275)
Q Consensus 147 ~Yrr~F~lp~~~~~~~i~L~f~g-----------v~s~~~VwvNG~~VG~~~-----------------~---------- 188 (275)
.-..+|++|+..+.-++.+.+-| +...-.|+|||+++-... |
T Consensus 11 ~~~~~f~lp~~~k~~~L~~iiTGHG~~~~gc~EFc~~~h~~~vnG~~~f~~~~~~~~Ca~~~~~n~~p~G~w~~~Rs~WC 90 (141)
T PF09113_consen 11 RLPVNFTLPANAKNARLRYIITGHGSGNNGCDEFCPKSHHFYVNGKEVFSFAPWRDDCASNRLYNPAPSGTWLYSRSNWC 90 (141)
T ss_dssp SEEEEEEE-TT-SEEEEEEEEEEEEETTEEEETTS-EEEEEEETTEEEEEEEE-BS-GGGGSGG-TTT-SCESS-BSS--
T ss_pred ceeEEEECCcccceEEEEEEEecCCCCCCCcceecccccEEEECCeEeeecCCCccchhhccccCccccceEecCCCCCC
Confidence 55679999987544444444433 222347999999992211 1
Q ss_pred --C-CCCceeccccccccCCCCCceEEEEEEEe
Q 023943 189 --S-RLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 189 --~-~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
+ -.|.++||++++. |+ +++.|.|.-
T Consensus 91 PG~~v~p~~~dl~~~~~----g~-ht~~~~i~~ 118 (141)
T PF09113_consen 91 PGMVVDPWRIDLTDAVA----GG-HTFSVDIPY 118 (141)
T ss_dssp TTEEE--EEEEEE-GGG----TT-SEEEEEEET
T ss_pred CCCCCCceEeccccccC----CC-ceEEEEecc
Confidence 0 1278899998775 53 888777764
No 32
>smart00560 LamGL LamG-like jellyroll fold domain.
Probab=41.61 E-value=34 Score=27.18 Aligned_cols=27 Identities=19% Similarity=0.320 Sum_probs=20.9
Q ss_pred ceEEEEeCcccceeEEEEcCEEeeeec
Q 023943 161 RRILLHFEAVDSAFCAWINGVPVGYSQ 187 (275)
Q Consensus 161 ~~i~L~f~gv~s~~~VwvNG~~VG~~~ 187 (275)
.++.+.+++......+||||++++...
T Consensus 64 ~hva~v~d~~~g~~~lYvnG~~~~~~~ 90 (133)
T smart00560 64 VHLAGVYDGGAGKLSLYVNGVEVATSE 90 (133)
T ss_pred EEEEEEEECCCCeEEEEECCEEccccc
Confidence 366777777777889999999997543
No 33
>PRK11114 cellulose synthase regulator protein; Provisional
Probab=39.41 E-value=57 Score=34.22 Aligned_cols=39 Identities=26% Similarity=0.427 Sum_probs=24.4
Q ss_pred EEEEEEcCCC---CCCceEE--EEeC------cccceeEEEEcCEEeeee
Q 023943 148 YRTYFHIPKE---WQGRRIL--LHFE------AVDSAFCAWINGVPVGYS 186 (275)
Q Consensus 148 Yrr~F~lp~~---~~~~~i~--L~f~------gv~s~~~VwvNG~~VG~~ 186 (275)
-+-.|.+|.+ |.++.+- |++. .-+|.-.|+|||++|+.-
T Consensus 378 i~~~~~lPpDl~~~~~~~i~l~L~yryt~~~~~~~S~l~V~vN~~~i~S~ 427 (756)
T PRK11114 378 IRVNLRLPPDLFLWRGDGIPLDLNYRYTAPPVRDDSRLNISLNDQFVQSL 427 (756)
T ss_pred eeEcccCCccccccCCCCCceEEEEeCCCCCCCCCcEEEEEECCEEEeeE
Confidence 3455666765 4555543 4431 123688999999999753
No 34
>PF07908 D-aminoacyl_C: D-aminoacylase, C-terminal region; InterPro: IPR012855 D-aminoacylase (Q9AGH8 from SWISSPROT, 3.5.1.81 from EC) hydrolyses a wide variety of N-acyl derivatives of neutral D-amino acids, in a zinc-dependent manner. The enzyme is composed of a small beta-barrel domain and a larger catalytic alpha/beta-barrel that contains a short alpha/beta insert. The overall structure shares significant similarity to the alpha/beta-barrel amidohydrolase superfamily, in which the beta-strands in both barrels superimpose well []. The C-terminal region featured in this entry forms part of the beta-barrel domain, together with a short N-terminal segment. This domain does not seem to contribute to the substrate-binding site or to be involved in the catalytic process.; GO: 0008270 zinc ion binding, 0016811 hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amides; PDB: 3GIQ_B 3GIP_B 1V4Y_A 1M7J_A 1RK5_A 1RJP_A 1RJR_A 1RJQ_A 1RK6_A 1V51_A.
Probab=37.96 E-value=26 Score=23.26 Aligned_cols=13 Identities=23% Similarity=0.123 Sum_probs=10.6
Q ss_pred eeEEEEcCEEeee
Q 023943 173 AFCAWINGVPVGY 185 (275)
Q Consensus 173 ~~~VwvNG~~VG~ 185 (275)
.-+|||||+.+-.
T Consensus 20 I~~V~VNG~~vv~ 32 (48)
T PF07908_consen 20 IDYVFVNGQIVVE 32 (48)
T ss_dssp EEEEEETTEEEEC
T ss_pred EEEEEECCEEEEE
Confidence 4689999999854
No 35
>PF10262 Rdx: Rdx family; InterPro: IPR011893 This entry represents the Rdx family of selenoproteins, which includes mammalian selenoproteins SelW, SelV, SelT and SelH, bacterial SelW-like proteins and cysteine-containing proteins of unknown function in all three domains of life. Mammalian Rdx12 and its fish selenoprotein orthologues are also members of this family []. These proteins possess a thioredoxin-like fold and a conserved CXXC or CxxU (U is selenocysteine) motif near the N terminus, suggesting a redox function. Rdx proteins can use catalytic cysteine (or selenocysteine) to form transient mixed disulphides with substrate proteins. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins. Selenoprotein W (SelW) plays an important role in protection of neurons from oxidative stress during neuronal development [], []. Selenoprotein T (SelT) is conserved from plants to humans. SelT is localized to the endoplasmic reticulum through a hydrophobic domain. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding [, ]. The function of SelT is unknown, although it may have a role in PACAP signaling during PC12 cell differentiation [, ]. Selenoprotein H (SelH) protects neurons against UVB-induced damage by inhibiting apoptotic cell death pathways, by preventing mitochondrial depolarization, and by promoting cell survival pathways [].; GO: 0008430 selenium binding, 0045454 cell redox homeostasis; PDB: 2OJL_B 2FA8_A 2P0G_C 2NPB_A 3DEX_C 2OKA_A 2OBK_G.
Probab=37.01 E-value=48 Score=23.87 Aligned_cols=23 Identities=22% Similarity=0.150 Sum_probs=16.7
Q ss_pred ceEEEEeCcccceeEEEEcCEEee
Q 023943 161 RRILLHFEAVDSAFCAWINGVPVG 184 (275)
Q Consensus 161 ~~i~L~f~gv~s~~~VwvNG~~VG 184 (275)
..+.+.. +...+++|+|||+.|-
T Consensus 33 ~~v~~~~-~~~G~FEV~v~g~lI~ 55 (76)
T PF10262_consen 33 AEVELSP-GSTGAFEVTVNGELIF 55 (76)
T ss_dssp SEEEEEE-ESTT-EEEEETTEEEE
T ss_pred eEEEEEe-ccCCEEEEEEccEEEE
Confidence 3556666 4478899999999884
No 36
>PF11824 DUF3344: Protein of unknown function (DUF3344); InterPro: IPR021779 This family of proteins are functionally uncharacterised. This protein is found in bacteria and archaea. Proteins in this family are typically between 367 to 1857 amino acids in length.
Probab=35.08 E-value=56 Score=29.80 Aligned_cols=48 Identities=17% Similarity=0.188 Sum_probs=33.2
Q ss_pred EEeCcccceeEEEEcCEEeeeec--CCCCC-ceeccccccccCCCCCceEEEEEE
Q 023943 165 LHFEAVDSAFCAWINGVPVGYSQ--DSRLP-AEFEISDYCYPHGSDKKNVLAVQV 216 (275)
Q Consensus 165 L~f~gv~s~~~VwvNG~~VG~~~--~~~~p-~efdIT~~Lk~~~~G~eN~L~V~V 216 (275)
+...|-+....+.+||+-+.... +++.. ..|||+++|+. | +|.+.++-
T Consensus 204 ~~~s~~~~~g~~~FNg~~l~~~~~~~~~~~~~~~DVt~~l~~---~-~n~~~~~~ 254 (271)
T PF11824_consen 204 VALSGGDGEGNLTFNGTNLWNGTPSGSYFGYDTWDVTDYLKS---G-NNSAFIQS 254 (271)
T ss_pred EEEeccCCCCEEEECCcccCCCCCCccceeeEeeeccccccC---C-CceEEEEe
Confidence 33445444478999997775432 34333 35999999998 8 49988886
No 37
>TIGR02148 Fibro_Slime fibro-slime domain. This model represents a conserved region of about 90 amino acids, shared in at least 4 distinct large putative proteins from the slime mold Dictyostelium discoideum and 10 proteins from the rumen bacterium Fibrobacter succinogenes, and in no other species so far. We propose here the name fibro-slime domain
Probab=34.36 E-value=2.1e+02 Score=21.82 Aligned_cols=53 Identities=13% Similarity=0.100 Sum_probs=33.4
Q ss_pred EEEEeCcccceeEEEEcCEEeeeecCCCCC--ceecccc-ccccCCCCCceEEEE-EEEec
Q 023943 163 ILLHFEAVDSAFCAWINGVPVGYSQDSRLP--AEFEISD-YCYPHGSDKKNVLAV-QVFRW 219 (275)
Q Consensus 163 i~L~f~gv~s~~~VwvNG~~VG~~~~~~~p--~efdIT~-~Lk~~~~G~eN~L~V-~V~~~ 219 (275)
-.+.|-| +.-.-|+|||++|..-.|-+.| ..+|+.. =|.+ |+.-.+.+ .+.|.
T Consensus 20 e~F~F~G-DDDvWVFIn~kLv~DlGG~H~~~~~sV~l~~lgl~~---g~~Y~~d~F~~ERh 76 (90)
T TIGR02148 20 QYFEFRG-DDDVWVFINNKLVVDIGGQHPAVPGAVDLDTLGLKE---GKTYPFDIFYCERH 76 (90)
T ss_pred cEEEEEc-CCeEEEEECCEEEEEccCcCCCcccEEEhhhcCCcc---CcEeeEEEEEEeec
Confidence 4778888 6779999999999766554443 3456554 2444 64445555 33443
No 38
>PF00337 Gal-bind_lectin: Galactoside-binding lectin; InterPro: IPR001079 Galectins (also known as galaptins or S-lectin) are a family of proteins defined by having at least one characteristic carbohydrate recognition domain (CRD) with an affinity for beta-galactosides and sharing certain sequence elements. Members of the galectins family are found in mammals, birds, amphibians, fish, nematodes, sponges, and some fungi. Galectins are known to carry out intra- and extracellular functions through glycoconjugate-mediated recogntion. From the cytosol they may be secreted by non-classical pathways, but they may also be targeted to the nucleus or specific sub-cytosolic sites. Within the same peptide chain some galectins have a CRD with only a few additional amino acids, whereas others have two CRDs joined by a link peptide, and one (galectin-3) has one CRD joined to a different type of domain [, ]. The galectin carbohydrate recognition domain (CRD) is a beta-sandwich of about 135 amino acid. The two sheets are slightly bent with 6 strands forming the concave side and 5 strands forming the convex side. The concave side forms a groove in which carbohydrate is bound, and which is long enough to hold about a linear tetrasaccharide [, ].; GO: 0005529 sugar binding; PDB: 2WSU_B 2WT0_A 2WT1_A 2WT2_B 2WSV_A 1HLC_A 2ZGQ_A 3M3Q_B 1WW5_C 3M3E_A ....
Probab=33.06 E-value=69 Score=25.27 Aligned_cols=28 Identities=14% Similarity=0.281 Sum_probs=23.2
Q ss_pred CCceEEEEeCcccceeEEEEcCEEeeee
Q 023943 159 QGRRILLHFEAVDSAFCAWINGVPVGYS 186 (275)
Q Consensus 159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~ 186 (275)
.|+...|.|.--+..+.|+|||+.+..-
T Consensus 81 ~g~~F~i~I~~~~~~f~I~vng~~~~~F 108 (133)
T PF00337_consen 81 PGQPFEIRIRVEEDGFKIYVNGKHFCSF 108 (133)
T ss_dssp TTSEEEEEEEEESSEEEEEETTEEEEEE
T ss_pred CCceEEEEEEEecCeeEEEECCeEEEEe
Confidence 5777777777778999999999998753
No 39
>PF09829 DUF2057: Uncharacterized protein conserved in bacteria (DUF2057); InterPro: IPR018635 The proteins in this entry are functionally uncharacterised.
Probab=32.23 E-value=98 Score=26.29 Aligned_cols=39 Identities=21% Similarity=0.122 Sum_probs=25.2
Q ss_pred ceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEec
Q 023943 172 SAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRW 219 (275)
Q Consensus 172 s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~ 219 (275)
..--+-|||+.++.+.-+... .+.| .+ | +|+|+|++...
T Consensus 8 ~i~~l~vnG~~v~~~~~~~~~-~l~L----~~---G-~~Qiv~ry~~~ 46 (189)
T PF09829_consen 8 EIELLAVNGQEVSGSLFSSKD-SLEL----PP---G-ENQIVFRYSKI 46 (189)
T ss_pred CEEEEEEcCeeccCccccCCc-eEEe----CC---C-cEEEEEEEeEe
Confidence 345568999999654322111 2444 45 8 59999999974
No 40
>PF05775 AfaD: Enterobacteria AfaD invasin protein; InterPro: IPR008394 This family consists of several AfaD and related proteins from Escherichia coli and Salmonella bacteria. The afa gene clusters encode an afimbrial adhesive sheath produced by E. coli. The adhesive sheath is composed of two proteins, AfaD and AfaE, which are independently exposed at the bacterial cell surface. AfaE is required for bacterial adhesion to HeLa cells and AfaD for the uptake of adherent bacteria into these cells [].; GO: 0009289 pilus; PDB: 3UIZ_F 3UIY_A 2AXW_A 2IXQ_A 2FVN_A.
Probab=32.00 E-value=97 Score=24.56 Aligned_cols=45 Identities=18% Similarity=0.284 Sum_probs=30.7
Q ss_pred EEEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943 164 LLHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 164 ~L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
.|...+.++.+.||.|.+.++- .|-.+-|...=. . .|+|.|++-.
T Consensus 18 rI~~~~~htGF~Vw~na~~~~g-----~p~~Yil~G~~~----~-~h~LrVRlgg 62 (111)
T PF05775_consen 18 RIICREAHTGFHVWSNARQVGG-----RPGRYILQGKRN----S-QHELRVRLGG 62 (111)
T ss_dssp EEES-SSSSEEEEEESSEESTT-----STTEEEEEBCSS----S-S-EEEEEEET
T ss_pred EEEeCCCceEEEEEeechhcCC-----CccEEEEeCCCC----C-CceEEEEeCC
Confidence 4667788999999999998765 345555554211 2 4999999985
No 41
>PF13385 Laminin_G_3: Concanavalin A-like lectin/glucanases superfamily; PDB: 4DQA_A 1N1Y_A 1MZ6_A 1MZ5_A 1N1S_A 2A75_A 1WCS_A 1N1T_A 1N1V_A 2FHR_A ....
Probab=30.47 E-value=58 Score=25.11 Aligned_cols=24 Identities=29% Similarity=0.538 Sum_probs=17.8
Q ss_pred eEEEEeCcccceeEEEEcCEEeeeec
Q 023943 162 RILLHFEAVDSAFCAWINGVPVGYSQ 187 (275)
Q Consensus 162 ~i~L~f~gv~s~~~VwvNG~~VG~~~ 187 (275)
++.+.+. .....+||||+.++...
T Consensus 89 ~l~~~~~--~~~~~lyvnG~~~~~~~ 112 (157)
T PF13385_consen 89 HLALTYD--GSTVTLYVNGELVGSST 112 (157)
T ss_dssp EEEEEEE--TTEEEEEETTEEETTCT
T ss_pred EEEEEEE--CCeEEEEECCEEEEeEe
Confidence 5555555 44699999999998754
No 42
>KOG4342 consensus Alpha-mannosidase [Carbohydrate transport and metabolism]
Probab=30.22 E-value=1.7e+02 Score=30.39 Aligned_cols=67 Identities=21% Similarity=0.444 Sum_probs=47.8
Q ss_pred cccEEEEEEcCCCCCC-ceEEEEeCcccceeEEEE-cCEEee-eecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943 145 TGCYRTYFHIPKEWQG-RRILLHFEAVDSAFCAWI-NGVPVG-YSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 145 ~g~Yrr~F~lp~~~~~-~~i~L~f~gv~s~~~Vwv-NG~~VG-~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
+.|+|-+++||++|.+ +++.+.-+. +...-||= +|..|- .+.|.++ +|-+++-++. .+.++-|++..
T Consensus 103 T~WF~V~i~lPe~Wvk~eqv~fqW~c-dnEGlV~~kdg~PvqafsggErT--~yvLpd~~~~----~~~tfYiE~ac 172 (1078)
T KOG4342|consen 103 TCWFRVEITLPEAWVKNEQVHFQWEC-DNEGLVWRKDGEPVQAFSGGERT--SYVLPDRLGE----RSLTFYIEVAC 172 (1078)
T ss_pred eEEEEEEEECchhhcCceeEEEEEec-CCCeeEEecCCceeeeccCCccc--eeEcccccCC----cceEEEEEeec
Confidence 4689999999999965 788888876 56677777 899885 4444343 4556665543 24777777765
No 43
>cd00070 GLECT Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation.
Probab=29.28 E-value=83 Score=24.79 Aligned_cols=28 Identities=18% Similarity=0.199 Sum_probs=23.0
Q ss_pred CCceEEEEeCcccceeEEEEcCEEeeee
Q 023943 159 QGRRILLHFEAVDSAFCAWINGVPVGYS 186 (275)
Q Consensus 159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~ 186 (275)
+|+...|.|.--...+.|+|||+.+..-
T Consensus 76 ~g~~F~l~i~~~~~~f~i~vng~~~~~F 103 (127)
T cd00070 76 PGQPFELTILVEEDKFQIFVNGQHFFSF 103 (127)
T ss_pred CCCeEEEEEEEcCCEEEEEECCEeEEEe
Confidence 4777788887778999999999988653
No 44
>COG0278 Glutaredoxin-related protein [Posttranslational modification, protein turnover, chaperones]
Probab=28.81 E-value=33 Score=26.77 Aligned_cols=16 Identities=25% Similarity=0.191 Sum_probs=13.4
Q ss_pred cceeEEEEcCEEeeee
Q 023943 171 DSAFCAWINGVPVGYS 186 (275)
Q Consensus 171 ~s~~~VwvNG~~VG~~ 186 (275)
-+.-.+||||++||-+
T Consensus 70 PT~PQLyi~GEfvGG~ 85 (105)
T COG0278 70 PTFPQLYVNGEFVGGC 85 (105)
T ss_pred CCCceeeECCEEeccH
Confidence 4567899999999976
No 45
>PF15625 CC2D2AN-C2: CC2D2A N-terminal C2 domain
Probab=28.67 E-value=94 Score=26.05 Aligned_cols=40 Identities=20% Similarity=0.380 Sum_probs=24.3
Q ss_pred ceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC
Q 023943 172 SAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS 220 (275)
Q Consensus 172 s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~ 220 (275)
....|++||++|+.++-......|-+ .. |+ .+.|+|.+|.
T Consensus 39 ~~ikl~~N~k~V~~T~~~~l~~dF~v----~f---~~--~f~v~i~~~P 78 (168)
T PF15625_consen 39 YYIKLFFNDKEVSRTRSRPLWSDFRV----HF---NE--IFNVQITRWP 78 (168)
T ss_pred EEEEEEECCEEEEeeeeEecCCCeEE----ec---cC--EEEEEEecCC
Confidence 34578999999998875444333322 23 42 6666666653
No 46
>PF06439 DUF1080: Domain of Unknown Function (DUF1080); InterPro: IPR010496 This is a family of proteins of unknown function.; PDB: 3IMM_B 3NMB_A 3S5Q_A 3OSD_A 3HBK_A 3H3L_A 3U1X_A.
Probab=27.59 E-value=1.1e+02 Score=25.20 Aligned_cols=21 Identities=29% Similarity=0.632 Sum_probs=15.8
Q ss_pred cceeEEEEcCEEeeeecCCCC
Q 023943 171 DSAFCAWINGVPVGYSQDSRL 191 (275)
Q Consensus 171 ~s~~~VwvNG~~VG~~~~~~~ 191 (275)
.....|||||+.|....+...
T Consensus 138 g~~i~v~vnG~~v~~~~d~~~ 158 (185)
T PF06439_consen 138 GNRITVWVNGKPVADFTDPSF 158 (185)
T ss_dssp TTEEEEEETTEEEEEEETTSH
T ss_pred CCEEEEEECCEEEEEEEcCCC
Confidence 345889999999988766433
No 47
>TIGR02412 pepN_strep_liv aminopeptidase N, Streptomyces lividans type. This family is a subset of the members of the zinc metallopeptidase family M1 (pfam01433), with a single member characterized in Streptomyces lividans 66 and designated aminopeptidase N. The spectrum of activity may differ somewhat from the aminopeptidase N clade of E. coli and most other Proteobacteria, well separated phylogenetically within the M1 family. The M1 family also includes leukotriene A-4 hydrolase/aminopeptidase (with a bifunctional active site).
Probab=26.61 E-value=7.9e+02 Score=26.07 Aligned_cols=63 Identities=16% Similarity=0.174 Sum_probs=40.0
Q ss_pred cccEEEEEEcCCCCCCceEEEEeCcccceeEEEEcCE-EeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943 145 TGCYRTYFHIPKEWQGRRILLHFEAVDSAFCAWINGV-PVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 145 ~g~Yrr~F~lp~~~~~~~i~L~f~gv~s~~~VwvNG~-~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
.+.=+-+|++-+. +.++.|.+.+. ..-.|-|||+ .+.... ....+.+.. |.. | +|+|.|....
T Consensus 35 ~~~~~i~~~~~~~--~~~l~LD~~~l-~I~~v~vng~~~~~~~~---~~~~i~l~~-l~~---g-~~~l~i~~~~ 98 (831)
T TIGR02412 35 RCVSTNTVRLSEP--GADTFLDLLAA-QIESVTLNGILDVAPVY---DGSRIPLPG-LLT---G-ENTLRVEATR 98 (831)
T ss_pred ceEEEEEEEEcCC--CCcEEEEccCC-EEEEEEECCcccCcccc---CCCEEEccC-CCC---C-ceEEEEEEEE
Confidence 3444445555333 67899999885 6778889997 332211 223456655 666 8 5999999754
No 48
>PRK10824 glutaredoxin-4; Provisional
Probab=26.00 E-value=48 Score=26.30 Aligned_cols=23 Identities=22% Similarity=0.343 Sum_probs=18.2
Q ss_pred EEEeCcccceeEEEEcCEEeeee
Q 023943 164 LLHFEAVDSAFCAWINGVPVGYS 186 (275)
Q Consensus 164 ~L~f~gv~s~~~VwvNG~~VG~~ 186 (275)
...+.|-.+.-.|||||++||-+
T Consensus 62 l~~~sg~~TVPQIFI~G~~IGG~ 84 (115)
T PRK10824 62 LPKYANWPTFPQLWVDGELVGGC 84 (115)
T ss_pred HHHHhCCCCCCeEEECCEEEcCh
Confidence 33445778889999999999876
No 49
>smart00276 GLECT Galectin. Galectin - galactose-binding lectin
Probab=25.64 E-value=1e+02 Score=24.30 Aligned_cols=28 Identities=21% Similarity=0.289 Sum_probs=22.6
Q ss_pred CCceEEEEeCcccceeEEEEcCEEeeee
Q 023943 159 QGRRILLHFEAVDSAFCAWINGVPVGYS 186 (275)
Q Consensus 159 ~~~~i~L~f~gv~s~~~VwvNG~~VG~~ 186 (275)
.|+...|.|---...+.|+|||+.+..-
T Consensus 75 ~g~~F~l~i~~~~~~f~i~vng~~~~~f 102 (128)
T smart00276 75 PGQPFDLTIIVQPDHFQIFVNGVHITTF 102 (128)
T ss_pred CCCEEEEEEEEcCCEEEEEECCEeEEEe
Confidence 4677777777778899999999998753
No 50
>PF13464 DUF4115: Domain of unknown function (DUF4115)
Probab=25.63 E-value=1.1e+02 Score=21.89 Aligned_cols=25 Identities=20% Similarity=0.268 Sum_probs=21.3
Q ss_pred CceEEEEeCcccceeEEEEcCEEeee
Q 023943 160 GRRILLHFEAVDSAFCAWINGVPVGY 185 (275)
Q Consensus 160 ~~~i~L~f~gv~s~~~VwvNG~~VG~ 185 (275)
...+.|+++.. ++.+|.+||+.++.
T Consensus 37 ~~~~~i~iGna-~~v~v~~nG~~~~~ 61 (77)
T PF13464_consen 37 KEPFRIRIGNA-GAVEVTVNGKPVDL 61 (77)
T ss_pred CCCEEEEEeCC-CcEEEEECCEECCC
Confidence 56788999875 58899999999987
No 51
>COG3148 Uncharacterized conserved protein [Function unknown]
Probab=25.58 E-value=46 Score=29.48 Aligned_cols=51 Identities=12% Similarity=0.189 Sum_probs=33.1
Q ss_pred CCcccccccCCCCCCCccccCChhhhcccCCchhhHHHhhhcccccCCCCCcEEecCccce
Q 023943 22 EDPSFIKWRKRDPHVTLRCHDSVEVSNSAVWDDDAVHEALTSAAFWTNGLPFVKSLSGHWK 82 (275)
Q Consensus 22 ~~p~~~~~n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~LnG~W~ 82 (275)
+||+++..=+.|..-++.-|+++... +... -.++....+.+..+.|+|+|+
T Consensus 90 ~~~eLl~ll~~P~~~p~lvfP~e~a~--------e~t~--v~~~~p~~k~plfIllDgTW~ 140 (231)
T COG3148 90 PNPELLALLANPDYQPYLVFPAEYAE--------ELTE--VISTAPAEKPPLFILLDGTWR 140 (231)
T ss_pred CCHHHHHHHhCCCCceEEEcchHHHH--------HHHH--HhhcccccCCceEEEecCccH
Confidence 38888888888888888889986521 1110 011111224568999999997
No 52
>smart00776 NPCBM This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins.
Probab=25.47 E-value=3.8e+02 Score=21.93 Aligned_cols=42 Identities=19% Similarity=0.245 Sum_probs=28.9
Q ss_pred ceeEEEEcCEEeeeecC---CC--CCceeccccccccCCCCCceEEEEEEEecCCC
Q 023943 172 SAFCAWINGVPVGYSQD---SR--LPAEFEISDYCYPHGSDKKNVLAVQVFRWSDG 222 (275)
Q Consensus 172 s~~~VwvNG~~VG~~~~---~~--~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~dg 222 (275)
-.+.|+.+|+.+-.+.. .. .+.++||+ |. ++|.++|....+|
T Consensus 85 V~F~V~~Dg~~l~~s~~~~~~~~~~~~~vdv~--------G~-~~L~L~v~~~g~g 131 (145)
T smart00776 85 VVFEVYADGTKLYNSGVLRGADPAKAVDVDVS--------GA-KELRLVVTDAGDG 131 (145)
T ss_pred EEEEEEeCCEeEEEcccccCCCCCeEEEEEcC--------CC-eEEEEEEEeCCCC
Confidence 36799999999977742 22 23455553 74 8999999876544
No 53
>smart00561 MBT Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation.
Probab=24.74 E-value=59 Score=24.85 Aligned_cols=21 Identities=38% Similarity=0.874 Sum_probs=18.2
Q ss_pred CCceEEEEeCcccceeEEEEc
Q 023943 159 QGRRILLHFEAVDSAFCAWIN 179 (275)
Q Consensus 159 ~~~~i~L~f~gv~s~~~VwvN 179 (275)
.|+++.|+|+|-++.+..|++
T Consensus 54 ~g~~l~v~~dg~~~~~D~W~~ 74 (96)
T smart00561 54 KGYRLLLHFDGWDDKYDFWCD 74 (96)
T ss_pred ECCEEEEEEccCCCcCCEEEE
Confidence 378999999999988888875
No 54
>PF11324 DUF3126: Protein of unknown function (DUF3126); InterPro: IPR021473 This family of proteins with unknown function appear to be restricted to Alphaproteobacteria.
Probab=23.88 E-value=71 Score=22.72 Aligned_cols=18 Identities=17% Similarity=0.163 Sum_probs=14.8
Q ss_pred cccceeEEEEcCEEeeee
Q 023943 169 AVDSAFCAWINGVPVGYS 186 (275)
Q Consensus 169 gv~s~~~VwvNG~~VG~~ 186 (275)
.-+..++||+++++||.-
T Consensus 25 k~~dsaEV~~g~EfiGvi 42 (63)
T PF11324_consen 25 KKDDSAEVYIGDEFIGVI 42 (63)
T ss_pred CCCCceEEEeCCEEEEEE
Confidence 346689999999999963
No 55
>PF03422 CBM_6: Carbohydrate binding module (family 6); InterPro: IPR005084 A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins. CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [, ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology. Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types" and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see []. This entry represents CBM6 from CAZY which was previously known as cellulose-binding domain family VI (CBD VI). CBM6 bind to amorphous cellulose, xylan, mixed beta-(1,3)(1,4)glucan and beta-1,3-glucan[, , ]. CBM6 adopts a classic lectin-like beta-jelly roll fold, predominantly consisting of five antiparallel beta-strands on one face and four antiparallel beta-strands on the other face. It contains two potential ligand binding sites, named respectively cleft A and B. These clefts include aromatic residues which are probably involved in the substrate binding. The cleft B is located on the concave surface of one beta-sheet, and the cleft A on one edge of the protein between the loop that connects the inner and outer beta-sheets of the jellyroll fold []. The multiple binding clefts confer the extensive range of specificities displayed by the domain [, , ].; GO: 0030246 carbohydrate binding; PDB: 1UY1_A 1UY3_A 1UY4_A 1UY2_A 1UYY_A 1UXZ_B 1UYZ_A 1UY0_B 1UYX_A 1UZ0_A ....
Probab=23.26 E-value=3.3e+02 Score=20.71 Aligned_cols=40 Identities=10% Similarity=0.061 Sum_probs=23.7
Q ss_pred eeEEEEcC---EEeeeec----CCCCC---ceeccccccccCCCCCceEEEEEEEe
Q 023943 173 AFCAWING---VPVGYSQ----DSRLP---AEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 173 ~~~VwvNG---~~VG~~~----~~~~p---~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
..+|+||| +.++... ++... .+..| .|.. | .|.|.+....
T Consensus 61 ~~~l~id~~~g~~~~~~~~~~tg~w~~~~~~~~~v--~l~~---G-~h~i~l~~~~ 110 (125)
T PF03422_consen 61 TIELRIDGPDGTLIGTVSLPPTGGWDTWQTVSVSV--KLPA---G-KHTIYLVFNG 110 (125)
T ss_dssp EEEEEETTTTSEEEEEEEEE-ESSTTEEEEEEEEE--EEES---E-EEEEEEEESS
T ss_pred EEEEEECCCCCcEEEEEEEcCCCCccccEEEEEEE--eeCC---C-eeEEEEEEEC
Confidence 56777777 7776542 33333 22223 3556 8 4888888765
No 56
>PRK01904 hypothetical protein; Provisional
Probab=22.44 E-value=1.6e+02 Score=25.98 Aligned_cols=41 Identities=17% Similarity=0.081 Sum_probs=25.8
Q ss_pred ceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEecC
Q 023943 172 SAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFRWS 220 (275)
Q Consensus 172 s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~~ 220 (275)
..-.+-|||+.+..+-.. ..-.+.++ . |++|+|+|++...-
T Consensus 29 ~i~lL~vnG~kv~~s~~~-~~~~l~L~----d---gg~hQIv~ry~~~~ 69 (219)
T PRK01904 29 NIDFLAIDGQKASKSLLK-EAKSFNIN----D---TQVHQVVVRVSEIV 69 (219)
T ss_pred ceEEEEECCEECcccccc-CCcceEeC----C---CCceEEEEEEeecc
Confidence 345678999999643222 22334444 3 53599999999753
No 57
>KOG1752 consensus Glutaredoxin and related proteins [Posttranslational modification, protein turnover, chaperones]
Probab=21.91 E-value=65 Score=25.08 Aligned_cols=25 Identities=16% Similarity=0.195 Sum_probs=18.9
Q ss_pred EEEEeCcccceeEEEEcCEEeeeec
Q 023943 163 ILLHFEAVDSAFCAWINGVPVGYSQ 187 (275)
Q Consensus 163 i~L~f~gv~s~~~VwvNG~~VG~~~ 187 (275)
....+.|..+.-.|||||++||...
T Consensus 58 ~l~~~tg~~tvP~vFI~Gk~iGG~~ 82 (104)
T KOG1752|consen 58 ALKKLTGQRTVPNVFIGGKFIGGAS 82 (104)
T ss_pred HHHHhcCCCCCCEEEECCEEEcCHH
Confidence 3344566668899999999998754
No 58
>PRK15222 putative pilin structural protein SafD; Provisional
Probab=20.70 E-value=1.8e+02 Score=24.36 Aligned_cols=44 Identities=14% Similarity=0.284 Sum_probs=30.3
Q ss_pred EEeCcccceeEEEEcCEEeeeecCCCCCceeccccccccCCCCCceEEEEEEEe
Q 023943 165 LHFEAVDSAFCAWINGVPVGYSQDSRLPAEFEISDYCYPHGSDKKNVLAVQVFR 218 (275)
Q Consensus 165 L~f~gv~s~~~VwvNG~~VG~~~~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~ 218 (275)
+...|.++.|.||.|++.+|-.. -.+-|..- +. . .|+|.||+.-
T Consensus 60 I~~~g~htGF~Vwsna~q~gg~p-----~~Yil~G~-~d---s-~h~LrVRl~G 103 (156)
T PRK15222 60 VTYHGSHSGFRVWSDEQKAGNTP-----TVLLLSGQ-QD---P-RHHIQVRLEG 103 (156)
T ss_pred EEeCCCceeEEEEecccccCCCc-----cEEEEECC-CC---C-cceEEEEecC
Confidence 33888899999999999986543 33333321 22 3 4899999984
No 59
>cd02848 Chitinase_N_term Chitinase N-terminus domain. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and based on sequence criteria, chitinases belong to families 18 and 19 of glycosyl hydrolases. The N-terminus of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitob
Probab=20.24 E-value=2.5e+02 Score=22.09 Aligned_cols=45 Identities=13% Similarity=0.137 Sum_probs=30.1
Q ss_pred CcccceeEEEEcCEEeeeec--CCCCCceeccccccccCCCCCceEEEEEEEec
Q 023943 168 EAVDSAFCAWINGVPVGYSQ--DSRLPAEFEISDYCYPHGSDKKNVLAVQVFRW 219 (275)
Q Consensus 168 ~gv~s~~~VwvNG~~VG~~~--~~~~p~efdIT~~Lk~~~~G~eN~L~V~V~~~ 219 (275)
+.....++|++||+.|-... ++-....|+++ +. | .-.+.|++-+.
T Consensus 45 G~~Gd~a~vl~dg~~V~~G~~~~~~~~at~~v~---kg---G-~y~m~V~lCn~ 91 (106)
T cd02848 45 GDPGDTYKVLLDGKEVWSGALTGSSGTATFKVG---KG---G-RYQMQVALCNG 91 (106)
T ss_pred CCCCcEEEEEECCeEEEcccCCCCccEEEEEeC---CC---C-eEEEEEEEECC
Confidence 55667899999999984432 22234566654 34 6 48999988764
No 60
>PF01589 Alpha_E1_glycop: Alphavirus E1 glycoprotein; InterPro: IPR002548 Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses []. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 (IPR000936 from INTERPRO) and E3 (IPR002533 from INTERPRO) causes a change in the viral surface. Together the E1, E2, and sometimes E3, glycoprotein "spikes" form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike []. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together. The alphaviral glycoprotein E1 is a class II viral fusion protein, which is structurally different from the class I fusion proteins found in influenza virus and HIV. The structure of the Semliki Forest virus revealed a structure that is similar to that of flaviviral glycoprotein E, with three structural domains in the same primary sequence arrangement []. This entry represents all three domains of the alphaviral E1 glycoprotein.; GO: 0004252 serine-type endopeptidase activity, 0019028 viral capsid, 0055036 virion membrane; PDB: 2YEW_L 1LD4_P 1Z8Y_K 3MUU_B 3N44_F 2XFB_F 3N42_F 2XFC_H 3N40_F 3N41_F ....
Probab=20.06 E-value=1.5e+02 Score=29.22 Aligned_cols=28 Identities=18% Similarity=0.332 Sum_probs=24.3
Q ss_pred ceEEEEeCcccceeEEEEcCEEeeeecC
Q 023943 161 RRILLHFEAVDSAFCAWINGVPVGYSQD 188 (275)
Q Consensus 161 ~~i~L~f~gv~s~~~VwvNG~~VG~~~~ 188 (275)
-.+.+.++.+....++||||..-+...|
T Consensus 195 a~l~ityG~~~~~v~~yVNG~t~~~~~~ 222 (502)
T PF01589_consen 195 AKLRITYGNVNQTVDVYVNGETPVNSGD 222 (502)
T ss_dssp EEEEEEESSEEEEEEEESSSSCEEEETT
T ss_pred eEEEEEEcceEEEEEEEEcCccceeccc
Confidence 4678899999999999999998877765
No 61
>PRK06789 flagellar motor switch protein; Validated
Probab=20.01 E-value=1.1e+02 Score=22.45 Aligned_cols=27 Identities=11% Similarity=0.209 Sum_probs=18.5
Q ss_pred CceEEEEeCcccceeEEEEcCEEeeeec
Q 023943 160 GRRILLHFEAVDSAFCAWINGVPVGYSQ 187 (275)
Q Consensus 160 ~~~i~L~f~gv~s~~~VwvNG~~VG~~~ 187 (275)
|.-+.|. ..+.....+++||+.+|+.+
T Consensus 31 Gsvi~Ld-k~~~epvdI~vNg~lia~GE 57 (74)
T PRK06789 31 GTLYRLE-NSTKNTVRLMLENEEIGTGK 57 (74)
T ss_pred CCEEEeC-CcCCCCEEEEECCEEEeEEe
Confidence 4444443 23466789999999999854
Done!