Query 000112
Match_columns 2161
No_of_seqs 285 out of 1332
Neff 3.2
Searched_HMMs 46136
Date Thu Mar 28 18:53:03 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/000112.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/000112hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG0045 Cytosolic Ca2+-depende 100.0 1.8E-72 4E-77 686.4 35.5 442 1693-2160 15-485 (612)
2 smart00230 CysPc Calpain-like 100.0 4.4E-69 9.6E-74 616.0 29.1 302 1694-2014 4-317 (318)
3 cd00044 CysPc Calpains, domain 100.0 1.5E-66 3.2E-71 592.1 27.9 304 1695-2005 3-315 (315)
4 PF00648 Peptidase_C2: Calpain 100.0 4.6E-65 1E-69 573.0 20.8 283 1705-2006 1-297 (298)
5 KOG0045 Cytosolic Ca2+-depende 100.0 5.3E-31 1.2E-35 323.7 -10.7 598 874-1874 13-610 (612)
6 smart00720 calpain_III calpain 99.8 3E-20 6.4E-25 191.3 13.5 135 2013-2159 4-143 (143)
7 cd00214 Calpain_III Calpain, s 99.8 4.5E-20 9.8E-25 192.8 13.6 138 2013-2161 6-150 (150)
8 PF01067 Calpain_III: Calpain 99.8 2.9E-19 6.3E-24 182.4 10.6 136 2013-2159 5-147 (147)
9 cd00152 PTX Pentraxins are pla 97.9 8.2E-05 1.8E-09 82.1 12.1 162 1434-1618 32-195 (201)
10 smart00159 PTX Pentraxin / C-r 97.8 0.00016 3.4E-09 80.4 12.3 161 1434-1619 32-196 (206)
11 PF13385 Laminin_G_3: Concanav 97.3 0.00098 2.1E-08 66.5 8.8 80 1498-1594 78-157 (157)
12 PF00354 Pentaxin: Pentaxin fa 97.3 0.00092 2E-08 74.5 9.4 159 1434-1618 26-188 (195)
13 cd00110 LamG Laminin G domain; 94.2 0.31 6.7E-06 50.0 9.8 110 1432-1559 19-129 (151)
14 smart00210 TSPN Thrombospondin 94.1 0.32 6.9E-06 53.8 10.4 86 1434-1533 53-143 (184)
15 smart00282 LamG Laminin G doma 93.2 0.71 1.5E-05 47.3 10.4 109 1435-1560 3-112 (135)
16 smart00560 LamGL LamG-like jel 90.8 0.89 1.9E-05 47.6 8.0 82 1434-1532 2-88 (133)
17 cd02619 Peptidase_C1 C1 Peptid 81.9 2.3 5E-05 46.4 5.5 49 1924-1998 168-218 (223)
18 KOG1029 Endocytic adaptor prot 77.6 3.7 8E-05 54.2 6.0 33 1011-1043 72-104 (1118)
19 PF02210 Laminin_G_2: Laminin 76.4 5.5 0.00012 39.5 5.7 62 1497-1562 46-107 (128)
20 cd02248 Peptidase_C1A Peptidas 60.0 22 0.00047 39.4 6.6 43 1925-1993 156-198 (210)
21 PF03699 UPF0182: Uncharacteri 58.3 13 0.00028 50.2 5.3 62 865-926 62-154 (774)
22 PF02057 Glyco_hydro_59: Glyco 57.9 29 0.00064 46.3 8.2 91 1431-1534 542-638 (669)
23 KOG4326 Mitochondrial F1F0-ATP 56.9 15 0.00033 36.8 4.2 17 1242-1258 13-29 (81)
24 TIGR00805 oat sodium-independe 47.6 33 0.00072 45.1 6.4 94 919-1014 328-436 (633)
25 PF09323 DUF1980: Domain of un 44.9 57 0.0012 36.8 6.9 57 64-124 4-60 (182)
26 PF00112 Peptidase_C1: Papain 44.8 30 0.00064 38.0 4.7 44 1925-1994 163-206 (219)
27 PF04156 IncA: IncA protein; 44.5 12 0.00025 41.7 1.5 14 1031-1044 63-76 (191)
28 PTZ00334 trans-sialidase; Prov 43.2 33 0.00072 46.6 5.5 77 1504-1598 642-724 (780)
29 COG1390 NtpE Archaeal/vacuolar 40.8 2.5E+02 0.0053 32.8 11.1 113 1255-1384 17-133 (194)
30 PF07946 DUF1682: Protein of u 40.6 40 0.00086 41.2 5.3 10 1332-1341 305-314 (321)
31 PF00054 Laminin_G_1: Laminin 38.5 40 0.00087 35.5 4.3 51 1472-1530 26-76 (131)
32 cd08045 TAF4 TATA Binding Prot 38.2 17 0.00037 41.8 1.7 44 1353-1420 166-209 (212)
33 KOG1029 Endocytic adaptor prot 37.6 43 0.00093 45.1 5.0 43 1290-1332 356-398 (1118)
34 PTZ00266 NIMA-related protein 37.5 53 0.0012 46.0 6.2 10 1377-1386 508-517 (1021)
35 PLN02316 synthase/transferase 36.9 53 0.0012 46.1 6.1 16 1837-1852 688-703 (1036)
36 PF09472 MtrF: Tetrahydrometha 35.4 12 0.00027 36.7 -0.0 47 796-842 17-64 (64)
37 KOG1144 Translation initiation 33.1 1E+02 0.0022 42.1 7.2 17 1521-1537 397-413 (1064)
38 PF09323 DUF1980: Domain of un 32.6 76 0.0016 35.8 5.5 65 956-1020 4-83 (182)
39 PF05297 Herpes_LMP1: Herpesvi 32.2 15 0.00033 44.5 0.0 52 949-1002 107-158 (381)
40 cd02620 Peptidase_C1A_Cathepsi 32.1 85 0.0018 36.5 5.9 27 1927-1953 184-210 (236)
41 PF09586 YfhO: Bacterial membr 30.3 1.1E+02 0.0023 41.4 7.1 24 846-869 214-238 (843)
42 COG0815 Lnt Apolipoprotein N-a 29.3 1.2E+02 0.0026 39.8 7.1 77 99-180 97-185 (518)
43 PF12065 DUF3545: Protein of u 28.0 26 0.00055 34.2 0.7 10 1333-1342 23-32 (59)
44 KOG2341 TATA box binding prote 27.8 61 0.0013 42.7 4.2 27 1055-1081 189-215 (563)
45 PF05875 Ceramidase: Ceramidas 27.8 54 0.0012 38.6 3.5 143 806-969 14-159 (262)
46 cd02698 Peptidase_C1A_Cathepsi 27.2 1.2E+02 0.0027 35.2 6.1 27 1927-1953 178-205 (239)
47 PF04405 ScdA_N: Domain of Unk 27.0 42 0.00092 32.1 2.0 33 511-543 11-47 (56)
48 TIGR00570 cdk7 CDK-activating 26.9 96 0.0021 38.5 5.3 104 1204-1329 57-164 (309)
49 PTZ00266 NIMA-related protein 26.9 89 0.0019 44.0 5.6 16 823-838 227-242 (1021)
50 cd06899 lectin_legume_LecRK_Ar 26.8 2.2E+02 0.0047 33.5 8.0 37 1492-1528 150-186 (236)
51 PF05154 TM2: TM2 domain; Int 26.6 21 0.00046 33.0 -0.0 33 290-326 3-38 (51)
52 PF09991 DUF2232: Predicted me 26.2 61 0.0013 37.6 3.5 87 915-1002 199-288 (290)
53 PF14402 7TM_transglut: 7 tran 25.6 86 0.0019 38.8 4.6 55 942-1001 146-207 (313)
54 PF06439 DUF1080: Domain of Un 24.9 2.6E+02 0.0056 30.4 7.7 102 1415-1529 38-149 (185)
55 PRK11588 hypothetical protein; 24.0 2E+02 0.0043 38.0 7.6 46 890-952 172-217 (506)
56 KOG3011 Ubiquitin-conjugating 23.5 1.7E+02 0.0038 35.6 6.3 117 852-984 83-225 (293)
57 COG4870 Cysteine protease [Pos 22.8 69 0.0015 40.4 3.2 49 1923-1997 260-318 (372)
58 cd01951 lectin_L-type legume l 22.6 3.7E+02 0.0079 30.9 8.7 50 1505-1555 154-203 (223)
59 PRK02509 hypothetical protein; 22.1 1.6E+02 0.0035 41.3 6.6 34 895-928 188-238 (973)
60 TIGR00917 2A060601 Niemann-Pic 21.9 39 0.00084 48.0 1.0 78 956-1034 640-742 (1204)
61 PF04123 DUF373: Domain of unk 21.8 40 0.00086 42.1 1.0 138 923-1074 161-320 (344)
62 PRK10263 DNA translocase FtsK; 21.4 42 0.00091 47.8 1.2 29 773-805 24-52 (1355)
63 PRK15097 cytochrome d terminal 21.3 2.5E+02 0.0053 37.3 7.6 91 948-1074 393-491 (522)
64 PF13801 Metal_resist: Heavy-m 21.2 3.1E+02 0.0068 27.4 6.9 19 1287-1305 43-61 (125)
65 KOG3583 Uncharacterized conser 21.1 1.7E+02 0.0037 35.0 5.6 122 1233-1363 38-185 (279)
66 PF15412 Nse4-Nse3_bdg: Bindin 21.0 69 0.0015 30.5 2.1 28 182-209 18-45 (56)
67 PLN00122 serine/threonine prot 21.0 1E+02 0.0022 35.4 3.8 22 1323-1344 142-163 (170)
68 TIGR02916 PEP_his_kin putative 20.9 57 0.0012 43.0 2.2 36 886-922 58-93 (679)
69 PF02387 IncFII_repA: IncFII R 20.7 1.1E+02 0.0023 37.6 4.2 87 1247-1349 159-251 (281)
70 KOG4661 Hsp27-ERE-TATA-binding 20.2 1.4E+02 0.0029 39.5 5.0 30 1313-1342 626-655 (940)
71 PF02460 Patched: Patched fami 20.2 1.1E+02 0.0023 41.5 4.5 53 955-1007 282-348 (798)
No 1
>KOG0045 consensus Cytosolic Ca2+-dependent cysteine protease (calpain), large subunit (EF-Hand protein superfamily) [Posttranslational modification, protein turnover, chaperones; Signal transduction mechanisms]
Probab=100.00 E-value=1.8e-72 Score=686.37 Aligned_cols=442 Identities=38% Similarity=0.687 Sum_probs=357.5
Q ss_pred HHHHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCch
Q 000112 1693 AVKEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCW 1772 (2161)
Q Consensus 1693 aIKE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCW 1772 (2161)
.+++.|...+ ..|+|++|||+++|++.+...|..+. ..+.|+||+|++. +|+++.+++++.||+||.+||||
T Consensus 15 ~~~~~cl~~~-~~F~D~~FP~~~~Sl~~~~~~p~~~~-~~i~W~RP~ei~~------~p~~i~~~~~~~di~Qg~lgdCw 86 (612)
T KOG0045|consen 15 RLRRDCLPAK-SLFVDALFPAADSSLFYKLSTPLAQF-SDIVWKRPQEICA------NPRLIVDGPSRFDVKQGLLGDCW 86 (612)
T ss_pred HHHHHHhhcC-CcccccCCCCCCccccccccCCCccc-ccceecCcccccC------CCCeecCCCCcceeEEeeecchH
Confidence 3455555554 58999999999999998765555332 4589999999764 68999999999999999999999
Q ss_pred HHHHHHHHhccccccccccCc----ccCCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHH
Q 000112 1773 FLSAVAVLTEVSQISEVIITP----EYNEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAY 1848 (2161)
Q Consensus 1773 FLAALAALAE~PrLle~fItP----eyNe~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAY 1848 (2161)
|+||+|+||.++.++.+++++ .+++.|+|+||||++|+|+.|+|||+|||. +|+..|+++..++|+|++||||||
T Consensus 87 ~laA~a~la~~~~ll~~vip~~~~~~~~yaGif~f~~w~~G~W~~VvIDD~LP~~-~~~~~~~~s~~~~efW~aLlEKAy 165 (612)
T KOG0045|consen 87 FLAACAALALRPELLDKVIPQDQSFQENYAGIFHFRFWQNGEWVEVVIDDRLPTS-NGGLLFSHSSGKNEFWAALLEKAY 165 (612)
T ss_pred HHHHHHHhhcCHHHHHhccCCCcccccccceEEEEEEEeCCeEEEEEeeeecceE-cCCEEEEeecCCceeHHHHHHHHH
Confidence 999999999999998888873 367899999999999999999999999997 567889999888999999999999
Q ss_pred HHhcCCcccccCCChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCC--CCC---cccccc
Q 000112 1849 AKLHGSYEALEGGLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPS--GSD---VHISSS 1923 (2161)
Q Consensus 1849 AKLhGSYeaLeGG~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTps--gsD---~e~es~ 1923 (2161)
||++|||+++.||...+|+.+|||+.+|.++++..... +.+ +.+|. +.+..++|.+++|++.. ..+ .....+
T Consensus 166 aKl~GsY~~l~gg~~~~a~~~lTG~~~e~~~l~~~~~~-~~~-~l~~~-~~~~~~~~~~l~c~~~~~~~~~~~~~~~~~~ 242 (612)
T KOG0045|consen 166 AKLLGSYEALHGGSTIDALVDLTGGVTEPFDLNKTPKS-FKN-NLVWA-LLKSAHRGSLLLCSIESKDPTEEEEEAKLRN 242 (612)
T ss_pred HHHhCcccCCCCCchhhHHHhccCCccceeEcccCcch-hHH-HHHHH-HHHhhhccCceeeeccccccchhHHHHHhhc
Confidence 99999999999999999999999999999998764311 111 33444 44555556666665432 222 235689
Q ss_pred CcccCceeEEEEEEEEcC----eEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCC--CCCCCCeeecchhhHhhc
Q 000112 1924 GIVQGHAYSILQVREVDG----HKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHV--PQSKDGIFWMSWQDFQIH 1997 (2161)
Q Consensus 1924 GLVsGHAYSVLdV~EVdG----~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~--p~sDDGtFWMSfEDFLky 1997 (2161)
||+++|||+|++++++++ ++|+||||||| +.||||+|||++++|....+..+... ...+||+|||+++||+++
T Consensus 243 gL~~~HaYsit~~~~~~~~~~~~~lirlrNPwg-~~~W~G~wsd~~~~W~~v~~~~~~~~~~~~~~dGeFWms~~dF~~~ 321 (612)
T KOG0045|consen 243 GLVKGHAYAITDVREVQGRGGKHRLIRLRNPWG-ESEWNGPWSDGSEEWHLVDKSKLSELGRQPLDDGEFWMSFDDFLRE 321 (612)
T ss_pred CccccccEEEEEEEEeecccccceeEEecCCcC-CceeccccccCCcchhhhCHHHHhhcccccccCCCeeeeHHHHHhh
Confidence 999999999999999999 99999999999 58999999999999998766554422 126899999999999999
Q ss_pred ccceeEEEEcCCCc---------cccccCce--e-cccCCCCccC-cCCCCCCeEEEEeccCCCCCCEEEEEEeeccccc
Q 000112 1998 FRSIYVCRVYPSEM---------RYSVHGQW--R-GYSAGGCQDY-ASWNQNPQFRLRASGSDASFPIHVFITLTQGVSF 2064 (2161)
Q Consensus 1998 FssIyICrl~Pd~~---------RyrVhGeW--r-G~TAGGC~Df-dTF~qNPQY~LsVssSD~sePi~VLISLSQkDqr 2064 (2161)
|+.+++|++.|++. ....+|+| . +.++|||.++ ++|.+||||.+.+..++. ..+.++..+.|+..+
T Consensus 322 F~~~~vC~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~t~ggc~~~~~tF~~npq~~~~~~~~~~-~~~~~v~~~~q~~~~ 400 (612)
T KOG0045|consen 322 FDSLTVCRLRPDWLESRNQLQWVKLSLDGEWELARGVTAGGCRNSVDTFDRNPQYILAVRKPTK-SLCAVVLALFQKTRR 400 (612)
T ss_pred CCeEeecCCCcchhhhhheeeeeeeecCCccceeecccCCCCccCcccccCCceEEEEecCCCc-cceEEEEEeeccccc
Confidence 99999999988754 13467999 3 5789999998 799999999999975432 357788889998643
Q ss_pred cccccccccccccCCCceeEEEEEEEEecCcccccceeeccc-cCCcccccCcceEEEEEEeCCCccEEEEccccCCCCc
Q 000112 2065 SRTVAGFKNYQSSHDSMMFYIGMRILKTRGRRAAHNIYLHES-VGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEE 2143 (2161)
Q Consensus 2065 sR~~~GFrnYq~shDs~LLyIGL~VfKvrGnRs~~nIflhEs-V~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqE 2143 (2161)
+-. ....+...||+++++...++... +..+.+ .....|.+.|+|+.++++|| |.|++||+|++|+++
T Consensus 401 ~~~---------~~~~~~~~ig~~i~~v~~~~~~~-~~~~~~~~~~~~~i~~r~v~~~~~~P~--~~y~~~pst~~~~~~ 468 (612)
T KOG0045|consen 401 GER---------SFGANILDIGFHIYEVPLEGKYF-VLDNAPIASSSSFINNREVSVRFRLPP--GTYVIVPSTFEPGEE 468 (612)
T ss_pred ccc---------cccceeeecceEEEEecCCCCce-EecccchhcccccccceeEEEEecCCC--cceeecccCCCCCCC
Confidence 111 11235688999999998663221 222222 34567999999999999775 899999999999999
Q ss_pred cCcEEEEEeCCCcceee
Q 000112 2144 APFVLSVFTKASIILEA 2160 (2161)
Q Consensus 2144 G~FTLRVFSskpItLEP 2160 (2161)
++|+|+||++.++..++
T Consensus 469 ~~f~lrvfs~~~~~~~~ 485 (612)
T KOG0045|consen 469 GEFLLRVFSNVKVKSEE 485 (612)
T ss_pred ccEEEEEeecccccCcc
Confidence 99999999998877663
No 2
>smart00230 CysPc Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).
Probab=100.00 E-value=4.4e-69 Score=615.99 Aligned_cols=302 Identities=41% Similarity=0.813 Sum_probs=266.6
Q ss_pred HHHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCchH
Q 000112 1694 VKEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCWF 1773 (2161)
Q Consensus 1694 IKE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCWF 1773 (2161)
|.+.|..++ .+|+|++|||+..||+.++..+ ..++|+||+|+++ +|++|.++++|.||+||.+|||||
T Consensus 4 i~~~c~~~~-~~f~D~~Fpp~~~sl~~~~~~~-----~~~~W~Rp~e~~~------~~~~~~~~i~~~di~QG~lgDC~~ 71 (318)
T smart00230 4 LRQYCKESG-TLFEDPLFPANNGSLFFSQRQR-----KFVVWKRPHEIFE------NPPFIVGGASRTDICQGVLGDCWL 71 (318)
T ss_pred HHHHHHHcC-CCccCCCCCCCcCccccCCCCC-----CCcEEECcHHHcC------CCEEEeCCCChhhccCcccccHHH
Confidence 455565554 6999999999999999765432 2479999999985 478998899999999999999999
Q ss_pred HHHHHHHhccccccccccCc--c--cCCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHHH
Q 000112 1774 LSAVAVLTEVSQISEVIITP--E--YNEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAYA 1849 (2161)
Q Consensus 1774 LAALAALAE~PrLle~fItP--e--yNe~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAYA 1849 (2161)
+|||++|+++|.+++.++++ + .|+.|+|+||||+||+|+.|+|||+||+.. |.++|+++.+++|+|++|||||||
T Consensus 72 lsal~~la~~~~~i~~if~~~~~~~~~~~G~y~vrl~~~G~w~~V~VDd~lP~~~-~~~~~~~~~~~~e~W~~LLEKAyA 150 (318)
T smart00230 72 LAALASLTLREKLLDRVIPHDQEFSENYAGIFHFRFWRFGKWVDVVIDDRLPTYN-GELVFMHSNSRNEFWSALLEKAYA 150 (318)
T ss_pred HHHHHHHHhCHHHHhheEeCCcccccccCCEEEEEEEECCEEEEEEecCCCeeeC-CceEEEEeCCCCcchhHHHHHHHH
Confidence 99999999999888777652 2 468999999999999999999999999974 569999998899999999999999
Q ss_pred HhcCCcccccCCChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCCCC---CccccccCcc
Q 000112 1850 KLHGSYEALEGGLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPSGS---DVHISSSGIV 1926 (2161)
Q Consensus 1850 KLhGSYeaLeGG~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTpsgs---D~e~es~GLV 1926 (2161)
|+||||++|.||.+.+||++|||++++.+++++.. .+.+++|+.|.++.++|++|+|+++..+ +...++.||+
T Consensus 151 K~~GsY~~i~gg~~~~al~~LTG~~~~~i~l~~~~----~~~~~~w~~l~~~~~~g~lv~~~t~~~~~~~~~~~~~~GLv 226 (318)
T smart00230 151 KLNGCYEALKGGSTTEALEDLTGGVAESIDLKEAS----KDPDNLFEDLFKAFERGSLMGCSIGAGTAVEEEEQKDCGLV 226 (318)
T ss_pred HHcCCCcccCCCCHHHHHHHhcCCCeEEEEccccc----CCHHHHHHHHHHHHhCCCeEEEEcCCCCcchhhhhhhcCcc
Confidence 99999999999999999999999999999987642 2467899999999999999999987653 3345679999
Q ss_pred cCceeEEEEEEEEcCeE--EEEEecCCCCCccccCCCCCCCcccc---HHHHhhhCCCCCCCCCeeecchhhHhhcccce
Q 000112 1927 QGHAYSILQVREVDGHK--LVQIRNPWANEVEWNGPWSDSSPEWT---DRMKHKLKHVPQSKDGIFWMSWQDFQIHFRSI 2001 (2161)
Q Consensus 1927 sGHAYSVLdV~EVdG~R--LVRLRNPWG~~~EWKG~WSD~S~eWT---eeLKkkL~~~p~sDDGtFWMSfEDFLkyFssI 2001 (2161)
++|||+|++++++++++ ||+|||||| ..||+|+|||+|++|+ +++++++++. ..+||+|||+|+||++||+++
T Consensus 227 ~~HaYsVl~v~~~~~~~~~Ll~lrNPWg-~~eW~G~wsd~s~~W~~~~~~~~~~l~~~-~~~dG~FWM~~~df~~~F~~~ 304 (318)
T smart00230 227 KGHAYSVTDVREVQGRRQELLRLRNPWG-QVEWNGPWSDDSPEWRSVSASEKKNLGLT-FDDDGEFWMSFEDFLRHFDKV 304 (318)
T ss_pred cCccEEEEEEEEEecCCeEEEEEECCCC-CCCcCCCCCCCCccccccCHHHHHHhCCC-CCCCCEEEEEhHHHHhhCCeE
Confidence 99999999999998866 999999999 5899999999999999 6788888764 469999999999999999999
Q ss_pred eEEEEcCCCcccc
Q 000112 2002 YVCRVYPSEMRYS 2014 (2161)
Q Consensus 2002 yICrl~Pd~~Ryr 2014 (2161)
++|++.|++++|+
T Consensus 305 ~vc~~~~~~~~~r 317 (318)
T smart00230 305 EICNLNPDSLEER 317 (318)
T ss_pred EEeccCCcccccc
Confidence 9999999987664
No 3
>cd00044 CysPc Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction.
Probab=100.00 E-value=1.5e-66 Score=592.08 Aligned_cols=304 Identities=47% Similarity=0.846 Sum_probs=260.7
Q ss_pred HHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCchHH
Q 000112 1695 KEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCWFL 1774 (2161)
Q Consensus 1695 KE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCWFL 1774 (2161)
.+.|...+ .+|+|++|||+.+|++.++..+..+....++|+||+|+++.... .+|++|.++++|.||+||.+|||||+
T Consensus 3 ~~~c~~~~-~~f~D~~Fpp~~~s~~~~~~~~~~~~~~~~~W~Rp~~~~~~~~~-~~~~~~~~~~~~~dI~QG~lgDC~~l 80 (315)
T cd00044 3 LQICLLSG-VLFEDPDFPPNDSSLGFDDSLSNGQPKKVIEWKRPSEIFADDGN-SNPRLFVNGASPSDVCQGILGDCWFL 80 (315)
T ss_pred HHHHHHcC-CCccCCCCCCCccccccccccccccCcCcceEECcHHHhCcccC-CCCEEEeCCCChhhcccCcccchHHH
Confidence 45565554 69999999999999987643333344556899999999875322 46899999999999999999999999
Q ss_pred HHHHHHhccccccccccCcc-c---CCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHHHH
Q 000112 1775 SAVAVLTEVSQISEVIITPE-Y---NEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAYAK 1850 (2161)
Q Consensus 1775 AALAALAE~PrLle~fItPe-y---Ne~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAYAK 1850 (2161)
|||++|+++|.+++.++.+. . ++.|+|+||||+||+|+.|+|||+||+..++ |+|+++.+.+|+|++||||||||
T Consensus 81 saL~~la~~~~~i~~lf~~~~~~~~~~~G~y~v~l~~~G~w~~V~VDD~lP~~~~~-~~~~~s~~~~e~W~~LlEKAyAK 159 (315)
T cd00044 81 AALAALAERPELLKRVIPPDQSFEENYAGIYHFRFWKNGEWVEVVIDDRLPTSNGG-LLFMHSRDRNELWVALLEKAYAK 159 (315)
T ss_pred HHHHHHHcCHHHHhheEcCCcccccCcCcEEEEEEEECCEEEEEEecCCCeecCCc-eEEEEECCCCeEcHHHHHHHHHh
Confidence 99999999998777766543 3 6899999999999999999999999997655 99999988899999999999999
Q ss_pred hcCCcccccCCChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCCCCCcc-ccccCcccCc
Q 000112 1851 LHGSYEALEGGLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPSGSDVH-ISSSGIVQGH 1929 (2161)
Q Consensus 1851 LhGSYeaLeGG~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTpsgsD~e-~es~GLVsGH 1929 (2161)
+||||++|.||++.+||++|||++++.+++++.... ...+++|+.|.++.+.+++|+|+|+...+.. .++.||+++|
T Consensus 160 ~~GsY~~i~gg~~~~al~~LTG~~~~~i~~~~~~~~--~~~~~~~~~l~~~~~~~~lv~~~t~~~~~~~~~~~~Gl~~~H 237 (315)
T cd00044 160 LHGSYEALVGGNTAEALEDLTGGPTERIDLKSADAS--SGDNDLFALLLSFLQGGSLIGCSTGSRSEEEARTANGLVKGH 237 (315)
T ss_pred hcCCccccCCCCHHHHHHHhhCCCcEEEEccccccc--cCHHHHHHHHHHHhhCCCEEEEEcCCCCcchhhccCCcccCc
Confidence 999999999999999999999999999998764321 2467899999999999999999998765432 5679999999
Q ss_pred eeEEEEEEEEc--CeEEEEEecCCCCCccccCCCCCCCccccH--HHHhhhCCCCCCCCCeeecchhhHhhcccceeEEE
Q 000112 1930 AYSILQVREVD--GHKLVQIRNPWANEVEWNGPWSDSSPEWTD--RMKHKLKHVPQSKDGIFWMSWQDFQIHFRSIYVCR 2005 (2161)
Q Consensus 1930 AYSVLdV~EVd--G~RLVRLRNPWG~~~EWKG~WSD~S~eWTe--eLKkkL~~~p~sDDGtFWMSfEDFLkyFssIyICr 2005 (2161)
||+|+++++++ |+|||+||||||. .||+|+|||+|++|.. ..++.+. ....+||+|||+|+||++||+++++|+
T Consensus 238 aY~Vl~~~~~~~~~~~lv~lrNPWg~-~~w~G~ws~~~~~w~~~~~~~~~~~-~~~~~dG~Fwm~~~df~~~F~~~~vc~ 315 (315)
T cd00044 238 AYSVLDVREVQEEGLRLLRLRNPWGV-GEWWGGWSDDSSEWWVIDAERKKLL-LSGKDDGEFWMSFEDFLRNFDGLYVCN 315 (315)
T ss_pred ceEEeEEEEEccCceEEEEecCCccC-CCccCCCCCCCchhccChHHHHHhc-CCCCCCCEEEEEhHHhheeeCeEEEeC
Confidence 99999999998 8999999999996 7999999999999953 3333333 345799999999999999999999994
No 4
>PF00648 Peptidase_C2: Calpain family cysteine protease This is family C2 in the peptidase classification. ; InterPro: IPR001300 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium []. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals [, ]: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only []. All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [, ]. The crystallographic structure of m-calpain reveals six "domains" in the 80kDa subunit: A 19-amino acid NH2-terminal sequence; Active site domain IIa; Active site domain IIb. Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related []. Domain III; An 18-amino acid extended sequence linking domain III to domain IV; Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity []. />]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad []. Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (IPR001259 from INTERPRO). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma []. Calpains are a family of cytosolic cysteine proteinases (see PDOC00126 from PROSITEDOC). Members of the calpain family are believed to function in various biological processes, including integrin-mediated cell migration, cytoskeletal remodeling, cell differentiation and apoptosis [, ]. The calpain family includes numerous members from C. elegans to mammals and with homologues in yeast and bacteria. The best characterised members are the m- and mu-calpains, both proteins are heterodimer composed of a large catalytic subunit and a small regulatory subunit. The large subunit comprises four domains (dI-dIV) while the small subunit has two domains (dV-dVI). Domain dI is a short region cleaved by autolysis, dII is the catalytic core, dIII is a C2-like domain, dIV consists of five calcium binding EF-hand motifs []. The crystal structure of calpain has been solved [, ]. The catalytic region consists of two distinct structural domains (dIIa and dIIb). dIIa contains a central helix flanked on three faces by a cluster of alpha-helices and is entirely unrelated to the corresponding domain in the typical thiol proteinases. The fold of dIIb is similar to the corresponding domain in other cysteine proteinases and contains two three-stranded anti-parallel beta-sheets. The catalytic triad residues (C,H,N) are located in dIIa and dIIb. The activation of the domain is dependent on the binding of two calcium atoms in two non EF-hand calcium binding sites located in the catalytic core, one close to the Cys active site in dIIa and one at the end of dIIb. Calcium-binding induced conformational changes in the catalytic domain which align the active site [][]. The profile covers the whole catalytic domain.; GO: 0004198 calcium-dependent cysteine-type endopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 2NQA_A 1KFU_L 1KFX_L 1QXP_B 2R9C_A 1TL9_A 2G8E_A 1KXR_B 2G8J_A 2NQG_A ....
Probab=100.00 E-value=4.6e-65 Score=572.98 Aligned_cols=283 Identities=51% Similarity=0.959 Sum_probs=228.0
Q ss_pred ceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCchHHHHHHHHhccc
Q 000112 1705 QFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCWFLSAVAVLTEVS 1784 (2161)
Q Consensus 1705 ~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCWFLAALAALAE~P 1784 (2161)
.|+||+|||+++||+.++..+ ..++|+||+|+++ +|++|.+++.+.||+||.+|||||+|||++||++|
T Consensus 1 ~f~D~~Fpp~~~Sl~~~~~~~-----~~~~W~R~~e~~~------~~~~~~~~~~~~di~QG~lgDc~llaaL~~la~~~ 69 (298)
T PF00648_consen 1 LFEDPEFPPNDSSLGFDDQKP-----KNVEWKRPSEICE------NPQFFIDGISPSDIRQGSLGDCWLLAALAALAEHP 69 (298)
T ss_dssp ----TTS-SSHHHHTSSTTST-----TT-EEE-HHHHSS------S-BSSSSSSSGGGEBE-SSSSHHHHHHHHHHTTSH
T ss_pred CccCCCCccCccccccCCCCC-----CcceeEechhcCC------CCeEEECCCccccccccccCChhHHHHHHHHHhcc
Confidence 499999999999999765433 3479999999985 47788899999999999999999999999999999
Q ss_pred cccccccC--ccc--CCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHHHHhcCCcccccC
Q 000112 1785 QISEVIIT--PEY--NEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAYAKLHGSYEALEG 1860 (2161)
Q Consensus 1785 rLle~fIt--Pey--Ne~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAYAKLhGSYeaLeG 1860 (2161)
.+++++++ +.. +..|+|+||||++|+|++|+|||+||| .+|+|+|+++.+++|+|++||||||||+||||++|.|
T Consensus 70 ~~i~~i~~~~~~~~~~~~G~y~v~l~~~G~w~~V~VDd~lP~-~~g~~~f~~s~~~~elW~~LlEKAyAKl~GsY~~l~g 148 (298)
T PF00648_consen 70 DLIKKIFPVNQSFNENYNGIYTVRLFKNGEWREVTVDDRLPC-KNGKPLFARSSDPNELWPSLLEKAYAKLHGSYSALEG 148 (298)
T ss_dssp HHHHHHS-SS--SSTT-SSEEEEEEEETTEEEEEEEES-EEE-ETTEESSSBESSTTB-HHHHHHHHHHHHTTSSGGGSS
T ss_pred cccccccccccccccccCceeeEeeccCCeeeeeccchhhhc-cccceeeeccCCcccchhhhhhchhhhccccccccCC
Confidence 88777763 222 346999999999999999999999999 6899999998889999999999999999999999999
Q ss_pred CChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCCC---CCccccccCcccCceeEEEEEE
Q 000112 1861 GLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPSG---SDVHISSSGIVQGHAYSILQVR 1937 (2161)
Q Consensus 1861 G~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTpsg---sD~e~es~GLVsGHAYSVLdV~ 1937 (2161)
|++.++|++|||++++.+++++.. ..+++|+.+.+..+++.++++.+... ........||+++|||+|++++
T Consensus 149 g~~~~al~~LTG~~~~~~~l~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gl~~~HaY~Vl~~~ 223 (298)
T PF00648_consen 149 GNPSEALQDLTGGPPESIDLRDDS-----SDDELWELWKKLLKSGSLVGCSTGSSTPFDSEEYEKNGLVPGHAYAVLDVR 223 (298)
T ss_dssp BSHHHHHHHHHSSEEEEEEGGG-------T--THHHHHHHHHHCT-EEEEE--SSSGGGTTSBCTTSBBTTS-EEEEEEE
T ss_pred CChhhhhHhhcCCcceeeeccccc-----hhhhHHHHHHHHHHhccccccccccccccccccccccCcccceeEEEEEEE
Confidence 999999999999999999886542 13468888888899999988876532 1233568999999999999999
Q ss_pred EEcC----eEEEEEecCCCCCccccCCCCCCCcccc---HHHHhhhCCCCCCCCCeeecchhhHhhcccceeEEEE
Q 000112 1938 EVDG----HKLVQIRNPWANEVEWNGPWSDSSPEWT---DRMKHKLKHVPQSKDGIFWMSWQDFQIHFRSIYVCRV 2006 (2161)
Q Consensus 1938 EVdG----~RLVRLRNPWG~~~EWKG~WSD~S~eWT---eeLKkkL~~~p~sDDGtFWMSfEDFLkyFssIyICrl 2006 (2161)
++++ ++||||||||| ..||+|+||++|++|+ +..++.+++. ..+||+|||+|+||++||+.++||++
T Consensus 224 ~~~~~~~~~~lv~LrNPwg-~~~w~G~ws~~s~~W~~~~~~~~~~~~~~-~~~dg~FWM~~~df~~~F~~i~vc~~ 297 (298)
T PF00648_consen 224 EVNGNGEGHRLVKLRNPWG-STEWKGDWSDDSPEWTEIHPSLRKRLNQS-SSDDGTFWMSFEDFLKYFSSIYVCRL 297 (298)
T ss_dssp EEEETTEEEEEEEEE-TTS-S---SSTTSTTSGGGGGS-HHHHHHHTTT-SSSSSEEEEEHHHHHHHSEEEEEEES
T ss_pred eeccccceeEEEEEcCCCc-cccccccccccccccccCCHHHHhhcccc-cccCccHhHhHHHHHhhCCceEEEee
Confidence 9975 89999999999 4899999999999999 5677777753 46899999999999999999999986
No 5
>KOG0045 consensus Cytosolic Ca2+-dependent cysteine protease (calpain), large subunit (EF-Hand protein superfamily) [Posttranslational modification, protein turnover, chaperones; Signal transduction mechanisms]
Probab=99.95 E-value=5.3e-31 Score=323.66 Aligned_cols=598 Identities=23% Similarity=0.179 Sum_probs=481.9
Q ss_pred EEeccCCCCCChhhHHHhhhhhhhHHHHHHhhcccceeecCccccccceeeeehhHHHHHHhhhheeeeeccchhHHHHH
Q 000112 874 VVKSREDQVPTKGDFLAALLPLVCIPALLSLCSGLLKWKDDDWKLSRGVYVFITIGLVLLLGAISAVIVVITPWTIGVAF 953 (2161)
Q Consensus 874 v~~~r~~~~p~~~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~~gvaf 953 (2161)
..+.|++..|++..|..+.+|..+.+.++.+++..-+|++-.|++-.- ..-+||+|..-.
T Consensus 13 ~~~~~~~cl~~~~~F~D~~FP~~~~Sl~~~~~~p~~~~~~i~W~RP~e--------------------i~~~p~~i~~~~ 72 (612)
T KOG0045|consen 13 FERLRRDCLPAKSLFVDALFPAADSSLFYKLSTPLAQFSDIVWKRPQE--------------------ICANPRLIVDGP 72 (612)
T ss_pred HHHHHHHHhhcCCcccccCCCCCCccccccccCCCcccccceecCccc--------------------ccCCCCeecCCC
Confidence 346789999999999999999999999999999998887777777665 236899876655
Q ss_pred HHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhhcCCCCccccchhHHHHHHHhhccceeeeccCC
Q 000112 954 LLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGWFDDKPFVGASVGYFTFLFLLAGRALTVLLSPP 1033 (2161)
Q Consensus 954 ~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1033 (2161)
..+-+... ..+|.++..-...+.+.-.++.-... +|+-|-..+.|+|.|-|...|+..+|.
T Consensus 73 ~~~di~Qg---------~lgdCw~laA~a~la~~~~ll~~vip------~~~~~~~~yaGif~f~~w~~G~W~~Vv---- 133 (612)
T KOG0045|consen 73 SRFDVKQG---------LLGDCWFLAACAALALRPELLDKVIP------QDQSFQENYAGIFHFRFWQNGEWVEVV---- 133 (612)
T ss_pred CcceeEEe---------eecchHHHHHHHHhhcCHHHHHhccC------CCcccccccceEEEEEEEeCCeEEEEE----
Confidence 44333222 14566555444444444444443333 899999999999999999999988764
Q ss_pred EEEecCceeeEEEeecccccCCCchhhHHHHHHHHhhhccceeEEEEEEcCCCcccchhhhhheeeeccccccccchhhc
Q 000112 1034 IVVYSPRVLPVYVYDAHADCGKNVSVAFLVLYGVALAIEGWGVVASLKIYPPFAGAAVSAITLVVAFGFAVSRPCLTLKT 1113 (2161)
Q Consensus 1034 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1113 (2161)
| --.||+|+++.| .+.|... ..+.+||...+| ...+-.|+++.|..++.+. ++|+.+++.|+...|+
T Consensus 134 --I--DD~LP~~~~~~~----~~~s~~~-~efW~aLlEKAy--aKl~GsY~~l~gg~~~~a~--~~lTG~~~e~~~l~~~ 200 (612)
T KOG0045|consen 134 --I--DDRLPTSNGGLL----FSHSSGK-NEFWAALLEKAY--AKLLGSYEALHGGSTIDAL--VDLTGGVTEPFDLNKT 200 (612)
T ss_pred --e--eeecceEcCCEE----EEeecCC-ceeHHHHHHHHH--HHHhCcccCCCCCchhhHH--HhccCCccceeEcccC
Confidence 2 568999999998 6677777 788999999999 6678899999999887766 9999999999999999
Q ss_pred hHHHhhhcchhhHHHHHhhhccccccccccccccccccccccceeccCCccccccCCCccccchhhhHHHHhhccccccc
Q 000112 1114 MEDAVHFLSKDTVVQAISRSATKTRNALSGTYSAPQRSASSTALLVGDPNATRDKQGNLMLPRDDVVKLRDRLKNEEFVA 1193 (2161)
Q Consensus 1114 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1193 (2161)
+++... ++++++-++++|..+++..+++ ..+.+++ +.+++|+.|++..-.+
T Consensus 201 ~~~~~~-----~l~~~~~~~~~~~~~l~c~~~~----------------~~~~~~~--------~~~~~~~gL~~~HaYs 251 (612)
T KOG0045|consen 201 PKSFKN-----NLVWALLKSAHRGSLLLCSIES----------------KDPTEEE--------EEAKLRNGLVKGHAYA 251 (612)
T ss_pred cchhHH-----HHHHHHHHhhhccCceeeeccc----------------cccchhH--------HHHHhhcCccccccEE
Confidence 998876 7899999999999999988876 1222222 7999999999999999
Q ss_pred ccccccccccccccCCCCCchhhHhhhhhhhhhhhhhhcccceeeeeccchhhhHhhhccchhhhhhhhhhhhhhhhccc
Q 000112 1194 GSFFCRMKYKRFRHELSSDYDYRREMCTHARILALEEAIDTEWVYMWDKFGGYLLLLLGLTAKAERVQDEVRLRLFLDSI 1273 (2161)
Q Consensus 1194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1273 (2161)
.+-.+.++. |+.|+.|.||..-.. ++||.++|++.+.+...+.....+..++|.
T Consensus 252 it~~~~~~~-------------~~~~~~lirlrNPwg--~~~W~G~wsd~~~~W~~v~~~~~~~~~~~~----------- 305 (612)
T KOG0045|consen 252 ITDVREVQG-------------RGGKHRLIRLRNPWG--ESEWNGPWSDGSEEWHLVDKSKLSELGRQP----------- 305 (612)
T ss_pred EEEEEEeec-------------ccccceeEEecCCcC--CceeccccccCCcchhhhCHHHHhhccccc-----------
Confidence 998888875 999999999999988 999999999999999999988888777765
Q ss_pred CCCcCChhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHHHHHHHhhHHhhhhhhcccCCCCCc
Q 000112 1274 GFSDLSAKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKALLEKEERKWKEIEASLISSIPNAGN 1353 (2161)
Q Consensus 1274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1353 (2161)
++|...||++|..+.+..+..+-+++++.+|..+| +..+.++++++.+
T Consensus 306 ------~~dGeFWms~~dF~~~F~~~~vC~~~~~~~~~~~~--------------------------~~~~~~~~~~~w~ 353 (612)
T KOG0045|consen 306 ------LDDGEFWMSFDDFLREFDSLTVCRLRPDWLESRNQ--------------------------LQWVKLSLDGEWE 353 (612)
T ss_pred ------ccCCCeeeeHHHHHhhCCeEeecCCCcchhhhhhe--------------------------eeeeeeecCCccc
Confidence 67889999999999999999999999999988877 5567788999988
Q ss_pred hHHHHHHHHHHHhcCCccccchhhhHHHHHHHHHHHHHHHHHHHHhcCCcceEEeeCCCCCccCccccccccccccccee
Q 000112 1354 REAAAMAAAVRAVGGDSVLEDSFARERVSSIARRIRTAQLARRALQTGITGAICVLDDEPTTSGRHCGQIDASICQSQKV 1433 (2161)
Q Consensus 1354 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1433 (2161)
.++....||.....++|.+.....|+++....
T Consensus 354 ------~~~~~t~ggc~~~~~tF~~npq~~~~~~~~~~------------------------------------------ 385 (612)
T KOG0045|consen 354 ------LARGVTAGGCRNSVDTFDRNPQYILAVRKPTK------------------------------------------ 385 (612)
T ss_pred ------eeecccCCCCccCcccccCCceEEEEecCCCc------------------------------------------
Confidence 66778899999999999987766665544333
Q ss_pred EEEEEEEeecCCCceeeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEEEEEE
Q 000112 1434 SFSIAVMIQPESGPVCLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVTMT 1513 (2161)
Q Consensus 1434 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1513 (2161)
+++.+..+..|++..++|.+|+ .+.+..||+.+++
T Consensus 386 --------------------------------------------~~~~~v~~~~q~~~~~~~~~~~-~~~~ig~~i~~v~ 420 (612)
T KOG0045|consen 386 --------------------------------------------SLCAVVLALFQKTRRGERSFGA-NILDIGFHIYEVP 420 (612)
T ss_pred --------------------------------------------cceEEEEEeecccccccccccc-eeeecceEEEEec
Confidence 8999999999999999999999 9999999999988
Q ss_pred EeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCChHHH
Q 000112 1514 IDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLTEDEI 1593 (2161)
Q Consensus 1514 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clte~e~ 1593 (2161)
.+ ++.|. ++++.+.....-||.++..+|.. +||.+...|+ |+.|..|+.+|+|+||-++.|.+|+
T Consensus 421 ~~----------~~~~~-~~~~~~~~~~~~i~~r~v~~~~~-~P~~~y~~~p-st~~~~~~~~f~lrvfs~~~~~~~~-- 485 (612)
T KOG0045|consen 421 LE----------GKYFV-LDNAPIASSSSFINNREVSVRFR-LPPGTYVIVP-STFEPGEEGEFLLRVFSNVKVKSEE-- 485 (612)
T ss_pred CC----------CCceE-ecccchhcccccccceeEEEEec-CCCcceeecc-cCCCCCCCccEEEEEeecccccCcc--
Confidence 76 66777 99999999999999999999999 9999999999 9999999999999999999999998
Q ss_pred HHHhhcccccccccccCCCCCcccCCCCcccccCCCCCcceeecccccccccCcccccccccCCCcceeeccchhhhhcc
Q 000112 1594 ASLYSAICSAELNMNEFPEDNWQWADSPPRVDEWDSDPADVDLYDRDDIDWDGQYSSGRKRRADRDGIVVNVDSFARKFR 1673 (2161)
Q Consensus 1594 ~~~~~~~~~ae~~~~d~~dd~WQ~~dsp~r~~~~~~~~~~~~ly~re~v~~~~q~~sGrk~~~~~d~~~ld~d~f~Rklr 1673 (2161)
+..+..+...|+..
T Consensus 486 ------------------------------------------------------------------~~~i~~~~~~~~~~ 499 (612)
T KOG0045|consen 486 ------------------------------------------------------------------DMEISLDETKRSTN 499 (612)
T ss_pred ------------------------------------------------------------------ceEEeeccccccee
Confidence 01111111111111
Q ss_pred CCCcCCHHHHHHHHHHHHHHHHHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCcee
Q 000112 1674 KPRMETQEEIYQRMLSVELAVKEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCL 1753 (2161)
Q Consensus 1674 kpr~etkEEI~Qrl~svE~aIKE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~L 1753 (2161)
....+
T Consensus 500 ~~~~~--------------------------------------------------------------------------- 504 (612)
T KOG0045|consen 500 IIVMK--------------------------------------------------------------------------- 504 (612)
T ss_pred eeeec---------------------------------------------------------------------------
Confidence 10000
Q ss_pred ecCCCCCCCcccCCCCCchHHHHHHHHhccccccccccCcccCCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeec
Q 000112 1754 FSGAVNPSDVCQGRLGDCWFLSAVAVLTEVSQISEVIITPEYNEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATS 1833 (2161)
Q Consensus 1754 F~ggIsPsDVkQG~LGDCWFLAALAALAE~PrLle~fItPeyNe~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArS 1833 (2161)
...++..+|+|.+.......++++....+...+..+. | ..++++..++ |..+++...|...+...
T Consensus 505 --------~~~~~~~~~~~~~~~~~~~~k~s~~~~~~~~~~~~~~--~----~~~~~~~~~~-~~~~~~~~~~~~~~~~~ 569 (612)
T KOG0045|consen 505 --------GFSLGECGDKWKLSSTLVNTKVSRSSEFILTVEVVSP--L----DIEGESTLVV-DIPIAIESKGSGDVAPL 569 (612)
T ss_pred --------ceehhhhchhhhccccccccccchhhceeeeeccccc--E----EEeccccccc-cccceeeccCCcccccc
Confidence 3445556666666555555555443333333333333 2 6788888888 99899887777777776
Q ss_pred CCCCchhHHHHHHHHHHhcCCcccccCCChHHHHhhccCCc
Q 000112 1834 KKGHELWVSILEKAYAKLHGSYEALEGGLVQDALVDLTGGA 1874 (2161)
Q Consensus 1834 sd~nELWpSLLEKAYAKLhGSYeaLeGG~~sEAL~DLTGgP 1874 (2161)
.+..+.|....|++|++.+.++...+++...+.+.++++..
T Consensus 570 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 610 (612)
T KOG0045|consen 570 LNVIRLRIADPEIAYSFDSTSCCATEGPLVLDELFDLSSKK 610 (612)
T ss_pred eeeeeeeccChhheeeccccccccccCcchhhhhhcCCCCC
Confidence 66779999999999999999999999999999999888754
No 6
>smart00720 calpain_III calpain_III.
Probab=99.83 E-value=3e-20 Score=191.33 Aligned_cols=135 Identities=33% Similarity=0.604 Sum_probs=106.4
Q ss_pred ccccCcee-cccCCCCccC-cCCCCCCeEEEEeccCCCCCCEEEEEEeeccccccccccccccccccCCCceeEEEEEEE
Q 000112 2013 YSVHGQWR-GYSAGGCQDY-ASWNQNPQFRLRASGSDASFPIHVFITLTQGVSFSRTVAGFKNYQSSHDSMMFYIGMRIL 2090 (2161)
Q Consensus 2013 yrVhGeWr-G~TAGGC~Df-dTF~qNPQY~LsVssSD~sePi~VLISLSQkDqrsR~~~GFrnYq~shDs~LLyIGL~Vf 2090 (2161)
..++|+|+ +.+||||.++ .+|++||||.|++.+++. ..++|+|.|+|++++.. .. ......+|||+|+
T Consensus 4 ~~~~G~W~~~~tAGG~~~~~~tf~~NPqy~l~v~~~~~-~~~~v~i~L~q~~~r~~--------~~-~~~~~~~iGf~v~ 73 (143)
T smart00720 4 KSVQGSWTRGQTAGGCRNYPATFWTNPQFRITLEEPDD-DDCTVLIALMQKNRRRL--------RR-KGADFLTIGFAVY 73 (143)
T ss_pred EEEeCeEECCCccCCccccccccccCCeEEEEecCCCC-CceEEEEEecccCcccc--------cc-cCCccceEeEEEE
Confidence 45789997 8999999999 899999999999986542 33889999999975311 11 1124578999999
Q ss_pred EecCc-ccccceee--ccccCCcccccCcceEEEEEEeCCCccEEEEccccCCCCccCcEEEEEeCCCccee
Q 000112 2091 KTRGR-RAAHNIYL--HESVGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEEAPFVLSVFTKASIILE 2159 (2161)
Q Consensus 2091 KvrGn-Rs~~nIfl--hEsV~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqEG~FTLRVFSskpItLE 2159 (2161)
+++.. +.....+. .+...+++|.+.|++++++.|+| |.|+|||+|++|+++|+|+|+||++.+++|+
T Consensus 74 ~~~~~~~~~~~~~~~~~~~~~s~~y~~~r~v~~~~~L~~--G~Y~iVPsT~~p~~~g~F~LrV~s~~~~~l~ 143 (143)
T smart00720 74 KVPKELHLRRDFFLSNAPRASSGDYINGREVSERFRLPP--GEYVIVPSTFEPNQEGDFLLRVFSEGPFKLT 143 (143)
T ss_pred EeccccccchhhhhccCccccccccccCeEEEEEEEcCC--CCEEEEEeecCCCCccCEEEEEEecCccccC
Confidence 98765 32222222 22334568999999999999987 8899999999999999999999999999874
No 7
>cd00214 Calpain_III Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains.
Probab=99.82 E-value=4.5e-20 Score=192.77 Aligned_cols=138 Identities=34% Similarity=0.608 Sum_probs=107.5
Q ss_pred ccccCceec-ccCCCCccC-cCCCCCCeEEEEeccCCC-CCCEEEEEEeeccccccccccccccccccCCCceeEEEEEE
Q 000112 2013 YSVHGQWRG-YSAGGCQDY-ASWNQNPQFRLRASGSDA-SFPIHVFITLTQGVSFSRTVAGFKNYQSSHDSMMFYIGMRI 2089 (2161)
Q Consensus 2013 yrVhGeWrG-~TAGGC~Df-dTF~qNPQY~LsVssSD~-sePi~VLISLSQkDqrsR~~~GFrnYq~shDs~LLyIGL~V 2089 (2161)
..++|+|+. .+||||.++ .+|++||||.|++.+++. ..+++|+|.|+|++++..+. ...+..+|||+|
T Consensus 6 ~~~~G~W~~g~tAGGc~~~~~tf~~NPQf~l~v~~~~~~~~~~~v~i~L~q~~~r~~~~---------~~~~~~~IGf~v 76 (150)
T cd00214 6 KSFNGEWRRGQTAGGCRNNPDTFWTNPQFRIRVPEPDDDEGKCTVLIALMQKNRRHLRK---------KGLDLLTIGFHV 76 (150)
T ss_pred EEEeCeEeCCcccCCCCCcccccccCceEEEEecCCCCCCCccEEEEEeccCCcchhcc---------cCCCcceEEEEE
Confidence 467899976 999999555 799999999999986531 22389999999997642211 123457899999
Q ss_pred EEecCc-cc-ccceee-ccc-cCCcccccCcceEEEEEEeCCCccEEEEccccCCCCccCcEEEEEeCCCcceeeC
Q 000112 2090 LKTRGR-RA-AHNIYL-HES-VGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEEAPFVLSVFTKASIILEAL 2161 (2161)
Q Consensus 2090 fKvrGn-Rs-~~nIfl-hEs-V~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqEG~FTLRVFSskpItLEPL 2161 (2161)
+++++. +. ....+. +++ +..++|.+.|+|++++.|+| |.|+|||+|++|+++|+|.|+||+++++++++|
T Consensus 77 ~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~rev~~~~~L~p--G~YvIIPsT~~p~~~g~F~LrVfs~~~~~~~~~ 150 (150)
T cd00214 77 YKVPGENRHLRRDFFLHKAPRARSSTFINTREVSLRFRLPP--GEYVIVPSTFEPGEEGEFLLRVFSEKSIKSSEL 150 (150)
T ss_pred EEeCCcCcccChhhhhccCcccccCccccccEEEEEEEcCC--CCEEEEeeecCCCCcccEEEEEEecCCCccccC
Confidence 998652 21 222222 233 34578999999999999987 899999999999999999999999999999886
No 8
>PF01067 Calpain_III: Calpain large subunit, domain III; InterPro: IPR022682 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium []. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals [, ]: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only []. All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [, ]. The crystallographic structure of m-calpain reveals six "domains" in the 80kDa subunit: A 19-amino acid NH2-terminal sequence; Active site domain IIa; Active site domain IIb. Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related []. Domain III; An 18-amino acid extended sequence linking domain III to domain IV; Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity []. />]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad []. Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (IPR001259 from INTERPRO). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma []. This entry represents domain III. It is found in association with PF00648 from PFAM. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions. ; PDB: 1QXP_B 2QFE_A 1DF0_A 1U5I_A 3DF0_A 3BOW_A 1KFU_L 1KFX_L.
Probab=99.79 E-value=2.9e-19 Score=182.38 Aligned_cols=136 Identities=35% Similarity=0.621 Sum_probs=96.4
Q ss_pred ccccCce-ecccCCCCccCc-CCCCCCeEEEEeccCCC-CCCEEEEEEeeccccccccccccccccccCCCceeEEEEEE
Q 000112 2013 YSVHGQW-RGYSAGGCQDYA-SWNQNPQFRLRASGSDA-SFPIHVFITLTQGVSFSRTVAGFKNYQSSHDSMMFYIGMRI 2089 (2161)
Q Consensus 2013 yrVhGeW-rG~TAGGC~Dfd-TF~qNPQY~LsVssSD~-sePi~VLISLSQkDqrsR~~~GFrnYq~shDs~LLyIGL~V 2089 (2161)
.+++|+| ++.+||||.++. +|++||||.|++..++. +.+++|+|+|+|++.+... ..+....+|||+|
T Consensus 5 ~~~~G~W~~~~taGG~~~~~~s~~~NPQy~l~v~~~~~~~~~~~v~i~L~q~~~~~~~---------~~~~~~~~Ig~~v 75 (147)
T PF01067_consen 5 VTIEGEWVTGNTAGGCPNNPYSWWNNPQYRLTVSEPTEESNKCTVVISLMQKDRRRKR---------DVGEKDLPIGFYV 75 (147)
T ss_dssp EEEEEEE-TTTS---STT-TTTGGGS-EEEEEESSGCCCSSBEEEEEEEEECSGCCGC---------STTTTTSEEEEEE
T ss_pred EEEeCEEeCCCcCCCCcccccccccCcEEEEEEcCCCCCcceeEEEEEEEecCcchhh---------cccccceEEeEEE
Confidence 4689999 899999999998 99999999999986543 2368999999998753211 1123457899999
Q ss_pred EEecC--ccccccee--eccccCCcccccCcceEEEEEEeCCCccEEEEccccCCCCccCcEEEEEeCCCccee
Q 000112 2090 LKTRG--RRAAHNIY--LHESVGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEEAPFVLSVFTKASIILE 2159 (2161)
Q Consensus 2090 fKvrG--nRs~~nIf--lhEsV~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqEG~FTLRVFSskpItLE 2159 (2161)
++... .+...... ..+.+..++|.+.|+++.+++|+| |.|+|||+|++|+++|+|+|+||++.|++|+
T Consensus 76 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~L~~--G~YvIVPsT~~~~~~g~F~L~v~s~~~~~l~ 147 (147)
T PF01067_consen 76 FKVQSQQKRLPRQYFLFNKPVVSSGDYSNSREVSEEFTLPP--GTYVIVPSTYEPGQEGEFTLRVFSDSPFELQ 147 (147)
T ss_dssp EEETTTTSE--HHHHHTS-SSEE-SSEBSSSEEEEEEEE-S--EEEEEEEEESSTT--EEEEEEEEESSSEEE-
T ss_pred EeeecccccCCcceeccccceeeccccccceEEEEEEEcCC--CCEEEEEecCCCCCeeeEEEEEEECCCcccC
Confidence 99822 11111111 123445678999999999999987 8899999999999999999999999999874
No 9
>cd00152 PTX Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers.
Probab=97.91 E-value=8.2e-05 Score=82.12 Aligned_cols=162 Identities=20% Similarity=0.328 Sum_probs=99.2
Q ss_pred EEEEEEEeecC--CCceeeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEEEE
Q 000112 1434 SFSIAVMIQPE--SGPVCLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVT 1511 (2161)
Q Consensus 1434 ~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1511 (2161)
+|++++-++.+ +++..+|..--.++ =-|+++.+..+ |+ +.+-..|...++ . ....||+||.|+
T Consensus 32 ~fTv~~Wv~~~~~~~~~~ifSy~~~~~-~~~~~l~~~~~----g~--~~~~i~~~~~~~-------~-~~~~~g~W~hv~ 96 (201)
T cd00152 32 AFTLCLWVYTDLSTREYSLFSYATKGQ-DNELLLYKEKD----GG--YSLYIGGKEVTF-------K-VPESDGAWHHIC 96 (201)
T ss_pred hEEEEEEEEecCCCCCeEEEEEeCCCC-CCeEEEEEcCC----Ce--EEEEEcCEEEEE-------e-ccCCCCCEEEEE
Confidence 57788778776 47777774333211 22777664432 33 333333332221 2 234899999999
Q ss_pred EEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCChH
Q 000112 1512 MTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLTED 1591 (2161)
Q Consensus 1512 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clte~ 1591 (2161)
+|-|..+|+.+-|+||.-.+.++ + . .+..+..+....+|-++ |..|..-...-.=+=+|=|+-+|.|-||.+
T Consensus 97 ~t~d~~~g~~~lyvnG~~~~~~~-~--~-~~~~~~~~g~l~lG~~q----~~~gg~~~~~~~f~G~I~~v~iw~~~Ls~~ 168 (201)
T cd00152 97 VTWESTSGIAELWVNGKLSVRKS-L--K-KGYTVGPGGSIILGQEQ----DSYGGGFDATQSFVGEISDVNMWDSVLSPE 168 (201)
T ss_pred EEEECCCCcEEEEECCEEecccc-c--c-CCCEECCCCeEEEeecc----cCCCCCCCCCcceEEEEceeEEEcccCCHH
Confidence 99999999999999998776554 1 1 12345556667777654 333322111111233567888999999999
Q ss_pred HHHHHhhcccccccccccCCCCCcccC
Q 000112 1592 EIASLYSAICSAELNMNEFPEDNWQWA 1618 (2161)
Q Consensus 1592 e~~~~~~~~~~ae~~~~d~~dd~WQ~~ 1618 (2161)
||..+++.-+...=++++-.++.|+.+
T Consensus 169 eI~~l~~~~~~~~Gnv~~W~~~~~~~~ 195 (201)
T cd00152 169 EIKNVYSEGGTLSGNILNWRALNYEIN 195 (201)
T ss_pred HHHHHHhcCCCCCCCEEechhhEEEEe
Confidence 999998744444445555555555544
No 10
>smart00159 PTX Pentraxin / C-reactive protein / pentaxin family. This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding.
Probab=97.81 E-value=0.00016 Score=80.42 Aligned_cols=161 Identities=21% Similarity=0.320 Sum_probs=101.0
Q ss_pred EEEEEEEeecCC--Cceeee--cccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEE
Q 000112 1434 SFSIAVMIQPES--GPVCLL--GTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHI 1509 (2161)
Q Consensus 1434 ~~~~~~~~~~~~--~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1509 (2161)
+|++++-++++. ++-.|| .+..|. -|+++... ++.++.+...|...+ ....+.||+||.
T Consensus 32 ~fTvc~W~k~~~~~~~~~ifSy~~~~~~---ne~~~~~~------~~~~~~l~i~g~~~~--------~~~~~~~g~W~h 94 (206)
T smart00159 32 AFTVCLWFYSDLSPRGYSLFSYATKGQD---NELLLYKE------KQGEYSLYIGGKKVQ--------FPVPESDGKWHH 94 (206)
T ss_pred HEEEEEEEEecCCCCceEEEEEeCCCCC---CeEEEEEc------CCcEEEEEEcCeEEE--------ecccccCCceEE
Confidence 567777777653 444454 665554 37766543 233466666654211 123578999999
Q ss_pred EEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCC
Q 000112 1510 VTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLT 1589 (2161)
Q Consensus 1510 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clt 1589 (2161)
|++|-|..+|+++-|+||... .+.++ .. +..+..+-.+.+|-++ |..|-.-.+...=+=.|=|+=||.|-||
T Consensus 95 vc~tw~~~~g~~~lyvnG~~~-~~~~~--~~-g~~i~~~G~lvlGq~q----d~~gg~f~~~~~f~G~i~~v~iw~~~Ls 166 (206)
T smart00159 95 ICTTWESSSGIAELWVDGKPG-VRKGL--AK-GYTVKPGGSIILGQEQ----DSYGGGFDATQSFVGEIGDLNMWDSVLS 166 (206)
T ss_pred EEEEEECCCCcEEEEECCEEc-ccccc--cC-CcEECCCCEEEEEecc----cCCCCCCCCCcceeEEEeeeEEecccCC
Confidence 999999999999999999875 33322 11 2344566678888764 3333221111112335668889999999
Q ss_pred hHHHHHHhhcccccccccccCCCCCcccCC
Q 000112 1590 EDEIASLYSAICSAELNMNEFPEDNWQWAD 1619 (2161)
Q Consensus 1590 e~e~~~~~~~~~~ae~~~~d~~dd~WQ~~d 1619 (2161)
++||..+++.-...+=++.+-.++.|+.+.
T Consensus 167 ~~eI~~l~~~~~~~~Gnv~~W~~~~~~~~g 196 (206)
T smart00159 167 PEEIKSVYKGSTFSIGNILNWRALNYEVHG 196 (206)
T ss_pred HHHHHHHHcCCCCCCCCEEeccccEEEEee
Confidence 999999987433334456666666666653
No 11
>PF13385 Laminin_G_3: Concanavalin A-like lectin/glucanases superfamily; PDB: 4DQA_A 1N1Y_A 1MZ6_A 1MZ5_A 1N1S_A 2A75_A 1WCS_A 1N1T_A 1N1V_A 2FHR_A ....
Probab=97.30 E-value=0.00098 Score=66.47 Aligned_cols=80 Identities=24% Similarity=0.379 Sum_probs=49.7
Q ss_pred ccceecCCceEEEEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchh
Q 000112 1498 SATSIADGRWHIVTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMH 1577 (2161)
Q Consensus 1498 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1577 (2161)
....+.+++||.+++|+| .++.+.|+||-..+....-.. ..+......-+|-.+ ....--+..
T Consensus 78 ~~~~~~~~~W~~l~~~~~--~~~~~lyvnG~~~~~~~~~~~----~~~~~~~~~~iG~~~-----------~~~~~~~g~ 140 (157)
T PF13385_consen 78 SDSNLPDNKWHHLALTYD--GSTVTLYVNGELVGSSTIPSN----ISLNSNGPLFIGGSG-----------GGSSPFNGY 140 (157)
T ss_dssp -BS---TT-EEEEEEEEE--TTEEEEEETTEEETTCTEESS----SSTTSCCEEEESS-S-----------TT--B-EEE
T ss_pred cCcccCCCCEEEEEEEEE--CCeEEEEECCEEEEeEeccCC----cCCCCcceEEEeecC-----------CCCCceEEE
Confidence 566788999999999999 556999999999986543222 123444455555433 223344678
Q ss_pred hheehhhcccCChHHHH
Q 000112 1578 IMDVFLWGRCLTEDEIA 1594 (2161)
Q Consensus 1578 ~~~~~~~~~clte~e~~ 1594 (2161)
|-|+-+|.|+||++||+
T Consensus 141 i~~~~i~~~aLt~~eI~ 157 (157)
T PF13385_consen 141 IDDLRIYNRALTAEEIQ 157 (157)
T ss_dssp EEEEEEESS---HHHHH
T ss_pred EEEEEEECccCCHHHcC
Confidence 88999999999999996
No 12
>PF00354 Pentaxin: Pentaxin family; InterPro: IPR001759 Pentaxins (or pentraxins) [, ] are a family of proteins which show, under electron microscopy, a discoid arrangement of five noncovalently bound subunits. Proteins of the pentaxin family are involved in acute immunological responses []. Three of the principal members of the pentaxin family are serum proteins: namely, C-reactive protein (CRP) [], serum amyloid P component protein (SAP) [], and female protein (FP) []. CRP is expressed during acute phase response to tissue injury or inflammation in mammals. The protein resembles antibody and performs several functions associated with host defence: it promotes agglutination, bacterial capsular swelling and phagocytosis, and activates the classical complement pathway through its calcium-dependent binding to phosphocholine. CRPs have also been sequenced in an invertebrate, Limulus polyphemus (Atlantic horseshoe crab), where they are a normal constituent of the hemolymph. SAP is a vertebrate protein that is a precursor of amyloid component P. It is found in all types of amyloid deposits, in glomerular basement menbrane and in elastic fibres in blood vessels. SAP binds to various lipoprotein ligands in a calcium-dependent manner, and it has been suggested that, in mammals, this may have important implications in atherosclerosis and amyloidosis. FP is a SAP homologue found in Mesocricetus auratus (Golden hamster). The concentration of this plasma protein is altered by sex steroids and stimuli that elicit an acute phase response. Pentaxin proteins expressed in the nervous system are neural pentaxin I (NPI) and II (NPII) []. NPI and NPII are homologous and can exist within one species. It is suggested that both proteins mediate the uptake of synaptic macromolecules and play a role in synaptic plasticity. Apexin, a sperm acrosomal protein, is a homologue of NPII found in Cavia porcellus (Guinea pig) []. PTX3 (or TSG-14) protein is a cytokine-induced protein that is homologous to CRPs and SAPs, but its function is not yet known.; PDB: 2A3W_F 3KQR_C 3D5O_D 2A3X_G 1SAC_D 2W08_B 1GYK_B 1LGN_A 2A3Y_A 1B09_D ....
Probab=97.29 E-value=0.00092 Score=74.48 Aligned_cols=159 Identities=29% Similarity=0.496 Sum_probs=94.9
Q ss_pred EEEEEEEeecCC--Cceeee--cccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEE
Q 000112 1434 SFSIAVMIQPES--GPVCLL--GTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHI 1509 (2161)
Q Consensus 1434 ~~~~~~~~~~~~--~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1509 (2161)
+|++.+-++++. ..-+|| .|+.|. -|+++.+..++ +++|.-.|.... + ...+.||+||.
T Consensus 26 ~fTvC~w~k~~~~~~~~tifSYat~~~~---nell~~~~~~~------~~~l~i~~~~~~-------~-~~~~~~~~Whh 88 (195)
T PF00354_consen 26 AFTVCFWVKTDDSSNDGTIFSYATSSQD---NELLLFGSSSG------SLRLYINGSSVS-------F-SGPIRDGQWHH 88 (195)
T ss_dssp EEEEEEEEEESGSGS-EEEEEEEETTEE---EEEEEEEETTT------EEEEEETTEEEE-------E-EECS-TSS-EE
T ss_pred cEEEEEEEEeccCCCceEEEEEccCCCC---ccEEEEEeCCc------eEEEEECCeEeE-------e-ccccCCCCcEE
Confidence 355555555533 355555 444443 38888765442 566776666221 1 13578999999
Q ss_pred EEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCC
Q 000112 1510 VTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLT 1589 (2161)
Q Consensus 1510 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clt 1589 (2161)
+.+|-|..+|+..-|+||-. ....+ +..+..|-..|+ +=+|-+. |.+|-.-.+...=.=.|-|+-||.|-||
T Consensus 89 ~C~tW~s~~G~~~ly~dG~~-~~~~~--~~~g~~i~~gG~-~vlGQeQ----d~~gG~fd~~q~F~G~i~~~~iWd~vLs 160 (195)
T PF00354_consen 89 ICVTWDSSTGRWQLYVDGVR-LSSTG--LATGHSIPGGGT-LVLGQEQ----DSYGGGFDESQAFVGEISDFNIWDRVLS 160 (195)
T ss_dssp EEEEEETTTTEEEEEETTEE-EEEEE--SSTT--B-SSEE-EEESS-B----SBTTBTCSGGGB--EEEEEEEEESS---
T ss_pred EEEEEecCCcEEEEEECCEe-ccccc--ccCCceECCCCE-EEECccc----cccCCCcCCccEeeEEEeceEEEeeeCC
Confidence 99999999999999999993 22333 345556655555 4477654 6666544443333446889999999999
Q ss_pred hHHHHHHhhcccccccccccCCCCCcccC
Q 000112 1590 EDEIASLYSAICSAELNMNEFPEDNWQWA 1618 (2161)
Q Consensus 1590 e~e~~~~~~~~~~ae~~~~d~~dd~WQ~~ 1618 (2161)
++||+.++.. +..+=++++-.+..|+..
T Consensus 161 ~~eI~~l~~~-~~~~Gnvi~W~~~~~~~~ 188 (195)
T PF00354_consen 161 PEEIRALASC-CCYKGNVISWDDLRWSIS 188 (195)
T ss_dssp HHHHHHHHHT--S---SSEEGGGBEEEEE
T ss_pred HHHHHHHHhC-CCCCCCEEccccCeEEee
Confidence 9999999986 665666776666666554
No 13
>cd00110 LamG Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules.
Probab=94.18 E-value=0.31 Score=49.98 Aligned_cols=110 Identities=22% Similarity=0.327 Sum_probs=63.6
Q ss_pred eeEEEEEEEeecCCCceeeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccc-eecCCceEEE
Q 000112 1432 KVSFSIAVMIQPESGPVCLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISAT-SIADGRWHIV 1510 (2161)
Q Consensus 1432 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~ 1510 (2161)
.-.+++.+.+.|.+..-.||-...+.. -+.+.+ .++.|++-+++-.. .+.. .+... .+.||+||.|
T Consensus 19 ~~~~~i~~~frt~~~~g~l~~~~~~~~--~~~~~l----~l~~g~l~~~~~~g-~~~~------~~~~~~~v~dg~Wh~v 85 (151)
T cd00110 19 RTRLSISFSFRTTSPNGLLLYAGSQNG--GDFLAL----ELEDGRLVLRYDLG-SGSL------VLSSKTPLNDGQWHSV 85 (151)
T ss_pred cceeEEEEEEEeCCCCeEEEEecCCCC--CCEEEE----EEECCEEEEEEcCC-cccE------EEEccCccCCCCEEEE
Confidence 446777788888765555554444321 111211 14566766654433 2222 22222 6999999999
Q ss_pred EEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCC
Q 000112 1511 TMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPT 1559 (2161)
Q Consensus 1511 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1559 (2161)
+++.+. ++++-|+||.--. +... +.+.-.=.....+++|-.|..
T Consensus 86 ~i~~~~--~~~~l~VD~~~~~-~~~~--~~~~~~~~~~~~~~iGg~~~~ 129 (151)
T cd00110 86 SVERNG--RSVTLSVDGERVV-ESGS--PGGSALLNLDGPLYLGGLPED 129 (151)
T ss_pred EEEECC--CEEEEEECCccEE-eeeC--CCCceeecCCCCeEEcCCCCc
Confidence 999987 7899999997111 1111 111112246778899988864
No 14
>smart00210 TSPN Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin
Probab=94.11 E-value=0.32 Score=53.80 Aligned_cols=86 Identities=22% Similarity=0.310 Sum_probs=51.1
Q ss_pred EEEEEEEeecC-CCceeeecccc-cchhhheeeeccccccccccceeEEEEEe---cCCceeeeeeeccccceecCCceE
Q 000112 1434 SFSIAVMIQPE-SGPVCLLGTEF-QKKVCWEILVAGSEQGIEAGQVGLRLITK---GDRQTTVAKDWSISATSIADGRWH 1508 (2161)
Q Consensus 1434 ~~~~~~~~~~~-~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~ 1508 (2161)
.||+.+.++|. ..+--||...- |++.=+++.+-| ++.-+.+.++ |+.++.+- ....++||+||
T Consensus 53 ~fsi~~~~r~~~~~~g~L~si~~~~~~~~l~v~l~g-------~~~~~~~~~~~~~g~~~~~~f-----~~~~l~dg~WH 120 (184)
T smart00210 53 DFSLLTTFRQTPKSRGVLFAIYDAQNVRQFGLEVDG-------RANTLLLRYQGVDGKQHTVSF-----RNLPLADGQWH 120 (184)
T ss_pred CeEEEEEEEeCCCCCeEEEEEEcCCCcEEEEEEEeC-------CccEEEEEECCCCCcEEEEee-----cCCccccCCce
Confidence 46666666665 34444554432 444444444432 2334555542 32232221 12469999999
Q ss_pred EEEEEEeccccceeeeecccccccc
Q 000112 1509 IVTMTIDADIGEATCYLDGGFDGYQ 1533 (2161)
Q Consensus 1509 ~~~~~~~~~~~~~~~~~~~~~~~~~ 1533 (2161)
.++++|+.+ .++-|+|+..-+-+
T Consensus 121 ~lal~V~~~--~v~LyvDC~~~~~~ 143 (184)
T smart00210 121 KLALSVSGS--SATLYVDCNEIDSR 143 (184)
T ss_pred EEEEEEeCC--EEEEEECCccccce
Confidence 999999887 69999999876544
No 15
>smart00282 LamG Laminin G domain.
Probab=93.19 E-value=0.71 Score=47.35 Aligned_cols=109 Identities=20% Similarity=0.215 Sum_probs=64.7
Q ss_pred EEEEEEeecCCCceeeecccc-cchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEEEEEE
Q 000112 1435 FSIAVMIQPESGPVCLLGTEF-QKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVTMT 1513 (2161)
Q Consensus 1435 ~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1513 (2161)
+++.+.++|.+..=.||-+.. +.+. ++.+ .++.|++-++.-..+ +... -......+.||+||.|.++
T Consensus 3 ~~i~~~frt~~~~g~l~~~~~~~~~~---~l~l----~l~~g~l~~~~~~g~-~~~~----~~~~~~~~~dg~WH~v~i~ 70 (135)
T smart00282 3 LSISFSFRTTSPNGLLLYAGSKNGGD---YLAL----ELRDGRLVLRYDLGS-GPAR----LTSDPTPLNDGQWHRVAVE 70 (135)
T ss_pred eEEEEEEEeCCCCEEEEEeCCCCCCC---EEEE----EEECCEEEEEEECCC-CCEE----EEECCeEeCCCCEEEEEEE
Confidence 566777777765445554433 1221 1211 234688777666533 2211 1224478999999999999
Q ss_pred EeccccceeeeecccccccccccccccccccccCCceEEEecCCCCc
Q 000112 1514 IDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTD 1560 (2161)
Q Consensus 1514 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1560 (2161)
.+ .++.+-++||...-... .+.....-+..+.+++|-.|+..
T Consensus 71 ~~--~~~~~l~VD~~~~~~~~---~~~~~~~l~~~~~l~iGG~p~~~ 112 (135)
T smart00282 71 RN--GRRVTLSVDGENPVSGE---SPGGLTILNLDGPLYLGGLPEDL 112 (135)
T ss_pred Ee--CCEEEEEECCCccccEE---CCCCceEEecCCCcEEccCCchh
Confidence 87 46788999996432221 12222344556789999888753
No 16
>smart00560 LamGL LamG-like jellyroll fold domain.
Probab=90.83 E-value=0.89 Score=47.63 Aligned_cols=82 Identities=20% Similarity=0.253 Sum_probs=49.5
Q ss_pred EEEEEEEeecCCCce--eeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccc---cceecCCceE
Q 000112 1434 SFSIAVMIQPESGPV--CLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSIS---ATSIADGRWH 1508 (2161)
Q Consensus 1434 ~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~ 1508 (2161)
+|++++.|.|++.|- .+++ .- +.+-. +-..++.++-|-...+.+ |... .+....|+||
T Consensus 2 ~fTv~aWv~~~~~~~~~~~~~---------~~-v~~~~-~~~~~~~~f~l~~~~~~~------w~~~~~~~~~~~~~~W~ 64 (133)
T smart00560 2 SFTLEAWVKLESAGGSQPIIT---------GA-AVAQP-TISEKALTFFLRAKSVQG------WQTARTGATADWIGVWV 64 (133)
T ss_pred cEEEEEEEeecccCcccceee---------eE-EEEcc-CCCCCceEEEEEeeccCC------EEEeccccCCCCCCCEE
Confidence 699999999997642 1110 11 11111 223355665554443222 2221 1222239999
Q ss_pred EEEEEEeccccceeeeeccccccc
Q 000112 1509 IVTMTIDADIGEATCYLDGGFDGY 1532 (2161)
Q Consensus 1509 ~~~~~~~~~~~~~~~~~~~~~~~~ 1532 (2161)
-|+++.|.+.|+.+.|+||-..+-
T Consensus 65 hva~v~d~~~g~~~lYvnG~~~~~ 88 (133)
T smart00560 65 HLAGVYDGGAGKLSLYVNGVEVAT 88 (133)
T ss_pred EEEEEEECCCCeEEEEECCEEccc
Confidence 999999999999999999976653
No 17
>cd02619 Peptidase_C1 C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel str
Probab=81.90 E-value=2.3 Score=46.44 Aligned_cols=49 Identities=24% Similarity=0.493 Sum_probs=39.4
Q ss_pred CcccCceeEEEEEEEEc--CeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchhhHhhcc
Q 000112 1924 GIVQGHAYSILQVREVD--GHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQDFQIHF 1998 (2161)
Q Consensus 1924 GLVsGHAYSVLdV~EVd--G~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfEDFLkyF 1998 (2161)
.-..+||-.|++..... +.....+||-||. .| .++|.|||+++++..++
T Consensus 168 ~~~~~Hav~ivGy~~~~~~~~~~~i~~NSwG~--~w------------------------g~~Gy~~i~~~~~~~~~ 218 (223)
T cd02619 168 GDLGGHAVVIVGYDDNYVEGKGAFIVKNSWGT--DW------------------------GDNGYGRISYEDVYEMT 218 (223)
T ss_pred CccCCeEEEEEeecCCCCCCCCEEEEEeCCCC--cc------------------------ccCCEEEEehhhhhhhh
Confidence 44679999999998654 6788999999994 44 24899999999998554
No 18
>KOG1029 consensus Endocytic adaptor protein intersectin [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=77.56 E-value=3.7 Score=54.17 Aligned_cols=33 Identities=27% Similarity=0.427 Sum_probs=19.6
Q ss_pred chhHHHHHHHhhccceeeeccCCEEEecCceee
Q 000112 1011 SVGYFTFLFLLAGRALTVLLSPPIVVYSPRVLP 1043 (2161)
Q Consensus 1011 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1043 (2161)
|+..=....-|.|--+-+.|-|-+.+--||-.|
T Consensus 72 SIAmkLi~lkLqG~~lP~~LPPsll~~~~~~~p 104 (1118)
T KOG1029|consen 72 SIAMKLIKLKLQGIQLPPVLPPSLLKQPPRNAP 104 (1118)
T ss_pred HHHHHHHHHHhcCCcCCCCCChHHhccCCcCCC
Confidence 555555556677777777665546655555444
No 19
>PF02210 Laminin_G_2: Laminin G domain; InterPro: IPR012680 Laminins are large heterotrimeric glycoproteins involved in basement membrane function []. The laminin globular (G) domain can be found in one to several copies in various laminin family members, including a large number of extracellular proteins. The C terminus of the laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity []. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation []. The structure of the laminin-G domain has been predicted to resemble that of pentraxin []. Laminin G domains can vary in their function, and a variety of binding functions have been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each have five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan []. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. This entry represents one subtype of laminin G domains, which is sometimes found in association with thrombospondin-type laminin G domains (IPR012679 from INTERPRO).; PDB: 3POY_A 3QCW_B 3R05_B 3ASI_A 3MW4_B 3MW3_A 1QU0_D 1DYK_A 1OKQ_A 3SH4_A ....
Probab=76.41 E-value=5.5 Score=39.52 Aligned_cols=62 Identities=21% Similarity=0.394 Sum_probs=39.9
Q ss_pred cccceecCCceEEEEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccc
Q 000112 1497 ISATSIADGRWHIVTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMD 1562 (2161)
Q Consensus 1497 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1562 (2161)
.....++||+||.|+++.+... ++-++|+.-.-.+.-.... ...=+....+++|-.|+....
T Consensus 46 ~~~~~~~dg~wh~v~i~~~~~~--~~l~Vd~~~~~~~~~~~~~--~~~~~~~~~l~iGg~~~~~~~ 107 (128)
T PF02210_consen 46 FSNSNLNDGQWHKVSISRDGNR--VTLTVDGQSVSSESLPSSS--SDSLDPDGSLYIGGLPESNQP 107 (128)
T ss_dssp ECSSSSTSSSEEEEEEEEETTE--EEEEETTSEEEEEESSSTT--HHCBESEEEEEESSTTTTCTC
T ss_pred ccCccccccceeEEEEEEeeee--EEEEecCccceEEeccccc--eecccCCCCEEEecccCcccc
Confidence 3455699999999999887765 7888887643322111111 013345667999999886543
No 20
>cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to h
Probab=59.99 E-value=22 Score=39.35 Aligned_cols=43 Identities=16% Similarity=0.404 Sum_probs=35.3
Q ss_pred cccCceeEEEEEEEEcCeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchhh
Q 000112 1925 IVQGHAYSILQVREVDGHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQD 1993 (2161)
Q Consensus 1925 LVsGHAYSVLdV~EVdG~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfED 1993 (2161)
...+|+=.|++..+-.|.+...+||-||. +| .++|.|||+.++
T Consensus 156 ~~~~Hav~iVGy~~~~~~~ywiv~NSWG~--~W------------------------G~~Gy~~i~~~~ 198 (210)
T cd02248 156 TNLNHAVLLVGYGTENGVDYWIVKNSWGT--SW------------------------GEKGYIRIARGS 198 (210)
T ss_pred CcCCEEEEEEEEeecCCceEEEEEcCCCC--cc------------------------ccCcEEEEEcCC
Confidence 45689999999988777889999999994 44 346999999877
No 21
>PF03699 UPF0182: Uncharacterised protein family (UPF0182); InterPro: IPR005372 This family contains uncharacterised integral membrane proteins.; GO: 0016021 integral to membrane
Probab=58.26 E-value=13 Score=50.24 Aligned_cols=62 Identities=23% Similarity=0.379 Sum_probs=37.3
Q ss_pred hhcCceEEEEEeccCCCCCChh-----h----HHHhh-----hhhhhHHHHHHhhcccc---ee--------------ec
Q 000112 865 AFCGASYLEVVKSREDQVPTKG-----D----FLAAL-----LPLVCIPALLSLCSGLL---KW--------------KD 913 (2161)
Q Consensus 865 ~~~~~~~~~v~~~r~~~~p~~~-----d----fl~a~-----lpl~~ipa~~~l~~gl~---kw--------------~d 913 (2161)
.+...+.+-..+.|....|... + +.... +-++.++++++++.|+. .| +|
T Consensus 62 ~~~~~~~~~a~r~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~W~~~L~f~n~~~Fg~~D 141 (774)
T PF03699_consen 62 LFVFLNLWLAYRSRPKFRPPSPEQQRSDPLERYRELIEPRRRWVIIGVSLVLGLFAGLSASSQWETILLFLNGTPFGITD 141 (774)
T ss_pred HHHHHHHHHHHhcccccccccccccccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhHHHHHHHhCCCCCCCCC
Confidence 4455555556666665444322 1 22222 22456777888877764 35 78
Q ss_pred Cccccccceeeee
Q 000112 914 DDWKLSRGVYVFI 926 (2161)
Q Consensus 914 d~w~~s~~~~~~~ 926 (2161)
--....-|-|+|.
T Consensus 142 P~Fg~Di~FYvF~ 154 (774)
T PF03699_consen 142 PIFGKDISFYVFS 154 (774)
T ss_pred CCCCCCceeeeeh
Confidence 8888888999985
No 22
>PF02057 Glyco_hydro_59: Glycosyl hydrolase family 59; InterPro: IPR001286 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 59 GH59 from CAZY comprises enzymes with only one known activity; galactocerebrosidase (3.2.1.46 from EC). Globoid cell leukodystrophy (Krabbe disease) is a severe, autosomal recessive disorder that results from deficiency of galactocerebrosidase (GALC) activity [, , ]. GALC is responsible for the lysosomal catabolism of certain galactolipids, including galactosylceramide and psychosine [].; GO: 0004336 galactosylceramidase activity, 0006683 galactosylceramide catabolic process; PDB: 3ZR6_A 3ZR5_A.
Probab=57.92 E-value=29 Score=46.29 Aligned_cols=91 Identities=29% Similarity=0.418 Sum_probs=49.7
Q ss_pred ceeEEEEEEEee-cCCCceeeecccccchhhheeeeccccccc-----cccceeEEEEEecCCceeeeeeeccccceecC
Q 000112 1431 QKVSFSIAVMIQ-PESGPVCLLGTEFQKKVCWEILVAGSEQGI-----EAGQVGLRLITKGDRQTTVAKDWSISATSIAD 1504 (2161)
Q Consensus 1431 ~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1504 (2161)
+.++.|.-||+. |++|-|+|.|.--+.- | ...+.+|+ +.|.-- ||+.-..+++-+... +.+.-
T Consensus 542 ~NytVs~DV~ie~~~~ggv~lagRv~~~g-~----~~~~~~G~~f~v~~~G~w~---vt~d~~~~~~l~~G~---~~~~~ 610 (669)
T PF02057_consen 542 SNYTVSCDVYIETPDTGGVFLAGRVNKGG-C----DVRSARGYFFWVYANGTWS---VTSDLAGTTTLASGT---ADIGA 610 (669)
T ss_dssp -EEEEEEEEEE-STTT-EEEEEEEE---G-G----GGGG-EEEEEEEETTTEEE---EEEETTS-SEEEEEE----S--T
T ss_pred eEEEEEEEEEeccCCcCcEEEEEeecccc-c----ccCCCCeEEEEEEcCCcEE---EeccCCCcEEEeeee---ecccC
Confidence 346777888887 5899999987654332 1 12223332 222221 333333333434433 45777
Q ss_pred CceEEEEEEEeccccceeeeeccccccccc
Q 000112 1505 GRWHIVTMTIDADIGEATCYLDGGFDGYQT 1534 (2161)
Q Consensus 1505 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1534 (2161)
||||+++++|+-++ +++++||..-++-.
T Consensus 611 ~~WhtltL~~~g~~--~ta~lng~~l~~~~ 638 (669)
T PF02057_consen 611 GKWHTLTLTISGST--ATAMLNGTVLWTDV 638 (669)
T ss_dssp T-EEEEEEEEETTE--EEEEETTEEEEEEE
T ss_pred CeEEEEEEEEECCE--EEEEECCEEeEEec
Confidence 99999999998887 99999998766543
No 23
>KOG4326 consensus Mitochondrial F1F0-ATP synthase, subunit e [Energy production and conversion]
Probab=56.90 E-value=15 Score=36.81 Aligned_cols=17 Identities=47% Similarity=0.597 Sum_probs=14.4
Q ss_pred cchhhhHhhhccchhhh
Q 000112 1242 KFGGYLLLLLGLTAKAE 1258 (2161)
Q Consensus 1242 ~~~~~~~~~~~~~~~~~ 1258 (2161)
|||.|-+|+||.+--|-
T Consensus 13 kfGRysaL~lGvaYGa~ 29 (81)
T KOG4326|consen 13 KFGRYSALSLGVAYGAF 29 (81)
T ss_pred HhhHHHHHHHHHHHhHH
Confidence 89999999999876554
No 24
>TIGR00805 oat sodium-independent organic anion transporter. Proteins of the OAT family catalyze the Na+-independent facilitated transport of organic anions such as bromosulfobromophthalein and prostaglandins as well as conjugated and unconjugated bile acids (taurocholate and cholate, respectively). These transporters have been characterized in mammals, but homologues are present in C. elegans and A. thaliana. Some of the mammalian proteins exhibit a high degree of tissue specificity. For example, the rat OAT is found at high levels in liver and kidney and at lower levels in other tissues. These proteins possess 10-12 putative a-helical transmembrane spanners. They may catalyze electrogenic anion uniport or anion exchange.
Probab=47.60 E-value=33 Score=45.11 Aligned_cols=94 Identities=16% Similarity=0.280 Sum_probs=53.7
Q ss_pred ccceeeeehhHHHHHHhhhheeeeeccchh----------HHHHHHHHHHHHHHHhh-hhhcccccceeeehhhHHHHHH
Q 000112 919 SRGVYVFITIGLVLLLGAISAVIVVITPWT----------IGVAFLLLLLLIVLAIG-VIHHWASNNFYLTRTQMFFVCF 987 (2161)
Q Consensus 919 s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~----------~gvaf~l~~~~~~~~ig-v~~~was~~f~~~~~~~~~~~~ 987 (2161)
+...|++..++..+..++..++...+..+. .|..+.+..+. ...+| .+.-|.++.+-+..++++..|+
T Consensus 328 ~n~~f~~~~l~~~~~~~~~~~~~~~lP~yl~~~~g~s~~~ag~l~~~~~i~-~~~vG~~l~G~l~~r~~~~~~~~~~~~~ 406 (633)
T TIGR00805 328 CNPIYMLVILAQVIDSLAFNGYITFLPKYLENQYGISSAEANFLIGVVNLP-AAGLGYLIGGFIMKKFKLNVKKAAYFAI 406 (633)
T ss_pred cCcHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHcCCcHHHHHHHhhhhhhh-HHHHHHhhhhheeeeecccHHHHHHHHH
Confidence 344566666666666555555444333332 23222222221 12233 3566777777777777777666
Q ss_pred HHHHHHHH----HHHhhhcCCCCccccchhH
Q 000112 988 LAFLLGLA----AFLVGWFDDKPFVGASVGY 1014 (2161)
Q Consensus 988 ~~~~~~~~----~~~~~~~~~~~~~~~~~~~ 1014 (2161)
+..+++++ .|++| -++-|+.|..+.|
T Consensus 407 ~~~~~~~~~~~~~~~~~-C~~~~~agv~~~y 436 (633)
T TIGR00805 407 CLSTLSYLLCSPLFLIG-CESAPVAGVNNPS 436 (633)
T ss_pred HHHHHHHHHHHHHHeec-CCCCccceeeccC
Confidence 65555543 45555 5888999999987
No 25
>PF09323 DUF1980: Domain of unknown function (DUF1980); InterPro: IPR015402 Members of this occur in gene pairs with members of PF03773 from PFAM. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region. Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function has not, as yet, been defined.
Probab=44.91 E-value=57 Score=36.77 Aligned_cols=57 Identities=19% Similarity=0.386 Sum_probs=43.9
Q ss_pred HHHHHHHHHHhhHHHHHHHhhhhhheeeccceehhhHHHHHHHHHHHHHHHHHHhhhhccc
Q 000112 64 FLALSAWMVVISPVAVLIMWGSWLIVILGRDIIGLAIIMAGTALLLAFYSIMLWWRTQWQS 124 (2161)
Q Consensus 64 ~l~l~a~~~~~~p~~~~~~wg~~~~~~~~~~~~g~a~~~~g~~~~~~~y~~~~w~~t~w~s 124 (2161)
+|.|++|.+.+ +-+.+-.-+...+.++.+.++++.+.+.++||.+.++.|+|.+=++
T Consensus 4 ~liL~~~~~l~----~~l~~sG~i~~YI~P~~~~~~~~a~i~l~ilai~q~~~~~~~~~~~ 60 (182)
T PF09323_consen 4 FLILLGFGILL----FYLILSGKILLYIHPRYIPLLYFAAILLLILAIVQLWRWFRPKRRK 60 (182)
T ss_pred HHHHHHHHHHH----HHHHHhCcHHHHhCccHHHHHHHHHHHHHHHHHHHHHHHHhccccc
Confidence 34455554432 2344556677788999999999999999999999999999988774
No 26
>PF00112 Peptidase_C1: Papain family cysteine protease This is family C1 in the peptidase classification. ; InterPro: IPR000668 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues. The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity []. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate []. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. ; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MOR_B 3HHI_B 1S4V_A 3F75_A 1MEG_A 1PCI_C 1PPO_A 3HD3_B 1F29_A 1EWL_A ....
Probab=44.85 E-value=30 Score=38.04 Aligned_cols=44 Identities=25% Similarity=0.572 Sum_probs=35.7
Q ss_pred cccCceeEEEEEEEEcCeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchhhH
Q 000112 1925 IVQGHAYSILQVREVDGHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQDF 1994 (2161)
Q Consensus 1925 LVsGHAYSVLdV~EVdG~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfEDF 1994 (2161)
...+|+-.|++..+-.+.....+||-||. .| .++|.|||+.++-
T Consensus 163 ~~~~Hav~iVGy~~~~~~~~wiv~NSWG~--~W------------------------G~~Gy~~i~~~~~ 206 (219)
T PF00112_consen 163 ESGGHAVLIVGYDDENGKGYWIVKNSWGT--DW------------------------GDNGYFRISYDYN 206 (219)
T ss_dssp SSEEEEEEEEEEEEETTEEEEEEE-SBTT--TS------------------------TBTTEEEEESSSS
T ss_pred ccccccccccccccccceeeEeeehhhCC--cc------------------------CCCeEEEEeeCCC
Confidence 56799999999998888899999999994 34 2479999998764
No 27
>PF04156 IncA: IncA protein; InterPro: IPR007285 Chlamydia trachomatis is an obligate intracellular bacterium that develops within a parasitophorous vacuole termed an inclusion. The inclusion is nonfusogenic with lysosomes but intercepts lipids from a host cell exocytic pathway. Initiation of chlamydial development is concurrent with modification of the inclusion membrane by a set of C. trachomatis-encoded proteins collectively designated Incs. One of these Incs, IncA (Inclusion membrane protein A), is functionally associated with the homotypic fusion of inclusions [].
Probab=44.49 E-value=12 Score=41.72 Aligned_cols=14 Identities=14% Similarity=0.337 Sum_probs=9.5
Q ss_pred cCCEEEecCceeeE
Q 000112 1031 SPPIVVYSPRVLPV 1044 (2161)
Q Consensus 1031 ~~~~~~~~~~~~~~ 1044 (2161)
.+|...+.|+.+|.
T Consensus 63 ~~~~~~~~~~~~~~ 76 (191)
T PF04156_consen 63 KRPVQSVRPQQIEE 76 (191)
T ss_pred ccccccchHHHHHh
Confidence 45666677777776
No 28
>PTZ00334 trans-sialidase; Provisional
Probab=43.17 E-value=33 Score=46.57 Aligned_cols=77 Identities=23% Similarity=0.403 Sum_probs=50.6
Q ss_pred CCceEEEEEEEeccccceeeeeccccccc-ccccccccccccccCCceEEEecCCCCccccc--cC-CCcccc--ccchh
Q 000112 1504 DGRWHIVTMTIDADIGEATCYLDGGFDGY-QTGLALSAGNSIWEEGAEVWVGVRPPTDMDVF--GR-SDSEGA--ESKMH 1577 (2161)
Q Consensus 1504 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~-~~~~~~--~~~~~ 1577 (2161)
-|+-|.|.++++-. .+.+.|+||--=|- ++ ++.. +.|.++--| |- ..+.+. ++++-
T Consensus 642 ~~k~yqVal~L~~G-~~gsvYVDG~~vg~~~~--~l~~---------------~~~~~IshFyiGgdg~~~~~~~~~~VT 703 (780)
T PTZ00334 642 PETTHQVAIVLRNG-KQGSAYVDGQRVGDASC--ELKN---------------TDSKGISHFYIGGDGGSAGSKEDVPVT 703 (780)
T ss_pred CCCeEEEEEEEeCC-CeEEEEECCEEecCccc--ccCC---------------CCCcccceEEECCCccccccCCCCCEE
Confidence 36779999999542 26899999976552 22 2221 124444444 11 111111 46788
Q ss_pred hheehhhcccCChHHHHHHhh
Q 000112 1578 IMDVFLWGRCLTEDEIASLYS 1598 (2161)
Q Consensus 1578 ~~~~~~~~~clte~e~~~~~~ 1598 (2161)
...|||.-|+|+++||.+|..
T Consensus 704 V~NVlLYNRpL~~~Ei~~l~~ 724 (780)
T PTZ00334 704 ATNVLLYNRPLDDNEIRVLNA 724 (780)
T ss_pred EeEeEEeCCCCCHHHHHhhhc
Confidence 999999999999999999975
No 29
>COG1390 NtpE Archaeal/vacuolar-type H+-ATPase subunit E [Energy production and conversion]
Probab=40.77 E-value=2.5e+02 Score=32.83 Aligned_cols=113 Identities=25% Similarity=0.262 Sum_probs=73.1
Q ss_pred hhhhhhhhhhhhhhhhcccCCCcCChhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHHHHHHH
Q 000112 1255 AKAERVQDEVRLRLFLDSIGFSDLSAKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKALLEKEE 1334 (2161)
Q Consensus 1255 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1334 (2161)
.||+++.+|.+-+ .++=..|-++.-+-.++.+.+.++-|.+...||=-..-+-.-||+.|-.+||
T Consensus 17 eeak~I~~eA~~e---------------ae~i~~ea~~~~~~~~~~~~~~~~~ea~~~~~~iis~A~le~r~~~Le~~ee 81 (194)
T COG1390 17 EEAEEILEEAREE---------------AEKIKEEAKREAEEAIEEILRKAEKEAERERQRIISSALLEARRKLLEAKEE 81 (194)
T ss_pred HHHHHHHHHHHHH---------------HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 5677777776543 3333456777778888899988887777777665555455555555555444
Q ss_pred --hhHHhhhhhhcccCCCCCchHH--HHHHHHHHHhcCCccccchhhhHHHHHH
Q 000112 1335 --RKWKEIEASLISSIPNAGNREA--AAMAAAVRAVGGDSVLEDSFARERVSSI 1384 (2161)
Q Consensus 1335 --~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1384 (2161)
..|-+..-.-|..+++...-++ +-|.+++....|+.+. -+.+++.+.+
T Consensus 82 ~l~~~~~~~~e~L~~i~~~~~~~~l~~ll~~~~~~~~~~~~i--V~~~e~d~~~ 133 (194)
T COG1390 82 ILESVFEAVEEKLRNIASDPEYESLQELLIEALEKLLGGELV--VYLNEKDKAL 133 (194)
T ss_pred HHHHHHHHHHHHHHcCcCCcchHHHHHHHHHHHHhcCCCCeE--EEeCcccHHH
Confidence 2344455556667777666666 6688888888777766 4555555555
No 30
>PF07946 DUF1682: Protein of unknown function (DUF1682); InterPro: IPR012879 The members of this family are all hypothetical eukaryotic proteins of unknown function. One member (Q920S6 from SWISSPROT) is described as being an adipocyte-specific protein, but no evidence of this was found.
Probab=40.65 E-value=40 Score=41.21 Aligned_cols=10 Identities=50% Similarity=0.826 Sum_probs=6.1
Q ss_pred HHHhhHHhhh
Q 000112 1332 KEERKWKEIE 1341 (2161)
Q Consensus 1332 ~~~~~~~~~~ 1341 (2161)
.|.|||.|-|
T Consensus 305 eeQrK~eeKe 314 (321)
T PF07946_consen 305 EEQRKYEEKE 314 (321)
T ss_pred HHHHHHHHHH
Confidence 5666666655
No 31
>PF00054 Laminin_G_1: Laminin G domain; InterPro: IPR012679 Laminins are large heterotrimeric glycoproteins involved in basement membrane function []. The laminin globular (G) domain can be found in one to several copies in various laminin family members, which includes a large number of extracellular proteins. The C terminus of laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity []. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation []. The structure of the laminin-G domain has been predicted to resemble that of pentraxin []. Laminin G domains can vary in their function, and a variety of binding functions has been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each has five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan []. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. This entry represents one subtype of laminin G domains, which is sometimes found in association with thrombospondin-type laminin G domains (IPR012680 from INTERPRO).; PDB: 1OKQ_A 1DYK_A 2C5D_A 1H30_A 1LHW_A 1KDK_A 1LHU_A 1KDM_A 1LHO_A 1D2S_A ....
Probab=38.55 E-value=40 Score=35.51 Aligned_cols=51 Identities=22% Similarity=0.473 Sum_probs=33.8
Q ss_pred ccccceeEEEEEecCCceeeeeeeccccceecCCceEEEEEEEeccccceeeeeccccc
Q 000112 1472 IEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVTMTIDADIGEATCYLDGGFD 1530 (2161)
Q Consensus 1472 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1530 (2161)
|..|++=+|.- -|.+..++ ..+.+ |.||+||.|++..... +++-.+||...
T Consensus 26 L~~G~l~~~~~-~G~~~~~~----~~~~~-i~dg~wh~v~~~r~~~--~~~L~Vd~~~~ 76 (131)
T PF00054_consen 26 LRDGRLEFRYN-LGSGPASL----RSPQK-INDGKWHTVSVSRNGR--NGSLSVDGEEV 76 (131)
T ss_dssp EETTEEEEEEE-SSSEEEEE----EESSE-TTSSSEEEEEEEEETT--EEEEEETTSEE
T ss_pred EECCEEEEEEe-CCCcccee----cCCCc-cCCCcceEEEEEEcCc--EEEEEECCccc
Confidence 66788777763 33333333 12333 9999999999988754 55667888765
No 32
>cd08045 TAF4 TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryote. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are
Probab=38.23 E-value=17 Score=41.80 Aligned_cols=44 Identities=30% Similarity=0.357 Sum_probs=31.5
Q ss_pred chHHHHHHHHHHHhcCCccccchhhhHHHHHHHHHHHHHHHHHHHHhcCCcceEEeeCCCCCccCccc
Q 000112 1353 NREAAAMAAAVRAVGGDSVLEDSFARERVSSIARRIRTAQLARRALQTGITGAICVLDDEPTTSGRHC 1420 (2161)
Q Consensus 1353 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1420 (2161)
.|--||=++|.-|+||+.-+- ....+...+|+|++||+.+..+.
T Consensus 166 ~r~r~AN~tA~~AiG~~kk~~------------------------~~i~~rD~l~~LE~e~~~~~s~l 209 (212)
T cd08045 166 MRHRAANATALAAIGGRKKKK------------------------RRITMRDVLFVLEREPRYSKSAL 209 (212)
T ss_pred HHHHHHHHHHHHHhCCCCccc------------------------ceeeHHHHHHHHHhCchhhhhhh
Confidence 344566677777899987765 33445677889999998876653
No 33
>KOG1029 consensus Endocytic adaptor protein intersectin [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=37.63 E-value=43 Score=45.15 Aligned_cols=43 Identities=23% Similarity=0.360 Sum_probs=20.5
Q ss_pred hhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHHHHH
Q 000112 1290 DRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKALLEK 1332 (2161)
Q Consensus 1290 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1332 (2161)
|||+=|.-....-++-|.|.++-+||--|.+|-.||||.+.++
T Consensus 356 ekkererqEqErk~qlElekqLerQReiE~qrEEerkkeie~r 398 (1118)
T KOG1029|consen 356 EKKERERQEQERKAQLELEKQLERQREIERQREEERKKEIERR 398 (1118)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 4444443333344444555555555544555555555554444
No 34
>PTZ00266 NIMA-related protein kinase; Provisional
Probab=37.50 E-value=53 Score=45.98 Aligned_cols=10 Identities=50% Similarity=0.807 Sum_probs=4.6
Q ss_pred hhHHHHHHHH
Q 000112 1377 ARERVSSIAR 1386 (2161)
Q Consensus 1377 ~~~~~~~~~~ 1386 (2161)
-|||...+.|
T Consensus 508 e~er~~r~e~ 517 (1021)
T PTZ00266 508 ERERVDRLER 517 (1021)
T ss_pred HHHHHHHHHH
Confidence 3455544444
No 35
>PLN02316 synthase/transferase
Probab=36.90 E-value=53 Score=46.06 Aligned_cols=16 Identities=6% Similarity=-0.068 Sum_probs=10.2
Q ss_pred CchhHHHHHHHHHHhc
Q 000112 1837 HELWVSILEKAYAKLH 1852 (2161)
Q Consensus 1837 nELWpSLLEKAYAKLh 1852 (2161)
+..|..++=||.+.+.
T Consensus 688 d~~RF~~F~~Aale~l 703 (1036)
T PLN02316 688 DGERFGFFCHAALEFL 703 (1036)
T ss_pred HHHHHHHHHHHHHHHH
Confidence 4567777777766643
No 36
>PF09472 MtrF: Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF); InterPro: IPR013347 Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump []. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C-terminal region of what appears to be a fusion of the MtrA and MtrF proteins [, ]. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.; GO: 0030269 tetrahydromethanopterin S-methyltransferase activity, 0015948 methanogenesis, 0016020 membrane
Probab=35.40 E-value=12 Score=36.74 Aligned_cols=47 Identities=21% Similarity=0.360 Sum_probs=39.9
Q ss_pred cccccCCcc-cCCCCccccccccchhhHHHHHhhHHHhhhcccchhhc
Q 000112 796 LEDLGYKGW-TGEPNSFASPYASSVYLGWLMASAIALVVTGVLPIVSW 842 (2161)
Q Consensus 796 ~~~~~~~~~-~~~~~~~~~~y~~~~~~gw~~~~~~~~v~~~~~p~vsw 842 (2161)
.||++||.= -++.+...|--.++-..|.++...+|+|+.++.|+.-|
T Consensus 17 vedi~Yk~qLiaR~~kL~SGv~~~~~~GfaiG~~~AlvLv~ip~~l~~ 64 (64)
T PF09472_consen 17 VEDIRYKAQLIARDQKLESGVMATGIKGFAIGFLFALVLVGIPILLMF 64 (64)
T ss_pred HHHHHHHHHHhhhcchhHHHHhhhhhHHHHHHHHHHHHHHHHHHHHhC
Confidence 489999863 45667788888899999999999999999999888766
No 37
>KOG1144 consensus Translation initiation factor 5B (eIF-5B) [Translation, ribosomal structure and biogenesis]
Probab=33.10 E-value=1e+02 Score=42.08 Aligned_cols=17 Identities=24% Similarity=0.251 Sum_probs=9.5
Q ss_pred eeeeecccccccccccc
Q 000112 1521 ATCYLDGGFDGYQTGLA 1537 (2161)
Q Consensus 1521 ~~~~~~~~~~~~~~~~~ 1537 (2161)
++--+||-+|-+-.-+.
T Consensus 397 ~~~~~~~d~dd~ee~~~ 413 (1064)
T KOG1144|consen 397 VDLAIDGDDDDDEEELQ 413 (1064)
T ss_pred ccccccccccchhhhhc
Confidence 33446666776655443
No 38
>PF09323 DUF1980: Domain of unknown function (DUF1980); InterPro: IPR015402 Members of this occur in gene pairs with members of PF03773 from PFAM. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region. Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function has not, as yet, been defined.
Probab=32.65 E-value=76 Score=35.83 Aligned_cols=65 Identities=28% Similarity=0.348 Sum_probs=38.0
Q ss_pred HHHHHHHHhhhhhcccccce--eee-hhhHHHHHHHHHHHHHHHH-HhhhcCCCCcc-----------ccchhHHHHHHH
Q 000112 956 LLLLIVLAIGVIHHWASNNF--YLT-RTQMFFVCFLAFLLGLAAF-LVGWFDDKPFV-----------GASVGYFTFLFL 1020 (2161)
Q Consensus 956 ~~~~~~~~igv~~~was~~f--~~~-~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~-----------~~~~~~~~~~~~ 1020 (2161)
+|+|+.+++-.+|.|.+.+. |+. |+.-+.+....+++.||.+ +..|+..+.-. .-..+|+.|++-
T Consensus 4 ~liL~~~~~l~~~l~~sG~i~~YI~P~~~~~~~~a~i~l~ilai~q~~~~~~~~~~~~~~h~h~~~~~~~~~~y~l~~iP 83 (182)
T PF09323_consen 4 FLILLGFGILLFYLILSGKILLYIHPRYIPLLYFAAILLLILAIVQLWRWFRPKRRKEDCHDHGHSKSKKLWSYFLFLIP 83 (182)
T ss_pred HHHHHHHHHHHHHHHHhCcHHHHhCccHHHHHHHHHHHHHHHHHHHHHHHHhcccccccccccccccccccHHHHHHHHH
Confidence 46677778888899998864 554 4444444444444444444 34556555443 345667776663
No 39
>PF05297 Herpes_LMP1: Herpesvirus latent membrane protein 1 (LMP1); InterPro: IPR007961 This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4). LMP1 of HHV-4 is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N terminus and a long cytoplasmic carboxy tail of 200 amino acids. HHV-4 virus latent membrane protein 1 (LMP1) is essential for HHV-4 mediated transformation and has been associated with several cases of malignancies. HHV-4-like viruses in Macaca fascicularis (Cynomolgus monkeys) have been associated with high lymphoma rates in immunosuppressed monkeys [].; GO: 0019087 transformation of host cell by virus, 0016021 integral to membrane; PDB: 1CZY_E 1ZMS_B.
Probab=32.17 E-value=15 Score=44.55 Aligned_cols=52 Identities=21% Similarity=0.425 Sum_probs=0.0
Q ss_pred HHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhhc
Q 000112 949 IGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGWF 1002 (2161)
Q Consensus 949 ~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1002 (2161)
+|..|+.+.++++++|=.. .|-=.++=-|-.+ ++..++||+||+.-.++..+
T Consensus 107 ~Gi~~l~l~~lLaL~vW~Y-m~lLr~~GAs~Wt-iLaFcLAF~LaivlLIIAv~ 158 (381)
T PF05297_consen 107 VGIVILFLCCLLALGVWFY-MWLLRELGASFWT-ILAFCLAFLLAIVLLIIAVL 158 (381)
T ss_dssp ------------------------------------------------------
T ss_pred HHHHHHHHHHHHHHHHHHH-HHHHHHhhhHHHH-HHHHHHHHHHHHHHHHHHHH
Confidence 4666666666666655322 4433332223333 34445677777766655554
No 40
>cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane
Probab=32.12 E-value=85 Score=36.47 Aligned_cols=27 Identities=26% Similarity=0.389 Sum_probs=23.6
Q ss_pred cCceeEEEEEEEEcCeEEEEEecCCCC
Q 000112 1927 QGHAYSILQVREVDGHKLVQIRNPWAN 1953 (2161)
Q Consensus 1927 sGHAYSVLdV~EVdG~RLVRLRNPWG~ 1953 (2161)
.+||=.|++..+-+|.+...+||-||.
T Consensus 184 ~~HaV~iVGyg~~~g~~YWivrNSWG~ 210 (236)
T cd02620 184 GGHAVKIIGWGVENGVPYWLAANSWGT 210 (236)
T ss_pred CCeEEEEEEEeccCCeeEEEEEeCCCC
Confidence 579999999976678899999999994
No 41
>PF09586 YfhO: Bacterial membrane protein YfhO; InterPro: IPR018580 The yfhO gene is transcribed in Difco sporulation medium and the transcription is affected by the YvrGHb two-component system []. Some members of this family have been annotated as putative ABC transporter permease proteins.
Probab=30.26 E-value=1.1e+02 Score=41.43 Aligned_cols=24 Identities=33% Similarity=0.375 Sum_probs=17.1
Q ss_pred cccccchhhHHHHHHHHHh-hhcCc
Q 000112 846 YRFSLSSAICVGIFAAVLV-AFCGA 869 (2161)
Q Consensus 846 yr~~~~sav~~~~~~~v~~-~~~~~ 869 (2161)
.||-.++.+.+|+-+++|+ ++++-
T Consensus 214 ~~~~~~~ilg~~lsa~~llP~~~~~ 238 (843)
T PF09586_consen 214 LRFIGSSILGVGLSAFLLLPTILSL 238 (843)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 5677777777788788777 66543
No 42
>COG0815 Lnt Apolipoprotein N-acyltransferase [Cell envelope biogenesis, outer membrane]
Probab=29.30 E-value=1.2e+02 Score=39.81 Aligned_cols=77 Identities=18% Similarity=0.123 Sum_probs=43.7
Q ss_pred hHHHHHHHHHHHHHHHHHHhhhhccchhHHHHHHHHHHHHHhhcceEEEEEecC-------CCCC-----CCCCCcceeh
Q 000112 99 AIIMAGTALLLAFYSIMLWWRTQWQSSRAVAVLLLLAVALLCAYELSAVYVTAG-------SHAS-----DRYSPSGFFF 166 (2161)
Q Consensus 99 a~~~~g~~~~~~~y~~~~w~~t~w~s~~~~~~~~~~~~~l~~~~~~~~~yvt~~-------~~~~-----~~~sps~~ff 166 (2161)
..++.+.++.+++|-.+.+|-.+ +.+.+..+.. ++--++|..--.+=+| -+.. .++-|-+=-+
T Consensus 97 ~~~~~ll~~~lal~~~l~~~~~~-~~~~~~~~~~----~~w~~~E~lR~~~~tGFpW~~~Gy~q~~~~~l~q~a~i~Gv~ 171 (518)
T COG0815 97 PLLVLLLAAWLALFLLLVAVLTC-RLWFALLVVP----SAWVAAEWLRGWSLTGFPWLLLGYSQWSPSPLLQLASLGGVW 171 (518)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHH-HHhhhhHHHH----HHHHHHHHHHhccCcCCchhhhchhhccCccccceeeccCHH
Confidence 34557788888888888777655 6666665544 3333445333222222 2222 2233333445
Q ss_pred hhhHHHHHhhhhhh
Q 000112 167 GVSAIALAINMLFI 180 (2161)
Q Consensus 167 ~~sai~~~~n~l~i 180 (2161)
++|.+.+++|+++.
T Consensus 172 ~lsflvv~~~~~~a 185 (518)
T COG0815 172 LLSFLVVAVNALLA 185 (518)
T ss_pred HHHHHHHHHHHHHH
Confidence 67888888888753
No 43
>PF12065 DUF3545: Protein of unknown function (DUF3545); InterPro: IPR021932 This family of proteins is functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 60 to 77 amino acids in length. This protein has two completely conserved residues (R and L) that may be functionally important.
Probab=28.03 E-value=26 Score=34.24 Aligned_cols=10 Identities=70% Similarity=1.358 Sum_probs=8.6
Q ss_pred HHhhHHhhhh
Q 000112 1333 EERKWKEIEA 1342 (2161)
Q Consensus 1333 ~~~~~~~~~~ 1342 (2161)
..|||+||||
T Consensus 23 ~KRKWREIEA 32 (59)
T PF12065_consen 23 KKRKWREIEA 32 (59)
T ss_pred cchhHHHHHH
Confidence 4589999998
No 44
>KOG2341 consensus TATA box binding protein (TBP)-associated factor, RNA polymerase II [Transcription]
Probab=27.83 E-value=61 Score=42.70 Aligned_cols=27 Identities=19% Similarity=0.031 Sum_probs=17.3
Q ss_pred CCchhhHHHHHHHHhhhccceeEEEEE
Q 000112 1055 KNVSVAFLVLYGVALAIEGWGVVASLK 1081 (2161)
Q Consensus 1055 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1081 (2161)
|-+++.++-+++-..-.+|=+.=+.+.
T Consensus 189 ~t~~~~~~~~p~s~~~~~g~~~ppq~~ 215 (563)
T KOG2341|consen 189 KTLPALRLAVPPSNTFSEGSDPPPQLV 215 (563)
T ss_pred hcchHhhccCCCcccccCCCCCCcccc
Confidence 556777777777776667665544443
No 45
>PF05875 Ceramidase: Ceramidase; InterPro: IPR008901 This entry consists of several ceramidases. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates.; GO: 0016811 hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amides, 0006672 ceramide metabolic process, 0016021 integral to membrane
Probab=27.78 E-value=54 Score=38.56 Aligned_cols=143 Identities=21% Similarity=0.196 Sum_probs=74.2
Q ss_pred CCCCccccccccchhhHHHHHhhHHHhhhcccchhhceeecccccchhhHHHHHHHHHhhhcCceEEEEEeccCCCCCCh
Q 000112 806 GEPNSFASPYASSVYLGWLMASAIALVVTGVLPIVSWFSTYRFSLSSAICVGIFAAVLVAFCGASYLEVVKSREDQVPTK 885 (2161)
Q Consensus 806 ~~~~~~~~~y~~~~~~gw~~~~~~~~v~~~~~p~vswf~tyr~~~~sav~~~~~~~v~~~~~~~~~~~v~~~r~~~~p~~ 885 (2161)
|++|+..|||-...+= --|-++ ..++++.-|....|-.+.....+....+++|.+++.-|=-- -++..|
T Consensus 14 CE~nY~~s~yiAEf~N---tlSNl~---fi~~al~gl~~~~~~~~~~~~~l~~~~l~~VGiGS~~FHaT-l~~~~q---- 82 (262)
T PF05875_consen 14 CEENYVVSPYIAEFWN---TLSNLA---FIVAALYGLYLARRRGLERRFALLYLGLALVGIGSFLFHAT-LSYWTQ---- 82 (262)
T ss_pred chhccccCcccchHHH---HHHHHH---HHHHHHHHHHHHhhccccchhHHHHHHHHHHHHhHHHHHhC-hhhhHH----
Confidence 6888999999755432 122222 33355666666666666666666666677776655544322 222222
Q ss_pred hhHHHhhhhhhhHHHHHHhhcccceeecCccccccceeeeehhHHHHHHhhhheeeeec--cchhHHHHHHHHHHHHHHH
Q 000112 886 GDFLAALLPLVCIPALLSLCSGLLKWKDDDWKLSRGVYVFITIGLVLLLGAISAVIVVI--TPWTIGVAFLLLLLLIVLA 963 (2161)
Q Consensus 886 ~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i--~~w~~gvaf~l~~~~~~~~ 963 (2161)
|.--|| -+...++-+|-|-++.. -+++.-..+++.|.... +++++.... +|..-.++|..+.+++++.
T Consensus 83 ---l~DelP-----Ml~~~~~~~~~~~~~~~-~~~~~~~~~~~~L~~~~-~~~t~~~~~~~~p~~~~~~f~~~~~~~~~~ 152 (262)
T PF05875_consen 83 ---LLDELP-----MLWATLLFLYIVLTRRY-SSPRYRLALPLLLFIYA-VVVTVLYFVLDNPVFHQIAFASLVLLVILR 152 (262)
T ss_pred ---Hhhhhh-----HHHHHHHHHHHHhcccc-cCchhhHHHHHHHHHHH-HHHHHHHhhhccchhhhhhHHHHHHHHHHH
Confidence 222233 33334444444444433 11222223344443333 444444444 7888788887776666655
Q ss_pred hhh-hhc
Q 000112 964 IGV-IHH 969 (2161)
Q Consensus 964 igv-~~~ 969 (2161)
... +++
T Consensus 153 ~~~~~~~ 159 (262)
T PF05875_consen 153 SIYLIRR 159 (262)
T ss_pred HHHHHHH
Confidence 554 444
No 46
>cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.
Probab=27.18 E-value=1.2e+02 Score=35.25 Aligned_cols=27 Identities=22% Similarity=0.363 Sum_probs=23.1
Q ss_pred cCceeEEEEEEEEc-CeEEEEEecCCCC
Q 000112 1927 QGHAYSILQVREVD-GHKLVQIRNPWAN 1953 (2161)
Q Consensus 1927 sGHAYSVLdV~EVd-G~RLVRLRNPWG~ 1953 (2161)
.+|+=.|++.-+.+ |.+.-.+||-||.
T Consensus 178 ~~HaV~IVGyG~~~~g~~YWiikNSWG~ 205 (239)
T cd02698 178 INHIISVAGWGVDENGVEYWIVRNSWGE 205 (239)
T ss_pred CCeEEEEEEEEecCCCCEEEEEEcCCCc
Confidence 47999999987665 8899999999994
No 47
>PF04405 ScdA_N: Domain of Unknown function (DUF542) ; InterPro: IPR007500 This is a domain of unknown function found at the N terminus of genes involved in cell wall development and nitrous oxide protection. ScdA is required for normal cell growth and development; mutants have an increased level of peptidoglycan cross-linking and aberrant cellular morphology suggesting a role for ScdA in cell wall metabolism []. NorA1, NorA2, and YtfE are involved in the nitrous oxide response. NorA1 and NorA2, which are similar to YtfE, are co-transcribed with the membrane-bound nitrous oxide (NO) reductases. The genes appear to be involved in NO protection but their function is unknown [, ].
Probab=27.01 E-value=42 Score=32.07 Aligned_cols=33 Identities=36% Similarity=0.660 Sum_probs=29.3
Q ss_pred cChhhHHHHhhhc----ccCchHhHhhhhhcCCCcch
Q 000112 511 NDPRITSMLKKRA----REGDRELTSLLQDKGLDPNF 543 (2161)
Q Consensus 511 ~~~~~~~~l~~~~----~~~~~~l~~ll~dkgldpnf 543 (2161)
++|+-++.++|-+ -.|++-|..-.+.+|+||+-
T Consensus 11 ~~p~~a~vf~~~gIDfCCgG~~~L~eA~~~~~ld~~~ 47 (56)
T PF04405_consen 11 EDPRAARVFRKYGIDFCCGGNRSLEEACEEKGLDPEE 47 (56)
T ss_pred HChHHHHHHHHcCCcccCCCCchHHHHHHHcCCCHHH
Confidence 6899999999777 67999999999999999974
No 48
>TIGR00570 cdk7 CDK-activating kinase assembly factor MAT1. All proteins in this family for which functions are known are cyclin dependent protein kinases that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. Also known as MAT1 (menage a trois 1). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=26.92 E-value=96 Score=38.49 Aligned_cols=104 Identities=30% Similarity=0.467 Sum_probs=53.5
Q ss_pred ccccCCCCCchhhHhhhhhhhhhhh----hhhcccceeeeeccchhhhHhhhccchhhhhhhhhhhhhhhhcccCCCcCC
Q 000112 1204 RFRHELSSDYDYRREMCTHARILAL----EEAIDTEWVYMWDKFGGYLLLLLGLTAKAERVQDEVRLRLFLDSIGFSDLS 1279 (2161)
Q Consensus 1204 ~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1279 (2161)
.|+.-.-.|...-|++-.--||+.. ||-.+|- ..|.-|| |+|.|=| ..| ...|.- .-.
T Consensus 57 ~fr~q~F~D~~vekEV~iRkrv~~i~Nk~e~dF~~l-----~~yNdYL----------E~vEdii-~nL-~~~~d~-~~t 118 (309)
T TIGR00570 57 NFRVQLFEDPTVEKEVDIRKRVLKIYNKREEDFPSL-----REYNDYL----------EEVEDIV-YNL-TNNIDL-ENT 118 (309)
T ss_pred hccccccccHHHHHHHHHHHHHHHHHccchhccCCH-----HHHHHHH----------HHHHHHH-HHh-hcCCcH-HHH
Confidence 3555566777778888887887765 3333321 2344555 2332211 000 001100 113
Q ss_pred hhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHH
Q 000112 1280 AKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKAL 1329 (2161)
Q Consensus 1280 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1329 (2161)
..+|++|--|.+ +.|+++-.|+++ |++.++|+.++|.+-++.|+..
T Consensus 119 e~~l~~y~~~n~---~~I~~n~~~~~~-e~~~~~~~~~~E~~~~~~rr~~ 164 (309)
T TIGR00570 119 KKKIETYQKENK---DVIQKNKEKSTR-EQEELEEALEFEKEEEEQRRLL 164 (309)
T ss_pred HHHHHHHHHHhH---HHHHHHHHHHHh-HHHHHHHHHHHHHHHHHHHHHH
Confidence 456666655544 458888888776 4455555555555555444333
No 49
>PTZ00266 NIMA-related protein kinase; Provisional
Probab=26.88 E-value=89 Score=43.96 Aligned_cols=16 Identities=25% Similarity=0.607 Sum_probs=6.6
Q ss_pred HHHHhhHHHhhhcccc
Q 000112 823 WLMASAIALVVTGVLP 838 (2161)
Q Consensus 823 w~~~~~~~~v~~~~~p 838 (2161)
|.++..+--++||-.|
T Consensus 227 WSLG~ILYELLTGk~P 242 (1021)
T PTZ00266 227 WALGCIIYELCSGKTP 242 (1021)
T ss_pred HHHHHHHHHHHHCCCC
Confidence 3443333334444444
No 50
>cd06899 lectin_legume_LecRK_Arcelin_ConA legume lectins, lectin-like receptor kinases, arcelin, concanavalinA, and alpha-amylase inhibitor. This alignment model includes the legume lectins (also known as agglutinins), the arcelin (also known as phytohemagglutinin-L) family of lectin-like defense proteins, the LecRK family of lectin-like receptor kinases, concanavalinA (ConA), and an alpha-amylase inhibitor. Arcelin is a major seed glycoprotein discovered in kidney beans (Phaseolus vulgaris) that has insecticidal properties and protects the seeds from predation by larvae of various bruchids. Arcelin is devoid of monosaccharide binding properties and lacks a key metal-binding loop that is present in other members of this family. Phytohaemagglutinin (PHA) is a lectin found in plants, especially beans, that affects cell metabolism by inducing mitosis and by altering the permeability of the cell membrane to various proteins. PHA agglutinates most mammalian red blood cell types by bindin
Probab=26.83 E-value=2.2e+02 Score=33.46 Aligned_cols=37 Identities=14% Similarity=0.129 Sum_probs=30.3
Q ss_pred eeeeccccceecCCceEEEEEEEeccccceeeeeccc
Q 000112 1492 AKDWSISATSIADGRWHIVTMTIDADIGEATCYLDGG 1528 (2161)
Q Consensus 1492 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1528 (2161)
+..|......+.||++|.|.|.-|+.+..-+.||+..
T Consensus 150 ~~~~~~~~~~l~~g~~~~v~I~Y~~~~~~L~V~l~~~ 186 (236)
T cd06899 150 AGYWDDDGGKLKSGKPMQAWIDYDSSSKRLSVTLAYS 186 (236)
T ss_pred eeccccccccccCCCeEEEEEEEcCCCCEEEEEEEeC
Confidence 3556555445789999999999999999999999854
No 51
>PF05154 TM2: TM2 domain; InterPro: IPR007829 This domain is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts.
Probab=26.64 E-value=21 Score=32.97 Aligned_cols=33 Identities=36% Similarity=0.609 Sum_probs=22.4
Q ss_pred cchhhHHhhhccceeeeeeecchhhcccch---HHHHHHH
Q 000112 290 QSRVAALFVAGTSRVFLICFGVHYWYLGHC---ISYAVVA 326 (2161)
Q Consensus 290 ~~~~~~~~~a~~~r~~li~fg~~~w~lghc---i~y~~~~ 326 (2161)
||+.++.+.+- |+-.||+|.+|+||= +.|.++.
T Consensus 3 K~~~~a~lL~~----~lG~~G~hrfYlg~~~~g~~~l~~~ 38 (51)
T PF05154_consen 3 KSKWIAYLLSF----FLGWFGLHRFYLGKYGKGILYLLTF 38 (51)
T ss_pred cCHHHHHHHHH----HHhhccccceecCchHHHHHHHHHH
Confidence 56666666542 566899999999985 4444444
No 52
>PF09991 DUF2232: Predicted membrane protein (DUF2232); InterPro: IPR018710 This family of bacterial and eukaryotic proteins has no known fucntion; however this signature belongs to a Pfam Gx transporter clan.
Probab=26.16 E-value=61 Score=37.56 Aligned_cols=87 Identities=17% Similarity=0.305 Sum_probs=44.7
Q ss_pred ccccccceeeeehhHHHHHHhhhheeeeeccchhHHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHH-
Q 000112 915 DWKLSRGVYVFITIGLVLLLGAISAVIVVITPWTIGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLG- 993 (2161)
Q Consensus 915 ~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~- 993 (2161)
.|++++..-.+..+++++.+-..........--..-+..++..++++-+++++|+|..+. -++|.=-.+..++.+++.
T Consensus 199 ~~~lP~~~~~~~i~~~~~~l~~~~~~~~~~~~i~~Nl~~v~~~l~~~qGla~~~~~~~~~-~~~~~~~~l~~~~~i~~~~ 277 (290)
T PF09991_consen 199 EWRLPRWLIWLLIVALALSLVGGGFGGSWLQIIGLNLLIVLSFLFFIQGLAVIHFFLKRR-KMSKFLRVLLYILLILFPF 277 (290)
T ss_pred HHhCcHHHHHHHHHHHHHHHHhcccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHc-CCcHHHHHHHHHHHHHHHH
Confidence 488887654333344333321111111111122234556666777888999999998776 666654333333333332
Q ss_pred --HHHHHhhhc
Q 000112 994 --LAAFLVGWF 1002 (2161)
Q Consensus 994 --~~~~~~~~~ 1002 (2161)
..-.++|.+
T Consensus 278 ~~~~l~~lG~~ 288 (290)
T PF09991_consen 278 LIVILALLGLI 288 (290)
T ss_pred HHHHHHHHHhh
Confidence 334445544
No 53
>PF14402 7TM_transglut: 7 transmembrane helices usually fused to an inactive transglutaminase
Probab=25.59 E-value=86 Score=38.85 Aligned_cols=55 Identities=25% Similarity=0.473 Sum_probs=42.1
Q ss_pred eeccchhHHHHHH-------HHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhh
Q 000112 942 VVITPWTIGVAFL-------LLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGW 1001 (2161)
Q Consensus 942 ~~i~~w~~gvaf~-------l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~ 1001 (2161)
+|.-|-.|.+||. ++++++++++|.+-+ +||+|..+++|-=+|-++....++++.
T Consensus 146 GTFmPVLIAlAF~eT~L~~Gli~FllIV~~GL~iR-----~yLs~LnLLlV~RisaVli~VI~ii~~ 207 (313)
T PF14402_consen 146 GTFMPVLIALAFRETQLLWGLILFLLIVAIGLLIR-----SYLSHLNLLLVPRISAVLIVVILIIAA 207 (313)
T ss_pred cchHHHHHHHHHHHhhhHHHHHHHHHHHHHHHHHH-----HHHHhhhhHHHHHHHHHHHHHHHHHHH
Confidence 4566777777775 778889999999766 699999999998777777666665544
No 54
>PF06439 DUF1080: Domain of Unknown Function (DUF1080); InterPro: IPR010496 This is a family of proteins of unknown function.; PDB: 3IMM_B 3NMB_A 3S5Q_A 3OSD_A 3HBK_A 3H3L_A 3U1X_A.
Probab=24.90 E-value=2.6e+02 Score=30.38 Aligned_cols=102 Identities=17% Similarity=0.248 Sum_probs=52.8
Q ss_pred ccCcccccccccccccceeEEEEEEEeecCCCceeee-cc-----cccchhhheeeecccccc----ccccceeEEEEEe
Q 000112 1415 TSGRHCGQIDASICQSQKVSFSIAVMIQPESGPVCLL-GT-----EFQKKVCWEILVAGSEQG----IEAGQVGLRLITK 1484 (2161)
Q Consensus 1415 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~-----~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~ 1484 (2161)
..+.+.|.+=... ......+++-++++| +|-..++ -. +.....|.|+-+.....+ -..|.+=-+
T Consensus 38 ~~~~~~~~l~~~~-~~~df~l~~d~k~~~-~~~sGi~~r~~~~~~~~~~~~gy~~~i~~~~~~~~~~~~~G~~~~~---- 111 (185)
T PF06439_consen 38 SSGSGGGYLYTDK-KFSDFELEVDFKITP-GGNSGIFFRAQSPGDGQDWNNGYEFQIDNSGGGTGLPNSTGSLYDE---- 111 (185)
T ss_dssp GGESSS--EEESS-EBSSEEEEEEEEE-T-T-EEEEEEEESSECCSSGGGTSEEEEEE-TTTCSTTTTSTTSBTTT----
T ss_pred cCCCCcceEEECC-ccccEEEEEEEEECC-CCCeEEEEEeccccCCCCcceEEEEEEECCCCccCCCCccceEEEe----
Confidence 3444555444443 556677888888754 4433332 22 245677888877776555 111111000
Q ss_pred cCCceeeeeeeccccceecCCceEEEEEEEeccccceeeeecccc
Q 000112 1485 GDRQTTVAKDWSISATSIADGRWHIVTMTIDADIGEATCYLDGGF 1529 (2161)
Q Consensus 1485 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1529 (2161)
-.++ .-.-....+..|+||.++|++..+. .++|+||..
T Consensus 112 ~~~~-----~~~~~~~~~~~~~W~~~~I~~~g~~--i~v~vnG~~ 149 (185)
T PF06439_consen 112 PPWQ-----LEPSVNVAIPPGEWNTVRIVVKGNR--ITVWVNGKP 149 (185)
T ss_dssp B-TC-----B-SSS--S--TTSEEEEEEEEETTE--EEEEETTEE
T ss_pred cccc-----ccccccccCCCCceEEEEEEEECCE--EEEEECCEE
Confidence 0000 0122344578899999999998776 889999964
No 55
>PRK11588 hypothetical protein; Provisional
Probab=24.04 E-value=2e+02 Score=38.00 Aligned_cols=46 Identities=15% Similarity=0.249 Sum_probs=36.5
Q ss_pred HhhhhhhhHHHHHHhhcccceeecCccccccceeeeehhHHHHHHhhhheeeeeccchhHHHH
Q 000112 890 AALLPLVCIPALLSLCSGLLKWKDDDWKLSRGVYVFITIGLVLLLGAISAVIVVITPWTIGVA 952 (2161)
Q Consensus 890 ~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~~gva 952 (2161)
.++.|+ ++|-+.+|| .=-.+|.+++++-..+.+...++||.++|+|
T Consensus 172 i~f~pi-~v~l~~alG----------------yD~ivg~ai~~lg~~iGf~~s~~NPftvgIA 217 (506)
T PRK11588 172 IAFAII-IAPLMVRLG----------------YDSITTVLVTYVATQIGFATSWMNPFSVAIA 217 (506)
T ss_pred HHHHHH-HHHHHHHhC----------------CcHHHHHHHHHHHhhhhhcccccCccHHHHH
Confidence 366664 567666665 2347899999999999999999999999887
No 56
>KOG3011 consensus Ubiquitin-conjugating enzyme [Posttranslational modification, protein turnover, chaperones]
Probab=23.50 E-value=1.7e+02 Score=35.56 Aligned_cols=117 Identities=21% Similarity=0.306 Sum_probs=63.7
Q ss_pred hhhHHHHHHHHHhhhcCceEEEEEeccCCCCCChhhHHHhhhhhhhHHHHHHhhcccceeecCcc---------------
Q 000112 852 SAICVGIFAAVLVAFCGASYLEVVKSREDQVPTKGDFLAALLPLVCIPALLSLCSGLLKWKDDDW--------------- 916 (2161)
Q Consensus 852 sav~~~~~~~v~~~~~~~~~~~v~~~r~~~~p~~~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w--------------- 916 (2161)
.|.|.++|+.++...-|+-=.+ +.+..+|--+|=-..-=|++|+|.|--|.|
T Consensus 83 ~~~c~~lf~~~~~~ii~~~~s~-------------~~~~~~La~~aG~i~AD~~SGl~HWaaD~~Gsv~tP~vG~~f~rf 149 (293)
T KOG3011|consen 83 AAGCTTLFVSFAKSIIGGFGSH-------------LWLEPALAAYAGYITADLGSGVYHWAADNYGSVSTPWVGRQFERF 149 (293)
T ss_pred HhhhHHHHHHHHHHHHHhhhhh-------------hhHHHHHHHHHHHHHHhhhcceeEeeccccCccccchhHHHHHHH
Confidence 4568888888777655543211 223333333333334468999999966655
Q ss_pred --------ccccceeeeehhHHHHHHhhhheeeeecc---chhHHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHH
Q 000112 917 --------KLSRGVYVFITIGLVLLLGAISAVIVVIT---PWTIGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFF 984 (2161)
Q Consensus 917 --------~~s~~~~~~~~~gl~ll~~a~~~v~~~i~---~w~~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~ 984 (2161)
.+.|.-++=. +-|+--|+-+.+-+.. -|.+--+|.+.+-+.|+----||.|+---|=|+|.-+++
T Consensus 150 reHH~dP~tITr~~f~~~---~~ll~~a~~f~v~~~d~~~q~~~~h~fV~~~~i~v~~tnQiHkWsHTy~gLP~wVv~L 225 (293)
T KOG3011|consen 150 QEHHKDPWTITRRQFANN---LHLLARAYTFIVLPLDLAFQDPVFHGFVFLFAICVLFTNQIHKWSHTYSGLPPWVVLL 225 (293)
T ss_pred HhccCCcceeeHHHHhhh---hHHHHHhheeEecCHHHHhhcccHHHHHHHHHHHHHHHHHHHHHHhhhccCchHHHHH
Confidence 4444443333 2233324444443321 122223444444444444556999999888899865543
No 57
>COG4870 Cysteine protease [Posttranslational modification, protein turnover, chaperones]
Probab=22.78 E-value=69 Score=40.39 Aligned_cols=49 Identities=31% Similarity=0.577 Sum_probs=35.2
Q ss_pred cCcccCceeEEEEEEEEc----------CeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchh
Q 000112 1923 SGIVQGHAYSILQVREVD----------GHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQ 1992 (2161)
Q Consensus 1923 ~GLVsGHAYSVLdV~EVd----------G~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfE 1992 (2161)
.+-..|||=.|++...-- |.--+++||-||. .| .++|-|||+++
T Consensus 260 s~~~~gHAv~iVGyDDs~~~n~~~~~~~g~GAfiikNSWGt--~w------------------------G~~GYfwisY~ 313 (372)
T COG4870 260 SGENWGHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGT--NW------------------------GENGYFWISYY 313 (372)
T ss_pred ccccccceEEEEeccccccccccccCCCCCceEEEECcccc--cc------------------------ccCceEEEEee
Confidence 345679999999876421 2236889999995 33 35799999999
Q ss_pred hHhhc
Q 000112 1993 DFQIH 1997 (2161)
Q Consensus 1993 DFLky 1997 (2161)
+-..-
T Consensus 314 ya~~g 318 (372)
T COG4870 314 YALNG 318 (372)
T ss_pred ecccc
Confidence 86554
No 58
>cd01951 lectin_L-type legume lectins. The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.
Probab=22.63 E-value=3.7e+02 Score=30.89 Aligned_cols=50 Identities=22% Similarity=0.286 Sum_probs=32.5
Q ss_pred CceEEEEEEEeccccceeeeecccccccccccccccccccccCCceEEEec
Q 000112 1505 GRWHIVTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGV 1555 (2161)
Q Consensus 1505 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1555 (2161)
|+||.|.|+.|+.++.-+.++|+.-.....-+..++.-.- ....+++||+
T Consensus 154 g~~~~v~I~Y~~~~~~L~v~l~~~~~~~~~~l~~~~~l~~-~~~~~~yvGF 203 (223)
T cd01951 154 GNEHTVRITYDPTTNTLTVYLDNGSTLTSLDITIPVDLIQ-LGPTKAYFGF 203 (223)
T ss_pred CCEEEEEEEEeCCCCEEEEEECCCCccccccEEEeeeecc-cCCCcEEEEE
Confidence 9999999999999999999999764312122222222211 1246777765
No 59
>PRK02509 hypothetical protein; Provisional
Probab=22.05 E-value=1.6e+02 Score=41.25 Aligned_cols=34 Identities=32% Similarity=0.527 Sum_probs=24.7
Q ss_pred hhhHHHHHHhhcccc---ee--------------ecCccccccceeeeehh
Q 000112 895 LVCIPALLSLCSGLL---KW--------------KDDDWKLSRGVYVFITI 928 (2161)
Q Consensus 895 l~~ipa~~~l~~gl~---kw--------------~dd~w~~s~~~~~~~~~ 928 (2161)
+..||++++++.|+. .| +|--....-|-|+|.-=
T Consensus 188 ~~~i~~~~sl~~g~~~~~~W~~~l~f~n~~~Fg~~DP~Fg~DisFYvF~LP 238 (973)
T PRK02509 188 LRGIAIILSLAFGLILSGNWARVLQYFHSTPFNETDPLFGRDISFYIFQLP 238 (973)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhCCCCCCCCCCCCCCCcEEEEEehH
Confidence 556777777777753 34 78888888999998643
No 60
>TIGR00917 2A060601 Niemann-Pick C type protein family. The model describes Niemann-Pick C type protein in eukaryotes. The defective protein has been associated with Niemann-Pick disease which is described in humans as autosomal recessive lipidosis. It is characterized by the lysosomal accumulation of unestrified cholesterol. It is an integral membrane protein, which indicates that this protein is most likely involved in cholesterol transport or acts as some component of cholesterol homeostasis.
Probab=21.86 E-value=39 Score=47.98 Aligned_cols=78 Identities=23% Similarity=0.292 Sum_probs=42.9
Q ss_pred HHHHHHHHhhh------hhcccccceee-----------ehhh----HH----HHHHHHHHHHHHHHHhhhcCCCCcccc
Q 000112 956 LLLLIVLAIGV------IHHWASNNFYL-----------TRTQ----MF----FVCFLAFLLGLAAFLVGWFDDKPFVGA 1010 (2161)
Q Consensus 956 ~~~~~~~~igv------~~~was~~f~~-----------~~~~----~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1010 (2161)
++-++|+|||| +|.|...+-.- +..| ++ --++++-+.-.+||++|.+-+-|-+-
T Consensus 640 v~PFLvL~IGVD~ifilv~~~~r~~~~~~~~~~~~~~~~~~~~ri~~~l~~~G~sI~ltslt~~~aF~~g~~s~~Pavr- 718 (1204)
T TIGR00917 640 VIPFLVLAVGVDNIFILVQTYQRLERFYREVGVDNEQELTLEQQLGRALGEVGPSITLASLSESLAFFLGALSKMPAVR- 718 (1204)
T ss_pred HHHHHHHHHHhhHHHHHHHHHHHhhhccccccccccccCCHHHHHHHHHHHhhHHHHHHHHHHHHHHHHHhccCChHHH-
Confidence 45577889998 56675433210 2212 11 34667777888899999998777442
Q ss_pred chhHHHHHHHhhccceeeeccCCE
Q 000112 1011 SVGYFTFLFLLAGRALTVLLSPPI 1034 (2161)
Q Consensus 1011 ~~~~~~~~~~~~~~~~~~~~~~~~ 1034 (2161)
..|.++-+.++.-=.+++.+-|++
T Consensus 719 ~F~~~aa~av~~~fll~it~f~al 742 (1204)
T TIGR00917 719 AFSLFAGLAVFIDFLLQITAFVAL 742 (1204)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH
Confidence 334444333333333333333333
No 61
>PF04123 DUF373: Domain of unknown function (DUF373); InterPro: IPR007254 This archaeal family of unknown function is predicted to be an integral membrane protein with six transmembrane regions.
Probab=21.79 E-value=40 Score=42.08 Aligned_cols=138 Identities=23% Similarity=0.429 Sum_probs=77.5
Q ss_pred eee-ehhHHHHHHhhhheeeeeccchhHHHHHHHHHHHH-HHHhhh---hhccccc---ceeeehhhHHHHHHHHHHHHH
Q 000112 923 YVF-ITIGLVLLLGAISAVIVVITPWTIGVAFLLLLLLI-VLAIGV---IHHWASN---NFYLTRTQMFFVCFLAFLLGL 994 (2161)
Q Consensus 923 ~~~-~~~gl~ll~~a~~~v~~~i~~w~~gvaf~l~~~~~-~~~igv---~~~was~---~f~~~~~~~~~~~~~~~~~~~ 994 (2161)
++| +- |++||+-++.+++.. ..+.+++..+++++.+ .=+.|. +.+|.++ .+|-.|.... .-..|.++.+
T Consensus 161 ~~lGvP-G~~lLiy~i~~l~~~-~~~a~~~i~~~iG~yll~kGfgld~~~~~~~~~~~~~l~~g~it~i-tyvva~~l~i 237 (344)
T PF04123_consen 161 TFLGVP-GLILLIYAILALLGY-PAYALGIILLLIGLYLLYKGFGLDDYLREWLERFRESLYEGRITFI-TYVVALLLII 237 (344)
T ss_pred eeecch-HHHHHHHHHHHHHcc-hHHHHHHHHHHHHHHHHHHhcCcHHHHHHHHHHhccccccceeehH-HHHHHHHHHH
Confidence 455 55 999999999998875 4555555555554444 335555 5566554 4666654333 3344444555
Q ss_pred HHHHhhhcC------CCC------ccccchhHHHH--HHHhhccceeeeccCCEEEecCceeeEEEeecccccCCCchhh
Q 000112 995 AAFLVGWFD------DKP------FVGASVGYFTF--LFLLAGRALTVLLSPPIVVYSPRVLPVYVYDAHADCGKNVSVA 1060 (2161)
Q Consensus 995 ~~~~~~~~~------~~~------~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1060 (2161)
.+...|... ..+ |+=.++.||++ +...+||.+.-.+.--...|+--..|.++ .+.
T Consensus 238 ig~i~g~~~~~~~~~~~~~~~~~~f~~~~v~~~~~a~l~~~~G~iid~~l~~~~~~~~~i~~~~~~-----------~a~ 306 (344)
T PF04123_consen 238 IGIIYGYLTLWSYYSISGLIVPGTFLYGSVPWLALAALIASLGKIIDEYLRRDFRLWRYINAPFFV-----------IAI 306 (344)
T ss_pred HHHHHHHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHccCcchHHHHHHHHHH-----------HHH
Confidence 555555441 111 44455666655 44557887776666555555544444432 455
Q ss_pred HHHHHHHHhhhccc
Q 000112 1061 FLVLYGVALAIEGW 1074 (2161)
Q Consensus 1061 ~~~~~~~~~~~~~~ 1074 (2161)
++++|++..-....
T Consensus 307 ~~v~~~~~~~~l~~ 320 (344)
T PF04123_consen 307 GLVLYGFSAYFLSI 320 (344)
T ss_pred HHHHHHHHHHHHhh
Confidence 56677766554443
No 62
>PRK10263 DNA translocase FtsK; Provisional
Probab=21.37 E-value=42 Score=47.79 Aligned_cols=29 Identities=21% Similarity=0.406 Sum_probs=22.8
Q ss_pred EEeeeeehhhchhccceeeeccccccccCCccc
Q 000112 773 LVICITVFTGSVLALGAIVSAKPLEDLGYKGWT 805 (2161)
Q Consensus 773 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 805 (2161)
-.+.|++++.+++.+.+++|+.|.|- +|+
T Consensus 24 E~~gIlLlllAlfL~lALiSYsPsDP----SwS 52 (1355)
T PRK10263 24 EALLILIVLFAVWLMAALLSFNPSDP----SWS 52 (1355)
T ss_pred HHHHHHHHHHHHHHHHHHHhCCccCC----ccc
Confidence 35567778888888999999999774 665
No 63
>PRK15097 cytochrome d terminal oxidase subunit 1; Provisional
Probab=21.29 E-value=2.5e+02 Score=37.32 Aligned_cols=91 Identities=22% Similarity=0.346 Sum_probs=0.0
Q ss_pred hHHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhhcCCCCccccchhHHHHHHHhhcccee
Q 000112 948 TIGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGWFDDKPFVGASVGYFTFLFLLAGRALT 1027 (2161)
Q Consensus 948 ~~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1027 (2161)
|||..++++++.++-. -.|..+..| +.+-.|-++.++..|...|...||+-.| .||
T Consensus 393 MVg~G~l~~~l~~~~l----~l~~r~~l~-~~rw~L~~~~~~~plp~iA~~~GWi~tE----------------vGR--- 448 (522)
T PRK15097 393 MVACGFLMLAIIALSF----WSVIRNRIG-EKKWLLRAALYGIPLPWIAVEAGWFVAE----------------YGR--- 448 (522)
T ss_pred HHHHHHHHHHHHHHHH----HHHHcCccc-cCcHHHHHHHHHHHHHHHHHHhhhhhee----------------cCC---
Q ss_pred eeccCCEEEecCceeeEEEeecccccCCCchh--------hHHHHHHHHhhhccc
Q 000112 1028 VLLSPPIVVYSPRVLPVYVYDAHADCGKNVSV--------AFLVLYGVALAIEGW 1074 (2161)
Q Consensus 1028 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--------~~~~~~~~~~~~~~~ 1074 (2161)
-|=+|| .+||+ +|..-||+. .|.++|++.+..+.|
T Consensus 449 ----QPWiVy--g~l~T------~~avS~~s~~~v~~sl~~f~~~Y~~L~~~~~~ 491 (522)
T PRK15097 449 ----QPWAIG--EVLPT------AVANSSLTAGDLLFSMVLICGLYTLFLVAELF 491 (522)
T ss_pred ----CCeEEe--ceeeH------hHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHH
No 64
>PF13801 Metal_resist: Heavy-metal resistance; PDB: 3EPV_C 2Y3D_A 2Y3H_D 2Y3G_B 2Y3B_A 2Y39_A 3LAY_H.
Probab=21.20 E-value=3.1e+02 Score=27.37 Aligned_cols=19 Identities=16% Similarity=0.468 Sum_probs=12.8
Q ss_pred CchhhhhHHHHHHhhhhhh
Q 000112 1287 MPEDRRQFEIIQESYIREK 1305 (2161)
Q Consensus 1287 ~~~~~~~~~~~~~~~~~~~ 1305 (2161)
+||++++++-+.+.|..+-
T Consensus 43 t~eQ~~~l~~~~~~~~~~~ 61 (125)
T PF13801_consen 43 TPEQQAKLRALMDEFRQEM 61 (125)
T ss_dssp THHHHHHHHHHHHHHHHHH
T ss_pred CHHHHHHHHHHHHHHHHHH
Confidence 5777777777766666544
No 65
>KOG3583 consensus Uncharacterized conserved protein [Function unknown]
Probab=21.09 E-value=1.7e+02 Score=34.99 Aligned_cols=122 Identities=22% Similarity=0.325 Sum_probs=64.4
Q ss_pred ccceeeeeccchhhhHhhhccchhhhhhh-----h--hhhhhhhhccc-CCCcCChhhhhccCchhhhhHHHHHHhhhhh
Q 000112 1233 DTEWVYMWDKFGGYLLLLLGLTAKAERVQ-----D--EVRLRLFLDSI-GFSDLSAKKIKKWMPEDRRQFEIIQESYIRE 1304 (2161)
Q Consensus 1233 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~--~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1304 (2161)
.|-|--|-|||.-.--.+-||+.--..-| . -|-+|+-.|-= -.-....+..--|.- -|--.|+|-
T Consensus 38 ~~~wp~~le~fs~las~ms~l~~~~~k~~~p~lr~~~~~~~~~~~e~detl~r~TeGRVpvfsH-------~lVPdyLRT 110 (279)
T KOG3583|consen 38 KCPWPLMLEKFSTLASFMSSLQSSVRKSGMPHLRSHVLVTQRLQYEPDETLQRATEGRVPVFSH-------ALVPDYLRT 110 (279)
T ss_pred cCccHHHHHHHHHHHHHHHHHHHHHHHccCCccccchhhhhhhhcCchHHHHHHhcCccccccc-------ccchHhhcc
Confidence 35599999999988777888875322111 0 11122211100 000000111111111 123468987
Q ss_pred h---hhHHHHHHHHHhhhcccHHH---H-----------HHHHHHHHhhHHhhhhhhcccCCCCCch-HHHHHHHHH
Q 000112 1305 K---EMEEEILMQRREEEGRGKER---R-----------KALLEKEERKWKEIEASLISSIPNAGNR-EAAAMAAAV 1363 (2161)
Q Consensus 1305 ~---~~~~~~~~~~~~~~~~~~~~---~-----------~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~ 1363 (2161)
| |||+|+.|---|...++..- . -.-+.|++|.| +|++..--|-..-|+ |.|++.|||
T Consensus 111 kPdPe~E~~e~ql~~~aa~~saDaa~kQI~~yNK~is~ll~~lsk~~re~--tEs~~~~piqQT~n~~dT~~lVaaV 185 (279)
T KOG3583|consen 111 KPDPEMENEEGQLDGEAAAKSADAAVKQIAAYNKNISGLLNHLSKVDREH--TESAIEKPIQQTYNRDDTAKLVAAV 185 (279)
T ss_pred CCChhhHHHHhhhhhHHhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHH--HHhhhcCccccccChhHHHHHHHHH
Confidence 6 89999887665555554321 1 12355788888 888776655555554 456666655
No 66
>PF15412 Nse4-Nse3_bdg: Binding domain of Nse4/EID3 to Nse3-MAGE
Probab=21.02 E-value=69 Score=30.49 Aligned_cols=28 Identities=36% Similarity=0.662 Sum_probs=23.8
Q ss_pred eeeecCCCCCHHHHHHHHhhhccCCCcc
Q 000112 182 RMVFNGNGLDVDEYVRRAYKFAYPDGIE 209 (2161)
Q Consensus 182 ~~~~~g~~~d~~~~~r~~y~~a~~d~~~ 209 (2161)
++-+.|+++|+||||.+..+|.-.+..+
T Consensus 18 ~lk~~~~~fd~deFv~~l~~fm~~~~~~ 45 (56)
T PF15412_consen 18 NLKFGGSGFDVDEFVSKLKTFMGGNRFE 45 (56)
T ss_pred HhccCCCccCHHHHHHHHHHHhCcccCC
Confidence 4567799999999999999998876665
No 67
>PLN00122 serine/threonine protein phosphatase 2A; Provisional
Probab=20.99 E-value=1e+02 Score=35.39 Aligned_cols=22 Identities=32% Similarity=0.605 Sum_probs=16.2
Q ss_pred HHHHHHHHHHHHhhHHhhhhhh
Q 000112 1323 KERRKALLEKEERKWKEIEASL 1344 (2161)
Q Consensus 1323 ~~~~~~~~~~~~~~~~~~~~~~ 1344 (2161)
++++++..+|.|.+|+.||..-
T Consensus 142 ~~~~~~~~~~r~~~W~~le~~A 163 (170)
T PLN00122 142 EAKAKEVEEKREATWKRLEEAA 163 (170)
T ss_pred HHHHHHHHHHHHHHHHHHHHHH
Confidence 3456666688889999998643
No 68
>TIGR02916 PEP_his_kin putative PEP-CTERM system histidine kinase. Members of this protein family have a novel N-terminal domain, a single predicted membrane-spanning helix, and a predicted cystosolic histidine kinase domain. We designate this protein PrsK, and its companion DNA-binding response regulator protein (TIGR02915) PrsR. These predicted signal-transducing proteins appear to enable enhancer-dependent transcriptional activation. The prsK gene is often associated with exopolysaccharide biosynthesis genes.
Probab=20.92 E-value=57 Score=42.96 Aligned_cols=36 Identities=11% Similarity=-0.006 Sum_probs=20.3
Q ss_pred hhHHHhhhhhhhHHHHHHhhcccceeecCccccccce
Q 000112 886 GDFLAALLPLVCIPALLSLCSGLLKWKDDDWKLSRGV 922 (2161)
Q Consensus 886 ~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~ 922 (2161)
..++..+.|..-++.++.+ .+...+.++++.-++..
T Consensus 58 ~~~~~~l~~~~w~~~l~~~-~~~~~~~~~~~~~~~~~ 93 (679)
T TIGR02916 58 VLVLEVFRDAAWLAFLLTL-LRRPATSGKPFNQRPKL 93 (679)
T ss_pred HHHHHHHHHHHHHHHHHHH-hcccccccCcccchHHH
Confidence 3455555666655555543 34466677777665544
No 69
>PF02387 IncFII_repA: IncFII RepA protein family; InterPro: IPR003446 These proteins are plasmid encoded and essential for plasmid replication, they are also involved in copy control functions [].; GO: 0006276 plasmid maintenance
Probab=20.75 E-value=1.1e+02 Score=37.58 Aligned_cols=87 Identities=24% Similarity=0.387 Sum_probs=52.2
Q ss_pred hHhhhccchhhhhhhhhhhhhhhhc---ccCCCcCChhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccH
Q 000112 1247 LLLLLGLTAKAERVQDEVRLRLFLD---SIGFSDLSAKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGK 1323 (2161)
Q Consensus 1247 ~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1323 (2161)
+..++|.+.+.-+-+.+-||+..=+ ..|-..+|..++++..- ++..+..++.|+++..+|+
T Consensus 159 ff~l~gi~~~kl~~~~~~~l~~~~~~~~~~~~~~is~~e~~~r~~----------------~~~~~~~~~~r~~~~~~~~ 222 (281)
T PF02387_consen 159 FFMLLGISEDKLRREQRQRLQWENNGLSKQGEEPISLHEARRRAK----------------EQHRKRALDYRKERRAKGK 222 (281)
T ss_pred HHHHhCCCHHHHHHHHHHHHHHHHHhhhhcccCCCcHHHHHHHHH----------------HHHHHHHHHHHHHhHHHHH
Confidence 3567899888766666666665533 44667777777643222 2335567778888887788
Q ss_pred HHHHH--HHHH-HHhhHHhhhhhhcccCC
Q 000112 1324 ERRKA--LLEK-EERKWKEIEASLISSIP 1349 (2161)
Q Consensus 1324 ~~~~~--~~~~-~~~~~~~~~~~~~~~~~ 1349 (2161)
+|++| +.+. |....++|=.-|+.+.|
T Consensus 223 krk~A~rl~~L~e~~ar~~I~~~Lik~ys 251 (281)
T PF02387_consen 223 KRKRARRLAKLDEDEARQEILRQLIKEYS 251 (281)
T ss_pred HHHHHhhccccCHHHHHHHHHHHHHHHcC
Confidence 77654 2222 22334555555665554
No 70
>KOG4661 consensus Hsp27-ERE-TATA-binding protein/Scaffold attachment factor (SAF-B) [Transcription]
Probab=20.24 E-value=1.4e+02 Score=39.53 Aligned_cols=30 Identities=40% Similarity=0.446 Sum_probs=19.5
Q ss_pred HHHHhhhcccHHHHHHHHHHHHhhHHhhhh
Q 000112 1313 MQRREEEGRGKERRKALLEKEERKWKEIEA 1342 (2161)
Q Consensus 1313 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1342 (2161)
+||-+||.--.||||+..|+||+.+-++|.
T Consensus 626 r~RirE~rerEqR~~a~~ERee~eRl~~er 655 (940)
T KOG4661|consen 626 RQRIREEREREQRRKAAVEREELERLKAER 655 (940)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 344444444557888888888877766653
No 71
>PF02460 Patched: Patched family; InterPro: IPR003392 The transmembrane protein, patched, is a receptor for the morphogene Sonic Hedgehog. In Drosophila melanogaster, this protein associates with the smoothened protein to transduce hedgehog signals, leading to the activation of wingless, decapentaplegic and patched itself. It participates in cell interactions that establish pattern within the segment and imaginal disks during development. The mouse homologue may play a role in epidermal development. The human Niemann-Pick C1 protein, defects in which cause Niemann-Pick type II disease, is also a member of this family. This protein is involved in the intracellular trafficking of cholesterol, and may play a role in vesicular trafficking in glia, a process that may be crucial for maintaining the structural functional integrity of nerve terminals.; GO: 0008158 hedgehog receptor activity, 0016020 membrane
Probab=20.19 E-value=1.1e+02 Score=41.54 Aligned_cols=53 Identities=28% Similarity=0.419 Sum_probs=36.0
Q ss_pred HHHHHHHHHhhh------hhcccccceeeehhhHH--------HHHHHHHHHHHHHHHhhhcCCCCc
Q 000112 955 LLLLLIVLAIGV------IHHWASNNFYLTRTQMF--------FVCFLAFLLGLAAFLVGWFDDKPF 1007 (2161)
Q Consensus 955 l~~~~~~~~igv------~~~was~~f~~~~~~~~--------~~~~~~~~~~~~~~~~~~~~~~~~ 1007 (2161)
.+.-++++|||| +|.|-...-..+..+-+ --.++.-+--.+||++|.+-.-|=
T Consensus 282 ~v~PFLvlgIGvDd~Fi~~~~~~~~~~~~~~~er~~~~l~~~g~SitiTslT~~~aF~ig~~t~~pa 348 (798)
T PF02460_consen 282 LVIPFLVLGIGVDDMFIMIHAWRRTSPDLSVEERMAETLAEAGPSITITSLTNALAFAIGAITPIPA 348 (798)
T ss_pred HHHHHHHHHHHHhceEEeHHHHhhhchhccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCcHH
Confidence 456778889999 89998776665543222 223444455567899999887773
Done!