Query         000112
Match_columns 2161
No_of_seqs    285 out of 1332
Neff          3.2 
Searched_HMMs 46136
Date          Thu Mar 28 18:53:03 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/000112.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/000112hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG0045 Cytosolic Ca2+-depende 100.0 1.8E-72   4E-77  686.4  35.5  442 1693-2160   15-485 (612)
  2 smart00230 CysPc Calpain-like  100.0 4.4E-69 9.6E-74  616.0  29.1  302 1694-2014    4-317 (318)
  3 cd00044 CysPc Calpains, domain 100.0 1.5E-66 3.2E-71  592.1  27.9  304 1695-2005    3-315 (315)
  4 PF00648 Peptidase_C2:  Calpain 100.0 4.6E-65   1E-69  573.0  20.8  283 1705-2006    1-297 (298)
  5 KOG0045 Cytosolic Ca2+-depende 100.0 5.3E-31 1.2E-35  323.7 -10.7  598  874-1874   13-610 (612)
  6 smart00720 calpain_III calpain  99.8   3E-20 6.4E-25  191.3  13.5  135 2013-2159    4-143 (143)
  7 cd00214 Calpain_III Calpain, s  99.8 4.5E-20 9.8E-25  192.8  13.6  138 2013-2161    6-150 (150)
  8 PF01067 Calpain_III:  Calpain   99.8 2.9E-19 6.3E-24  182.4  10.6  136 2013-2159    5-147 (147)
  9 cd00152 PTX Pentraxins are pla  97.9 8.2E-05 1.8E-09   82.1  12.1  162 1434-1618   32-195 (201)
 10 smart00159 PTX Pentraxin / C-r  97.8 0.00016 3.4E-09   80.4  12.3  161 1434-1619   32-196 (206)
 11 PF13385 Laminin_G_3:  Concanav  97.3 0.00098 2.1E-08   66.5   8.8   80 1498-1594   78-157 (157)
 12 PF00354 Pentaxin:  Pentaxin fa  97.3 0.00092   2E-08   74.5   9.4  159 1434-1618   26-188 (195)
 13 cd00110 LamG Laminin G domain;  94.2    0.31 6.7E-06   50.0   9.8  110 1432-1559   19-129 (151)
 14 smart00210 TSPN Thrombospondin  94.1    0.32 6.9E-06   53.8  10.4   86 1434-1533   53-143 (184)
 15 smart00282 LamG Laminin G doma  93.2    0.71 1.5E-05   47.3  10.4  109 1435-1560    3-112 (135)
 16 smart00560 LamGL LamG-like jel  90.8    0.89 1.9E-05   47.6   8.0   82 1434-1532    2-88  (133)
 17 cd02619 Peptidase_C1 C1 Peptid  81.9     2.3   5E-05   46.4   5.5   49 1924-1998  168-218 (223)
 18 KOG1029 Endocytic adaptor prot  77.6     3.7   8E-05   54.2   6.0   33 1011-1043   72-104 (1118)
 19 PF02210 Laminin_G_2:  Laminin   76.4     5.5 0.00012   39.5   5.7   62 1497-1562   46-107 (128)
 20 cd02248 Peptidase_C1A Peptidas  60.0      22 0.00047   39.4   6.6   43 1925-1993  156-198 (210)
 21 PF03699 UPF0182:  Uncharacteri  58.3      13 0.00028   50.2   5.3   62  865-926    62-154 (774)
 22 PF02057 Glyco_hydro_59:  Glyco  57.9      29 0.00064   46.3   8.2   91 1431-1534  542-638 (669)
 23 KOG4326 Mitochondrial F1F0-ATP  56.9      15 0.00033   36.8   4.2   17 1242-1258   13-29  (81)
 24 TIGR00805 oat sodium-independe  47.6      33 0.00072   45.1   6.4   94  919-1014  328-436 (633)
 25 PF09323 DUF1980:  Domain of un  44.9      57  0.0012   36.8   6.9   57   64-124     4-60  (182)
 26 PF00112 Peptidase_C1:  Papain   44.8      30 0.00064   38.0   4.7   44 1925-1994  163-206 (219)
 27 PF04156 IncA:  IncA protein;    44.5      12 0.00025   41.7   1.5   14 1031-1044   63-76  (191)
 28 PTZ00334 trans-sialidase; Prov  43.2      33 0.00072   46.6   5.5   77 1504-1598  642-724 (780)
 29 COG1390 NtpE Archaeal/vacuolar  40.8 2.5E+02  0.0053   32.8  11.1  113 1255-1384   17-133 (194)
 30 PF07946 DUF1682:  Protein of u  40.6      40 0.00086   41.2   5.3   10 1332-1341  305-314 (321)
 31 PF00054 Laminin_G_1:  Laminin   38.5      40 0.00087   35.5   4.3   51 1472-1530   26-76  (131)
 32 cd08045 TAF4 TATA Binding Prot  38.2      17 0.00037   41.8   1.7   44 1353-1420  166-209 (212)
 33 KOG1029 Endocytic adaptor prot  37.6      43 0.00093   45.1   5.0   43 1290-1332  356-398 (1118)
 34 PTZ00266 NIMA-related protein   37.5      53  0.0012   46.0   6.2   10 1377-1386  508-517 (1021)
 35 PLN02316 synthase/transferase   36.9      53  0.0012   46.1   6.1   16 1837-1852  688-703 (1036)
 36 PF09472 MtrF:  Tetrahydrometha  35.4      12 0.00027   36.7  -0.0   47  796-842    17-64  (64)
 37 KOG1144 Translation initiation  33.1   1E+02  0.0022   42.1   7.2   17 1521-1537  397-413 (1064)
 38 PF09323 DUF1980:  Domain of un  32.6      76  0.0016   35.8   5.5   65  956-1020    4-83  (182)
 39 PF05297 Herpes_LMP1:  Herpesvi  32.2      15 0.00033   44.5   0.0   52  949-1002  107-158 (381)
 40 cd02620 Peptidase_C1A_Cathepsi  32.1      85  0.0018   36.5   5.9   27 1927-1953  184-210 (236)
 41 PF09586 YfhO:  Bacterial membr  30.3 1.1E+02  0.0023   41.4   7.1   24  846-869   214-238 (843)
 42 COG0815 Lnt Apolipoprotein N-a  29.3 1.2E+02  0.0026   39.8   7.1   77   99-180    97-185 (518)
 43 PF12065 DUF3545:  Protein of u  28.0      26 0.00055   34.2   0.7   10 1333-1342   23-32  (59)
 44 KOG2341 TATA box binding prote  27.8      61  0.0013   42.7   4.2   27 1055-1081  189-215 (563)
 45 PF05875 Ceramidase:  Ceramidas  27.8      54  0.0012   38.6   3.5  143  806-969    14-159 (262)
 46 cd02698 Peptidase_C1A_Cathepsi  27.2 1.2E+02  0.0027   35.2   6.1   27 1927-1953  178-205 (239)
 47 PF04405 ScdA_N:  Domain of Unk  27.0      42 0.00092   32.1   2.0   33  511-543    11-47  (56)
 48 TIGR00570 cdk7 CDK-activating   26.9      96  0.0021   38.5   5.3  104 1204-1329   57-164 (309)
 49 PTZ00266 NIMA-related protein   26.9      89  0.0019   44.0   5.6   16  823-838   227-242 (1021)
 50 cd06899 lectin_legume_LecRK_Ar  26.8 2.2E+02  0.0047   33.5   8.0   37 1492-1528  150-186 (236)
 51 PF05154 TM2:  TM2 domain;  Int  26.6      21 0.00046   33.0  -0.0   33  290-326     3-38  (51)
 52 PF09991 DUF2232:  Predicted me  26.2      61  0.0013   37.6   3.5   87  915-1002  199-288 (290)
 53 PF14402 7TM_transglut:  7 tran  25.6      86  0.0019   38.8   4.6   55  942-1001  146-207 (313)
 54 PF06439 DUF1080:  Domain of Un  24.9 2.6E+02  0.0056   30.4   7.7  102 1415-1529   38-149 (185)
 55 PRK11588 hypothetical protein;  24.0   2E+02  0.0043   38.0   7.6   46  890-952   172-217 (506)
 56 KOG3011 Ubiquitin-conjugating   23.5 1.7E+02  0.0038   35.6   6.3  117  852-984    83-225 (293)
 57 COG4870 Cysteine protease [Pos  22.8      69  0.0015   40.4   3.2   49 1923-1997  260-318 (372)
 58 cd01951 lectin_L-type legume l  22.6 3.7E+02  0.0079   30.9   8.7   50 1505-1555  154-203 (223)
 59 PRK02509 hypothetical protein;  22.1 1.6E+02  0.0035   41.3   6.6   34  895-928   188-238 (973)
 60 TIGR00917 2A060601 Niemann-Pic  21.9      39 0.00084   48.0   1.0   78  956-1034  640-742 (1204)
 61 PF04123 DUF373:  Domain of unk  21.8      40 0.00086   42.1   1.0  138  923-1074  161-320 (344)
 62 PRK10263 DNA translocase FtsK;  21.4      42 0.00091   47.8   1.2   29  773-805    24-52  (1355)
 63 PRK15097 cytochrome d terminal  21.3 2.5E+02  0.0053   37.3   7.6   91  948-1074  393-491 (522)
 64 PF13801 Metal_resist:  Heavy-m  21.2 3.1E+02  0.0068   27.4   6.9   19 1287-1305   43-61  (125)
 65 KOG3583 Uncharacterized conser  21.1 1.7E+02  0.0037   35.0   5.6  122 1233-1363   38-185 (279)
 66 PF15412 Nse4-Nse3_bdg:  Bindin  21.0      69  0.0015   30.5   2.1   28  182-209    18-45  (56)
 67 PLN00122 serine/threonine prot  21.0   1E+02  0.0022   35.4   3.8   22 1323-1344  142-163 (170)
 68 TIGR02916 PEP_his_kin putative  20.9      57  0.0012   43.0   2.2   36  886-922    58-93  (679)
 69 PF02387 IncFII_repA:  IncFII R  20.7 1.1E+02  0.0023   37.6   4.2   87 1247-1349  159-251 (281)
 70 KOG4661 Hsp27-ERE-TATA-binding  20.2 1.4E+02  0.0029   39.5   5.0   30 1313-1342  626-655 (940)
 71 PF02460 Patched:  Patched fami  20.2 1.1E+02  0.0023   41.5   4.5   53  955-1007  282-348 (798)

No 1  
>KOG0045 consensus Cytosolic Ca2+-dependent cysteine protease (calpain), large subunit (EF-Hand protein superfamily) [Posttranslational modification, protein turnover, chaperones; Signal transduction mechanisms]
Probab=100.00  E-value=1.8e-72  Score=686.37  Aligned_cols=442  Identities=38%  Similarity=0.687  Sum_probs=357.5

Q ss_pred             HHHHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCch
Q 000112         1693 AVKEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCW 1772 (2161)
Q Consensus      1693 aIKE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCW 1772 (2161)
                      .+++.|...+ ..|+|++|||+++|++.+...|..+. ..+.|+||+|++.      +|+++.+++++.||+||.+||||
T Consensus        15 ~~~~~cl~~~-~~F~D~~FP~~~~Sl~~~~~~p~~~~-~~i~W~RP~ei~~------~p~~i~~~~~~~di~Qg~lgdCw   86 (612)
T KOG0045|consen   15 RLRRDCLPAK-SLFVDALFPAADSSLFYKLSTPLAQF-SDIVWKRPQEICA------NPRLIVDGPSRFDVKQGLLGDCW   86 (612)
T ss_pred             HHHHHHhhcC-CcccccCCCCCCccccccccCCCccc-ccceecCcccccC------CCCeecCCCCcceeEEeeecchH
Confidence            3455555554 58999999999999998765555332 4589999999764      68999999999999999999999


Q ss_pred             HHHHHHHHhccccccccccCc----ccCCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHH
Q 000112         1773 FLSAVAVLTEVSQISEVIITP----EYNEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAY 1848 (2161)
Q Consensus      1773 FLAALAALAE~PrLle~fItP----eyNe~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAY 1848 (2161)
                      |+||+|+||.++.++.+++++    .+++.|+|+||||++|+|+.|+|||+|||. +|+..|+++..++|+|++||||||
T Consensus        87 ~laA~a~la~~~~ll~~vip~~~~~~~~yaGif~f~~w~~G~W~~VvIDD~LP~~-~~~~~~~~s~~~~efW~aLlEKAy  165 (612)
T KOG0045|consen   87 FLAACAALALRPELLDKVIPQDQSFQENYAGIFHFRFWQNGEWVEVVIDDRLPTS-NGGLLFSHSSGKNEFWAALLEKAY  165 (612)
T ss_pred             HHHHHHHhhcCHHHHHhccCCCcccccccceEEEEEEEeCCeEEEEEeeeecceE-cCCEEEEeecCCceeHHHHHHHHH
Confidence            999999999999998888873    367899999999999999999999999997 567889999888999999999999


Q ss_pred             HHhcCCcccccCCChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCC--CCC---cccccc
Q 000112         1849 AKLHGSYEALEGGLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPS--GSD---VHISSS 1923 (2161)
Q Consensus      1849 AKLhGSYeaLeGG~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTps--gsD---~e~es~ 1923 (2161)
                      ||++|||+++.||...+|+.+|||+.+|.++++..... +.+ +.+|. +.+..++|.+++|++..  ..+   .....+
T Consensus       166 aKl~GsY~~l~gg~~~~a~~~lTG~~~e~~~l~~~~~~-~~~-~l~~~-~~~~~~~~~~l~c~~~~~~~~~~~~~~~~~~  242 (612)
T KOG0045|consen  166 AKLLGSYEALHGGSTIDALVDLTGGVTEPFDLNKTPKS-FKN-NLVWA-LLKSAHRGSLLLCSIESKDPTEEEEEAKLRN  242 (612)
T ss_pred             HHHhCcccCCCCCchhhHHHhccCCccceeEcccCcch-hHH-HHHHH-HHHhhhccCceeeeccccccchhHHHHHhhc
Confidence            99999999999999999999999999999998764311 111 33444 44555556666665432  222   235689


Q ss_pred             CcccCceeEEEEEEEEcC----eEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCC--CCCCCCeeecchhhHhhc
Q 000112         1924 GIVQGHAYSILQVREVDG----HKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHV--PQSKDGIFWMSWQDFQIH 1997 (2161)
Q Consensus      1924 GLVsGHAYSVLdV~EVdG----~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~--p~sDDGtFWMSfEDFLky 1997 (2161)
                      ||+++|||+|++++++++    ++|+||||||| +.||||+|||++++|....+..+...  ...+||+|||+++||+++
T Consensus       243 gL~~~HaYsit~~~~~~~~~~~~~lirlrNPwg-~~~W~G~wsd~~~~W~~v~~~~~~~~~~~~~~dGeFWms~~dF~~~  321 (612)
T KOG0045|consen  243 GLVKGHAYAITDVREVQGRGGKHRLIRLRNPWG-ESEWNGPWSDGSEEWHLVDKSKLSELGRQPLDDGEFWMSFDDFLRE  321 (612)
T ss_pred             CccccccEEEEEEEEeecccccceeEEecCCcC-CceeccccccCCcchhhhCHHHHhhcccccccCCCeeeeHHHHHhh
Confidence            999999999999999999    99999999999 58999999999999998766554422  126899999999999999


Q ss_pred             ccceeEEEEcCCCc---------cccccCce--e-cccCCCCccC-cCCCCCCeEEEEeccCCCCCCEEEEEEeeccccc
Q 000112         1998 FRSIYVCRVYPSEM---------RYSVHGQW--R-GYSAGGCQDY-ASWNQNPQFRLRASGSDASFPIHVFITLTQGVSF 2064 (2161)
Q Consensus      1998 FssIyICrl~Pd~~---------RyrVhGeW--r-G~TAGGC~Df-dTF~qNPQY~LsVssSD~sePi~VLISLSQkDqr 2064 (2161)
                      |+.+++|++.|++.         ....+|+|  . +.++|||.++ ++|.+||||.+.+..++. ..+.++..+.|+..+
T Consensus       322 F~~~~vC~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~t~ggc~~~~~tF~~npq~~~~~~~~~~-~~~~~v~~~~q~~~~  400 (612)
T KOG0045|consen  322 FDSLTVCRLRPDWLESRNQLQWVKLSLDGEWELARGVTAGGCRNSVDTFDRNPQYILAVRKPTK-SLCAVVLALFQKTRR  400 (612)
T ss_pred             CCeEeecCCCcchhhhhheeeeeeeecCCccceeecccCCCCccCcccccCCceEEEEecCCCc-cceEEEEEeeccccc
Confidence            99999999988754         13467999  3 5789999998 799999999999975432 357788889998643


Q ss_pred             cccccccccccccCCCceeEEEEEEEEecCcccccceeeccc-cCCcccccCcceEEEEEEeCCCccEEEEccccCCCCc
Q 000112         2065 SRTVAGFKNYQSSHDSMMFYIGMRILKTRGRRAAHNIYLHES-VGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEE 2143 (2161)
Q Consensus      2065 sR~~~GFrnYq~shDs~LLyIGL~VfKvrGnRs~~nIflhEs-V~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqE 2143 (2161)
                      +-.         ....+...||+++++...++... +..+.+ .....|.+.|+|+.++++||  |.|++||+|++|+++
T Consensus       401 ~~~---------~~~~~~~~ig~~i~~v~~~~~~~-~~~~~~~~~~~~~i~~r~v~~~~~~P~--~~y~~~pst~~~~~~  468 (612)
T KOG0045|consen  401 GER---------SFGANILDIGFHIYEVPLEGKYF-VLDNAPIASSSSFINNREVSVRFRLPP--GTYVIVPSTFEPGEE  468 (612)
T ss_pred             ccc---------cccceeeecceEEEEecCCCCce-EecccchhcccccccceeEEEEecCCC--cceeecccCCCCCCC
Confidence            111         11235688999999998663221 222222 34567999999999999775  899999999999999


Q ss_pred             cCcEEEEEeCCCcceee
Q 000112         2144 APFVLSVFTKASIILEA 2160 (2161)
Q Consensus      2144 G~FTLRVFSskpItLEP 2160 (2161)
                      ++|+|+||++.++..++
T Consensus       469 ~~f~lrvfs~~~~~~~~  485 (612)
T KOG0045|consen  469 GEFLLRVFSNVKVKSEE  485 (612)
T ss_pred             ccEEEEEeecccccCcc
Confidence            99999999998877663


No 2  
>smart00230 CysPc Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit).
Probab=100.00  E-value=4.4e-69  Score=615.99  Aligned_cols=302  Identities=41%  Similarity=0.813  Sum_probs=266.6

Q ss_pred             HHHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCchH
Q 000112         1694 VKEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCWF 1773 (2161)
Q Consensus      1694 IKE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCWF 1773 (2161)
                      |.+.|..++ .+|+|++|||+..||+.++..+     ..++|+||+|+++      +|++|.++++|.||+||.+|||||
T Consensus         4 i~~~c~~~~-~~f~D~~Fpp~~~sl~~~~~~~-----~~~~W~Rp~e~~~------~~~~~~~~i~~~di~QG~lgDC~~   71 (318)
T smart00230        4 LRQYCKESG-TLFEDPLFPANNGSLFFSQRQR-----KFVVWKRPHEIFE------NPPFIVGGASRTDICQGVLGDCWL   71 (318)
T ss_pred             HHHHHHHcC-CCccCCCCCCCcCccccCCCCC-----CCcEEECcHHHcC------CCEEEeCCCChhhccCcccccHHH
Confidence            455565554 6999999999999999765432     2479999999985      478998899999999999999999


Q ss_pred             HHHHHHHhccccccccccCc--c--cCCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHHH
Q 000112         1774 LSAVAVLTEVSQISEVIITP--E--YNEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAYA 1849 (2161)
Q Consensus      1774 LAALAALAE~PrLle~fItP--e--yNe~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAYA 1849 (2161)
                      +|||++|+++|.+++.++++  +  .|+.|+|+||||+||+|+.|+|||+||+.. |.++|+++.+++|+|++|||||||
T Consensus        72 lsal~~la~~~~~i~~if~~~~~~~~~~~G~y~vrl~~~G~w~~V~VDd~lP~~~-~~~~~~~~~~~~e~W~~LLEKAyA  150 (318)
T smart00230       72 LAALASLTLREKLLDRVIPHDQEFSENYAGIFHFRFWRFGKWVDVVIDDRLPTYN-GELVFMHSNSRNEFWSALLEKAYA  150 (318)
T ss_pred             HHHHHHHHhCHHHHhheEeCCcccccccCCEEEEEEEECCEEEEEEecCCCeeeC-CceEEEEeCCCCcchhHHHHHHHH
Confidence            99999999999888777652  2  468999999999999999999999999974 569999998899999999999999


Q ss_pred             HhcCCcccccCCChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCCCC---CccccccCcc
Q 000112         1850 KLHGSYEALEGGLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPSGS---DVHISSSGIV 1926 (2161)
Q Consensus      1850 KLhGSYeaLeGG~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTpsgs---D~e~es~GLV 1926 (2161)
                      |+||||++|.||.+.+||++|||++++.+++++..    .+.+++|+.|.++.++|++|+|+++..+   +...++.||+
T Consensus       151 K~~GsY~~i~gg~~~~al~~LTG~~~~~i~l~~~~----~~~~~~w~~l~~~~~~g~lv~~~t~~~~~~~~~~~~~~GLv  226 (318)
T smart00230      151 KLNGCYEALKGGSTTEALEDLTGGVAESIDLKEAS----KDPDNLFEDLFKAFERGSLMGCSIGAGTAVEEEEQKDCGLV  226 (318)
T ss_pred             HHcCCCcccCCCCHHHHHHHhcCCCeEEEEccccc----CCHHHHHHHHHHHHhCCCeEEEEcCCCCcchhhhhhhcCcc
Confidence            99999999999999999999999999999987642    2467899999999999999999987653   3345679999


Q ss_pred             cCceeEEEEEEEEcCeE--EEEEecCCCCCccccCCCCCCCcccc---HHHHhhhCCCCCCCCCeeecchhhHhhcccce
Q 000112         1927 QGHAYSILQVREVDGHK--LVQIRNPWANEVEWNGPWSDSSPEWT---DRMKHKLKHVPQSKDGIFWMSWQDFQIHFRSI 2001 (2161)
Q Consensus      1927 sGHAYSVLdV~EVdG~R--LVRLRNPWG~~~EWKG~WSD~S~eWT---eeLKkkL~~~p~sDDGtFWMSfEDFLkyFssI 2001 (2161)
                      ++|||+|++++++++++  ||+|||||| ..||+|+|||+|++|+   +++++++++. ..+||+|||+|+||++||+++
T Consensus       227 ~~HaYsVl~v~~~~~~~~~Ll~lrNPWg-~~eW~G~wsd~s~~W~~~~~~~~~~l~~~-~~~dG~FWM~~~df~~~F~~~  304 (318)
T smart00230      227 KGHAYSVTDVREVQGRRQELLRLRNPWG-QVEWNGPWSDDSPEWRSVSASEKKNLGLT-FDDDGEFWMSFEDFLRHFDKV  304 (318)
T ss_pred             cCccEEEEEEEEEecCCeEEEEEECCCC-CCCcCCCCCCCCccccccCHHHHHHhCCC-CCCCCEEEEEhHHHHhhCCeE
Confidence            99999999999998866  999999999 5899999999999999   6788888764 469999999999999999999


Q ss_pred             eEEEEcCCCcccc
Q 000112         2002 YVCRVYPSEMRYS 2014 (2161)
Q Consensus      2002 yICrl~Pd~~Ryr 2014 (2161)
                      ++|++.|++++|+
T Consensus       305 ~vc~~~~~~~~~r  317 (318)
T smart00230      305 EICNLNPDSLEER  317 (318)
T ss_pred             EEeccCCcccccc
Confidence            9999999987664


No 3  
>cd00044 CysPc Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction.
Probab=100.00  E-value=1.5e-66  Score=592.08  Aligned_cols=304  Identities=47%  Similarity=0.846  Sum_probs=260.7

Q ss_pred             HHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCchHH
Q 000112         1695 KEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCWFL 1774 (2161)
Q Consensus      1695 KE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCWFL 1774 (2161)
                      .+.|...+ .+|+|++|||+.+|++.++..+..+....++|+||+|+++.... .+|++|.++++|.||+||.+|||||+
T Consensus         3 ~~~c~~~~-~~f~D~~Fpp~~~s~~~~~~~~~~~~~~~~~W~Rp~~~~~~~~~-~~~~~~~~~~~~~dI~QG~lgDC~~l   80 (315)
T cd00044           3 LQICLLSG-VLFEDPDFPPNDSSLGFDDSLSNGQPKKVIEWKRPSEIFADDGN-SNPRLFVNGASPSDVCQGILGDCWFL   80 (315)
T ss_pred             HHHHHHcC-CCccCCCCCCCccccccccccccccCcCcceEECcHHHhCcccC-CCCEEEeCCCChhhcccCcccchHHH
Confidence            45565554 69999999999999987643333344556899999999875322 46899999999999999999999999


Q ss_pred             HHHHHHhccccccccccCcc-c---CCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHHHH
Q 000112         1775 SAVAVLTEVSQISEVIITPE-Y---NEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAYAK 1850 (2161)
Q Consensus      1775 AALAALAE~PrLle~fItPe-y---Ne~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAYAK 1850 (2161)
                      |||++|+++|.+++.++.+. .   ++.|+|+||||+||+|+.|+|||+||+..++ |+|+++.+.+|+|++||||||||
T Consensus        81 saL~~la~~~~~i~~lf~~~~~~~~~~~G~y~v~l~~~G~w~~V~VDD~lP~~~~~-~~~~~s~~~~e~W~~LlEKAyAK  159 (315)
T cd00044          81 AALAALAERPELLKRVIPPDQSFEENYAGIYHFRFWKNGEWVEVVIDDRLPTSNGG-LLFMHSRDRNELWVALLEKAYAK  159 (315)
T ss_pred             HHHHHHHcCHHHHhheEcCCcccccCcCcEEEEEEEECCEEEEEEecCCCeecCCc-eEEEEECCCCeEcHHHHHHHHHh
Confidence            99999999998777766543 3   6899999999999999999999999997655 99999988899999999999999


Q ss_pred             hcCCcccccCCChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCCCCCcc-ccccCcccCc
Q 000112         1851 LHGSYEALEGGLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPSGSDVH-ISSSGIVQGH 1929 (2161)
Q Consensus      1851 LhGSYeaLeGG~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTpsgsD~e-~es~GLVsGH 1929 (2161)
                      +||||++|.||++.+||++|||++++.+++++....  ...+++|+.|.++.+.+++|+|+|+...+.. .++.||+++|
T Consensus       160 ~~GsY~~i~gg~~~~al~~LTG~~~~~i~~~~~~~~--~~~~~~~~~l~~~~~~~~lv~~~t~~~~~~~~~~~~Gl~~~H  237 (315)
T cd00044         160 LHGSYEALVGGNTAEALEDLTGGPTERIDLKSADAS--SGDNDLFALLLSFLQGGSLIGCSTGSRSEEEARTANGLVKGH  237 (315)
T ss_pred             hcCCccccCCCCHHHHHHHhhCCCcEEEEccccccc--cCHHHHHHHHHHHhhCCCEEEEEcCCCCcchhhccCCcccCc
Confidence            999999999999999999999999999998764321  2467899999999999999999998765432 5679999999


Q ss_pred             eeEEEEEEEEc--CeEEEEEecCCCCCccccCCCCCCCccccH--HHHhhhCCCCCCCCCeeecchhhHhhcccceeEEE
Q 000112         1930 AYSILQVREVD--GHKLVQIRNPWANEVEWNGPWSDSSPEWTD--RMKHKLKHVPQSKDGIFWMSWQDFQIHFRSIYVCR 2005 (2161)
Q Consensus      1930 AYSVLdV~EVd--G~RLVRLRNPWG~~~EWKG~WSD~S~eWTe--eLKkkL~~~p~sDDGtFWMSfEDFLkyFssIyICr 2005 (2161)
                      ||+|+++++++  |+|||+||||||. .||+|+|||+|++|..  ..++.+. ....+||+|||+|+||++||+++++|+
T Consensus       238 aY~Vl~~~~~~~~~~~lv~lrNPWg~-~~w~G~ws~~~~~w~~~~~~~~~~~-~~~~~dG~Fwm~~~df~~~F~~~~vc~  315 (315)
T cd00044         238 AYSVLDVREVQEEGLRLLRLRNPWGV-GEWWGGWSDDSSEWWVIDAERKKLL-LSGKDDGEFWMSFEDFLRNFDGLYVCN  315 (315)
T ss_pred             ceEEeEEEEEccCceEEEEecCCccC-CCccCCCCCCCchhccChHHHHHhc-CCCCCCCEEEEEhHHhheeeCeEEEeC
Confidence            99999999998  8999999999996 7999999999999953  3333333 345799999999999999999999994


No 4  
>PF00648 Peptidase_C2:  Calpain family cysteine protease This is family C2 in the peptidase classification. ;  InterPro: IPR001300 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium []. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals [, ]: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only [].  All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [, ]. The crystallographic structure of m-calpain reveals six "domains" in the 80kDa subunit:    A 19-amino acid NH2-terminal sequence; Active site domain IIa; Active site domain IIb.  Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related [].  Domain III; An 18-amino acid extended sequence linking domain III to domain IV; Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity []. />]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad [].   Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (IPR001259 from INTERPRO). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma [].  Calpains are a family of cytosolic cysteine proteinases (see PDOC00126 from PROSITEDOC). Members of the calpain family are believed to function in various biological processes, including integrin-mediated cell migration, cytoskeletal remodeling, cell differentiation and apoptosis [, ]. The calpain family includes numerous members from C. elegans to mammals and with homologues in yeast and bacteria. The best characterised members are the m- and mu-calpains, both proteins are heterodimer composed of a large catalytic subunit and a small regulatory subunit. The large subunit comprises four domains (dI-dIV) while the small subunit has two domains (dV-dVI). Domain dI is a short region cleaved by autolysis, dII is the catalytic core, dIII is a C2-like domain, dIV consists of five calcium binding EF-hand motifs []. The crystal structure of calpain has been solved [, ]. The catalytic region consists of two distinct structural domains (dIIa and dIIb). dIIa contains a central helix flanked on three faces by a cluster of alpha-helices and is entirely unrelated to the corresponding domain in the typical thiol proteinases. The fold of dIIb is similar to the corresponding domain in other cysteine proteinases and contains two three-stranded anti-parallel beta-sheets. The catalytic triad residues (C,H,N) are located in dIIa and dIIb. The activation of the domain is dependent on the binding of two calcium atoms in two non EF-hand calcium binding sites located in the catalytic core, one close to the Cys active site in dIIa and one at the end of dIIb. Calcium-binding induced conformational changes in the catalytic domain which align the active site [][]. The profile covers the whole catalytic domain.; GO: 0004198 calcium-dependent cysteine-type endopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 2NQA_A 1KFU_L 1KFX_L 1QXP_B 2R9C_A 1TL9_A 2G8E_A 1KXR_B 2G8J_A 2NQG_A ....
Probab=100.00  E-value=4.6e-65  Score=572.98  Aligned_cols=283  Identities=51%  Similarity=0.959  Sum_probs=228.0

Q ss_pred             ceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCceeecCCCCCCCcccCCCCCchHHHHHHHHhccc
Q 000112         1705 QFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCLFSGAVNPSDVCQGRLGDCWFLSAVAVLTEVS 1784 (2161)
Q Consensus      1705 ~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~LF~ggIsPsDVkQG~LGDCWFLAALAALAE~P 1784 (2161)
                      .|+||+|||+++||+.++..+     ..++|+||+|+++      +|++|.+++.+.||+||.+|||||+|||++||++|
T Consensus         1 ~f~D~~Fpp~~~Sl~~~~~~~-----~~~~W~R~~e~~~------~~~~~~~~~~~~di~QG~lgDc~llaaL~~la~~~   69 (298)
T PF00648_consen    1 LFEDPEFPPNDSSLGFDDQKP-----KNVEWKRPSEICE------NPQFFIDGISPSDIRQGSLGDCWLLAALAALAEHP   69 (298)
T ss_dssp             ----TTS-SSHHHHTSSTTST-----TT-EEE-HHHHSS------S-BSSSSSSSGGGEBE-SSSSHHHHHHHHHHTTSH
T ss_pred             CccCCCCccCccccccCCCCC-----CcceeEechhcCC------CCeEEECCCccccccccccCChhHHHHHHHHHhcc
Confidence            499999999999999765433     3479999999985      47788899999999999999999999999999999


Q ss_pred             cccccccC--ccc--CCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeecCCCCchhHHHHHHHHHHhcCCcccccC
Q 000112         1785 QISEVIIT--PEY--NEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATSKKGHELWVSILEKAYAKLHGSYEALEG 1860 (2161)
Q Consensus      1785 rLle~fIt--Pey--Ne~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArSsd~nELWpSLLEKAYAKLhGSYeaLeG 1860 (2161)
                      .+++++++  +..  +..|+|+||||++|+|++|+|||+||| .+|+|+|+++.+++|+|++||||||||+||||++|.|
T Consensus        70 ~~i~~i~~~~~~~~~~~~G~y~v~l~~~G~w~~V~VDd~lP~-~~g~~~f~~s~~~~elW~~LlEKAyAKl~GsY~~l~g  148 (298)
T PF00648_consen   70 DLIKKIFPVNQSFNENYNGIYTVRLFKNGEWREVTVDDRLPC-KNGKPLFARSSDPNELWPSLLEKAYAKLHGSYSALEG  148 (298)
T ss_dssp             HHHHHHS-SS--SSTT-SSEEEEEEEETTEEEEEEEES-EEE-ETTEESSSBESSTTB-HHHHHHHHHHHHTTSSGGGSS
T ss_pred             cccccccccccccccccCceeeEeeccCCeeeeeccchhhhc-cccceeeeccCCcccchhhhhhchhhhccccccccCC
Confidence            88777763  222  346999999999999999999999999 6899999998889999999999999999999999999


Q ss_pred             CChHHHHhhccCCcceeecccchhhhhccchhHHHHHHHHHhcCCCEEEeeCCCC---CCccccccCcccCceeEEEEEE
Q 000112         1861 GLVQDALVDLTGGAGEEIDMRSAQAQIDLASGRLWSQLLRFKQEGFLLGAGSPSG---SDVHISSSGIVQGHAYSILQVR 1937 (2161)
Q Consensus      1861 G~~sEAL~DLTGgP~E~IDL~saeaq~Dl~sdeLWk~LlsalksG~LMgAsTpsg---sD~e~es~GLVsGHAYSVLdV~ 1937 (2161)
                      |++.++|++|||++++.+++++..     ..+++|+.+.+..+++.++++.+...   ........||+++|||+|++++
T Consensus       149 g~~~~al~~LTG~~~~~~~l~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gl~~~HaY~Vl~~~  223 (298)
T PF00648_consen  149 GNPSEALQDLTGGPPESIDLRDDS-----SDDELWELWKKLLKSGSLVGCSTGSSTPFDSEEYEKNGLVPGHAYAVLDVR  223 (298)
T ss_dssp             BSHHHHHHHHHSSEEEEEEGGG-------T--THHHHHHHHHHCT-EEEEE--SSSGGGTTSBCTTSBBTTS-EEEEEEE
T ss_pred             CChhhhhHhhcCCcceeeeccccc-----hhhhHHHHHHHHHHhccccccccccccccccccccccCcccceeEEEEEEE
Confidence            999999999999999999886542     13468888888899999988876532   1233568999999999999999


Q ss_pred             EEcC----eEEEEEecCCCCCccccCCCCCCCcccc---HHHHhhhCCCCCCCCCeeecchhhHhhcccceeEEEE
Q 000112         1938 EVDG----HKLVQIRNPWANEVEWNGPWSDSSPEWT---DRMKHKLKHVPQSKDGIFWMSWQDFQIHFRSIYVCRV 2006 (2161)
Q Consensus      1938 EVdG----~RLVRLRNPWG~~~EWKG~WSD~S~eWT---eeLKkkL~~~p~sDDGtFWMSfEDFLkyFssIyICrl 2006 (2161)
                      ++++    ++||||||||| ..||+|+||++|++|+   +..++.+++. ..+||+|||+|+||++||+.++||++
T Consensus       224 ~~~~~~~~~~lv~LrNPwg-~~~w~G~ws~~s~~W~~~~~~~~~~~~~~-~~~dg~FWM~~~df~~~F~~i~vc~~  297 (298)
T PF00648_consen  224 EVNGNGEGHRLVKLRNPWG-STEWKGDWSDDSPEWTEIHPSLRKRLNQS-SSDDGTFWMSFEDFLKYFSSIYVCRL  297 (298)
T ss_dssp             EEEETTEEEEEEEEE-TTS-S---SSTTSTTSGGGGGS-HHHHHHHTTT-SSSSSEEEEEHHHHHHHSEEEEEEES
T ss_pred             eeccccceeEEEEEcCCCc-cccccccccccccccccCCHHHHhhcccc-cccCccHhHhHHHHHhhCCceEEEee
Confidence            9975    89999999999 4899999999999999   5677777753 46899999999999999999999986


No 5  
>KOG0045 consensus Cytosolic Ca2+-dependent cysteine protease (calpain), large subunit (EF-Hand protein superfamily) [Posttranslational modification, protein turnover, chaperones; Signal transduction mechanisms]
Probab=99.95  E-value=5.3e-31  Score=323.66  Aligned_cols=598  Identities=23%  Similarity=0.179  Sum_probs=481.9

Q ss_pred             EEeccCCCCCChhhHHHhhhhhhhHHHHHHhhcccceeecCccccccceeeeehhHHHHHHhhhheeeeeccchhHHHHH
Q 000112          874 VVKSREDQVPTKGDFLAALLPLVCIPALLSLCSGLLKWKDDDWKLSRGVYVFITIGLVLLLGAISAVIVVITPWTIGVAF  953 (2161)
Q Consensus       874 v~~~r~~~~p~~~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~~gvaf  953 (2161)
                      ..+.|++..|++..|..+.+|..+.+.++.+++..-+|++-.|++-.-                    ..-+||+|..-.
T Consensus        13 ~~~~~~~cl~~~~~F~D~~FP~~~~Sl~~~~~~p~~~~~~i~W~RP~e--------------------i~~~p~~i~~~~   72 (612)
T KOG0045|consen   13 FERLRRDCLPAKSLFVDALFPAADSSLFYKLSTPLAQFSDIVWKRPQE--------------------ICANPRLIVDGP   72 (612)
T ss_pred             HHHHHHHHhhcCCcccccCCCCCCccccccccCCCcccccceecCccc--------------------ccCCCCeecCCC
Confidence            346789999999999999999999999999999998887777777665                    236899876655


Q ss_pred             HHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhhcCCCCccccchhHHHHHHHhhccceeeeccCC
Q 000112          954 LLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGWFDDKPFVGASVGYFTFLFLLAGRALTVLLSPP 1033 (2161)
Q Consensus       954 ~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1033 (2161)
                      ..+-+...         ..+|.++..-...+.+.-.++.-...      +|+-|-..+.|+|.|-|...|+..+|.    
T Consensus        73 ~~~di~Qg---------~lgdCw~laA~a~la~~~~ll~~vip------~~~~~~~~yaGif~f~~w~~G~W~~Vv----  133 (612)
T KOG0045|consen   73 SRFDVKQG---------LLGDCWFLAACAALALRPELLDKVIP------QDQSFQENYAGIFHFRFWQNGEWVEVV----  133 (612)
T ss_pred             CcceeEEe---------eecchHHHHHHHHhhcCHHHHHhccC------CCcccccccceEEEEEEEeCCeEEEEE----
Confidence            44333222         14566555444444444444443333      899999999999999999999988764    


Q ss_pred             EEEecCceeeEEEeecccccCCCchhhHHHHHHHHhhhccceeEEEEEEcCCCcccchhhhhheeeeccccccccchhhc
Q 000112         1034 IVVYSPRVLPVYVYDAHADCGKNVSVAFLVLYGVALAIEGWGVVASLKIYPPFAGAAVSAITLVVAFGFAVSRPCLTLKT 1113 (2161)
Q Consensus      1034 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1113 (2161)
                        |  --.||+|+++.|    .+.|... ..+.+||...+|  ...+-.|+++.|..++.+.  ++|+.+++.|+...|+
T Consensus       134 --I--DD~LP~~~~~~~----~~~s~~~-~efW~aLlEKAy--aKl~GsY~~l~gg~~~~a~--~~lTG~~~e~~~l~~~  200 (612)
T KOG0045|consen  134 --I--DDRLPTSNGGLL----FSHSSGK-NEFWAALLEKAY--AKLLGSYEALHGGSTIDAL--VDLTGGVTEPFDLNKT  200 (612)
T ss_pred             --e--eeecceEcCCEE----EEeecCC-ceeHHHHHHHHH--HHHhCcccCCCCCchhhHH--HhccCCccceeEcccC
Confidence              2  568999999998    6677777 788999999999  6678899999999887766  9999999999999999


Q ss_pred             hHHHhhhcchhhHHHHHhhhccccccccccccccccccccccceeccCCccccccCCCccccchhhhHHHHhhccccccc
Q 000112         1114 MEDAVHFLSKDTVVQAISRSATKTRNALSGTYSAPQRSASSTALLVGDPNATRDKQGNLMLPRDDVVKLRDRLKNEEFVA 1193 (2161)
Q Consensus      1114 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1193 (2161)
                      +++...     ++++++-++++|..+++..+++                ..+.+++        +.+++|+.|++..-.+
T Consensus       201 ~~~~~~-----~l~~~~~~~~~~~~~l~c~~~~----------------~~~~~~~--------~~~~~~~gL~~~HaYs  251 (612)
T KOG0045|consen  201 PKSFKN-----NLVWALLKSAHRGSLLLCSIES----------------KDPTEEE--------EEAKLRNGLVKGHAYA  251 (612)
T ss_pred             cchhHH-----HHHHHHHHhhhccCceeeeccc----------------cccchhH--------HHHHhhcCccccccEE
Confidence            998876     7899999999999999988876                1222222        7999999999999999


Q ss_pred             ccccccccccccccCCCCCchhhHhhhhhhhhhhhhhhcccceeeeeccchhhhHhhhccchhhhhhhhhhhhhhhhccc
Q 000112         1194 GSFFCRMKYKRFRHELSSDYDYRREMCTHARILALEEAIDTEWVYMWDKFGGYLLLLLGLTAKAERVQDEVRLRLFLDSI 1273 (2161)
Q Consensus      1194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1273 (2161)
                      .+-.+.++.             |+.|+.|.||..-..  ++||.++|++.+.+...+.....+..++|.           
T Consensus       252 it~~~~~~~-------------~~~~~~lirlrNPwg--~~~W~G~wsd~~~~W~~v~~~~~~~~~~~~-----------  305 (612)
T KOG0045|consen  252 ITDVREVQG-------------RGGKHRLIRLRNPWG--ESEWNGPWSDGSEEWHLVDKSKLSELGRQP-----------  305 (612)
T ss_pred             EEEEEEeec-------------ccccceeEEecCCcC--CceeccccccCCcchhhhCHHHHhhccccc-----------
Confidence            998888875             999999999999988  999999999999999999988888777765           


Q ss_pred             CCCcCChhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHHHHHHHhhHHhhhhhhcccCCCCCc
Q 000112         1274 GFSDLSAKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKALLEKEERKWKEIEASLISSIPNAGN 1353 (2161)
Q Consensus      1274 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1353 (2161)
                            ++|...||++|..+.+..+..+-+++++.+|..+|                          +..+.++++++.+
T Consensus       306 ------~~dGeFWms~~dF~~~F~~~~vC~~~~~~~~~~~~--------------------------~~~~~~~~~~~w~  353 (612)
T KOG0045|consen  306 ------LDDGEFWMSFDDFLREFDSLTVCRLRPDWLESRNQ--------------------------LQWVKLSLDGEWE  353 (612)
T ss_pred             ------ccCCCeeeeHHHHHhhCCeEeecCCCcchhhhhhe--------------------------eeeeeeecCCccc
Confidence                  67889999999999999999999999999988877                          5567788999988


Q ss_pred             hHHHHHHHHHHHhcCCccccchhhhHHHHHHHHHHHHHHHHHHHHhcCCcceEEeeCCCCCccCccccccccccccccee
Q 000112         1354 REAAAMAAAVRAVGGDSVLEDSFARERVSSIARRIRTAQLARRALQTGITGAICVLDDEPTTSGRHCGQIDASICQSQKV 1433 (2161)
Q Consensus      1354 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1433 (2161)
                            .++....||.....++|.+.....|+++....                                          
T Consensus       354 ------~~~~~t~ggc~~~~~tF~~npq~~~~~~~~~~------------------------------------------  385 (612)
T KOG0045|consen  354 ------LARGVTAGGCRNSVDTFDRNPQYILAVRKPTK------------------------------------------  385 (612)
T ss_pred             ------eeecccCCCCccCcccccCCceEEEEecCCCc------------------------------------------
Confidence                  66778899999999999987766665544333                                          


Q ss_pred             EEEEEEEeecCCCceeeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEEEEEE
Q 000112         1434 SFSIAVMIQPESGPVCLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVTMT 1513 (2161)
Q Consensus      1434 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1513 (2161)
                                                                  +++.+..+..|++..++|.+|+ .+.+..||+.+++
T Consensus       386 --------------------------------------------~~~~~v~~~~q~~~~~~~~~~~-~~~~ig~~i~~v~  420 (612)
T KOG0045|consen  386 --------------------------------------------SLCAVVLALFQKTRRGERSFGA-NILDIGFHIYEVP  420 (612)
T ss_pred             --------------------------------------------cceEEEEEeecccccccccccc-eeeecceEEEEec
Confidence                                                        8999999999999999999999 9999999999988


Q ss_pred             EeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCChHHH
Q 000112         1514 IDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLTEDEI 1593 (2161)
Q Consensus      1514 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clte~e~ 1593 (2161)
                      .+          ++.|. ++++.+.....-||.++..+|.. +||.+...|+ |+.|..|+.+|+|+||-++.|.+|+  
T Consensus       421 ~~----------~~~~~-~~~~~~~~~~~~i~~r~v~~~~~-~P~~~y~~~p-st~~~~~~~~f~lrvfs~~~~~~~~--  485 (612)
T KOG0045|consen  421 LE----------GKYFV-LDNAPIASSSSFINNREVSVRFR-LPPGTYVIVP-STFEPGEEGEFLLRVFSNVKVKSEE--  485 (612)
T ss_pred             CC----------CCceE-ecccchhcccccccceeEEEEec-CCCcceeecc-cCCCCCCCccEEEEEeecccccCcc--
Confidence            76          66777 99999999999999999999999 9999999999 9999999999999999999999998  


Q ss_pred             HHHhhcccccccccccCCCCCcccCCCCcccccCCCCCcceeecccccccccCcccccccccCCCcceeeccchhhhhcc
Q 000112         1594 ASLYSAICSAELNMNEFPEDNWQWADSPPRVDEWDSDPADVDLYDRDDIDWDGQYSSGRKRRADRDGIVVNVDSFARKFR 1673 (2161)
Q Consensus      1594 ~~~~~~~~~ae~~~~d~~dd~WQ~~dsp~r~~~~~~~~~~~~ly~re~v~~~~q~~sGrk~~~~~d~~~ld~d~f~Rklr 1673 (2161)
                                                                                        +..+..+...|+..
T Consensus       486 ------------------------------------------------------------------~~~i~~~~~~~~~~  499 (612)
T KOG0045|consen  486 ------------------------------------------------------------------DMEISLDETKRSTN  499 (612)
T ss_pred             ------------------------------------------------------------------ceEEeeccccccee
Confidence                                                                              01111111111111


Q ss_pred             CCCcCCHHHHHHHHHHHHHHHHHHHHhcCCCceecCCCCCCCCCcccCCCCCCccccCcceeeccccccccCccCCCcee
Q 000112         1674 KPRMETQEEIYQRMLSVELAVKEALSARGERQFTDHEFPPDDQSLYVDPGNPPSKLQVVAEWMRPSEIVKESRLDCQPCL 1753 (2161)
Q Consensus      1674 kpr~etkEEI~Qrl~svE~aIKE~L~arGe~~FeDpEFPPsdsSLy~Dp~~P~sklq~~IqWkRPsEI~~e~~~ds~P~L 1753 (2161)
                      ....+                                                                           
T Consensus       500 ~~~~~---------------------------------------------------------------------------  504 (612)
T KOG0045|consen  500 IIVMK---------------------------------------------------------------------------  504 (612)
T ss_pred             eeeec---------------------------------------------------------------------------
Confidence            10000                                                                           


Q ss_pred             ecCCCCCCCcccCCCCCchHHHHHHHHhccccccccccCcccCCCCcEEEEEeeCCEEEEEEecccccCCCCCceEEeec
Q 000112         1754 FSGAVNPSDVCQGRLGDCWFLSAVAVLTEVSQISEVIITPEYNEEGIYTVRFCIQGEWVPVVVDDWIPCESPGKPAFATS 1833 (2161)
Q Consensus      1754 F~ggIsPsDVkQG~LGDCWFLAALAALAE~PrLle~fItPeyNe~GiY~VRLyiNGeWreVVVDDrLPc~~nGkPLFArS 1833 (2161)
                              ...++..+|+|.+.......++++....+...+..+.  |    ..++++..++ |..+++...|...+...
T Consensus       505 --------~~~~~~~~~~~~~~~~~~~~k~s~~~~~~~~~~~~~~--~----~~~~~~~~~~-~~~~~~~~~~~~~~~~~  569 (612)
T KOG0045|consen  505 --------GFSLGECGDKWKLSSTLVNTKVSRSSEFILTVEVVSP--L----DIEGESTLVV-DIPIAIESKGSGDVAPL  569 (612)
T ss_pred             --------ceehhhhchhhhccccccccccchhhceeeeeccccc--E----EEeccccccc-cccceeeccCCcccccc
Confidence                    3445556666666555555555443333333333333  2    6788888888 99899887777777776


Q ss_pred             CCCCchhHHHHHHHHHHhcCCcccccCCChHHHHhhccCCc
Q 000112         1834 KKGHELWVSILEKAYAKLHGSYEALEGGLVQDALVDLTGGA 1874 (2161)
Q Consensus      1834 sd~nELWpSLLEKAYAKLhGSYeaLeGG~~sEAL~DLTGgP 1874 (2161)
                      .+..+.|....|++|++.+.++...+++...+.+.++++..
T Consensus       570 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  610 (612)
T KOG0045|consen  570 LNVIRLRIADPEIAYSFDSTSCCATEGPLVLDELFDLSSKK  610 (612)
T ss_pred             eeeeeeeccChhheeeccccccccccCcchhhhhhcCCCCC
Confidence            66779999999999999999999999999999999888754


No 6  
>smart00720 calpain_III calpain_III.
Probab=99.83  E-value=3e-20  Score=191.33  Aligned_cols=135  Identities=33%  Similarity=0.604  Sum_probs=106.4

Q ss_pred             ccccCcee-cccCCCCccC-cCCCCCCeEEEEeccCCCCCCEEEEEEeeccccccccccccccccccCCCceeEEEEEEE
Q 000112         2013 YSVHGQWR-GYSAGGCQDY-ASWNQNPQFRLRASGSDASFPIHVFITLTQGVSFSRTVAGFKNYQSSHDSMMFYIGMRIL 2090 (2161)
Q Consensus      2013 yrVhGeWr-G~TAGGC~Df-dTF~qNPQY~LsVssSD~sePi~VLISLSQkDqrsR~~~GFrnYq~shDs~LLyIGL~Vf 2090 (2161)
                      ..++|+|+ +.+||||.++ .+|++||||.|++.+++. ..++|+|.|+|++++..        .. ......+|||+|+
T Consensus         4 ~~~~G~W~~~~tAGG~~~~~~tf~~NPqy~l~v~~~~~-~~~~v~i~L~q~~~r~~--------~~-~~~~~~~iGf~v~   73 (143)
T smart00720        4 KSVQGSWTRGQTAGGCRNYPATFWTNPQFRITLEEPDD-DDCTVLIALMQKNRRRL--------RR-KGADFLTIGFAVY   73 (143)
T ss_pred             EEEeCeEECCCccCCccccccccccCCeEEEEecCCCC-CceEEEEEecccCcccc--------cc-cCCccceEeEEEE
Confidence            45789997 8999999999 899999999999986542 33889999999975311        11 1124578999999


Q ss_pred             EecCc-ccccceee--ccccCCcccccCcceEEEEEEeCCCccEEEEccccCCCCccCcEEEEEeCCCccee
Q 000112         2091 KTRGR-RAAHNIYL--HESVGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEEAPFVLSVFTKASIILE 2159 (2161)
Q Consensus      2091 KvrGn-Rs~~nIfl--hEsV~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqEG~FTLRVFSskpItLE 2159 (2161)
                      +++.. +.....+.  .+...+++|.+.|++++++.|+|  |.|+|||+|++|+++|+|+|+||++.+++|+
T Consensus        74 ~~~~~~~~~~~~~~~~~~~~~s~~y~~~r~v~~~~~L~~--G~Y~iVPsT~~p~~~g~F~LrV~s~~~~~l~  143 (143)
T smart00720       74 KVPKELHLRRDFFLSNAPRASSGDYINGREVSERFRLPP--GEYVIVPSTFEPNQEGDFLLRVFSEGPFKLT  143 (143)
T ss_pred             EeccccccchhhhhccCccccccccccCeEEEEEEEcCC--CCEEEEEeecCCCCccCEEEEEEecCccccC
Confidence            98765 32222222  22334568999999999999987  8899999999999999999999999999874


No 7  
>cd00214 Calpain_III Calpain, subdomain III. Calpains are  calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains.
Probab=99.82  E-value=4.5e-20  Score=192.77  Aligned_cols=138  Identities=34%  Similarity=0.608  Sum_probs=107.5

Q ss_pred             ccccCceec-ccCCCCccC-cCCCCCCeEEEEeccCCC-CCCEEEEEEeeccccccccccccccccccCCCceeEEEEEE
Q 000112         2013 YSVHGQWRG-YSAGGCQDY-ASWNQNPQFRLRASGSDA-SFPIHVFITLTQGVSFSRTVAGFKNYQSSHDSMMFYIGMRI 2089 (2161)
Q Consensus      2013 yrVhGeWrG-~TAGGC~Df-dTF~qNPQY~LsVssSD~-sePi~VLISLSQkDqrsR~~~GFrnYq~shDs~LLyIGL~V 2089 (2161)
                      ..++|+|+. .+||||.++ .+|++||||.|++.+++. ..+++|+|.|+|++++..+.         ...+..+|||+|
T Consensus         6 ~~~~G~W~~g~tAGGc~~~~~tf~~NPQf~l~v~~~~~~~~~~~v~i~L~q~~~r~~~~---------~~~~~~~IGf~v   76 (150)
T cd00214           6 KSFNGEWRRGQTAGGCRNNPDTFWTNPQFRIRVPEPDDDEGKCTVLIALMQKNRRHLRK---------KGLDLLTIGFHV   76 (150)
T ss_pred             EEEeCeEeCCcccCCCCCcccccccCceEEEEecCCCCCCCccEEEEEeccCCcchhcc---------cCCCcceEEEEE
Confidence            467899976 999999555 799999999999986531 22389999999997642211         123457899999


Q ss_pred             EEecCc-cc-ccceee-ccc-cCCcccccCcceEEEEEEeCCCccEEEEccccCCCCccCcEEEEEeCCCcceeeC
Q 000112         2090 LKTRGR-RA-AHNIYL-HES-VGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEEAPFVLSVFTKASIILEAL 2161 (2161)
Q Consensus      2090 fKvrGn-Rs-~~nIfl-hEs-V~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqEG~FTLRVFSskpItLEPL 2161 (2161)
                      +++++. +. ....+. +++ +..++|.+.|+|++++.|+|  |.|+|||+|++|+++|+|.|+||+++++++++|
T Consensus        77 ~~~~~~~~~~~~~~~~~~~~~~~s~~~~~~rev~~~~~L~p--G~YvIIPsT~~p~~~g~F~LrVfs~~~~~~~~~  150 (150)
T cd00214          77 YKVPGENRHLRRDFFLHKAPRARSSTFINTREVSLRFRLPP--GEYVIVPSTFEPGEEGEFLLRVFSEKSIKSSEL  150 (150)
T ss_pred             EEeCCcCcccChhhhhccCcccccCccccccEEEEEEEcCC--CCEEEEeeecCCCCcccEEEEEEecCCCccccC
Confidence            998652 21 222222 233 34578999999999999987  899999999999999999999999999999886


No 8  
>PF01067 Calpain_III:  Calpain large subunit, domain III;  InterPro: IPR022682 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium []. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals [, ]: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only [].  All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [, ]. The crystallographic structure of m-calpain reveals six "domains" in the 80kDa subunit:    A 19-amino acid NH2-terminal sequence; Active site domain IIa; Active site domain IIb.  Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related [].  Domain III; An 18-amino acid extended sequence linking domain III to domain IV; Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity []. />]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad [].   Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin (IPR001259 from INTERPRO). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma [].   This entry represents domain III. It is found in association with PF00648 from PFAM. The function of the domain III and I are currently unknown. Domain II is a cysteine protease and domain IV is a calcium binding domain. Calpains are believed to participate in intracellular signaling pathways mediated by calcium ions. ; PDB: 1QXP_B 2QFE_A 1DF0_A 1U5I_A 3DF0_A 3BOW_A 1KFU_L 1KFX_L.
Probab=99.79  E-value=2.9e-19  Score=182.38  Aligned_cols=136  Identities=35%  Similarity=0.621  Sum_probs=96.4

Q ss_pred             ccccCce-ecccCCCCccCc-CCCCCCeEEEEeccCCC-CCCEEEEEEeeccccccccccccccccccCCCceeEEEEEE
Q 000112         2013 YSVHGQW-RGYSAGGCQDYA-SWNQNPQFRLRASGSDA-SFPIHVFITLTQGVSFSRTVAGFKNYQSSHDSMMFYIGMRI 2089 (2161)
Q Consensus      2013 yrVhGeW-rG~TAGGC~Dfd-TF~qNPQY~LsVssSD~-sePi~VLISLSQkDqrsR~~~GFrnYq~shDs~LLyIGL~V 2089 (2161)
                      .+++|+| ++.+||||.++. +|++||||.|++..++. +.+++|+|+|+|++.+...         ..+....+|||+|
T Consensus         5 ~~~~G~W~~~~taGG~~~~~~s~~~NPQy~l~v~~~~~~~~~~~v~i~L~q~~~~~~~---------~~~~~~~~Ig~~v   75 (147)
T PF01067_consen    5 VTIEGEWVTGNTAGGCPNNPYSWWNNPQYRLTVSEPTEESNKCTVVISLMQKDRRRKR---------DVGEKDLPIGFYV   75 (147)
T ss_dssp             EEEEEEE-TTTS---STT-TTTGGGS-EEEEEESSGCCCSSBEEEEEEEEECSGCCGC---------STTTTTSEEEEEE
T ss_pred             EEEeCEEeCCCcCCCCcccccccccCcEEEEEEcCCCCCcceeEEEEEEEecCcchhh---------cccccceEEeEEE
Confidence            4689999 899999999998 99999999999986543 2368999999998753211         1123457899999


Q ss_pred             EEecC--ccccccee--eccccCCcccccCcceEEEEEEeCCCccEEEEccccCCCCccCcEEEEEeCCCccee
Q 000112         2090 LKTRG--RRAAHNIY--LHESVGGTDYVNSREISCEMVLDPDPKGYTIVPTTIHPGEEAPFVLSVFTKASIILE 2159 (2161)
Q Consensus      2090 fKvrG--nRs~~nIf--lhEsV~sgdYVNSREVS~RLtLEPepG~YVVVPSTyEPGqEG~FTLRVFSskpItLE 2159 (2161)
                      ++...  .+......  ..+.+..++|.+.|+++.+++|+|  |.|+|||+|++|+++|+|+|+||++.|++|+
T Consensus        76 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~L~~--G~YvIVPsT~~~~~~g~F~L~v~s~~~~~l~  147 (147)
T PF01067_consen   76 FKVQSQQKRLPRQYFLFNKPVVSSGDYSNSREVSEEFTLPP--GTYVIVPSTYEPGQEGEFTLRVFSDSPFELQ  147 (147)
T ss_dssp             EEETTTTSE--HHHHHTS-SSEE-SSEBSSSEEEEEEEE-S--EEEEEEEEESSTT--EEEEEEEEESSSEEE-
T ss_pred             EeeecccccCCcceeccccceeeccccccceEEEEEEEcCC--CCEEEEEecCCCCCeeeEEEEEEECCCcccC
Confidence            99822  11111111  123445678999999999999987  8899999999999999999999999999874


No 9  
>cd00152 PTX Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers.
Probab=97.91  E-value=8.2e-05  Score=82.12  Aligned_cols=162  Identities=20%  Similarity=0.328  Sum_probs=99.2

Q ss_pred             EEEEEEEeecC--CCceeeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEEEE
Q 000112         1434 SFSIAVMIQPE--SGPVCLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVT 1511 (2161)
Q Consensus      1434 ~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1511 (2161)
                      +|++++-++.+  +++..+|..--.++ =-|+++.+..+    |+  +.+-..|...++       . ....||+||.|+
T Consensus        32 ~fTv~~Wv~~~~~~~~~~ifSy~~~~~-~~~~~l~~~~~----g~--~~~~i~~~~~~~-------~-~~~~~g~W~hv~   96 (201)
T cd00152          32 AFTLCLWVYTDLSTREYSLFSYATKGQ-DNELLLYKEKD----GG--YSLYIGGKEVTF-------K-VPESDGAWHHIC   96 (201)
T ss_pred             hEEEEEEEEecCCCCCeEEEEEeCCCC-CCeEEEEEcCC----Ce--EEEEEcCEEEEE-------e-ccCCCCCEEEEE
Confidence            57788778776  47777774333211 22777664432    33  333333332221       2 234899999999


Q ss_pred             EEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCChH
Q 000112         1512 MTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLTED 1591 (2161)
Q Consensus      1512 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clte~ 1591 (2161)
                      +|-|..+|+.+-|+||.-.+.++ +  . .+..+..+....+|-++    |..|..-...-.=+=+|=|+-+|.|-||.+
T Consensus        97 ~t~d~~~g~~~lyvnG~~~~~~~-~--~-~~~~~~~~g~l~lG~~q----~~~gg~~~~~~~f~G~I~~v~iw~~~Ls~~  168 (201)
T cd00152          97 VTWESTSGIAELWVNGKLSVRKS-L--K-KGYTVGPGGSIILGQEQ----DSYGGGFDATQSFVGEISDVNMWDSVLSPE  168 (201)
T ss_pred             EEEECCCCcEEEEECCEEecccc-c--c-CCCEECCCCeEEEeecc----cCCCCCCCCCcceEEEEceeEEEcccCCHH
Confidence            99999999999999998776554 1  1 12345556667777654    333322111111233567888999999999


Q ss_pred             HHHHHhhcccccccccccCCCCCcccC
Q 000112         1592 EIASLYSAICSAELNMNEFPEDNWQWA 1618 (2161)
Q Consensus      1592 e~~~~~~~~~~ae~~~~d~~dd~WQ~~ 1618 (2161)
                      ||..+++.-+...=++++-.++.|+.+
T Consensus       169 eI~~l~~~~~~~~Gnv~~W~~~~~~~~  195 (201)
T cd00152         169 EIKNVYSEGGTLSGNILNWRALNYEIN  195 (201)
T ss_pred             HHHHHHhcCCCCCCCEEechhhEEEEe
Confidence            999998744444445555555555544


No 10 
>smart00159 PTX Pentraxin / C-reactive protein / pentaxin family. This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding.
Probab=97.81  E-value=0.00016  Score=80.42  Aligned_cols=161  Identities=21%  Similarity=0.320  Sum_probs=101.0

Q ss_pred             EEEEEEEeecCC--Cceeee--cccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEE
Q 000112         1434 SFSIAVMIQPES--GPVCLL--GTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHI 1509 (2161)
Q Consensus      1434 ~~~~~~~~~~~~--~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1509 (2161)
                      +|++++-++++.  ++-.||  .+..|.   -|+++...      ++.++.+...|...+        ....+.||+||.
T Consensus        32 ~fTvc~W~k~~~~~~~~~ifSy~~~~~~---ne~~~~~~------~~~~~~l~i~g~~~~--------~~~~~~~g~W~h   94 (206)
T smart00159       32 AFTVCLWFYSDLSPRGYSLFSYATKGQD---NELLLYKE------KQGEYSLYIGGKKVQ--------FPVPESDGKWHH   94 (206)
T ss_pred             HEEEEEEEEecCCCCceEEEEEeCCCCC---CeEEEEEc------CCcEEEEEEcCeEEE--------ecccccCCceEE
Confidence            567777777653  444454  665554   37766543      233466666654211        123578999999


Q ss_pred             EEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCC
Q 000112         1510 VTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLT 1589 (2161)
Q Consensus      1510 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clt 1589 (2161)
                      |++|-|..+|+++-|+||... .+.++  .. +..+..+-.+.+|-++    |..|-.-.+...=+=.|=|+=||.|-||
T Consensus        95 vc~tw~~~~g~~~lyvnG~~~-~~~~~--~~-g~~i~~~G~lvlGq~q----d~~gg~f~~~~~f~G~i~~v~iw~~~Ls  166 (206)
T smart00159       95 ICTTWESSSGIAELWVDGKPG-VRKGL--AK-GYTVKPGGSIILGQEQ----DSYGGGFDATQSFVGEIGDLNMWDSVLS  166 (206)
T ss_pred             EEEEEECCCCcEEEEECCEEc-ccccc--cC-CcEECCCCEEEEEecc----cCCCCCCCCCcceeEEEeeeEEecccCC
Confidence            999999999999999999875 33322  11 2344566678888764    3333221111112335668889999999


Q ss_pred             hHHHHHHhhcccccccccccCCCCCcccCC
Q 000112         1590 EDEIASLYSAICSAELNMNEFPEDNWQWAD 1619 (2161)
Q Consensus      1590 e~e~~~~~~~~~~ae~~~~d~~dd~WQ~~d 1619 (2161)
                      ++||..+++.-...+=++.+-.++.|+.+.
T Consensus       167 ~~eI~~l~~~~~~~~Gnv~~W~~~~~~~~g  196 (206)
T smart00159      167 PEEIKSVYKGSTFSIGNILNWRALNYEVHG  196 (206)
T ss_pred             HHHHHHHHcCCCCCCCCEEeccccEEEEee
Confidence            999999987433334456666666666653


No 11 
>PF13385 Laminin_G_3:  Concanavalin A-like lectin/glucanases superfamily; PDB: 4DQA_A 1N1Y_A 1MZ6_A 1MZ5_A 1N1S_A 2A75_A 1WCS_A 1N1T_A 1N1V_A 2FHR_A ....
Probab=97.30  E-value=0.00098  Score=66.47  Aligned_cols=80  Identities=24%  Similarity=0.379  Sum_probs=49.7

Q ss_pred             ccceecCCceEEEEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchh
Q 000112         1498 SATSIADGRWHIVTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMH 1577 (2161)
Q Consensus      1498 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1577 (2161)
                      ....+.+++||.+++|+|  .++.+.|+||-..+....-..    ..+......-+|-.+           ....--+..
T Consensus        78 ~~~~~~~~~W~~l~~~~~--~~~~~lyvnG~~~~~~~~~~~----~~~~~~~~~~iG~~~-----------~~~~~~~g~  140 (157)
T PF13385_consen   78 SDSNLPDNKWHHLALTYD--GSTVTLYVNGELVGSSTIPSN----ISLNSNGPLFIGGSG-----------GGSSPFNGY  140 (157)
T ss_dssp             -BS---TT-EEEEEEEEE--TTEEEEEETTEEETTCTEESS----SSTTSCCEEEESS-S-----------TT--B-EEE
T ss_pred             cCcccCCCCEEEEEEEEE--CCeEEEEECCEEEEeEeccCC----cCCCCcceEEEeecC-----------CCCCceEEE
Confidence            566788999999999999  556999999999986543222    123444455555433           223344678


Q ss_pred             hheehhhcccCChHHHH
Q 000112         1578 IMDVFLWGRCLTEDEIA 1594 (2161)
Q Consensus      1578 ~~~~~~~~~clte~e~~ 1594 (2161)
                      |-|+-+|.|+||++||+
T Consensus       141 i~~~~i~~~aLt~~eI~  157 (157)
T PF13385_consen  141 IDDLRIYNRALTAEEIQ  157 (157)
T ss_dssp             EEEEEEESS---HHHHH
T ss_pred             EEEEEEECccCCHHHcC
Confidence            88999999999999996


No 12 
>PF00354 Pentaxin:  Pentaxin family;  InterPro: IPR001759 Pentaxins (or pentraxins) [, ] are a family of proteins which show, under electron microscopy, a discoid arrangement of five noncovalently bound subunits. Proteins of the pentaxin family are involved in acute immunological responses []. Three of the principal members of the pentaxin family are serum proteins: namely, C-reactive protein (CRP) [], serum amyloid P component protein (SAP) [], and female protein (FP) []. CRP is expressed during acute phase response to tissue injury or inflammation in mammals. The protein resembles antibody and performs several functions associated with host defence: it promotes agglutination, bacterial capsular swelling and phagocytosis, and activates the classical complement pathway through its calcium-dependent binding to phosphocholine. CRPs have also been sequenced in an invertebrate, Limulus polyphemus (Atlantic horseshoe crab), where they are a normal constituent of the hemolymph. SAP is a vertebrate protein that is a precursor of amyloid component P. It is found in all types of amyloid deposits, in glomerular basement menbrane and in elastic fibres in blood vessels. SAP binds to various lipoprotein ligands in a calcium-dependent manner, and it has been suggested that, in mammals, this may have important implications in atherosclerosis and amyloidosis. FP is a SAP homologue found in Mesocricetus auratus (Golden hamster). The concentration of this plasma protein is altered by sex steroids and stimuli that elicit an acute phase response. Pentaxin proteins expressed in the nervous system are neural pentaxin I (NPI) and II (NPII) []. NPI and NPII are homologous and can exist within one species. It is suggested that both proteins mediate the uptake of synaptic macromolecules and play a role in synaptic plasticity. Apexin, a sperm acrosomal protein, is a homologue of NPII found in Cavia porcellus (Guinea pig) []. PTX3 (or TSG-14) protein is a cytokine-induced protein that is homologous to CRPs and SAPs, but its function is not yet known.; PDB: 2A3W_F 3KQR_C 3D5O_D 2A3X_G 1SAC_D 2W08_B 1GYK_B 1LGN_A 2A3Y_A 1B09_D ....
Probab=97.29  E-value=0.00092  Score=74.48  Aligned_cols=159  Identities=29%  Similarity=0.496  Sum_probs=94.9

Q ss_pred             EEEEEEEeecCC--Cceeee--cccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEE
Q 000112         1434 SFSIAVMIQPES--GPVCLL--GTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHI 1509 (2161)
Q Consensus      1434 ~~~~~~~~~~~~--~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1509 (2161)
                      +|++.+-++++.  ..-+||  .|+.|.   -|+++.+..++      +++|.-.|....       + ...+.||+||.
T Consensus        26 ~fTvC~w~k~~~~~~~~tifSYat~~~~---nell~~~~~~~------~~~l~i~~~~~~-------~-~~~~~~~~Whh   88 (195)
T PF00354_consen   26 AFTVCFWVKTDDSSNDGTIFSYATSSQD---NELLLFGSSSG------SLRLYINGSSVS-------F-SGPIRDGQWHH   88 (195)
T ss_dssp             EEEEEEEEEESGSGS-EEEEEEEETTEE---EEEEEEEETTT------EEEEEETTEEEE-------E-EECS-TSS-EE
T ss_pred             cEEEEEEEEeccCCCceEEEEEccCCCC---ccEEEEEeCCc------eEEEEECCeEeE-------e-ccccCCCCcEE
Confidence            355555555533  355555  444443   38888765442      566776666221       1 13578999999


Q ss_pred             EEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccccccCCCccccccchhhheehhhcccCC
Q 000112         1510 VTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMDVFGRSDSEGAESKMHIMDVFLWGRCLT 1589 (2161)
Q Consensus      1510 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~clt 1589 (2161)
                      +.+|-|..+|+..-|+||-. ....+  +..+..|-..|+ +=+|-+.    |.+|-.-.+...=.=.|-|+-||.|-||
T Consensus        89 ~C~tW~s~~G~~~ly~dG~~-~~~~~--~~~g~~i~~gG~-~vlGQeQ----d~~gG~fd~~q~F~G~i~~~~iWd~vLs  160 (195)
T PF00354_consen   89 ICVTWDSSTGRWQLYVDGVR-LSSTG--LATGHSIPGGGT-LVLGQEQ----DSYGGGFDESQAFVGEISDFNIWDRVLS  160 (195)
T ss_dssp             EEEEEETTTTEEEEEETTEE-EEEEE--SSTT--B-SSEE-EEESS-B----SBTTBTCSGGGB--EEEEEEEEESS---
T ss_pred             EEEEEecCCcEEEEEECCEe-ccccc--ccCCceECCCCE-EEECccc----cccCCCcCCccEeeEEEeceEEEeeeCC
Confidence            99999999999999999993 22333  345556655555 4477654    6666544443333446889999999999


Q ss_pred             hHHHHHHhhcccccccccccCCCCCcccC
Q 000112         1590 EDEIASLYSAICSAELNMNEFPEDNWQWA 1618 (2161)
Q Consensus      1590 e~e~~~~~~~~~~ae~~~~d~~dd~WQ~~ 1618 (2161)
                      ++||+.++.. +..+=++++-.+..|+..
T Consensus       161 ~~eI~~l~~~-~~~~Gnvi~W~~~~~~~~  188 (195)
T PF00354_consen  161 PEEIRALASC-CCYKGNVISWDDLRWSIS  188 (195)
T ss_dssp             HHHHHHHHHT--S---SSEEGGGBEEEEE
T ss_pred             HHHHHHHHhC-CCCCCCEEccccCeEEee
Confidence            9999999986 665666776666666554


No 13 
>cd00110 LamG Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules.
Probab=94.18  E-value=0.31  Score=49.98  Aligned_cols=110  Identities=22%  Similarity=0.327  Sum_probs=63.6

Q ss_pred             eeEEEEEEEeecCCCceeeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccc-eecCCceEEE
Q 000112         1432 KVSFSIAVMIQPESGPVCLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISAT-SIADGRWHIV 1510 (2161)
Q Consensus      1432 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~ 1510 (2161)
                      .-.+++.+.+.|.+..-.||-...+..  -+.+.+    .++.|++-+++-.. .+..      .+... .+.||+||.|
T Consensus        19 ~~~~~i~~~frt~~~~g~l~~~~~~~~--~~~~~l----~l~~g~l~~~~~~g-~~~~------~~~~~~~v~dg~Wh~v   85 (151)
T cd00110          19 RTRLSISFSFRTTSPNGLLLYAGSQNG--GDFLAL----ELEDGRLVLRYDLG-SGSL------VLSSKTPLNDGQWHSV   85 (151)
T ss_pred             cceeEEEEEEEeCCCCeEEEEecCCCC--CCEEEE----EEECCEEEEEEcCC-cccE------EEEccCccCCCCEEEE
Confidence            446777788888765555554444321  111211    14566766654433 2222      22222 6999999999


Q ss_pred             EEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCC
Q 000112         1511 TMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPT 1559 (2161)
Q Consensus      1511 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1559 (2161)
                      +++.+.  ++++-|+||.--. +...  +.+.-.=.....+++|-.|..
T Consensus        86 ~i~~~~--~~~~l~VD~~~~~-~~~~--~~~~~~~~~~~~~~iGg~~~~  129 (151)
T cd00110          86 SVERNG--RSVTLSVDGERVV-ESGS--PGGSALLNLDGPLYLGGLPED  129 (151)
T ss_pred             EEEECC--CEEEEEECCccEE-eeeC--CCCceeecCCCCeEEcCCCCc
Confidence            999987  7899999997111 1111  111112246778899988864


No 14 
>smart00210 TSPN Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin
Probab=94.11  E-value=0.32  Score=53.80  Aligned_cols=86  Identities=22%  Similarity=0.310  Sum_probs=51.1

Q ss_pred             EEEEEEEeecC-CCceeeecccc-cchhhheeeeccccccccccceeEEEEEe---cCCceeeeeeeccccceecCCceE
Q 000112         1434 SFSIAVMIQPE-SGPVCLLGTEF-QKKVCWEILVAGSEQGIEAGQVGLRLITK---GDRQTTVAKDWSISATSIADGRWH 1508 (2161)
Q Consensus      1434 ~~~~~~~~~~~-~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~ 1508 (2161)
                      .||+.+.++|. ..+--||...- |++.=+++.+-|       ++.-+.+.++   |+.++.+-     ....++||+||
T Consensus        53 ~fsi~~~~r~~~~~~g~L~si~~~~~~~~l~v~l~g-------~~~~~~~~~~~~~g~~~~~~f-----~~~~l~dg~WH  120 (184)
T smart00210       53 DFSLLTTFRQTPKSRGVLFAIYDAQNVRQFGLEVDG-------RANTLLLRYQGVDGKQHTVSF-----RNLPLADGQWH  120 (184)
T ss_pred             CeEEEEEEEeCCCCCeEEEEEEcCCCcEEEEEEEeC-------CccEEEEEECCCCCcEEEEee-----cCCccccCCce
Confidence            46666666665 34444554432 444444444432       2334555542   32232221     12469999999


Q ss_pred             EEEEEEeccccceeeeecccccccc
Q 000112         1509 IVTMTIDADIGEATCYLDGGFDGYQ 1533 (2161)
Q Consensus      1509 ~~~~~~~~~~~~~~~~~~~~~~~~~ 1533 (2161)
                      .++++|+.+  .++-|+|+..-+-+
T Consensus       121 ~lal~V~~~--~v~LyvDC~~~~~~  143 (184)
T smart00210      121 KLALSVSGS--SATLYVDCNEIDSR  143 (184)
T ss_pred             EEEEEEeCC--EEEEEECCccccce
Confidence            999999887  69999999876544


No 15 
>smart00282 LamG Laminin G domain.
Probab=93.19  E-value=0.71  Score=47.35  Aligned_cols=109  Identities=20%  Similarity=0.215  Sum_probs=64.7

Q ss_pred             EEEEEEeecCCCceeeecccc-cchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccccceecCCceEEEEEE
Q 000112         1435 FSIAVMIQPESGPVCLLGTEF-QKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVTMT 1513 (2161)
Q Consensus      1435 ~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1513 (2161)
                      +++.+.++|.+..=.||-+.. +.+.   ++.+    .++.|++-++.-..+ +...    -......+.||+||.|.++
T Consensus         3 ~~i~~~frt~~~~g~l~~~~~~~~~~---~l~l----~l~~g~l~~~~~~g~-~~~~----~~~~~~~~~dg~WH~v~i~   70 (135)
T smart00282        3 LSISFSFRTTSPNGLLLYAGSKNGGD---YLAL----ELRDGRLVLRYDLGS-GPAR----LTSDPTPLNDGQWHRVAVE   70 (135)
T ss_pred             eEEEEEEEeCCCCEEEEEeCCCCCCC---EEEE----EEECCEEEEEEECCC-CCEE----EEECCeEeCCCCEEEEEEE
Confidence            566777777765445554433 1221   1211    234688777666533 2211    1224478999999999999


Q ss_pred             EeccccceeeeecccccccccccccccccccccCCceEEEecCCCCc
Q 000112         1514 IDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTD 1560 (2161)
Q Consensus      1514 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1560 (2161)
                      .+  .++.+-++||...-...   .+.....-+..+.+++|-.|+..
T Consensus        71 ~~--~~~~~l~VD~~~~~~~~---~~~~~~~l~~~~~l~iGG~p~~~  112 (135)
T smart00282       71 RN--GRRVTLSVDGENPVSGE---SPGGLTILNLDGPLYLGGLPEDL  112 (135)
T ss_pred             Ee--CCEEEEEECCCccccEE---CCCCceEEecCCCcEEccCCchh
Confidence            87  46788999996432221   12222344556789999888753


No 16 
>smart00560 LamGL LamG-like jellyroll fold domain.
Probab=90.83  E-value=0.89  Score=47.63  Aligned_cols=82  Identities=20%  Similarity=0.253  Sum_probs=49.5

Q ss_pred             EEEEEEEeecCCCce--eeecccccchhhheeeeccccccccccceeEEEEEecCCceeeeeeeccc---cceecCCceE
Q 000112         1434 SFSIAVMIQPESGPV--CLLGTEFQKKVCWEILVAGSEQGIEAGQVGLRLITKGDRQTTVAKDWSIS---ATSIADGRWH 1508 (2161)
Q Consensus      1434 ~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~ 1508 (2161)
                      +|++++.|.|++.|-  .+++         .- +.+-. +-..++.++-|-...+.+      |...   .+....|+||
T Consensus         2 ~fTv~aWv~~~~~~~~~~~~~---------~~-v~~~~-~~~~~~~~f~l~~~~~~~------w~~~~~~~~~~~~~~W~   64 (133)
T smart00560        2 SFTLEAWVKLESAGGSQPIIT---------GA-AVAQP-TISEKALTFFLRAKSVQG------WQTARTGATADWIGVWV   64 (133)
T ss_pred             cEEEEEEEeecccCcccceee---------eE-EEEcc-CCCCCceEEEEEeeccCC------EEEeccccCCCCCCCEE
Confidence            699999999997642  1110         11 11111 223355665554443222      2221   1222239999


Q ss_pred             EEEEEEeccccceeeeeccccccc
Q 000112         1509 IVTMTIDADIGEATCYLDGGFDGY 1532 (2161)
Q Consensus      1509 ~~~~~~~~~~~~~~~~~~~~~~~~ 1532 (2161)
                      -|+++.|.+.|+.+.|+||-..+-
T Consensus        65 hva~v~d~~~g~~~lYvnG~~~~~   88 (133)
T smart00560       65 HLAGVYDGGAGKLSLYVNGVEVAT   88 (133)
T ss_pred             EEEEEEECCCCeEEEEECCEEccc
Confidence            999999999999999999976653


No 17 
>cd02619 Peptidase_C1 C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel str
Probab=81.90  E-value=2.3  Score=46.44  Aligned_cols=49  Identities=24%  Similarity=0.493  Sum_probs=39.4

Q ss_pred             CcccCceeEEEEEEEEc--CeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchhhHhhcc
Q 000112         1924 GIVQGHAYSILQVREVD--GHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQDFQIHF 1998 (2161)
Q Consensus      1924 GLVsGHAYSVLdV~EVd--G~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfEDFLkyF 1998 (2161)
                      .-..+||-.|++.....  +.....+||-||.  .|                        .++|.|||+++++..++
T Consensus       168 ~~~~~Hav~ivGy~~~~~~~~~~~i~~NSwG~--~w------------------------g~~Gy~~i~~~~~~~~~  218 (223)
T cd02619         168 GDLGGHAVVIVGYDDNYVEGKGAFIVKNSWGT--DW------------------------GDNGYGRISYEDVYEMT  218 (223)
T ss_pred             CccCCeEEEEEeecCCCCCCCCEEEEEeCCCC--cc------------------------ccCCEEEEehhhhhhhh
Confidence            44679999999998654  6788999999994  44                        24899999999998554


No 18 
>KOG1029 consensus Endocytic adaptor protein intersectin [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=77.56  E-value=3.7  Score=54.17  Aligned_cols=33  Identities=27%  Similarity=0.427  Sum_probs=19.6

Q ss_pred             chhHHHHHHHhhccceeeeccCCEEEecCceee
Q 000112         1011 SVGYFTFLFLLAGRALTVLLSPPIVVYSPRVLP 1043 (2161)
Q Consensus      1011 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1043 (2161)
                      |+..=....-|.|--+-+.|-|-+.+--||-.|
T Consensus        72 SIAmkLi~lkLqG~~lP~~LPPsll~~~~~~~p  104 (1118)
T KOG1029|consen   72 SIAMKLIKLKLQGIQLPPVLPPSLLKQPPRNAP  104 (1118)
T ss_pred             HHHHHHHHHHhcCCcCCCCCChHHhccCCcCCC
Confidence            555555556677777777665546655555444


No 19 
>PF02210 Laminin_G_2:  Laminin G domain;  InterPro: IPR012680 Laminins are large heterotrimeric glycoproteins involved in basement membrane function []. The laminin globular (G) domain can be found in one to several copies in various laminin family members, including a large number of extracellular proteins. The C terminus of the laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity []. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation []. The structure of the laminin-G domain has been predicted to resemble that of pentraxin [].  Laminin G domains can vary in their function, and a variety of binding functions have been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each have five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan []. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. This entry represents one subtype of laminin G domains, which is sometimes found in association with thrombospondin-type laminin G domains (IPR012679 from INTERPRO).; PDB: 3POY_A 3QCW_B 3R05_B 3ASI_A 3MW4_B 3MW3_A 1QU0_D 1DYK_A 1OKQ_A 3SH4_A ....
Probab=76.41  E-value=5.5  Score=39.52  Aligned_cols=62  Identities=21%  Similarity=0.394  Sum_probs=39.9

Q ss_pred             cccceecCCceEEEEEEEeccccceeeeecccccccccccccccccccccCCceEEEecCCCCccc
Q 000112         1497 ISATSIADGRWHIVTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGVRPPTDMD 1562 (2161)
Q Consensus      1497 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1562 (2161)
                      .....++||+||.|+++.+...  ++-++|+.-.-.+.-....  ...=+....+++|-.|+....
T Consensus        46 ~~~~~~~dg~wh~v~i~~~~~~--~~l~Vd~~~~~~~~~~~~~--~~~~~~~~~l~iGg~~~~~~~  107 (128)
T PF02210_consen   46 FSNSNLNDGQWHKVSISRDGNR--VTLTVDGQSVSSESLPSSS--SDSLDPDGSLYIGGLPESNQP  107 (128)
T ss_dssp             ECSSSSTSSSEEEEEEEEETTE--EEEEETTSEEEEEESSSTT--HHCBESEEEEEESSTTTTCTC
T ss_pred             ccCccccccceeEEEEEEeeee--EEEEecCccceEEeccccc--eecccCCCCEEEecccCcccc
Confidence            3455699999999999887765  7888887643322111111  013345667999999886543


No 20 
>cd02248 Peptidase_C1A Peptidase C1A subfamily (MEROPS database nomenclature); composed of cysteine peptidases (CPs) similar to papain, including the mammalian CPs (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional carboxydipeptidyl activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to h
Probab=59.99  E-value=22  Score=39.35  Aligned_cols=43  Identities=16%  Similarity=0.404  Sum_probs=35.3

Q ss_pred             cccCceeEEEEEEEEcCeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchhh
Q 000112         1925 IVQGHAYSILQVREVDGHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQD 1993 (2161)
Q Consensus      1925 LVsGHAYSVLdV~EVdG~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfED 1993 (2161)
                      ...+|+=.|++..+-.|.+...+||-||.  +|                        .++|.|||+.++
T Consensus       156 ~~~~Hav~iVGy~~~~~~~ywiv~NSWG~--~W------------------------G~~Gy~~i~~~~  198 (210)
T cd02248         156 TNLNHAVLLVGYGTENGVDYWIVKNSWGT--SW------------------------GEKGYIRIARGS  198 (210)
T ss_pred             CcCCEEEEEEEEeecCCceEEEEEcCCCC--cc------------------------ccCcEEEEEcCC
Confidence            45689999999988777889999999994  44                        346999999877


No 21 
>PF03699 UPF0182:  Uncharacterised protein family (UPF0182);  InterPro: IPR005372 This family contains uncharacterised integral membrane proteins.; GO: 0016021 integral to membrane
Probab=58.26  E-value=13  Score=50.24  Aligned_cols=62  Identities=23%  Similarity=0.379  Sum_probs=37.3

Q ss_pred             hhcCceEEEEEeccCCCCCChh-----h----HHHhh-----hhhhhHHHHHHhhcccc---ee--------------ec
Q 000112          865 AFCGASYLEVVKSREDQVPTKG-----D----FLAAL-----LPLVCIPALLSLCSGLL---KW--------------KD  913 (2161)
Q Consensus       865 ~~~~~~~~~v~~~r~~~~p~~~-----d----fl~a~-----lpl~~ipa~~~l~~gl~---kw--------------~d  913 (2161)
                      .+...+.+-..+.|....|...     +    +....     +-++.++++++++.|+.   .|              +|
T Consensus        62 ~~~~~~~~~a~r~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~W~~~L~f~n~~~Fg~~D  141 (774)
T PF03699_consen   62 LFVFLNLWLAYRSRPKFRPPSPEQQRSDPLERYRELIEPRRRWVIIGVSLVLGLFAGLSASSQWETILLFLNGTPFGITD  141 (774)
T ss_pred             HHHHHHHHHHHhcccccccccccccccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhHHHHHHHhCCCCCCCCC
Confidence            4455555556666665444322     1    22222     22456777888877764   35              78


Q ss_pred             Cccccccceeeee
Q 000112          914 DDWKLSRGVYVFI  926 (2161)
Q Consensus       914 d~w~~s~~~~~~~  926 (2161)
                      --....-|-|+|.
T Consensus       142 P~Fg~Di~FYvF~  154 (774)
T PF03699_consen  142 PIFGKDISFYVFS  154 (774)
T ss_pred             CCCCCCceeeeeh
Confidence            8888888999985


No 22 
>PF02057 Glyco_hydro_59:  Glycosyl hydrolase family 59;  InterPro: IPR001286 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 59 GH59 from CAZY comprises enzymes with only one known activity; galactocerebrosidase (3.2.1.46 from EC). Globoid cell leukodystrophy (Krabbe disease) is a severe, autosomal recessive disorder that results from deficiency of galactocerebrosidase (GALC) activity [, , ]. GALC is responsible for the lysosomal catabolism of certain galactolipids, including galactosylceramide and psychosine [].; GO: 0004336 galactosylceramidase activity, 0006683 galactosylceramide catabolic process; PDB: 3ZR6_A 3ZR5_A.
Probab=57.92  E-value=29  Score=46.29  Aligned_cols=91  Identities=29%  Similarity=0.418  Sum_probs=49.7

Q ss_pred             ceeEEEEEEEee-cCCCceeeecccccchhhheeeeccccccc-----cccceeEEEEEecCCceeeeeeeccccceecC
Q 000112         1431 QKVSFSIAVMIQ-PESGPVCLLGTEFQKKVCWEILVAGSEQGI-----EAGQVGLRLITKGDRQTTVAKDWSISATSIAD 1504 (2161)
Q Consensus      1431 ~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1504 (2161)
                      +.++.|.-||+. |++|-|+|.|.--+.- |    ...+.+|+     +.|.--   ||+.-..+++-+...   +.+.-
T Consensus       542 ~NytVs~DV~ie~~~~ggv~lagRv~~~g-~----~~~~~~G~~f~v~~~G~w~---vt~d~~~~~~l~~G~---~~~~~  610 (669)
T PF02057_consen  542 SNYTVSCDVYIETPDTGGVFLAGRVNKGG-C----DVRSARGYFFWVYANGTWS---VTSDLAGTTTLASGT---ADIGA  610 (669)
T ss_dssp             -EEEEEEEEEE-STTT-EEEEEEEE---G-G----GGGG-EEEEEEEETTTEEE---EEEETTS-SEEEEEE----S--T
T ss_pred             eEEEEEEEEEeccCCcCcEEEEEeecccc-c----ccCCCCeEEEEEEcCCcEE---EeccCCCcEEEeeee---ecccC
Confidence            346777888887 5899999987654332 1    12223332     222221   333333333434433   45777


Q ss_pred             CceEEEEEEEeccccceeeeeccccccccc
Q 000112         1505 GRWHIVTMTIDADIGEATCYLDGGFDGYQT 1534 (2161)
Q Consensus      1505 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1534 (2161)
                      ||||+++++|+-++  +++++||..-++-.
T Consensus       611 ~~WhtltL~~~g~~--~ta~lng~~l~~~~  638 (669)
T PF02057_consen  611 GKWHTLTLTISGST--ATAMLNGTVLWTDV  638 (669)
T ss_dssp             T-EEEEEEEEETTE--EEEEETTEEEEEEE
T ss_pred             CeEEEEEEEEECCE--EEEEECCEEeEEec
Confidence            99999999998887  99999998766543


No 23 
>KOG4326 consensus Mitochondrial F1F0-ATP synthase, subunit e [Energy production and conversion]
Probab=56.90  E-value=15  Score=36.81  Aligned_cols=17  Identities=47%  Similarity=0.597  Sum_probs=14.4

Q ss_pred             cchhhhHhhhccchhhh
Q 000112         1242 KFGGYLLLLLGLTAKAE 1258 (2161)
Q Consensus      1242 ~~~~~~~~~~~~~~~~~ 1258 (2161)
                      |||.|-+|+||.+--|-
T Consensus        13 kfGRysaL~lGvaYGa~   29 (81)
T KOG4326|consen   13 KFGRYSALSLGVAYGAF   29 (81)
T ss_pred             HhhHHHHHHHHHHHhHH
Confidence            89999999999876554


No 24 
>TIGR00805 oat sodium-independent organic anion transporter. Proteins of the OAT family catalyze the Na+-independent facilitated transport of organic anions such as bromosulfobromophthalein and prostaglandins as well as conjugated and unconjugated bile acids (taurocholate and cholate, respectively). These transporters have been characterized in mammals, but homologues are present in C. elegans and A. thaliana. Some of the mammalian proteins exhibit a high degree of tissue specificity. For example, the rat OAT is found at high levels in liver and kidney and at lower levels in other tissues. These proteins possess 10-12 putative a-helical transmembrane spanners. They may catalyze electrogenic anion uniport or anion exchange.
Probab=47.60  E-value=33  Score=45.11  Aligned_cols=94  Identities=16%  Similarity=0.280  Sum_probs=53.7

Q ss_pred             ccceeeeehhHHHHHHhhhheeeeeccchh----------HHHHHHHHHHHHHHHhh-hhhcccccceeeehhhHHHHHH
Q 000112          919 SRGVYVFITIGLVLLLGAISAVIVVITPWT----------IGVAFLLLLLLIVLAIG-VIHHWASNNFYLTRTQMFFVCF  987 (2161)
Q Consensus       919 s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~----------~gvaf~l~~~~~~~~ig-v~~~was~~f~~~~~~~~~~~~  987 (2161)
                      +...|++..++..+..++..++...+..+.          .|..+.+..+. ...+| .+.-|.++.+-+..++++..|+
T Consensus       328 ~n~~f~~~~l~~~~~~~~~~~~~~~lP~yl~~~~g~s~~~ag~l~~~~~i~-~~~vG~~l~G~l~~r~~~~~~~~~~~~~  406 (633)
T TIGR00805       328 CNPIYMLVILAQVIDSLAFNGYITFLPKYLENQYGISSAEANFLIGVVNLP-AAGLGYLIGGFIMKKFKLNVKKAAYFAI  406 (633)
T ss_pred             cCcHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHcCCcHHHHHHHhhhhhhh-HHHHHHhhhhheeeeecccHHHHHHHHH
Confidence            344566666666666555555444333332          23222222221 12233 3566777777777777777666


Q ss_pred             HHHHHHHH----HHHhhhcCCCCccccchhH
Q 000112          988 LAFLLGLA----AFLVGWFDDKPFVGASVGY 1014 (2161)
Q Consensus       988 ~~~~~~~~----~~~~~~~~~~~~~~~~~~~ 1014 (2161)
                      +..+++++    .|++| -++-|+.|..+.|
T Consensus       407 ~~~~~~~~~~~~~~~~~-C~~~~~agv~~~y  436 (633)
T TIGR00805       407 CLSTLSYLLCSPLFLIG-CESAPVAGVNNPS  436 (633)
T ss_pred             HHHHHHHHHHHHHHeec-CCCCccceeeccC
Confidence            65555543    45555 5888999999987


No 25 
>PF09323 DUF1980:  Domain of unknown function (DUF1980);  InterPro: IPR015402  Members of this occur in gene pairs with members of PF03773 from PFAM. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region.  Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function has not, as yet, been defined. 
Probab=44.91  E-value=57  Score=36.77  Aligned_cols=57  Identities=19%  Similarity=0.386  Sum_probs=43.9

Q ss_pred             HHHHHHHHHHhhHHHHHHHhhhhhheeeccceehhhHHHHHHHHHHHHHHHHHHhhhhccc
Q 000112           64 FLALSAWMVVISPVAVLIMWGSWLIVILGRDIIGLAIIMAGTALLLAFYSIMLWWRTQWQS  124 (2161)
Q Consensus        64 ~l~l~a~~~~~~p~~~~~~wg~~~~~~~~~~~~g~a~~~~g~~~~~~~y~~~~w~~t~w~s  124 (2161)
                      +|.|++|.+.+    +-+.+-.-+...+.++.+.++++.+.+.++||.+.++.|+|.+=++
T Consensus         4 ~liL~~~~~l~----~~l~~sG~i~~YI~P~~~~~~~~a~i~l~ilai~q~~~~~~~~~~~   60 (182)
T PF09323_consen    4 FLILLGFGILL----FYLILSGKILLYIHPRYIPLLYFAAILLLILAIVQLWRWFRPKRRK   60 (182)
T ss_pred             HHHHHHHHHHH----HHHHHhCcHHHHhCccHHHHHHHHHHHHHHHHHHHHHHHHhccccc
Confidence            34455554432    2344556677788999999999999999999999999999988774


No 26 
>PF00112 Peptidase_C1:  Papain family cysteine protease This is family C1 in the peptidase classification. ;  InterPro: IPR000668 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.  The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity []. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate [].  The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. ; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MOR_B 3HHI_B 1S4V_A 3F75_A 1MEG_A 1PCI_C 1PPO_A 3HD3_B 1F29_A 1EWL_A ....
Probab=44.85  E-value=30  Score=38.04  Aligned_cols=44  Identities=25%  Similarity=0.572  Sum_probs=35.7

Q ss_pred             cccCceeEEEEEEEEcCeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchhhH
Q 000112         1925 IVQGHAYSILQVREVDGHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQDF 1994 (2161)
Q Consensus      1925 LVsGHAYSVLdV~EVdG~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfEDF 1994 (2161)
                      ...+|+-.|++..+-.+.....+||-||.  .|                        .++|.|||+.++-
T Consensus       163 ~~~~Hav~iVGy~~~~~~~~wiv~NSWG~--~W------------------------G~~Gy~~i~~~~~  206 (219)
T PF00112_consen  163 ESGGHAVLIVGYDDENGKGYWIVKNSWGT--DW------------------------GDNGYFRISYDYN  206 (219)
T ss_dssp             SSEEEEEEEEEEEEETTEEEEEEE-SBTT--TS------------------------TBTTEEEEESSSS
T ss_pred             ccccccccccccccccceeeEeeehhhCC--cc------------------------CCCeEEEEeeCCC
Confidence            56799999999998888899999999994  34                        2479999998764


No 27 
>PF04156 IncA:  IncA protein;  InterPro: IPR007285 Chlamydia trachomatis is an obligate intracellular bacterium that develops within a parasitophorous vacuole termed an inclusion. The inclusion is nonfusogenic with lysosomes but intercepts lipids from a host cell exocytic pathway. Initiation of chlamydial development is concurrent with modification of the inclusion membrane by a set of C. trachomatis-encoded proteins collectively designated Incs. One of these Incs, IncA (Inclusion membrane protein A), is functionally associated with the homotypic fusion of inclusions [].
Probab=44.49  E-value=12  Score=41.72  Aligned_cols=14  Identities=14%  Similarity=0.337  Sum_probs=9.5

Q ss_pred             cCCEEEecCceeeE
Q 000112         1031 SPPIVVYSPRVLPV 1044 (2161)
Q Consensus      1031 ~~~~~~~~~~~~~~ 1044 (2161)
                      .+|...+.|+.+|.
T Consensus        63 ~~~~~~~~~~~~~~   76 (191)
T PF04156_consen   63 KRPVQSVRPQQIEE   76 (191)
T ss_pred             ccccccchHHHHHh
Confidence            45666677777776


No 28 
>PTZ00334 trans-sialidase; Provisional
Probab=43.17  E-value=33  Score=46.57  Aligned_cols=77  Identities=23%  Similarity=0.403  Sum_probs=50.6

Q ss_pred             CCceEEEEEEEeccccceeeeeccccccc-ccccccccccccccCCceEEEecCCCCccccc--cC-CCcccc--ccchh
Q 000112         1504 DGRWHIVTMTIDADIGEATCYLDGGFDGY-QTGLALSAGNSIWEEGAEVWVGVRPPTDMDVF--GR-SDSEGA--ESKMH 1577 (2161)
Q Consensus      1504 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~-~~~~~~--~~~~~ 1577 (2161)
                      -|+-|.|.++++-. .+.+.|+||--=|- ++  ++..               +.|.++--|  |- ..+.+.  ++++-
T Consensus       642 ~~k~yqVal~L~~G-~~gsvYVDG~~vg~~~~--~l~~---------------~~~~~IshFyiGgdg~~~~~~~~~~VT  703 (780)
T PTZ00334        642 PETTHQVAIVLRNG-KQGSAYVDGQRVGDASC--ELKN---------------TDSKGISHFYIGGDGGSAGSKEDVPVT  703 (780)
T ss_pred             CCCeEEEEEEEeCC-CeEEEEECCEEecCccc--ccCC---------------CCCcccceEEECCCccccccCCCCCEE
Confidence            36779999999542 26899999976552 22  2221               124444444  11 111111  46788


Q ss_pred             hheehhhcccCChHHHHHHhh
Q 000112         1578 IMDVFLWGRCLTEDEIASLYS 1598 (2161)
Q Consensus      1578 ~~~~~~~~~clte~e~~~~~~ 1598 (2161)
                      ...|||.-|+|+++||.+|..
T Consensus       704 V~NVlLYNRpL~~~Ei~~l~~  724 (780)
T PTZ00334        704 ATNVLLYNRPLDDNEIRVLNA  724 (780)
T ss_pred             EeEeEEeCCCCCHHHHHhhhc
Confidence            999999999999999999975


No 29 
>COG1390 NtpE Archaeal/vacuolar-type H+-ATPase subunit E [Energy production and conversion]
Probab=40.77  E-value=2.5e+02  Score=32.83  Aligned_cols=113  Identities=25%  Similarity=0.262  Sum_probs=73.1

Q ss_pred             hhhhhhhhhhhhhhhhcccCCCcCChhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHHHHHHH
Q 000112         1255 AKAERVQDEVRLRLFLDSIGFSDLSAKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKALLEKEE 1334 (2161)
Q Consensus      1255 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1334 (2161)
                      .||+++.+|.+-+               .++=..|-++.-+-.++.+.+.++-|.+...||=-..-+-.-||+.|-.+||
T Consensus        17 eeak~I~~eA~~e---------------ae~i~~ea~~~~~~~~~~~~~~~~~ea~~~~~~iis~A~le~r~~~Le~~ee   81 (194)
T COG1390          17 EEAEEILEEAREE---------------AEKIKEEAKREAEEAIEEILRKAEKEAERERQRIISSALLEARRKLLEAKEE   81 (194)
T ss_pred             HHHHHHHHHHHHH---------------HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            5677777776543               3333456777778888899988887777777665555455555555555444


Q ss_pred             --hhHHhhhhhhcccCCCCCchHH--HHHHHHHHHhcCCccccchhhhHHHHHH
Q 000112         1335 --RKWKEIEASLISSIPNAGNREA--AAMAAAVRAVGGDSVLEDSFARERVSSI 1384 (2161)
Q Consensus      1335 --~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1384 (2161)
                        ..|-+..-.-|..+++...-++  +-|.+++....|+.+.  -+.+++.+.+
T Consensus        82 ~l~~~~~~~~e~L~~i~~~~~~~~l~~ll~~~~~~~~~~~~i--V~~~e~d~~~  133 (194)
T COG1390          82 ILESVFEAVEEKLRNIASDPEYESLQELLIEALEKLLGGELV--VYLNEKDKAL  133 (194)
T ss_pred             HHHHHHHHHHHHHHcCcCCcchHHHHHHHHHHHHhcCCCCeE--EEeCcccHHH
Confidence              2344455556667777666666  6688888888777766  4555555555


No 30 
>PF07946 DUF1682:  Protein of unknown function (DUF1682);  InterPro: IPR012879 The members of this family are all hypothetical eukaryotic proteins of unknown function. One member (Q920S6 from SWISSPROT) is described as being an adipocyte-specific protein, but no evidence of this was found. 
Probab=40.65  E-value=40  Score=41.21  Aligned_cols=10  Identities=50%  Similarity=0.826  Sum_probs=6.1

Q ss_pred             HHHhhHHhhh
Q 000112         1332 KEERKWKEIE 1341 (2161)
Q Consensus      1332 ~~~~~~~~~~ 1341 (2161)
                      .|.|||.|-|
T Consensus       305 eeQrK~eeKe  314 (321)
T PF07946_consen  305 EEQRKYEEKE  314 (321)
T ss_pred             HHHHHHHHHH
Confidence            5666666655


No 31 
>PF00054 Laminin_G_1:  Laminin G domain;  InterPro: IPR012679 Laminins are large heterotrimeric glycoproteins involved in basement membrane function []. The laminin globular (G) domain can be found in one to several copies in various laminin family members, which includes a large number of extracellular proteins. The C terminus of laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity []. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation []. The structure of the laminin-G domain has been predicted to resemble that of pentraxin [].  Laminin G domains can vary in their function, and a variety of binding functions has been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each has five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan []. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. This entry represents one subtype of laminin G domains, which is sometimes found in association with thrombospondin-type laminin G domains (IPR012680 from INTERPRO).; PDB: 1OKQ_A 1DYK_A 2C5D_A 1H30_A 1LHW_A 1KDK_A 1LHU_A 1KDM_A 1LHO_A 1D2S_A ....
Probab=38.55  E-value=40  Score=35.51  Aligned_cols=51  Identities=22%  Similarity=0.473  Sum_probs=33.8

Q ss_pred             ccccceeEEEEEecCCceeeeeeeccccceecCCceEEEEEEEeccccceeeeeccccc
Q 000112         1472 IEAGQVGLRLITKGDRQTTVAKDWSISATSIADGRWHIVTMTIDADIGEATCYLDGGFD 1530 (2161)
Q Consensus      1472 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1530 (2161)
                      |..|++=+|.- -|.+..++    ..+.+ |.||+||.|++.....  +++-.+||...
T Consensus        26 L~~G~l~~~~~-~G~~~~~~----~~~~~-i~dg~wh~v~~~r~~~--~~~L~Vd~~~~   76 (131)
T PF00054_consen   26 LRDGRLEFRYN-LGSGPASL----RSPQK-INDGKWHTVSVSRNGR--NGSLSVDGEEV   76 (131)
T ss_dssp             EETTEEEEEEE-SSSEEEEE----EESSE-TTSSSEEEEEEEEETT--EEEEEETTSEE
T ss_pred             EECCEEEEEEe-CCCcccee----cCCCc-cCCCcceEEEEEEcCc--EEEEEECCccc
Confidence            66788777763 33333333    12333 9999999999988754  55667888765


No 32 
>cd08045 TAF4 TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex. The TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryote. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are
Probab=38.23  E-value=17  Score=41.80  Aligned_cols=44  Identities=30%  Similarity=0.357  Sum_probs=31.5

Q ss_pred             chHHHHHHHHHHHhcCCccccchhhhHHHHHHHHHHHHHHHHHHHHhcCCcceEEeeCCCCCccCccc
Q 000112         1353 NREAAAMAAAVRAVGGDSVLEDSFARERVSSIARRIRTAQLARRALQTGITGAICVLDDEPTTSGRHC 1420 (2161)
Q Consensus      1353 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1420 (2161)
                      .|--||=++|.-|+||+.-+-                        ....+...+|+|++||+.+..+.
T Consensus       166 ~r~r~AN~tA~~AiG~~kk~~------------------------~~i~~rD~l~~LE~e~~~~~s~l  209 (212)
T cd08045         166 MRHRAANATALAAIGGRKKKK------------------------RRITMRDVLFVLEREPRYSKSAL  209 (212)
T ss_pred             HHHHHHHHHHHHHhCCCCccc------------------------ceeeHHHHHHHHHhCchhhhhhh
Confidence            344566677777899987765                        33445677889999998876653


No 33 
>KOG1029 consensus Endocytic adaptor protein intersectin [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=37.63  E-value=43  Score=45.15  Aligned_cols=43  Identities=23%  Similarity=0.360  Sum_probs=20.5

Q ss_pred             hhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHHHHH
Q 000112         1290 DRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKALLEK 1332 (2161)
Q Consensus      1290 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1332 (2161)
                      |||+=|.-....-++-|.|.++-+||--|.+|-.||||.+.++
T Consensus       356 ekkererqEqErk~qlElekqLerQReiE~qrEEerkkeie~r  398 (1118)
T KOG1029|consen  356 EKKERERQEQERKAQLELEKQLERQREIERQREEERKKEIERR  398 (1118)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4444443333344444555555555544555555555554444


No 34 
>PTZ00266 NIMA-related protein kinase; Provisional
Probab=37.50  E-value=53  Score=45.98  Aligned_cols=10  Identities=50%  Similarity=0.807  Sum_probs=4.6

Q ss_pred             hhHHHHHHHH
Q 000112         1377 ARERVSSIAR 1386 (2161)
Q Consensus      1377 ~~~~~~~~~~ 1386 (2161)
                      -|||...+.|
T Consensus       508 e~er~~r~e~  517 (1021)
T PTZ00266        508 ERERVDRLER  517 (1021)
T ss_pred             HHHHHHHHHH
Confidence            3455544444


No 35 
>PLN02316 synthase/transferase
Probab=36.90  E-value=53  Score=46.06  Aligned_cols=16  Identities=6%  Similarity=-0.068  Sum_probs=10.2

Q ss_pred             CchhHHHHHHHHHHhc
Q 000112         1837 HELWVSILEKAYAKLH 1852 (2161)
Q Consensus      1837 nELWpSLLEKAYAKLh 1852 (2161)
                      +..|..++=||.+.+.
T Consensus       688 d~~RF~~F~~Aale~l  703 (1036)
T PLN02316        688 DGERFGFFCHAALEFL  703 (1036)
T ss_pred             HHHHHHHHHHHHHHHH
Confidence            4567777777766643


No 36 
>PF09472 MtrF:  Tetrahydromethanopterin S-methyltransferase, F subunit (MtrF);  InterPro: IPR013347  Many archaea have evolved energy-yielding pathways marked by one-carbon biochemistry featuring novel cofactors and enzymes. This domain is mostly found in MtrF, where it covers the entire length of the protein. This polypeptide is one of eight subunits of the N5-methyltetrahydromethanopterin: coenzyme M methyltransferase complex found in methanogenic archaea. This is a membrane-associated enzyme complex that uses methyl-transfer reactions to drive a sodium-ion pump []. MtrF itself is involved in the transfer of the methyl group from N5-methyltetrahydromethanopterin to coenzyme M. Subsequently, methane is produced by two-electron reduction of the methyl moiety in methyl-coenzyme M by another enzyme, methyl-coenzyme M reductase. In some organisms this domain is found at the C-terminal region of what appears to be a fusion of the MtrA and MtrF proteins [, ]. The function of these proteins is unknown, though it is likely that they are involved in C1 metabolism.; GO: 0030269 tetrahydromethanopterin S-methyltransferase activity, 0015948 methanogenesis, 0016020 membrane
Probab=35.40  E-value=12  Score=36.74  Aligned_cols=47  Identities=21%  Similarity=0.360  Sum_probs=39.9

Q ss_pred             cccccCCcc-cCCCCccccccccchhhHHHHHhhHHHhhhcccchhhc
Q 000112          796 LEDLGYKGW-TGEPNSFASPYASSVYLGWLMASAIALVVTGVLPIVSW  842 (2161)
Q Consensus       796 ~~~~~~~~~-~~~~~~~~~~y~~~~~~gw~~~~~~~~v~~~~~p~vsw  842 (2161)
                      .||++||.= -++.+...|--.++-..|.++...+|+|+.++.|+.-|
T Consensus        17 vedi~Yk~qLiaR~~kL~SGv~~~~~~GfaiG~~~AlvLv~ip~~l~~   64 (64)
T PF09472_consen   17 VEDIRYKAQLIARDQKLESGVMATGIKGFAIGFLFALVLVGIPILLMF   64 (64)
T ss_pred             HHHHHHHHHHhhhcchhHHHHhhhhhHHHHHHHHHHHHHHHHHHHHhC
Confidence            489999863 45667788888899999999999999999999888766


No 37 
>KOG1144 consensus Translation initiation factor 5B (eIF-5B) [Translation, ribosomal structure and biogenesis]
Probab=33.10  E-value=1e+02  Score=42.08  Aligned_cols=17  Identities=24%  Similarity=0.251  Sum_probs=9.5

Q ss_pred             eeeeecccccccccccc
Q 000112         1521 ATCYLDGGFDGYQTGLA 1537 (2161)
Q Consensus      1521 ~~~~~~~~~~~~~~~~~ 1537 (2161)
                      ++--+||-+|-+-.-+.
T Consensus       397 ~~~~~~~d~dd~ee~~~  413 (1064)
T KOG1144|consen  397 VDLAIDGDDDDDEEELQ  413 (1064)
T ss_pred             ccccccccccchhhhhc
Confidence            33446666776655443


No 38 
>PF09323 DUF1980:  Domain of unknown function (DUF1980);  InterPro: IPR015402  Members of this occur in gene pairs with members of PF03773 from PFAM. The N-terminal region contains several predicted transmembrane helix regions while the few invariant residues (G, CxxD, and W) occur in the C-terminal region.  Members of this family are found in a set of prokaryotic hypothetical proteins. Their exact function has not, as yet, been defined. 
Probab=32.65  E-value=76  Score=35.83  Aligned_cols=65  Identities=28%  Similarity=0.348  Sum_probs=38.0

Q ss_pred             HHHHHHHHhhhhhcccccce--eee-hhhHHHHHHHHHHHHHHHH-HhhhcCCCCcc-----------ccchhHHHHHHH
Q 000112          956 LLLLIVLAIGVIHHWASNNF--YLT-RTQMFFVCFLAFLLGLAAF-LVGWFDDKPFV-----------GASVGYFTFLFL 1020 (2161)
Q Consensus       956 ~~~~~~~~igv~~~was~~f--~~~-~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~-----------~~~~~~~~~~~~ 1020 (2161)
                      +|+|+.+++-.+|.|.+.+.  |+. |+.-+.+....+++.||.+ +..|+..+.-.           .-..+|+.|++-
T Consensus         4 ~liL~~~~~l~~~l~~sG~i~~YI~P~~~~~~~~a~i~l~ilai~q~~~~~~~~~~~~~~h~h~~~~~~~~~~y~l~~iP   83 (182)
T PF09323_consen    4 FLILLGFGILLFYLILSGKILLYIHPRYIPLLYFAAILLLILAIVQLWRWFRPKRRKEDCHDHGHSKSKKLWSYFLFLIP   83 (182)
T ss_pred             HHHHHHHHHHHHHHHHhCcHHHHhCccHHHHHHHHHHHHHHHHHHHHHHHHhcccccccccccccccccccHHHHHHHHH
Confidence            46677778888899998864  554 4444444444444444444 34556555443           345667776663


No 39 
>PF05297 Herpes_LMP1:  Herpesvirus latent membrane protein 1 (LMP1);  InterPro: IPR007961 This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4). LMP1 of HHV-4 is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N terminus and a long cytoplasmic carboxy tail of 200 amino acids. HHV-4 virus latent membrane protein 1 (LMP1) is essential for HHV-4 mediated transformation and has been associated with several cases of malignancies. HHV-4-like viruses in Macaca fascicularis (Cynomolgus monkeys) have been associated with high lymphoma rates in immunosuppressed monkeys [].; GO: 0019087 transformation of host cell by virus, 0016021 integral to membrane; PDB: 1CZY_E 1ZMS_B.
Probab=32.17  E-value=15  Score=44.55  Aligned_cols=52  Identities=21%  Similarity=0.425  Sum_probs=0.0

Q ss_pred             HHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhhc
Q 000112          949 IGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGWF 1002 (2161)
Q Consensus       949 ~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1002 (2161)
                      +|..|+.+.++++++|=.. .|-=.++=-|-.+ ++..++||+||+.-.++..+
T Consensus       107 ~Gi~~l~l~~lLaL~vW~Y-m~lLr~~GAs~Wt-iLaFcLAF~LaivlLIIAv~  158 (381)
T PF05297_consen  107 VGIVILFLCCLLALGVWFY-MWLLRELGASFWT-ILAFCLAFLLAIVLLIIAVL  158 (381)
T ss_dssp             ------------------------------------------------------
T ss_pred             HHHHHHHHHHHHHHHHHHH-HHHHHHhhhHHHH-HHHHHHHHHHHHHHHHHHHH
Confidence            4666666666666655322 4433332223333 34445677777766655554


No 40 
>cd02620 Peptidase_C1A_CathepsinB Cathepsin B group; composed of cathepsin B and similar proteins, including tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin B is a lysosomal papain-like cysteine peptidase which is expressed in all tissues and functions primarily as an exopeptidase through its carboxydipeptidyl activity. Together with other cathepsins, it is involved in the degradation of proteins, proenzyme activation, Ag processing, metabolism and apoptosis. Cathepsin B has been implicated in a number of human diseases such as cancer, rheumatoid arthritis, osteoporosis and Alzheimer's disease. The unique carboxydipeptidyl activity of cathepsin B is attributed to the presence of an occluding loop in its active site which favors the binding of the C-termini of substrate proteins. Some members of this group do not possess the occluding loop. TIN-Ag is an extracellular matrix basement protein which was originally identified as a target Ag involved in anti-tubular basement membrane
Probab=32.12  E-value=85  Score=36.47  Aligned_cols=27  Identities=26%  Similarity=0.389  Sum_probs=23.6

Q ss_pred             cCceeEEEEEEEEcCeEEEEEecCCCC
Q 000112         1927 QGHAYSILQVREVDGHKLVQIRNPWAN 1953 (2161)
Q Consensus      1927 sGHAYSVLdV~EVdG~RLVRLRNPWG~ 1953 (2161)
                      .+||=.|++..+-+|.+...+||-||.
T Consensus       184 ~~HaV~iVGyg~~~g~~YWivrNSWG~  210 (236)
T cd02620         184 GGHAVKIIGWGVENGVPYWLAANSWGT  210 (236)
T ss_pred             CCeEEEEEEEeccCCeeEEEEEeCCCC
Confidence            579999999976678899999999994


No 41 
>PF09586 YfhO:  Bacterial membrane protein YfhO;  InterPro: IPR018580  The yfhO gene is transcribed in Difco sporulation medium and the transcription is affected by the YvrGHb two-component system []. Some members of this family have been annotated as putative ABC transporter permease proteins. 
Probab=30.26  E-value=1.1e+02  Score=41.43  Aligned_cols=24  Identities=33%  Similarity=0.375  Sum_probs=17.1

Q ss_pred             cccccchhhHHHHHHHHHh-hhcCc
Q 000112          846 YRFSLSSAICVGIFAAVLV-AFCGA  869 (2161)
Q Consensus       846 yr~~~~sav~~~~~~~v~~-~~~~~  869 (2161)
                      .||-.++.+.+|+-+++|+ ++++-
T Consensus       214 ~~~~~~~ilg~~lsa~~llP~~~~~  238 (843)
T PF09586_consen  214 LRFIGSSILGVGLSAFLLLPTILSL  238 (843)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            5677777777788788777 66543


No 42 
>COG0815 Lnt Apolipoprotein N-acyltransferase [Cell envelope biogenesis, outer membrane]
Probab=29.30  E-value=1.2e+02  Score=39.81  Aligned_cols=77  Identities=18%  Similarity=0.123  Sum_probs=43.7

Q ss_pred             hHHHHHHHHHHHHHHHHHHhhhhccchhHHHHHHHHHHHHHhhcceEEEEEecC-------CCCC-----CCCCCcceeh
Q 000112           99 AIIMAGTALLLAFYSIMLWWRTQWQSSRAVAVLLLLAVALLCAYELSAVYVTAG-------SHAS-----DRYSPSGFFF  166 (2161)
Q Consensus        99 a~~~~g~~~~~~~y~~~~w~~t~w~s~~~~~~~~~~~~~l~~~~~~~~~yvt~~-------~~~~-----~~~sps~~ff  166 (2161)
                      ..++.+.++.+++|-.+.+|-.+ +.+.+..+..    ++--++|..--.+=+|       -+..     .++-|-+=-+
T Consensus        97 ~~~~~ll~~~lal~~~l~~~~~~-~~~~~~~~~~----~~w~~~E~lR~~~~tGFpW~~~Gy~q~~~~~l~q~a~i~Gv~  171 (518)
T COG0815          97 PLLVLLLAAWLALFLLLVAVLTC-RLWFALLVVP----SAWVAAEWLRGWSLTGFPWLLLGYSQWSPSPLLQLASLGGVW  171 (518)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHH-HHhhhhHHHH----HHHHHHHHHHhccCcCCchhhhchhhccCccccceeeccCHH
Confidence            34557788888888888777655 6666665544    3333445333222222       2222     2233333445


Q ss_pred             hhhHHHHHhhhhhh
Q 000112          167 GVSAIALAINMLFI  180 (2161)
Q Consensus       167 ~~sai~~~~n~l~i  180 (2161)
                      ++|.+.+++|+++.
T Consensus       172 ~lsflvv~~~~~~a  185 (518)
T COG0815         172 LLSFLVVAVNALLA  185 (518)
T ss_pred             HHHHHHHHHHHHHH
Confidence            67888888888753


No 43 
>PF12065 DUF3545:  Protein of unknown function (DUF3545);  InterPro: IPR021932  This family of proteins is functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 60 to 77 amino acids in length. This protein has two completely conserved residues (R and L) that may be functionally important. 
Probab=28.03  E-value=26  Score=34.24  Aligned_cols=10  Identities=70%  Similarity=1.358  Sum_probs=8.6

Q ss_pred             HHhhHHhhhh
Q 000112         1333 EERKWKEIEA 1342 (2161)
Q Consensus      1333 ~~~~~~~~~~ 1342 (2161)
                      ..|||+||||
T Consensus        23 ~KRKWREIEA   32 (59)
T PF12065_consen   23 KKRKWREIEA   32 (59)
T ss_pred             cchhHHHHHH
Confidence            4589999998


No 44 
>KOG2341 consensus TATA box binding protein (TBP)-associated factor, RNA polymerase II [Transcription]
Probab=27.83  E-value=61  Score=42.70  Aligned_cols=27  Identities=19%  Similarity=0.031  Sum_probs=17.3

Q ss_pred             CCchhhHHHHHHHHhhhccceeEEEEE
Q 000112         1055 KNVSVAFLVLYGVALAIEGWGVVASLK 1081 (2161)
Q Consensus      1055 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1081 (2161)
                      |-+++.++-+++-..-.+|=+.=+.+.
T Consensus       189 ~t~~~~~~~~p~s~~~~~g~~~ppq~~  215 (563)
T KOG2341|consen  189 KTLPALRLAVPPSNTFSEGSDPPPQLV  215 (563)
T ss_pred             hcchHhhccCCCcccccCCCCCCcccc
Confidence            556777777777776667665544443


No 45 
>PF05875 Ceramidase:  Ceramidase;  InterPro: IPR008901 This entry consists of several ceramidases. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates.; GO: 0016811 hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amides, 0006672 ceramide metabolic process, 0016021 integral to membrane
Probab=27.78  E-value=54  Score=38.56  Aligned_cols=143  Identities=21%  Similarity=0.196  Sum_probs=74.2

Q ss_pred             CCCCccccccccchhhHHHHHhhHHHhhhcccchhhceeecccccchhhHHHHHHHHHhhhcCceEEEEEeccCCCCCCh
Q 000112          806 GEPNSFASPYASSVYLGWLMASAIALVVTGVLPIVSWFSTYRFSLSSAICVGIFAAVLVAFCGASYLEVVKSREDQVPTK  885 (2161)
Q Consensus       806 ~~~~~~~~~y~~~~~~gw~~~~~~~~v~~~~~p~vswf~tyr~~~~sav~~~~~~~v~~~~~~~~~~~v~~~r~~~~p~~  885 (2161)
                      |++|+..|||-...+=   --|-++   ..++++.-|....|-.+.....+....+++|.+++.-|=-- -++..|    
T Consensus        14 CE~nY~~s~yiAEf~N---tlSNl~---fi~~al~gl~~~~~~~~~~~~~l~~~~l~~VGiGS~~FHaT-l~~~~q----   82 (262)
T PF05875_consen   14 CEENYVVSPYIAEFWN---TLSNLA---FIVAALYGLYLARRRGLERRFALLYLGLALVGIGSFLFHAT-LSYWTQ----   82 (262)
T ss_pred             chhccccCcccchHHH---HHHHHH---HHHHHHHHHHHHhhccccchhHHHHHHHHHHHHhHHHHHhC-hhhhHH----
Confidence            6888999999755432   122222   33355666666666666666666666677776655544322 222222    


Q ss_pred             hhHHHhhhhhhhHHHHHHhhcccceeecCccccccceeeeehhHHHHHHhhhheeeeec--cchhHHHHHHHHHHHHHHH
Q 000112          886 GDFLAALLPLVCIPALLSLCSGLLKWKDDDWKLSRGVYVFITIGLVLLLGAISAVIVVI--TPWTIGVAFLLLLLLIVLA  963 (2161)
Q Consensus       886 ~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i--~~w~~gvaf~l~~~~~~~~  963 (2161)
                         |.--||     -+...++-+|-|-++.. -+++.-..+++.|.... +++++....  +|..-.++|..+.+++++.
T Consensus        83 ---l~DelP-----Ml~~~~~~~~~~~~~~~-~~~~~~~~~~~~L~~~~-~~~t~~~~~~~~p~~~~~~f~~~~~~~~~~  152 (262)
T PF05875_consen   83 ---LLDELP-----MLWATLLFLYIVLTRRY-SSPRYRLALPLLLFIYA-VVVTVLYFVLDNPVFHQIAFASLVLLVILR  152 (262)
T ss_pred             ---Hhhhhh-----HHHHHHHHHHHHhcccc-cCchhhHHHHHHHHHHH-HHHHHHHhhhccchhhhhhHHHHHHHHHHH
Confidence               222233     33334444444444433 11222223344443333 444444444  7888788887776666655


Q ss_pred             hhh-hhc
Q 000112          964 IGV-IHH  969 (2161)
Q Consensus       964 igv-~~~  969 (2161)
                      ... +++
T Consensus       153 ~~~~~~~  159 (262)
T PF05875_consen  153 SIYLIRR  159 (262)
T ss_pred             HHHHHHH
Confidence            554 444


No 46 
>cd02698 Peptidase_C1A_CathepsinX Cathepsin X; the only papain-like lysosomal cysteine peptidase exhibiting carboxymonopeptidase activity. It can also act as a carboxydipeptidase, like cathepsin B, but has been shown to preferentially cleave substrates through a monopeptidyl carboxypeptidase pathway. The propeptide region of cathepsin X, the shortest among papain-like peptidases, is covalently attached to the active site cysteine in the inactive form of the enzyme. Little is known about the biological function of cathepsin X. Some studies point to a role in early tumorigenesis. A more recent study indicates that cathepsin X expression is restricted to immune cells suggesting a role in phagocytosis and the regulation of the immune response.
Probab=27.18  E-value=1.2e+02  Score=35.25  Aligned_cols=27  Identities=22%  Similarity=0.363  Sum_probs=23.1

Q ss_pred             cCceeEEEEEEEEc-CeEEEEEecCCCC
Q 000112         1927 QGHAYSILQVREVD-GHKLVQIRNPWAN 1953 (2161)
Q Consensus      1927 sGHAYSVLdV~EVd-G~RLVRLRNPWG~ 1953 (2161)
                      .+|+=.|++.-+.+ |.+.-.+||-||.
T Consensus       178 ~~HaV~IVGyG~~~~g~~YWiikNSWG~  205 (239)
T cd02698         178 INHIISVAGWGVDENGVEYWIVRNSWGE  205 (239)
T ss_pred             CCeEEEEEEEEecCCCCEEEEEEcCCCc
Confidence            47999999987665 8899999999994


No 47 
>PF04405 ScdA_N:  Domain of Unknown function (DUF542)  ;  InterPro: IPR007500 This is a domain of unknown function found at the N terminus of genes involved in cell wall development and nitrous oxide protection. ScdA is required for normal cell growth and development; mutants have an increased level of peptidoglycan cross-linking and aberrant cellular morphology suggesting a role for ScdA in cell wall metabolism []. NorA1, NorA2, and YtfE are involved in the nitrous oxide response. NorA1 and NorA2, which are similar to YtfE, are co-transcribed with the membrane-bound nitrous oxide (NO) reductases. The genes appear to be involved in NO protection but their function is unknown [, ]. 
Probab=27.01  E-value=42  Score=32.07  Aligned_cols=33  Identities=36%  Similarity=0.660  Sum_probs=29.3

Q ss_pred             cChhhHHHHhhhc----ccCchHhHhhhhhcCCCcch
Q 000112          511 NDPRITSMLKKRA----REGDRELTSLLQDKGLDPNF  543 (2161)
Q Consensus       511 ~~~~~~~~l~~~~----~~~~~~l~~ll~dkgldpnf  543 (2161)
                      ++|+-++.++|-+    -.|++-|..-.+.+|+||+-
T Consensus        11 ~~p~~a~vf~~~gIDfCCgG~~~L~eA~~~~~ld~~~   47 (56)
T PF04405_consen   11 EDPRAARVFRKYGIDFCCGGNRSLEEACEEKGLDPEE   47 (56)
T ss_pred             HChHHHHHHHHcCCcccCCCCchHHHHHHHcCCCHHH
Confidence            6899999999777    67999999999999999974


No 48 
>TIGR00570 cdk7 CDK-activating kinase assembly factor MAT1. All proteins in this family for which functions are known are cyclin dependent protein kinases that are components of TFIIH, a complex that is involved in nucleotide excision repair and transcription initiation. Also known as MAT1 (menage a trois 1). This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=26.92  E-value=96  Score=38.49  Aligned_cols=104  Identities=30%  Similarity=0.467  Sum_probs=53.5

Q ss_pred             ccccCCCCCchhhHhhhhhhhhhhh----hhhcccceeeeeccchhhhHhhhccchhhhhhhhhhhhhhhhcccCCCcCC
Q 000112         1204 RFRHELSSDYDYRREMCTHARILAL----EEAIDTEWVYMWDKFGGYLLLLLGLTAKAERVQDEVRLRLFLDSIGFSDLS 1279 (2161)
Q Consensus      1204 ~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1279 (2161)
                      .|+.-.-.|...-|++-.--||+..    ||-.+|-     ..|.-||          |+|.|=| ..| ...|.- .-.
T Consensus        57 ~fr~q~F~D~~vekEV~iRkrv~~i~Nk~e~dF~~l-----~~yNdYL----------E~vEdii-~nL-~~~~d~-~~t  118 (309)
T TIGR00570        57 NFRVQLFEDPTVEKEVDIRKRVLKIYNKREEDFPSL-----REYNDYL----------EEVEDIV-YNL-TNNIDL-ENT  118 (309)
T ss_pred             hccccccccHHHHHHHHHHHHHHHHHccchhccCCH-----HHHHHHH----------HHHHHHH-HHh-hcCCcH-HHH
Confidence            3555566777778888887887765    3333321     2344555          2332211 000 001100 113


Q ss_pred             hhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccHHHHHHH
Q 000112         1280 AKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGKERRKAL 1329 (2161)
Q Consensus      1280 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1329 (2161)
                      ..+|++|--|.+   +.|+++-.|+++ |++.++|+.++|.+-++.|+..
T Consensus       119 e~~l~~y~~~n~---~~I~~n~~~~~~-e~~~~~~~~~~E~~~~~~rr~~  164 (309)
T TIGR00570       119 KKKIETYQKENK---DVIQKNKEKSTR-EQEELEEALEFEKEEEEQRRLL  164 (309)
T ss_pred             HHHHHHHHHHhH---HHHHHHHHHHHh-HHHHHHHHHHHHHHHHHHHHHH
Confidence            456666655544   458888888776 4455555555555555444333


No 49 
>PTZ00266 NIMA-related protein kinase; Provisional
Probab=26.88  E-value=89  Score=43.96  Aligned_cols=16  Identities=25%  Similarity=0.607  Sum_probs=6.6

Q ss_pred             HHHHhhHHHhhhcccc
Q 000112          823 WLMASAIALVVTGVLP  838 (2161)
Q Consensus       823 w~~~~~~~~v~~~~~p  838 (2161)
                      |.++..+--++||-.|
T Consensus       227 WSLG~ILYELLTGk~P  242 (1021)
T PTZ00266        227 WALGCIIYELCSGKTP  242 (1021)
T ss_pred             HHHHHHHHHHHHCCCC
Confidence            3443333334444444


No 50 
>cd06899 lectin_legume_LecRK_Arcelin_ConA legume lectins, lectin-like receptor kinases, arcelin, concanavalinA, and alpha-amylase inhibitor. This alignment model includes the legume lectins (also known as agglutinins), the arcelin (also known as phytohemagglutinin-L) family of lectin-like defense proteins, the LecRK family of lectin-like receptor kinases, concanavalinA (ConA), and an alpha-amylase inhibitor.  Arcelin is a major seed glycoprotein discovered in kidney beans (Phaseolus vulgaris) that has insecticidal properties and protects the seeds from predation by larvae of various bruchids.  Arcelin is devoid of monosaccharide binding properties and lacks a key metal-binding loop that is present in other members of this family.  Phytohaemagglutinin (PHA) is a lectin found in plants, especially beans, that affects cell metabolism by inducing mitosis and by altering the permeability of the cell membrane to various proteins.  PHA agglutinates most mammalian red blood cell types by bindin
Probab=26.83  E-value=2.2e+02  Score=33.46  Aligned_cols=37  Identities=14%  Similarity=0.129  Sum_probs=30.3

Q ss_pred             eeeeccccceecCCceEEEEEEEeccccceeeeeccc
Q 000112         1492 AKDWSISATSIADGRWHIVTMTIDADIGEATCYLDGG 1528 (2161)
Q Consensus      1492 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1528 (2161)
                      +..|......+.||++|.|.|.-|+.+..-+.||+..
T Consensus       150 ~~~~~~~~~~l~~g~~~~v~I~Y~~~~~~L~V~l~~~  186 (236)
T cd06899         150 AGYWDDDGGKLKSGKPMQAWIDYDSSSKRLSVTLAYS  186 (236)
T ss_pred             eeccccccccccCCCeEEEEEEEcCCCCEEEEEEEeC
Confidence            3556555445789999999999999999999999854


No 51 
>PF05154 TM2:  TM2 domain;  InterPro: IPR007829 This domain is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts.
Probab=26.64  E-value=21  Score=32.97  Aligned_cols=33  Identities=36%  Similarity=0.609  Sum_probs=22.4

Q ss_pred             cchhhHHhhhccceeeeeeecchhhcccch---HHHHHHH
Q 000112          290 QSRVAALFVAGTSRVFLICFGVHYWYLGHC---ISYAVVA  326 (2161)
Q Consensus       290 ~~~~~~~~~a~~~r~~li~fg~~~w~lghc---i~y~~~~  326 (2161)
                      ||+.++.+.+-    |+-.||+|.+|+||=   +.|.++.
T Consensus         3 K~~~~a~lL~~----~lG~~G~hrfYlg~~~~g~~~l~~~   38 (51)
T PF05154_consen    3 KSKWIAYLLSF----FLGWFGLHRFYLGKYGKGILYLLTF   38 (51)
T ss_pred             cCHHHHHHHHH----HHhhccccceecCchHHHHHHHHHH
Confidence            56666666542    566899999999985   4444444


No 52 
>PF09991 DUF2232:  Predicted membrane protein (DUF2232);  InterPro: IPR018710 This family of bacterial and eukaryotic proteins has no known fucntion; however this signature belongs to a Pfam Gx transporter clan.
Probab=26.16  E-value=61  Score=37.56  Aligned_cols=87  Identities=17%  Similarity=0.305  Sum_probs=44.7

Q ss_pred             ccccccceeeeehhHHHHHHhhhheeeeeccchhHHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHH-
Q 000112          915 DWKLSRGVYVFITIGLVLLLGAISAVIVVITPWTIGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLG-  993 (2161)
Q Consensus       915 ~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~-  993 (2161)
                      .|++++..-.+..+++++.+-..........--..-+..++..++++-+++++|+|..+. -++|.=-.+..++.+++. 
T Consensus       199 ~~~lP~~~~~~~i~~~~~~l~~~~~~~~~~~~i~~Nl~~v~~~l~~~qGla~~~~~~~~~-~~~~~~~~l~~~~~i~~~~  277 (290)
T PF09991_consen  199 EWRLPRWLIWLLIVALALSLVGGGFGGSWLQIIGLNLLIVLSFLFFIQGLAVIHFFLKRR-KMSKFLRVLLYILLILFPF  277 (290)
T ss_pred             HHhCcHHHHHHHHHHHHHHHHhcccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHc-CCcHHHHHHHHHHHHHHHH
Confidence            488887654333344333321111111111122234556666777888999999998776 666654333333333332 


Q ss_pred             --HHHHHhhhc
Q 000112          994 --LAAFLVGWF 1002 (2161)
Q Consensus       994 --~~~~~~~~~ 1002 (2161)
                        ..-.++|.+
T Consensus       278 ~~~~l~~lG~~  288 (290)
T PF09991_consen  278 LIVILALLGLI  288 (290)
T ss_pred             HHHHHHHHHhh
Confidence              334445544


No 53 
>PF14402 7TM_transglut:  7 transmembrane helices usually fused to an inactive transglutaminase
Probab=25.59  E-value=86  Score=38.85  Aligned_cols=55  Identities=25%  Similarity=0.473  Sum_probs=42.1

Q ss_pred             eeccchhHHHHHH-------HHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhh
Q 000112          942 VVITPWTIGVAFL-------LLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGW 1001 (2161)
Q Consensus       942 ~~i~~w~~gvaf~-------l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~ 1001 (2161)
                      +|.-|-.|.+||.       ++++++++++|.+-+     +||+|..+++|-=+|-++....++++.
T Consensus       146 GTFmPVLIAlAF~eT~L~~Gli~FllIV~~GL~iR-----~yLs~LnLLlV~RisaVli~VI~ii~~  207 (313)
T PF14402_consen  146 GTFMPVLIALAFRETQLLWGLILFLLIVAIGLLIR-----SYLSHLNLLLVPRISAVLIVVILIIAA  207 (313)
T ss_pred             cchHHHHHHHHHHHhhhHHHHHHHHHHHHHHHHHH-----HHHHhhhhHHHHHHHHHHHHHHHHHHH
Confidence            4566777777775       778889999999766     699999999998777777666665544


No 54 
>PF06439 DUF1080:  Domain of Unknown Function (DUF1080);  InterPro: IPR010496 This is a family of proteins of unknown function.; PDB: 3IMM_B 3NMB_A 3S5Q_A 3OSD_A 3HBK_A 3H3L_A 3U1X_A.
Probab=24.90  E-value=2.6e+02  Score=30.38  Aligned_cols=102  Identities=17%  Similarity=0.248  Sum_probs=52.8

Q ss_pred             ccCcccccccccccccceeEEEEEEEeecCCCceeee-cc-----cccchhhheeeecccccc----ccccceeEEEEEe
Q 000112         1415 TSGRHCGQIDASICQSQKVSFSIAVMIQPESGPVCLL-GT-----EFQKKVCWEILVAGSEQG----IEAGQVGLRLITK 1484 (2161)
Q Consensus      1415 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~-----~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~ 1484 (2161)
                      ..+.+.|.+=... ......+++-++++| +|-..++ -.     +.....|.|+-+.....+    -..|.+=-+    
T Consensus        38 ~~~~~~~~l~~~~-~~~df~l~~d~k~~~-~~~sGi~~r~~~~~~~~~~~~gy~~~i~~~~~~~~~~~~~G~~~~~----  111 (185)
T PF06439_consen   38 SSGSGGGYLYTDK-KFSDFELEVDFKITP-GGNSGIFFRAQSPGDGQDWNNGYEFQIDNSGGGTGLPNSTGSLYDE----  111 (185)
T ss_dssp             GGESSS--EEESS-EBSSEEEEEEEEE-T-T-EEEEEEEESSECCSSGGGTSEEEEEE-TTTCSTTTTSTTSBTTT----
T ss_pred             cCCCCcceEEECC-ccccEEEEEEEEECC-CCCeEEEEEeccccCCCCcceEEEEEEECCCCccCCCCccceEEEe----
Confidence            3444555444443 556677888888754 4433332 22     245677888877776555    111111000    


Q ss_pred             cCCceeeeeeeccccceecCCceEEEEEEEeccccceeeeecccc
Q 000112         1485 GDRQTTVAKDWSISATSIADGRWHIVTMTIDADIGEATCYLDGGF 1529 (2161)
Q Consensus      1485 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1529 (2161)
                      -.++     .-.-....+..|+||.++|++..+.  .++|+||..
T Consensus       112 ~~~~-----~~~~~~~~~~~~~W~~~~I~~~g~~--i~v~vnG~~  149 (185)
T PF06439_consen  112 PPWQ-----LEPSVNVAIPPGEWNTVRIVVKGNR--ITVWVNGKP  149 (185)
T ss_dssp             B-TC-----B-SSS--S--TTSEEEEEEEEETTE--EEEEETTEE
T ss_pred             cccc-----ccccccccCCCCceEEEEEEEECCE--EEEEECCEE
Confidence            0000     0122344578899999999998776  889999964


No 55 
>PRK11588 hypothetical protein; Provisional
Probab=24.04  E-value=2e+02  Score=38.00  Aligned_cols=46  Identities=15%  Similarity=0.249  Sum_probs=36.5

Q ss_pred             HhhhhhhhHHHHHHhhcccceeecCccccccceeeeehhHHHHHHhhhheeeeeccchhHHHH
Q 000112          890 AALLPLVCIPALLSLCSGLLKWKDDDWKLSRGVYVFITIGLVLLLGAISAVIVVITPWTIGVA  952 (2161)
Q Consensus       890 ~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~~~~~~~gl~ll~~a~~~v~~~i~~w~~gva  952 (2161)
                      .++.|+ ++|-+.+||                .=-.+|.+++++-..+.+...++||.++|+|
T Consensus       172 i~f~pi-~v~l~~alG----------------yD~ivg~ai~~lg~~iGf~~s~~NPftvgIA  217 (506)
T PRK11588        172 IAFAII-IAPLMVRLG----------------YDSITTVLVTYVATQIGFATSWMNPFSVAIA  217 (506)
T ss_pred             HHHHHH-HHHHHHHhC----------------CcHHHHHHHHHHHhhhhhcccccCccHHHHH
Confidence            366664 567666665                2347899999999999999999999999887


No 56 
>KOG3011 consensus Ubiquitin-conjugating enzyme [Posttranslational modification, protein turnover, chaperones]
Probab=23.50  E-value=1.7e+02  Score=35.56  Aligned_cols=117  Identities=21%  Similarity=0.306  Sum_probs=63.7

Q ss_pred             hhhHHHHHHHHHhhhcCceEEEEEeccCCCCCChhhHHHhhhhhhhHHHHHHhhcccceeecCcc---------------
Q 000112          852 SAICVGIFAAVLVAFCGASYLEVVKSREDQVPTKGDFLAALLPLVCIPALLSLCSGLLKWKDDDW---------------  916 (2161)
Q Consensus       852 sav~~~~~~~v~~~~~~~~~~~v~~~r~~~~p~~~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w---------------  916 (2161)
                      .|.|.++|+.++...-|+-=.+             +.+..+|--+|=-..-=|++|+|.|--|.|               
T Consensus        83 ~~~c~~lf~~~~~~ii~~~~s~-------------~~~~~~La~~aG~i~AD~~SGl~HWaaD~~Gsv~tP~vG~~f~rf  149 (293)
T KOG3011|consen   83 AAGCTTLFVSFAKSIIGGFGSH-------------LWLEPALAAYAGYITADLGSGVYHWAADNYGSVSTPWVGRQFERF  149 (293)
T ss_pred             HhhhHHHHHHHHHHHHHhhhhh-------------hhHHHHHHHHHHHHHHhhhcceeEeeccccCccccchhHHHHHHH
Confidence            4568888888777655543211             223333333333334468999999966655               


Q ss_pred             --------ccccceeeeehhHHHHHHhhhheeeeecc---chhHHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHH
Q 000112          917 --------KLSRGVYVFITIGLVLLLGAISAVIVVIT---PWTIGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFF  984 (2161)
Q Consensus       917 --------~~s~~~~~~~~~gl~ll~~a~~~v~~~i~---~w~~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~  984 (2161)
                              .+.|.-++=.   +-|+--|+-+.+-+..   -|.+--+|.+.+-+.|+----||.|+---|=|+|.-+++
T Consensus       150 reHH~dP~tITr~~f~~~---~~ll~~a~~f~v~~~d~~~q~~~~h~fV~~~~i~v~~tnQiHkWsHTy~gLP~wVv~L  225 (293)
T KOG3011|consen  150 QEHHKDPWTITRRQFANN---LHLLARAYTFIVLPLDLAFQDPVFHGFVFLFAICVLFTNQIHKWSHTYSGLPPWVVLL  225 (293)
T ss_pred             HhccCCcceeeHHHHhhh---hHHHHHhheeEecCHHHHhhcccHHHHHHHHHHHHHHHHHHHHHHhhhccCchHHHHH
Confidence                    4444443333   2233324444443321   122223444444444444556999999888899865543


No 57 
>COG4870 Cysteine protease [Posttranslational modification, protein turnover, chaperones]
Probab=22.78  E-value=69  Score=40.39  Aligned_cols=49  Identities=31%  Similarity=0.577  Sum_probs=35.2

Q ss_pred             cCcccCceeEEEEEEEEc----------CeEEEEEecCCCCCccccCCCCCCCccccHHHHhhhCCCCCCCCCeeecchh
Q 000112         1923 SGIVQGHAYSILQVREVD----------GHKLVQIRNPWANEVEWNGPWSDSSPEWTDRMKHKLKHVPQSKDGIFWMSWQ 1992 (2161)
Q Consensus      1923 ~GLVsGHAYSVLdV~EVd----------G~RLVRLRNPWG~~~EWKG~WSD~S~eWTeeLKkkL~~~p~sDDGtFWMSfE 1992 (2161)
                      .+-..|||=.|++...--          |.--+++||-||.  .|                        .++|-|||+++
T Consensus       260 s~~~~gHAv~iVGyDDs~~~n~~~~~~~g~GAfiikNSWGt--~w------------------------G~~GYfwisY~  313 (372)
T COG4870         260 SGENWGHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGT--NW------------------------GENGYFWISYY  313 (372)
T ss_pred             ccccccceEEEEeccccccccccccCCCCCceEEEECcccc--cc------------------------ccCceEEEEee
Confidence            345679999999876421          2236889999995  33                        35799999999


Q ss_pred             hHhhc
Q 000112         1993 DFQIH 1997 (2161)
Q Consensus      1993 DFLky 1997 (2161)
                      +-..-
T Consensus       314 ya~~g  318 (372)
T COG4870         314 YALNG  318 (372)
T ss_pred             ecccc
Confidence            86554


No 58 
>cd01951 lectin_L-type legume lectins. The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind.  This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor.  L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face".  This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers.  Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely.
Probab=22.63  E-value=3.7e+02  Score=30.89  Aligned_cols=50  Identities=22%  Similarity=0.286  Sum_probs=32.5

Q ss_pred             CceEEEEEEEeccccceeeeecccccccccccccccccccccCCceEEEec
Q 000112         1505 GRWHIVTMTIDADIGEATCYLDGGFDGYQTGLALSAGNSIWEEGAEVWVGV 1555 (2161)
Q Consensus      1505 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1555 (2161)
                      |+||.|.|+.|+.++.-+.++|+.-.....-+..++.-.- ....+++||+
T Consensus       154 g~~~~v~I~Y~~~~~~L~v~l~~~~~~~~~~l~~~~~l~~-~~~~~~yvGF  203 (223)
T cd01951         154 GNEHTVRITYDPTTNTLTVYLDNGSTLTSLDITIPVDLIQ-LGPTKAYFGF  203 (223)
T ss_pred             CCEEEEEEEEeCCCCEEEEEECCCCccccccEEEeeeecc-cCCCcEEEEE
Confidence            9999999999999999999999764312122222222211 1246777765


No 59 
>PRK02509 hypothetical protein; Provisional
Probab=22.05  E-value=1.6e+02  Score=41.25  Aligned_cols=34  Identities=32%  Similarity=0.527  Sum_probs=24.7

Q ss_pred             hhhHHHHHHhhcccc---ee--------------ecCccccccceeeeehh
Q 000112          895 LVCIPALLSLCSGLL---KW--------------KDDDWKLSRGVYVFITI  928 (2161)
Q Consensus       895 l~~ipa~~~l~~gl~---kw--------------~dd~w~~s~~~~~~~~~  928 (2161)
                      +..||++++++.|+.   .|              +|--....-|-|+|.-=
T Consensus       188 ~~~i~~~~sl~~g~~~~~~W~~~l~f~n~~~Fg~~DP~Fg~DisFYvF~LP  238 (973)
T PRK02509        188 LRGIAIILSLAFGLILSGNWARVLQYFHSTPFNETDPLFGRDISFYIFQLP  238 (973)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHhCCCCCCCCCCCCCCCcEEEEEehH
Confidence            556777777777753   34              78888888999998643


No 60 
>TIGR00917 2A060601 Niemann-Pick C type protein family. The model describes Niemann-Pick C type protein in eukaryotes. The defective protein has been associated with Niemann-Pick disease which is described in humans as autosomal recessive lipidosis. It is characterized by the lysosomal accumulation of unestrified cholesterol. It is an integral membrane protein, which indicates that this protein is most likely involved in cholesterol transport or acts as some component of cholesterol homeostasis.
Probab=21.86  E-value=39  Score=47.98  Aligned_cols=78  Identities=23%  Similarity=0.292  Sum_probs=42.9

Q ss_pred             HHHHHHHHhhh------hhcccccceee-----------ehhh----HH----HHHHHHHHHHHHHHHhhhcCCCCcccc
Q 000112          956 LLLLIVLAIGV------IHHWASNNFYL-----------TRTQ----MF----FVCFLAFLLGLAAFLVGWFDDKPFVGA 1010 (2161)
Q Consensus       956 ~~~~~~~~igv------~~~was~~f~~-----------~~~~----~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1010 (2161)
                      ++-++|+||||      +|.|...+-.-           +..|    ++    --++++-+.-.+||++|.+-+-|-+- 
T Consensus       640 v~PFLvL~IGVD~ifilv~~~~r~~~~~~~~~~~~~~~~~~~~ri~~~l~~~G~sI~ltslt~~~aF~~g~~s~~Pavr-  718 (1204)
T TIGR00917       640 VIPFLVLAVGVDNIFILVQTYQRLERFYREVGVDNEQELTLEQQLGRALGEVGPSITLASLSESLAFFLGALSKMPAVR-  718 (1204)
T ss_pred             HHHHHHHHHHhhHHHHHHHHHHHhhhccccccccccccCCHHHHHHHHHHHhhHHHHHHHHHHHHHHHHHhccCChHHH-
Confidence            45577889998      56675433210           2212    11    34667777888899999998777442 


Q ss_pred             chhHHHHHHHhhccceeeeccCCE
Q 000112         1011 SVGYFTFLFLLAGRALTVLLSPPI 1034 (2161)
Q Consensus      1011 ~~~~~~~~~~~~~~~~~~~~~~~~ 1034 (2161)
                      ..|.++-+.++.-=.+++.+-|++
T Consensus       719 ~F~~~aa~av~~~fll~it~f~al  742 (1204)
T TIGR00917       719 AFSLFAGLAVFIDFLLQITAFVAL  742 (1204)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHH
Confidence            334444333333333333333333


No 61 
>PF04123 DUF373:  Domain of unknown function (DUF373);  InterPro: IPR007254 This archaeal family of unknown function is predicted to be an integral membrane protein with six transmembrane regions.
Probab=21.79  E-value=40  Score=42.08  Aligned_cols=138  Identities=23%  Similarity=0.429  Sum_probs=77.5

Q ss_pred             eee-ehhHHHHHHhhhheeeeeccchhHHHHHHHHHHHH-HHHhhh---hhccccc---ceeeehhhHHHHHHHHHHHHH
Q 000112          923 YVF-ITIGLVLLLGAISAVIVVITPWTIGVAFLLLLLLI-VLAIGV---IHHWASN---NFYLTRTQMFFVCFLAFLLGL  994 (2161)
Q Consensus       923 ~~~-~~~gl~ll~~a~~~v~~~i~~w~~gvaf~l~~~~~-~~~igv---~~~was~---~f~~~~~~~~~~~~~~~~~~~  994 (2161)
                      ++| +- |++||+-++.+++.. ..+.+++..+++++.+ .=+.|.   +.+|.++   .+|-.|.... .-..|.++.+
T Consensus       161 ~~lGvP-G~~lLiy~i~~l~~~-~~~a~~~i~~~iG~yll~kGfgld~~~~~~~~~~~~~l~~g~it~i-tyvva~~l~i  237 (344)
T PF04123_consen  161 TFLGVP-GLILLIYAILALLGY-PAYALGIILLLIGLYLLYKGFGLDDYLREWLERFRESLYEGRITFI-TYVVALLLII  237 (344)
T ss_pred             eeecch-HHHHHHHHHHHHHcc-hHHHHHHHHHHHHHHHHHHhcCcHHHHHHHHHHhccccccceeehH-HHHHHHHHHH
Confidence            455 55 999999999998875 4555555555554444 335555   5566554   4666654333 3344444555


Q ss_pred             HHHHhhhcC------CCC------ccccchhHHHH--HHHhhccceeeeccCCEEEecCceeeEEEeecccccCCCchhh
Q 000112          995 AAFLVGWFD------DKP------FVGASVGYFTF--LFLLAGRALTVLLSPPIVVYSPRVLPVYVYDAHADCGKNVSVA 1060 (2161)
Q Consensus       995 ~~~~~~~~~------~~~------~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1060 (2161)
                      .+...|...      ..+      |+=.++.||++  +...+||.+.-.+.--...|+--..|.++           .+.
T Consensus       238 ig~i~g~~~~~~~~~~~~~~~~~~f~~~~v~~~~~a~l~~~~G~iid~~l~~~~~~~~~i~~~~~~-----------~a~  306 (344)
T PF04123_consen  238 IGIIYGYLTLWSYYSISGLIVPGTFLYGSVPWLALAALIASLGKIIDEYLRRDFRLWRYINAPFFV-----------IAI  306 (344)
T ss_pred             HHHHHHHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHccCcchHHHHHHHHHH-----------HHH
Confidence            555555441      111      44455666655  44557887776666555555544444432           455


Q ss_pred             HHHHHHHHhhhccc
Q 000112         1061 FLVLYGVALAIEGW 1074 (2161)
Q Consensus      1061 ~~~~~~~~~~~~~~ 1074 (2161)
                      ++++|++..-....
T Consensus       307 ~~v~~~~~~~~l~~  320 (344)
T PF04123_consen  307 GLVLYGFSAYFLSI  320 (344)
T ss_pred             HHHHHHHHHHHHhh
Confidence            56677766554443


No 62 
>PRK10263 DNA translocase FtsK; Provisional
Probab=21.37  E-value=42  Score=47.79  Aligned_cols=29  Identities=21%  Similarity=0.406  Sum_probs=22.8

Q ss_pred             EEeeeeehhhchhccceeeeccccccccCCccc
Q 000112          773 LVICITVFTGSVLALGAIVSAKPLEDLGYKGWT  805 (2161)
Q Consensus       773 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  805 (2161)
                      -.+.|++++.+++.+.+++|+.|.|-    +|+
T Consensus        24 E~~gIlLlllAlfL~lALiSYsPsDP----SwS   52 (1355)
T PRK10263         24 EALLILIVLFAVWLMAALLSFNPSDP----SWS   52 (1355)
T ss_pred             HHHHHHHHHHHHHHHHHHHhCCccCC----ccc
Confidence            35567778888888999999999774    665


No 63 
>PRK15097 cytochrome d terminal oxidase subunit 1; Provisional
Probab=21.29  E-value=2.5e+02  Score=37.32  Aligned_cols=91  Identities=22%  Similarity=0.346  Sum_probs=0.0

Q ss_pred             hHHHHHHHHHHHHHHHhhhhhcccccceeeehhhHHHHHHHHHHHHHHHHHhhhcCCCCccccchhHHHHHHHhhcccee
Q 000112          948 TIGVAFLLLLLLIVLAIGVIHHWASNNFYLTRTQMFFVCFLAFLLGLAAFLVGWFDDKPFVGASVGYFTFLFLLAGRALT 1027 (2161)
Q Consensus       948 ~~gvaf~l~~~~~~~~igv~~~was~~f~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1027 (2161)
                      |||..++++++.++-.    -.|..+..| +.+-.|-++.++..|...|...||+-.|                .||   
T Consensus       393 MVg~G~l~~~l~~~~l----~l~~r~~l~-~~rw~L~~~~~~~plp~iA~~~GWi~tE----------------vGR---  448 (522)
T PRK15097        393 MVACGFLMLAIIALSF----WSVIRNRIG-EKKWLLRAALYGIPLPWIAVEAGWFVAE----------------YGR---  448 (522)
T ss_pred             HHHHHHHHHHHHHHHH----HHHHcCccc-cCcHHHHHHHHHHHHHHHHHHhhhhhee----------------cCC---


Q ss_pred             eeccCCEEEecCceeeEEEeecccccCCCchh--------hHHHHHHHHhhhccc
Q 000112         1028 VLLSPPIVVYSPRVLPVYVYDAHADCGKNVSV--------AFLVLYGVALAIEGW 1074 (2161)
Q Consensus      1028 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--------~~~~~~~~~~~~~~~ 1074 (2161)
                          -|=+||  .+||+      +|..-||+.        .|.++|++.+..+.|
T Consensus       449 ----QPWiVy--g~l~T------~~avS~~s~~~v~~sl~~f~~~Y~~L~~~~~~  491 (522)
T PRK15097        449 ----QPWAIG--EVLPT------AVANSSLTAGDLLFSMVLICGLYTLFLVAELF  491 (522)
T ss_pred             ----CCeEEe--ceeeH------hHhcCCCCHHHHHHHHHHHHHHHHHHHHHHHH


No 64 
>PF13801 Metal_resist:  Heavy-metal resistance; PDB: 3EPV_C 2Y3D_A 2Y3H_D 2Y3G_B 2Y3B_A 2Y39_A 3LAY_H.
Probab=21.20  E-value=3.1e+02  Score=27.37  Aligned_cols=19  Identities=16%  Similarity=0.468  Sum_probs=12.8

Q ss_pred             CchhhhhHHHHHHhhhhhh
Q 000112         1287 MPEDRRQFEIIQESYIREK 1305 (2161)
Q Consensus      1287 ~~~~~~~~~~~~~~~~~~~ 1305 (2161)
                      +||++++++-+.+.|..+-
T Consensus        43 t~eQ~~~l~~~~~~~~~~~   61 (125)
T PF13801_consen   43 TPEQQAKLRALMDEFRQEM   61 (125)
T ss_dssp             THHHHHHHHHHHHHHHHHH
T ss_pred             CHHHHHHHHHHHHHHHHHH
Confidence            5777777777766666544


No 65 
>KOG3583 consensus Uncharacterized conserved protein [Function unknown]
Probab=21.09  E-value=1.7e+02  Score=34.99  Aligned_cols=122  Identities=22%  Similarity=0.325  Sum_probs=64.4

Q ss_pred             ccceeeeeccchhhhHhhhccchhhhhhh-----h--hhhhhhhhccc-CCCcCChhhhhccCchhhhhHHHHHHhhhhh
Q 000112         1233 DTEWVYMWDKFGGYLLLLLGLTAKAERVQ-----D--EVRLRLFLDSI-GFSDLSAKKIKKWMPEDRRQFEIIQESYIRE 1304 (2161)
Q Consensus      1233 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~--~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1304 (2161)
                      .|-|--|-|||.-.--.+-||+.--..-|     .  -|-+|+-.|-= -.-....+..--|.-       -|--.|+|-
T Consensus        38 ~~~wp~~le~fs~las~ms~l~~~~~k~~~p~lr~~~~~~~~~~~e~detl~r~TeGRVpvfsH-------~lVPdyLRT  110 (279)
T KOG3583|consen   38 KCPWPLMLEKFSTLASFMSSLQSSVRKSGMPHLRSHVLVTQRLQYEPDETLQRATEGRVPVFSH-------ALVPDYLRT  110 (279)
T ss_pred             cCccHHHHHHHHHHHHHHHHHHHHHHHccCCccccchhhhhhhhcCchHHHHHHhcCccccccc-------ccchHhhcc
Confidence            35599999999988777888875322111     0  11122211100 000000111111111       123468987


Q ss_pred             h---hhHHHHHHHHHhhhcccHHH---H-----------HHHHHHHHhhHHhhhhhhcccCCCCCch-HHHHHHHHH
Q 000112         1305 K---EMEEEILMQRREEEGRGKER---R-----------KALLEKEERKWKEIEASLISSIPNAGNR-EAAAMAAAV 1363 (2161)
Q Consensus      1305 ~---~~~~~~~~~~~~~~~~~~~~---~-----------~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~ 1363 (2161)
                      |   |||+|+.|---|...++..-   .           -.-+.|++|.|  +|++..--|-..-|+ |.|++.|||
T Consensus       111 kPdPe~E~~e~ql~~~aa~~saDaa~kQI~~yNK~is~ll~~lsk~~re~--tEs~~~~piqQT~n~~dT~~lVaaV  185 (279)
T KOG3583|consen  111 KPDPEMENEEGQLDGEAAAKSADAAVKQIAAYNKNISGLLNHLSKVDREH--TESAIEKPIQQTYNRDDTAKLVAAV  185 (279)
T ss_pred             CCChhhHHHHhhhhhHHhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHH--HHhhhcCccccccChhHHHHHHHHH
Confidence            6   89999887665555554321   1           12355788888  888776655555554 456666655


No 66 
>PF15412 Nse4-Nse3_bdg:  Binding domain of Nse4/EID3 to Nse3-MAGE
Probab=21.02  E-value=69  Score=30.49  Aligned_cols=28  Identities=36%  Similarity=0.662  Sum_probs=23.8

Q ss_pred             eeeecCCCCCHHHHHHHHhhhccCCCcc
Q 000112          182 RMVFNGNGLDVDEYVRRAYKFAYPDGIE  209 (2161)
Q Consensus       182 ~~~~~g~~~d~~~~~r~~y~~a~~d~~~  209 (2161)
                      ++-+.|+++|+||||.+..+|.-.+..+
T Consensus        18 ~lk~~~~~fd~deFv~~l~~fm~~~~~~   45 (56)
T PF15412_consen   18 NLKFGGSGFDVDEFVSKLKTFMGGNRFE   45 (56)
T ss_pred             HhccCCCccCHHHHHHHHHHHhCcccCC
Confidence            4567799999999999999998876665


No 67 
>PLN00122 serine/threonine protein phosphatase 2A; Provisional
Probab=20.99  E-value=1e+02  Score=35.39  Aligned_cols=22  Identities=32%  Similarity=0.605  Sum_probs=16.2

Q ss_pred             HHHHHHHHHHHHhhHHhhhhhh
Q 000112         1323 KERRKALLEKEERKWKEIEASL 1344 (2161)
Q Consensus      1323 ~~~~~~~~~~~~~~~~~~~~~~ 1344 (2161)
                      ++++++..+|.|.+|+.||..-
T Consensus       142 ~~~~~~~~~~r~~~W~~le~~A  163 (170)
T PLN00122        142 EAKAKEVEEKREATWKRLEEAA  163 (170)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHH
Confidence            3456666688889999998643


No 68 
>TIGR02916 PEP_his_kin putative PEP-CTERM system histidine kinase. Members of this protein family have a novel N-terminal domain, a single predicted membrane-spanning helix, and a predicted cystosolic histidine kinase domain. We designate this protein PrsK, and its companion DNA-binding response regulator protein (TIGR02915) PrsR. These predicted signal-transducing proteins appear to enable enhancer-dependent transcriptional activation. The prsK gene is often associated with exopolysaccharide biosynthesis genes.
Probab=20.92  E-value=57  Score=42.96  Aligned_cols=36  Identities=11%  Similarity=-0.006  Sum_probs=20.3

Q ss_pred             hhHHHhhhhhhhHHHHHHhhcccceeecCccccccce
Q 000112          886 GDFLAALLPLVCIPALLSLCSGLLKWKDDDWKLSRGV  922 (2161)
Q Consensus       886 ~dfl~a~lpl~~ipa~~~l~~gl~kw~dd~w~~s~~~  922 (2161)
                      ..++..+.|..-++.++.+ .+...+.++++.-++..
T Consensus        58 ~~~~~~l~~~~w~~~l~~~-~~~~~~~~~~~~~~~~~   93 (679)
T TIGR02916        58 VLVLEVFRDAAWLAFLLTL-LRRPATSGKPFNQRPKL   93 (679)
T ss_pred             HHHHHHHHHHHHHHHHHHH-hcccccccCcccchHHH
Confidence            3455555666655555543 34466677777665544


No 69 
>PF02387 IncFII_repA:  IncFII RepA protein family;  InterPro: IPR003446 These proteins are plasmid encoded and essential for plasmid replication, they are also involved in copy control functions [].; GO: 0006276 plasmid maintenance
Probab=20.75  E-value=1.1e+02  Score=37.58  Aligned_cols=87  Identities=24%  Similarity=0.387  Sum_probs=52.2

Q ss_pred             hHhhhccchhhhhhhhhhhhhhhhc---ccCCCcCChhhhhccCchhhhhHHHHHHhhhhhhhhHHHHHHHHHhhhcccH
Q 000112         1247 LLLLLGLTAKAERVQDEVRLRLFLD---SIGFSDLSAKKIKKWMPEDRRQFEIIQESYIREKEMEEEILMQRREEEGRGK 1323 (2161)
Q Consensus      1247 ~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1323 (2161)
                      +..++|.+.+.-+-+.+-||+..=+   ..|-..+|..++++..-                ++..+..++.|+++..+|+
T Consensus       159 ff~l~gi~~~kl~~~~~~~l~~~~~~~~~~~~~~is~~e~~~r~~----------------~~~~~~~~~~r~~~~~~~~  222 (281)
T PF02387_consen  159 FFMLLGISEDKLRREQRQRLQWENNGLSKQGEEPISLHEARRRAK----------------EQHRKRALDYRKERRAKGK  222 (281)
T ss_pred             HHHHhCCCHHHHHHHHHHHHHHHHHhhhhcccCCCcHHHHHHHHH----------------HHHHHHHHHHHHHhHHHHH
Confidence            3567899888766666666665533   44667777777643222                2335567778888887788


Q ss_pred             HHHHH--HHHH-HHhhHHhhhhhhcccCC
Q 000112         1324 ERRKA--LLEK-EERKWKEIEASLISSIP 1349 (2161)
Q Consensus      1324 ~~~~~--~~~~-~~~~~~~~~~~~~~~~~ 1349 (2161)
                      +|++|  +.+. |....++|=.-|+.+.|
T Consensus       223 krk~A~rl~~L~e~~ar~~I~~~Lik~ys  251 (281)
T PF02387_consen  223 KRKRARRLAKLDEDEARQEILRQLIKEYS  251 (281)
T ss_pred             HHHHHhhccccCHHHHHHHHHHHHHHHcC
Confidence            77654  2222 22334555555665554


No 70 
>KOG4661 consensus Hsp27-ERE-TATA-binding protein/Scaffold attachment factor (SAF-B) [Transcription]
Probab=20.24  E-value=1.4e+02  Score=39.53  Aligned_cols=30  Identities=40%  Similarity=0.446  Sum_probs=19.5

Q ss_pred             HHHHhhhcccHHHHHHHHHHHHhhHHhhhh
Q 000112         1313 MQRREEEGRGKERRKALLEKEERKWKEIEA 1342 (2161)
Q Consensus      1313 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1342 (2161)
                      +||-+||.--.||||+..|+||+.+-++|.
T Consensus       626 r~RirE~rerEqR~~a~~ERee~eRl~~er  655 (940)
T KOG4661|consen  626 RQRIREEREREQRRKAAVEREELERLKAER  655 (940)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            344444444557888888888877766653


No 71 
>PF02460 Patched:  Patched family;  InterPro: IPR003392 The transmembrane protein, patched, is a receptor for the morphogene Sonic Hedgehog. In Drosophila melanogaster, this protein associates with the smoothened protein to transduce hedgehog signals, leading to the activation of wingless, decapentaplegic and patched itself. It participates in cell interactions that establish pattern within the segment and imaginal disks during development. The mouse homologue may play a role in epidermal development. The human Niemann-Pick C1 protein, defects in which cause Niemann-Pick type II disease, is also a member of this family. This protein is involved in the intracellular trafficking of cholesterol, and may play a role in vesicular trafficking in glia, a process that may be crucial for maintaining the structural functional integrity of nerve terminals.; GO: 0008158 hedgehog receptor activity, 0016020 membrane
Probab=20.19  E-value=1.1e+02  Score=41.54  Aligned_cols=53  Identities=28%  Similarity=0.419  Sum_probs=36.0

Q ss_pred             HHHHHHHHHhhh------hhcccccceeeehhhHH--------HHHHHHHHHHHHHHHhhhcCCCCc
Q 000112          955 LLLLLIVLAIGV------IHHWASNNFYLTRTQMF--------FVCFLAFLLGLAAFLVGWFDDKPF 1007 (2161)
Q Consensus       955 l~~~~~~~~igv------~~~was~~f~~~~~~~~--------~~~~~~~~~~~~~~~~~~~~~~~~ 1007 (2161)
                      .+.-++++||||      +|.|-...-..+..+-+        --.++.-+--.+||++|.+-.-|=
T Consensus       282 ~v~PFLvlgIGvDd~Fi~~~~~~~~~~~~~~~er~~~~l~~~g~SitiTslT~~~aF~ig~~t~~pa  348 (798)
T PF02460_consen  282 LVIPFLVLGIGVDDMFIMIHAWRRTSPDLSVEERMAETLAEAGPSITITSLTNALAFAIGAITPIPA  348 (798)
T ss_pred             HHHHHHHHHHHHhceEEeHHHHhhhchhccHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCcHH
Confidence            456778889999      89998776665543222        223444455567899999887773


Done!