Query         045193
Match_columns 82
No_of_seqs    109 out of 697
Neff          6.9 
Searched_HMMs 46136
Date          Fri Mar 29 10:44:30 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/045193.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/045193hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF00031 Cystatin:  Cystatin do  99.8   6E-19 1.3E-23  106.6   8.8   59   18-77      1-61  (94)
  2 smart00043 CY Cystatin-like do  99.8 2.6E-18 5.6E-23  106.2   6.4   61   16-77      2-64  (107)
  3 cd00042 CY Substituted updates  99.7   1E-17 2.3E-22  103.0   8.5   59   18-77      1-60  (105)
  4 PF07430 PP1:  Phloem filament   99.6 5.5E-15 1.2E-19  100.6  10.0   69    8-77    105-175 (202)
  5 PF07430 PP1:  Phloem filament   99.1 2.4E-10 5.2E-15   78.1   5.5   66   11-76      1-68  (202)
  6 TIGR01638 Atha_cystat_rel Arab  98.7 8.6E-08 1.9E-12   58.9   7.0   53   26-78     10-62  (92)
  7 PF06907 Latexin:  Latexin;  In  88.4     5.4 0.00012   28.1   7.8   53   25-77      3-60  (220)
  8 PF05679 CHGN:  Chondroitin N-a  88.2     2.7 5.8E-05   32.5   6.8   50   28-77    160-211 (499)
  9 TIGR01572 A_thl_para_3677 Arab  86.2    0.78 1.7E-05   33.1   2.8   49   31-79     42-90  (265)
 10 PF00666 Cathelicidins:  Cathel  71.7     4.2 9.1E-05   23.5   2.2   30   31-61      4-33  (67)
 11 COG3360 Uncharacterized conser  53.5      44 0.00095   19.5   6.1   44   31-75     19-64  (71)
 12 PF07353 Uroplakin_II:  Uroplak  52.4      33 0.00072   23.4   4.1   20   62-81    109-128 (184)
 13 TIGR02105 III_needle type III   50.0      27 0.00059   20.3   3.0   20   25-44     30-49  (72)
 14 PF02995 DUF229:  Protein of un  47.1      24 0.00053   27.3   3.2   28   17-45    451-478 (497)
 15 PRK15344 type III secretion sy  46.0      36 0.00077   19.9   3.0   21   24-44     28-48  (71)
 16 PF13028 DUF3889:  Protein of u  45.3      72  0.0016   19.7   7.7   51   27-78     19-71  (97)
 17 PF07311 Dodecin:  Dodecin;  In  39.2      76  0.0016   18.2   7.4   47   28-75     13-61  (66)
 18 PF00041 fn3:  Fibronectin type  39.2      32  0.0007   18.5   2.2   18   62-79     63-80  (85)
 19 KOG2650 Zinc carboxypeptidase   36.1 1.8E+02  0.0039   22.4   6.3   62   15-76    316-387 (418)
 20 cd08538 SAM_PNT-ESE-2-like Ste  35.8      30 0.00064   20.5   1.7   19   27-45     11-29  (78)
 21 cd02848 Chitinase_N_term Chiti  34.9      83  0.0018   19.7   3.6   19   62-80     76-94  (106)
 22 KOG2786 Putative glutamate/orn  30.9      35 0.00076   25.7   1.7   29   51-80    176-204 (431)
 23 cd06247 M14_CPO Peptidase M14   30.5      93   0.002   22.6   3.8   71    6-77    186-268 (298)
 24 PF14201 DUF4318:  Domain of un  28.4 1.3E+02  0.0028   17.6   4.7   39   38-77     23-65  (74)
 25 PF14873 BNR_assoc_N:  N-termin  27.4      62  0.0014   20.6   2.3   22   56-77     94-116 (137)
 26 PF07488 Glyco_hydro_67M:  Glyc  27.1      72  0.0016   23.9   2.8   40    5-45    108-147 (328)
 27 PF07849 DUF1641:  Protein of u  26.1      81  0.0018   16.2   2.2   15   26-40     20-34  (42)
 28 PF01448 ELM2:  ELM2 domain;  I  25.6      60  0.0013   17.2   1.7   22   20-41     33-54  (55)
 29 PF07820 TraC:  TraC-like prote  25.2   1E+02  0.0022   18.9   2.8   29   17-46     35-63  (92)
 30 cd03870 M14_CPA Peptidase M14   23.8 2.6E+02  0.0056   20.3   5.1   72    7-78    187-269 (301)
 31 PF08329 ChitinaseA_N:  Chitina  23.2 1.9E+02  0.0042   18.7   4.0   19   62-80     79-97  (133)
 32 PF01819 Levi_coat:  Levivirus   21.7 1.7E+02  0.0037   19.0   3.4   23   46-68     53-78  (129)
 33 PF00413 Peptidase_M10:  Matrix  21.4 1.8E+02   0.004   17.9   3.6   30   25-56     17-46  (154)
 34 cd03147 GATase1_Ydr533c_like T  21.1      75  0.0016   22.0   1.8   26   15-40     99-125 (231)
 35 PF08522 DUF1735:  Domain of un  20.3 1.9E+02   0.004   16.5   3.4   15   38-52     15-29  (86)

No 1  
>PF00031 Cystatin:  Cystatin domain;  InterPro: IPR000010 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.  The cystatins are cysteine proteinase inhibitors belonging to MEROPS inhibitor family I25, clan IH [, , ]. They mainly inhibit peptidases belonging to peptidase families C1 (papain family) and C13 (legumain family). The cystatin family includes:   The Type 1 cystatins, which are intracellular cystatins that are present in the cytosol of many cell types, but can also appear in body fluids at significant concentrations. They are single-chain polypeptides of about 100 residues, which have neither disulphide bonds nor carbohydrate side chains.  The Type 2 cystatins, which are mainly extracellular secreted polypeptides synthesised with a 19-28 residue signal peptide. They are broadly distributed and found in most body fluids.  The Type 3 cystatins, which are multidomain proteins. The mammalian representatives of this group are the kininogens. There are three different kininogens in mammals: H- (high molecular mass, IPR002395 from INTERPRO) and L- (low molecular mass) kininogen which are found in a number of species, and T-kininogen that is found only in rat.  Unclassified cystatins. These are cystatin-like proteins found in a range of organisms: plant phytocystatins, fetuin in mammals, insect cystatins and a puff adder venom cystatin which inhibits metalloproteases of the MEROPS peptidase family M12 (astacin/adamalysin). Also a number of the cystatins-like proteins have been shown to be devoid of inhibitory activity.   All true cystatins inhibit cysteine peptidases of the papain family (MEROPS peptidase family C1), and some also inhibit legumain family enzymes (MEROPS peptidase family C13). These peptidases play key roles in physiological processes, such as intracellular protein degradation (cathepsins B, H and L), are pivotal in the remodelling of bone (cathepsin K), and may be important in the control of antigen presentation (cathepsin S, mammalian legumain). Moreover, the activities of such peptidases are increased in pathophysiological conditions, such as cancer metastasis and inflammation. Additionally, such peptidases are essential for several pathogenic parasites and bacteria. Thus in animals cystatins not only have capacity to regulate normal body processes and perhaps cause disease when down-regulated, but in other organisms may also participate in defence against biotic and abiotic stress. ; GO: 0004869 cysteine-type endopeptidase inhibitor activity; PDB: 3L0R_B 2W9P_K 2W9Q_A 3S67_A 3QRD_D 1R4C_G 3GAX_A 1TIJ_B 1G96_A 3NX0_A ....
Probab=99.80  E-value=6e-19  Score=106.58  Aligned_cols=59  Identities=29%  Similarity=0.433  Sum_probs=54.7

Q ss_pred             cceEECCCCCCHHHHHHHHHHHHHHHhhcCCcee--EeeEEEEEEEeecceEEEEEEEEeeC
Q 045193           18 GGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFK--FKSVEKGKTKVVSSTNYRLILVVKDG   77 (82)
Q Consensus        18 Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~--~~~V~~~~~QVVaG~nY~l~v~~~~g   77 (82)
                      |||+++ |++||+++++++||+.+||+++++.+.  +.+|+++++|||+|++|+|+++++++
T Consensus         1 Gg~~~~-~~~dp~v~~~~~~al~~~N~~~~~~~~~~~~~v~~a~~QvV~G~~Y~i~~~~~~t   61 (94)
T PF00031_consen    1 GGPSPV-DPNDPEVQEAAEFALDKFNEQSNSGYKFKLVKVISATTQVVAGINYYIEFEVGET   61 (94)
T ss_dssp             SSEEEE-CTTSHHHHHHHHHHHHHHHHHSTTSEEEEEEEEEEEEEEESSSEEEEEEEEEEEE
T ss_pred             CCCccC-CCCCHHHHHHHHHHHHHHHHhCcccCcceeeeeeEEEEeecCCceEEEEEEEEcc
Confidence            899999 789999999999999999999866554  58999999999999999999999885


No 2  
>smart00043 CY Cystatin-like domain. Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as  kininogen, His-rich glycoprotein and fetuin also contain these domains.
Probab=99.75  E-value=2.6e-18  Score=106.18  Aligned_cols=61  Identities=31%  Similarity=0.465  Sum_probs=57.3

Q ss_pred             cccceEECCCCCCHHHHHHHHHHHHHHHhhcCCcee--EeeEEEEEEEeecceEEEEEEEEeeC
Q 045193           16 LVGGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFK--FKSVEKGKTKVVSSTNYRLILVVKDG   77 (82)
Q Consensus        16 ~~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~--~~~V~~~~~QVVaG~nY~l~v~~~~g   77 (82)
                      ++|||+++ +.+||+++++++||+.+||+++++.|.  +++++++++|||+|++|+|++++.++
T Consensus         2 ~~Gg~~~~-~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~~v~~a~~QvvaG~~y~l~~~v~~t   64 (107)
T smart00043        2 CLGGPSDV-PPNDPEVQEAADFAVAEYNKKSNDKYELRVIKVVSAKSQVVAGTNYYLKVEVGET   64 (107)
T ss_pred             CCCCCccC-CCCCHHHHHHHHHHHHHHHHhcccchhhhhhhhheeeeeeecceEEEEEEEEEec
Confidence            68999999 789999999999999999999998887  58999999999999999999999875


No 3  
>cd00042 CY Substituted updates: Jan 30, 2002
Probab=99.75  E-value=1e-17  Score=103.02  Aligned_cols=59  Identities=32%  Similarity=0.507  Sum_probs=55.7

Q ss_pred             cceEECCCCCCHHHHHHHHHHHHHHHhhcCCc-eeEeeEEEEEEEeecceEEEEEEEEeeC
Q 045193           18 GGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNE-FKFKSVEKGKTKVVSSTNYRLILVVKDG   77 (82)
Q Consensus        18 Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~-~~~~~V~~~~~QVVaG~nY~l~v~~~~g   77 (82)
                      |||.++ +++||+++++++||+.+||+++++. +.+.+|+++++|||+|++|+|++++.+.
T Consensus         1 gg~~~~-~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~i~~~~~QvvaG~~y~i~~~~~~t   60 (105)
T cd00042           1 GGPSDI-PANDPEVQELADFAVAEYNKKSNDKYLEFFKVLSAKSQVVAGTNYYITVEAGDT   60 (105)
T ss_pred             CCCccC-CCCCHHHHHHHHHHHHHHHhhcCccceeEEEEEEEEEEEEeeeEEEEEEEEecc
Confidence            899998 7999999999999999999999988 7779999999999999999999999875


No 4  
>PF07430 PP1:  Phloem filament protein PP1;  InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.62  E-value=5.5e-15  Score=100.64  Aligned_cols=69  Identities=28%  Similarity=0.512  Sum_probs=61.9

Q ss_pred             eeeccCcccccceEECCCCCCHHHHHHHHHHHHHHHhhcCCceeEeeEEEEEEEeec--ceEEEEEEEEeeC
Q 045193            8 AADNRKRVLVGGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFKFKSVEKGKTKVVS--STNYRLILVVKDG   77 (82)
Q Consensus         8 ~~~~~~~~~~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~QVVa--G~nY~l~v~~~~g   77 (82)
                      .+.+..++...+|.+++|+++|++|++++|||.||| +.+++++|.+|.+++.|=++  |++|+|+|.+.|+
T Consensus       105 ~v~pv~~p~~~~Wi~I~nin~p~VQeLgkFAV~EhN-K~gd~LkF~KV~eGw~q~l~~d~ikYrLhI~AkDg  175 (202)
T PF07430_consen  105 VVYPVATPQSKKWIPIPNINNPFVQELGKFAVIEHN-KAGDKLKFEKVYEGWYQDLGNDGIKYRLHIVAKDG  175 (202)
T ss_pred             eeccccCcccCCCEECCCCCcHHHHHHHHHHHHHHh-hcCCceEEEEEeeEEEEeccCCCceEEEEEEeecC
Confidence            445666777899999999999999999999999999 57899999999999999664  7999999999998


No 5  
>PF07430 PP1:  Phloem filament protein PP1;  InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.08  E-value=2.4e-10  Score=78.05  Aligned_cols=66  Identities=29%  Similarity=0.407  Sum_probs=60.2

Q ss_pred             ccCcccccceEECCCCCCHHHHHHHHHHHHHHHhhcCCceeEeeEEEEE--EEeecceEEEEEEEEee
Q 045193           11 NRKRVLVGGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFKFKSVEKGK--TKVVSSTNYRLILVVKD   76 (82)
Q Consensus        11 ~~~~~~~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~~~~V~~~~--~QVVaG~nY~l~v~~~~   76 (82)
                      |.+....+||.+++|+.+|.+|+++.|++.+++.+.++.++|.+|++++  .|...+++|||.+++.|
T Consensus         1 ~~~~~~~~~w~~ip~v~~~~~q~v~~~~veq~k~~~~~~l~~~~v~egwy~el~~~~~~yrlhv~a~d   68 (202)
T PF07430_consen    1 CGQVPFSPKWIKIPDVKEPCLQEVAKFAVEQFKIQYGDSLKFRSVVEGWYFELCPNSLKYRLHVKAID   68 (202)
T ss_pred             CCCcccCcccccCCcccchHHHHHHHHHHHHHhhhcccceeeeeeeeceeecccccceeEEEeehhhh
Confidence            3456678999999999999999999999999999999999999999999  56889999999998765


No 6  
>TIGR01638 Atha_cystat_rel Arabidopsis thaliana cystatin-related protein. This model represents a family similar in sequence and probably homologous to a large family of cysteine proteinase inhibitors, or cystatins, as described by pfam model pfam00031. Cystatins may help plants resist attack by insects.
Probab=98.72  E-value=8.6e-08  Score=58.86  Aligned_cols=53  Identities=6%  Similarity=0.042  Sum_probs=49.1

Q ss_pred             CCCHHHHHHHHHHHHHHHhhcCCceeEeeEEEEEEEeecceEEEEEEEEeeCc
Q 045193           26 LSEPHVTEIGRFAVMSTKKRSKNEFKFKSVEKGKTKVVSSTNYRLILVVKDGK   78 (82)
Q Consensus        26 ~~d~~vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~QVVaG~nY~l~v~~~~g~   78 (82)
                      .+.+.+..++++|+++||...+..+.|++|++|..|..+|..|+|++.+.|..
T Consensus        10 T~rd~~~~la~~al~k~N~~~~t~lEfV~vVrAn~~~~~g~~~yITF~Ard~~   62 (92)
T TIGR01638        10 TNRDLLERLSYVASKKYNDTKFLNLELVEVVRANYRGGAKSKSYITFEARDKP   62 (92)
T ss_pred             CHHHHHHHHHHHHHHHhhhhcCceEEEEEEEEEEeeccceEEEEEEEEEecCC
Confidence            45677899999999999999999999999999999999999999999998864


No 7  
>PF06907 Latexin:  Latexin;  InterPro: IPR009684 This family consists of several animal specific latexin and proteins related to latexin that belong to MEROPS proteinase inhibitor family I47, clan I- [].  Latexin, a protein possessing inhibitory activity against rat carboxypeptidase A1 (CPA1) and CPA2 (MEROPS peptidase family M14A), is expressed in a neuronal subset in the cerebral cortex and cells in other neural and non-neural tissues of rat [, ]. OCX-32, the 32 kDa eggshell matrix protein, is present at high levels in the uterine fluid during the terminal phase of eggshell formation, and is localised predominantly in the outer eggshell. The timing of OCX-32 secretion into the uterine fluid suggests that it may play a role in the termination of mineral deposition []. OCX-32 protein possesses limited identity (32%) to two unrelated proteins: latexin and to a skin protein that is encoded by a retinoic acid receptor-responsive gene, TIG1. Tazarotene Induced Gene 1 (TIG1) is a putative 228 transmembrane protein with a small N-terminal intracellular region, a single membrane-spanning hydrophobic region, and a large C-terminal extracellular region containing a glycosylation signal. TIG1 is up-regulated by retinoic acid receptor but not by retinoid X receptor-specific synthetic retinoids []. TIG1 may be a tumour suppressor gene whose diminished expression is involved in the malignant progression of prostate cancer [].; PDB: 1WNH_A 2BO9_B.
Probab=88.38  E-value=5.4  Score=28.12  Aligned_cols=53  Identities=15%  Similarity=0.254  Sum_probs=40.0

Q ss_pred             CCCCHHHHHHHHHHHHHHHhhcCCcee---EeeEEEEEEEeec--ceEEEEEEEEeeC
Q 045193           25 DLSEPHVTEIGRFAVMSTKKRSKNEFK---FKSVEKGKTKVVS--STNYRLILVVKDG   77 (82)
Q Consensus        25 ~~~d~~vq~~a~fAv~~~n~~s~~~~~---~~~V~~~~~QVVa--G~nY~l~v~~~~g   77 (82)
                      +++.-..+.+|+-|..=+|-+.++..+   +.+|.+|+..++.  |-+|.|++.+.+-
T Consensus         3 ~p~h~~a~rAA~va~hy~N~~~GSP~~l~~l~~V~~a~~e~ip~~G~Ky~L~FSte~~   60 (220)
T PF06907_consen    3 NPSHRPAQRAARVAQHYINYRAGSPSRLFVLQQVQKARAEDIPGEGCKYDLVFSTEEY   60 (220)
T ss_dssp             -TTSHHHHHHHHHHHHHHHHHH-BTTB-EEEEEEEEEEEEEETTTEEEEEEEEEEEET
T ss_pred             CCcchHHHHHHHHHHHHhccccCCCceeeehhhhhhhhheeccCCCCEEEEEEEhHHh
Confidence            455666788999888888876665443   3799999999764  8999999998774


No 8  
>PF05679 CHGN:  Chondroitin N-acetylgalactosaminyltransferase;  InterPro: IPR008428 This family represents Chondroitin N-acetylgalactosaminyltransferase. Proteins have a type II transmembrane topology. The enzyme is involved in the biosynthetic initiation and elongation of chondroitin sulphate and is the key enzyme responsible for the selective chain assembly of chondroitin/dermatan sulphate on the linkage region tetrasaccharide common to various proteoglycans containing chondroitin/dermatan sulphate or heparin/heparan sulphate chains. ; GO: 0016758 transferase activity, transferring hexosyl groups, 0032580 Golgi cisterna membrane
Probab=88.17  E-value=2.7  Score=32.50  Aligned_cols=50  Identities=16%  Similarity=0.310  Sum_probs=42.8

Q ss_pred             CHHHHHHHHHHHHHHHhhcCCceeEeeEEEEEEEe--ecceEEEEEEEEeeC
Q 045193           28 EPHVTEIGRFAVMSTKKRSKNEFKFKSVEKGKTKV--VSSTNYRLILVVKDG   77 (82)
Q Consensus        28 d~~vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~QV--VaG~nY~l~v~~~~g   77 (82)
                      -.++.++.+.|+...|+.+...+.|.+++.+...+  .-|+.|.|.+.....
T Consensus       160 ~~dl~~vi~~a~~~ln~~~~~~~~~~~l~~GY~R~dp~rG~~Y~Ldl~l~~~  211 (499)
T PF05679_consen  160 REDLDDVIEQAMEELNRKSRRVLEFRDLINGYRRFDPTRGMDYILDLLLKYK  211 (499)
T ss_pred             HHHHHHHHHHHHHHHhccccccEEeeeeeeEEEEecCCCCceEEEEEEEeec
Confidence            36789999999999998888889999999998775  679999999876543


No 9  
>TIGR01572 A_thl_para_3677 Arabidopsis paralogous family TIGR01572. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL.
Probab=86.22  E-value=0.78  Score=33.09  Aligned_cols=49  Identities=14%  Similarity=0.228  Sum_probs=44.5

Q ss_pred             HHHHHHHHHHHHHhhcCCceeEeeEEEEEEEeecceEEEEEEEEeeCcc
Q 045193           31 VTEIGRFAVMSTKKRSKNEFKFKSVEKGKTKVVSSTNYRLILVVKDGKN   79 (82)
Q Consensus        31 vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~QVVaG~nY~l~v~~~~g~~   79 (82)
                      +.-.|+.++-.||...+.+++|..|.+.-++..+=+.|+|++++-|-..
T Consensus        42 vklyAr~GLH~YN~~~GTNlel~~v~K~N~~~~~~~syyITL~A~DP~s   90 (265)
T TIGR01572        42 VKIYARVGLHRYNFLEGTNLELDHVDKFNKRMCALSSYYITLLAVDPDS   90 (265)
T ss_pred             HHHHHHhhhhhhhhccCccceehhhhhhccchhhheeeeEEEEEecCCc
Confidence            5888999999999999999999999999999998999999999988643


No 10 
>PF00666 Cathelicidins:  Cathelicidin;  InterPro: IPR001894 The precursor sequences of a number of antimicrobial peptides secreted by neutrophils (polymorphonuclear leukocytes) upon activation have been found to be evolutionarily related and are collectively known as cathelicidins []. Structurally, these proteins consist of three domains: a signal sequence, a conserved region of about 100 residues that contains four cysteines involved in two disulphide bonds, and a highly divergent C-terminal section of variable size. It is in this C-terminal section that the antibacterial peptides are found; they are proteolytically processed from their precursor by enzymes such as elastase. This structure is shown in the following schematic representation:  +---+--------------------------------+--------------------+ |Sig| Propeptide C C C C | Antibacterial pep. | +---+----------------|--|--|--|------+--------------------+ | | | | +--+ +--+ 'C': conserved cysteine involved in a disulphide bond. ; GO: 0006952 defense response, 0005576 extracellular region; PDB: 1KWI_A 1PFP_A 1LXE_A 1N5P_A 1N5H_A.
Probab=71.74  E-value=4.2  Score=23.54  Aligned_cols=30  Identities=17%  Similarity=-0.110  Sum_probs=19.6

Q ss_pred             HHHHHHHHHHHHHhhcCCceeEeeEEEEEEE
Q 045193           31 VTEIGRFAVMSTKKRSKNEFKFKSVEKGKTK   61 (82)
Q Consensus        31 vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~Q   61 (82)
                      ++++...|+..||+++.+.. +.+++++.-+
T Consensus         4 Y~eav~~Av~~yN~~s~~~n-lfRLLe~~p~   33 (67)
T PF00666_consen    4 YEEAVLRAVDFYNQGSSGEN-LFRLLELDPP   33 (67)
T ss_dssp             CHHHHHHHHHHHHHCS-SSE-EEEEEEE---
T ss_pred             HHHHHHHHHHHHhcCCCccC-ceeeeeccCC
Confidence            57889999999998876543 3556665544


No 11 
>COG3360 Uncharacterized conserved protein [Function unknown]
Probab=53.54  E-value=44  Score=19.53  Aligned_cols=44  Identities=16%  Similarity=0.248  Sum_probs=31.4

Q ss_pred             HHHHHHHHHHHHHhhcCCceeEeeEEEEEEEeecc--eEEEEEEEEe
Q 045193           31 VTEIGRFAVMSTKKRSKNEFKFKSVEKGKTKVVSS--TNYRLILVVK   75 (82)
Q Consensus        31 vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~QVVaG--~nY~l~v~~~   75 (82)
                      +.++++-|+..-. ++-+.+...+|++-+-+++.|  ..|.++++++
T Consensus        19 ~d~Ai~~Ai~RA~-~t~~~l~wfeV~~~rg~v~~g~v~hyqv~lkVg   64 (71)
T COG3360          19 IDAAIANAIARAA-DTLDNLDWFEVVETRGHVVDGAVAHYQVTLKVG   64 (71)
T ss_pred             HHHHHHHHHHHHH-hhhhcceEEEEEeecccEeecceEEEEEEEEEE
Confidence            4555555555432 245678889999999999987  5688888875


No 12 
>PF07353 Uroplakin_II:  Uroplakin II;  InterPro: IPR009952 This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension [].; GO: 0016044 cellular membrane organization, 0030176 integral to endoplasmic reticulum membrane
Probab=52.43  E-value=33  Score=23.39  Aligned_cols=20  Identities=20%  Similarity=0.361  Sum_probs=16.3

Q ss_pred             eecceEEEEEEEEeeCcccC
Q 045193           62 VVSSTNYRLILVVKDGKNCY   81 (82)
Q Consensus        62 VVaG~nY~l~v~~~~g~~~~   81 (82)
                      ++-|++|++...+.+|...|
T Consensus       109 L~pGTkY~isY~VtkgtstE  128 (184)
T PF07353_consen  109 LQPGTKYYISYLVTKGTSTE  128 (184)
T ss_pred             cCCCcEEEEEEEEecCccce
Confidence            46799999999998886544


No 13 
>TIGR02105 III_needle type III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus.
Probab=49.97  E-value=27  Score=20.32  Aligned_cols=20  Identities=15%  Similarity=0.134  Sum_probs=18.1

Q ss_pred             CCCCHHHHHHHHHHHHHHHh
Q 045193           25 DLSEPHVTEIGRFAVMSTKK   44 (82)
Q Consensus        25 ~~~d~~vq~~a~fAv~~~n~   44 (82)
                      +++||+..--.+|++.+||-
T Consensus        30 ~~~nP~~La~~Q~~~~qYs~   49 (72)
T TIGR02105        30 LPNDPELMAELQFALNQYSA   49 (72)
T ss_pred             CCCCHHHHHHHHHHHHHHHH
Confidence            57899999999999999984


No 14 
>PF02995 DUF229:  Protein of unknown function (DUF229);  InterPro: IPR004245 Members of this family are uncharacterised with a long conserved region that may contain several domains.
Probab=47.14  E-value=24  Score=27.26  Aligned_cols=28  Identities=25%  Similarity=0.469  Sum_probs=25.1

Q ss_pred             ccceEECCCCCCHHHHHHHHHHHHHHHhh
Q 045193           17 VGGWKSKEDLSEPHVTEIGRFAVMSTKKR   45 (82)
Q Consensus        17 ~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~   45 (82)
                      |-+|.++ +.+|+.++.+|+++|...|..
T Consensus       451 C~~~~~~-~~~~~~~~~~a~~~v~~iN~~  478 (497)
T PF02995_consen  451 CEGWKTI-PTNDSLVQRIAKFLVDHINEY  478 (497)
T ss_pred             CcCcccc-ccCcHHHHHHHHHHHHHHHHH
Confidence            6889888 688999999999999999965


No 15 
>PRK15344 type III secretion system needle protein SsaG; Provisional
Probab=46.00  E-value=36  Score=19.93  Aligned_cols=21  Identities=24%  Similarity=0.330  Sum_probs=18.3

Q ss_pred             CCCCCHHHHHHHHHHHHHHHh
Q 045193           24 EDLSEPHVTEIGRFAVMSTKK   44 (82)
Q Consensus        24 ~~~~d~~vq~~a~fAv~~~n~   44 (82)
                      .+++||+..--++|++.+|+.
T Consensus        28 ~~~~nP~~ml~lQf~i~QyS~   48 (71)
T PRK15344         28 NDLLNPESMIKAQFALQQYST   48 (71)
T ss_pred             CCCCCHHHHHHHHHHHHHHHH
Confidence            378899999999999999874


No 16 
>PF13028 DUF3889:  Protein of unknown function (DUF3889)
Probab=45.33  E-value=72  Score=19.65  Aligned_cols=51  Identities=29%  Similarity=0.412  Sum_probs=29.3

Q ss_pred             CCHHHHHHHHHHHHHHHhhc-CCceeEeeEEEEEEEeecc-eEEEEEEEEeeCc
Q 045193           27 SEPHVTEIGRFAVMSTKKRS-KNEFKFKSVEKGKTKVVSS-TNYRLILVVKDGK   78 (82)
Q Consensus        27 ~d~~vq~~a~fAv~~~n~~s-~~~~~~~~V~~~~~QVVaG-~nY~l~v~~~~g~   78 (82)
                      ++|.+.+..+.|+.+.-++= +.... .-..-+++|+-++ +.-.+++-+.+|+
T Consensus        19 ~~p~yaKWgrlA~~~~k~~Yp~a~v~-DY~~vGr~~~~~~~t~e~Fkl~l~~~~   71 (97)
T PF13028_consen   19 AQPSYAKWGRLAVQETKEKYPGAEVV-DYLYVGRTKVNDEQTVEKFKLWLREGG   71 (97)
T ss_pred             CCCcHHHHHHHHHHHHHHHCCCCEEe-eeeeecceecCCcceEEEEEEEEEcCC
Confidence            45889999999999865431 21111 1112245556666 6666666665554


No 17 
>PF07311 Dodecin:  Dodecin;  InterPro: IPR009923 This entry represents proteins with a Dodecin-like topology. Dodecin flavoprotein is a small dodecameric flavin-binding protein from Halobacterium salinarium (Halobacterium halobium) that contains two flavins stacked in a single binding pocket between two tryptophan residues to form an aromatic tetrade []. Dodecin binds riboflavin, although it appears to have a broad specificity for flavins. Lumichrome, a molecule associated with flavin metabolism, appears to be a ligand of dodecin, which could act as a waste-trapping device. ; PDB: 2VYX_L 2DEG_F 2V18_K 2V19_D 2UX9_B 2CZ8_E 2V21_F 2CC8_A 2CCB_A 2VX9_A ....
Probab=39.17  E-value=76  Score=18.15  Aligned_cols=47  Identities=19%  Similarity=0.239  Sum_probs=35.6

Q ss_pred             CHHHHHHHHHHHHHHHhhcCCceeEeeEEEEEEEeecc--eEEEEEEEEe
Q 045193           28 EPHVTEIGRFAVMSTKKRSKNEFKFKSVEKGKTKVVSS--TNYRLILVVK   75 (82)
Q Consensus        28 d~~vq~~a~fAv~~~n~~s~~~~~~~~V~~~~~QVVaG--~nY~l~v~~~   75 (82)
                      ......+.+-|+.+-.+ +=++++-++|..-+-.|..|  +.|+.+++++
T Consensus        13 ~~S~edAv~~Av~~A~k-Tl~ni~~~eV~e~~~~v~dg~i~~y~v~lkv~   61 (66)
T PF07311_consen   13 PKSWEDAVQNAVARASK-TLRNIRWFEVKEQRGHVEDGKITEYQVNLKVS   61 (66)
T ss_dssp             SSHHHHHHHHHHHHHHH-HSSSEEEEEEEEEEEEEETTCEEEEEEEEEEE
T ss_pred             CCCHHHHHHHHHHHHhh-chhCcEEEEEEEEEEEEeCCcEEEEEEEEEEE
Confidence            34566777777776553 45678889999999999887  7899988875


No 18 
>PF00041 fn3:  Fibronectin type III domain;  InterPro: IPR003961 Fibronectins are multi-domain glycoproteins found in a soluble form in plasma, and in an insoluble form in loose connective tissue and basement membranes []. They contain multiple copies of 3 repeat regions (types I, II and III), which bind to a variety of substances including heparin, collagen, DNA, actin, fibrin and fibronectin receptors on cell surfaces. The wide variety of these substances means that fibronectins are involved in a number of important functions: e.g., wound healing; cell adhesion; blood coagulation; cell differentiation and migration; maintenance of the cellular cytoskeleton; and tumour metastasis []. The role of fibronectin in cell differentiation is demonstrated by the marked reduction in the expression of its gene when neoplastic transformation occurs. Cell attachment has been found to be mediated by the binding of the tetrapeptide RGDS to integrins on the cell surface [], although related sequences can also display cell adhesion activity. Plasma fibronectin occurs as a dimer of 2 different subunits, linked together by 2 disulphide bonds near the C terminus. The difference in the 2 chains occurs in the type III repeat region and is caused by alternative splicing of the mRNA from one gene []. The observation that, in a given protein, an individual repeat of one of the 3 types (e.g., the first FnIII repeat) shows much less similarity to its subsequent tandem repeats within that protein than to its equivalent repeat between fibronectins from other species, has suggested that the repeating structure of fibronectin arose at an early stage of evolution. It also seems to suggest that the structure is subject to high selective pressure []. The fibronectin type III repeat region is an approximately 100 amino acid domain, different tandem repeats of which contain binding sites for DNA, heparin and the cell surface []. The superfamily of sequences believed to contain FnIII repeats represents 45 different families, the majority of which are involved in cell surface binding in some manner, or are receptor protein tyrosine kinases, or cytokine receptors.; GO: 0005515 protein binding; PDB: 1UEM_A 1TDQ_A 1X5I_A 2IC2_B 2IBG_C 2IBB_A 3R8Q_A 2FNB_A 1FNH_A 2EDB_A ....
Probab=39.17  E-value=32  Score=18.55  Aligned_cols=18  Identities=11%  Similarity=0.254  Sum_probs=14.5

Q ss_pred             eecceEEEEEEEEeeCcc
Q 045193           62 VVSSTNYRLILVVKDGKN   79 (82)
Q Consensus        62 VVaG~nY~l~v~~~~g~~   79 (82)
                      +..|+.|.+.|.+.++.+
T Consensus        63 L~p~t~Y~~~v~a~~~~g   80 (85)
T PF00041_consen   63 LQPGTTYEFRVRAVNSDG   80 (85)
T ss_dssp             CCTTSEEEEEEEEEETTE
T ss_pred             CCCCCEEEEEEEEEeCCc
Confidence            568999999999877643


No 19 
>KOG2650 consensus Zinc carboxypeptidase [Function unknown]
Probab=36.08  E-value=1.8e+02  Score=22.45  Aligned_cols=62  Identities=15%  Similarity=0.158  Sum_probs=45.0

Q ss_pred             ccccceEECCCCCCHHHHHHHHHHHHHHHhhcCCceeEe---eEE---EE----EEEeecceEEEEEEEEee
Q 045193           15 VLVGGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFKFK---SVE---KG----KTKVVSSTNYRLILVVKD   76 (82)
Q Consensus        15 ~~~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~~~---~V~---~~----~~QVVaG~nY~l~v~~~~   76 (82)
                      ..|=|++.....+-++++++|+.|++..++..+..|++-   .++   ++    +.+=+.|.+|-+++++.|
T Consensus       316 lyPyg~~~~~~~~~~dl~~va~~a~~ai~~~~gt~Y~~G~~~~~~y~asG~S~Dway~~~gi~~~ft~ELrd  387 (418)
T KOG2650|consen  316 LYPYGYTNDLPEDYEDLQEVARAAADALKSVYGTKYTVGSSADTLYPASGGSDDWAYDVLGIPYAFTFELRD  387 (418)
T ss_pred             EecccccCCCCCCHHHHHHHHHHHHHHHHHHhCCEEEeccccceeeccCCchHHHhhhccCCCEEEEEEecc
Confidence            346667665444667789999999999998888888873   221   22    345578999999999874


No 20 
>cd08538 SAM_PNT-ESE-2-like Sterile alpha motif (SAM)/Pointed domain of ESE-2 like ETS transcriptional regulators. SAM Pointed domain of ESE-2-like (Epithelium-Specific ETS) subfamily of ETS transcriptional regulators is a putative protein-protein interaction domain. It can act as a major transactivator by providing a potential docking site for co-activators. ESE-2 factors are involved in regulation of gene expression in a variety of epithelial (glandular and secretory) cells. ESE-2 mRNA was found in skin keratinocytes, salivary gland, mammary gland, stomach, prostate, and kidneys. The DNA binding consensus motif for ESE-2 consists of a GGA core and AT-rich flanks. The expression profiles of these factors are altered in epithelial cancers. Members of this subfamily are potential targets for cancer therapy.
Probab=35.75  E-value=30  Score=20.52  Aligned_cols=19  Identities=26%  Similarity=0.181  Sum_probs=16.8

Q ss_pred             CCHHHHHHHHHHHHHHHhh
Q 045193           27 SEPHVTEIGRFAVMSTKKR   45 (82)
Q Consensus        27 ~d~~vq~~a~fAv~~~n~~   45 (82)
                      +..+|.+-..||+++|+-.
T Consensus        11 s~~~V~~WL~Wav~ef~L~   29 (78)
T cd08538          11 TKRHVWEWLQFCCDQYKLD   29 (78)
T ss_pred             CHHHHHHHHHHHHHHcCCC
Confidence            6688999999999999864


No 21 
>cd02848 Chitinase_N_term Chitinase N-terminus domain. Chitinases hydrolyze the abundant natural biopolymer chitin, producing smaller chito-oligosaccharides. Chitin consists of multiple N-acetyl-D-glucosamine (NAG) residues connected via beta-1,4-glycosidic linkages and is an important structural element of fungal cell wall and arthropod exoskeletons. On the basis of the mode of chitin hydrolysis, chitinases are classified as random, endo-, and exo-chitinases and based on sequence criteria, chitinases belong to families 18 and 19 of glycosyl hydrolases.  The N-terminus of chitinase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at  either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitob
Probab=34.90  E-value=83  Score=19.75  Aligned_cols=19  Identities=21%  Similarity=0.564  Sum_probs=15.9

Q ss_pred             eecceEEEEEEEEeeCccc
Q 045193           62 VVSSTNYRLILVVKDGKNC   80 (82)
Q Consensus        62 VVaG~nY~l~v~~~~g~~~   80 (82)
                      +-.|=.|.|+|++.++++|
T Consensus        76 v~kgG~y~m~V~lCn~dGC   94 (106)
T cd02848          76 VGKGGRYQMQVALCNGDGC   94 (106)
T ss_pred             eCCCCeEEEEEEEECCCCc
Confidence            3458899999999998877


No 22 
>KOG2786 consensus Putative glutamate/ornithine acetyltransferase [Amino acid transport and metabolism]
Probab=30.88  E-value=35  Score=25.75  Aligned_cols=29  Identities=17%  Similarity=0.208  Sum_probs=22.1

Q ss_pred             eEeeEEEEEEEeecceEEEEEEEEeeCccc
Q 045193           51 KFKSVEKGKTKVVSSTNYRLILVVKDGKNC   80 (82)
Q Consensus        51 ~~~~V~~~~~QVVaG~nY~l~v~~~~g~~~   80 (82)
                      .|.|.+..|.|+ +|++|++.=-+...+.+
T Consensus       176 tfpKlV~~e~~v-~G~~yrv~GmAKGaGMI  204 (431)
T KOG2786|consen  176 TFPKLVAVESQV-GGIKYRVGGMAKGAGMI  204 (431)
T ss_pred             ccchhhheeeec-ccEEEEeecccccCccc
Confidence            367999999999 99999997655544443


No 23 
>cd06247 M14_CPO Peptidase M14 carboxypeptidase (CP) O (CPO, also known as metallocarboxypeptidase C; EC 3.4.17.) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPO has not been well characterized as yet, and little is known about it. Based on modeling studies, CPO has been suggested to have specificity for acidic residues rather than aliphatic/aromatic residues as in A-like enzymes or basic residues as in B-like enzymes. It remains to be demonstrated that CPO is functional as an MCP.
Probab=30.47  E-value=93  Score=22.59  Aligned_cols=71  Identities=10%  Similarity=0.209  Sum_probs=43.4

Q ss_pred             EeeeeccCccc--ccceEECCCCCCHHHHHHHHHHHHHHHhhcCCceeEe---eE---EEE----EEEeecceEEEEEEE
Q 045193            6 LLAADNRKRVL--VGGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFKFK---SV---EKG----KTKVVSSTNYRLILV   73 (82)
Q Consensus         6 ~~~~~~~~~~~--~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~~~---~V---~~~----~~QVVaG~nY~l~v~   73 (82)
                      .+++..-+..+  |=|++.-+.+++.+..++++.+++...+.++..|..-   .+   .++    +.+ ..|..|-++++
T Consensus       186 ~l~~Hsyg~~i~~P~g~~~~~~~n~~~~~~~a~~~~~ai~~~~~~~y~~g~~~~~~y~a~G~s~Dwa~-~~~~~~s~t~E  264 (298)
T cd06247         186 YLTIHSYGQLILLPYGYTKEPSSNHEEMMLVAQKAAAALKEKHGTEYRVGSSALILYSNSGSSRDWAV-DIGIPFSYTFE  264 (298)
T ss_pred             EEEeccCCCeEEeCCcCCCCCCCCHHHHHHHHHHHHHHHHHhcCCCCccCCcccccccCCCChhhhhh-ccCCCEEEEEE
Confidence            34444434333  3334443345778899999999988887766666652   11   111    223 25899999999


Q ss_pred             EeeC
Q 045193           74 VKDG   77 (82)
Q Consensus        74 ~~~g   77 (82)
                      +.+.
T Consensus       265 l~~~  268 (298)
T cd06247         265 LRDN  268 (298)
T ss_pred             eCCC
Confidence            9764


No 24 
>PF14201 DUF4318:  Domain of unknown function (DUF4318)
Probab=28.40  E-value=1.3e+02  Score=17.57  Aligned_cols=39  Identities=15%  Similarity=0.205  Sum_probs=24.3

Q ss_pred             HHHHHHhhcCCceeEeeEEE-EEEEeecceEEEEEE---EEeeC
Q 045193           38 AVMSTKKRSKNEFKFKSVEK-GKTKVVSSTNYRLIL---VVKDG   77 (82)
Q Consensus        38 Av~~~n~~s~~~~~~~~V~~-~~~QVVaG~nY~l~v---~~~~g   77 (82)
                      |+..|=.+++..+.|++=.+ +.- -.+|+.|.+++   .+.+|
T Consensus        23 aIE~YC~~~~~~l~Fisr~~Pi~~-~idg~lYev~i~~~~~~rg   65 (74)
T PF14201_consen   23 AIEKYCIKNGESLEFISRDKPITF-KIDGVLYEVEIDEEYMARG   65 (74)
T ss_pred             HHHHHHHHcCCceEEEecCCcEEE-EECCeEEEEEEEeeecccC
Confidence            57777777888888842221 111 24788888888   44444


No 25 
>PF14873 BNR_assoc_N:  N-terminal domain of BNR-repeat neuraminidase
Probab=27.35  E-value=62  Score=20.61  Aligned_cols=22  Identities=32%  Similarity=0.487  Sum_probs=16.0

Q ss_pred             EEEEEEeecceEE-EEEEEEeeC
Q 045193           56 EKGKTKVVSSTNY-RLILVVKDG   77 (82)
Q Consensus        56 ~~~~~QVVaG~nY-~l~v~~~~g   77 (82)
                      +.+..++..|+|| ++.+++.+.
T Consensus        94 l~~~~~L~~G~nyfWvs~~lk~~  116 (137)
T PF14873_consen   94 LTGNQKLFPGTNYFWVSVDLKDN  116 (137)
T ss_pred             EcCCceeCCCCeEEEEEEEecCC
Confidence            3455668899999 677788764


No 26 
>PF07488 Glyco_hydro_67M:  Glycosyl hydrolase family 67 middle domain;  InterPro: IPR011100 Alpha-glucuronidases, components of an ensemble of enzymes central to the recycling of photosynthetic biomass, remove the alpha-1,2 linked 4-O-methyl glucuronic acid from xylans. This family represents the central catalytic domain of alpha-glucuronidase [].; GO: 0046559 alpha-glucuronidase activity, 0045493 xylan catabolic process, 0005576 extracellular region; PDB: 1MQP_A 1K9E_A 1MQQ_A 1L8N_A 1K9D_A 1MQR_A 1K9F_A 1GQL_A 1GQI_B 1GQJ_B ....
Probab=27.07  E-value=72  Score=23.85  Aligned_cols=40  Identities=15%  Similarity=0.154  Sum_probs=25.2

Q ss_pred             eEeeeeccCcccccceEECCCCCCHHHHHHHHHHHHHHHhh
Q 045193            5 PLLAADNRKRVLVGGWKSKEDLSEPHVTEIGRFAVMSTKKR   45 (82)
Q Consensus         5 l~~~~~~~~~~~~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~   45 (82)
                      +.+|+.-+++...||.... |+-||+|++.=+-..++.-+.
T Consensus       108 v~LSvnFasP~~lggL~Ta-DPld~~V~~WW~~k~~eIY~~  147 (328)
T PF07488_consen  108 VYLSVNFASPIELGGLPTA-DPLDPEVRQWWKDKADEIYSA  147 (328)
T ss_dssp             EEEEE-TTHHHHTTS-S----TTSHHHHHHHHHHHHHHHHH
T ss_pred             EEEEeeccCCcccCCcCcC-CCCCHHHHHHHHHHHHHHHHh
Confidence            4567777777778999888 899999988755555555443


No 27 
>PF07849 DUF1641:  Protein of unknown function (DUF1641);  InterPro: IPR012440 Archaeal and bacterial hypothetical proteins are found in this family, with the region in question being approximately 40 residues long. 
Probab=26.15  E-value=81  Score=16.17  Aligned_cols=15  Identities=27%  Similarity=0.341  Sum_probs=12.9

Q ss_pred             CCCHHHHHHHHHHHH
Q 045193           26 LSEPHVTEIGRFAVM   40 (82)
Q Consensus        26 ~~d~~vq~~a~fAv~   40 (82)
                      ..||++|....|.+.
T Consensus        20 l~DpdvqrgL~~ll~   34 (42)
T PF07849_consen   20 LRDPDVQRGLGFLLA   34 (42)
T ss_pred             HcCHHHHHHHHHHHH
Confidence            689999999998765


No 28 
>PF01448 ELM2:  ELM2 domain;  InterPro: IPR000949 The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex []. The domain is usually found to the N terminus of a myb-like DNA binding domain and a GATA binding domain. ELM2, in some instances, is also found associated with the ARID DNA binding domain IPR001606 from INTERPRO. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.
Probab=25.65  E-value=60  Score=17.21  Aligned_cols=22  Identities=27%  Similarity=0.438  Sum_probs=16.0

Q ss_pred             eEECCCCCCHHHHHHHHHHHHH
Q 045193           20 WKSKEDLSEPHVTEIGRFAVMS   41 (82)
Q Consensus        20 ~~~i~~~~d~~vq~~a~fAv~~   41 (82)
                      |.|-.+.+|.++.+...+|.+.
T Consensus        33 W~P~~~~~d~~l~~yl~~A~s~   54 (55)
T PF01448_consen   33 WSPNNPLSDRKLEEYLKVAKSS   54 (55)
T ss_pred             ECCCCCCCHHHHHHHHHHHHhc
Confidence            6775456888888888887653


No 29 
>PF07820 TraC:  TraC-like protein;  InterPro: IPR012930 The members of this family are sequences that are similar to TraC (Q84HT8 from SWISSPROT) from Rhizobium etli. The gene encoding this protein is one of a group of genes found on plasmid p42a of Rhizobium etli (strain CFN 42/ATCC 51251) that are thought to be involved in the process of plasmid self-transmission. Mobilisation of plasmid p42a is of importance as it is required for transfer of plasmid p42d, the symbiotic plasmid which carries most of the genes required for nodulation and nitrogen fixation by this symbiotic bacterium. The predicted protein products of p42a are similar to known transfer proteins of Agrobacterium tumefaciens plasmid pTiC58 []. ; GO: 0000746 conjugation
Probab=25.18  E-value=1e+02  Score=18.93  Aligned_cols=29  Identities=10%  Similarity=0.075  Sum_probs=24.2

Q ss_pred             ccceEECCCCCCHHHHHHHHHHHHHHHhhc
Q 045193           17 VGGWKSKEDLSEPHVTEIGRFAVMSTKKRS   46 (82)
Q Consensus        17 ~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s   46 (82)
                      --|...+ +++|.+++.+.+-....|++..
T Consensus        35 KaGL~ei-eI~d~eL~~~FeeIa~RFrk~~   63 (92)
T PF07820_consen   35 KAGLGEI-EISDAELQAAFEEIAARFRKGK   63 (92)
T ss_pred             Hcccccc-cCCHHHHHHHHHHHHHHHhccc
Confidence            3567788 8999999999999999998763


No 30 
>cd03870 M14_CPA Peptidase M14 Carboxypeptidase (CP) A (CPA) belongs to the A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPA enzymes generally favor hydrophobic residues. A/B subfamily enzymes are normally synthesized as inactive precursors containing preceding signal peptide, followed by a globular N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The procarboxypeptidase A (PCPA) is produced by the exocrine pancreas and stored as a stable zymogen in the pancreatic granules until secretion into the digestive tract occurs. This subfamily includes CPA1, CPA2 and CPA4 forms. Within these A forms, there are slightly different specificities, with CPA1 preferring aliphatic and small aromatic residues, and CPA2 p
Probab=23.77  E-value=2.6e+02  Score=20.27  Aligned_cols=72  Identities=14%  Similarity=0.287  Sum_probs=42.4

Q ss_pred             eeeeccCccc--ccceEECCCCCCHHHHHHHHHHHHHHHhhcCCceeEeeE------EEEEEE---eecceEEEEEEEEe
Q 045193            7 LAADNRKRVL--VGGWKSKEDLSEPHVTEIGRFAVMSTKKRSKNEFKFKSV------EKGKTK---VVSSTNYRLILVVK   75 (82)
Q Consensus         7 ~~~~~~~~~~--~Gg~~~i~~~~d~~vq~~a~fAv~~~n~~s~~~~~~~~V------~~~~~Q---VVaG~nY~l~v~~~   75 (82)
                      +++..-...+  |=|++.-..+++....++++-+++.+.+.++..|+.-..      .++.+.   -..|..|-+++++.
T Consensus       187 l~lHS~g~~i~yP~~~~~~~~~~~~~~~~la~~~~~ai~~~~g~~y~~g~~~~~~y~a~G~s~Dw~y~~~~~~s~t~El~  266 (301)
T cd03870         187 ISIHSYSQLLLYPYGYTTQSIPDKTELNQVAKSAVAALKSLYGTSYKYGSIITTIYQASGGSIDWSYNQGIKYSFTFELR  266 (301)
T ss_pred             EEeccCCceEEecCcCCCCCCCCHHHHHHHHHHHHHHHHHhcCCccccccccceeecCCCChhhhhhcCCCcEEEEEEeC
Confidence            4444433333  334444334567788999998888887766666665221      111111   02488899999998


Q ss_pred             eCc
Q 045193           76 DGK   78 (82)
Q Consensus        76 ~g~   78 (82)
                      +.+
T Consensus       267 ~~g  269 (301)
T cd03870         267 DTG  269 (301)
T ss_pred             CCC
Confidence            753


No 31 
>PF08329 ChitinaseA_N:  Chitinase A, N-terminal domain;  InterPro: IPR013540 This domain is found in a number of bacterial chitinases and similar viral proteins. It is organised into a fibronectin III module domain-like fold, comprising only beta strands. Its function is not known, but it may be involved in interaction with the enzyme substrate, chitin [, ]. It is separated by a hinge region from the catalytic domain (IPR001223 from INTERPRO); this hinge region is probably mobile, allowing the N-terminal domain to have different relative positions in solution []. ; GO: 0004568 chitinase activity; PDB: 2WLY_A 1EDQ_A 2WM0_A 1X6N_A 1NH6_A 2WK2_A 1EHN_A 2WLZ_A 1EIB_A 1FFR_A ....
Probab=23.22  E-value=1.9e+02  Score=18.71  Aligned_cols=19  Identities=16%  Similarity=0.480  Sum_probs=14.7

Q ss_pred             eecceEEEEEEEEeeCccc
Q 045193           62 VVSSTNYRLILVVKDGKNC   80 (82)
Q Consensus        62 VVaG~nY~l~v~~~~g~~~   80 (82)
                      +-.|-.|.+.+++.+.++|
T Consensus        79 ~~~gG~y~~~VeLCN~~GC   97 (133)
T PF08329_consen   79 VTKGGRYQMQVELCNADGC   97 (133)
T ss_dssp             E-S-EEEEEEEEEEETTEE
T ss_pred             ecCCCEEEEEEEEECCCCc
Confidence            4578899999999998876


No 32 
>PF01819 Levi_coat:  Levivirus coat protein;  InterPro: IPR002703 This entry represents the coat proteins of the leviviruses (phage MS2) and alloleviruses (phage Qbeta and phage F1).  The Levivirus coat protein forms the bacteriophage coat that encapsidates the viral RNA. 180 copies of this protein form the virion shell. The Bacteriophage MS2 coat protein controls two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation-by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication [].; GO: 0005198 structural molecule activity, 0019028 viral capsid; PDB: 2VF9_A 6MSF_A 2MS2_A 1ZDI_B 2BS1_C 1MST_C 2C4Y_A 1ZDK_B 1ZDJ_A 1ZSE_B ....
Probab=21.71  E-value=1.7e+02  Score=19.00  Aligned_cols=23  Identities=30%  Similarity=0.322  Sum_probs=16.7

Q ss_pred             cCCceeE---eeEEEEEEEeecceEE
Q 045193           46 SKNEFKF---KSVEKGKTKVVSSTNY   68 (82)
Q Consensus        46 s~~~~~~---~~V~~~~~QVVaG~nY   68 (82)
                      ++++++|   ++|=+..+|.+.|+.+
T Consensus        53 s~~r~kytvkv~vp~~~tqt~~Gve~   78 (129)
T PF01819_consen   53 SANRRKYTVKVEVPKVTTQTVNGVEL   78 (129)
T ss_dssp             TTTEEEEEEEEEEEEEEEEEETTEEE
T ss_pred             cCCCcceEEEEEcccceeEeeCcEec
Confidence            4555554   6788889999999765


No 33 
>PF00413 Peptidase_M10:  Matrixin This Prosite motif covers only the active site.;  InterPro: IPR001818 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M10 (clan MA(M)).  The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA. Sequences having this domain are extracellular metalloproteases, such as collagenase and stromelysin, which degrade the extracellular matrix, are known as matrixins. They are zinc-dependent, calcium-activated proteases synthesised as inactive precursors (zymogens), which are proteolytically cleaved to yield the active enzyme [, ]. All matrixins and related proteins possess 2 domains: an N-terminal domain, and a zinc-binding active site domain. The N-terminal domain peptide, cleaved during the activation step, includes a conserved PRCGVPDV octapeptide, known as the cysteine switch, whose Cys residue chelates the active site zinc atom, rendering the enzyme inactive [, ]. The active enzyme degrades components of the extracellular matrix, playing a role in the initial steps of tissue remodelling during morphogenesis, wound healing, angiogenesis and tumour invasion [, ].; GO: 0004222 metalloendopeptidase activity, 0008270 zinc ion binding, 0006508 proteolysis, 0031012 extracellular matrix; PDB: 1Q3A_C 3V96_B 1HV5_D 1CXV_A 1SRP_A 1FBL_A 1ZVX_A 1JH1_A 1I76_A 2OY4_A ....
Probab=21.37  E-value=1.8e+02  Score=17.85  Aligned_cols=30  Identities=20%  Similarity=0.225  Sum_probs=22.5

Q ss_pred             CCCCHHHHHHHHHHHHHHHhhcCCceeEeeEE
Q 045193           25 DLSEPHVTEIGRFAVMSTKKRSKNEFKFKSVE   56 (82)
Q Consensus        25 ~~~d~~vq~~a~fAv~~~n~~s~~~~~~~~V~   56 (82)
                      +.+.++.+++.+.|...+++..  .+.|.++-
T Consensus        17 ~~~~~~~~~~i~~A~~~W~~~~--~~~F~~~~   46 (154)
T PF00413_consen   17 QLSQSEQRDAIRQAFQAWNDVA--PLNFTEVS   46 (154)
T ss_dssp             TS-HHHHHHHHHHHHHHHHTTS--SEEEEEES
T ss_pred             CCCHHHHHHHHHHHHHHHHhcC--CceEEecc
Confidence            3556789999999999998653  37776665


No 34 
>cd03147 GATase1_Ydr533c_like Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein. Type 1 glutamine amidotransferase (GATase1)-like domain found in Saccharomyces cerevisiae Ydr533c protein.  This group includes proteins similar to S. cerevisiae Ydr533c.  Ydr533c is upregulated in response to various stress conditions along with the heat shock family.  The catalytic triad typical of GATase1domains is not conserved in this GATase1-like domain. However, in common with a typical GATase1domain, a reactive Cys residue is found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. This Cys together with a different His and Glu residue form a different catalytic triad from the typical GATase1domain.  Ydr533c protein is a homodimer.
Probab=21.08  E-value=75  Score=22.05  Aligned_cols=26  Identities=15%  Similarity=0.208  Sum_probs=18.6

Q ss_pred             ccccceEECCC-CCCHHHHHHHHHHHH
Q 045193           15 VLVGGWKSKED-LSEPHVTEIGRFAVM   40 (82)
Q Consensus        15 ~~~Gg~~~i~~-~~d~~vq~~a~fAv~   40 (82)
                      .+|||+.+..| .+|++++++.+....
T Consensus        99 ~iPGG~g~~~dl~~~~~l~~ll~~f~~  125 (231)
T cd03147          99 FVAGGHGTLFDFPHATNLQKIAQQIYA  125 (231)
T ss_pred             EECCCCchhhhcccCHHHHHHHHHHHH
Confidence            46899876544 478999988776543


No 35 
>PF08522 DUF1735:  Domain of unknown function (DUF1735);  InterPro: IPR013728 This domain of unknown function is found in a number of bacterial proteins including acylhydrolases.; PDB: 3POH_A 4DQA_A 3SOT_D 3NQK_A 3N91_A 3P02_A.
Probab=20.32  E-value=1.9e+02  Score=16.52  Aligned_cols=15  Identities=7%  Similarity=0.217  Sum_probs=11.0

Q ss_pred             HHHHHHhhcCCceeE
Q 045193           38 AVMSTKKRSKNEFKF   52 (82)
Q Consensus        38 Av~~~n~~s~~~~~~   52 (82)
                      .+..||+..+..|.+
T Consensus        15 ~l~~YN~~~gt~y~~   29 (86)
T PF08522_consen   15 LLDAYNKANGTNYEL   29 (86)
T ss_dssp             HHHHHHHHHTGBEEE
T ss_pred             HHHHHHHhcCCccEE
Confidence            477899887776664


Done!