Query 033492
Match_columns 118
No_of_seqs 131 out of 1034
Neff 7.3
Searched_HMMs 46136
Date Fri Mar 29 02:54:20 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/033492.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/033492hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd00042 CY Substituted updates 99.9 1.5E-22 3.3E-27 133.6 12.1 84 30-114 1-104 (105)
2 smart00043 CY Cystatin-like do 99.9 1.4E-22 2.9E-27 134.3 8.8 86 28-114 2-105 (107)
3 PF00031 Cystatin: Cystatin do 99.9 1.1E-20 2.4E-25 122.3 10.8 74 30-104 1-94 (94)
4 PF07430 PP1: Phloem filament 99.6 9.1E-15 2E-19 105.5 10.2 86 25-111 110-199 (202)
5 PF07430 PP1: Phloem filament 99.1 2.2E-10 4.8E-15 82.9 6.0 91 25-115 3-98 (202)
6 TIGR01638 Atha_cystat_rel Arab 98.7 7.4E-08 1.6E-12 62.5 7.2 66 37-102 9-77 (92)
7 PF06907 Latexin: Latexin; In 96.3 0.086 1.9E-06 39.3 10.1 66 37-102 3-76 (220)
8 TIGR01572 A_thl_para_3677 Arab 90.9 0.55 1.2E-05 36.0 4.9 60 43-102 42-104 (265)
9 PF07172 GRP: Glycine rich pro 88.7 0.27 5.9E-06 32.1 1.5 7 1-7 1-7 (95)
10 PF05679 CHGN: Chondroitin N-a 78.5 9.9 0.00021 31.6 6.9 50 41-90 161-212 (499)
11 COG3360 Uncharacterized conser 76.0 16 0.00034 22.5 6.5 45 42-87 18-64 (71)
12 PF12276 DUF3617: Protein of u 74.5 4 8.6E-05 28.3 3.1 34 1-34 1-34 (162)
13 PF07311 Dodecin: Dodecin; In 72.5 19 0.00041 21.9 7.8 48 39-87 12-61 (66)
14 PF10731 Anophelin: Thrombin i 71.9 4.1 8.9E-05 24.5 2.2 18 1-18 1-18 (65)
15 PF05984 Cytomega_UL20A: Cytom 66.3 5.9 0.00013 25.5 2.2 21 1-21 1-21 (100)
16 PF00666 Cathelicidins: Cathel 65.7 7.1 0.00015 23.9 2.4 47 42-89 3-53 (67)
17 KOG2650 Zinc carboxypeptidase 49.8 47 0.001 27.3 5.3 63 28-90 317-389 (418)
18 PF01456 Mucin: Mucin-like gly 48.6 12 0.00026 25.4 1.5 22 1-22 2-25 (143)
19 PF03032 Brevenin: Brevenin/es 48.2 15 0.00033 20.7 1.7 18 4-21 4-21 (46)
20 TIGR01601 PYST-C1 Plasmodium y 43.9 14 0.00031 23.3 1.2 16 1-16 1-17 (82)
21 cd06379 PBP1_iGluR_NMDA_NR1 N- 42.6 37 0.00081 26.3 3.7 42 11-58 5-46 (377)
22 PRK10386 curli assembly protei 41.6 1.2E+02 0.0025 21.0 6.9 81 1-90 1-82 (130)
23 PRK15344 type III secretion sy 33.8 67 0.0014 19.9 3.0 22 36-57 28-49 (71)
24 TIGR02105 III_needle type III 33.8 60 0.0013 20.0 2.9 21 37-57 30-50 (72)
25 PF13028 DUF3889: Protein of u 30.2 1.6E+02 0.0035 19.2 10.8 69 39-108 19-89 (97)
26 PF06157 DUF973: Protein of un 29.4 82 0.0018 24.5 3.6 29 72-103 257-285 (285)
27 CHL00132 psaF photosystem I su 27.2 2.1E+02 0.0046 20.9 5.1 26 28-55 25-50 (185)
28 PF12274 DUF3615: Protein of u 27.2 1.7E+02 0.0036 18.5 7.9 56 59-115 9-72 (96)
29 COG4676 Uncharacterized protei 26.2 28 0.00061 26.4 0.5 33 17-50 19-54 (268)
30 PRK09408 ompX outer membrane p 26.2 56 0.0012 23.4 2.1 37 1-38 1-37 (171)
31 cd06247 M14_CPO Peptidase M14 26.2 1E+02 0.0022 24.0 3.7 61 29-90 199-269 (298)
32 PF03823 Neurokinin_B: Neuroki 26.2 81 0.0018 18.7 2.4 34 1-34 1-38 (59)
33 PRK15346 outer membrane secret 24.8 95 0.0021 25.8 3.5 48 1-51 1-48 (499)
34 PF08139 LPAM_1: Prokaryotic m 23.9 53 0.0011 16.2 1.1 12 3-14 8-19 (25)
35 PF03082 MAGSP: Male accessory 23.0 55 0.0012 24.7 1.6 22 1-22 1-23 (264)
36 TIGR03431 PhnD phosphonate ABC 22.0 3.3E+02 0.0072 20.1 5.9 34 1-34 1-35 (288)
37 PF05887 Trypan_PARP: Procycli 21.9 30 0.00065 24.1 0.0 13 1-13 1-13 (143)
38 PF05438 TRH: Thyrotropin-rele 20.7 1.8E+02 0.0039 21.7 3.8 16 6-21 2-17 (212)
39 cd01781 AF6_RA_repeat2 Ubiquit 20.3 1.5E+02 0.0032 19.5 3.0 30 39-68 24-55 (100)
40 PF07353 Uroplakin_II: Uroplak 20.2 2.4E+02 0.0051 20.5 4.2 16 75-90 110-125 (184)
41 PF15281 Consortin_C: Consorti 20.2 76 0.0016 21.3 1.6 15 7-21 56-70 (113)
42 PF07312 DUF1459: Protein of u 20.1 93 0.002 19.8 1.9 16 1-16 1-16 (84)
No 1
>cd00042 CY Substituted updates: Jan 30, 2002
Probab=99.89 E-value=1.5e-22 Score=133.56 Aligned_cols=84 Identities=46% Similarity=0.727 Sum_probs=78.2
Q ss_pred ceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCC-eEEEEEEEEEEEEeceEEEEEEEEEEeCC------------------
Q 033492 30 GGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSA-LKFESVEKGETQVVSGTNYRLILVVKDGP------------------ 90 (118)
Q Consensus 30 GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~-~~~~~V~~a~~QVVaG~nY~l~v~~~~~~------------------ 90 (118)
|||.++ +.+||++++++++|+.+||+++++. |.+.+|+++++|||+|++|++++++.+++
T Consensus 1 gg~~~~-~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~i~~~~~QvvaG~~y~i~~~~~~t~C~k~~~~~~~~~c~~~~~ 79 (105)
T cd00042 1 GGPSDI-PANDPEVQELADFAVAEYNKKSNDKYLEFFKVLSAKSQVVAGTNYYITVEAGDTNCKKSSVPLDCPDCKLLEE 79 (105)
T ss_pred CCCccC-CCCCHHHHHHHHHHHHHHHhhcCccceeEEEEEEEEEEEEeeeEEEEEEEEecccccccCccccccccccccc
Confidence 899995 8999999999999999999999988 77899999999999999999999999863
Q ss_pred -CceeEEEEEEEecCCCceEEEEee
Q 033492 91 -STKKFEAVVWEKPWEHFKSLTSFK 114 (118)
Q Consensus 91 -~~~~c~~~V~~~PW~~~~~l~sf~ 114 (118)
....|.+.||.+||.+..++++++
T Consensus 80 ~~~~~C~~~V~~~pw~~~~~l~~~~ 104 (105)
T cd00042 80 GKKKFCTAKVWEKPWENFKELLSFK 104 (105)
T ss_pred CCCEEEEEEEEecCCCCceeeeecc
Confidence 578999999999999999999875
No 2
>smart00043 CY Cystatin-like domain. Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains.
Probab=99.88 E-value=1.4e-22 Score=134.34 Aligned_cols=86 Identities=41% Similarity=0.628 Sum_probs=78.5
Q ss_pred ccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeE--EEEEEEEEEEEeceEEEEEEEEEEeCC-Cc------------
Q 033492 28 LVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALK--FESVEKGETQVVSGTNYRLILVVKDGP-ST------------ 92 (118)
Q Consensus 28 ~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~--~~~V~~a~~QVVaG~nY~l~v~~~~~~-~~------------ 92 (118)
++|||.++ +.+||++++++++|+++||+++++.|. +.+++++++|||+|++|++++++.+++ +.
T Consensus 2 ~~Gg~~~~-~~~d~~~~~~~~~a~~~~N~~~~~~~~~~~~~v~~a~~QvvaG~~y~l~~~v~~t~C~k~~~~~~~C~~~~ 80 (107)
T smart00043 2 CLGGPSDV-PPNDPEVQEAADFAVAEYNKKSNDKYELRVIKVVSAKSQVVAGTNYYLKVEVGETNCKKLSVDLENCPFLD 80 (107)
T ss_pred CCCCCccC-CCCCHHHHHHHHHHHHHHHHhcccchhhhhhhhheeeeeeecceEEEEEEEEEeceeccCCcccccCCCCC
Confidence 68999995 899999999999999999999998887 689999999999999999999999876 11
Q ss_pred ---eeEEEEEEEecCCCceEEEEee
Q 033492 93 ---KKFEAVVWEKPWEHFKSLTSFK 114 (118)
Q Consensus 93 ---~~c~~~V~~~PW~~~~~l~sf~ 114 (118)
..|.++||.+||+++.++++|+
T Consensus 81 ~~~~~C~~~V~~~pw~~~~~~~~~~ 105 (107)
T smart00043 81 QGEKFCTAKVWEKPWENKIKLVEFK 105 (107)
T ss_pred CCccEEEEEEEecCCCCccCcccee
Confidence 4899999999999999998875
No 3
>PF00031 Cystatin: Cystatin domain; InterPro: IPR000010 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. The cystatins are cysteine proteinase inhibitors belonging to MEROPS inhibitor family I25, clan IH [, , ]. They mainly inhibit peptidases belonging to peptidase families C1 (papain family) and C13 (legumain family). The cystatin family includes: The Type 1 cystatins, which are intracellular cystatins that are present in the cytosol of many cell types, but can also appear in body fluids at significant concentrations. They are single-chain polypeptides of about 100 residues, which have neither disulphide bonds nor carbohydrate side chains. The Type 2 cystatins, which are mainly extracellular secreted polypeptides synthesised with a 19-28 residue signal peptide. They are broadly distributed and found in most body fluids. The Type 3 cystatins, which are multidomain proteins. The mammalian representatives of this group are the kininogens. There are three different kininogens in mammals: H- (high molecular mass, IPR002395 from INTERPRO) and L- (low molecular mass) kininogen which are found in a number of species, and T-kininogen that is found only in rat. Unclassified cystatins. These are cystatin-like proteins found in a range of organisms: plant phytocystatins, fetuin in mammals, insect cystatins and a puff adder venom cystatin which inhibits metalloproteases of the MEROPS peptidase family M12 (astacin/adamalysin). Also a number of the cystatins-like proteins have been shown to be devoid of inhibitory activity. All true cystatins inhibit cysteine peptidases of the papain family (MEROPS peptidase family C1), and some also inhibit legumain family enzymes (MEROPS peptidase family C13). These peptidases play key roles in physiological processes, such as intracellular protein degradation (cathepsins B, H and L), are pivotal in the remodelling of bone (cathepsin K), and may be important in the control of antigen presentation (cathepsin S, mammalian legumain). Moreover, the activities of such peptidases are increased in pathophysiological conditions, such as cancer metastasis and inflammation. Additionally, such peptidases are essential for several pathogenic parasites and bacteria. Thus in animals cystatins not only have capacity to regulate normal body processes and perhaps cause disease when down-regulated, but in other organisms may also participate in defence against biotic and abiotic stress. ; GO: 0004869 cysteine-type endopeptidase inhibitor activity; PDB: 3L0R_B 2W9P_K 2W9Q_A 3S67_A 3QRD_D 1R4C_G 3GAX_A 1TIJ_B 1G96_A 3NX0_A ....
Probab=99.85 E-value=1.1e-20 Score=122.32 Aligned_cols=74 Identities=41% Similarity=0.687 Sum_probs=67.6
Q ss_pred ceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCe--EEEEEEEEEEEEeceEEEEEEEEEEeCC-----------------
Q 033492 30 GGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSAL--KFESVEKGETQVVSGTNYRLILVVKDGP----------------- 90 (118)
Q Consensus 30 GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~--~~~~V~~a~~QVVaG~nY~l~v~~~~~~----------------- 90 (118)
|||+++ +++||++++++++|+.+||+++++.+ .+.+|++|++|||+|++|+|++++.+++
T Consensus 1 Gg~~~~-~~~dp~v~~~~~~al~~~N~~~~~~~~~~~~~v~~a~~QvV~G~~Y~i~~~~~~t~C~k~~~~~~~C~~~~~~ 79 (94)
T PF00031_consen 1 GGPSPV-DPNDPEVQEAAEFALDKFNEQSNSGYKFKLVKVISATTQVVAGINYYIEFEVGETNCKKSSKDFENCPFQEEQ 79 (94)
T ss_dssp SSEEEE-CTTSHHHHHHHHHHHHHHHHHSTTSEEEEEEEEEEEEEEESSSEEEEEEEEEEEEEEETCEEEEEECEBESTT
T ss_pred CCCccC-CCCCHHHHHHHHHHHHHHHHhCcccCcceeeeeeEEEEeecCCceEEEEEEEEcccccccccccccCCccccC
Confidence 899995 88999999999999999999996665 5699999999999999999999998862
Q ss_pred -CceeEEEEEEEecC
Q 033492 91 -STKKFEAVVWEKPW 104 (118)
Q Consensus 91 -~~~~c~~~V~~~PW 104 (118)
....|.++||.+||
T Consensus 80 ~~~~~C~~~v~~~pW 94 (94)
T PF00031_consen 80 PWTKFCKFTVWERPW 94 (94)
T ss_dssp SSEEEEEEEEEEECG
T ss_pred CceeeEEEEEEECCC
Confidence 56799999999999
No 4
>PF07430 PP1: Phloem filament protein PP1; InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.60 E-value=9.1e-15 Score=105.50 Aligned_cols=86 Identities=38% Similarity=0.566 Sum_probs=76.2
Q ss_pred CCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEec--eEEEEEEEEEEeC-CCceeEEEEEEE
Q 033492 25 KGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVS--GTNYRLILVVKDG-PSTKKFEAVVWE 101 (118)
Q Consensus 25 ~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVa--G~nY~l~v~~~~~-~~~~~c~~~V~~ 101 (118)
..+...+|.+|.|+++|.+|++++||+.++| +.+++++|.+|.+++.|-++ |++|+|.+.+.+. ++...|++.||+
T Consensus 110 ~~p~~~~Wi~I~nin~p~VQeLgkFAV~EhN-K~gd~LkF~KV~eGw~q~l~~d~ikYrLhI~AkDg~G~~~~YeAvV~~ 188 (202)
T PF07430_consen 110 ATPQSKKWIPIPNINNPFVQELGKFAVIEHN-KAGDKLKFEKVYEGWYQDLGNDGIKYRLHIVAKDGLGRLGNYEAVVWE 188 (202)
T ss_pred cCcccCCCEECCCCCcHHHHHHHHHHHHHHh-hcCCceEEEEEeeEEEEeccCCCceEEEEEEeecCCCCcCceEEEEEE
Confidence 4566789999999999999999999999999 67899999999999999885 6999999999998 688999999999
Q ss_pred e-cCCCceEEE
Q 033492 102 K-PWEHFKSLT 111 (118)
Q Consensus 102 ~-PW~~~~~l~ 111 (118)
. +|.++.+++
T Consensus 189 k~~~sk~i~i~ 199 (202)
T PF07430_consen 189 KQFLSKKIKIL 199 (202)
T ss_pred eccCcceEEEE
Confidence 7 577765443
No 5
>PF07430 PP1: Phloem filament protein PP1; InterPro: IPR009994 This domain represents a conserved region approximately 200 residues long, four copies of which are found within the plant phloem filament protein PP1. This is one of the constituents of the proteinaceous filaments found in the sieve elements of Cucurbita phloem [].
Probab=99.09 E-value=2.2e-10 Score=82.92 Aligned_cols=91 Identities=29% Similarity=0.403 Sum_probs=82.7
Q ss_pred CCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEE--EEEeceEEEEEEEEEEeC-CCceeEEEEEEE
Q 033492 25 KGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGE--TQVVSGTNYRLILVVKDG-PSTKKFEAVVWE 101 (118)
Q Consensus 25 ~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~--~QVVaG~nY~l~v~~~~~-~~~~~c~~~V~~ 101 (118)
+....+||.+|+|+.+|.+|+++++++.+++-+.++.++|.+|.+.+ .|..+|++|++.+++.|- .+...|.+.|++
T Consensus 3 ~~~~~~~w~~ip~v~~~~~q~v~~~~veq~k~~~~~~l~~~~v~egwy~el~~~~~~yrlhv~a~d~l~r~l~~e~ii~e 82 (202)
T PF07430_consen 3 QVPFSPKWIKIPDVKEPCLQEVAKFAVEQFKIQYGDSLKFRSVVEGWYFELCPNSLKYRLHVKAIDFLGRSLKYEAIIIE 82 (202)
T ss_pred CcccCcccccCCcccchHHHHHHHHHHHHHhhhcccceeeeeeeeceeecccccceeEEEeehhhhhhccccceeeeeee
Confidence 44678999999999999999999999999999999999999999999 889999999999999876 478899999999
Q ss_pred ec--CCCceEEEEeee
Q 033492 102 KP--WEHFKSLTSFKP 115 (118)
Q Consensus 102 ~P--W~~~~~l~sf~~ 115 (118)
+- |++.++|.|+-.
T Consensus 83 ~~~~~~~~~kl~s~l~ 98 (202)
T PF07430_consen 83 EKPQLTRIRKLASILA 98 (202)
T ss_pred hhhhhhhhhhhheeeE
Confidence 85 999999988754
No 6
>TIGR01638 Atha_cystat_rel Arabidopsis thaliana cystatin-related protein. This model represents a family similar in sequence and probably homologous to a large family of cysteine proteinase inhibitors, or cystatins, as described by pfam model pfam00031. Cystatins may help plants resist attack by insects.
Probab=98.72 E-value=7.4e-08 Score=62.55 Aligned_cols=66 Identities=14% Similarity=0.134 Sum_probs=56.0
Q ss_pred CCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEeceEEEEEEEEEEeCC---CceeEEEEEEEe
Q 033492 37 DPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSGTNYRLILVVKDGP---STKKFEAVVWEK 102 (118)
Q Consensus 37 ~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG~nY~l~v~~~~~~---~~~~c~~~V~~~ 102 (118)
..+.+-+..+++.|+.+||+..+.++.|++|++|..|..+|..|+|++.+.+.. ..+.+++.||..
T Consensus 9 ~T~rd~~~~la~~al~k~N~~~~t~lEfV~vVrAn~~~~~g~~~yITF~Ard~~d~p~~e~~q~~v~~~ 77 (92)
T TIGR01638 9 ETNRDLLERLSYVASKKYNDTKFLNLELVEVVRANYRGGAKSKSYITFEARDKPDGPLGEYQQAAVVYL 77 (92)
T ss_pred cCHHHHHHHHHHHHHHHhhhhcCceEEEEEEEEEEeeccceEEEEEEEEEecCCCCCHHHhhheeeEec
Confidence 346667889999999999999999999999999999999999999999999764 345556666653
No 7
>PF06907 Latexin: Latexin; InterPro: IPR009684 This family consists of several animal specific latexin and proteins related to latexin that belong to MEROPS proteinase inhibitor family I47, clan I- []. Latexin, a protein possessing inhibitory activity against rat carboxypeptidase A1 (CPA1) and CPA2 (MEROPS peptidase family M14A), is expressed in a neuronal subset in the cerebral cortex and cells in other neural and non-neural tissues of rat [, ]. OCX-32, the 32 kDa eggshell matrix protein, is present at high levels in the uterine fluid during the terminal phase of eggshell formation, and is localised predominantly in the outer eggshell. The timing of OCX-32 secretion into the uterine fluid suggests that it may play a role in the termination of mineral deposition []. OCX-32 protein possesses limited identity (32%) to two unrelated proteins: latexin and to a skin protein that is encoded by a retinoic acid receptor-responsive gene, TIG1. Tazarotene Induced Gene 1 (TIG1) is a putative 228 transmembrane protein with a small N-terminal intracellular region, a single membrane-spanning hydrophobic region, and a large C-terminal extracellular region containing a glycosylation signal. TIG1 is up-regulated by retinoic acid receptor but not by retinoid X receptor-specific synthetic retinoids []. TIG1 may be a tumour suppressor gene whose diminished expression is involved in the malignant progression of prostate cancer [].; PDB: 1WNH_A 2BO9_B.
Probab=96.29 E-value=0.086 Score=39.33 Aligned_cols=66 Identities=18% Similarity=0.264 Sum_probs=52.8
Q ss_pred CCCcHHHHHHHHHHHHHHHHhcCCCe---EEEEEEEEEEEEec--eEEEEEEEEEEeCC---CceeEEEEEEEe
Q 033492 37 DPKEKHVMEIGQFAVTEYNKQSKSAL---KFESVEKGETQVVS--GTNYRLILVVKDGP---STKKFEAVVWEK 102 (118)
Q Consensus 37 ~~~d~~v~~~a~~Av~~~n~~~~~~~---~~~~V~~a~~QVVa--G~nY~l~v~~~~~~---~~~~c~~~V~~~ 102 (118)
+++.-..+++|+-|..-+|-..++.+ .+..|.+|+..+++ |-+|.|.+.+.+-. ....|.++|+..
T Consensus 3 ~p~h~~a~rAA~va~hy~N~~~GSP~~l~~l~~V~~a~~e~ip~~G~Ky~L~FSte~~~~~e~~g~CsA~V~f~ 76 (220)
T PF06907_consen 3 NPSHRPAQRAARVAQHYINYRAGSPSRLFVLQQVQKARAEDIPGEGCKYDLVFSTEEYIEGEHLGNCSAEVFFK 76 (220)
T ss_dssp -TTSHHHHHHHHHHHHHHHHHH-BTTB-EEEEEEEEEEEEEETTTEEEEEEEEEEEETTT---EEEEEEEEEET
T ss_pred CCcchHHHHHHHHHHHHhccccCCCceeeehhhhhhhhheeccCCCCEEEEEEEhHHhhcCCceeEeEEEEEec
Confidence 56677788999999999998877765 34789999999985 78999999998753 678999999983
No 8
>TIGR01572 A_thl_para_3677 Arabidopsis paralogous family TIGR01572. This model describes a paralogous family of hypothetical proteins in Arabidopsis thaliana. No homologs are detected from other species. Length heterogeneity within the family is attributable partly to a 21-residue repeat present in from zero to three tandem copies. The central region of the repeat resembles the pattern [VIF][FY][QK]GX[LM]P[DEK]XXXDDAL.
Probab=90.93 E-value=0.55 Score=36.01 Aligned_cols=60 Identities=20% Similarity=0.288 Sum_probs=53.1
Q ss_pred HHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEeceEEEEEEEEEEeCC---CceeEEEEEEEe
Q 033492 43 VMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSGTNYRLILVVKDGP---STKKFEAVVWEK 102 (118)
Q Consensus 43 v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG~nY~l~v~~~~~~---~~~~c~~~V~~~ 102 (118)
++-.|+.++.-||-..+.+++|..|.+.-++..+=+.|+|++++-+.+ ..+.|+..|.++
T Consensus 42 vklyAr~GLH~YN~~~GTNlel~~v~K~N~~~~~~~syyITL~A~DP~s~~s~qTFQtrV~e~ 104 (265)
T TIGR01572 42 VKIYARVGLHRYNFLEGTNLELDHVDKFNKRMCALSSYYITLLAVDPDSRFLQQTFQVRVDEQ 104 (265)
T ss_pred HHHHHHhhhhhhhhccCccceehhhhhhccchhhheeeeEEEEEecCCccccceEEEEEEEec
Confidence 577899999999999999999999999999999999999999999875 567788877765
No 9
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=88.70 E-value=0.27 Score=32.09 Aligned_cols=7 Identities=14% Similarity=0.111 Sum_probs=4.8
Q ss_pred CcchHHH
Q 033492 1 MNQRFCC 7 (118)
Q Consensus 1 m~~~~~~ 7 (118)
|++|.++
T Consensus 1 MaSK~~l 7 (95)
T PF07172_consen 1 MASKAFL 7 (95)
T ss_pred CchhHHH
Confidence 8887644
No 10
>PF05679 CHGN: Chondroitin N-acetylgalactosaminyltransferase; InterPro: IPR008428 This family represents Chondroitin N-acetylgalactosaminyltransferase. Proteins have a type II transmembrane topology. The enzyme is involved in the biosynthetic initiation and elongation of chondroitin sulphate and is the key enzyme responsible for the selective chain assembly of chondroitin/dermatan sulphate on the linkage region tetrasaccharide common to various proteoglycans containing chondroitin/dermatan sulphate or heparin/heparan sulphate chains. ; GO: 0016758 transferase activity, transferring hexosyl groups, 0032580 Golgi cisterna membrane
Probab=78.54 E-value=9.9 Score=31.59 Aligned_cols=50 Identities=24% Similarity=0.377 Sum_probs=43.2
Q ss_pred HHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEe--ceEEEEEEEEEEeCC
Q 033492 41 KHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVV--SGTNYRLILVVKDGP 90 (118)
Q Consensus 41 ~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVV--aG~nY~l~v~~~~~~ 90 (118)
+++.++.+.|+...|++....+.|.+++.+...+- -|+.|.|++.+....
T Consensus 161 ~dl~~vi~~a~~~ln~~~~~~~~~~~l~~GY~R~dp~rG~~Y~Ldl~l~~~~ 212 (499)
T PF05679_consen 161 EDLDDVIEQAMEELNRKSRRVLEFRDLINGYRRFDPTRGMDYILDLLLKYKK 212 (499)
T ss_pred HHHHHHHHHHHHHHhccccccEEeeeeeeEEEEecCCCCceEEEEEEEeecc
Confidence 57889999999999999888889999999987764 599999999886654
No 11
>COG3360 Uncharacterized conserved protein [Function unknown]
Probab=76.05 E-value=16 Score=22.54 Aligned_cols=45 Identities=20% Similarity=0.244 Sum_probs=32.4
Q ss_pred HHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEece--EEEEEEEEEE
Q 033492 42 HVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSG--TNYRLILVVK 87 (118)
Q Consensus 42 ~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG--~nY~l~v~~~ 87 (118)
.+.++++-|++.-.. +-+.+.+.+|++.+-+++.| ..|.++++++
T Consensus 18 S~d~Ai~~Ai~RA~~-t~~~l~wfeV~~~rg~v~~g~v~hyqv~lkVg 64 (71)
T COG3360 18 SIDAAIANAIARAAD-TLDNLDWFEVVETRGHVVDGAVAHYQVTLKVG 64 (71)
T ss_pred cHHHHHHHHHHHHHh-hhhcceEEEEEeecccEeecceEEEEEEEEEE
Confidence 344556666665433 34667889999999999988 4688888775
No 12
>PF12276 DUF3617: Protein of unknown function (DUF3617); InterPro: IPR022061 This family of proteins is found in bacteria. Proteins in this family are typically between 155 and 179 amino acids in length. There is a single completely conserved residue C that may be functionally important.
Probab=74.45 E-value=4 Score=28.25 Aligned_cols=34 Identities=24% Similarity=0.300 Sum_probs=18.9
Q ss_pred CcchHHHHHHHHHHhhhcccccCCCCcccceeee
Q 033492 1 MNQRFCCLIVLFLSVVPLLAAGDRKGALVGGWKP 34 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~~~~~~~~~~~~~~GG~~~ 34 (118)
|++..++++++++++....+++......+|-|.-
T Consensus 1 M~~~~~~~~~~~~~~~~~~~~a~~~~~kpGlWe~ 34 (162)
T PF12276_consen 1 MKRRLLLALALALLALAAAAAAAAPDIKPGLWEV 34 (162)
T ss_pred CchHHHHHHHHHHHHhhcccccccCCCCCcccEE
Confidence 6666555444444432333444456667899974
No 13
>PF07311 Dodecin: Dodecin; InterPro: IPR009923 This entry represents proteins with a Dodecin-like topology. Dodecin flavoprotein is a small dodecameric flavin-binding protein from Halobacterium salinarium (Halobacterium halobium) that contains two flavins stacked in a single binding pocket between two tryptophan residues to form an aromatic tetrade []. Dodecin binds riboflavin, although it appears to have a broad specificity for flavins. Lumichrome, a molecule associated with flavin metabolism, appears to be a ligand of dodecin, which could act as a waste-trapping device. ; PDB: 2VYX_L 2DEG_F 2V18_K 2V19_D 2UX9_B 2CZ8_E 2V21_F 2CC8_A 2CCB_A 2VX9_A ....
Probab=72.55 E-value=19 Score=21.90 Aligned_cols=48 Identities=25% Similarity=0.273 Sum_probs=37.9
Q ss_pred CcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEece--EEEEEEEEEE
Q 033492 39 KEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSG--TNYRLILVVK 87 (118)
Q Consensus 39 ~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG--~nY~l~v~~~ 87 (118)
+.....++++.|+.+-++. -.++..++|.+.+-.|..| ..|+.+++++
T Consensus 12 S~~S~edAv~~Av~~A~kT-l~ni~~~eV~e~~~~v~dg~i~~y~v~lkv~ 61 (66)
T PF07311_consen 12 SPKSWEDAVQNAVARASKT-LRNIRWFEVKEQRGHVEDGKITEYQVNLKVS 61 (66)
T ss_dssp ESSHHHHHHHHHHHHHHHH-SSSEEEEEEEEEEEEEETTCEEEEEEEEEEE
T ss_pred CCCCHHHHHHHHHHHHhhc-hhCcEEEEEEEEEEEEeCCcEEEEEEEEEEE
Confidence 3456777888888887664 3578889999999999998 6799888875
No 14
>PF10731 Anophelin: Thrombin inhibitor from mosquito; InterPro: IPR018932 Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing.
Probab=71.85 E-value=4.1 Score=24.47 Aligned_cols=18 Identities=17% Similarity=0.482 Sum_probs=13.4
Q ss_pred CcchHHHHHHHHHHhhhc
Q 033492 1 MNQRFCCLIVLFLSVVPL 18 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~~~ 18 (118)
|++|++++.|+|+.++.+
T Consensus 1 MA~Kl~vialLC~aLva~ 18 (65)
T PF10731_consen 1 MASKLIVIALLCVALVAI 18 (65)
T ss_pred CcchhhHHHHHHHHHHHH
Confidence 888887777777777654
No 15
>PF05984 Cytomega_UL20A: Cytomegalovirus UL20A protein; InterPro: IPR009245 This family consists of several Cytomegalovirus UL20A proteins. UL20A is thought to be a glycoprotein [].
Probab=66.33 E-value=5.9 Score=25.47 Aligned_cols=21 Identities=33% Similarity=0.425 Sum_probs=13.5
Q ss_pred CcchHHHHHHHHHHhhhcccc
Q 033492 1 MNQRFCCLIVLFLSVVPLLAA 21 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~~~~~~ 21 (118)
|++|+++|-|+++.++...|+
T Consensus 1 MaRRlwiLslLAVtLtVALAA 21 (100)
T PF05984_consen 1 MARRLWILSLLAVTLTVALAA 21 (100)
T ss_pred CchhhHHHHHHHHHHHHHhhc
Confidence 889887766666555544444
No 16
>PF00666 Cathelicidins: Cathelicidin; InterPro: IPR001894 The precursor sequences of a number of antimicrobial peptides secreted by neutrophils (polymorphonuclear leukocytes) upon activation have been found to be evolutionarily related and are collectively known as cathelicidins []. Structurally, these proteins consist of three domains: a signal sequence, a conserved region of about 100 residues that contains four cysteines involved in two disulphide bonds, and a highly divergent C-terminal section of variable size. It is in this C-terminal section that the antibacterial peptides are found; they are proteolytically processed from their precursor by enzymes such as elastase. This structure is shown in the following schematic representation: +---+--------------------------------+--------------------+ |Sig| Propeptide C C C C | Antibacterial pep. | +---+----------------|--|--|--|------+--------------------+ | | | | +--+ +--+ 'C': conserved cysteine involved in a disulphide bond. ; GO: 0006952 defense response, 0005576 extracellular region; PDB: 1KWI_A 1PFP_A 1LXE_A 1N5P_A 1N5H_A.
Probab=65.71 E-value=7.1 Score=23.89 Aligned_cols=47 Identities=19% Similarity=0.051 Sum_probs=28.0
Q ss_pred HHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEe----ceEEEEEEEEEEeC
Q 033492 42 HVMEIGQFAVTEYNKQSKSALKFESVEKGETQVV----SGTNYRLILVVKDG 89 (118)
Q Consensus 42 ~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVV----aG~nY~l~v~~~~~ 89 (118)
.|++++..|+..||+++.+.. +.+++++.-|-- .++.--+.+.+.++
T Consensus 3 sY~eav~~Av~~yN~~s~~~n-lfRLLe~~p~P~~~~~~~~~~pl~FtIkET 53 (67)
T PF00666_consen 3 SYEEAVLRAVDFYNQGSSGEN-LFRLLELDPPPGWDEDPSTPKPLNFTIKET 53 (67)
T ss_dssp CCHHHHHHHHHHHHHCS-SSE-EEEEEEE---SSSSSSSSS-EEEEEEEEEE
T ss_pred CHHHHHHHHHHHHhcCCCccC-ceeeeeccCCCCCCCCcCcceeeEEEEeec
Confidence 367889999999999987653 456676665532 22344555555554
No 17
>KOG2650 consensus Zinc carboxypeptidase [Function unknown]
Probab=49.77 E-value=47 Score=27.31 Aligned_cols=63 Identities=13% Similarity=0.096 Sum_probs=45.5
Q ss_pred ccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEE---EEEE---E----EEEEeceEEEEEEEEEEeCC
Q 033492 28 LVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFE---SVEK---G----ETQVVSGTNYRLILVVKDGP 90 (118)
Q Consensus 28 ~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~---~V~~---a----~~QVVaG~nY~l~v~~~~~~ 90 (118)
.|=|++.....+-++++++|+.|+..+++..+..|++- .++- . +.+-++|+.|-+++++.++.
T Consensus 317 yPyg~~~~~~~~~~dl~~va~~a~~ai~~~~gt~Y~~G~~~~~~y~asG~S~Dway~~~gi~~~ft~ELrd~g 389 (418)
T KOG2650|consen 317 YPYGYTNDLPEDYEDLQEVARAAADALKSVYGTKYTVGSSADTLYPASGGSDDWAYDVLGIPYAFTFELRDTG 389 (418)
T ss_pred ecccccCCCCCCHHHHHHHHHHHHHHHHHHhCCEEEeccccceeeccCCchHHHhhhccCCCEEEEEEeccCC
Confidence 36677663334677889999999999999988888763 2221 1 13557899999999998654
No 18
>PF01456 Mucin: Mucin-like glycoprotein; InterPro: IPR000458 This family of trypanosomal proteins resemble vertebrate mucins. The protein consists of three regions. The N and C terminii are conserved between all members of the family, whereas the central region is not well conserved and contains a large number of threonine residues which can be glycosylated []. Indirect evidence suggested that these genes might encode the core protein of parasite mucins, glycoproteins that were proposed to be involved in the interaction with, and invasion of, mammalian host cells.
Probab=48.62 E-value=12 Score=25.43 Aligned_cols=22 Identities=36% Similarity=0.596 Sum_probs=13.4
Q ss_pred CcchHHH-HHHHHHHhh-hccccc
Q 033492 1 MNQRFCC-LIVLFLSVV-PLLAAG 22 (118)
Q Consensus 1 m~~~~~~-~~~~~~~~~-~~~~~~ 22 (118)
|--|+|+ ||++.|+|+ +.++..
T Consensus 2 mtcRLLCalLvlaLcCCpsvc~t~ 25 (143)
T PF01456_consen 2 MTCRLLCALLVLALCCCPSVCATA 25 (143)
T ss_pred chHHHHHHHHHHHHHcCcchhccc
Confidence 3447766 777777777 444443
No 19
>PF03032 Brevenin: Brevenin/esculentin/gaegurin/rugosin family; InterPro: IPR004275 In addition to the highly specific cell-mediated immune system, vertebrates possess an efficient host-defence mechanism against invading microorganisms which involves the synthesis of highly potent antimicrobial peptides with a large spectrum of activity. This entry represents a number of these defence peptides secreted from the skin of amphibians, including the opiate-like dermorphins and deltorphins, and the antimicrobial dermoseptins and temporins.; GO: 0006952 defense response, 0042742 defense response to bacterium, 0005576 extracellular region
Probab=48.18 E-value=15 Score=20.72 Aligned_cols=18 Identities=22% Similarity=0.458 Sum_probs=12.5
Q ss_pred hHHHHHHHHHHhhhcccc
Q 033492 4 RFCCLIVLFLSVVPLLAA 21 (118)
Q Consensus 4 ~~~~~~~~~~~~~~~~~~ 21 (118)
|..+++|++|.+++++.-
T Consensus 4 KKsllLlfflG~ISlSlC 21 (46)
T PF03032_consen 4 KKSLLLLFFLGTISLSLC 21 (46)
T ss_pred hHHHHHHHHHHHcccchH
Confidence 445788888888876443
No 20
>TIGR01601 PYST-C1 Plasmodium yoelii subtelomeric domain PYST-C1. The C-terminal portions of the genes which contain this domain are divergent and some contain other yoelii-specific paralogous domains such as PYST-C2 (TIGR01604).
Probab=43.93 E-value=14 Score=23.34 Aligned_cols=16 Identities=25% Similarity=0.611 Sum_probs=9.0
Q ss_pred Ccch-HHHHHHHHHHhh
Q 033492 1 MNQR-FCCLIVLFLSVV 16 (118)
Q Consensus 1 m~~~-~~~~~~~~~~~~ 16 (118)
|++| |+|+|.+++.+.
T Consensus 1 MNkrIfslVcivlY~ll 17 (82)
T TIGR01601 1 MNKRIFSLVCIVLYILL 17 (82)
T ss_pred CCceEeehhHHHHHHHH
Confidence 8776 455555554443
No 21
>cd06379 PBP1_iGluR_NMDA_NR1 N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. N-terminal leucine/isoleucine/valine-binding protein (LIVBP)-like domain of the NR1, an essential channel-forming subunit of the NMDA receptor. The ionotropic N-methyl-d-asparate (NMDA) subtype of glutamate receptor serves critical functions in neuronal development, functioning, and degeneration in the mammalian central nervous system. The functional NMDA receptor is a heterotetramer ccomposed of two NR1 and two NR2 (A, B, C, and D) or of NR3 (A and B) subunits. The receptor controls a cation channel that is highly permeable to monovalent ions and calcium and exhibits voltage-dependent inhibition by magnesium. Dual agonists, glutamate and glycine, are required for efficient activation of the NMDA receptor. When co-expressed with NR1, the NR3 subunits form receptors that are activated by glycine alone and therefore
Probab=42.57 E-value=37 Score=26.31 Aligned_cols=42 Identities=26% Similarity=0.181 Sum_probs=27.1
Q ss_pred HHHHhhhcccccCCCCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhc
Q 033492 11 LFLSVVPLLAAGDRKGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQS 58 (118)
Q Consensus 11 ~~~~~~~~~~~~~~~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~ 58 (118)
++|.++++ +++.+..-..|+..+ . ...++..++|++.+|...
T Consensus 5 ~~~~~~~~-~~~~~~~i~IG~i~~-~----~~~~~~~~~Ai~~~N~~~ 46 (377)
T cd06379 5 LFLSLCAR-AGCSPKTVNIGAVLS-N----KKHEQEFKEAVNAANVER 46 (377)
T ss_pred HHHHHhcc-cCCCCcEEEEeEEec-c----hhHHHHHHHHHHHHhhhh
Confidence 33444344 233345566788887 2 256789999999999854
No 22
>PRK10386 curli assembly protein CsgE; Provisional
Probab=41.56 E-value=1.2e+02 Score=20.98 Aligned_cols=81 Identities=9% Similarity=-0.024 Sum_probs=41.0
Q ss_pred CcchH-HHHHHHHHHhhhcccccCCCCcccceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEEEEEEeceEE
Q 033492 1 MNQRF-CCLIVLFLSVVPLLAAGDRKGALVGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFESVEKGETQVVSGTN 79 (118)
Q Consensus 1 m~~~~-~~~~~~~~~~~~~~~~~~~~~~~~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a~~QVVaG~n 79 (118)
|.+.. ++++++++++++...++ .+..+.|=..+ ...+-. =.+--.+-.+.++ +.+. ..+.+..++.+|..
T Consensus 1 ~~r~~~~~l~~~~l~~~~~~~a~-~eiEi~GLIiD-~T~Tr~-G~DFY~~Fs~~~~----~~~~--~nltI~E~p~a~~G 71 (130)
T PRK10386 1 MKRYLRWIVAAELLFAAGNLHAA-VEVEVPGLLTD-HTVSSI-GHDFYRAFSDKWE----SDYD--GNLTINERPSARWG 71 (130)
T ss_pred ChhHHHHHHHHHHHHhCcccccc-ccccccceEec-cccccc-cHhHHHHHHHHHh----hhCC--CcEEEEEEEcCCCC
Confidence 55532 44555555554432232 34455555555 222221 1222223334444 1222 46667888888777
Q ss_pred EEEEEEEEeCC
Q 033492 80 YRLILVVKDGP 90 (118)
Q Consensus 80 Y~l~v~~~~~~ 90 (118)
=.|++.+.+.-
T Consensus 72 S~ItV~~n~~v 82 (130)
T PRK10386 72 SWITITVNQDV 82 (130)
T ss_pred cEEEEEECCEE
Confidence 78999887654
No 23
>PRK15344 type III secretion system needle protein SsaG; Provisional
Probab=33.84 E-value=67 Score=19.88 Aligned_cols=22 Identities=27% Similarity=0.345 Sum_probs=18.3
Q ss_pred CCCCcHHHHHHHHHHHHHHHHh
Q 033492 36 EDPKEKHVMEIGQFAVTEYNKQ 57 (118)
Q Consensus 36 ~~~~d~~v~~~a~~Av~~~n~~ 57 (118)
.+++||+..--+.|++.+|+.-
T Consensus 28 ~~~~nP~~ml~lQf~i~QyS~~ 49 (71)
T PRK15344 28 NDLLNPESMIKAQFALQQYSTF 49 (71)
T ss_pred CCCCCHHHHHHHHHHHHHHHHH
Confidence 3678999888899999999753
No 24
>TIGR02105 III_needle type III secretion apparatus needle protein. Type III secretion systems translocate proteins, usually virulence factors, out across both inner and outer membranes of certain Gram-negative bacteria and further across the plasma membrane and into the cytoplasm of the host cell. This protein, termed YscF in Yersinia, and EscF, PscF, EprI, etc. in other systems, forms the needle of the injection apparatus.
Probab=33.76 E-value=60 Score=20.00 Aligned_cols=21 Identities=29% Similarity=0.455 Sum_probs=18.0
Q ss_pred CCCcHHHHHHHHHHHHHHHHh
Q 033492 37 DPKEKHVMEIGQFAVTEYNKQ 57 (118)
Q Consensus 37 ~~~d~~v~~~a~~Av~~~n~~ 57 (118)
+|+||+..--..|++.+||.-
T Consensus 30 ~~~nP~~La~~Q~~~~qYs~~ 50 (72)
T TIGR02105 30 LPNDPELMAELQFALNQYSAY 50 (72)
T ss_pred CCCCHHHHHHHHHHHHHHHHH
Confidence 568999988899999999763
No 25
>PF13028 DUF3889: Protein of unknown function (DUF3889)
Probab=30.17 E-value=1.6e+02 Score=19.21 Aligned_cols=69 Identities=16% Similarity=0.073 Sum_probs=48.8
Q ss_pred CcHHHHHHHHHHHHHHHHhcCCCeEEEEEEEE-EEEEece-EEEEEEEEEEeCCCceeEEEEEEEecCCCce
Q 033492 39 KEKHVMEIGQFAVTEYNKQSKSALKFESVEKG-ETQVVSG-TNYRLILVVKDGPSTKKFEAVVWEKPWEHFK 108 (118)
Q Consensus 39 ~d~~v~~~a~~Av~~~n~~~~~~~~~~~V~~a-~~QVVaG-~nY~l~v~~~~~~~~~~c~~~V~~~PW~~~~ 108 (118)
.+|.+.+-.+.|+++.-++= .+..++.-+.. ++|+-.+ +.-.+++.+.++++.--..+.|+-.|-.+..
T Consensus 19 ~~p~yaKWgrlA~~~~k~~Y-p~a~v~DY~~vGr~~~~~~~t~e~Fkl~l~~~~kefgV~v~V~f~p~T~ki 89 (97)
T PF13028_consen 19 AQPSYAKWGRLAVQETKEKY-PGAEVVDYLYVGRTKVNDEQTVEKFKLWLREGGKEFGVFVTVSFNPKTEKI 89 (97)
T ss_pred CCCcHHHHHHHHHHHHHHHC-CCCEEeeeeeecceecCCcceEEEEEEEEEcCCeEEEEEEEEEEeCCCCcE
Confidence 45888888999999875542 12233343433 4455566 7889999999999888888889888866653
No 26
>PF06157 DUF973: Protein of unknown function (DUF973); InterPro: IPR009321 This family consists of several hypothetical archaeal proteins of unknown function.
Probab=29.35 E-value=82 Score=24.47 Aligned_cols=29 Identities=24% Similarity=0.526 Sum_probs=20.0
Q ss_pred EEEeceEEEEEEEEEEeCCCceeEEEEEEEec
Q 033492 72 TQVVSGTNYRLILVVKDGPSTKKFEAVVWEKP 103 (118)
Q Consensus 72 ~QVVaG~nY~l~v~~~~~~~~~~c~~~V~~~P 103 (118)
.+.+.|.+|.+++.++++. ..++.+-.+|
T Consensus 257 ~~l~~g~~Y~i~l~l~ng~---~v~v~~~y~p 285 (285)
T PF06157_consen 257 LNLVPGNTYTITLTLSNGQ---TVDVNVIYQP 285 (285)
T ss_pred ccCCCCCEEEEEEEEcCCc---EEEEEEEEeC
Confidence 3466788888888888775 5555555554
No 27
>CHL00132 psaF photosystem I subunit III; Validated
Probab=27.20 E-value=2.1e+02 Score=20.91 Aligned_cols=26 Identities=15% Similarity=0.138 Sum_probs=17.2
Q ss_pred ccceeeeCCCCCcHHHHHHHHHHHHHHH
Q 033492 28 LVGGWKPIEDPKEKHVMEIGQFAVTEYN 55 (118)
Q Consensus 28 ~~GG~~~i~~~~d~~v~~~a~~Av~~~n 55 (118)
-.+|.+|= .++|.+++-++.++.+..
T Consensus 25 d~agLtpC--ses~aF~kR~~~~~k~Le 50 (185)
T CHL00132 25 DVAGLTPC--SESPAFQKRLNNSVKKLE 50 (185)
T ss_pred cccCCccC--ccCHHHHHHHHHHHHHHH
Confidence 45666662 478888887777776643
No 28
>PF12274 DUF3615: Protein of unknown function (DUF3615); InterPro: IPR022059 This domain family is found in bacteria and eukaryotes, and is typically between 86 and 97 amino acids in length. There is a conserved FAE sequence motif. There is a single completely conserved residue F that may be functionally important.
Probab=27.17 E-value=1.7e+02 Score=18.45 Aligned_cols=56 Identities=16% Similarity=0.088 Sum_probs=35.0
Q ss_pred CCCeEEEEEEEEEEEEeceE--EEEEEEEEEeCC------CceeEEEEEEEecCCCceEEEEeee
Q 033492 59 KSALKFESVEKGETQVVSGT--NYRLILVVKDGP------STKKFEAVVWEKPWEHFKSLTSFKP 115 (118)
Q Consensus 59 ~~~~~~~~V~~a~~QVVaG~--nY~l~v~~~~~~------~~~~c~~~V~~~PW~~~~~l~sf~~ 115 (118)
+..|.+.+++....=.-.|. =|++.+.+...+ ....+-|++. ..-.....+...-+
T Consensus 9 ~~~yeL~~v~~~~~~~e~~~~~y~HvNF~A~~~~~~~~~~~~~LFFAE~~-~~~~~~~~v~~C~~ 72 (96)
T PF12274_consen 9 GLEYELVDVLHSCFIFERGGWNYYHVNFTAKTKGPDSDDGSPTLFFAEVS-NDCKDEDDVSCCCP 72 (96)
T ss_pred CcCEEEeEEEeeeeeEeCCCcEEEeEEEEEEcCCccCCCCCceEEEEEEe-cCCCCCCEEEEEEE
Confidence 56788888887764444443 368888887654 5677889887 22233344444433
No 29
>COG4676 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=26.22 E-value=28 Score=26.38 Aligned_cols=33 Identities=18% Similarity=0.450 Sum_probs=19.2
Q ss_pred hcccccCCCCcc---cceeeeCCCCCcHHHHHHHHHH
Q 033492 17 PLLAAGDRKGAL---VGGWKPIEDPKEKHVMEIGQFA 50 (118)
Q Consensus 17 ~~~~~~~~~~~~---~GG~~~i~~~~d~~v~~~a~~A 50 (118)
++.|.+.+...+ .+||+. ..-.|..+.+.+++-
T Consensus 19 ~l~A~ae~~v~ld~P~~GWr~-s~g~~~~~~q~v~YP 54 (268)
T COG4676 19 SLVAWAEPEVELDAPLSGWRP-SGGEDASYRQSVNYP 54 (268)
T ss_pred chhhhcCCcccccCccccccc-CCCccccccccccCC
Confidence 555666555443 689988 665555554444443
No 30
>PRK09408 ompX outer membrane protein X; Provisional
Probab=26.19 E-value=56 Score=23.36 Aligned_cols=37 Identities=19% Similarity=0.273 Sum_probs=19.3
Q ss_pred CcchHHHHHHHHHHhhhcccccCCCCcccceeeeCCCC
Q 033492 1 MNQRFCCLIVLFLSVVPLLAAGDRKGALVGGWKPIEDP 38 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~~~~~~~~~~~~~~GG~~~i~~~ 38 (118)
|.+..|+.+++++++++..++......+.+|+.. .++
T Consensus 1 mkk~~~~~~~~~~~~~~~~~~~~~~~t~s~GYaq-~~~ 37 (171)
T PRK09408 1 MKKIACLSALACVLAVTAGTAVAATSTVTGGYAQ-SDA 37 (171)
T ss_pred CceEehHHHHHHHHHHhhhhhhcccceEEEEEEE-eec
Confidence 6665555444333332222122234788899887 454
No 31
>cd06247 M14_CPO Peptidase M14 carboxypeptidase (CP) O (CPO, also known as metallocarboxypeptidase C; EC 3.4.17.) belongs to the carboxypeptidase A/B subfamily of the M14 family of metallocarboxypeptidases (MCPs). The M14 family are zinc-binding CPs which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. CPO has not been well characterized as yet, and little is known about it. Based on modeling studies, CPO has been suggested to have specificity for acidic residues rather than aliphatic/aromatic residues as in A-like enzymes or basic residues as in B-like enzymes. It remains to be demonstrated that CPO is functional as an MCP.
Probab=26.17 E-value=1e+02 Score=23.96 Aligned_cols=61 Identities=13% Similarity=0.190 Sum_probs=39.3
Q ss_pred cceeeeCCCCCcHHHHHHHHHHHHHHHHhcCCCeEEEE---EEE---E----EEEEeceEEEEEEEEEEeCC
Q 033492 29 VGGWKPIEDPKEKHVMEIGQFAVTEYNKQSKSALKFES---VEK---G----ETQVVSGTNYRLILVVKDGP 90 (118)
Q Consensus 29 ~GG~~~i~~~~d~~v~~~a~~Av~~~n~~~~~~~~~~~---V~~---a----~~QVVaG~nY~l~v~~~~~~ 90 (118)
|=|++.-..++++++.++++.++..+.+..+..|..-. ++- . +.+ ..|..|-+++++.+..
T Consensus 199 P~g~~~~~~~n~~~~~~~a~~~~~ai~~~~~~~y~~g~~~~~~y~a~G~s~Dwa~-~~~~~~s~t~El~~~g 269 (298)
T cd06247 199 PYGYTKEPSSNHEEMMLVAQKAAAALKEKHGTEYRVGSSALILYSNSGSSRDWAV-DIGIPFSYTFELRDNG 269 (298)
T ss_pred CCcCCCCCCCCHHHHHHHHHHHHHHHHHhcCCCCccCCcccccccCCCChhhhhh-ccCCCEEEEEEeCCCC
Confidence 33444423457888999999999988887776775421 111 0 112 2588899999997654
No 32
>PF03823 Neurokinin_B: Neurokinin B; InterPro: IPR003635 Tachykinins [, , ] are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. This family includes neurokinins, as well as many other peptides. Like other tachykinins, neurokinins are synthesized as larger protein precursors that are enzymatically converted to their mature forms.; GO: 0007217 tachykinin receptor signaling pathway
Probab=26.16 E-value=81 Score=18.66 Aligned_cols=34 Identities=12% Similarity=0.136 Sum_probs=15.5
Q ss_pred CcchHHHHH-HHHHHhhhcccccC--CCCccc-ceeee
Q 033492 1 MNQRFCCLI-VLFLSVVPLLAAGD--RKGALV-GGWKP 34 (118)
Q Consensus 1 m~~~~~~~~-~~~~~~~~~~~~~~--~~~~~~-GG~~~ 34 (118)
|+.-.++++ |++-.+-+..|.+. ++...+ ||.++
T Consensus 1 MR~~lLf~aiLalsla~s~gavCeesQeQ~~p~gg~sk 38 (59)
T PF03823_consen 1 MRSTLLFAAILALSLARSFGAVCEESQEQVVPGGGHSK 38 (59)
T ss_pred ChhHHHHHHHHHHHHHHHhhhhhhhhhhccCCCCCccc
Confidence 655444433 33333335555555 233345 45555
No 33
>PRK15346 outer membrane secretin SsaC; Provisional
Probab=24.77 E-value=95 Score=25.78 Aligned_cols=48 Identities=13% Similarity=0.056 Sum_probs=23.1
Q ss_pred CcchHHHHHHHHHHhhhcccccCCCCcccceeeeCCCCCcHHHHHHHHHHH
Q 033492 1 MNQRFCCLIVLFLSVVPLLAAGDRKGALVGGWKPIEDPKEKHVMEIGQFAV 51 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~~~~~~~~~~~~~~GG~~~i~~~~d~~v~~~a~~Av 51 (118)
|-++.++.+|++|+.+.-.++ . ...-.|-.-.+ +..|.++.++++..-
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~-~-~~~w~~~~~~~-~~~~~di~~vl~~~a 48 (499)
T PRK15346 1 MKKLLILIFLFLLNTAKFAAS-K-SIPWQGNPFFI-YSRGMPLAEVLHDLG 48 (499)
T ss_pred CchhHHHHHHHHHhhhhhhcc-C-CCCCCCCCEEE-EECCCcHHHHHHHHH
Confidence 445555555555555433222 1 11122322332 567888877777433
No 34
>PF08139 LPAM_1: Prokaryotic membrane lipoprotein lipid attachment site; InterPro: IPR012640 In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognises a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [,]. This lipid attachment site is found in homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection [].
Probab=23.94 E-value=53 Score=16.15 Aligned_cols=12 Identities=0% Similarity=0.185 Sum_probs=5.1
Q ss_pred chHHHHHHHHHH
Q 033492 3 QRFCCLIVLFLS 14 (118)
Q Consensus 3 ~~~~~~~~~~~~ 14 (118)
+|++++++++++
T Consensus 8 Kkil~~l~a~~~ 19 (25)
T PF08139_consen 8 KKILFPLLALFM 19 (25)
T ss_pred HHHHHHHHHHHH
Confidence 444444444333
No 35
>PF03082 MAGSP: Male accessory gland secretory protein; InterPro: IPR004315 The accessory gland of male insects is a genital tissue that secretes many components of the ejaculatory fluid, some of which affect the female's receptivity to courtship and her rate of oviposition. The protein is expressed exclusively in the male accessory glands of adult Drosophila melanogaster. During copulation it is transferred to the female genital tract where it is rapidly altered [].; GO: 0007618 mating, 0005576 extracellular region
Probab=23.05 E-value=55 Score=24.74 Aligned_cols=22 Identities=36% Similarity=0.439 Sum_probs=16.2
Q ss_pred CcchH-HHHHHHHHHhhhccccc
Q 033492 1 MNQRF-CCLIVLFLSVVPLLAAG 22 (118)
Q Consensus 1 m~~~~-~~~~~~~~~~~~~~~~~ 22 (118)
|++.. |-.+|+++++|..+.+.
T Consensus 1 MNQILLCS~iLLllfaVAnC~~~ 23 (264)
T PF03082_consen 1 MNQILLCSAILLLLFAVANCDGL 23 (264)
T ss_pred CceeeehHHHHHHHHHHhhcccc
Confidence 88876 44888888887776653
No 36
>TIGR03431 PhnD phosphonate ABC transporter, periplasmic phosphonate binding protein. Note that this model does not identify all phnD-subfamily genes with evident phosphonate context, but all sequences above the trusted context may be inferred to bind phosphonate compounds even in the absence of such context. Furthermore, there is ample evidence to suggest that many other members of the TIGR01098 subfamily have a different primary function.
Probab=21.98 E-value=3.3e+02 Score=20.07 Aligned_cols=34 Identities=21% Similarity=0.156 Sum_probs=17.0
Q ss_pred CcchHHHHHHHHHHhhhcccccCCC-Ccccceeee
Q 033492 1 MNQRFCCLIVLFLSVVPLLAAGDRK-GALVGGWKP 34 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~~~~~~~~~~-~~~~GG~~~ 34 (118)
|.+|+|+..+..+++-++.+..... ..+.-|..+
T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~vg~~~ 35 (288)
T TIGR03431 1 MLRRLILSLVAAFMLISSNAQAEDWPKELNFGIIP 35 (288)
T ss_pred ChhhHHHHHHHHHHHHhcchhhhcCCCeEEEEEcC
Confidence 7778766444444443333332222 345666665
No 37
>PF05887 Trypan_PARP: Procyclic acidic repetitive protein (PARP); InterPro: IPR008882 This family consists of several Trypanosoma brucei procyclic acidic repetitive protein (PARP) like sequences. The procyclic acidic repetitive protein (parp) genes of T. brucei encode a small family of abundant surface proteins whose expression is restricted to the procyclic form of the parasite. They are found at two unlinked loci, parpA and parpB; transcription of both loci is developmentally regulated [].; GO: 0016020 membrane; PDB: 2X34_B 2X32_B.
Probab=21.86 E-value=30 Score=24.12 Aligned_cols=13 Identities=38% Similarity=0.869 Sum_probs=0.0
Q ss_pred CcchHHHHHHHHH
Q 033492 1 MNQRFCCLIVLFL 13 (118)
Q Consensus 1 m~~~~~~~~~~~~ 13 (118)
|..|.+|+|.++|
T Consensus 1 m~pr~l~~LavLL 13 (143)
T PF05887_consen 1 MTPRHLCLLAVLL 13 (143)
T ss_dssp -------------
T ss_pred Ccccccccccccc
Confidence 7777655433333
No 38
>PF05438 TRH: Thyrotropin-releasing hormone (TRH); InterPro: IPR008857 This family consists of several thyrotropin-releasing hormone (TRH) proteins. Thyrotropin-Releasing Hormone (TRH; pyroGlu-His-Pro-NH2), originally isolated as a hypothalamic neuropeptide hormone, most likely acts also as a neuromodulator and/or neurotransmitter in the central nervous system (CNS). This interpretation is supported by the identification of a peptidase localised on the surface of neuronal cells which has been termed TRH-degrading ectoenzyme (TRH-DE) since it selectively inactivates TRH. TRH has been used clinically for the treatment of spinocerebellar degeneration and disturbance of consciousness in humans [].; GO: 0005184 neuropeptide hormone activity, 0009755 hormone-mediated signaling pathway, 0005576 extracellular region
Probab=20.73 E-value=1.8e+02 Score=21.72 Aligned_cols=16 Identities=19% Similarity=0.187 Sum_probs=8.3
Q ss_pred HHHHHHHHHhhhcccc
Q 033492 6 CCLIVLFLSVVPLLAA 21 (118)
Q Consensus 6 ~~~~~~~~~~~~~~~~ 21 (118)
|+++|++|+++...+.
T Consensus 2 wlllll~L~l~~~~v~ 17 (212)
T PF05438_consen 2 WLLLLLALTLCNTGVP 17 (212)
T ss_pred HHHHHHHHHHhhcccc
Confidence 4556666655444333
No 39
>cd01781 AF6_RA_repeat2 Ubiquitin domain of AT-6, second repeat. The AF-6 protein (also known as afadin and canoe) is a multidomain cell junction protein that contains two N-terminal Ras-associating (RA) domains in addition to FHA (forkhead-associated), DIL (class V myosin homology region), and PDZ domains and a proline-rich region. AF6 acts downstream of the Egfr (Epidermal Growth Factor-receptor)/Ras signalling pathway and provides a link from Egfr to cytoskeletal elements.
Probab=20.33 E-value=1.5e+02 Score=19.52 Aligned_cols=30 Identities=13% Similarity=0.096 Sum_probs=23.2
Q ss_pred CcHHHHHHHHHHHHHHHHhcC--CCeEEEEEE
Q 033492 39 KEKHVMEIGQFAVTEYNKQSK--SALKFESVE 68 (118)
Q Consensus 39 ~d~~v~~~a~~Av~~~n~~~~--~~~~~~~V~ 68 (118)
++....++...|+.+|+-+.. .+|.+++|+
T Consensus 24 ~~~~a~~vV~eALeKygL~~e~p~~Y~LveV~ 55 (100)
T cd01781 24 INDNADRIVGEALEKYGLEKSDPDDYCLVEVS 55 (100)
T ss_pred CCccHHHHHHHHHHHhCCCccCccceEEEEEe
Confidence 566778899999999987654 367877775
No 40
>PF07353 Uroplakin_II: Uroplakin II; InterPro: IPR009952 This family contains uroplakin II, which is approximately 180 residues long and seems to be restricted to mammals. Uroplakin II is an integral membrane protein, and is one of the components of the apical plaques of mammalian urothelium formed by the asymmetric unit membrane - this is believed to play a role in strengthening the urothelial apical surface to prevent the cells from rupturing during bladder distension [].; GO: 0016044 cellular membrane organization, 0030176 integral to endoplasmic reticulum membrane
Probab=20.21 E-value=2.4e+02 Score=20.45 Aligned_cols=16 Identities=31% Similarity=0.590 Sum_probs=13.9
Q ss_pred eceEEEEEEEEEEeCC
Q 033492 75 VSGTNYRLILVVKDGP 90 (118)
Q Consensus 75 VaG~nY~l~v~~~~~~ 90 (118)
..|++|++...+.++.
T Consensus 110 ~pGTkY~isY~Vtkgt 125 (184)
T PF07353_consen 110 QPGTKYYISYLVTKGT 125 (184)
T ss_pred CCCcEEEEEEEEecCc
Confidence 4699999999998875
No 41
>PF15281 Consortin_C: Consortin C-terminus
Probab=20.21 E-value=76 Score=21.34 Aligned_cols=15 Identities=33% Similarity=0.388 Sum_probs=9.8
Q ss_pred HHHHHHHHhhhcccc
Q 033492 7 CLIVLFLSVVPLLAA 21 (118)
Q Consensus 7 ~~~~~~~~~~~~~~~ 21 (118)
+|+|+++++|.++..
T Consensus 56 ~L~LlclvTv~lS~g 70 (113)
T PF15281_consen 56 LLLLLCLVTVVLSVG 70 (113)
T ss_pred HHHHHHHHHHHHhcc
Confidence 466777777766555
No 42
>PF07312 DUF1459: Protein of unknown function (DUF1459); InterPro: IPR009924 This family consists of several hypothetical Caenorhabditis elegans proteins of around 85 residues in length. The function of this family is unknown.
Probab=20.12 E-value=93 Score=19.75 Aligned_cols=16 Identities=25% Similarity=0.366 Sum_probs=9.9
Q ss_pred CcchHHHHHHHHHHhh
Q 033492 1 MNQRFCCLIVLFLSVV 16 (118)
Q Consensus 1 m~~~~~~~~~~~~~~~ 16 (118)
|+.|-+.++++++++.
T Consensus 1 MF~Kc~~~l~l~~f~i 16 (84)
T PF07312_consen 1 MFQKCIIVLLLCLFCI 16 (84)
T ss_pred ChHHHHHHHHHHHHHH
Confidence 7777655555555553
Done!