Query 040138
Match_columns 216
No_of_seqs 183 out of 772
Neff 4.5
Searched_HMMs 46136
Date Fri Mar 29 05:30:48 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/040138.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/040138hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PLN03024 Putative EG45-like do 99.8 1.4E-20 3E-25 152.0 11.1 96 98-214 19-124 (125)
2 PLN00193 expansin-A; Provision 99.7 1.7E-16 3.6E-21 141.5 11.8 114 100-215 30-155 (256)
3 PLN00050 expansin A; Provision 99.7 2.6E-16 5.6E-21 139.7 11.2 114 100-215 25-147 (247)
4 PF03330 DPBB_1: Rare lipoprot 99.7 6.8E-17 1.5E-21 118.4 6.2 67 133-213 1-78 (78)
5 smart00837 DPBB_1 Rare lipopro 99.6 1.2E-15 2.5E-20 116.1 7.5 80 133-213 1-87 (87)
6 PLN03023 Expansin-like B1; Pro 99.4 1.5E-12 3.2E-17 115.8 10.3 98 101-214 26-137 (247)
7 PF00967 Barwin: Barwin family 98.8 7.4E-09 1.6E-13 83.2 4.7 57 143-215 56-116 (119)
8 TIGR00413 rlpA rare lipoprotei 98.6 2.3E-07 5E-12 81.1 10.4 85 103-213 1-88 (208)
9 COG0797 RlpA Lipoproteins [Cel 98.6 4.8E-07 1E-11 80.2 11.1 137 50-212 32-171 (233)
10 PRK10672 rare lipoprotein A; P 98.5 8.4E-07 1.8E-11 83.0 11.1 103 84-212 64-167 (361)
11 COG4305 Endoglucanase C-termin 97.9 3.8E-05 8.3E-10 66.5 7.7 73 128-215 53-129 (232)
12 PF07249 Cerato-platanin: Cera 97.8 7.7E-05 1.7E-09 60.3 7.1 65 131-215 43-111 (119)
13 PF07127 Nodulin_late: Late no 94.5 0.062 1.3E-06 37.3 3.9 36 10-45 3-39 (54)
14 PF02977 CarbpepA_inh: Carboxy 86.3 0.27 5.8E-06 33.9 0.5 23 26-48 2-24 (46)
15 PF07473 Toxin_11: Spasmodic p 69.4 3.6 7.8E-05 25.6 1.6 23 31-53 1-23 (28)
16 TIGR02645 ARCH_P_rylase putati 61.8 25 0.00054 34.9 6.7 44 145-202 27-70 (493)
17 PF01683 EB: EB module; Inter 57.9 8.2 0.00018 25.8 1.9 22 33-54 19-40 (52)
18 cd00150 PlantTI Plant trypsin 46.6 14 0.00031 22.9 1.5 21 34-54 6-27 (27)
19 smart00286 PTI Plant trypsin i 43.4 18 0.0004 22.7 1.6 21 34-54 8-29 (29)
20 TIGR03327 AMP_phos AMP phospho 38.0 1E+02 0.0022 30.7 6.7 43 146-202 29-71 (500)
21 PF02532 PsbI: Photosystem II 32.4 36 0.00078 22.5 1.8 17 13-29 11-28 (36)
22 PF00299 Squash: Squash family 32.3 22 0.00047 22.4 0.7 20 34-53 8-28 (29)
23 CHL00024 psbI photosystem II p 31.6 25 0.00054 23.2 0.9 16 13-28 11-27 (36)
24 PRK04350 thymidine phosphoryla 31.4 1.6E+02 0.0034 29.4 6.8 42 148-202 24-65 (490)
25 TIGR03170 flgA_cterm flagella 31.0 72 0.0016 24.5 3.7 23 146-168 94-117 (122)
26 PRK02655 psbI photosystem II r 30.6 26 0.00057 23.3 0.9 17 13-29 11-28 (38)
27 PF13144 SAF_2: SAF-like 28.9 1E+02 0.0022 25.7 4.5 24 145-168 167-191 (196)
28 PF01666 DX: DX module; Inter 26.4 47 0.001 24.8 1.8 24 34-57 48-76 (76)
29 smart00051 DSL delta serrate l 24.2 38 0.00083 24.2 0.9 29 29-65 27-55 (63)
30 PRK12618 flgA flagellar basal 22.8 1.2E+02 0.0026 24.9 3.7 23 146-168 110-133 (141)
31 PRK08515 flgA flagellar basal 22.3 1.1E+02 0.0024 26.7 3.7 24 145-168 193-216 (222)
32 KOG4106 Uncharacterized conser 22.1 88 0.0019 25.6 2.7 24 143-167 24-47 (125)
33 PF02015 Glyco_hydro_45: Glyco 21.5 1.6E+02 0.0036 26.1 4.5 25 145-169 82-110 (201)
No 1
>PLN03024 Putative EG45-like domain containing protein 1; Provisional
Probab=99.84 E-value=1.4e-20 Score=151.97 Aligned_cols=96 Identities=27% Similarity=0.450 Sum_probs=82.7
Q ss_pred cccceeEEEEeecCCCCCCCCCCCCCCCccCCCCcEEEeecccCCCCcCCCceEEEEeC----------CCCEEEEEEEe
Q 040138 98 TSSTQARLTNNDFSEGGDGGGPSECDGQYHDNSKPIAALSTGWYSGGSRCGKMIRITAN----------NGRSVLAQVVD 167 (216)
Q Consensus 98 ~~~t~A~lT~y~f~~g~dgGg~gACG~~~~~d~d~VVALSsg~f~~g~~CGk~I~It~~----------NGkSV~atVVD 167 (216)
...+.+++|||+. +.++||++ .+.+++++|||++.+|++++.||++++|++. |||+|+|+|+|
T Consensus 19 ~~~~~G~AT~Y~~------~~~gAC~~-~~~~g~~iaAls~~lf~~G~~CG~c~~V~C~~~~~~~~~~c~gksV~V~VtD 91 (125)
T PLN03024 19 SYATPGIATFYTS------YTPSACYR-GTSFGVMIAAASDSLWNNGRVCGKMFTVKCKGPRNAVPHPCTGKSVTVKIVD 91 (125)
T ss_pred hcccceEEEEeCC------CCCccccC-CCCCCCEeEEeCHHHcCCCcccCceEEEEECCCCccccccccCCeEEEEEEc
Confidence 4467899999962 34579974 5778999999999999999999999999861 58999999999
Q ss_pred CCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCCCccEEEEEEE
Q 040138 168 ECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDKEIGIVDVTWS 214 (216)
Q Consensus 168 eC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~~~G~i~VtWs 214 (216)
+||+ +|. ++||||+++|++|+ +.+.|+|+|+|.
T Consensus 92 ~CP~------------~C~-~~~DLS~~AF~~iA-~~~aG~v~V~y~ 124 (125)
T PLN03024 92 HCPS------------GCA-STLDLSREAFAQIA-NPVAGIINIDYI 124 (125)
T ss_pred CCCC------------CCC-CceEcCHHHHHHhc-CccCCEEEEEEe
Confidence 9973 465 59999999999999 789999999995
No 2
>PLN00193 expansin-A; Provisional
Probab=99.69 E-value=1.7e-16 Score=141.54 Aligned_cols=114 Identities=25% Similarity=0.306 Sum_probs=91.1
Q ss_pred cceeEEEEeecCCCCCCCCCCCCCCCc---cCCCCcEEEeecccCCCCcCCCceEEEEeC---------CCCEEEEEEEe
Q 040138 100 STQARLTNNDFSEGGDGGGPSECDGQY---HDNSKPIAALSTGWYSGGSRCGKMIRITAN---------NGRSVLAQVVD 167 (216)
Q Consensus 100 ~t~A~lT~y~f~~g~dgGg~gACG~~~---~~d~d~VVALSsg~f~~g~~CGk~I~It~~---------NGkSV~atVVD 167 (216)
=++|++|||.-.++ .|...||||+.. ...+.+++|||+.+|+++..||++++|+++ +|++|+++|+|
T Consensus 30 W~~a~AT~Yg~~d~-~gt~gGACGYg~l~~~~~g~~~AAls~~lf~~G~~CGaCyev~C~~~~~~~~C~~g~sV~Vt~td 108 (256)
T PLN00193 30 WTKAHATFYGGSDA-SGTMGGACGYGNLYSTGYGTRTAALSTALFNDGASCGQCYRIMCDYQADSRWCIKGASVTITATN 108 (256)
T ss_pred ceeeEEEEcCCCCC-CCCCCcccCCCCccccCCCceeeecCHhHccCCccccCeEEEECCCCCCCccccCCCeEEEEEec
Confidence 46799999975333 223458999752 234678999999999999999999999972 47799999999
Q ss_pred CCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCCCccEEEEEEEe
Q 040138 168 ECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDKEIGIVDVTWSM 215 (216)
Q Consensus 168 eC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~~~G~i~VtWs~ 215 (216)
+||...+-+++.+.|..=+..|+|||+.||.+|+ ....|+|+|+|+-
T Consensus 109 ~CP~n~~~~~~~ggwC~~~~~HFDLS~~AF~~iA-~~~~Giv~V~yrR 155 (256)
T PLN00193 109 FCPPNYALPNNNGGWCNPPLQHFDMAQPAWEKIG-IYRGGIVPVLFQR 155 (256)
T ss_pred CCCCcccccccCCCcCCCCCcccccCHHHHHHHh-hhcCCeEeEEEEE
Confidence 9998766666666653335589999999999999 5789999999963
No 3
>PLN00050 expansin A; Provisional
Probab=99.68 E-value=2.6e-16 Score=139.70 Aligned_cols=114 Identities=22% Similarity=0.256 Sum_probs=90.7
Q ss_pred cceeEEEEeecCCCCCCCCCCCCCCCcc---CCCCcEEEeecccCCCCcCCCceEEEEeCCC------CEEEEEEEeCCC
Q 040138 100 STQARLTNNDFSEGGDGGGPSECDGQYH---DNSKPIAALSTGWYSGGSRCGKMIRITANNG------RSVLAQVVDECD 170 (216)
Q Consensus 100 ~t~A~lT~y~f~~g~dgGg~gACG~~~~---~d~d~VVALSsg~f~~g~~CGk~I~It~~NG------kSV~atVVDeC~ 170 (216)
=..|++|||.-.++ .|...||||+... ..+.+++|||+.+|+++..||.+++|+++++ ++|+++|+|+||
T Consensus 25 W~~a~AT~Yg~~dg-~gt~gGACGYg~l~~~~~g~~~AAls~~lf~~G~~CGaCyeV~C~~~~~~C~~gsV~V~itd~CP 103 (247)
T PLN00050 25 WTGAHATFYGGGDA-SGTMGGACGYGNLYSQGYGTNTAALSTALFNNGLSCGACFEIKCVNDNIWCLPGSIIITATNFCP 103 (247)
T ss_pred ccccEEEEcCCCCC-CCCCCcccCCCCccccCCCceeeeccHhHccCCccccceEEEEcCCCCcccCCCcEEEEEecCCC
Confidence 46799999975433 2234589997531 3467999999999999999999999998433 389999999999
Q ss_pred CCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCCCccEEEEEEEe
Q 040138 171 SMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDKEIGIVDVTWSM 215 (216)
Q Consensus 171 s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~~~G~i~VtWs~ 215 (216)
...+.+.+.+.|..=+..|||||+.||.+|+ ....|+|+|+|+-
T Consensus 104 ~~~~~~~~~~gwC~~~~~hFDLS~~AF~~iA-~~~aGii~V~yRR 147 (247)
T PLN00050 104 PNLALPNNDGGWCNPPQQHFDLSQPVFQKIA-QYKAGIVPVQYRR 147 (247)
T ss_pred CCcCcCccCCCcCCCCCcccccCHHHHHHHh-hhcCCeeeeEEEE
Confidence 8777666555553225589999999999999 6889999999973
No 4
>PF03330 DPBB_1: Rare lipoprotein A (RlpA)-like double-psi beta-barrel; InterPro: IPR009009 Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair [].; PDB: 1N10_B 3D30_A 2BH0_A 2HCZ_X.
Probab=99.68 E-value=6.8e-17 Score=118.44 Aligned_cols=67 Identities=39% Similarity=0.724 Sum_probs=58.2
Q ss_pred EEEeecccCCCCcCCCceEEEEe---C-C-----C--CEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhc
Q 040138 133 IAALSTGWYSGGSRCGKMIRITA---N-N-----G--RSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALG 201 (216)
Q Consensus 133 VVALSsg~f~~g~~CGk~I~It~---~-N-----G--kSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg 201 (216)
.||++..||+++..||++++++. . . . |+|+|+|+|+|| .|.+++||||+++|++|+
T Consensus 1 t~a~~~~~y~~g~~cG~~~~~~~~~~a~~~~~~~~~~ksV~v~V~D~Cp-------------~~~~~~lDLS~~aF~~la 67 (78)
T PF03330_consen 1 TAAGSATWYDNGTACGQCYQVTCLTAASATGTCKVGNKSVTVTVVDRCP-------------GCPPNHLDLSPAAFKALA 67 (78)
T ss_dssp EEEE-HHHHGGGTTTT-EEEEEE---SSTT--BESEECEEEEEEEEE-T-------------TSSSSEEEEEHHHHHHTB
T ss_pred CeEEEhhhcCCCCcCCCeeeccccccCCccceEEecCCeEEEEEEccCC-------------CCcCCEEEeCHHHHHHhC
Confidence 48999999999999999999998 1 1 2 999999999995 589999999999999999
Q ss_pred cCCCccEEEEEE
Q 040138 202 LDKEIGIVDVTW 213 (216)
Q Consensus 202 ~~~~~G~i~VtW 213 (216)
+.+.|+++|+|
T Consensus 68 -~~~~G~i~V~w 78 (78)
T PF03330_consen 68 -DPDAGVIPVEW 78 (78)
T ss_dssp -STTCSSEEEEE
T ss_pred -CCCceEEEEEC
Confidence 78999999999
No 5
>smart00837 DPBB_1 Rare lipoprotein A (RlpA)-like double-psi beta-barrel. Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen.
Probab=99.62 E-value=1.2e-15 Score=116.13 Aligned_cols=80 Identities=24% Similarity=0.375 Sum_probs=67.2
Q ss_pred EEEeecccCCCCcCCCceEEEEeC-------CCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCCC
Q 040138 133 IAALSTGWYSGGSRCGKMIRITAN-------NGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDKE 205 (216)
Q Consensus 133 VVALSsg~f~~g~~CGk~I~It~~-------NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~~ 205 (216)
++|||+.+|+++..||++++|++. +|++|+|+|+|+|+...+.+.+.+.|..=+..|||||+.||.+|+ ...
T Consensus 1 taA~s~~lf~~G~~CG~Cy~v~C~~~~~~C~~~~~V~V~vtd~CP~~~~~~~~~~~~C~~~~~hfDLS~~AF~~iA-~~~ 79 (87)
T smart00837 1 TAALSTALFNNGASCGACYEIMCVDSPKWCKPGGSITVTATNFCPPNYALSNDNGGWCNPPRKHFDLSQPAFEKIA-QYK 79 (87)
T ss_pred CcccCHHHccCCccccceEEEEeCCCCCcccCCCeEEEEEeccCCccccccccCCCccCCCCcCeEcCHHHHHHHh-hhc
Confidence 489999999999999999999962 457999999999998766655555442224689999999999999 689
Q ss_pred ccEEEEEE
Q 040138 206 IGIVDVTW 213 (216)
Q Consensus 206 ~G~i~VtW 213 (216)
.|+|+|+|
T Consensus 80 ~Gvi~v~y 87 (87)
T smart00837 80 AGIVPVKY 87 (87)
T ss_pred CCEEeeEC
Confidence 99999987
No 6
>PLN03023 Expansin-like B1; Provisional
Probab=99.40 E-value=1.5e-12 Score=115.82 Aligned_cols=98 Identities=22% Similarity=0.265 Sum_probs=78.0
Q ss_pred ceeEEEEeecCCCCCCCCCCCCCCCc---cCCCCcEEEeecccCCCCcCCCceEEEEeC-----CCCEEEEEEEeCCCCC
Q 040138 101 TQARLTNNDFSEGGDGGGPSECDGQY---HDNSKPIAALSTGWYSGGSRCGKMIRITAN-----NGRSVLAQVVDECDSM 172 (216)
Q Consensus 101 t~A~lT~y~f~~g~dgGg~gACG~~~---~~d~d~VVALSsg~f~~g~~CGk~I~It~~-----NGkSV~atVVDeC~s~ 172 (216)
.+|++|||.-.+| .|...||||+.. ..++-+++|+| .+|++|..||+|++|++. .+++|+++|+|.|+
T Consensus 26 ~~a~AT~Yg~~~g-~gt~gGACGYg~~~~~~~g~~~aa~s-~Lf~~G~~CGaCy~irC~~~~~C~~~~v~V~iTd~~~-- 101 (247)
T PLN03023 26 TYSRATYYGSPDC-LGTPTGACGFGEYGRTVNGGNVAGVS-RLYRNGTGCGACYQVRCKAPNLCSDDGVNVVVTDYGE-- 101 (247)
T ss_pred ccceEEEeCCCCC-CCCCCccccCCccccCCCcceeeeeh-hhhcCCchhcccEEeecCCCCccCCCCeEEEEEeCCC--
Confidence 5689999985433 334568999743 23457899998 899999999999999983 35689999999995
Q ss_pred CCCCcCCCCCCCCCCCeeecCHHHHHHhccC------CCccEEEEEEE
Q 040138 173 RGCDEEHAGQPPCDNNIVDGSDAVWSALGLD------KEIGIVDVTWS 214 (216)
Q Consensus 173 ~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~------~~~G~i~VtWs 214 (216)
..+.|+|||..||.+|+.+ ...|+|+|+++
T Consensus 102 ------------~~~~hFdLS~~AF~~iA~pg~~~~l~~aGiv~v~Yr 137 (247)
T PLN03023 102 ------------GDKTDFILSPRAYARLARPNMAAELFAYGVVDVEYR 137 (247)
T ss_pred ------------CCCCccccCHHHHHHHhCccccchhccCcEEEeEEE
Confidence 2468999999999999953 34699999985
No 7
>PF00967 Barwin: Barwin family; InterPro: IPR001153 Barwin is a basic protein isolated from aqueous extracts of barley seeds. It is 125 amino acids in length, and contains six cysteine residues that combine to form three disulphide bridges [, ]. Comparative analysis shows the sequence to be highly similar to a 122 amino acid stretch in the C-terminal of the products of two wound-induced genes (win1 and win2) from potato, the product of the hevein gene of rubber trees, and pathogenesis-related protein 4 from tobacco. The high levels of similarity to these proteins, and their ability to bind saccharides, suggest that the barwin domain may be involved in a common defence mechanism in plants.; GO: 0042742 defense response to bacterium, 0050832 defense response to fungus; PDB: 1BW3_A 1BW4_A.
Probab=98.78 E-value=7.4e-09 Score=83.17 Aligned_cols=57 Identities=30% Similarity=0.580 Sum_probs=42.6
Q ss_pred CCcCCCceEEEEe-CCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCC---CccEEEEEEEe
Q 040138 143 GGSRCGKMIRITA-NNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDK---EIGIVDVTWSM 215 (216)
Q Consensus 143 ~g~~CGk~I~It~-~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~---~~G~i~VtWs~ 215 (216)
+...|||+++||+ .+|.+++|+|||+|. ++.|||.+.||.+|--+- ..|.+.|+|+|
T Consensus 56 gq~~CGkClrVTNt~tga~~~~RIVDqCs----------------nGGLDld~~vF~~iDtdG~G~~~Ghl~V~y~f 116 (119)
T PF00967_consen 56 GQDSCGKCLRVTNTATGAQVTVRIVDQCS----------------NGGLDLDPTVFNQIDTDGQGYAQGHLIVDYEF 116 (119)
T ss_dssp SGGGTT-EEEEE-TTT--EEEEEEEEE-S----------------SSSEES-SSSHHHH-SSSHHHHHTEEEEEEEE
T ss_pred CcccccceEEEEecCCCcEEEEEEEEcCC----------------CCCcccChhHHhhhccCCcccccceEEEEEEE
Confidence 4578999999999 479999999999994 569999999999997432 38999999987
No 8
>TIGR00413 rlpA rare lipoprotein A. This is a family of prokaryotic proteins with unknown function. Lipoprotein annotation based on the presence of consensus lipoprotein signal sequence. Included in this family is the E. coli putative lipoprotein rlpA.
Probab=98.65 E-value=2.3e-07 Score=81.06 Aligned_cols=85 Identities=22% Similarity=0.214 Sum_probs=67.9
Q ss_pred eEEEEee--cCCCCCCCCCCCCCCCccCCCCcEEEeecccCCCCcCCCceEEEEe-CCCCEEEEEEEeCCCCCCCCCcCC
Q 040138 103 ARLTNND--FSEGGDGGGPSECDGQYHDNSKPIAALSTGWYSGGSRCGKMIRITA-NNGRSVLAQVVDECDSMRGCDEEH 179 (216)
Q Consensus 103 A~lT~y~--f~~g~dgGg~gACG~~~~~d~d~VVALSsg~f~~g~~CGk~I~It~-~NGkSV~atVVDeC~s~~GCd~~~ 179 (216)
+.++||. |. |..+|.|.. +....|.+|-.+- ..|..++|++ .|||+|+|+|.|++|...|
T Consensus 1 G~ASwYg~~f~-----G~~TAnGe~-y~~~~~tAAHktL------PlgT~V~VtNl~ngrsviVrVnDRGPf~~g----- 63 (208)
T TIGR00413 1 GLASWYGPKFH-----GRKTANGEV-YNMKALTAAHKTL------PFNTYVKVTNLHNNRSVIVRINDRGPFSDD----- 63 (208)
T ss_pred CEEeEeCCCCC-----CCcCCCCee-cCCCccccccccC------CCCCEEEEEECCCCCEEEEEEeCCCCCCCC-----
Confidence 3577885 33 778999975 4456788887765 4589999998 5999999999999986543
Q ss_pred CCCCCCCCCeeecCHHHHHHhccCCCccEEEEEE
Q 040138 180 AGQPPCDNNIVDGSDAVWSALGLDKEIGIVDVTW 213 (216)
Q Consensus 180 ~~~p~C~~n~lDLS~avF~aLg~~~~~G~i~VtW 213 (216)
.+||||++|+++|+. .+.|+.+|.-
T Consensus 64 --------RiIDLS~aAA~~Lg~-~~~G~a~V~v 88 (208)
T TIGR00413 64 --------RIIDLSHAAAREIGL-ISRGVGQVRI 88 (208)
T ss_pred --------CEEECCHHHHHHcCC-CcCceEEEEE
Confidence 799999999999994 7788877653
No 9
>COG0797 RlpA Lipoproteins [Cell envelope biogenesis, outer membrane]
Probab=98.59 E-value=4.8e-07 Score=80.23 Aligned_cols=137 Identities=22% Similarity=0.194 Sum_probs=90.2
Q ss_pred cCccCCCCCCceeeccCCCCCCCCCCCCCCcee--cCCcccCeeecCCCccccceeEEEEeecCCCCCCCCCCCCCCCcc
Q 040138 50 NGKCNDDPDVGTHICKGGEGGGGGNCQPSGTLT--CQGNSYPTYKCSPPVTSSTQARLTNNDFSEGGDGGGPSECDGQYH 127 (216)
Q Consensus 50 ~g~c~ddp~~~t~ic~~~~~~~~~~c~~~~~~~--c~g~~y~~~~cSPp~~~~t~A~lT~y~f~~g~dgGg~gACG~~~~ 127 (216)
..++.+.+.+..+.......+..+.=++.-... =.|+.|.... -+..-...+.+.||.- +=.|..+|=|+. +
T Consensus 32 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~g~~y~~~~--~~~~~~~~G~ASwYg~---~fhgr~TA~Ge~-~ 105 (233)
T COG0797 32 RSKYNFNEQVYGVKADPRDNPLKGSGRLTANKPYQVKGKSYYPKA--EPASFEQVGYASWYGE---KFHGRKTANGER-Y 105 (233)
T ss_pred cccccCcccceecCCCccccccccccccccccceeeeeeEEEeee--ccccccccceeeeecc---ccCCccccCccc-c
Confidence 445666666655555433333210001111011 1577776665 3334446677888861 223778888864 5
Q ss_pred CCCCcEEEeecccCCCCcCCCceEEEEe-CCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCCCc
Q 040138 128 DNSKPIAALSTGWYSGGSRCGKMIRITA-NNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDKEI 206 (216)
Q Consensus 128 ~d~d~VVALSsg~f~~g~~CGk~I~It~-~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~~~ 206 (216)
....+.+|-.+-=| +..++||+ .|||+|+|+|.|+.|...| .+||||.+++++|+. ...
T Consensus 106 n~~~~tAAH~TLP~------~t~v~VtNl~NgrsvvVRINDRGPf~~g-------------RiIDlS~aAA~~l~~-~~~ 165 (233)
T COG0797 106 DMNALTAAHKTLPL------PTYVRVTNLDNGRSVVVRINDRGPFVSG-------------RIIDLSKAAADKLGM-IRS 165 (233)
T ss_pred cccccccccccCCC------CCEEEEEEccCCcEEEEEEeCCCCCCCC-------------cEeEcCHHHHHHhCC-ccC
Confidence 55788888877655 78999999 6999999999999987544 899999999999994 777
Q ss_pred cEEEEE
Q 040138 207 GIVDVT 212 (216)
Q Consensus 207 G~i~Vt 212 (216)
|+.+|.
T Consensus 166 G~a~V~ 171 (233)
T COG0797 166 GVAKVR 171 (233)
T ss_pred ceEEEE
Confidence 777654
No 10
>PRK10672 rare lipoprotein A; Provisional
Probab=98.52 E-value=8.4e-07 Score=83.01 Aligned_cols=103 Identities=21% Similarity=0.211 Sum_probs=78.5
Q ss_pred CCcccCeeecCCCccccceeEEEEeecCCCCCCCCCCCCCCCccCCCCcEEEeecccCCCCcCCCceEEEEe-CCCCEEE
Q 040138 84 QGNSYPTYKCSPPVTSSTQARLTNNDFSEGGDGGGPSECDGQYHDNSKPIAALSTGWYSGGSRCGKMIRITA-NNGRSVL 162 (216)
Q Consensus 84 ~g~~y~~~~cSPp~~~~t~A~lT~y~f~~g~dgGg~gACG~~~~~d~d~VVALSsg~f~~g~~CGk~I~It~-~NGkSV~ 162 (216)
.||.|...+ .| ..=+..+.++||.-. -.|..+|-|.. ++...+.+|..+-= -+..++||+ .|||+|+
T Consensus 64 ~G~~Y~~~~-~~-~~~~~~G~ASwYg~~---f~G~~TA~Ge~-~~~~~~tAAH~tLP------lps~vrVtNl~ngrsvv 131 (361)
T PRK10672 64 NGKSYKIVQ-DP-SNFSQAGLAAIYDAE---AGSNLTASGER-FDPNALTAAHPTLP------IPSYVRVTNLANGRMIV 131 (361)
T ss_pred CCEEEEeCc-cC-CCcceEEEEEEeCCc---cCCCcCcCcee-ecCCcCeeeccCCC------CCCEEEEEECCCCcEEE
Confidence 689998764 22 223456888888632 23678999974 45568888887654 488999998 6999999
Q ss_pred EEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccCCCccEEEEE
Q 040138 163 AQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLDKEIGIVDVT 212 (216)
Q Consensus 163 atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~~~~G~i~Vt 212 (216)
|+|.|+.+... ..+||||.+++++|+. ...+.|.|+
T Consensus 132 VrVnDRGP~~~-------------gRiiDLS~aAA~~Lg~-~~~~~V~ve 167 (361)
T PRK10672 132 VRINDRGPYGP-------------GRVIDLSRAAADRLNT-SNNTKVRID 167 (361)
T ss_pred EEEeCCCCCCC-------------CCeeEcCHHHHHHhCC-CCCceEEEE
Confidence 99999998643 3799999999999996 556777766
No 11
>COG4305 Endoglucanase C-terminal domain/subunit and related proteins [Carbohydrate transport and metabolism]
Probab=97.92 E-value=3.8e-05 Score=66.53 Aligned_cols=73 Identities=19% Similarity=0.344 Sum_probs=61.3
Q ss_pred CCCCcEEEeecccCCC----CcCCCceEEEEeCCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhccC
Q 040138 128 DNSKPIAALSTGWYSG----GSRCGKMIRITANNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGLD 203 (216)
Q Consensus 128 ~d~d~VVALSsg~f~~----g~~CGk~I~It~~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~~ 203 (216)
..+.-|.||+....|- ..+=|..++|+.+.|++ +|.|+|.-| .=..+.+|||+.+|.++| +
T Consensus 53 ~sd~eITAlNPaqlNlGGipAAmAGaYLrVqGPKG~T-TVYVTDlYP-------------egasGaLDLSpNAFakIG-n 117 (232)
T COG4305 53 PSDMEITALNPAQLNLGGIPAAMAGAYLRVQGPKGKT-TVYVTDLYP-------------EGASGALDLSPNAFAKIG-N 117 (232)
T ss_pred CCcceeeecCHHHcccCCchhhhccceEEEECCCCce-EEEEecccc-------------cccccccccChHHHhhhc-c
Confidence 4466789999888773 35789999999988864 688999975 335689999999999999 8
Q ss_pred CCccEEEEEEEe
Q 040138 204 KEIGIVDVTWSM 215 (216)
Q Consensus 204 ~~~G~i~VtWs~ 215 (216)
..+|+|+|.|+.
T Consensus 118 m~qGrIpvqWrv 129 (232)
T COG4305 118 MKQGRIPVQWRV 129 (232)
T ss_pred hhcCccceeEEE
Confidence 999999999974
No 12
>PF07249 Cerato-platanin: Cerato-platanin; InterPro: IPR010829 Cerato-platanin (CP) is the first member of the cerato-platanin family. It is produced by the Ascomycete Ceratocystis fimbriata f. sp. platani and causes the severe plant disease: canker stain. This protein occurs in the cell wall of the fungus and is involved in the host-plane interaction and induces both cell necrosis and phytoalexin synthesis which is one of the first plant defense-related events. CP, like other fungal surface proteins, is able to self assemble in vitro []. CP is a 120 amino acid protein, containing 40% hydrophobic residues and two S-S bridges. It contains four cysteine residues that form two disulphide bonds []. The N-terminal region of CP is very similar to cerato-ulmin, a phytotoxic protein produced by the Ophiostoma species belonging to the hydrophobin family, which also self-assembles []. This entry also includes other precursor proteins.; PDB: 2KQA_A 3M3G_A.
Probab=97.80 E-value=7.7e-05 Score=60.34 Aligned_cols=65 Identities=28% Similarity=0.450 Sum_probs=48.0
Q ss_pred CcEEEeec--ccCCCCcCCCceEEEEeCCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhcc--CCCc
Q 040138 131 KPIAALST--GWYSGGSRCGKMIRITANNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGL--DKEI 206 (216)
Q Consensus 131 d~VVALSs--g~f~~g~~CGk~I~It~~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~--~~~~ 206 (216)
-+|.+... +| +.+.||+|+++++ +||++.+..+|.=. ..+++|++||++|.. ..+.
T Consensus 43 p~IGg~~~V~gW--nS~~CGtC~~lty-~g~si~vlaID~a~-----------------~gfnis~~A~n~LT~g~a~~l 102 (119)
T PF07249_consen 43 PYIGGAPAVAGW--NSPNCGTCWKLTY-NGRSIYVLAIDHAG-----------------GGFNISLDAMNDLTNGQAVEL 102 (119)
T ss_dssp TSEEEETT--ST--T-TTTT-EEEEEE-TTEEEEEEEEEE-S-----------------SSEEE-HHHHHHHHTS-CCCC
T ss_pred CeeccccccccC--CCCCCCCeEEEEE-CCeEEEEEEEecCC-----------------CcccchHHHHHHhcCCcccce
Confidence 45666653 45 4578999999999 99999999999953 579999999999964 3469
Q ss_pred cEEEEEEEe
Q 040138 207 GIVDVTWSM 215 (216)
Q Consensus 207 G~i~VtWs~ 215 (216)
|+|+++|+.
T Consensus 103 G~V~a~~~q 111 (119)
T PF07249_consen 103 GRVDATYTQ 111 (119)
T ss_dssp -EEE-EEEE
T ss_pred eEEEEEEEE
Confidence 999999863
No 13
>PF07127 Nodulin_late: Late nodulin protein; InterPro: IPR009810 This family consists of several plant specific late nodulin sequences which are homologous to the Pisum sativum (Garden pea) ENOD3 protein. ENOD3 is expressed in the late stages of root nodule formation and contains two pairs of cysteine residues toward the proteins C terminus which may be involved in metal-binding [].; GO: 0046872 metal ion binding, 0009878 nodule morphogenesis
Probab=94.48 E-value=0.062 Score=37.29 Aligned_cols=36 Identities=22% Similarity=0.308 Sum_probs=25.1
Q ss_pred cchhHHHHHHHHHhhhhhhhhhC-CCCCCCcCCCCCc
Q 040138 10 SLSLLTFFCTISLPLYSNAISQC-NGPCGTLDDCDGQ 45 (216)
Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~c-~~~c~~~~dc~g~ 45 (216)
.+-+.++.+.|||++++.+...= ..+|++..||+..
T Consensus 3 ~ilKFvY~mIiflslflv~~~~~~~~~C~~d~DCp~~ 39 (54)
T PF07127_consen 3 KILKFVYAMIIFLSLFLVVTNVDAIIPCKTDSDCPKD 39 (54)
T ss_pred cchhhHHHHHHHHHHHHhhcccCCCcccCccccCCCC
Confidence 44567777777776444444443 5899999999976
No 14
>PF02977 CarbpepA_inh: Carboxypeptidase A inhibitor; InterPro: IPR004231 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. This family is represented by the well-characterised metallocarboxypeptidase A inhibitor (MCPI) from potatoes, which belongs to the MEROPS inhibitor family I37, clan IE. It inhibits metallopeptidases belonging to MEROPS peptidase family M14, carboxypeptidase A. In Russet Burbank potatoes, it is a mixture of approximately equal amounts of two polypeptide chains containing 38 or 39 amino acid residues. The chains differ in their amino terminal sequence only [] and are resistant to fragmentation by proteases []. The structure of the complex between bovine carboxypeptidase A and the 39-amino-acid carboxypeptidase A inhibitor from potatoes has been determined at 2.5-A resolution []. The potato inhibitor is synthesised as a precursor, having a 29 residue N-terminal signal peptide, a 27 residue pro-peptide, the 39 residue mature inhibitor region and a 7 residue C-terminal extension. The 7 residue C-terminal extension is involved in inhibitor inactivation and may be required for targeting to the vacuole where the mature active inhibitor accumulates []. The N-terminal region and the mature inhibitor are weakly related to other solananaceous proteins found in this entry, from potato, tomato and henbane, which have been incorrectly described as metallocarboxipeptidase inhibitors [].; GO: 0008191 metalloendopeptidase inhibitor activity; PDB: 4CPA_I 1H20_A 2HLG_A.
Probab=86.34 E-value=0.27 Score=33.93 Aligned_cols=23 Identities=43% Similarity=0.953 Sum_probs=16.6
Q ss_pred hhhhhhCCCCCCCcCCCCCceEE
Q 040138 26 SNAISQCNGPCGTLDDCDGQLIC 48 (216)
Q Consensus 26 ~~~~~~c~~~c~~~~dc~g~l~c 48 (216)
+|...=||++|.+++||.|.-.|
T Consensus 2 ~~~~~tCn~~C~t~sDC~g~tlC 24 (46)
T PF02977_consen 2 SNILGTCNKYCNTNSDCSGITLC 24 (46)
T ss_dssp --S-TTTT-B-SSSCCCTTSSSS
T ss_pred cccccccCCccccCccccceeeh
Confidence 35566799999999999999988
No 15
>PF07473 Toxin_11: Spasmodic peptide gm9a; InterPro: IPR010012 This family consists of several spasmodic peptide gm9a sequences. Conotoxin gm9a is a putative 27-residue polypeptide encoded by Conus gloriamaris and is known to be a homologue of the 'spasmodic peptide', tx9a, isolated from the venom of the mollusk-hunting cone shell Conus textile []. Upon injection of this venom component, normal mice are converted into behavioural phenocopies of a well-known mutant, the spasmodic mouse [].; GO: 0009405 pathogenesis, 0005576 extracellular region; PDB: 1IXT_A.
Probab=69.44 E-value=3.6 Score=25.64 Aligned_cols=23 Identities=26% Similarity=0.869 Sum_probs=14.8
Q ss_pred hCCCCCCCcCCCCCceEEecCcc
Q 040138 31 QCNGPCGTLDDCDGQLICINGKC 53 (216)
Q Consensus 31 ~c~~~c~~~~dc~g~l~c~~g~c 53 (216)
+||..|+++-+|+..-+|...+|
T Consensus 1 sCnnsCqshs~CashC~C~~~~c 23 (28)
T PF07473_consen 1 SCNNSCQSHSKCASHCFCHPEEC 23 (28)
T ss_dssp ---B--SSSS-SSSSEEEETTEE
T ss_pred CcchhhhhhccCccceEEccccc
Confidence 59999999999999999966655
No 16
>TIGR02645 ARCH_P_rylase putative thymidine phosphorylase. Members of this family are closely related to characterized examples of thymidine phosphorylase (EC 2.4.2.4) and pyrimidine nucleoside phosphorylase (RC 2.4.2.2). Most examples are found in the archaea, but other examples in Legionella pneumophila str. Paris and Rhodopseudomonas palustris CGA009.
Probab=61.77 E-value=25 Score=34.86 Aligned_cols=44 Identities=11% Similarity=0.096 Sum_probs=34.5
Q ss_pred cCCCceEEEEeCCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhcc
Q 040138 145 SRCGKMIRITANNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGL 202 (216)
Q Consensus 145 ~~CGk~I~It~~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~ 202 (216)
-.=+.+|+|+. ++|+++|.|++.=. . =.++.+-||..+|++|+.
T Consensus 27 ~~~~~rv~v~~-~~~~~~a~~~~~~~-~------------~~~~~~gl~~~~~~~l~~ 70 (493)
T TIGR02645 27 FTPQDRVEVRI-GGKSLIAILVGSDT-L------------VEMGEIGLSVSAVETFMA 70 (493)
T ss_pred CCcCCeEEEEe-CCEEEEEEEecccc-c------------ccCCeeeccHHHHHHcCC
Confidence 34478999999 99999999986311 1 135799999999999985
No 17
>PF01683 EB: EB module; InterPro: IPR006149 The EB domain has no known function. It is found in several Caenorhabditis sp. and Drosophila sp. proteins. The domain contains 8 conserved cysteines that probably form four disulphide bridges and is found associated with kunitz domains IPR002223 from INTERPRO
Probab=57.86 E-value=8.2 Score=25.80 Aligned_cols=22 Identities=32% Similarity=0.999 Sum_probs=19.4
Q ss_pred CCCCCCcCCCCCceEEecCccC
Q 040138 33 NGPCGTLDDCDGQLICINGKCN 54 (216)
Q Consensus 33 ~~~c~~~~dc~g~l~c~~g~c~ 54 (216)
|.+|.....|.+...|++|+|.
T Consensus 19 g~~C~~~~qC~~~s~C~~g~C~ 40 (52)
T PF01683_consen 19 GESCESDEQCIGGSVCVNGRCQ 40 (52)
T ss_pred CCCCCCcCCCCCcCEEcCCEeE
Confidence 6789999999999999999663
No 18
>cd00150 PlantTI Plant trypsin inhibitors such as squash trypsin inhibitor. Plant proteinase inhibitors play important roles in natural plant defense. Proteinase inhibitors from squash seeds form an uniform family of small proteins cross-linked with three disulfide bridges.
Probab=46.63 E-value=14 Score=22.86 Aligned_cols=21 Identities=38% Similarity=0.771 Sum_probs=17.5
Q ss_pred CCCCCcCCCCCceEE-ecCccC
Q 040138 34 GPCGTLDDCDGQLIC-INGKCN 54 (216)
Q Consensus 34 ~~c~~~~dc~g~l~c-~~g~c~ 54 (216)
+.|+.-.||.++-|| -||-|+
T Consensus 6 m~Ck~DsDCl~~CiC~~~G~CG 27 (27)
T cd00150 6 MECKRDSDCLAECICLENGYCG 27 (27)
T ss_pred eeccccccccCCCEEccccccC
Confidence 479999999999999 557774
No 19
>smart00286 PTI Plant trypsin inhibitors.
Probab=43.43 E-value=18 Score=22.75 Aligned_cols=21 Identities=38% Similarity=0.817 Sum_probs=17.6
Q ss_pred CCCCCcCCCCCceEE-ecCccC
Q 040138 34 GPCGTLDDCDGQLIC-INGKCN 54 (216)
Q Consensus 34 ~~c~~~~dc~g~l~c-~~g~c~ 54 (216)
+.|+.-.||.++-|| -||-|+
T Consensus 8 m~Ck~DsDCl~~CiC~~~G~CG 29 (29)
T smart00286 8 MECKRDSDCMAECICLANGYCG 29 (29)
T ss_pred hccccccCcccCCEEccccccC
Confidence 479999999999999 557774
No 20
>TIGR03327 AMP_phos AMP phosphorylase. This enzyme family is found, so far, strictly in the Archaea, and only in those with a type III Rubisco enzyme. Most of the members previously were annotated as thymidine phosphorylase, or DeoA. The AMP metabolized by this enzyme may be produced by ADP-dependent sugar kinases.
Probab=38.03 E-value=1e+02 Score=30.74 Aligned_cols=43 Identities=28% Similarity=0.468 Sum_probs=32.8
Q ss_pred CCCceEEEEeCCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhcc
Q 040138 146 RCGKMIRITANNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGL 202 (216)
Q Consensus 146 ~CGk~I~It~~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~ 202 (216)
.=+.+|+|+. ++|+++|+|.=- +.. =.++.+-||..+|++|+.
T Consensus 29 ~~~~rv~v~~-~~~~~~a~~~~~-~~~------------~~~g~~gls~~~~~~l~~ 71 (500)
T TIGR03327 29 HPGDRVRIES-GGKSVVGIVDST-DTL------------VEKGEIGLSHEVLEELGI 71 (500)
T ss_pred CCCCeEEEEe-CCEEEEEEEEcc-ccc------------ccCCeeeccHHHHHHcCC
Confidence 3477899999 999999888432 211 135799999999999995
No 21
>PF02532 PsbI: Photosystem II reaction centre I protein (PSII 4.8 kDa protein); InterPro: IPR003686 Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product. PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [, ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This family represents the low molecular weight transmembrane protein PsbI, which is tightly associated with the D1/D2 heterodimer in PSII. The function of PsbI is unknown, but it may be involved in the assembly, dimerisation or stabilisation of PSII dimers [].; GO: 0015979 photosynthesis, 0009523 photosystem II, 0009539 photosystem II reaction center, 0016020 membrane; PDB: 3A0H_i 3ARC_I 3A0B_i 3BZ2_I 3PRQ_I 3KZI_I 3PRR_I 2AXT_i 4FBY_I 1S5L_i ....
Probab=32.43 E-value=36 Score=22.45 Aligned_cols=17 Identities=24% Similarity=0.444 Sum_probs=11.3
Q ss_pred hHHHHHHHHHh-hhhhhh
Q 040138 13 LLTFFCTISLP-LYSNAI 29 (216)
Q Consensus 13 ~~~~~~~~~~~-~~~~~~ 29 (216)
-++|+.++|+| ||+|-.
T Consensus 11 vV~ffv~LFifGflsnDp 28 (36)
T PF02532_consen 11 VVIFFVSLFIFGFLSNDP 28 (36)
T ss_dssp HHHHHHHHHHHHHHTTCT
T ss_pred hHHHHHHHHhccccCCCC
Confidence 35677777777 777643
No 22
>PF00299 Squash: Squash family serine protease inhibitor; InterPro: IPR000737 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. The squash inhibitors form one of a number of serine proteinase inhibitor families. They belong to MEROPS inhibitor family I7, clan IE. They are generally annotated as either trypsin or elastase inhibitors (MEROPS peptidase family S1, IPR001254 from INTERPRO). The proteins, found exclusively in the seeds of the cucurbitaceae, e.g. Citrullus lanatus (watermelon), Cucumis sativus (cucumber), Momordica charantia (balsam pear), are approximately 30 residues in length and contain 6 Cys residues, which form 3 disulphide bonds []. The inhibitors function by being taken up by a serine protease (such as trypsin), which cleaves the peptide bond between Arg/Lys and Ile residues in the N-terminal portion of the protein [, ]. Structural studies have shown that the inhibitor has an ellipsoidal shape, and is largely composed of beta-turns []. The fold and Cys connectivity of the proteins resembles that of potato carboxypeptidase A inhibitor [].; GO: 0004867 serine-type endopeptidase inhibitor activity; PDB: 1MCT_I 2IT8_A 1HA9_A 1IB9_A 2C4B_B 2PO8_A 1F2S_I 1W7Z_H 2LET_A 2ETI_A ....
Probab=32.31 E-value=22 Score=22.42 Aligned_cols=20 Identities=40% Similarity=0.758 Sum_probs=15.7
Q ss_pred CCCCCcCCCCCceEE-ecCcc
Q 040138 34 GPCGTLDDCDGQLIC-INGKC 53 (216)
Q Consensus 34 ~~c~~~~dc~g~l~c-~~g~c 53 (216)
+.|+.-.||..+-+| -+|-|
T Consensus 8 m~Ck~DsDCl~~C~C~~~g~C 28 (29)
T PF00299_consen 8 MECKRDSDCLAGCICLENGYC 28 (29)
T ss_dssp CB-SSGGGSSTTEEEETTSEE
T ss_pred hhcccccCcccCCEEccCccc
Confidence 579999999999999 55566
No 23
>CHL00024 psbI photosystem II protein I
Probab=31.65 E-value=25 Score=23.18 Aligned_cols=16 Identities=25% Similarity=0.526 Sum_probs=11.7
Q ss_pred hHHHHHHHHHh-hhhhh
Q 040138 13 LLTFFCTISLP-LYSNA 28 (216)
Q Consensus 13 ~~~~~~~~~~~-~~~~~ 28 (216)
-++|+.++|+| ||+|-
T Consensus 11 vV~ffvsLFifGFlsnD 27 (36)
T CHL00024 11 VVIFFVSLFIFGFLSND 27 (36)
T ss_pred HHHHHHHHHHccccCCC
Confidence 46778888887 77764
No 24
>PRK04350 thymidine phosphorylase; Provisional
Probab=31.41 E-value=1.6e+02 Score=29.40 Aligned_cols=42 Identities=24% Similarity=0.429 Sum_probs=31.7
Q ss_pred CceEEEEeCCCCEEEEEEEeCCCCCCCCCcCCCCCCCCCCCeeecCHHHHHHhcc
Q 040138 148 GKMIRITANNGRSVLAQVVDECDSMRGCDEEHAGQPPCDNNIVDGSDAVWSALGL 202 (216)
Q Consensus 148 Gk~I~It~~NGkSV~atVVDeC~s~~GCd~~~~~~p~C~~n~lDLS~avF~aLg~ 202 (216)
+.+|+|+. ++++++|++.=.=+.. =.++.+-||..+|++|+.
T Consensus 24 ~~rv~v~~-~~~~~~a~~~~~~~~~------------~~~~~~gl~~~~~~~l~~ 65 (490)
T PRK04350 24 GDRVEVRA-GGRSIIATLNITDDDL------------VGPGEIGLSESAFRRLGV 65 (490)
T ss_pred CCeEEEEc-CCeEEEEEEEeccccc------------cCCCcccccHHHHHHhCC
Confidence 78899999 9999998873221100 135789999999999995
No 25
>TIGR03170 flgA_cterm flagella basal body P-ring formation protein FlgA. This model describes a conserved C-terminal region of the flagellar basal body P-ring formation protein FlgA. This sequence region contains a SAF domain, now described by Pfam model pfam08666.
Probab=31.02 E-value=72 Score=24.55 Aligned_cols=23 Identities=22% Similarity=0.417 Sum_probs=19.8
Q ss_pred CCCceEEEEe-CCCCEEEEEEEeC
Q 040138 146 RCGKMIRITA-NNGRSVLAQVVDE 168 (216)
Q Consensus 146 ~CGk~I~It~-~NGkSV~atVVDe 168 (216)
.=|..|+|++ ..||.+.|+|++.
T Consensus 94 ~~G~~I~V~N~~s~k~i~~~V~~~ 117 (122)
T TIGR03170 94 AVGDQIRVRNLSSGKIISGIVTGP 117 (122)
T ss_pred CCCCEEEEEECCCCCEEEEEEeCC
Confidence 4599999998 5899999999875
No 26
>PRK02655 psbI photosystem II reaction center I protein I; Provisional
Probab=30.60 E-value=26 Score=23.30 Aligned_cols=17 Identities=18% Similarity=0.286 Sum_probs=12.2
Q ss_pred hHHHHHHHHHh-hhhhhh
Q 040138 13 LLTFFCTISLP-LYSNAI 29 (216)
Q Consensus 13 ~~~~~~~~~~~-~~~~~~ 29 (216)
-++|+.++|+| ||+|-.
T Consensus 11 vV~ffvsLFiFGflsnDP 28 (38)
T PRK02655 11 VVFFFVGLFVFGFLSSDP 28 (38)
T ss_pred hHHHHHHHHHcccCCCCC
Confidence 46788888887 777643
No 27
>PF13144 SAF_2: SAF-like
Probab=28.93 E-value=1e+02 Score=25.69 Aligned_cols=24 Identities=25% Similarity=0.441 Sum_probs=20.1
Q ss_pred cCCCceEEEEeC-CCCEEEEEEEeC
Q 040138 145 SRCGKMIRITAN-NGRSVLAQVVDE 168 (216)
Q Consensus 145 ~~CGk~I~It~~-NGkSV~atVVDe 168 (216)
..=|..|+|.+. .||.+.|+|++.
T Consensus 167 G~~G~~I~V~N~~S~k~v~g~V~~~ 191 (196)
T PF13144_consen 167 GALGDTIRVKNLSSGKIVQGRVIGP 191 (196)
T ss_pred CCCCCEEEEEECCCCCEEEEEEecC
Confidence 345999999994 599999999875
No 28
>PF01666 DX: DX module; InterPro: IPR002593 This domain has no known function. It is found in several Caenorhabditis elegans proteins. The domain contains 6 conserved cysteines that probably form three disulphide bridges.
Probab=26.43 E-value=47 Score=24.80 Aligned_cols=24 Identities=29% Similarity=0.887 Sum_probs=19.5
Q ss_pred CCCCCcCCCCCceEEec-----CccCCCC
Q 040138 34 GPCGTLDDCDGQLICIN-----GKCNDDP 57 (216)
Q Consensus 34 ~~c~~~~dc~g~l~c~~-----g~c~ddp 57 (216)
.-|.+..||...++|.. +.|=.||
T Consensus 48 ~~C~~N~DC~~~~VCV~~~~~~~~C~~nP 76 (76)
T PF01666_consen 48 SYCTSNRDCGSGSVCVRENSARGRCYYNP 76 (76)
T ss_pred cccccCcccCCCcEEEEEECCccEEeCCC
Confidence 67999999999999955 4666665
No 29
>smart00051 DSL delta serrate ligand.
Probab=24.18 E-value=38 Score=24.23 Aligned_cols=29 Identities=34% Similarity=0.636 Sum_probs=20.3
Q ss_pred hhhCCCCCCCcCCCCCceEEecCccCCCCCCceeecc
Q 040138 29 ISQCNGPCGTLDDCDGQLICINGKCNDDPDVGTHICK 65 (216)
Q Consensus 29 ~~~c~~~c~~~~dc~g~l~c~~g~c~ddp~~~t~ic~ 65 (216)
-..|.+-|+..+|+.+...| |++ |..+|.
T Consensus 27 G~~C~~~C~~~~d~~~~~~C-------d~~-G~~~C~ 55 (63)
T smart00051 27 GEGCNKFCRPRDDFFGHYTC-------DEN-GNKGCL 55 (63)
T ss_pred CCccCCEeCcCccccCCccC-------CcC-CCEecC
Confidence 45688888888888887776 342 666663
No 30
>PRK12618 flgA flagellar basal body P-ring biosynthesis protein FlgA; Reviewed
Probab=22.76 E-value=1.2e+02 Score=24.93 Aligned_cols=23 Identities=17% Similarity=0.378 Sum_probs=19.6
Q ss_pred CCCceEEEEe-CCCCEEEEEEEeC
Q 040138 146 RCGKMIRITA-NNGRSVLAQVVDE 168 (216)
Q Consensus 146 ~CGk~I~It~-~NGkSV~atVVDe 168 (216)
.=|..|+|++ +.||.|.++|++.
T Consensus 110 ~~Gd~IrV~N~~S~riV~g~V~~~ 133 (141)
T PRK12618 110 GVGDEIRVMNLSSRTTVSGRIAAD 133 (141)
T ss_pred CCCCEEEEEECCCCCEEEEEEecC
Confidence 3489999998 5799999999875
No 31
>PRK08515 flgA flagellar basal body P-ring biosynthesis protein FlgA; Reviewed
Probab=22.28 E-value=1.1e+02 Score=26.72 Aligned_cols=24 Identities=17% Similarity=0.374 Sum_probs=20.2
Q ss_pred cCCCceEEEEeCCCCEEEEEEEeC
Q 040138 145 SRCGKMIRITANNGRSVLAQVVDE 168 (216)
Q Consensus 145 ~~CGk~I~It~~NGkSV~atVVDe 168 (216)
..=|..|+|++..||.|.|+|++.
T Consensus 193 G~~Gd~IrVrN~Sgkii~g~V~~~ 216 (222)
T PRK08515 193 GNLGDIIQAKNKSNKILKAKVLSK 216 (222)
T ss_pred CCCCCEEEEEeCCCCEEEEEEecC
Confidence 456899999987789999999876
No 32
>KOG4106 consensus Uncharacterized conserved protein [Function unknown]
Probab=22.10 E-value=88 Score=25.65 Aligned_cols=24 Identities=29% Similarity=0.387 Sum_probs=20.0
Q ss_pred CCcCCCceEEEEeCCCCEEEEEEEe
Q 040138 143 GGSRCGKMIRITANNGRSVLAQVVD 167 (216)
Q Consensus 143 ~g~~CGk~I~It~~NGkSV~atVVD 167 (216)
++++|-..+-|+. ||||++|.--|
T Consensus 24 ~sp~lve~vavt~-nGRTIvawHP~ 47 (125)
T KOG4106|consen 24 DSPRLVEKVAVTA-NGRTIVAWHPP 47 (125)
T ss_pred CCcceeeeEEEec-CCcEEEEecCC
Confidence 4678999999999 99999987643
No 33
>PF02015 Glyco_hydro_45: Glycosyl hydrolase family 45; InterPro: IPR000334 O-Glycosyl hydrolases 3.2.1. from EC are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [, ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Glycoside hydrolase family 45 GH45 from CAZY comprises enzymes with only one known activity; endoglucanase (3.2.1.4 from EC). The microbial degradation of cellulose and xylans requires several types of enzymes such as endoglucanases, cellobiohydrolases (3.2.1.91 from EC) (exoglucanases), or xylanases (3.2.1.8 from EC) [, ]. Fungi and bacteria produce a spectrum of cellulolytic enzymes (cellulases) and xylanases which, on the basis of sequence similarities, can be classified into families. One of these families is known as the cellulase family K or as the glycosyl hydrolases family 45 []. The best conserved regions in these enzymes is located in the N-terminal section. It contains an aspartic acid residue which has been shown [] to act as a nucleophile in the catalytic mechanism. This also has several cysteines that are involved in forming disulphide bridges.; GO: 0008810 cellulase activity, 0005975 carbohydrate metabolic process; PDB: 1OA7_A 1OA9_A 1L8F_A 1HD5_A 4ENG_A 3ENG_A 2ENG_A.
Probab=21.48 E-value=1.6e+02 Score=26.06 Aligned_cols=25 Identities=24% Similarity=0.440 Sum_probs=18.8
Q ss_pred cCCCceEEEEeC----CCCEEEEEEEeCC
Q 040138 145 SRCGKMIRITAN----NGRSVLAQVVDEC 169 (216)
Q Consensus 145 ~~CGk~I~It~~----NGkSV~atVVDeC 169 (216)
..|+++++++-. .||+.+|+|++.=
T Consensus 82 ~~Cc~Cy~LtFt~g~l~GKkmiVQ~tNtG 110 (201)
T PF02015_consen 82 SWCCACYELTFTSGPLKGKKMIVQVTNTG 110 (201)
T ss_dssp HHTT-EEEEEE-SSTTTT-EEEEEEEEE-
T ss_pred CcccceEEEEEcCCCcCCCEeEEEecccC
Confidence 689999999872 6899999998874
Done!