Query 016745
Match_columns 383
No_of_seqs 47 out of 49
Neff 3.3
Searched_HMMs 46136
Date Fri Mar 29 02:25:50 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016745.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016745hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF05708 DUF830: Orthopoxvirus 98.5 4.9E-07 1.1E-11 77.4 8.4 116 212-353 2-118 (158)
2 PRK10030 hypothetical protein; 98.4 1.8E-06 4E-11 79.5 9.2 120 209-354 18-137 (197)
3 PRK11470 hypothetical protein; 98.0 2.7E-05 5.8E-10 73.0 8.9 142 209-374 6-161 (200)
4 PRK11479 hypothetical protein; 97.2 0.0013 2.8E-08 64.5 8.0 100 205-329 58-160 (274)
5 PF05382 Amidase_5: Bacterioph 79.0 5.5 0.00012 36.1 6.0 65 210-298 74-138 (145)
6 PF05257 CHAP: CHAP domain; I 74.3 9 0.0002 32.0 5.7 43 208-264 59-102 (124)
7 PF01436 NHL: NHL repeat; Int 68.9 7.9 0.00017 25.4 3.3 19 246-265 6-24 (28)
8 TIGR02219 phage_NlpC_fam putat 68.7 6.3 0.00014 34.2 3.7 54 206-282 71-124 (134)
9 COG3863 Uncharacterized distan 53.1 34 0.00073 33.4 5.8 109 206-342 73-188 (231)
10 PF07646 Kelch_2: Kelch motif; 48.0 18 0.00038 25.7 2.4 18 240-260 1-18 (49)
11 PRK15231 fimbrial adhesin prot 44.1 24 0.00051 32.7 3.1 92 146-266 10-106 (150)
12 PF07313 DUF1460: Protein of u 38.1 74 0.0016 30.6 5.6 58 209-282 151-208 (216)
13 PF13418 Kelch_4: Galactose ox 36.9 33 0.00072 24.0 2.4 18 240-259 1-18 (49)
14 smart00739 KOW KOW (Kyprides, 33.0 79 0.0017 19.7 3.5 23 212-250 2-24 (28)
15 cd02983 P5_C P5 family, C-term 31.2 60 0.0013 28.2 3.5 85 292-382 24-125 (130)
16 PF12075 KN_motif: KN motif; 31.0 19 0.00041 26.7 0.4 6 321-326 6-11 (39)
17 PF01344 Kelch_1: Kelch motif; 29.8 53 0.0012 22.5 2.4 30 240-272 1-30 (47)
18 PF04583 Baculo_p74: Baculovir 29.2 62 0.0013 32.2 3.6 38 321-369 125-173 (249)
19 COG5008 PilU Tfp pilus assembl 29.0 30 0.00065 35.7 1.4 109 134-250 110-232 (375)
20 PF13964 Kelch_6: Kelch motif 28.3 55 0.0012 23.1 2.4 21 241-264 2-22 (50)
21 TIGR03047 PS_II_psb28 photosys 28.1 70 0.0015 28.4 3.4 41 228-271 33-77 (109)
22 PF07494 Reg_prop: Two compone 26.5 58 0.0013 20.8 2.0 13 248-260 10-22 (24)
23 cd03474 Rieske_T4moC Toluene-4 25.7 1.2E+02 0.0025 24.8 4.1 36 208-262 6-41 (108)
24 PF02362 B3: B3 DNA binding do 25.4 1.3E+02 0.0028 23.6 4.3 48 206-262 4-51 (100)
25 cd03531 Rieske_RO_Alpha_KSH Th 25.3 93 0.002 26.2 3.6 101 208-343 7-115 (115)
26 PF00877 NLPC_P60: NlpC/P60 fa 23.9 46 0.001 26.9 1.4 28 206-249 46-73 (105)
27 cd03477 Rieske_YhfW_C YhfW fam 22.0 1.6E+02 0.0034 24.1 4.2 53 209-280 5-63 (91)
28 PF04970 LRAT: Lecithin retino 21.9 3.9E+02 0.0085 22.5 6.7 92 209-326 4-106 (125)
29 PF09652 Cas_VVA1548: Putative 20.2 65 0.0014 27.8 1.7 32 183-219 8-39 (93)
No 1
>PF05708 DUF830: Orthopoxvirus protein of unknown function (DUF830); PDB: 2IF6_B 3KW0_C.
Probab=98.50 E-value=4.9e-07 Score=77.37 Aligned_cols=116 Identities=21% Similarity=0.306 Sum_probs=79.3
Q ss_pred CCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEec-CCCCcccccceeecchhHHHHHHhccCCC
Q 016745 212 VHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGES-GHENEKGEEIIVVIPWDEWWELALKDDSN 290 (383)
Q Consensus 212 IhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES-~~~~~~~~~~Iq~~pweeW~~~~~kd~a~ 290 (383)
.++||.|...- . +.+...|.=+|++..||.+|.+.+.+++.+|+|+ -.. +++..++++|.. + +
T Consensus 2 l~~GDIil~~~---~-~~~s~~i~~~t~~~~~HvgI~~~~~~~~~~viea~~~~------Gv~~~~l~~~~~----~--~ 65 (158)
T PF05708_consen 2 LQTGDIILTRG---K-SSLSKAIRPVTSSPYSHVGIVIGDEGQEPYVIEATPGD------GVRLEPLSDFLK----R--N 65 (158)
T ss_dssp --TT-EEEEEE-----SCCHHHHHHHHTSS--EEEEEEEETTE-EEEEEEETTT------CEEEEECHHHHH----C--C
T ss_pred CCCeeEEEEEC---C-chHHHHHHHHhCCCCCEEEEEEecCCCceEEEEeccCC------CeEEeeHHHHhc----C--C
Confidence 58999998873 3 7789999999999999999999987788999999 333 699999999964 3 4
Q ss_pred CcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCCCCCCCCCChhHHHHHH
Q 016745 291 PQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVM 353 (383)
Q Consensus 291 ~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~ 353 (383)
-+++++.+++. +..=...+|.+++++..|+||++--.+. ++. -.=++||+-++
T Consensus 66 ~~~~V~r~~~~-~~~~~~~~~~~~a~~~~g~~Y~~~~~~~------~~~---~yCSelV~~~y 118 (158)
T PF05708_consen 66 EKIAVYRLKDP-LSEEQRQKAAEFAKSYIGKPYDFNFSLD------DDR---FYCSELVAEAY 118 (158)
T ss_dssp CEEEEEEECCG-TTCHHHHHHHHHHHCCTTS-B-CC-HCC------SSS---B-HHHHHHHHH
T ss_pred ceEEEEEECCC-CCHHHHHHHHHHHHHHcCCCccccccCC------CCC---EEcHHHHHHHH
Confidence 46888888877 3233345678899999999999863333 221 22247776666
No 2
>PRK10030 hypothetical protein; Provisional
Probab=98.36 E-value=1.8e-06 Score=79.52 Aligned_cols=120 Identities=20% Similarity=0.274 Sum_probs=88.5
Q ss_pred CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHHhccC
Q 016745 209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELALKDD 288 (383)
Q Consensus 209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~~kd~ 288 (383)
..++++||.|-.+- + +.....|+.+|+|.-.|.+|..+. +|+.+|+|+- ..|+.+|+++|.+ +.
T Consensus 18 ~~~l~~GDlif~~g---~-~~~s~aI~~~T~s~~SHVGIi~~~-~~~~~ViEAv-------~~V~~~pL~~Fl~----~~ 81 (197)
T PRK10030 18 AWQPQTGDIIFQIS---R-SSQSKAIQLATHSDYSHTGMIVKR-NKKPYVFEAV-------GPVKYTPLKQWIA----HG 81 (197)
T ss_pred hcCCCCCCEEEEeC---C-CcHhHHHhHhhCCCCceEEEEEEE-CCcEEEEEec-------CceEEEEHHHHhh----cC
Confidence 34899999998873 2 456889999999999999999985 7999999994 2499999999964 44
Q ss_pred CCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCCCCCCCCCChhHHHHHHH
Q 016745 289 SNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVMS 354 (383)
Q Consensus 289 a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~s 354 (383)
.+-++++..++..+.... -.++.+++++..|+||-+. |.| | ++ .-.=++||.-++.
T Consensus 82 ~~~~~~V~Rl~~~lt~~~-~~~li~~A~~~lGkpYD~~---f~~-~--d~---~~YCSELV~~ay~ 137 (197)
T PRK10030 82 EKGKYVVRRLENGLSVEQ-QQKLAQTAKRYLGKPYDFY---FSW-S--DD---RIYCSELVWKVYQ 137 (197)
T ss_pred ccCcEEEEEeCCCCCHHH-HHHHHHHHHHHcCCCCCcc---ccc-C--CC---cEEeHHHHHHHHH
Confidence 456788887765332221 3447889999999999864 655 1 12 2334588887763
No 3
>PRK11470 hypothetical protein; Provisional
Probab=98.01 E-value=2.7e-05 Score=72.98 Aligned_cols=142 Identities=14% Similarity=0.171 Sum_probs=98.5
Q ss_pred CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHHhccC
Q 016745 209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELALKDD 288 (383)
Q Consensus 209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~~kd~ 288 (383)
+.++|+||+|-++--. ..+ -.|+-+|||...|..|..+-..++-+|+||--.. ++.+|.++|++ ..
T Consensus 6 ~~~l~~GDLvF~~~~~---~~~-~aI~~aT~s~~sHvGII~~~~~~~~~VlEA~~~~------vr~TpLs~fi~----r~ 71 (200)
T PRK11470 6 PAEYEIGDIVFTCIGA---ALF-GQISAASNCWSNHVGIIIGHNGEDFLVAESRVPL------STVTTLSRFIK----RS 71 (200)
T ss_pred cCCCCCCCEEEEeCCc---chh-HHHHhccCCccceEEEEEEEcCCceEEEEecCCc------eEEeEHHHHHh----cC
Confidence 4689999999998422 223 3488899999999999885435689999994222 89999999974 45
Q ss_pred CCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCCCCCCCCCChhHHHHHH--------------H
Q 016745 289 SNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVM--------------S 354 (383)
Q Consensus 289 a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~--------------s 354 (383)
.+-++++-.|+..+++.= ..+|.++++++-|+||.+. |.| | ++.|=| ++||.-++ .
T Consensus 72 ~~g~i~v~Rl~~~l~~~~-~~~~~~~A~~~lGkpYD~~---F~~-~--d~~~YC---SElV~~~y~~a~~i~vg~~~~~~ 141 (200)
T PRK11470 72 ANQRYAIKRLDAGLTEQQ-KQRIVEQVPSRLRKLYHTG---FKY-E--SSRQFC---SKFVFDIYKEALCIPVGEIETFG 141 (200)
T ss_pred cCceEEEEEecCCCCHHH-HHHHHHHHHHHcCCCCCCc---cCC-C--CCceeh---HHHHHHHHHHhhCCcccccccch
Confidence 567899998875554411 3458899999999999985 665 2 344433 46665333 2
Q ss_pred HhhhcchhHHHHHHHHHHhh
Q 016745 355 MWTRVQPAYAANMWNEALNK 374 (383)
Q Consensus 355 ~~~~~~P~~a~~m~neALNK 374 (383)
++-+=+|.....+|.+--.+
T Consensus 142 ~~~~~~p~~~~~~w~~~y~~ 161 (200)
T PRK11470 142 ELLNSNPDAKLTFWKFWFLG 161 (200)
T ss_pred hhccCCccchhHHHHHHhcC
Confidence 33233677777777754433
No 4
>PRK11479 hypothetical protein; Provisional
Probab=97.18 E-value=0.0013 Score=64.49 Aligned_cols=100 Identities=20% Similarity=0.297 Sum_probs=77.2
Q ss_pred ccCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHH
Q 016745 205 ATINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELA 284 (383)
Q Consensus 205 ~~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~ 284 (383)
..|+.+++|+||.|..+.- +.+.-.|++.|+|...|.+|.+- || .++|+-. .+|+..|.++|..
T Consensus 58 ~~Vs~~~LqpGDLVFfst~----t~~S~~Ik~~T~s~~SHVgIylG--dg--~vIEA~g------~GVri~pL~~~~~-- 121 (274)
T PRK11479 58 KEITAPDLKPGDLLFSSSL----GVTSFGIRVFSTSSVSHVAIYLG--EN--NVAEATG------AGVQIVSLKKAIK-- 121 (274)
T ss_pred cccChhhCCCCCEEEEecC----CccccceecccCCCCcEEEEEec--CC--eEEEcCC------CCEEEEechhhhc--
Confidence 3688899999999998631 44677899999999999999985 44 3799832 3599999999963
Q ss_pred hccCCCCcEEEe---eCChHHHhhcchHHHHHHHHhhcCCcceeeeee
Q 016745 285 LKDDSNPQIALL---PLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMI 329 (383)
Q Consensus 285 ~kd~a~~~ValL---PL~~e~RakFN~TAAwef~~~~eG~PYGYhN~i 329 (383)
.+ -.|+.+ .+.+|.+++ +.+|+++..|.||=|-+.+
T Consensus 122 -~~---~~I~a~Rv~~lt~e~~~k-----l~~fa~~~lGy~YN~~gI~ 160 (274)
T PRK11479 122 -HS---DKLFALRVPDLTPQQATK-----ITAFANKIKDSGYNYRGIV 160 (274)
T ss_pred -cc---ceEEEEeCCCCCHHHHHH-----HHHHHHHhcCCCCCHHHHH
Confidence 22 236666 566666654 8899999999999987764
No 5
>PF05382 Amidase_5: Bacteriophage peptidoglycan hydrolase ; InterPro: IPR008044 This entry is represented by Bacteriophage SFi21, lysin (Cell wall hydrolase; 3.5.1.28 from EC). At least one of proteins in this entry, the Pal protein from the pneumococcal bacteriophage Dp-1 (O03979 from SWISSPROT) has been shown to be an N-acetylmuramoyl-L-alanine amidase []. According to the known modular structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site should reside within this domain while a C-terminal domain binds to the choline residues of the cell wall teichoic acids [, ].
Probab=79.05 E-value=5.5 Score=36.12 Aligned_cols=65 Identities=20% Similarity=0.431 Sum_probs=42.9
Q ss_pred CCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHHhccCC
Q 016745 210 EDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELALKDDS 289 (383)
Q Consensus 210 ~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~~kd~a 289 (383)
.++|.||.+..++- |. ++...|||.|++. . +. +|.. + ++.++|.+++++..| ..+.
T Consensus 74 ~~~q~GDI~I~g~~-g~-----------S~G~~GHtgif~~-~-~~--iIhc---~-y~~~g~~~~~~~~~~----~~~~ 129 (145)
T PF05382_consen 74 WNLQRGDIFIWGRR-GN-----------SAGAGGHTGIFMD-N-DT--IIHC---N-YGANGIAINNYDWYW----YYNG 129 (145)
T ss_pred ccccCCCEEEEcCC-CC-----------CCCCCCeEEEEeC-C-Cc--EEEe---c-CCCCCeEecCCCeee----ecCC
Confidence 47999999997642 22 5556899999984 2 22 3332 2 278889999988775 4455
Q ss_pred CCcEEEeeC
Q 016745 290 NPQIALLPL 298 (383)
Q Consensus 290 ~~~ValLPL 298 (383)
.+-+-+-+|
T Consensus 130 ~~~~~~yr~ 138 (145)
T PF05382_consen 130 RPPVYVYRL 138 (145)
T ss_pred CCcEEEEEe
Confidence 555555444
No 6
>PF05257 CHAP: CHAP domain; InterPro: IPR007921 The CHAP (cysteine, histidine-dependent amidohydrolases/peptidases) domain is a region between 110 and 140 amino acids that is found in proteins from bacteria, bacteriophages, archaea and eukaryotes of the Trypanosomidae family. Many of these proteins are uncharacterised, but it has been proposed that they may function mainly in peptidoglycan hydrolysis. The CHAP domain is found in a wide range of protein architectures; it is commonly associated with bacterial type SH3 domains and with several families of amidase domains. It has been suggested that CHAP domain containing proteins utilise a catalytic cysteine residue in a nucleophilic-attack mechanism [, ]. The CHAP domain contains two invariant residues, a cysteine and a histidine. These residues form part of the putative active site of CHAP domain containing proteins. Secondary structure predictions show that the CHAP domain belongs to the alpha + beta structural class, with the N-terminal half largely containing predicted alpha helices and the C-terminal half principally composed of predicted beta strands [, ]. Some proteins known to contain a CHAP domain are listed below: Bacterial and trypanosomal glutathionylspermidine amidases. A variety of bacterial autolysins. A Nocardia aerocolonigenes putative esterase. Streptococcus pneumoniae choline-binding protein D. Methanosarcina mazei protein MM2478, a putative chloride channel. Several phage-encoded peptidoglycan hydrolases. Cysteine peptidases belonging to MEROPS peptidase family C51 (D-alanyl-glycyl endopeptidase, clan CA). ; PDB: 2LRJ_A 2VPM_B 2VOB_B 2VPS_A 2K3A_A 2IO9_A 2IO8_A 2IOB_A 2IOA_B 2IO7_B ....
Probab=74.28 E-value=9 Score=31.99 Aligned_cols=43 Identities=21% Similarity=0.201 Sum_probs=32.1
Q ss_pred CCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeec-CCCcEEEEecCCC
Q 016745 208 NPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKD-KEGNLWVGESGHE 264 (383)
Q Consensus 208 ~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd-~dGeL~v~ES~~~ 264 (383)
.....++||.+... ..++...||+++..-. .+|.+.++|....
T Consensus 59 ~~~~P~~Gdivv~~--------------~~~~~~~GHVaIV~~v~~~~~i~v~e~N~~ 102 (124)
T PF05257_consen 59 TGSTPQPGDIVVWD--------------SGSGGGYGHVAIVESVNDGGTITVIEQNWG 102 (124)
T ss_dssp ECS---TTEEEEEE--------------ECTTTTT-EEEEEEEE-TTSEEEEEECSST
T ss_pred cCcccccceEEEec--------------cCCCCCCCeEEEEEEECCCCEEEEEECCcC
Confidence 45677899998875 3567889999999988 7799999999864
No 7
>PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=68.94 E-value=7.9 Score=25.39 Aligned_cols=19 Identities=37% Similarity=0.847 Sum_probs=14.5
Q ss_pred eEEeecCCCcEEEEecCCCC
Q 016745 246 AVCLKDKEGNLWVGESGHEN 265 (383)
Q Consensus 246 av~Lrd~dGeL~v~ES~~~~ 265 (383)
-||+ +++|++||+||+..-
T Consensus 6 gvav-~~~g~i~VaD~~n~r 24 (28)
T PF01436_consen 6 GVAV-DSDGNIYVADSGNHR 24 (28)
T ss_dssp EEEE-ETTSEEEEEECCCTE
T ss_pred EEEE-eCCCCEEEEECCCCE
Confidence 3566 479999999987654
No 8
>TIGR02219 phage_NlpC_fam putative phage cell wall peptidase, NlpC/P60 family. Members of this family show sequence similarity to members of the NlpC/P60 family described by Pfam model pfam00877 and by Anantharaman and Aravind (PubMed:12620121). The NlpC/P60 family includes a number of characterized bacterial cell wall hydrolases. Members of this related family are all found in prophage regions of bacterial genomes.
Probab=68.74 E-value=6.3 Score=34.20 Aligned_cols=54 Identities=20% Similarity=0.342 Sum_probs=33.2
Q ss_pred cCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHH
Q 016745 206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWE 282 (383)
Q Consensus 206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~ 282 (383)
.++.+++|+||+|... . ..|..++|..+.+ + +|++ +-+-.+ . .+.+...+.||.
T Consensus 71 ~v~~~~~qpGDlvff~-~-------------~~~~~~~HvGIy~-G-~g~~--iHa~~~----~-~v~~~~~~~yw~ 124 (134)
T TIGR02219 71 PVPCDAAQPGDVLVFR-W-------------RPGAAAKHAAIAA-S-PTRF--IHAYDG----A-AVVESALVPWWR 124 (134)
T ss_pred ccchhcCCCCCEEEEe-e-------------CCCCCCcEEEEEe-C-CCcE--EEECCC----C-CEEEeCCcHHHH
Confidence 4566889999999664 2 2355688999887 3 6664 333221 1 233445567764
No 9
>COG3863 Uncharacterized distant relative of cell wall-associated hydrolases [Function unknown]
Probab=53.05 E-value=34 Score=33.42 Aligned_cols=109 Identities=22% Similarity=0.377 Sum_probs=65.9
Q ss_pred cCCCCCCCCCCEEEEEeecccCCchhhHHHhhc------CcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhH
Q 016745 206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVT------GAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDE 279 (383)
Q Consensus 206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~t------Gs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pwee 279 (383)
+.|..-.+|||-++= ++--| +|. -|.| +.|-||..|-. + -|+ .+||..+. +++.++.-
T Consensus 73 ~~dr~v~~~gd~~~g-dyPTr-~g~----i~~t~~~~~~~~H~gHagmy~-~-a~~--~VEs~psG------Vr~v~~n~ 136 (231)
T COG3863 73 NLDRSVLQPGDILLG-DYPTR-GGA----IWLTDTFGNIVGHWGHAGMYI-G-AGQ--MVESWPSG------VRVVSVNM 136 (231)
T ss_pred hhhhhhcCCcchhhc-cCCCC-cce----EEEEcccccccccccceEEEE-c-CCc--EEeeccCc------eEEecchh
Confidence 555556677776543 22112 121 1322 55778888654 3 344 58987764 88888877
Q ss_pred HHHHHhccCCCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEec-CCCCCCC
Q 016745 280 WWELALKDDSNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDT-MADNYPP 342 (383)
Q Consensus 280 W~~~~~kd~a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT-~~dNyPp 342 (383)
|. .+||+= |-..--+.|.. ++|-.|+.+-.|+||-|. .|.-|.| -++-|=|
T Consensus 137 ~~---~~dn~i--V~~vsts~~qk-----~~AadWa~~kVG~PY~~n--f~n~~nt~~dk~y~C 188 (231)
T COG3863 137 AR---NADNVI--VYRVSTSNDQK-----SKAADWALTKVGLPYDYN--FLNYVNTKYDKSYYC 188 (231)
T ss_pred hh---cccceE--EEEEecchhhh-----HHHHHHHHhccCCcccce--eeeecccccCcceeH
Confidence 75 355553 44444455543 688899999999999985 3455666 4444433
No 10
>PF07646 Kelch_2: Kelch motif; InterPro: IPR011498 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding
Probab=48.03 E-value=18 Score=25.75 Aligned_cols=18 Identities=39% Similarity=0.615 Sum_probs=14.4
Q ss_pred cCccceeEEeecCCCcEEEEe
Q 016745 240 AFAGHTAVCLKDKEGNLWVGE 260 (383)
Q Consensus 240 s~aGHtav~Lrd~dGeL~v~E 260 (383)
.+.||+++++ ++||||+=
T Consensus 1 ~r~~hs~~~~---~~kiyv~G 18 (49)
T PF07646_consen 1 PRYGHSAVVL---DGKIYVFG 18 (49)
T ss_pred CccceEEEEE---CCEEEEEC
Confidence 3679999866 79999973
No 11
>PRK15231 fimbrial adhesin protein SefD; Provisional
Probab=44.12 E-value=24 Score=32.74 Aligned_cols=92 Identities=14% Similarity=0.235 Sum_probs=66.4
Q ss_pred HHhcceEEEEecccchhhhhhhhhccccccCcchhhhccHHHHHhhcCCceeeccCCccccCCCCCCCCCCEEEEEeecc
Q 016745 146 VKQHGVSVFLMPSGMMGTLLSLIDILPLFSNSHWGQNANLAFLEKHMGATFEKRPQPWHATINPEDVHSGDFLAVSKIRG 225 (383)
Q Consensus 146 ik~~Gv~vFlm~~gm~gtl~sl~~~~plF~nt~wge~~Nl~FL~~~mG~~fe~R~~~~v~~i~~~dIhsGDfL~iski~g 225 (383)
|-+.=++||++-+|...++...-++ .|.++ +.|| .=-+.+++|+.|+-.||--
T Consensus 10 ~~~~~~~~~~~~~~~~Ss~sqA~el-~L~~~-------------~~~~-------------~~~~~l~dg~~laTGri~c 62 (150)
T PRK15231 10 IPKFIVSVFLIVTGFFSSTIKAQEL-KLMIK-------------INEA-------------VFYDRITSNKIIGTGHLFN 62 (150)
T ss_pred cccceeeEeeEeehhhhhhhhceee-EEEee-------------cccc-------------chhhhccCCcEEeeeeEEe
Confidence 4455689999999877666554443 22110 1111 0126789999999999977
Q ss_pred cCCchhhHHHhhc----CcCccceeEE-eecCCCcEEEEecCCCCc
Q 016745 226 RWGGFETLEKWVT----GAFAGHTAVC-LKDKEGNLWVGESGHENE 266 (383)
Q Consensus 226 r~dG~d~li~W~t----Gs~aGHtav~-Lrd~dGeL~v~ES~~~~~ 266 (383)
| +||- +.||.. |..+||-.|- .+|+.-||+|-=.|.+|-
T Consensus 63 r-egfh-iwmns~~~q~gg~P~~YIvqGk~dsqh~LrVRlgGeGWq 106 (150)
T PRK15231 63 R-EGKK-ILISSSLEKIKNTPGAYIIRGQNNSAHKLRIRIGGEDWQ 106 (150)
T ss_pred c-CCeE-EEEecchhhcCCCccEEEEECCCCCcceEEEEecCCCcc
Confidence 7 7998 899988 8899999887 778888999987777763
No 12
>PF07313 DUF1460: Protein of unknown function (DUF1460); InterPro: IPR010846 This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.; PDB: 2P1G_B 2IM9_A.
Probab=38.13 E-value=74 Score=30.65 Aligned_cols=58 Identities=17% Similarity=0.401 Sum_probs=39.9
Q ss_pred CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHH
Q 016745 209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWE 282 (383)
Q Consensus 209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~ 282 (383)
.+.||+||.++|..=. +||| ..|+-++.|..|| |+....-... ++..|.-.|..||.+
T Consensus 151 ~~~i~~GDiI~i~t~~---~GLD----------vsH~Giav~~~~~-l~l~hASs~~--~~~~vvd~pl~~Yl~ 208 (216)
T PF07313_consen 151 LSQIKNGDIIAIVTNI---KGLD----------VSHVGIAVWKNDG-LHLRHASSLH--KKVVVVDEPLSEYLK 208 (216)
T ss_dssp HTTS-TT-EEEEEEEC---TTEC----------EEEEEEEEEETTE-EEEEEEETTT--TEEEEECCEHHHHHH
T ss_pred HhcCCCCCEEEEEeCC---CCCc----------eeeEEEEEEECCe-EEEEeCCCCC--CCcEEeccCHHHHHh
Confidence 5889999999998632 4555 6799999998555 8887543322 223577789999975
No 13
>PF13418 Kelch_4: Galactose oxidase, central domain; PDB: 2UVK_B.
Probab=36.94 E-value=33 Score=23.96 Aligned_cols=18 Identities=28% Similarity=0.477 Sum_probs=11.4
Q ss_pred cCccceeEEeecCCCcEEEE
Q 016745 240 AFAGHTAVCLKDKEGNLWVG 259 (383)
Q Consensus 240 s~aGHtav~Lrd~dGeL~v~ 259 (383)
++.||+++.+ .+++|||.
T Consensus 1 pR~~h~~~~~--~~~~i~v~ 18 (49)
T PF13418_consen 1 PRYGHSAVSI--GDNSIYVF 18 (49)
T ss_dssp --BS-EEEEE---TTEEEEE
T ss_pred CcceEEEEEE--eCCeEEEE
Confidence 4789999776 35889885
No 14
>smart00739 KOW KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
Probab=33.04 E-value=79 Score=19.66 Aligned_cols=23 Identities=30% Similarity=0.533 Sum_probs=17.0
Q ss_pred CCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEee
Q 016745 212 VHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLK 250 (383)
Q Consensus 212 IhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lr 250 (383)
++.||.+.|. .|.++|+.+..+.
T Consensus 2 ~~~G~~V~I~----------------~G~~~g~~g~i~~ 24 (28)
T smart00739 2 FEVGDTVRVI----------------AGPFKGKVGKVLE 24 (28)
T ss_pred CCCCCEEEEe----------------ECCCCCcEEEEEE
Confidence 5789998888 4777888775553
No 15
>cd02983 P5_C P5 family, C-terminal redox inactive TRX-like domain; P5 is a protein disulfide isomerase (PDI)-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. The C-terminal domain is likely involved in substrate binding, similar to the b and b' domains of PDI.
Probab=31.21 E-value=60 Score=28.17 Aligned_cols=85 Identities=19% Similarity=0.181 Sum_probs=48.5
Q ss_pred cEEEeeC----ChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCC------------CCCCCCCChhHHHHHHH-
Q 016745 292 QIALLPL----HPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMA------------DNYPPPLDAHLVVSVMS- 354 (383)
Q Consensus 292 ~ValLPL----~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~------------dNyPppLd~~~v~~v~s- 354 (383)
=|++||= ++|-|++.-+ .=.+-|+++-|+| +.|.|+|.-. ++||...=-+.-..-..
T Consensus 24 ~i~~l~~~~d~~~e~~~~~~~-~l~~vAk~~kgk~-----i~Fv~vd~~~~~~~~~~fgl~~~~~P~v~i~~~~~~KY~~ 97 (130)
T cd02983 24 IIAFLPHILDCQASCRNKYLE-ILKSVAEKFKKKP-----WGWLWTEAGAQLDLEEALNIGGFGYPAMVAINFRKMKFAT 97 (130)
T ss_pred EEEEcCccccCCHHHHHHHHH-HHHHHHHHhcCCc-----EEEEEEeCcccHHHHHHcCCCccCCCEEEEEecccCcccc
Confidence 3777774 2333332211 1224667788988 6799999866 35664210011000111
Q ss_pred HhhhcchhHHHHHHHHHHhhhhCcCCCC
Q 016745 355 MWTRVQPAYAANMWNEALNKRLGTEVLC 382 (383)
Q Consensus 355 ~~~~~~P~~a~~m~neALNKRLgT~gL~ 382 (383)
+-..+..+-...+.++.++-++++..++
T Consensus 98 ~~~~~t~e~i~~Fv~~~l~Gkl~~~~~~ 125 (130)
T cd02983 98 LKGSFSEDGINEFLRELSYGRGPTLPVN 125 (130)
T ss_pred ccCccCHHHHHHHHHHHHcCCcccccCC
Confidence 2344666777889999999999877654
No 16
>PF12075 KN_motif: KN motif; InterPro: IPR021939 This small motif is found at the N terminus of Kank proteins and has been called the KN (for Kank N-terminal) motif. This protein is found in eukaryotes. Proteins in this family are typically between 413 to 1202 amino acids in length. This protein is found associated with PF00023 from PFAM. This protein has two conserved sequence motifs: TPYG and LDLDF. Kank1 was obtained by positional cloning of a tumor suppressor gene in renal cell carcinoma, while the other members were found by homology search. The family is involved in the regulation of actin polymerisation and cell motility through signaling pathways containing PI3K/Akt and/or unidentified modulators/effectors [].
Probab=31.02 E-value=19 Score=26.71 Aligned_cols=6 Identities=83% Similarity=1.990 Sum_probs=5.2
Q ss_pred Ccceee
Q 016745 321 KPYGYH 326 (383)
Q Consensus 321 ~PYGYh 326 (383)
.|||||
T Consensus 6 tPYGyh 11 (39)
T PF12075_consen 6 TPYGYH 11 (39)
T ss_pred CCccee
Confidence 399999
No 17
>PF01344 Kelch_1: Kelch motif; InterPro: IPR006652 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding; PDB: 2XN4_A 2WOZ_A 3II7_A 4ASC_A 1U6D_X 1ZGK_A 2FLU_X 2VPJ_A 2DYH_A 1X2R_A ....
Probab=29.80 E-value=53 Score=22.46 Aligned_cols=30 Identities=20% Similarity=0.303 Sum_probs=19.6
Q ss_pred cCccceeEEeecCCCcEEEEecCCCCcccccce
Q 016745 240 AFAGHTAVCLKDKEGNLWVGESGHENEKGEEII 272 (383)
Q Consensus 240 s~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~I 272 (383)
++++|+++.+ ++++||+=-.++.....+.+
T Consensus 1 pR~~~~~~~~---~~~iyv~GG~~~~~~~~~~v 30 (47)
T PF01344_consen 1 PRSGHAAVVV---GNKIYVIGGYDGNNQPTNSV 30 (47)
T ss_dssp -BBSEEEEEE---TTEEEEEEEBESTSSBEEEE
T ss_pred CCccCEEEEE---CCEEEEEeeecccCceeeeE
Confidence 3678888666 78999986666644443333
No 18
>PF04583 Baculo_p74: Baculoviridae p74 conserved region; InterPro: IPR007663 Baculoviruses are distinct from other virus families in that there are two viral phenotypes: budded virus (BV) and occlusion-derived virus (ODV). BVs disseminate viral infection throughout the tissues of the host and ODVs transmit baculovirus between insect hosts. GFP tagging experiments implicate p74 as an ODV envelope protein [, ].; GO: 0019058 viral infectious cycle
Probab=29.20 E-value=62 Score=32.22 Aligned_cols=38 Identities=26% Similarity=0.537 Sum_probs=22.2
Q ss_pred CcceeeeeeEEEEecCCCCCCCCCChhHHHHHHHH----hh-------hcchhHHHHHHH
Q 016745 321 KPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVMSM----WT-------RVQPAYAANMWN 369 (383)
Q Consensus 321 ~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~s~----~~-------~~~P~~a~~m~n 369 (383)
-||||.|| |||-...++-.+..+- ++ .++|++-.++..
T Consensus 125 DPfGYnNM-----------FPr~~ldDLs~sfl~A~~esl~~~~Rd~Ief~pe~f~~~v~ 173 (249)
T PF04583_consen 125 DPFGYNNM-----------FPREYLDDLSRSFLSAYYESLGNGSRDIIEFLPEFFDELVE 173 (249)
T ss_pred Cccccccc-----------CCCcchHHHHHHHHHHHHHHhCCCCCCceeecHHHHHHHHH
Confidence 59999999 5665544443333322 22 267777666554
No 19
>COG5008 PilU Tfp pilus assembly protein, ATPase PilU [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=28.97 E-value=30 Score=35.67 Aligned_cols=109 Identities=21% Similarity=0.380 Sum_probs=73.1
Q ss_pred EeecCChhhHH--HHHhcceEEEEecccc--hhhhhhhhhccccccC-cchh----hhccHHHHHhhcCCceeeccCC--
Q 016745 134 FDSWEEPAELE--YVKQHGVSVFLMPSGM--MGTLLSLIDILPLFSN-SHWG----QNANLAFLEKHMGATFEKRPQP-- 202 (383)
Q Consensus 134 ~~~~~~~~e~e--~ik~~Gv~vFlm~~gm--~gtl~sl~~~~plF~n-t~wg----e~~Nl~FL~~~mG~~fe~R~~~-- 202 (383)
|.+++=|+-++ .+++.|+-||.=.+|- .-|+-+.+.- .| +.-| ...=++|+.+|-+--+..|+.-
T Consensus 110 ~eeL~LPevlk~la~~kRGLviiVGaTGSGKSTtmAaMi~y----RN~~s~gHIiTIEDPIEfih~h~~CIvTQREvGvD 185 (375)
T COG5008 110 FEELKLPEVLKDLALAKRGLVIIVGATGSGKSTTMAAMIGY----RNKNSTGHIITIEDPIEFIHKHKRCIVTQREVGVD 185 (375)
T ss_pred HHhcCCcHHHHHhhcccCceEEEECCCCCCchhhHHHHhcc----cccCCCCceEEecChHHHHhcccceeEEeeeeccc
Confidence 34444344444 4678999988754442 3333333221 22 1122 3456899999999999999843
Q ss_pred ---ccccCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEee
Q 016745 203 ---WHATINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLK 250 (383)
Q Consensus 203 ---~v~~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lr 250 (383)
|-+.+.-+-=|+-|++.|+.+|-| ||++-=.+=|-.||-.||-.
T Consensus 186 Tesw~~AlkNtlRQapDvI~IGEvRsr----etMeyAi~fAeTGHLcmaTL 232 (375)
T COG5008 186 TESWEVALKNTLRQAPDVILIGEVRSR----ETMEYAIQFAETGHLCMATL 232 (375)
T ss_pred hHHHHHHHHHHHhcCCCeEEEeecccH----hHHHHHHHHHhcCceEEEEe
Confidence 222333456689999999999999 99998888899999888765
No 20
>PF13964 Kelch_6: Kelch motif
Probab=28.35 E-value=55 Score=23.07 Aligned_cols=21 Identities=29% Similarity=0.396 Sum_probs=15.6
Q ss_pred CccceeEEeecCCCcEEEEecCCC
Q 016745 241 FAGHTAVCLKDKEGNLWVGESGHE 264 (383)
Q Consensus 241 ~aGHtav~Lrd~dGeL~v~ES~~~ 264 (383)
+.+|+++++ +|+|||+=-.+.
T Consensus 2 R~~~s~v~~---~~~iyv~GG~~~ 22 (50)
T PF13964_consen 2 RYGHSAVVV---GGKIYVFGGYDN 22 (50)
T ss_pred CccCEEEEE---CCEEEEECCCCC
Confidence 678999665 689999855444
No 21
>TIGR03047 PS_II_psb28 photosystem II reaction center protein Psb28. Members of this protein family are the Psb28 protein of photosystem II. Two different protein families, apparently without homology between them, have been designated PsbW. Cyanobacterial proteins previously designated PsbW are members of the family described here. However, while members of the plant PsbW family are not found (so far) in Cyanobacteria, members of the present family do occur in plants. We therefore support the alternative designation that has emerged for this protein family, Psp28, rather than PsbW.
Probab=28.10 E-value=70 Score=28.35 Aligned_cols=41 Identities=29% Similarity=0.394 Sum_probs=30.7
Q ss_pred CchhhHHH--hhcCcCccceeEEeecCCCcEEEEecCCC--Ccccccc
Q 016745 228 GGFETLEK--WVTGAFAGHTAVCLKDKEGNLWVGESGHE--NEKGEEI 271 (383)
Q Consensus 228 dG~d~li~--W~tGs~aGHtav~Lrd~dGeL~v~ES~~~--~~~~~~~ 271 (383)
+-.+.+.+ -.+|...| |.|.|++|+|-+-|+... |-+|+.+
T Consensus 33 ~~p~al~~~~~~~~~itG---m~LiDeEGei~tr~v~~KFvnGkp~~i 77 (109)
T TIGR03047 33 ENPKALDKFNSDTGEITG---MYLIDEEGEIVTREVKAKFVNGKPKAL 77 (109)
T ss_pred CCchhhhhccccccceee---EEEEccCccEEEEecceEEECCCccEE
Confidence 44566666 55688888 999999999999999888 4444443
No 22
>PF07494 Reg_prop: Two component regulator propeller; InterPro: IPR011110 A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to members of IPR002372 from INTERPRO and IPR001680 from INTERPRO indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=26.47 E-value=58 Score=20.76 Aligned_cols=13 Identities=46% Similarity=1.173 Sum_probs=9.1
Q ss_pred EeecCCCcEEEEe
Q 016745 248 CLKDKEGNLWVGE 260 (383)
Q Consensus 248 ~Lrd~dGeL~v~E 260 (383)
.+.|.+|.|||+=
T Consensus 10 i~~D~~G~lWigT 22 (24)
T PF07494_consen 10 IYEDSDGNLWIGT 22 (24)
T ss_dssp EEE-TTSCEEEEE
T ss_pred EEEcCCcCEEEEe
Confidence 3458889999973
No 23
>cd03474 Rieske_T4moC Toluene-4-monooxygenase effector protein complex (T4mo), Rieske ferredoxin subunit; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. T4mo is a four-protein complex that catalyzes the NADH- and O2-dependent hydroxylation of toluene to form p-cresol. T4mo consists of an NADH oxidoreductase (T4moF), a diiron hydroxylase (T4moH), a catalytic effector protein (T4moD), and a Rieske ferredoxin (T4moC). T4moC contains a Rieske domain and functions as an obligate electron carrier between T4moF and T4moH. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes.
Probab=25.72 E-value=1.2e+02 Score=24.81 Aligned_cols=36 Identities=17% Similarity=0.291 Sum_probs=26.7
Q ss_pred CCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecC
Q 016745 208 NPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESG 262 (383)
Q Consensus 208 ~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~ 262 (383)
..+||..|+...+. +. |-..+.+++.+|++++++..
T Consensus 6 ~~~~l~~g~~~~~~-~~------------------~~~~~~~~~~~g~~~A~~n~ 41 (108)
T cd03474 6 SLDDVWEGEMELVD-VD------------------GEEVLLVAPEGGEFRAFQGI 41 (108)
T ss_pred ehhccCCCceEEEE-EC------------------CeEEEEEEccCCeEEEEcCc
Confidence 46789999988765 32 22567788889999998864
No 24
>PF02362 B3: B3 DNA binding domain; InterPro: IPR003340 Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see IPR001471 from INTERPRO) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors []. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations [].; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 1WID_A 1YEL_A.
Probab=25.41 E-value=1.3e+02 Score=23.64 Aligned_cols=48 Identities=31% Similarity=0.403 Sum_probs=29.4
Q ss_pred cCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecC
Q 016745 206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESG 262 (383)
Q Consensus 206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~ 262 (383)
.+.++++.+.+.|.|.+ +-..+. .|.......|.|+|++|+.|-++--
T Consensus 4 ~l~~s~~~~~~~l~iP~--------~f~~~~-~~~~~~~~~v~l~~~~g~~W~v~~~ 51 (100)
T PF02362_consen 4 VLKPSDVSSSCRLIIPK--------EFAKKH-GGNKRKSREVTLKDPDGRSWPVKLK 51 (100)
T ss_dssp E--TTCCCCTT-EEE-H--------HHHTTT-S--SS--CEEEEEETTTEEEEEEEE
T ss_pred EEEccCcCCCCEEEeCH--------HHHHHh-CCCcCCCeEEEEEeCCCCEEEEEEE
Confidence 45577888889999995 334444 2223344567899999999999983
No 25
>cd03531 Rieske_RO_Alpha_KSH The alignment model represents the N-terminal rieske iron-sulfur domain of KshA, the oxygenase component of 3-ketosteroid 9-alpha-hydroxylase (KSH). The terminal oxygenase component of KSH is a key enzyme in the microbial steroid degradation pathway, catalyzing the 9 alpha-hydroxylation of 4-androstene-3,17-dione (AD) and 1,4-androstadiene-3,17-dione (ADD). KSH is a two-component class IA monooxygenase, with terminal oxygenase (KshA) and oxygenase reductase (KshB) components. KSH activity has been found in many actino- and proteo- bacterial genera including Rhodococcus, Nocardia, Arthrobacter, Mycobacterium, and Burkholderia.
Probab=25.26 E-value=93 Score=26.21 Aligned_cols=101 Identities=20% Similarity=0.336 Sum_probs=58.2
Q ss_pred CCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCC-------CcccccceeecchhHH
Q 016745 208 NPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHE-------NEKGEEIIVVIPWDEW 280 (383)
Q Consensus 208 ~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~-------~~~~~~~Iq~~pweeW 280 (383)
..+||..|+...+. + .|...+..|+.||++++++.-=. .-..+...-+=||-.|
T Consensus 7 ~~~dl~~g~~~~~~-~------------------~g~~i~l~r~~~g~~~a~~n~CpH~ga~L~~G~~~~~~i~CP~Hg~ 67 (115)
T cd03531 7 LARDFRDGKPHGVE-A------------------FGTKLVVFADSDGALNVLDAYCRHMGGDLSQGTVKGDEIACPFHDW 67 (115)
T ss_pred EHHHCCCCCeEEEE-E------------------CCeEEEEEECCCCCEEEEcCcCCCCCCCCccCcccCCEEECCCCCC
Confidence 35678888888776 3 24677788888999999886322 1123334455589988
Q ss_pred HHHHhccCCCCcEEEeeCChHHHhhcchHHHHH-HHHhhcCCcceeeeeeEEEEecCCCCCCCC
Q 016745 281 WELALKDDSNPQIALLPLHPDVRAKFNSTAAWE-YARSMSGKPYGYHNMIFSWIDTMADNYPPP 343 (383)
Q Consensus 281 ~~~~~kd~a~~~ValLPL~~e~RakFN~TAAwe-f~~~~eG~PYGYhN~iFsWIDT~~dNyPpp 343 (383)
. +.- .+ ....+|-.+ +|...++.. |=-..+ ..+||-|+| ++.|=|||
T Consensus 68 ~---fd~-~G-~~~~~p~~~----~~p~~~~l~~ypv~~~------~g~v~v~~~-~~~~~p~~ 115 (115)
T cd03531 68 R---WGG-DG-RCKAIPYAR----RVPPLARTRAWPTLER------NGQLFVWHD-PEGNPPPP 115 (115)
T ss_pred E---ECC-CC-CEEECCccc----CCCcccccceEeEEEE------CCEEEEECC-CCCCCCCC
Confidence 3 322 22 345555322 222222221 111111 468999998 78887776
No 26
>PF00877 NLPC_P60: NlpC/P60 family; InterPro: IPR000064 The Escherichia coli NLPC/Listeria P60 domain occurs at the C terminus of a number of different bacterial and viral proteins. The viral proteins are either described as tail assembly proteins or Gp19. In bacteria, the proteins are variously described as being putative tail component of prophage, invasin, invasion associated protein, putative lipoprotein, cell wall hydrolase, or putative endopeptidase. The E. coli NLPC/Listeria P60 domain is contained within the boundaries of the cysteine peptidase domain that defines the MEROPS peptidase family C40 (clan C-). A type example being dipeptidyl-peptidase VI from Bacillus sphaericus and gamma-glutamyl-diamino acid-endopeptidase precursor from Lactococcus lactis 3.4.19.11 from EC. This group also contains proteins classified as non-peptidase homologues in that they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases in the C40 family. ; PDB: 3PVQ_B 3GT2_A 3NPF_B 2K1G_A 3I86_A 3S0Q_A 2XIV_A 3PBC_A 3NE0_A 3M1U_B ....
Probab=23.88 E-value=46 Score=26.87 Aligned_cols=28 Identities=18% Similarity=0.396 Sum_probs=22.6
Q ss_pred cCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEe
Q 016745 206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCL 249 (383)
Q Consensus 206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~L 249 (383)
.++.++.++||+|.... +..+.|.+|.+
T Consensus 46 ~~~~~~~~pGDlif~~~----------------~~~~~Hvgiy~ 73 (105)
T PF00877_consen 46 RVPISELQPGDLIFFKG----------------GGGISHVGIYL 73 (105)
T ss_dssp HEEGGG-TTTEEEEEEG----------------TGGEEEEEEEE
T ss_pred ccchhcCCcccEEEEeC----------------CccCCEeEEEE
Confidence 36788999999998872 77889999998
No 27
>cd03477 Rieske_YhfW_C YhfW family, C-terminal Rieske domain; YhfW is a protein of unknown function with an N-terminal DadA-like (glycine/D-amino acid dehydrogenase) domain and a C-terminal Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. It is commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. YhfW is found in bacteria, some eukaryotes and archaea.
Probab=22.00 E-value=1.6e+02 Score=24.12 Aligned_cols=53 Identities=21% Similarity=0.163 Sum_probs=33.6
Q ss_pred CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCC--c---cc-ccceeecchhHH
Q 016745 209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHEN--E---KG-EEIIVVIPWDEW 280 (383)
Q Consensus 209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~--~---~~-~~~Iq~~pweeW 280 (383)
.+||.+|+...+. +. |...+..|+.+|++++++..-.- . ++ .+....=||--|
T Consensus 5 ~~dl~~g~~~~~~-~~------------------g~~v~v~r~~~g~~~A~~~~CpH~g~~l~~g~~~~~i~CP~Hg~ 63 (91)
T cd03477 5 IEDLAPGEGGVVN-IG------------------GKRLAVYRDEDGVLHTVSATCTHLGCIVHWNDAEKSWDCPCHGS 63 (91)
T ss_pred hhhcCCCCeEEEE-EC------------------CEEEEEEECCCCCEEEEcCcCCCCCCCCcccCCCCEEECCCCCC
Confidence 5788999988775 32 44555678779999998875431 1 11 123445567666
No 28
>PF04970 LRAT: Lecithin retinol acyltransferase; InterPro: IPR007053 This entry represents a conserved sequence region found in proteins from viruses, bacteria and eukaryotes. It contains a well-conserved NCEHF motif, though its function in these proteins is unknown.; PDB: 2KYT_A 4DOT_A 4FA0_A.
Probab=21.94 E-value=3.9e+02 Score=22.48 Aligned_cols=92 Identities=16% Similarity=0.210 Sum_probs=48.8
Q ss_pred CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEe-cCC----------CCcccccceeecch
Q 016745 209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGE-SGH----------ENEKGEEIIVVIPW 277 (383)
Q Consensus 209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~E-S~~----------~~~~~~~~Iq~~pw 277 (383)
...+++||.|.+.|. ..=|.++-+= ||+..=.- ++. ..-..+..|++.+.
T Consensus 4 ~~~~~~GD~I~~~r~-----------------~y~H~gIYvG--~~~ViH~~~~~~~~~~~~~~~~~~~~~~~~V~~~~l 64 (125)
T PF04970_consen 4 KKRLKPGDHIEVPRG-----------------LYEHWGIYVG--DGEVIHFSGPGEISVSNRSSICGFSKKKAEVKKDSL 64 (125)
T ss_dssp --S--TT-EEEEEET-----------------TEEEEEEEEE--TTEEEEEE-S-SSS-SSSSGGGGT--S-EEEEEEEH
T ss_pred ccCCCCCCEEEEecC-----------------CccEEEEEec--CCeEEEecccccccccccccccceecCCCEEEEEEh
Confidence 457899999999962 4557777764 45433222 111 12235677889999
Q ss_pred hHHHHHHhccCCCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceee
Q 016745 278 DEWWELALKDDSNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYH 326 (383)
Q Consensus 278 eeW~~~~~kd~a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYh 326 (383)
+++. ++.. +-+....+.....+.-..+.+-|+++-|+...||
T Consensus 65 ~~~~-----~~~~--~~v~~~~~~~~~~~~~~~iv~rA~~~lg~~~~Y~ 106 (125)
T PF04970_consen 65 EEFA-----QGRK--VRVNNYLDHRYKPFPPEEIVERAESRLGKEFEYN 106 (125)
T ss_dssp HHHH-----TTSE--EEE--GGGGTS--S-HHHHHHHHHHTTT-EESS-
T ss_pred HHhc-----CCCE--EEEEecCCccCCCCCHHHHHHHHHHHHcCCCccC
Confidence 9984 2232 4444443345556777788889999999655665
No 29
>PF09652 Cas_VVA1548: Putative CRISPR-associated protein (Cas_VVA1548); InterPro: IPR013443 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPR repeats. In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas genes.
Probab=20.21 E-value=65 Score=27.80 Aligned_cols=32 Identities=16% Similarity=0.501 Sum_probs=23.7
Q ss_pred ccHHHHHhhcCCceeeccCCccccCCCCCCCCCCEEE
Q 016745 183 ANLAFLEKHMGATFEKRPQPWHATINPEDVHSGDFLA 219 (383)
Q Consensus 183 ~Nl~FL~~~mG~~fe~R~~~~v~~i~~~dIhsGDfL~ 219 (383)
..++++++. |+...++ +.-+|+++|++||.+.
T Consensus 8 GAieW~~~q-g~~iD~~----v~Hld~~~i~~GD~Vi 39 (93)
T PF09652_consen 8 GAIEWAKQQ-GIQIDHF----VDHLDPADIQPGDVVI 39 (93)
T ss_pred cHHHHHHHh-CCCccee----eccCCHHHccCCCEEE
Confidence 457888886 6655443 4578999999999875
Done!