Query         016745
Match_columns 383
No_of_seqs    47 out of 49
Neff          3.3 
Searched_HMMs 46136
Date          Fri Mar 29 02:25:50 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016745.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016745hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF05708 DUF830:  Orthopoxvirus  98.5 4.9E-07 1.1E-11   77.4   8.4  116  212-353     2-118 (158)
  2 PRK10030 hypothetical protein;  98.4 1.8E-06   4E-11   79.5   9.2  120  209-354    18-137 (197)
  3 PRK11470 hypothetical protein;  98.0 2.7E-05 5.8E-10   73.0   8.9  142  209-374     6-161 (200)
  4 PRK11479 hypothetical protein;  97.2  0.0013 2.8E-08   64.5   8.0  100  205-329    58-160 (274)
  5 PF05382 Amidase_5:  Bacterioph  79.0     5.5 0.00012   36.1   6.0   65  210-298    74-138 (145)
  6 PF05257 CHAP:  CHAP domain;  I  74.3       9  0.0002   32.0   5.7   43  208-264    59-102 (124)
  7 PF01436 NHL:  NHL repeat;  Int  68.9     7.9 0.00017   25.4   3.3   19  246-265     6-24  (28)
  8 TIGR02219 phage_NlpC_fam putat  68.7     6.3 0.00014   34.2   3.7   54  206-282    71-124 (134)
  9 COG3863 Uncharacterized distan  53.1      34 0.00073   33.4   5.8  109  206-342    73-188 (231)
 10 PF07646 Kelch_2:  Kelch motif;  48.0      18 0.00038   25.7   2.4   18  240-260     1-18  (49)
 11 PRK15231 fimbrial adhesin prot  44.1      24 0.00051   32.7   3.1   92  146-266    10-106 (150)
 12 PF07313 DUF1460:  Protein of u  38.1      74  0.0016   30.6   5.6   58  209-282   151-208 (216)
 13 PF13418 Kelch_4:  Galactose ox  36.9      33 0.00072   24.0   2.4   18  240-259     1-18  (49)
 14 smart00739 KOW KOW (Kyprides,   33.0      79  0.0017   19.7   3.5   23  212-250     2-24  (28)
 15 cd02983 P5_C P5 family, C-term  31.2      60  0.0013   28.2   3.5   85  292-382    24-125 (130)
 16 PF12075 KN_motif:  KN motif;    31.0      19 0.00041   26.7   0.4    6  321-326     6-11  (39)
 17 PF01344 Kelch_1:  Kelch motif;  29.8      53  0.0012   22.5   2.4   30  240-272     1-30  (47)
 18 PF04583 Baculo_p74:  Baculovir  29.2      62  0.0013   32.2   3.6   38  321-369   125-173 (249)
 19 COG5008 PilU Tfp pilus assembl  29.0      30 0.00065   35.7   1.4  109  134-250   110-232 (375)
 20 PF13964 Kelch_6:  Kelch motif   28.3      55  0.0012   23.1   2.4   21  241-264     2-22  (50)
 21 TIGR03047 PS_II_psb28 photosys  28.1      70  0.0015   28.4   3.4   41  228-271    33-77  (109)
 22 PF07494 Reg_prop:  Two compone  26.5      58  0.0013   20.8   2.0   13  248-260    10-22  (24)
 23 cd03474 Rieske_T4moC Toluene-4  25.7 1.2E+02  0.0025   24.8   4.1   36  208-262     6-41  (108)
 24 PF02362 B3:  B3 DNA binding do  25.4 1.3E+02  0.0028   23.6   4.3   48  206-262     4-51  (100)
 25 cd03531 Rieske_RO_Alpha_KSH Th  25.3      93   0.002   26.2   3.6  101  208-343     7-115 (115)
 26 PF00877 NLPC_P60:  NlpC/P60 fa  23.9      46   0.001   26.9   1.4   28  206-249    46-73  (105)
 27 cd03477 Rieske_YhfW_C YhfW fam  22.0 1.6E+02  0.0034   24.1   4.2   53  209-280     5-63  (91)
 28 PF04970 LRAT:  Lecithin retino  21.9 3.9E+02  0.0085   22.5   6.7   92  209-326     4-106 (125)
 29 PF09652 Cas_VVA1548:  Putative  20.2      65  0.0014   27.8   1.7   32  183-219     8-39  (93)

No 1  
>PF05708 DUF830:  Orthopoxvirus protein of unknown function (DUF830); PDB: 2IF6_B 3KW0_C.
Probab=98.50  E-value=4.9e-07  Score=77.37  Aligned_cols=116  Identities=21%  Similarity=0.306  Sum_probs=79.3

Q ss_pred             CCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEec-CCCCcccccceeecchhHHHHHHhccCCC
Q 016745          212 VHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGES-GHENEKGEEIIVVIPWDEWWELALKDDSN  290 (383)
Q Consensus       212 IhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES-~~~~~~~~~~Iq~~pweeW~~~~~kd~a~  290 (383)
                      .++||.|...-   . +.+...|.=+|++..||.+|.+.+.+++.+|+|+ -..      +++..++++|..    +  +
T Consensus         2 l~~GDIil~~~---~-~~~s~~i~~~t~~~~~HvgI~~~~~~~~~~viea~~~~------Gv~~~~l~~~~~----~--~   65 (158)
T PF05708_consen    2 LQTGDIILTRG---K-SSLSKAIRPVTSSPYSHVGIVIGDEGQEPYVIEATPGD------GVRLEPLSDFLK----R--N   65 (158)
T ss_dssp             --TT-EEEEEE-----SCCHHHHHHHHTSS--EEEEEEEETTE-EEEEEEETTT------CEEEEECHHHHH----C--C
T ss_pred             CCCeeEEEEEC---C-chHHHHHHHHhCCCCCEEEEEEecCCCceEEEEeccCC------CeEEeeHHHHhc----C--C
Confidence            58999998873   3 7789999999999999999999987788999999 333      699999999964    3  4


Q ss_pred             CcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCCCCCCCCCChhHHHHHH
Q 016745          291 PQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVM  353 (383)
Q Consensus       291 ~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~  353 (383)
                      -+++++.+++. +..=...+|.+++++..|+||++--.+.      ++.   -.=++||+-++
T Consensus        66 ~~~~V~r~~~~-~~~~~~~~~~~~a~~~~g~~Y~~~~~~~------~~~---~yCSelV~~~y  118 (158)
T PF05708_consen   66 EKIAVYRLKDP-LSEEQRQKAAEFAKSYIGKPYDFNFSLD------DDR---FYCSELVAEAY  118 (158)
T ss_dssp             CEEEEEEECCG-TTCHHHHHHHHHHHCCTTS-B-CC-HCC------SSS---B-HHHHHHHHH
T ss_pred             ceEEEEEECCC-CCHHHHHHHHHHHHHHcCCCccccccCC------CCC---EEcHHHHHHHH
Confidence            46888888877 3233345678899999999999863333      221   22247776666


No 2  
>PRK10030 hypothetical protein; Provisional
Probab=98.36  E-value=1.8e-06  Score=79.52  Aligned_cols=120  Identities=20%  Similarity=0.274  Sum_probs=88.5

Q ss_pred             CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHHhccC
Q 016745          209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELALKDD  288 (383)
Q Consensus       209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~~kd~  288 (383)
                      ..++++||.|-.+-   + +.....|+.+|+|.-.|.+|..+. +|+.+|+|+-       ..|+.+|+++|.+    +.
T Consensus        18 ~~~l~~GDlif~~g---~-~~~s~aI~~~T~s~~SHVGIi~~~-~~~~~ViEAv-------~~V~~~pL~~Fl~----~~   81 (197)
T PRK10030         18 AWQPQTGDIIFQIS---R-SSQSKAIQLATHSDYSHTGMIVKR-NKKPYVFEAV-------GPVKYTPLKQWIA----HG   81 (197)
T ss_pred             hcCCCCCCEEEEeC---C-CcHhHHHhHhhCCCCceEEEEEEE-CCcEEEEEec-------CceEEEEHHHHhh----cC
Confidence            34899999998873   2 456889999999999999999985 7999999994       2499999999964    44


Q ss_pred             CCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCCCCCCCCCChhHHHHHHH
Q 016745          289 SNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVMS  354 (383)
Q Consensus       289 a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~s  354 (383)
                      .+-++++..++..+.... -.++.+++++..|+||-+.   |.| |  ++   .-.=++||.-++.
T Consensus        82 ~~~~~~V~Rl~~~lt~~~-~~~li~~A~~~lGkpYD~~---f~~-~--d~---~~YCSELV~~ay~  137 (197)
T PRK10030         82 EKGKYVVRRLENGLSVEQ-QQKLAQTAKRYLGKPYDFY---FSW-S--DD---RIYCSELVWKVYQ  137 (197)
T ss_pred             ccCcEEEEEeCCCCCHHH-HHHHHHHHHHHcCCCCCcc---ccc-C--CC---cEEeHHHHHHHHH
Confidence            456788887765332221 3447889999999999864   655 1  12   2334588887763


No 3  
>PRK11470 hypothetical protein; Provisional
Probab=98.01  E-value=2.7e-05  Score=72.98  Aligned_cols=142  Identities=14%  Similarity=0.171  Sum_probs=98.5

Q ss_pred             CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHHhccC
Q 016745          209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELALKDD  288 (383)
Q Consensus       209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~~kd~  288 (383)
                      +.++|+||+|-++--.   ..+ -.|+-+|||...|..|..+-..++-+|+||--..      ++.+|.++|++    ..
T Consensus         6 ~~~l~~GDLvF~~~~~---~~~-~aI~~aT~s~~sHvGII~~~~~~~~~VlEA~~~~------vr~TpLs~fi~----r~   71 (200)
T PRK11470          6 PAEYEIGDIVFTCIGA---ALF-GQISAASNCWSNHVGIIIGHNGEDFLVAESRVPL------STVTTLSRFIK----RS   71 (200)
T ss_pred             cCCCCCCCEEEEeCCc---chh-HHHHhccCCccceEEEEEEEcCCceEEEEecCCc------eEEeEHHHHHh----cC
Confidence            4689999999998422   223 3488899999999999885435689999994222      89999999974    45


Q ss_pred             CCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCCCCCCCCCChhHHHHHH--------------H
Q 016745          289 SNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVM--------------S  354 (383)
Q Consensus       289 a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~--------------s  354 (383)
                      .+-++++-.|+..+++.= ..+|.++++++-|+||.+.   |.| |  ++.|=|   ++||.-++              .
T Consensus        72 ~~g~i~v~Rl~~~l~~~~-~~~~~~~A~~~lGkpYD~~---F~~-~--d~~~YC---SElV~~~y~~a~~i~vg~~~~~~  141 (200)
T PRK11470         72 ANQRYAIKRLDAGLTEQQ-KQRIVEQVPSRLRKLYHTG---FKY-E--SSRQFC---SKFVFDIYKEALCIPVGEIETFG  141 (200)
T ss_pred             cCceEEEEEecCCCCHHH-HHHHHHHHHHHcCCCCCCc---cCC-C--CCceeh---HHHHHHHHHHhhCCcccccccch
Confidence            567899998875554411 3458899999999999985   665 2  344433   46665333              2


Q ss_pred             HhhhcchhHHHHHHHHHHhh
Q 016745          355 MWTRVQPAYAANMWNEALNK  374 (383)
Q Consensus       355 ~~~~~~P~~a~~m~neALNK  374 (383)
                      ++-+=+|.....+|.+--.+
T Consensus       142 ~~~~~~p~~~~~~w~~~y~~  161 (200)
T PRK11470        142 ELLNSNPDAKLTFWKFWFLG  161 (200)
T ss_pred             hhccCCccchhHHHHHHhcC
Confidence            33233677777777754433


No 4  
>PRK11479 hypothetical protein; Provisional
Probab=97.18  E-value=0.0013  Score=64.49  Aligned_cols=100  Identities=20%  Similarity=0.297  Sum_probs=77.2

Q ss_pred             ccCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHH
Q 016745          205 ATINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELA  284 (383)
Q Consensus       205 ~~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~  284 (383)
                      ..|+.+++|+||.|..+.-    +.+.-.|++.|+|...|.+|.+-  ||  .++|+-.      .+|+..|.++|..  
T Consensus        58 ~~Vs~~~LqpGDLVFfst~----t~~S~~Ik~~T~s~~SHVgIylG--dg--~vIEA~g------~GVri~pL~~~~~--  121 (274)
T PRK11479         58 KEITAPDLKPGDLLFSSSL----GVTSFGIRVFSTSSVSHVAIYLG--EN--NVAEATG------AGVQIVSLKKAIK--  121 (274)
T ss_pred             cccChhhCCCCCEEEEecC----CccccceecccCCCCcEEEEEec--CC--eEEEcCC------CCEEEEechhhhc--
Confidence            3688899999999998631    44677899999999999999985  44  3799832      3599999999963  


Q ss_pred             hccCCCCcEEEe---eCChHHHhhcchHHHHHHHHhhcCCcceeeeee
Q 016745          285 LKDDSNPQIALL---PLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMI  329 (383)
Q Consensus       285 ~kd~a~~~ValL---PL~~e~RakFN~TAAwef~~~~eG~PYGYhN~i  329 (383)
                       .+   -.|+.+   .+.+|.+++     +.+|+++..|.||=|-+.+
T Consensus       122 -~~---~~I~a~Rv~~lt~e~~~k-----l~~fa~~~lGy~YN~~gI~  160 (274)
T PRK11479        122 -HS---DKLFALRVPDLTPQQATK-----ITAFANKIKDSGYNYRGIV  160 (274)
T ss_pred             -cc---ceEEEEeCCCCCHHHHHH-----HHHHHHHhcCCCCCHHHHH
Confidence             22   236666   566666654     8899999999999987764


No 5  
>PF05382 Amidase_5:  Bacteriophage peptidoglycan hydrolase ;  InterPro: IPR008044 This entry is represented by Bacteriophage SFi21, lysin (Cell wall hydrolase; 3.5.1.28 from EC). At least one of proteins in this entry, the Pal protein from the pneumococcal bacteriophage Dp-1 (O03979 from SWISSPROT) has been shown to be an N-acetylmuramoyl-L-alanine amidase []. According to the known modular structure of this and other peptidoglycan hydrolases from the pneumococcal system, the active site should reside within this domain while a C-terminal domain binds to the choline residues of the cell wall teichoic acids [, ].
Probab=79.05  E-value=5.5  Score=36.12  Aligned_cols=65  Identities=20%  Similarity=0.431  Sum_probs=42.9

Q ss_pred             CCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHHHHhccCC
Q 016745          210 EDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWELALKDDS  289 (383)
Q Consensus       210 ~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~~~~kd~a  289 (383)
                      .++|.||.+..++- |.           ++...|||.|++. . +.  +|..   + ++.++|.+++++..|    ..+.
T Consensus        74 ~~~q~GDI~I~g~~-g~-----------S~G~~GHtgif~~-~-~~--iIhc---~-y~~~g~~~~~~~~~~----~~~~  129 (145)
T PF05382_consen   74 WNLQRGDIFIWGRR-GN-----------SAGAGGHTGIFMD-N-DT--IIHC---N-YGANGIAINNYDWYW----YYNG  129 (145)
T ss_pred             ccccCCCEEEEcCC-CC-----------CCCCCCeEEEEeC-C-Cc--EEEe---c-CCCCCeEecCCCeee----ecCC
Confidence            47999999997642 22           5556899999984 2 22  3332   2 278889999988775    4455


Q ss_pred             CCcEEEeeC
Q 016745          290 NPQIALLPL  298 (383)
Q Consensus       290 ~~~ValLPL  298 (383)
                      .+-+-+-+|
T Consensus       130 ~~~~~~yr~  138 (145)
T PF05382_consen  130 RPPVYVYRL  138 (145)
T ss_pred             CCcEEEEEe
Confidence            555555444


No 6  
>PF05257 CHAP:  CHAP domain;  InterPro: IPR007921 The CHAP (cysteine, histidine-dependent amidohydrolases/peptidases) domain is a region between 110 and 140 amino acids that is found in proteins from bacteria, bacteriophages, archaea and eukaryotes of the Trypanosomidae family. Many of these proteins are uncharacterised, but it has been proposed that they may function mainly in peptidoglycan hydrolysis. The CHAP domain is found in a wide range of protein architectures; it is commonly associated with bacterial type SH3 domains and with several families of amidase domains. It has been suggested that CHAP domain containing proteins utilise a catalytic cysteine residue in a nucleophilic-attack mechanism [, ]. The CHAP domain contains two invariant residues, a cysteine and a histidine. These residues form part of the putative active site of CHAP domain containing proteins. Secondary structure predictions show that the CHAP domain belongs to the alpha + beta structural class, with the N-terminal half largely containing predicted alpha helices and the C-terminal half principally composed of predicted beta strands [, ]. Some proteins known to contain a CHAP domain are listed below:   Bacterial and trypanosomal glutathionylspermidine amidases.  A variety of bacterial autolysins.  A Nocardia aerocolonigenes putative esterase.  Streptococcus pneumoniae choline-binding protein D.  Methanosarcina mazei protein MM2478, a putative chloride channel.  Several phage-encoded peptidoglycan hydrolases.  Cysteine peptidases belonging to MEROPS peptidase family C51 (D-alanyl-glycyl endopeptidase, clan CA).  ; PDB: 2LRJ_A 2VPM_B 2VOB_B 2VPS_A 2K3A_A 2IO9_A 2IO8_A 2IOB_A 2IOA_B 2IO7_B ....
Probab=74.28  E-value=9  Score=31.99  Aligned_cols=43  Identities=21%  Similarity=0.201  Sum_probs=32.1

Q ss_pred             CCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeec-CCCcEEEEecCCC
Q 016745          208 NPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKD-KEGNLWVGESGHE  264 (383)
Q Consensus       208 ~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd-~dGeL~v~ES~~~  264 (383)
                      .....++||.+...              ..++...||+++..-. .+|.+.++|....
T Consensus        59 ~~~~P~~Gdivv~~--------------~~~~~~~GHVaIV~~v~~~~~i~v~e~N~~  102 (124)
T PF05257_consen   59 TGSTPQPGDIVVWD--------------SGSGGGYGHVAIVESVNDGGTITVIEQNWG  102 (124)
T ss_dssp             ECS---TTEEEEEE--------------ECTTTTT-EEEEEEEE-TTSEEEEEECSST
T ss_pred             cCcccccceEEEec--------------cCCCCCCCeEEEEEEECCCCEEEEEECCcC
Confidence            45677899998875              3567889999999988 7799999999864


No 7  
>PF01436 NHL:  NHL repeat;  InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ].  The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A.
Probab=68.94  E-value=7.9  Score=25.39  Aligned_cols=19  Identities=37%  Similarity=0.847  Sum_probs=14.5

Q ss_pred             eEEeecCCCcEEEEecCCCC
Q 016745          246 AVCLKDKEGNLWVGESGHEN  265 (383)
Q Consensus       246 av~Lrd~dGeL~v~ES~~~~  265 (383)
                      -||+ +++|++||+||+..-
T Consensus         6 gvav-~~~g~i~VaD~~n~r   24 (28)
T PF01436_consen    6 GVAV-DSDGNIYVADSGNHR   24 (28)
T ss_dssp             EEEE-ETTSEEEEEECCCTE
T ss_pred             EEEE-eCCCCEEEEECCCCE
Confidence            3566 479999999987654


No 8  
>TIGR02219 phage_NlpC_fam putative phage cell wall peptidase, NlpC/P60 family. Members of this family show sequence similarity to members of the NlpC/P60 family described by Pfam model pfam00877 and by Anantharaman and Aravind (PubMed:12620121). The NlpC/P60 family includes a number of characterized bacterial cell wall hydrolases. Members of this related family are all found in prophage regions of bacterial genomes.
Probab=68.74  E-value=6.3  Score=34.20  Aligned_cols=54  Identities=20%  Similarity=0.342  Sum_probs=33.2

Q ss_pred             cCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHH
Q 016745          206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWE  282 (383)
Q Consensus       206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~  282 (383)
                      .++.+++|+||+|... .             ..|..++|..+.+ + +|++  +-+-.+    . .+.+...+.||.
T Consensus        71 ~v~~~~~qpGDlvff~-~-------------~~~~~~~HvGIy~-G-~g~~--iHa~~~----~-~v~~~~~~~yw~  124 (134)
T TIGR02219        71 PVPCDAAQPGDVLVFR-W-------------RPGAAAKHAAIAA-S-PTRF--IHAYDG----A-AVVESALVPWWR  124 (134)
T ss_pred             ccchhcCCCCCEEEEe-e-------------CCCCCCcEEEEEe-C-CCcE--EEECCC----C-CEEEeCCcHHHH
Confidence            4566889999999664 2             2355688999887 3 6664  333221    1 233445567764


No 9  
>COG3863 Uncharacterized distant relative of cell wall-associated hydrolases [Function unknown]
Probab=53.05  E-value=34  Score=33.42  Aligned_cols=109  Identities=22%  Similarity=0.377  Sum_probs=65.9

Q ss_pred             cCCCCCCCCCCEEEEEeecccCCchhhHHHhhc------CcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhH
Q 016745          206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVT------GAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDE  279 (383)
Q Consensus       206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~t------Gs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pwee  279 (383)
                      +.|..-.+|||-++= ++--| +|.    -|.|      +.|-||..|-. + -|+  .+||..+.      +++.++.-
T Consensus        73 ~~dr~v~~~gd~~~g-dyPTr-~g~----i~~t~~~~~~~~H~gHagmy~-~-a~~--~VEs~psG------Vr~v~~n~  136 (231)
T COG3863          73 NLDRSVLQPGDILLG-DYPTR-GGA----IWLTDTFGNIVGHWGHAGMYI-G-AGQ--MVESWPSG------VRVVSVNM  136 (231)
T ss_pred             hhhhhhcCCcchhhc-cCCCC-cce----EEEEcccccccccccceEEEE-c-CCc--EEeeccCc------eEEecchh
Confidence            555556677776543 22112 121    1322      55778888654 3 344  58987764      88888877


Q ss_pred             HHHHHhccCCCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEec-CCCCCCC
Q 016745          280 WWELALKDDSNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDT-MADNYPP  342 (383)
Q Consensus       280 W~~~~~kd~a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT-~~dNyPp  342 (383)
                      |.   .+||+=  |-..--+.|..     ++|-.|+.+-.|+||-|.  .|.-|.| -++-|=|
T Consensus       137 ~~---~~dn~i--V~~vsts~~qk-----~~AadWa~~kVG~PY~~n--f~n~~nt~~dk~y~C  188 (231)
T COG3863         137 AR---NADNVI--VYRVSTSNDQK-----SKAADWALTKVGLPYDYN--FLNYVNTKYDKSYYC  188 (231)
T ss_pred             hh---cccceE--EEEEecchhhh-----HHHHHHHHhccCCcccce--eeeecccccCcceeH
Confidence            75   355553  44444455543     688899999999999985  3455666 4444433


No 10 
>PF07646 Kelch_2:  Kelch motif;  InterPro: IPR011498 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding
Probab=48.03  E-value=18  Score=25.75  Aligned_cols=18  Identities=39%  Similarity=0.615  Sum_probs=14.4

Q ss_pred             cCccceeEEeecCCCcEEEEe
Q 016745          240 AFAGHTAVCLKDKEGNLWVGE  260 (383)
Q Consensus       240 s~aGHtav~Lrd~dGeL~v~E  260 (383)
                      .+.||+++++   ++||||+=
T Consensus         1 ~r~~hs~~~~---~~kiyv~G   18 (49)
T PF07646_consen    1 PRYGHSAVVL---DGKIYVFG   18 (49)
T ss_pred             CccceEEEEE---CCEEEEEC
Confidence            3679999866   79999973


No 11 
>PRK15231 fimbrial adhesin protein SefD; Provisional
Probab=44.12  E-value=24  Score=32.74  Aligned_cols=92  Identities=14%  Similarity=0.235  Sum_probs=66.4

Q ss_pred             HHhcceEEEEecccchhhhhhhhhccccccCcchhhhccHHHHHhhcCCceeeccCCccccCCCCCCCCCCEEEEEeecc
Q 016745          146 VKQHGVSVFLMPSGMMGTLLSLIDILPLFSNSHWGQNANLAFLEKHMGATFEKRPQPWHATINPEDVHSGDFLAVSKIRG  225 (383)
Q Consensus       146 ik~~Gv~vFlm~~gm~gtl~sl~~~~plF~nt~wge~~Nl~FL~~~mG~~fe~R~~~~v~~i~~~dIhsGDfL~iski~g  225 (383)
                      |-+.=++||++-+|...++...-++ .|.++             +.||             .=-+.+++|+.|+-.||--
T Consensus        10 ~~~~~~~~~~~~~~~~Ss~sqA~el-~L~~~-------------~~~~-------------~~~~~l~dg~~laTGri~c   62 (150)
T PRK15231         10 IPKFIVSVFLIVTGFFSSTIKAQEL-KLMIK-------------INEA-------------VFYDRITSNKIIGTGHLFN   62 (150)
T ss_pred             cccceeeEeeEeehhhhhhhhceee-EEEee-------------cccc-------------chhhhccCCcEEeeeeEEe
Confidence            4455689999999877666554443 22110             1111             0126789999999999977


Q ss_pred             cCCchhhHHHhhc----CcCccceeEE-eecCCCcEEEEecCCCCc
Q 016745          226 RWGGFETLEKWVT----GAFAGHTAVC-LKDKEGNLWVGESGHENE  266 (383)
Q Consensus       226 r~dG~d~li~W~t----Gs~aGHtav~-Lrd~dGeL~v~ES~~~~~  266 (383)
                      | +||- +.||..    |..+||-.|- .+|+.-||+|-=.|.+|-
T Consensus        63 r-egfh-iwmns~~~q~gg~P~~YIvqGk~dsqh~LrVRlgGeGWq  106 (150)
T PRK15231         63 R-EGKK-ILISSSLEKIKNTPGAYIIRGQNNSAHKLRIRIGGEDWQ  106 (150)
T ss_pred             c-CCeE-EEEecchhhcCCCccEEEEECCCCCcceEEEEecCCCcc
Confidence            7 7998 899988    8899999887 778888999987777763


No 12 
>PF07313 DUF1460:  Protein of unknown function (DUF1460);  InterPro: IPR010846 This family consists of several hypothetical bacterial proteins of around 260 residues in length. The function of this family is unknown.; PDB: 2P1G_B 2IM9_A.
Probab=38.13  E-value=74  Score=30.65  Aligned_cols=58  Identities=17%  Similarity=0.401  Sum_probs=39.9

Q ss_pred             CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCCcccccceeecchhHHHH
Q 016745          209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHENEKGEEIIVVIPWDEWWE  282 (383)
Q Consensus       209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~Iq~~pweeW~~  282 (383)
                      .+.||+||.++|..=.   +|||          ..|+-++.|..|| |+....-...  ++..|.-.|..||.+
T Consensus       151 ~~~i~~GDiI~i~t~~---~GLD----------vsH~Giav~~~~~-l~l~hASs~~--~~~~vvd~pl~~Yl~  208 (216)
T PF07313_consen  151 LSQIKNGDIIAIVTNI---KGLD----------VSHVGIAVWKNDG-LHLRHASSLH--KKVVVVDEPLSEYLK  208 (216)
T ss_dssp             HTTS-TT-EEEEEEEC---TTEC----------EEEEEEEEEETTE-EEEEEEETTT--TEEEEECCEHHHHHH
T ss_pred             HhcCCCCCEEEEEeCC---CCCc----------eeeEEEEEEECCe-EEEEeCCCCC--CCcEEeccCHHHHHh
Confidence            5889999999998632   4555          6799999998555 8887543322  223577789999975


No 13 
>PF13418 Kelch_4:  Galactose oxidase, central domain; PDB: 2UVK_B.
Probab=36.94  E-value=33  Score=23.96  Aligned_cols=18  Identities=28%  Similarity=0.477  Sum_probs=11.4

Q ss_pred             cCccceeEEeecCCCcEEEE
Q 016745          240 AFAGHTAVCLKDKEGNLWVG  259 (383)
Q Consensus       240 s~aGHtav~Lrd~dGeL~v~  259 (383)
                      ++.||+++.+  .+++|||.
T Consensus         1 pR~~h~~~~~--~~~~i~v~   18 (49)
T PF13418_consen    1 PRYGHSAVSI--GDNSIYVF   18 (49)
T ss_dssp             --BS-EEEEE---TTEEEEE
T ss_pred             CcceEEEEEE--eCCeEEEE
Confidence            4789999776  35889885


No 14 
>smart00739 KOW KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
Probab=33.04  E-value=79  Score=19.66  Aligned_cols=23  Identities=30%  Similarity=0.533  Sum_probs=17.0

Q ss_pred             CCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEee
Q 016745          212 VHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLK  250 (383)
Q Consensus       212 IhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lr  250 (383)
                      ++.||.+.|.                .|.++|+.+..+.
T Consensus         2 ~~~G~~V~I~----------------~G~~~g~~g~i~~   24 (28)
T smart00739        2 FEVGDTVRVI----------------AGPFKGKVGKVLE   24 (28)
T ss_pred             CCCCCEEEEe----------------ECCCCCcEEEEEE
Confidence            5789998888                4777888775553


No 15 
>cd02983 P5_C P5 family, C-terminal redox inactive TRX-like domain; P5 is a protein disulfide isomerase (PDI)-related protein with a domain structure of aa'b (where a and a' are redox active TRX domains and b is a redox inactive TRX-like domain). Like PDI, P5 is located in the endoplasmic reticulum (ER) and displays both isomerase and chaperone activities, which are independent of each other. Compared to PDI, the isomerase and chaperone activities of P5 are lower. The first cysteine in the CXXC motif of both redox active domains in P5 is necessary for isomerase activity. The P5 gene was first isolated as an amplified gene from a hydroxyurea-resistant hamster cell line. The zebrafish P5 homolog has been implicated to play a critical role in establishing left/right asymmetries in the embryonic midline. The C-terminal domain is likely involved in substrate binding, similar to the b and b' domains of PDI.
Probab=31.21  E-value=60  Score=28.17  Aligned_cols=85  Identities=19%  Similarity=0.181  Sum_probs=48.5

Q ss_pred             cEEEeeC----ChHHHhhcchHHHHHHHHhhcCCcceeeeeeEEEEecCC------------CCCCCCCChhHHHHHHH-
Q 016745          292 QIALLPL----HPDVRAKFNSTAAWEYARSMSGKPYGYHNMIFSWIDTMA------------DNYPPPLDAHLVVSVMS-  354 (383)
Q Consensus       292 ~ValLPL----~~e~RakFN~TAAwef~~~~eG~PYGYhN~iFsWIDT~~------------dNyPppLd~~~v~~v~s-  354 (383)
                      =|++||=    ++|-|++.-+ .=.+-|+++-|+|     +.|.|+|.-.            ++||...=-+.-..-.. 
T Consensus        24 ~i~~l~~~~d~~~e~~~~~~~-~l~~vAk~~kgk~-----i~Fv~vd~~~~~~~~~~fgl~~~~~P~v~i~~~~~~KY~~   97 (130)
T cd02983          24 IIAFLPHILDCQASCRNKYLE-ILKSVAEKFKKKP-----WGWLWTEAGAQLDLEEALNIGGFGYPAMVAINFRKMKFAT   97 (130)
T ss_pred             EEEEcCccccCCHHHHHHHHH-HHHHHHHHhcCCc-----EEEEEEeCcccHHHHHHcCCCccCCCEEEEEecccCcccc
Confidence            3777774    2333332211 1224667788988     6799999866            35664210011000111 


Q ss_pred             HhhhcchhHHHHHHHHHHhhhhCcCCCC
Q 016745          355 MWTRVQPAYAANMWNEALNKRLGTEVLC  382 (383)
Q Consensus       355 ~~~~~~P~~a~~m~neALNKRLgT~gL~  382 (383)
                      +-..+..+-...+.++.++-++++..++
T Consensus        98 ~~~~~t~e~i~~Fv~~~l~Gkl~~~~~~  125 (130)
T cd02983          98 LKGSFSEDGINEFLRELSYGRGPTLPVN  125 (130)
T ss_pred             ccCccCHHHHHHHHHHHHcCCcccccCC
Confidence            2344666777889999999999877654


No 16 
>PF12075 KN_motif:  KN motif;  InterPro: IPR021939  This small motif is found at the N terminus of Kank proteins and has been called the KN (for Kank N-terminal) motif. This protein is found in eukaryotes. Proteins in this family are typically between 413 to 1202 amino acids in length. This protein is found associated with PF00023 from PFAM. This protein has two conserved sequence motifs: TPYG and LDLDF. Kank1 was obtained by positional cloning of a tumor suppressor gene in renal cell carcinoma, while the other members were found by homology search. The family is involved in the regulation of actin polymerisation and cell motility through signaling pathways containing PI3K/Akt and/or unidentified modulators/effectors []. 
Probab=31.02  E-value=19  Score=26.71  Aligned_cols=6  Identities=83%  Similarity=1.990  Sum_probs=5.2

Q ss_pred             Ccceee
Q 016745          321 KPYGYH  326 (383)
Q Consensus       321 ~PYGYh  326 (383)
                      .|||||
T Consensus         6 tPYGyh   11 (39)
T PF12075_consen    6 TPYGYH   11 (39)
T ss_pred             CCccee
Confidence            399999


No 17 
>PF01344 Kelch_1:  Kelch motif;  InterPro: IPR006652 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding; PDB: 2XN4_A 2WOZ_A 3II7_A 4ASC_A 1U6D_X 1ZGK_A 2FLU_X 2VPJ_A 2DYH_A 1X2R_A ....
Probab=29.80  E-value=53  Score=22.46  Aligned_cols=30  Identities=20%  Similarity=0.303  Sum_probs=19.6

Q ss_pred             cCccceeEEeecCCCcEEEEecCCCCcccccce
Q 016745          240 AFAGHTAVCLKDKEGNLWVGESGHENEKGEEII  272 (383)
Q Consensus       240 s~aGHtav~Lrd~dGeL~v~ES~~~~~~~~~~I  272 (383)
                      ++++|+++.+   ++++||+=-.++.....+.+
T Consensus         1 pR~~~~~~~~---~~~iyv~GG~~~~~~~~~~v   30 (47)
T PF01344_consen    1 PRSGHAAVVV---GNKIYVIGGYDGNNQPTNSV   30 (47)
T ss_dssp             -BBSEEEEEE---TTEEEEEEEBESTSSBEEEE
T ss_pred             CCccCEEEEE---CCEEEEEeeecccCceeeeE
Confidence            3678888666   78999986666644443333


No 18 
>PF04583 Baculo_p74:  Baculoviridae p74 conserved region;  InterPro: IPR007663 Baculoviruses are distinct from other virus families in that there are two viral phenotypes: budded virus (BV) and occlusion-derived virus (ODV). BVs disseminate viral infection throughout the tissues of the host and ODVs transmit baculovirus between insect hosts. GFP tagging experiments implicate p74 as an ODV envelope protein [, ].; GO: 0019058 viral infectious cycle
Probab=29.20  E-value=62  Score=32.22  Aligned_cols=38  Identities=26%  Similarity=0.537  Sum_probs=22.2

Q ss_pred             CcceeeeeeEEEEecCCCCCCCCCChhHHHHHHHH----hh-------hcchhHHHHHHH
Q 016745          321 KPYGYHNMIFSWIDTMADNYPPPLDAHLVVSVMSM----WT-------RVQPAYAANMWN  369 (383)
Q Consensus       321 ~PYGYhN~iFsWIDT~~dNyPppLd~~~v~~v~s~----~~-------~~~P~~a~~m~n  369 (383)
                      -||||.||           |||-...++-.+..+-    ++       .++|++-.++..
T Consensus       125 DPfGYnNM-----------FPr~~ldDLs~sfl~A~~esl~~~~Rd~Ief~pe~f~~~v~  173 (249)
T PF04583_consen  125 DPFGYNNM-----------FPREYLDDLSRSFLSAYYESLGNGSRDIIEFLPEFFDELVE  173 (249)
T ss_pred             Cccccccc-----------CCCcchHHHHHHHHHHHHHHhCCCCCCceeecHHHHHHHHH
Confidence            59999999           5665544443333322    22       267777666554


No 19 
>COG5008 PilU Tfp pilus assembly protein, ATPase PilU [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=28.97  E-value=30  Score=35.67  Aligned_cols=109  Identities=21%  Similarity=0.380  Sum_probs=73.1

Q ss_pred             EeecCChhhHH--HHHhcceEEEEecccc--hhhhhhhhhccccccC-cchh----hhccHHHHHhhcCCceeeccCC--
Q 016745          134 FDSWEEPAELE--YVKQHGVSVFLMPSGM--MGTLLSLIDILPLFSN-SHWG----QNANLAFLEKHMGATFEKRPQP--  202 (383)
Q Consensus       134 ~~~~~~~~e~e--~ik~~Gv~vFlm~~gm--~gtl~sl~~~~plF~n-t~wg----e~~Nl~FL~~~mG~~fe~R~~~--  202 (383)
                      |.+++=|+-++  .+++.|+-||.=.+|-  .-|+-+.+.-    .| +.-|    ...=++|+.+|-+--+..|+.-  
T Consensus       110 ~eeL~LPevlk~la~~kRGLviiVGaTGSGKSTtmAaMi~y----RN~~s~gHIiTIEDPIEfih~h~~CIvTQREvGvD  185 (375)
T COG5008         110 FEELKLPEVLKDLALAKRGLVIIVGATGSGKSTTMAAMIGY----RNKNSTGHIITIEDPIEFIHKHKRCIVTQREVGVD  185 (375)
T ss_pred             HHhcCCcHHHHHhhcccCceEEEECCCCCCchhhHHHHhcc----cccCCCCceEEecChHHHHhcccceeEEeeeeccc
Confidence            34444344444  4678999988754442  3333333221    22 1122    3456899999999999999843  


Q ss_pred             ---ccccCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEee
Q 016745          203 ---WHATINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLK  250 (383)
Q Consensus       203 ---~v~~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lr  250 (383)
                         |-+.+.-+-=|+-|++.|+.+|-|    ||++-=.+=|-.||-.||-.
T Consensus       186 Tesw~~AlkNtlRQapDvI~IGEvRsr----etMeyAi~fAeTGHLcmaTL  232 (375)
T COG5008         186 TESWEVALKNTLRQAPDVILIGEVRSR----ETMEYAIQFAETGHLCMATL  232 (375)
T ss_pred             hHHHHHHHHHHHhcCCCeEEEeecccH----hHHHHHHHHHhcCceEEEEe
Confidence               222333456689999999999999    99998888899999888765


No 20 
>PF13964 Kelch_6:  Kelch motif
Probab=28.35  E-value=55  Score=23.07  Aligned_cols=21  Identities=29%  Similarity=0.396  Sum_probs=15.6

Q ss_pred             CccceeEEeecCCCcEEEEecCCC
Q 016745          241 FAGHTAVCLKDKEGNLWVGESGHE  264 (383)
Q Consensus       241 ~aGHtav~Lrd~dGeL~v~ES~~~  264 (383)
                      +.+|+++++   +|+|||+=-.+.
T Consensus         2 R~~~s~v~~---~~~iyv~GG~~~   22 (50)
T PF13964_consen    2 RYGHSAVVV---GGKIYVFGGYDN   22 (50)
T ss_pred             CccCEEEEE---CCEEEEECCCCC
Confidence            678999665   689999855444


No 21 
>TIGR03047 PS_II_psb28 photosystem II reaction center protein Psb28. Members of this protein family are the Psb28 protein of photosystem II. Two different protein families, apparently without homology between them, have been designated PsbW. Cyanobacterial proteins previously designated PsbW are members of the family described here. However, while members of the plant PsbW family are not found (so far) in Cyanobacteria, members of the present family do occur in plants. We therefore support the alternative designation that has emerged for this protein family, Psp28, rather than PsbW.
Probab=28.10  E-value=70  Score=28.35  Aligned_cols=41  Identities=29%  Similarity=0.394  Sum_probs=30.7

Q ss_pred             CchhhHHH--hhcCcCccceeEEeecCCCcEEEEecCCC--Ccccccc
Q 016745          228 GGFETLEK--WVTGAFAGHTAVCLKDKEGNLWVGESGHE--NEKGEEI  271 (383)
Q Consensus       228 dG~d~li~--W~tGs~aGHtav~Lrd~dGeL~v~ES~~~--~~~~~~~  271 (383)
                      +-.+.+.+  -.+|...|   |.|.|++|+|-+-|+...  |-+|+.+
T Consensus        33 ~~p~al~~~~~~~~~itG---m~LiDeEGei~tr~v~~KFvnGkp~~i   77 (109)
T TIGR03047        33 ENPKALDKFNSDTGEITG---MYLIDEEGEIVTREVKAKFVNGKPKAL   77 (109)
T ss_pred             CCchhhhhccccccceee---EEEEccCccEEEEecceEEECCCccEE
Confidence            44566666  55688888   999999999999999888  4444443


No 22 
>PF07494 Reg_prop:  Two component regulator propeller;  InterPro: IPR011110 A large group of two component regulator proteins appear to have the same N-terminal structure of 14 tandem repeats. These repeats show homology to members of IPR002372 from INTERPRO and IPR001680 from INTERPRO indicating that they are likely to form a beta-propeller. This family has been built with artificially high cut-offs in order to avoid overlaps with other beta-propeller families. The fourteen repeats are likely to form two propellers; it is not clear if these structures are likely to recruit other proteins or interact with DNA.; PDB: 3V9F_D 3VA6_B 3OTT_B 4A2M_D 4A2L_B.
Probab=26.47  E-value=58  Score=20.76  Aligned_cols=13  Identities=46%  Similarity=1.173  Sum_probs=9.1

Q ss_pred             EeecCCCcEEEEe
Q 016745          248 CLKDKEGNLWVGE  260 (383)
Q Consensus       248 ~Lrd~dGeL~v~E  260 (383)
                      .+.|.+|.|||+=
T Consensus        10 i~~D~~G~lWigT   22 (24)
T PF07494_consen   10 IYEDSDGNLWIGT   22 (24)
T ss_dssp             EEE-TTSCEEEEE
T ss_pred             EEEcCCcCEEEEe
Confidence            3458889999973


No 23 
>cd03474 Rieske_T4moC Toluene-4-monooxygenase effector protein complex (T4mo), Rieske ferredoxin subunit; The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. T4mo is a four-protein complex that catalyzes the NADH- and O2-dependent hydroxylation of toluene to form p-cresol. T4mo consists of an NADH oxidoreductase (T4moF), a diiron hydroxylase (T4moH), a catalytic effector protein (T4moD), and a Rieske ferredoxin (T4moC). T4moC contains a Rieske domain and functions as an obligate electron carrier between T4moF and T4moH. Rieske ferredoxins are found as subunits of membrane oxidase complexes, cis-dihydrodiol-forming aromatic dioxygenases, bacterial assimilatory nitrite reductases, and arsenite oxidase. Rieske ferredoxins are also found as soluble electron carriers in bacterial dioxygenase and monooxygenase complexes.
Probab=25.72  E-value=1.2e+02  Score=24.81  Aligned_cols=36  Identities=17%  Similarity=0.291  Sum_probs=26.7

Q ss_pred             CCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecC
Q 016745          208 NPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESG  262 (383)
Q Consensus       208 ~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~  262 (383)
                      ..+||..|+...+. +.                  |-..+.+++.+|++++++..
T Consensus         6 ~~~~l~~g~~~~~~-~~------------------~~~~~~~~~~~g~~~A~~n~   41 (108)
T cd03474           6 SLDDVWEGEMELVD-VD------------------GEEVLLVAPEGGEFRAFQGI   41 (108)
T ss_pred             ehhccCCCceEEEE-EC------------------CeEEEEEEccCCeEEEEcCc
Confidence            46789999988765 32                  22567788889999998864


No 24 
>PF02362 B3:  B3 DNA binding domain;  InterPro: IPR003340 Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see IPR001471 from INTERPRO) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors []. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations [].; GO: 0003677 DNA binding, 0006355 regulation of transcription, DNA-dependent; PDB: 1WID_A 1YEL_A.
Probab=25.41  E-value=1.3e+02  Score=23.64  Aligned_cols=48  Identities=31%  Similarity=0.403  Sum_probs=29.4

Q ss_pred             cCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecC
Q 016745          206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESG  262 (383)
Q Consensus       206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~  262 (383)
                      .+.++++.+.+.|.|.+        +-..+. .|.......|.|+|++|+.|-++--
T Consensus         4 ~l~~s~~~~~~~l~iP~--------~f~~~~-~~~~~~~~~v~l~~~~g~~W~v~~~   51 (100)
T PF02362_consen    4 VLKPSDVSSSCRLIIPK--------EFAKKH-GGNKRKSREVTLKDPDGRSWPVKLK   51 (100)
T ss_dssp             E--TTCCCCTT-EEE-H--------HHHTTT-S--SS--CEEEEEETTTEEEEEEEE
T ss_pred             EEEccCcCCCCEEEeCH--------HHHHHh-CCCcCCCeEEEEEeCCCCEEEEEEE
Confidence            45577888889999995        334444 2223344567899999999999983


No 25 
>cd03531 Rieske_RO_Alpha_KSH The alignment model represents the N-terminal rieske iron-sulfur domain of KshA, the oxygenase component of 3-ketosteroid 9-alpha-hydroxylase (KSH).  The terminal oxygenase component of KSH is a key enzyme in the microbial steroid degradation pathway, catalyzing the 9 alpha-hydroxylation of 4-androstene-3,17-dione (AD) and 1,4-androstadiene-3,17-dione (ADD). KSH is a two-component class IA monooxygenase, with terminal oxygenase (KshA) and oxygenase reductase (KshB) components.  KSH activity has been found in many actino- and proteo- bacterial genera including Rhodococcus, Nocardia, Arthrobacter, Mycobacterium, and Burkholderia.
Probab=25.26  E-value=93  Score=26.21  Aligned_cols=101  Identities=20%  Similarity=0.336  Sum_probs=58.2

Q ss_pred             CCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCC-------CcccccceeecchhHH
Q 016745          208 NPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHE-------NEKGEEIIVVIPWDEW  280 (383)
Q Consensus       208 ~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~-------~~~~~~~Iq~~pweeW  280 (383)
                      ..+||..|+...+. +                  .|...+..|+.||++++++.-=.       .-..+...-+=||-.|
T Consensus         7 ~~~dl~~g~~~~~~-~------------------~g~~i~l~r~~~g~~~a~~n~CpH~ga~L~~G~~~~~~i~CP~Hg~   67 (115)
T cd03531           7 LARDFRDGKPHGVE-A------------------FGTKLVVFADSDGALNVLDAYCRHMGGDLSQGTVKGDEIACPFHDW   67 (115)
T ss_pred             EHHHCCCCCeEEEE-E------------------CCeEEEEEECCCCCEEEEcCcCCCCCCCCccCcccCCEEECCCCCC
Confidence            35678888888776 3                  24677788888999999886322       1123334455589988


Q ss_pred             HHHHhccCCCCcEEEeeCChHHHhhcchHHHHH-HHHhhcCCcceeeeeeEEEEecCCCCCCCC
Q 016745          281 WELALKDDSNPQIALLPLHPDVRAKFNSTAAWE-YARSMSGKPYGYHNMIFSWIDTMADNYPPP  343 (383)
Q Consensus       281 ~~~~~kd~a~~~ValLPL~~e~RakFN~TAAwe-f~~~~eG~PYGYhN~iFsWIDT~~dNyPpp  343 (383)
                      .   +.- .+ ....+|-.+    +|...++.. |=-..+      ..+||-|+| ++.|=|||
T Consensus        68 ~---fd~-~G-~~~~~p~~~----~~p~~~~l~~ypv~~~------~g~v~v~~~-~~~~~p~~  115 (115)
T cd03531          68 R---WGG-DG-RCKAIPYAR----RVPPLARTRAWPTLER------NGQLFVWHD-PEGNPPPP  115 (115)
T ss_pred             E---ECC-CC-CEEECCccc----CCCcccccceEeEEEE------CCEEEEECC-CCCCCCCC
Confidence            3   322 22 345555322    222222221 111111      468999998 78887776


No 26 
>PF00877 NLPC_P60:  NlpC/P60 family;  InterPro: IPR000064 The Escherichia coli NLPC/Listeria P60 domain occurs at the C terminus of a number of different bacterial and viral proteins. The viral proteins are either described as tail assembly proteins or Gp19. In bacteria, the proteins are variously described as being putative tail component of prophage, invasin, invasion associated protein, putative lipoprotein, cell wall hydrolase, or putative endopeptidase.  The E. coli NLPC/Listeria P60 domain is contained within the boundaries of the cysteine peptidase domain that defines the MEROPS peptidase family C40 (clan C-). A type example being dipeptidyl-peptidase VI from Bacillus sphaericus and gamma-glutamyl-diamino acid-endopeptidase precursor from Lactococcus lactis 3.4.19.11 from EC. This group also contains proteins classified as non-peptidase homologues in that they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases in the C40 family. ; PDB: 3PVQ_B 3GT2_A 3NPF_B 2K1G_A 3I86_A 3S0Q_A 2XIV_A 3PBC_A 3NE0_A 3M1U_B ....
Probab=23.88  E-value=46  Score=26.87  Aligned_cols=28  Identities=18%  Similarity=0.396  Sum_probs=22.6

Q ss_pred             cCCCCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEe
Q 016745          206 TINPEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCL  249 (383)
Q Consensus       206 ~i~~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~L  249 (383)
                      .++.++.++||+|....                +..+.|.+|.+
T Consensus        46 ~~~~~~~~pGDlif~~~----------------~~~~~Hvgiy~   73 (105)
T PF00877_consen   46 RVPISELQPGDLIFFKG----------------GGGISHVGIYL   73 (105)
T ss_dssp             HEEGGG-TTTEEEEEEG----------------TGGEEEEEEEE
T ss_pred             ccchhcCCcccEEEEeC----------------CccCCEeEEEE
Confidence            36788999999998872                77889999998


No 27 
>cd03477 Rieske_YhfW_C YhfW family, C-terminal Rieske domain; YhfW is a protein of unknown function with an N-terminal DadA-like (glycine/D-amino acid dehydrogenase) domain and a C-terminal Rieske domain. The Rieske domain is a [2Fe-2S] cluster binding domain involved in electron transfer. It is commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. YhfW is found in bacteria, some eukaryotes and archaea.
Probab=22.00  E-value=1.6e+02  Score=24.12  Aligned_cols=53  Identities=21%  Similarity=0.163  Sum_probs=33.6

Q ss_pred             CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEecCCCC--c---cc-ccceeecchhHH
Q 016745          209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGESGHEN--E---KG-EEIIVVIPWDEW  280 (383)
Q Consensus       209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~ES~~~~--~---~~-~~~Iq~~pweeW  280 (383)
                      .+||.+|+...+. +.                  |...+..|+.+|++++++..-.-  .   ++ .+....=||--|
T Consensus         5 ~~dl~~g~~~~~~-~~------------------g~~v~v~r~~~g~~~A~~~~CpH~g~~l~~g~~~~~i~CP~Hg~   63 (91)
T cd03477           5 IEDLAPGEGGVVN-IG------------------GKRLAVYRDEDGVLHTVSATCTHLGCIVHWNDAEKSWDCPCHGS   63 (91)
T ss_pred             hhhcCCCCeEEEE-EC------------------CEEEEEEECCCCCEEEEcCcCCCCCCCCcccCCCCEEECCCCCC
Confidence            5788999988775 32                  44555678779999998875431  1   11 123445567666


No 28 
>PF04970 LRAT:  Lecithin retinol acyltransferase;  InterPro: IPR007053 This entry represents a conserved sequence region found in proteins from viruses, bacteria and eukaryotes. It contains a well-conserved NCEHF motif, though its function in these proteins is unknown.; PDB: 2KYT_A 4DOT_A 4FA0_A.
Probab=21.94  E-value=3.9e+02  Score=22.48  Aligned_cols=92  Identities=16%  Similarity=0.210  Sum_probs=48.8

Q ss_pred             CCCCCCCCEEEEEeecccCCchhhHHHhhcCcCccceeEEeecCCCcEEEEe-cCC----------CCcccccceeecch
Q 016745          209 PEDVHSGDFLAVSKIRGRWGGFETLEKWVTGAFAGHTAVCLKDKEGNLWVGE-SGH----------ENEKGEEIIVVIPW  277 (383)
Q Consensus       209 ~~dIhsGDfL~iski~gr~dG~d~li~W~tGs~aGHtav~Lrd~dGeL~v~E-S~~----------~~~~~~~~Iq~~pw  277 (383)
                      ...+++||.|.+.|.                 ..=|.++-+=  ||+..=.- ++.          ..-..+..|++.+.
T Consensus         4 ~~~~~~GD~I~~~r~-----------------~y~H~gIYvG--~~~ViH~~~~~~~~~~~~~~~~~~~~~~~~V~~~~l   64 (125)
T PF04970_consen    4 KKRLKPGDHIEVPRG-----------------LYEHWGIYVG--DGEVIHFSGPGEISVSNRSSICGFSKKKAEVKKDSL   64 (125)
T ss_dssp             --S--TT-EEEEEET-----------------TEEEEEEEEE--TTEEEEEE-S-SSS-SSSSGGGGT--S-EEEEEEEH
T ss_pred             ccCCCCCCEEEEecC-----------------CccEEEEEec--CCeEEEecccccccccccccccceecCCCEEEEEEh
Confidence            457899999999962                 4557777764  45433222 111          12235677889999


Q ss_pred             hHHHHHHhccCCCCcEEEeeCChHHHhhcchHHHHHHHHhhcCCcceee
Q 016745          278 DEWWELALKDDSNPQIALLPLHPDVRAKFNSTAAWEYARSMSGKPYGYH  326 (383)
Q Consensus       278 eeW~~~~~kd~a~~~ValLPL~~e~RakFN~TAAwef~~~~eG~PYGYh  326 (383)
                      +++.     ++..  +-+....+.....+.-..+.+-|+++-|+...||
T Consensus        65 ~~~~-----~~~~--~~v~~~~~~~~~~~~~~~iv~rA~~~lg~~~~Y~  106 (125)
T PF04970_consen   65 EEFA-----QGRK--VRVNNYLDHRYKPFPPEEIVERAESRLGKEFEYN  106 (125)
T ss_dssp             HHHH-----TTSE--EEE--GGGGTS--S-HHHHHHHHHHTTT-EESS-
T ss_pred             HHhc-----CCCE--EEEEecCCccCCCCCHHHHHHHHHHHHcCCCccC
Confidence            9984     2232  4444443345556777788889999999655665


No 29 
>PF09652 Cas_VVA1548:  Putative CRISPR-associated protein (Cas_VVA1548);  InterPro: IPR013443 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny.   This entry represents a conserved region of about 95 amino acids found exclusively in species with CRISPR repeats. In all bacterial species that contain this entry, the genes encoding the proteins are in the midst of a cluster of cas genes.
Probab=20.21  E-value=65  Score=27.80  Aligned_cols=32  Identities=16%  Similarity=0.501  Sum_probs=23.7

Q ss_pred             ccHHHHHhhcCCceeeccCCccccCCCCCCCCCCEEE
Q 016745          183 ANLAFLEKHMGATFEKRPQPWHATINPEDVHSGDFLA  219 (383)
Q Consensus       183 ~Nl~FL~~~mG~~fe~R~~~~v~~i~~~dIhsGDfL~  219 (383)
                      ..++++++. |+...++    +.-+|+++|++||.+.
T Consensus         8 GAieW~~~q-g~~iD~~----v~Hld~~~i~~GD~Vi   39 (93)
T PF09652_consen    8 GAIEWAKQQ-GIQIDHF----VDHLDPADIQPGDVVI   39 (93)
T ss_pred             cHHHHHHHh-CCCccee----eccCCHHHccCCCEEE
Confidence            457888886 6655443    4578999999999875


Done!