Query         028455
Match_columns 208
No_of_seqs    223 out of 971
Neff          6.7 
Searched_HMMs 46136
Date          Fri Mar 29 11:45:44 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/028455.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/028455hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 KOG3288 OTU-like cysteine prot 100.0 2.9E-68 6.3E-73  448.7  13.8  201    1-207   106-307 (307)
  2 COG5539 Predicted cysteine pro 100.0 1.4E-36 3.1E-41  260.7   5.3  189   10-207   117-306 (306)
  3 PF02338 OTU:  OTU-like cystein  99.9 1.2E-26 2.5E-31  177.5   8.4  104   11-121     1-121 (121)
  4 KOG2606 OTU (ovarian tumor)-li  99.9 1.3E-25 2.9E-30  193.7   8.2  122    4-127   158-298 (302)
  5 PF10275 Peptidase_C65:  Peptid  99.6   6E-15 1.3E-19  126.4  12.4   92   34-126   141-244 (244)
  6 KOG3991 Uncharacterized conser  99.6 4.3E-15 9.2E-20  124.7   6.9   94   33-127   157-256 (256)
  7 KOG2605 OTU (ovarian tumor)-li  99.4 4.1E-14 8.9E-19  127.7   3.5  120    5-127   218-344 (371)
  8 COG5539 Predicted cysteine pro  99.0 1.1E-10 2.3E-15  101.3   2.6  115    5-124   171-304 (306)
  9 PF05415 Peptidase_C36:  Beet n  96.1   0.017 3.8E-07   42.4   5.4   78   10-106     3-84  (104)
 10 PF02148 zf-UBP:  Zn-finger in   87.1    0.28 6.2E-06   33.3   0.9   30  175-204    10-42  (63)
 11 PF12874 zf-met:  Zinc-finger o  84.5    0.41 8.8E-06   26.0   0.6   25  177-201     1-25  (25)
 12 PF12756 zf-C2H2_2:  C2H2 type   83.8     1.1 2.4E-05   31.8   2.8   30  176-205    50-79  (100)
 13 PF00096 zf-C2H2:  Zinc finger,  79.1     1.2 2.7E-05   23.5   1.2   21  178-198     2-22  (23)
 14 smart00290 ZnF_UBP Ubiquitin C  77.6     1.4 3.1E-05   28.0   1.4   29  176-204    11-42  (50)
 15 PF12171 zf-C2H2_jaz:  Zinc-fin  77.2    0.45 9.7E-06   26.7  -1.0   25  177-201     2-26  (27)
 16 PF13894 zf-C2H2_4:  C2H2-type   72.8     2.8 6.1E-05   21.6   1.6   21  178-198     2-22  (24)
 17 KOG0804 Cytoplasmic Zn-finger   71.5     2.3   5E-05   39.9   1.6   25  178-202   242-269 (493)
 18 smart00355 ZnF_C2H2 zinc finge  71.2       3 6.5E-05   21.7   1.5   21  177-197     1-21  (26)
 19 PF05412 Peptidase_C33:  Equine  67.8     3.6 7.8E-05   31.2   1.7   84   10-128     4-87  (108)
 20 PF13912 zf-C2H2_6:  C2H2-type   64.5     5.2 0.00011   21.9   1.6   21  177-197     2-22  (27)
 21 PF05379 Peptidase_C23:  Carlav  60.4      33 0.00072   25.0   5.6   16   14-29      3-18  (89)
 22 smart00451 ZnF_U1 U1-like zinc  59.5     5.4 0.00012   23.1   1.1   25  177-201     4-28  (35)
 23 PHA03082 DNA-dependent RNA pol  57.4     5.8 0.00013   26.8   1.1   19  174-192     2-20  (63)
 24 PF05864 Chordopox_RPO7:  Chord  57.1     6.1 0.00013   26.7   1.2   18  175-192     3-20  (63)
 25 PRK10963 hypothetical protein;  54.0      10 0.00022   32.1   2.3   36   37-78      6-41  (223)
 26 PF13913 zf-C2HC_2:  zinc-finge  53.2      11 0.00025   20.8   1.7   21  176-197     2-22  (25)
 27 COG4049 Uncharacterized protei  53.0     6.9 0.00015   26.5   0.9   26  173-198    14-39  (65)
 28 PHA00616 hypothetical protein   52.7     7.6 0.00016   24.8   1.0   29  177-205     2-31  (44)
 29 PF05381 Peptidase_C21:  Tymovi  43.4 1.4E+02  0.0031   22.5   6.8   89   13-123     2-94  (104)
 30 cd00729 rubredoxin_SM Rubredox  41.0      13 0.00028   22.2   0.7   14  176-189     2-15  (34)
 31 PF09237 GAGA:  GAGA factor;  I  38.8      26 0.00056   23.3   1.9   22  178-199    26-47  (54)
 32 cd02669 Peptidase_C19M A subfa  38.0      16 0.00035   33.9   1.2   32  173-204    25-59  (440)
 33 PF04475 DUF555:  Protein of un  38.0      42  0.0009   25.2   3.1   38  151-188    22-59  (102)
 34 KOG1247 Methionyl-tRNA synthet  38.0      19 0.00041   34.0   1.6   61  116-188    86-148 (567)
 35 PF13465 zf-H2C2_2:  Zinc-finge  36.4      17 0.00037   20.0   0.7   18  170-187     8-25  (26)
 36 COG3426 Butyrate kinase [Energ  36.3      53  0.0011   29.6   4.0   59   18-84    275-341 (358)
 37 PHA02768 hypothetical protein;  36.1      32 0.00068   23.0   2.0   21  178-198     7-27  (55)
 38 PF09082 DUF1922:  Domain of un  35.9      13 0.00027   26.0   0.1   21  166-187    10-30  (68)
 39 PF13909 zf-H2C2_5:  C2H2-type   35.2      41 0.00089   17.7   2.1   21  177-198     1-21  (24)
 40 PF07368 DUF1487:  Protein of u  34.2      64  0.0014   27.5   4.1   66   71-145   144-213 (215)
 41 COG3357 Predicted transcriptio  33.8      49  0.0011   24.5   2.9   37  150-188    34-70  (97)
 42 cd01675 RNR_III Class III ribo  33.1      57  0.0012   31.5   4.1   36  150-189   495-531 (555)
 43 PF04877 Hairpins:  HrpZ;  Inte  33.0      36 0.00078   30.4   2.4   50   31-85    162-211 (308)
 44 TIGR02934 nifT_nitrog probable  31.3     4.3 9.3E-05   28.3  -2.8   44   44-103     6-50  (67)
 45 PRK09784 hypothetical protein;  31.2      25 0.00055   30.9   1.2   20    5-24    200-219 (417)
 46 PRK06266 transcription initiat  31.0      41 0.00088   27.6   2.4   49  138-187    80-128 (178)
 47 PHA00732 hypothetical protein   30.8      44 0.00096   23.7   2.2   10  178-187    29-38  (79)
 48 PF05148 Methyltransf_8:  Hypot  30.2 1.3E+02  0.0028   25.8   5.2   72   16-97     13-104 (219)
 49 COG2051 RPS27A Ribosomal prote  29.0      29 0.00063   24.1   1.0   15  173-187    35-49  (67)
 50 PF13240 zinc_ribbon_2:  zinc-r  29.0      33 0.00072   18.6   1.0   14  170-186    10-23  (23)
 51 PF07967 zf-C3HC:  C3HC zinc fi  28.8      30 0.00066   26.7   1.2   23  167-189    34-56  (133)
 52 cd00350 rubredoxin_like Rubred  28.3      22 0.00048   20.8   0.2   14  177-190     2-15  (33)
 53 PF04959 ARS2:  Arsenite-resist  27.8      45 0.00098   28.3   2.2   28  170-197    71-98  (214)
 54 KOG1790 60s ribosomal protein   26.8      23 0.00049   27.4   0.2   25  167-191    32-56  (121)
 55 smart00238 BIR Baculoviral inh  26.8 1.2E+02  0.0027   20.2   3.9   40  164-204    25-68  (71)
 56 PF13451 zf-trcl:  Probable zin  26.7      54  0.0012   21.4   1.9   27  174-200     2-28  (49)
 57 COG2174 RPL34A Ribosomal prote  26.3      30 0.00065   25.6   0.7   14  177-190    35-48  (93)
 58 COG5134 Uncharacterized conser  25.1      46   0.001   28.5   1.7   32  152-183    14-49  (272)
 59 cd00022 BIR Baculoviral inhibi  25.1 1.3E+02  0.0029   19.9   3.8   40  164-204    23-66  (69)
 60 PRK13731 conjugal transfer sur  24.9   2E+02  0.0042   25.1   5.5   45  137-185    50-104 (243)
 61 TIGR00373 conserved hypothetic  24.1      59  0.0013   26.0   2.1   49  138-187    72-120 (158)
 62 PF05413 Peptidase_C34:  Putati  23.2      67  0.0015   23.3   2.0   89    7-123     2-90  (92)
 63 PRK03922 hypothetical protein;  21.8   1E+02  0.0022   23.6   2.8   38  150-187    21-60  (113)
 64 PF15412 Nse4-Nse3_bdg:  Bindin  21.8      31 0.00068   22.8   0.1   36  146-181     2-37  (56)
 65 PF06107 DUF951:  Bacterial pro  21.6      48   0.001   22.4   0.9   15  173-187    28-42  (57)
 66 PF13717 zinc_ribbon_4:  zinc-r  21.3      58  0.0013   19.5   1.2   14  173-186    22-35  (36)
 67 PLN02748 tRNA dimethylallyltra  21.0      51  0.0011   31.3   1.3   26  177-202   419-445 (468)
 68 PF03884 DUF329:  Domain of unk  21.0      46   0.001   22.4   0.8   13  175-187     1-13  (57)
 69 PF08209 Sgf11:  Sgf11 (transcr  20.9      60  0.0013   19.4   1.2   19  175-193     3-21  (33)
 70 PRK05452 anaerobic nitric oxid  20.6 1.1E+02  0.0023   29.0   3.4   53  139-192   373-441 (479)
 71 PF00653 BIR:  Inhibitor of Apo  20.3 1.2E+02  0.0025   20.5   2.7   39  164-203    25-67  (70)
 72 PF08782 c-SKI_SMAD_bind:  c-SK  20.2      38 0.00083   25.2   0.3   25  167-191    19-43  (96)
 73 PF14300 DUF4375:  Domain of un  20.1      55  0.0012   24.7   1.1   18   33-50    106-123 (123)
 74 PF10571 UPF0547:  Uncharacteri  20.1      51  0.0011   18.5   0.7   12  176-187    14-25  (26)

No 1  
>KOG3288 consensus OTU-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=2.9e-68  Score=448.66  Aligned_cols=201  Identities=57%  Similarity=1.023  Sum_probs=191.8

Q ss_pred             CCCcEEEEEeCCCCchhhHHHHHHhhcCCCc-hHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHH
Q 028455            1 MEGIIVRRVIPSDNSCLFNAVGYVMEHDKNK-APELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELS   79 (208)
Q Consensus         1 ~~~~l~~~~ip~DGnCLFrAis~~l~~~~~~-~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~   79 (208)
                      |+|.|.+|+||+||||||+||+|.+.+.... ..+||++||..+.+||+.|+++|||++..+||.||+++.+|||+|||+
T Consensus       106 ~~gvl~~~vvp~DNSCLF~ai~yv~~k~~~~~~~elR~iiA~~Vasnp~~yn~AiLgK~n~eYc~WI~k~dsWGGaIEls  185 (307)
T KOG3288|consen  106 GEGVLSRRVVPDDNSCLFTAIAYVIFKQVSNRPYELREIIAQEVASNPDKYNDAILGKPNKEYCAWILKMDSWGGAIELS  185 (307)
T ss_pred             ccceeEEEeccCCcchhhhhhhhhhcCccCCCcHHHHHHHHHHHhcChhhhhHHHhCCCcHHHHHHHccccccCceEEee
Confidence            5789999999999999999999999987543 469999999999999999999999999999999999999999999999


Q ss_pred             HHHHhhCceEEEEECCCCceeEeCCCCCCCCeEEEEEcCccceeeecCCCCCCCCCCCeeeeeCCCCCcchHHHHHHHHH
Q 028455           80 ILADYYGREIAAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQKGRTIGPAEDLALKL  159 (208)
Q Consensus        80 als~~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~~~~~~~~~d~~~f~~~~~~~~~~~~~~a~~l  159 (208)
                      |||++|+++|+|+|+++.++++||+++++..|++|+|+|+|||++++++.  .|.+.|.|+||.+|    +.++.+|++|
T Consensus       186 ILS~~ygveI~vvDiqt~rid~fged~~~~~rv~llydGIHYD~l~m~~~--~~~~~~~tifp~~d----d~v~~~alqL  259 (307)
T KOG3288|consen  186 ILSDYYGVEICVVDIQTVRIDRFGEDKNFDNRVLLLYDGIHYDPLAMNEF--KPTDVDNTIFPVSD----DTVLTQALQL  259 (307)
T ss_pred             eehhhhceeEEEEecceeeehhcCCCCCCCceEEEEecccccChhhhccC--CccCCccccccccc----chHHHHHHHH
Confidence            99999999999999999999999999999999999999999999999976  57778899999999    5678999999


Q ss_pred             HHHHhhCCCccccCCceeeccccCCCcCCHHHHHHHHHhhCCCccccc
Q 028455          160 VKEQQRKKTYTDTANFTLRCGVCQIGVIGQKEAVEHAQATGHVNFQEY  207 (208)
Q Consensus       160 ~~~~~~~~~~t~t~~~~~~C~~c~~~~~g~~~a~~ha~~tgH~~F~e~  207 (208)
                      |++||++||||||++|+|||.+|+..|+||++|++||++|||+||+|+
T Consensus       260 a~~~k~~r~ytdt~~ftlRC~~Cq~glvGq~ea~eHA~~TGH~nFge~  307 (307)
T KOG3288|consen  260 ASELKRTRYYTDTAKFTLRCMVCQMGLVGQKEAAEHAKATGHVNFGEY  307 (307)
T ss_pred             HHHHHhcceeccccceEEEeeecccceeeHHHHHHHHHhcCCCccccC
Confidence            999999999999999999999999999999999999999999999996


No 2  
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=100.00  E-value=1.4e-36  Score=260.68  Aligned_cols=189  Identities=29%  Similarity=0.518  Sum_probs=166.2

Q ss_pred             eCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCccc-CHHHHHHHHHhhCce
Q 028455           10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWG-GAIELSILADYYGRE   88 (208)
Q Consensus        10 ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WG-G~iEL~als~~~~~~   88 (208)
                      .-+|++|+|++.++.++..  ...+||.+|+..+.+|||.|+.++++.|.-.|+.||.++..|| |+||+.++|..+++.
T Consensus       117 ~~~d~srl~q~~~~~l~~a--sv~~lrE~vs~Ev~snPDl~n~~i~~~~~i~y~~~i~k~d~~~dG~ieia~iS~~l~v~  194 (306)
T COG5539         117 GQDDNSRLFQAERYSLRDA--SVAKLREVVSLEVLSNPDLYNPAILEIDVIAYATWIVKPDSQGDGCIEIAIISDQLPVR  194 (306)
T ss_pred             CCCchHHHHHHHHhhhhhh--hHHHHHHHHHHHHhhCccccchhhcCcchHHHHHhhhccccCCCceEEEeEecccccee
Confidence            3467999999999999864  6789999999999999999999999999999999999999999 999999999999999


Q ss_pred             EEEEECCCCceeEeCCCCCCCCeEEEEEcCccceeeecCCCCCCCCCCCeeeeeCCCCCcchHHHHHHHHHHHHHhhCCC
Q 028455           89 IAAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQKGRTIGPAEDLALKLVKEQQRKKT  168 (208)
Q Consensus        89 I~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~~~~~~~~~d~~~f~~~~~~~~~~~~~~a~~l~~~~~~~~~  168 (208)
                      |+++++.+.+.++|++.. +..++.++|+|+|||.....-.+ ..+..+.-.|+.+|     .+...+++||+-|+..+|
T Consensus       195 i~~Vdv~~~~~dr~~~~~-~~q~~~i~f~g~hfD~~t~~m~~-~dt~~ne~~~~a~~-----g~~~ei~qLas~lk~~~~  267 (306)
T COG5539         195 IHVVDVDKDSEDRYNSHP-YVQRISILFTGIHFDEETLAMVL-WDTYVNEVLFDASD-----GITIEIQQLASLLKNPHY  267 (306)
T ss_pred             eeeeecchhHHhhccCCh-hhhhhhhhhcccccchhhhhcch-HHHHHhhhcccccc-----cchHHHHHHHHHhcCceE
Confidence            999999999999999887 67888899999999999865321 12223344555554     245667789999999999


Q ss_pred             ccccCCceeeccccCCCcCCHHHHHHHHHhhCCCccccc
Q 028455          169 YTDTANFTLRCGVCQIGVIGQKEAVEHAQATGHVNFQEY  207 (208)
Q Consensus       169 ~t~t~~~~~~C~~c~~~~~g~~~a~~ha~~tgH~~F~e~  207 (208)
                      ||||+++++||+.||+.|.|++++-+||..|||+||+|.
T Consensus       268 ~~nT~~~~ik~n~c~~~~~~e~~~~~Ha~a~GH~n~~~d  306 (306)
T COG5539         268 YTNTASPSIKCNICGTGFVGEKDYYAHALATGHYNFGED  306 (306)
T ss_pred             EeecCCceEEeeccccccchhhHHHHHHHhhcCccccCC
Confidence            999999999999999999999999999999999999973


No 3  
>PF02338 OTU:  OTU-like cysteine protease;  InterPro: IPR003323 This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).  None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity [].; PDB: 2VFJ_C 3DKB_F 3PHW_A 3PHU_B 3PHX_A 3BY4_A 3C0R_C 3PRM_C 3PRP_C 3ZRH_A ....
Probab=99.94  E-value=1.2e-26  Score=177.50  Aligned_cols=104  Identities=30%  Similarity=0.544  Sum_probs=86.0

Q ss_pred             CCCCchhhHHHHHHhh----cCCCchHHHHHHHHHHHh-cChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhh
Q 028455           11 PSDNSCLFNAVGYVME----HDKNKAPELRQVIAATVA-SDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYY   85 (208)
Q Consensus        11 p~DGnCLFrAis~~l~----~~~~~~~~lR~~va~~I~-~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~   85 (208)
                      ||||||||||||++|+    +++..|.+||+.|+++|+ .|++.| +.++...      +|+++++|||++||+|||.+|
T Consensus         1 pgDGnClF~Avs~~l~~~~~~~~~~~~~lR~~~~~~l~~~~~~~~-~~~~~~~------~~~~~~~Wg~~~el~a~a~~~   73 (121)
T PF02338_consen    1 PGDGNCLFRAVSDQLYGDGGGSEDNHQELRKAVVDYLRDKNRDKF-EEFLEGD------KMSKPGTWGGEIELQALANVL   73 (121)
T ss_dssp             -SSTTHHHHHHHHHHCTT-SSSTTTHHHHHHHHHHHHHTHTTTHH-HHHHHHH------HHTSTTSHEEHHHHHHHHHHH
T ss_pred             CCCccHHHHHHHHHHHHhcCCCHHHHHHHHHHHHHHHHHhccchh-hhhhhhh------hhccccccCcHHHHHHHHHHh
Confidence            8999999999999999    999999999999999999 999999 5555433      999999999999999999999


Q ss_pred             CceEEEEECCCCceeE---eCC---CCCCCCeEEEEEc------Cccc
Q 028455           86 GREIAAYDIQTTRCDL---YGQ---EKKYSERVMLIYD------GLHY  121 (208)
Q Consensus        86 ~~~I~V~d~~~~~~~~---fg~---~~~~~~~i~llY~------G~HY  121 (208)
                      +++|+|++...+....   +..   ......++.+.|.      |+||
T Consensus        74 ~~~I~v~~~~~~~~~~~~~~~~~~~~~~~~~~i~l~~~~~l~~~~~Hy  121 (121)
T PF02338_consen   74 NRPIIVYSSSDGDNVVFIKFTGKYPPLESPPPICLCYHGHLYYTGNHY  121 (121)
T ss_dssp             TSEEEEECETTTBEEEEEEESCEESTTTTTTSEEEEEETEEEEETTEE
T ss_pred             CCeEEEEEcCCCCccceeeecCccccCCCCCeEEEEEcCCccCCCCCC
Confidence            9999999886664322   222   2334567777775      6898


No 4  
>KOG2606 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.92  E-value=1.3e-25  Score=193.67  Aligned_cols=122  Identities=25%  Similarity=0.428  Sum_probs=105.5

Q ss_pred             cEEEEEeCCCCchhhHHHHHHhhcC---CCchHHHHHHHHHHHhcChhcchhhhcC----------CCHHHHHHHhCCCC
Q 028455            4 IIVRRVIPSDNSCLFNAVGYVMEHD---KNKAPELRQVIAATVASDPVKYSEAFLG----------KSNQEYCSWIQDPE   70 (208)
Q Consensus         4 ~l~~~~ip~DGnCLFrAis~~l~~~---~~~~~~lR~~va~~I~~np~~y~e~~l~----------~~~~eY~~~i~~~~   70 (208)
                      .|....||+||+|||+||++||.-.   ..+...||..+|+||++|.++| .+|+-          .+|+.||+.|.++.
T Consensus       158 ~l~~~~Ip~DG~ClY~aI~hQL~~~~~~~~~v~kLR~~~a~Ymr~H~~df-~pf~~~eet~d~~~~~~f~~Yc~eI~~t~  236 (302)
T KOG2606|consen  158 GLKMFDIPADGHCLYAAISHQLKLRSGKLLSVQKLREETADYMREHVEDF-LPFLLDEETGDSLGPEDFDKYCREIRNTA  236 (302)
T ss_pred             cCccccCCCCchhhHHHHHHHHHhccCCCCcHHHHHHHHHHHHHHHHHHh-hhHhcCccccccCCHHHHHHHHHHhhhhc
Confidence            4788999999999999999999532   3577899999999999999999 56652          14999999999999


Q ss_pred             cccCHHHHHHHHHhhCceEEEEECCCCceeEeCCCCCCCCeEEEEE------cCccceeeecC
Q 028455           71 KWGGAIELSILADYYGREIAAYDIQTTRCDLYGQEKKYSERVMLIY------DGLHYDALAIS  127 (208)
Q Consensus        71 ~WGG~iEL~als~~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~llY------~G~HYD~l~~~  127 (208)
                      .|||+|||.|||..|++||.||..+.+ +..||+..+..+|++|+|      .|.||+++.+.
T Consensus       237 ~WGgelEL~AlShvL~~PI~Vy~~~~p-~~~~geey~kd~pL~lvY~rH~y~LGeHYNS~~~~  298 (302)
T KOG2606|consen  237 AWGGELELKALSHVLQVPIEVYQADGP-ILEYGEEYGKDKPLILVYHRHAYGLGEHYNSVTPL  298 (302)
T ss_pred             cccchHHHHHHHHhhccCeEEeecCCC-ceeechhhCCCCCeeeehHHhHHHHHhhhcccccc
Confidence            999999999999999999999997755 889998876568888887      46799998764


No 5  
>PF10275 Peptidase_C65:  Peptidase C65 Otubain;  InterPro: IPR019400 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].   This family of proteins is a highly specific ubiquitin iso-peptidase that removes ubiquitin from proteins. The modification of cellular proteins by ubiquitin (Ub) is an important event that underlies protein stability and function in eukaryotes, as it is a dynamic and reversible process. Otubain carries several key conserved domains: (i) the OTU (ovarian tumour domain) in which there is an active cysteine protease triad (ii) a nuclear localisation signal, (iii) a Ub interaction motif (UIM)-like motif phi-xx-A-xxxs-xx-Ac (where phi indicates an aromatic amino acid, x indicates any amino acid and Ac indicates an acidic amino acid), (iv) a Ub-associated (UBA)-like domain and (v) the LxxLL motif. ; PDB: 4DDG_C 3VON_O 2ZFY_A 4DHZ_A 4DDI_C 1TFF_A 4DHJ_I 4DHI_B.
Probab=99.61  E-value=6e-15  Score=126.38  Aligned_cols=92  Identities=23%  Similarity=0.350  Sum_probs=69.6

Q ss_pred             HHHHHHHHHHhcChhcchhhhcC----CCHHHHHH-HhCCCCcccCHHHHHHHHHhhCceEEEEECCCC------ceeEe
Q 028455           34 ELRQVIAATVASDPVKYSEAFLG----KSNQEYCS-WIQDPEKWGGAIELSILADYYGREIAAYDIQTT------RCDLY  102 (208)
Q Consensus        34 ~lR~~va~~I~~np~~y~e~~l~----~~~~eY~~-~i~~~~~WGG~iEL~als~~~~~~I~V~d~~~~------~~~~f  102 (208)
                      .||..++.||+.|++.| ++|+.    .++++||+ .+...+.-.+++.|.|||++++++|.|+-++..      ....|
T Consensus       141 flRLlts~~l~~~~d~y-~~fi~~~~~~tve~~C~~~Vep~~~Ead~v~i~ALa~aL~v~i~v~yld~~~~~~~~~~~~~  219 (244)
T PF10275_consen  141 FLRLLTSAYLKSNSDEY-EPFIDGLEYLTVEEFCSQEVEPMGKEADHVQIIALAQALGVPIRVEYLDRSVEGDEVNRHEF  219 (244)
T ss_dssp             HHHHHHHHHHHHTHHHH-GGGSSTT--S-HHHHHHHHTSSTT--B-HHHHHHHHHHHT--EEEEESSSSGCSTTSEEEEE
T ss_pred             HHHHHHHHHHHhhHHHH-hhhhcccccCCHHHHHHhhcccccccchhHHHHHHHHHhCCeEEEEEecCCCCCCccccccC
Confidence            58999999999999999 77875    78999997 577789999999999999999999999876632      23445


Q ss_pred             CCC-CCCCCeEEEEEcCccceeeec
Q 028455          103 GQE-KKYSERVMLIYDGLHYDALAI  126 (208)
Q Consensus       103 g~~-~~~~~~i~llY~G~HYD~l~~  126 (208)
                      .+. .+...+|.|||...|||++++
T Consensus       220 ~~~~~~~~~~i~LLyrpgHYdIly~  244 (244)
T PF10275_consen  220 PPDNESQEPQITLLYRPGHYDILYP  244 (244)
T ss_dssp             S-SSTTSS-SEEEEEETBEEEEEEE
T ss_pred             CCccCCCCCEEEEEEcCCccccccC
Confidence            432 224678899999999999985


No 6  
>KOG3991 consensus Uncharacterized conserved protein [Function unknown]
Probab=99.57  E-value=4.3e-15  Score=124.70  Aligned_cols=94  Identities=21%  Similarity=0.319  Sum_probs=75.6

Q ss_pred             HHHHHHHHHHHhcChhcchhhhcC--CCHHHHHHHhCCC-CcccCHHHHHHHHHhhCceEEEEECCCCceeEeCCC---C
Q 028455           33 PELRQVIAATVASDPVKYSEAFLG--KSNQEYCSWIQDP-EKWGGAIELSILADYYGREIAAYDIQTTRCDLYGQE---K  106 (208)
Q Consensus        33 ~~lR~~va~~I~~np~~y~e~~l~--~~~~eY~~~i~~~-~~WGG~iEL~als~~~~~~I~V~d~~~~~~~~fg~~---~  106 (208)
                      ..||..++.+|++|++.| ++|++  ++.++||..--.| ..-.|+|+|.|||+++++.|.|..++.+.....+.-   .
T Consensus       157 ~ylRLvtS~~ik~~adfy-~pFI~e~~tV~~fC~~eVEPm~kesdhi~I~ALs~Al~i~irVey~dr~~~~~~~hH~fpe  235 (256)
T KOG3991|consen  157 MYLRLVTSGFIKSNADFY-QPFIDEGMTVKAFCTQEVEPMYKESDHIHITALSQALGIRIRVEYVDRGSGDTVNHHDFPE  235 (256)
T ss_pred             HHHHHHHHHHHhhChhhh-hccCCCCCcHHHHHHhhcchhhhccCceeHHHHHhhhCceEEEEEecCCCCCCCCCCcCcc
Confidence            469999999999999999 88884  6999999975444 788999999999999999999988765432222211   2


Q ss_pred             CCCCeEEEEEcCccceeeecC
Q 028455          107 KYSERVMLIYDGLHYDALAIS  127 (208)
Q Consensus       107 ~~~~~i~llY~G~HYD~l~~~  127 (208)
                      +..++|.|||...|||+|+++
T Consensus       236 ~s~P~I~LLYrpGHYdilY~~  256 (256)
T KOG3991|consen  236 ASAPEIYLLYRPGHYDILYKK  256 (256)
T ss_pred             ccCceEEEEecCCccccccCC
Confidence            246789999999999999864


No 7  
>KOG2605 consensus OTU (ovarian tumor)-like cysteine protease [Signal transduction mechanisms; Posttranslational modification, protein turnover, chaperones]
Probab=99.44  E-value=4.1e-14  Score=127.66  Aligned_cols=120  Identities=19%  Similarity=0.222  Sum_probs=95.2

Q ss_pred             EEEEEeCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHH-
Q 028455            5 IVRRVIPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILAD-   83 (208)
Q Consensus         5 l~~~~ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~-   83 (208)
                      +..+.|..||||+|||+++|++++.+.|..+|+.+++++..+++.| +-++.+++.+|++.++.++.||.+||++|+|. 
T Consensus       218 ~e~~Kv~edGsC~fra~aDQvy~d~e~~~~~~~~~~dq~~~e~~~~-~~~vt~~~~~y~k~kr~~~~~gnhie~Qa~a~~  296 (371)
T KOG2605|consen  218 FEYKKVVEDGSCLFRALADQVYGDDEQHDHNRRECVDQLKKERDFY-EDYVTEDFTSYIKRKRADGEPGNHIEQQAAADI  296 (371)
T ss_pred             hhhhhcccCCchhhhccHHHhhcCHHHHHHHHHHHHHHHhhccccc-ccccccchhhcccccccCCCCcchHHHhhhhhh
Confidence            4567899999999999999999999999999999999999999999 78889999999999999999999999999995 


Q ss_pred             --hhCceEEEEECCCCceeEeCCCCCCCCeEEEE-E---cCccceeeecC
Q 028455           84 --YYGREIAAYDIQTTRCDLYGQEKKYSERVMLI-Y---DGLHYDALAIS  127 (208)
Q Consensus        84 --~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~ll-Y---~G~HYD~l~~~  127 (208)
                        ....++.+....+..+..-.+..  ..++-.+ |   .-.||+.++..
T Consensus       297 ~~~~~~~~~~~~~~~t~~~~~~~~~--~~~~~~~~~n~~~~~h~~~~~~~  344 (371)
T KOG2605|consen  297 YEEIEKPLNITSFKDTCYIQTPPAI--EESVKMEKYNFWVEVHYNTARHS  344 (371)
T ss_pred             hhhccccceeecccccceeccCccc--ccchhhhhhcccchhhhhhcccc
Confidence              45555555555555333332222  2222222 3   44699998875


No 8  
>COG5539 Predicted cysteine protease (OTU family) [Posttranslational modification, protein turnover, chaperones]
Probab=99.03  E-value=1.1e-10  Score=101.34  Aligned_cols=115  Identities=16%  Similarity=0.047  Sum_probs=88.6

Q ss_pred             EEEEEeCCCCchhhHHHHHHhhcC-----CCchHHHHHHHHHHHhcChhcchhhhc-C------CCHHHHHHHhCCCCcc
Q 028455            5 IVRRVIPSDNSCLFNAVGYVMEHD-----KNKAPELRQVIAATVASDPVKYSEAFL-G------KSNQEYCSWIQDPEKW   72 (208)
Q Consensus         5 l~~~~ip~DGnCLFrAis~~l~~~-----~~~~~~lR~~va~~I~~np~~y~e~~l-~------~~~~eY~~~i~~~~~W   72 (208)
                      ++--.++|||+|+|-+||++|.-.     -+....+|-.=..|...+.+.| ..++ +      .+|++|++.|+.+..|
T Consensus       171 i~k~d~~~dG~ieia~iS~~l~v~i~~Vdv~~~~~dr~~~~~~~q~~~i~f-~g~hfD~~t~~m~~~dt~~ne~~~~a~~  249 (306)
T COG5539         171 IVKPDSQGDGCIEIAIISDQLPVRIHVVDVDKDSEDRYNSHPYVQRISILF-TGIHFDEETLAMVLWDTYVNEVLFDASD  249 (306)
T ss_pred             hhccccCCCceEEEeEeccccceeeeeeecchhHHhhccCChhhhhhhhhh-cccccchhhhhcchHHHHHhhhcccccc
Confidence            455678999999999999999632     2345777877778887777777 4443 1      3899999999999999


Q ss_pred             cCHHHHHHHHHhhCceEEEEECCCCceeEeCCCCCCCCeEEEE--E-----cCccceee
Q 028455           73 GGAIELSILADYYGREIAAYDIQTTRCDLYGQEKKYSERVMLI--Y-----DGLHYDAL  124 (208)
Q Consensus        73 GG~iEL~als~~~~~~I~V~d~~~~~~~~fg~~~~~~~~i~ll--Y-----~G~HYD~l  124 (208)
                      |+.||+++||..|++++++++.... +.+|++-.  ..++.-+  |     .| ||+.+
T Consensus       250 g~~~ei~qLas~lk~~~~~~nT~~~-~ik~n~c~--~~~~~e~~~~~Ha~a~G-H~n~~  304 (306)
T COG5539         250 GITIEIQQLASLLKNPHYYTNTASP-SIKCNICG--TGFVGEKDYYAHALATG-HYNFG  304 (306)
T ss_pred             cchHHHHHHHHHhcCceEEeecCCc-eEEeeccc--cccchhhHHHHHHHhhc-Ccccc
Confidence            9999999999999999999997776 66776643  2233211  2     56 99976


No 9  
>PF05415 Peptidase_C36:  Beet necrotic yellow vein furovirus-type papain-like endopeptidase;  InterPro: IPR008746 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases correspond to MEROPS peptidase family C36 (clan CA). The type example is beet necrotic yellow vein furovirus-type papain-like endopeptidase (beet necrotic yellow vein virus), which is involved in processing the viral polyprotein.
Probab=96.05  E-value=0.017  Score=42.37  Aligned_cols=78  Identities=13%  Similarity=0.319  Sum_probs=53.8

Q ss_pred             eCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCC--CCcccCHHHHHHHHHhhCc
Q 028455           10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQD--PEKWGGAIELSILADYYGR   87 (208)
Q Consensus        10 ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~--~~~WGG~iEL~als~~~~~   87 (208)
                      +..|||||--|||.+|.-+.+       .+-+-|+.|..         +.+.||.|.++  |.+|-+-   ..+|+.+++
T Consensus         3 ~sR~NNCLVVAis~~L~~T~e-------~l~~~M~An~~---------~i~~y~~W~r~~~~STW~DC---~mFA~~LkV   63 (104)
T PF05415_consen    3 ASRPNNCLVVAISECLGVTLE-------KLDNLMQANVS---------TIKKYHTWLRKKRPSTWDDC---RMFADALKV   63 (104)
T ss_pred             ccCCCCeEeehHHHHhcchHH-------HHHHHHHhhHH---------HHHHHHHHHhcCCCCcHHHH---HHHHHhhee
Confidence            567999999999999985431       22334555433         36789999875  6899775   478999999


Q ss_pred             eEEEEEC-CCC-ceeEeCCCC
Q 028455           88 EIAAYDI-QTT-RCDLYGQEK  106 (208)
Q Consensus        88 ~I~V~d~-~~~-~~~~fg~~~  106 (208)
                      .|.+--. +++ ....|+++.
T Consensus        64 sm~vkV~~~~~~~l~~~~d~~   84 (104)
T PF05415_consen   64 SMQVKVLSDKPYDLLYFVDGA   84 (104)
T ss_pred             EEEEEEcCCCCceeeEeecCc
Confidence            9988543 333 344566554


No 10 
>PF02148 zf-UBP:  Zn-finger in ubiquitin-hydrolases and other protein;  InterPro: IPR001607 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP) [, ], All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties. Some of the proteins containing an UBP zinc finger include:    Homo sapiens (Human) deubiquitinating enzyme 13 (UBPD) Human deubiquitinating enzyme 5 (UBP5)  Dictyostelium discoideum (Slime mold) deubiquitinating enzyme A (UBPA)  Saccharomyces cerevisiae (Baker's yeast) deubiquitinating enzyme 8 (UBP8) Yeast deubiquitinating enzyme 14 (UBP14)   More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 3GV4_A 3PHD_B 3C5K_A 2UZG_A 3IHP_B 2G43_B 2G45_D 2I50_A 3MHH_A 3MHS_A ....
Probab=87.06  E-value=0.28  Score=33.26  Aligned_cols=30  Identities=27%  Similarity=0.313  Sum_probs=22.2

Q ss_pred             ceeeccccCCCcCCH---HHHHHHHHhhCCCcc
Q 028455          175 FTLRCGVCQIGVIGQ---KEAVEHAQATGHVNF  204 (208)
Q Consensus       175 ~~~~C~~c~~~~~g~---~~a~~ha~~tgH~~F  204 (208)
                      -...|+.||+.+-|.   .-|.+|+++|||.=|
T Consensus        10 ~lw~CL~Cg~~~C~~~~~~Ha~~H~~~~~H~l~   42 (63)
T PF02148_consen   10 NLWLCLTCGYVGCGRYSNGHALKHYKETGHPLA   42 (63)
T ss_dssp             SEEEETTTS-EEETTTSTSHHHHHHHHHT--EE
T ss_pred             ceEEeCCCCcccccCCcCcHHHHhhcccCCeEE
Confidence            345799999999985   679999999999633


No 11 
>PF12874 zf-met:  Zinc-finger of C2H2 type; PDB: 1ZU1_A 2KVG_A.
Probab=84.51  E-value=0.41  Score=26.01  Aligned_cols=25  Identities=16%  Similarity=0.504  Sum_probs=21.2

Q ss_pred             eeccccCCCcCCHHHHHHHHHhhCC
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQATGH  201 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~~tgH  201 (208)
                      ..|..|++.+.++...+.|-+...|
T Consensus         1 ~~C~~C~~~f~s~~~~~~H~~s~~H   25 (25)
T PF12874_consen    1 FYCDICNKSFSSENSLRQHLRSKKH   25 (25)
T ss_dssp             EEETTTTEEESSHHHHHHHHTTHHH
T ss_pred             CCCCCCCCCcCCHHHHHHHHCcCCC
Confidence            3699999999999999999876543


No 12 
>PF12756 zf-C2H2_2:  C2H2 type zinc-finger (2 copies); PDB: 2DMI_A.
Probab=83.80  E-value=1.1  Score=31.77  Aligned_cols=30  Identities=20%  Similarity=0.411  Sum_probs=26.5

Q ss_pred             eeeccccCCCcCCHHHHHHHHHhhCCCccc
Q 028455          176 TLRCGVCQIGVIGQKEAVEHAQATGHVNFQ  205 (208)
Q Consensus       176 ~~~C~~c~~~~~g~~~a~~ha~~tgH~~F~  205 (208)
                      ..+|..|++.+.....-+.|-...+|....
T Consensus        50 ~~~C~~C~~~f~s~~~l~~Hm~~~~H~~~~   79 (100)
T PF12756_consen   50 SFRCPYCNKTFRSREALQEHMRSKHHKKRN   79 (100)
T ss_dssp             SEEBSSSS-EESSHHHHHHHHHHTTTTC-S
T ss_pred             CCCCCccCCCCcCHHHHHHHHcCccCCCcc
Confidence            689999999999999999999999998874


No 13 
>PF00096 zf-C2H2:  Zinc finger, C2H2 type;  InterPro: IPR007087 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger: #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C], where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter []. This entry represents the classical C2H2 zinc finger domain.  More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005622 intracellular; PDB: 2D9H_A 2EPC_A 1SP1_A 1VA3_A 2WBT_B 2ELR_A 2YTP_A 2YTT_A 1VA1_A 2ELO_A ....
Probab=79.08  E-value=1.2  Score=23.48  Aligned_cols=21  Identities=14%  Similarity=0.468  Sum_probs=18.8

Q ss_pred             eccccCCCcCCHHHHHHHHHh
Q 028455          178 RCGVCQIGVIGQKEAVEHAQA  198 (208)
Q Consensus       178 ~C~~c~~~~~g~~~a~~ha~~  198 (208)
                      +|.+|++.+.....-..|-+.
T Consensus         2 ~C~~C~~~f~~~~~l~~H~~~   22 (23)
T PF00096_consen    2 KCPICGKSFSSKSNLKRHMRR   22 (23)
T ss_dssp             EETTTTEEESSHHHHHHHHHH
T ss_pred             CCCCCCCccCCHHHHHHHHhH
Confidence            799999999999999998764


No 14 
>smart00290 ZnF_UBP Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger.
Probab=77.62  E-value=1.4  Score=28.00  Aligned_cols=29  Identities=31%  Similarity=0.401  Sum_probs=22.2

Q ss_pred             eeeccccCCCcCCH---HHHHHHHHhhCCCcc
Q 028455          176 TLRCGVCQIGVIGQ---KEAVEHAQATGHVNF  204 (208)
Q Consensus       176 ~~~C~~c~~~~~g~---~~a~~ha~~tgH~~F  204 (208)
                      .-.|+.|++..-|.   .-+..|++.|||.=+
T Consensus        11 l~~CL~C~~~~c~~~~~~h~~~H~~~t~H~~~   42 (50)
T smart00290       11 LWLCLTCGQVGCGRYQLGHALEHFEETGHPLV   42 (50)
T ss_pred             eEEecCCCCcccCCCCCcHHHHHhhhhCCCEE
Confidence            44799998777643   459999999999643


No 15 
>PF12171 zf-C2H2_jaz:  Zinc-finger double-stranded RNA-binding;  InterPro: IPR022755  This zinc finger is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus []. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation.   This entry represents the multiple-adjacent-C2H2 zinc finger, JAZ. ; PDB: 4DGW_A 1ZR9_A.
Probab=77.23  E-value=0.45  Score=26.66  Aligned_cols=25  Identities=16%  Similarity=0.438  Sum_probs=20.7

Q ss_pred             eeccccCCCcCCHHHHHHHHHhhCC
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQATGH  201 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~~tgH  201 (208)
                      ..|..|++.|.++.....|-++..|
T Consensus         2 ~~C~~C~k~f~~~~~~~~H~~sk~H   26 (27)
T PF12171_consen    2 FYCDACDKYFSSENQLKQHMKSKKH   26 (27)
T ss_dssp             CBBTTTTBBBSSHHHHHCCTTSHHH
T ss_pred             CCcccCCCCcCCHHHHHHHHccCCC
Confidence            3599999999999999988765544


No 16 
>PF13894 zf-C2H2_4:  C2H2-type zinc finger; PDB: 2ELX_A 2EPP_A 2DLK_A 1X6H_A 2EOU_A 2EMB_A 2GQJ_A 2CSH_A 2WBT_B 2ELM_A ....
Probab=72.78  E-value=2.8  Score=21.64  Aligned_cols=21  Identities=19%  Similarity=0.517  Sum_probs=16.5

Q ss_pred             eccccCCCcCCHHHHHHHHHh
Q 028455          178 RCGVCQIGVIGQKEAVEHAQA  198 (208)
Q Consensus       178 ~C~~c~~~~~g~~~a~~ha~~  198 (208)
                      +|..|++.+....+-..|-..
T Consensus         2 ~C~~C~~~~~~~~~l~~H~~~   22 (24)
T PF13894_consen    2 QCPICGKSFRSKSELRQHMRT   22 (24)
T ss_dssp             E-SSTS-EESSHHHHHHHHHH
T ss_pred             CCcCCCCcCCcHHHHHHHHHh
Confidence            699999999999999988653


No 17 
>KOG0804 consensus Cytoplasmic Zn-finger protein BRAP2 (BRCA1 associated protein) [General function prediction only]
Probab=71.46  E-value=2.3  Score=39.87  Aligned_cols=25  Identities=32%  Similarity=0.473  Sum_probs=20.9

Q ss_pred             eccccCCCcCC---HHHHHHHHHhhCCC
Q 028455          178 RCGVCQIGVIG---QKEAVEHAQATGHV  202 (208)
Q Consensus       178 ~C~~c~~~~~g---~~~a~~ha~~tgH~  202 (208)
                      .|.+||.+.-|   +.-|++|++.|||+
T Consensus       242 icliCg~vgcgrY~eghA~rHweet~H~  269 (493)
T KOG0804|consen  242 ICLICGNVGCGRYKEGHARRHWEETGHC  269 (493)
T ss_pred             EEEEccceecccccchhHHHHHHhhcce
Confidence            67778777665   88999999999996


No 18 
>smart00355 ZnF_C2H2 zinc finger.
Probab=71.24  E-value=3  Score=21.71  Aligned_cols=21  Identities=24%  Similarity=0.403  Sum_probs=18.7

Q ss_pred             eeccccCCCcCCHHHHHHHHH
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQ  197 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~  197 (208)
                      .+|..|++.+.+...-..|-.
T Consensus         1 ~~C~~C~~~f~~~~~l~~H~~   21 (26)
T smart00355        1 YRCPECGKVFKSKSALKEHMR   21 (26)
T ss_pred             CCCCCCcchhCCHHHHHHHHH
Confidence            369999999999999999976


No 19 
>PF05412 Peptidase_C33:  Equine arterivirus Nsp2-type cysteine proteinase;  InterPro: IPR008743 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].  This group of cysteine peptidases corresponds to MEROPS peptidase family C33 (clan CA). The type example is equine arteritis virus Nsp2-type cysteine proteinase, which is involved in viral polyprotein processing [].; GO: 0016032 viral reproduction, 0019082 viral protein processing
Probab=67.77  E-value=3.6  Score=31.18  Aligned_cols=84  Identities=15%  Similarity=0.255  Sum_probs=47.6

Q ss_pred             eCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhCceE
Q 028455           10 IPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREI   89 (208)
Q Consensus        10 ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~~~I   89 (208)
                      =|+||+|-.|.|+..+++-.               .  ..|....        -+.-+.+..|-++-.|.-+=..++.|.
T Consensus         4 PP~DG~CG~H~i~aI~n~m~---------------~--~~~t~~l--------~~~~r~~d~W~~dedl~~~iq~l~lPa   58 (108)
T PF05412_consen    4 PPGDGSCGWHCIAAIMNHMM---------------G--GEFTTPL--------PQRNRPSDDWADDEDLYQVIQSLRLPA   58 (108)
T ss_pred             CCCCCchHHHHHHHHHHHhh---------------c--cCCCccc--------cccCCChHHccChHHHHHHHHHccCce
Confidence            38999999999998877421               1  1131111        112223456777766666656666666


Q ss_pred             EEEECCCCceeEeCCCCCCCCeEEEEEcCccceeeecCC
Q 028455           90 AAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISP  128 (208)
Q Consensus        90 ~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~  128 (208)
                      .+......          ..-+-++.-+|.|+.+-....
T Consensus        59 t~~~~~~C----------p~ArYv~~l~~qHW~V~~~~g   87 (108)
T PF05412_consen   59 TLDRNGAC----------PHARYVLKLDGQHWEVSVRKG   87 (108)
T ss_pred             eccCCCCC----------CCCEEEEEecCceEEEEEcCC
Confidence            55432211          123334447888888766653


No 20 
>PF13912 zf-C2H2_6:  C2H2-type zinc finger; PDB: 1JN7_A 1FU9_A 2L1O_A 1NJQ_A 2EN8_A 2EMM_A 1FV5_A 1Y0J_B 2L6Z_B.
Probab=64.50  E-value=5.2  Score=21.86  Aligned_cols=21  Identities=19%  Similarity=0.384  Sum_probs=18.5

Q ss_pred             eeccccCCCcCCHHHHHHHHH
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQ  197 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~  197 (208)
                      .+|..|++.|.....=.+|-+
T Consensus         2 ~~C~~C~~~F~~~~~l~~H~~   22 (27)
T PF13912_consen    2 FECDECGKTFSSLSALREHKR   22 (27)
T ss_dssp             EEETTTTEEESSHHHHHHHHC
T ss_pred             CCCCccCCccCChhHHHHHhH
Confidence            589999999999999888863


No 21 
>PF05379 Peptidase_C23:  Carlavirus endopeptidase ;  InterPro: IPR008041 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].   This group of cysteine peptidases belong to the MEROPS peptidase family C23 (clan CA). The type example is Carlavirus (apple stem pitting virus) endopeptidase, this thought to play a role in the post-translational cleavage of the high molecular weight primary translation products of the virus.; GO: 0003968 RNA-directed RNA polymerase activity, 0016817 hydrolase activity, acting on acid anhydrides
Probab=60.42  E-value=33  Score=24.99  Aligned_cols=16  Identities=19%  Similarity=0.536  Sum_probs=14.0

Q ss_pred             CchhhHHHHHHhhcCC
Q 028455           14 NSCLFNAVGYVMEHDK   29 (208)
Q Consensus        14 GnCLFrAis~~l~~~~   29 (208)
                      |.|..||||.+|.+..
T Consensus         3 N~Cvi~AiA~aL~R~~   18 (89)
T PF05379_consen    3 NGCVIRAIAEALGRRE   18 (89)
T ss_pred             ccchhHHHHHHhCCCH
Confidence            7899999999998753


No 22 
>smart00451 ZnF_U1 U1-like zinc finger. Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.
Probab=59.45  E-value=5.4  Score=23.08  Aligned_cols=25  Identities=16%  Similarity=0.481  Sum_probs=20.7

Q ss_pred             eeccccCCCcCCHHHHHHHHHhhCC
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQATGH  201 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~~tgH  201 (208)
                      ..|..|++.|.+......|-..--|
T Consensus         4 ~~C~~C~~~~~~~~~~~~H~~gk~H   28 (35)
T smart00451        4 FYCKLCNVTFTDEISVEAHLKGKKH   28 (35)
T ss_pred             eEccccCCccCCHHHHHHHHChHHH
Confidence            3599999999999999988766544


No 23 
>PHA03082 DNA-dependent RNA polymerase subunit; Provisional
Probab=57.44  E-value=5.8  Score=26.85  Aligned_cols=19  Identities=21%  Similarity=0.478  Sum_probs=16.0

Q ss_pred             CceeeccccCCCcCCHHHH
Q 028455          174 NFTLRCGVCQIGVIGQKEA  192 (208)
Q Consensus       174 ~~~~~C~~c~~~~~g~~~a  192 (208)
                      .|.+.|+.||..+..++..
T Consensus         2 Vf~lVCsTCGrDlSeeRy~   20 (63)
T PHA03082          2 VFQLVCSTCGRDLSEERYR   20 (63)
T ss_pred             eeeeeecccCcchhHHHHH
Confidence            3778999999999888764


No 24 
>PF05864 Chordopox_RPO7:  Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide (RPO7);  InterPro: IPR008448 DNA-directed RNA polymerases 2.7.7.6 from EC (also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length []. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel. RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:  RNA polymerase I: located in the nucleoli, synthesises precursors of most ribosomal RNAs. RNA polymerase II: occurs in the nucleoplasm, synthesises mRNA precursors.  RNA polymerase III: also occurs in the nucleoplasm, synthesises the precursors of 5S ribosomal RNA, the tRNAs, and a variety of other small nuclear and cytosolic RNAs.   Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kDa, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits. This family consists of several Chordopoxvirus DNA-directed RNA polymerase 7 kDa polypeptide sequences. DNA-dependent RNA polymerase catalyses the transcription of DNA into RNA [].; GO: 0003677 DNA binding, 0003899 DNA-directed RNA polymerase activity, 0006351 transcription, DNA-dependent
Probab=57.13  E-value=6.1  Score=26.73  Aligned_cols=18  Identities=22%  Similarity=0.545  Sum_probs=15.7

Q ss_pred             ceeeccccCCCcCCHHHH
Q 028455          175 FTLRCGVCQIGVIGQKEA  192 (208)
Q Consensus       175 ~~~~C~~c~~~~~g~~~a  192 (208)
                      |.+.|+.||..+..++..
T Consensus         3 f~lvCSTCGrDlSeeRy~   20 (63)
T PF05864_consen    3 FQLVCSTCGRDLSEERYR   20 (63)
T ss_pred             eeeeecccCCcchHHHHH
Confidence            778999999999888764


No 25 
>PRK10963 hypothetical protein; Provisional
Probab=54.03  E-value=10  Score=32.07  Aligned_cols=36  Identities=14%  Similarity=0.294  Sum_probs=26.0

Q ss_pred             HHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHH
Q 028455           37 QVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIEL   78 (208)
Q Consensus        37 ~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL   78 (208)
                      +.|++|+++|||.|..      ..+-+..|.-|...||.|-|
T Consensus         6 ~~V~~yL~~~PdFf~~------h~~Ll~~L~lph~~~gaVSL   41 (223)
T PRK10963          6 RAVVDYLLQNPDFFIR------NARLVEQMRVPHPVRGTVSL   41 (223)
T ss_pred             HHHHHHHHHCchHHhh------CHHHHHhccCCCCCCCeecH
Confidence            5799999999998832      34556677777667776544


No 26 
>PF13913 zf-C2HC_2:  zinc-finger of a C2HC-type
Probab=53.18  E-value=11  Score=20.76  Aligned_cols=21  Identities=14%  Similarity=0.420  Sum_probs=16.1

Q ss_pred             eeeccccCCCcCCHHHHHHHHH
Q 028455          176 TLRCGVCQIGVIGQKEAVEHAQ  197 (208)
Q Consensus       176 ~~~C~~c~~~~~g~~~a~~ha~  197 (208)
                      .+.|..||..| ++.....|..
T Consensus         2 l~~C~~CgR~F-~~~~l~~H~~   22 (25)
T PF13913_consen    2 LVPCPICGRKF-NPDRLEKHEK   22 (25)
T ss_pred             CCcCCCCCCEE-CHHHHHHHHH
Confidence            35799999999 6666777754


No 27 
>COG4049 Uncharacterized protein containing archaeal-type C2H2 Zn-finger [General function prediction only]
Probab=53.03  E-value=6.9  Score=26.46  Aligned_cols=26  Identities=23%  Similarity=0.429  Sum_probs=21.9

Q ss_pred             CCceeeccccCCCcCCHHHHHHHHHh
Q 028455          173 ANFTLRCGVCQIGVIGQKEAVEHAQA  198 (208)
Q Consensus       173 ~~~~~~C~~c~~~~~g~~~a~~ha~~  198 (208)
                      ...-++|.-||.+|+.++.-..|-.+
T Consensus        14 GE~~lrCPRC~~~FR~~K~Y~RHVNK   39 (65)
T COG4049          14 GEEFLRCPRCGMVFRRRKDYIRHVNK   39 (65)
T ss_pred             CceeeeCCchhHHHHHhHHHHHHhhH
Confidence            34457999999999999999999754


No 28 
>PHA00616 hypothetical protein
Probab=52.66  E-value=7.6  Score=24.81  Aligned_cols=29  Identities=21%  Similarity=0.272  Sum_probs=23.6

Q ss_pred             eeccccCCCcCCHHHHHHHHH-hhCCCccc
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQ-ATGHVNFQ  205 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~-~tgH~~F~  205 (208)
                      .+|..||++|.--.+-..|-. .|||..|.
T Consensus         2 YqC~~CG~~F~~~s~l~~H~r~~hg~~~~~   31 (44)
T PHA00616          2 YQCLRCGGIFRKKKEVIEHLLSVHKQNKLT   31 (44)
T ss_pred             CccchhhHHHhhHHHHHHHHHHhcCCCccc
Confidence            479999999999999999974 46666654


No 29 
>PF05381 Peptidase_C21:  Tymovirus endopeptidase;  InterPro: IPR008043 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad [].   This entry is found in cysteine peptidases belong to the MEROPS peptidase family C21 (tymovirus endopeptidase family, clan CA). The type example is tymovirus endopeptidase (turnip yellow mosaic virus). The noncapsid protein expressed from ORF-206 of turnip yellow mosaic virus (TYMV) is autocatalytically processed by a papain-like protease, producing N-terminal 150kDa and C-terminal 70kDa proteins.; GO: 0003968 RNA-directed RNA polymerase activity, 0016032 viral reproduction
Probab=43.41  E-value=1.4e+02  Score=22.53  Aligned_cols=89  Identities=19%  Similarity=0.199  Sum_probs=51.3

Q ss_pred             CCchhhHHHHHHhhcCCC-chHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhCceEEE
Q 028455           13 DNSCLFNAVGYVMEHDKN-KAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREIAA   91 (208)
Q Consensus        13 DGnCLFrAis~~l~~~~~-~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~~~I~V   91 (208)
                      ..+||--|||.+..-+.+ -...|....=+=+..|++   ..-+|-+-                -.+.|||-.|.....+
T Consensus         2 ~~~CLL~A~s~at~~~~~~LW~~L~~~lPDSlL~n~e---i~~~GLST----------------DhltaLa~~~~~~~~~   62 (104)
T PF05381_consen    2 ALDCLLVAISQATSISPETLWATLCEILPDSLLDNPE---IRTLGLST----------------DHLTALAYRYHFQCTF   62 (104)
T ss_pred             CcceeHHhhhhhhCCCHHHHHHHHHHhCchhhcCchh---hhhcCCcH----------------HHHHHHHHHHheEEEE
Confidence            478999999999875431 122233322233333333   11112211                2467999999999888


Q ss_pred             EECCCCceeEeCCCCCCCCeEEEEE-cC--cccee
Q 028455           92 YDIQTTRCDLYGQEKKYSERVMLIY-DG--LHYDA  123 (208)
Q Consensus        92 ~d~~~~~~~~fg~~~~~~~~i~llY-~G--~HYD~  123 (208)
                      ....  .+..||-.+ ....+.+.+ +|  .||..
T Consensus        63 hs~~--~~~~~Gi~~-as~~~~I~ht~G~p~HFs~   94 (104)
T PF05381_consen   63 HSDH--GVLHYGIKD-ASTVFTITHTPGPPGHFSL   94 (104)
T ss_pred             EcCC--ceEEeecCC-CceEEEEEeCCCCCCcccc
Confidence            8633  367888766 244444445 45  39998


No 30 
>cd00729 rubredoxin_SM Rubredoxin, Small Modular nonheme iron binding domain containing a [Fe(SCys)4] center, present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), and  believed to be involved in electron transfer. Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain. Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=41.03  E-value=13  Score=22.18  Aligned_cols=14  Identities=29%  Similarity=0.465  Sum_probs=11.3

Q ss_pred             eeeccccCCCcCCH
Q 028455          176 TLRCGVCQIGVIGQ  189 (208)
Q Consensus       176 ~~~C~~c~~~~~g~  189 (208)
                      .-+|.+||++..|.
T Consensus         2 ~~~C~~CG~i~~g~   15 (34)
T cd00729           2 VWVCPVCGYIHEGE   15 (34)
T ss_pred             eEECCCCCCEeECC
Confidence            35899999998775


No 31 
>PF09237 GAGA:  GAGA factor;  InterPro: IPR015318 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  Members of this entry bind to a 5'-GAGAG-3' DNA consensus binding site, and contain a Cys2-His2 zinc finger core as well as an N-terminal extension containing two highly basic regions. The zinc finger core binds in the DNA major groove and recognises the first three GAG bases of the consensus in a manner similar to that seen in other classical zinc finger-DNA complexes. The second basic region forms a helix that interacts in the major groove recognising the last G of the consensus, while the first basic region wraps around the DNA in the minor groove and recognises the A in the fourth position of the consensus sequence [].  More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; PDB: 1YUI_A 1YUJ_A.
Probab=38.77  E-value=26  Score=23.26  Aligned_cols=22  Identities=14%  Similarity=0.386  Sum_probs=16.9

Q ss_pred             eccccCCCcCCHHHHHHHHHhh
Q 028455          178 RCGVCQIGVIGQKEAVEHAQAT  199 (208)
Q Consensus       178 ~C~~c~~~~~g~~~a~~ha~~t  199 (208)
                      .|.+|+..+.-++.-+.|-+.+
T Consensus        26 tCP~C~a~~~~srnLrRHle~~   47 (54)
T PF09237_consen   26 TCPICGAVIRQSRNLRRHLEIR   47 (54)
T ss_dssp             E-TTT--EESSHHHHHHHHHHH
T ss_pred             CCCcchhhccchhhHHHHHHHH
Confidence            6999999999999999998754


No 32 
>cd02669 Peptidase_C19M A subfamily of Peptidase C19. Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyze bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin. The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome.
Probab=38.01  E-value=16  Score=33.90  Aligned_cols=32  Identities=22%  Similarity=0.214  Sum_probs=23.2

Q ss_pred             CCceeeccccCCCcC---CHHHHHHHHHhhCCCcc
Q 028455          173 ANFTLRCGVCQIGVI---GQKEAVEHAQATGHVNF  204 (208)
Q Consensus       173 ~~~~~~C~~c~~~~~---g~~~a~~ha~~tgH~~F  204 (208)
                      ..-..-|.+||+.+.   +..-|..|++.|||.=|
T Consensus        25 ~~n~~~CL~cg~~~~g~~~~~ha~~H~~~~~H~~~   59 (440)
T cd02669          25 NLNVYACLVCGKYFQGRGKGSHAYTHSLEDNHHVF   59 (440)
T ss_pred             CCcEEEEcccCCeecCCCCCcHHHHHhhccCCCEE
Confidence            333456999996655   34689999999999633


No 33 
>PF04475 DUF555:  Protein of unknown function (DUF555);  InterPro: IPR007564 This is a family of uncharacterised, hypothetical archaeal proteins.
Probab=38.00  E-value=42  Score=25.22  Aligned_cols=38  Identities=11%  Similarity=0.003  Sum_probs=29.8

Q ss_pred             HHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcCC
Q 028455          151 PAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG  188 (208)
Q Consensus       151 ~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~g  188 (208)
                      ++--+.-|..+.|+..-.|-+...-.+.|..||..|..
T Consensus        22 AI~iAIseaGkrLn~~~~~VeIevG~~~cP~Cge~~~~   59 (102)
T PF04475_consen   22 AIGIAISEAGKRLNPDLDYVEIEVGDTICPKCGEELDS   59 (102)
T ss_pred             HHHHHHHHHHHhhCCCCCeEEEecCcccCCCCCCccCc
Confidence            33344457778889988899999999999999987754


No 34 
>KOG1247 consensus Methionyl-tRNA synthetase [Translation, ribosomal structure and biogenesis]
Probab=37.98  E-value=19  Score=33.96  Aligned_cols=61  Identities=18%  Similarity=0.284  Sum_probs=43.9

Q ss_pred             EcCccceeeecCCCCCCCCCCCeeeee--CCCCCcchHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcCC
Q 028455          116 YDGLHYDALAISPFEGAPEEFDQTIFP--VQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG  188 (208)
Q Consensus       116 Y~G~HYD~l~~~~~~~~~~~~d~~~f~--~~~~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~g  188 (208)
                      |+++||++.---..       |...|.  +.+.+.     ..++.+..+|..++|+.-.+-..|.|.+|++-|..
T Consensus        86 yh~ihk~vy~Wf~I-------dfD~fgrtTT~~qT-----~i~Q~iF~kl~~ng~~se~tv~qLyC~vc~~flad  148 (567)
T KOG1247|consen   86 YHGIHKVVYDWFKI-------DFDEFGRTTTKTQT-----EICQDIFSKLYDNGYLSEQTVKQLYCEVCDTFLAD  148 (567)
T ss_pred             cchhHHHHHHhhcc-------cccccCcccCcchh-----HHHHHHhhchhhcCCcccceeeeEEehhhcccccc
Confidence            88999988754321       233554  333333     56778888889999999999999999999887763


No 35 
>PF13465 zf-H2C2_2:  Zinc-finger double domain; PDB: 2EN7_A 1TF6_A 1TF3_A 2ELT_A 2EOS_A 2EN2_A 2DMD_A 2WBS_A 2WBU_A 2EM5_A ....
Probab=36.45  E-value=17  Score=20.03  Aligned_cols=18  Identities=22%  Similarity=0.412  Sum_probs=13.5

Q ss_pred             cccCCceeeccccCCCcC
Q 028455          170 TDTANFTLRCGVCQIGVI  187 (208)
Q Consensus       170 t~t~~~~~~C~~c~~~~~  187 (208)
                      +-+.....+|..|++.|.
T Consensus         8 ~H~~~k~~~C~~C~k~F~   25 (26)
T PF13465_consen    8 THTGEKPYKCPYCGKSFS   25 (26)
T ss_dssp             HHSSSSSEEESSSSEEES
T ss_pred             hcCCCCCCCCCCCcCeeC
Confidence            345666789999998764


No 36 
>COG3426 Butyrate kinase [Energy production and conversion]
Probab=36.29  E-value=53  Score=29.58  Aligned_cols=59  Identities=24%  Similarity=0.432  Sum_probs=38.5

Q ss_pred             hHHHHHHhhcCCCchHHHHHHHHHHHhcChhc--------chhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHh
Q 028455           18 FNAVGYVMEHDKNKAPELRQVIAATVASDPVK--------YSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADY   84 (208)
Q Consensus        18 FrAis~~l~~~~~~~~~lR~~va~~I~~np~~--------y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~   84 (208)
                      |+|++||+.+      ++= .++..+..+||-        |.+.|+. -..+|++||..--.+.|+.||.|||.=
T Consensus       275 ~~AmayQVaK------eIG-~~savL~G~vDaIvLTGGiA~~~~f~~-~I~~~v~~iapv~v~PGE~EleALA~G  341 (358)
T COG3426         275 YEAMAYQVAK------EIG-AMSAVLKGKVDAIVLTGGIAYEKLFVD-AIEDRVSWIAPVIVYPGEDELEALAEG  341 (358)
T ss_pred             HHHHHHHHHH------HHH-hhhhhcCCCCCEEEEecchhhHHHHHH-HHHHHHhhhcceEecCCchHHHHHHhh
Confidence            5677777653      222 234456666662        2222222 467889999888899999999999863


No 37 
>PHA02768 hypothetical protein; Provisional
Probab=36.09  E-value=32  Score=23.02  Aligned_cols=21  Identities=24%  Similarity=0.502  Sum_probs=14.3

Q ss_pred             eccccCCCcCCHHHHHHHHHh
Q 028455          178 RCGVCQIGVIGQKEAVEHAQA  198 (208)
Q Consensus       178 ~C~~c~~~~~g~~~a~~ha~~  198 (208)
                      +|..||+.|.-...-..|-..
T Consensus         7 ~C~~CGK~Fs~~~~L~~H~r~   27 (55)
T PHA02768          7 ECPICGEIYIKRKSMITHLRK   27 (55)
T ss_pred             CcchhCCeeccHHHHHHHHHh
Confidence            677777777766666666554


No 38 
>PF09082 DUF1922:  Domain of unknown function (DUF1922);  InterPro: IPR015166 Members of this family consist of a beta-sheet region followed by an alpha-helix and an unstructured C terminus. The beta-sheet region contains a CXCX...XCXC sequence with Cys residues located in two proximal loops and pointing towards each other. This precise function of this set of bacterial proteins is, as yet, unknown []. ; PDB: 1GH9_A.
Probab=35.91  E-value=13  Score=26.05  Aligned_cols=21  Identities=24%  Similarity=0.477  Sum_probs=15.1

Q ss_pred             CCCccccCCceeeccccCCCcC
Q 028455          166 KKTYTDTANFTLRCGVCQIGVI  187 (208)
Q Consensus       166 ~~~~t~t~~~~~~C~~c~~~~~  187 (208)
                      +.-|.+-.+.+-+| +||+.++
T Consensus        10 r~lya~e~~kTkkC-~CG~~l~   30 (68)
T PF09082_consen   10 RYLYAKEGAKTKKC-VCGKTLK   30 (68)
T ss_dssp             --EEEETT-SEEEE-TTTEEEE
T ss_pred             CEEEecCCcceeEe-cCCCeee
Confidence            34578888889999 9998765


No 39 
>PF13909 zf-H2C2_5:  C2H2-type zinc-finger domain; PDB: 1X5W_A.
Probab=35.21  E-value=41  Score=17.66  Aligned_cols=21  Identities=14%  Similarity=0.412  Sum_probs=15.4

Q ss_pred             eeccccCCCcCCHHHHHHHHHh
Q 028455          177 LRCGVCQIGVIGQKEAVEHAQA  198 (208)
Q Consensus       177 ~~C~~c~~~~~g~~~a~~ha~~  198 (208)
                      .+|..|.+... ...-..|-+.
T Consensus         1 y~C~~C~y~t~-~~~l~~H~~~   21 (24)
T PF13909_consen    1 YKCPHCSYSTS-KSNLKRHLKR   21 (24)
T ss_dssp             EE-SSSS-EES-HHHHHHHHHH
T ss_pred             CCCCCCCCcCC-HHHHHHHHHh
Confidence            37999999998 8888888653


No 40 
>PF07368 DUF1487:  Protein of unknown function (DUF1487);  InterPro: IPR009961 This family consists of several uncharacterised proteins from Drosophila melanogaster. The function of this family is unknown.
Probab=34.16  E-value=64  Score=27.51  Aligned_cols=66  Identities=18%  Similarity=0.286  Sum_probs=37.2

Q ss_pred             cccCHHHH-HHHHHhhCceEEEEE---CCCCceeEeCCCCCCCCeEEEEEcCccceeeecCCCCCCCCCCCeeeeeCCC
Q 028455           71 KWGGAIEL-SILADYYGREIAAYD---IQTTRCDLYGQEKKYSERVMLIYDGLHYDALAISPFEGAPEEFDQTIFPVQK  145 (208)
Q Consensus        71 ~WGG~iEL-~als~~~~~~I~V~d---~~~~~~~~fg~~~~~~~~i~llY~G~HYD~l~~~~~~~~~~~~d~~~f~~~~  145 (208)
                      .|...++- --++..+++++.-++   +.-..+..+-.   ..+...++.+|.||++|....      ..-+-|||..+
T Consensus       144 iW~ekla~~Yel~~~l~~~~f~iNC~~V~L~PI~~~~~---~~~~~v~i~~gyHYE~l~~~~------~~k~IVFP~~~  213 (215)
T PF07368_consen  144 IWNEKLASAYELAARLPCDTFYINCFNVDLSPIMPFFA---ARKNDVLIANGYHYETLTIGG------KRKIIVFPIGT  213 (215)
T ss_pred             EeCcHHHHHHHHHHhCCCCEEEEEeccCCchhhhhhhh---cCCceEEEECCeeEEEEEECC------eEEEEEEeccc
Confidence            57665542 234455555555544   22222333221   235667778999999998863      23457787654


No 41 
>COG3357 Predicted transcriptional regulator containing an HTH domain fused to a Zn-ribbon [Transcription]
Probab=33.80  E-value=49  Score=24.53  Aligned_cols=37  Identities=19%  Similarity=0.186  Sum_probs=24.4

Q ss_pred             hHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcCC
Q 028455          150 GPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVIG  188 (208)
Q Consensus       150 ~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~g  188 (208)
                      ..+....+.+|+.|+.+++---..-  -+|..||+-|+.
T Consensus        34 ~~v~~~L~hiak~lkr~g~~Llv~P--a~CkkCGfef~~   70 (97)
T COG3357          34 KEVYDHLEHIAKSLKRKGKRLLVRP--ARCKKCGFEFRD   70 (97)
T ss_pred             HHHHHHHHHHHHHHHhCCceEEecC--hhhcccCccccc
Confidence            3455666677777788876322211  179999999886


No 42 
>cd01675 RNR_III Class III ribonucleotide reductase. Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded alpha-beta barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase). The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit w
Probab=33.12  E-value=57  Score=31.49  Aligned_cols=36  Identities=22%  Similarity=0.104  Sum_probs=22.0

Q ss_pred             hHHHHHHHHHHHHHhhCCC-ccccCCceeeccccCCCcCCH
Q 028455          150 GPAEDLALKLVKEQQRKKT-YTDTANFTLRCGVCQIGVIGQ  189 (208)
Q Consensus       150 ~~~~~~a~~l~~~~~~~~~-~t~t~~~~~~C~~c~~~~~g~  189 (208)
                      +++++..+..++  +...| +++|..+  +|.+||+...|+
T Consensus       495 ~al~~lv~~a~~--~~~~y~~~~~p~~--~C~~CG~~~~~~  531 (555)
T cd01675         495 EALEALVKKAAK--RGVIYFGINTPID--ICNDCGYIGEGE  531 (555)
T ss_pred             HHHHHHHHHHHH--cCCceEEEecCCc--cCCCCCCCCcCC
Confidence            334444444433  33455 7788777  999999976543


No 43 
>PF04877 Hairpins:  HrpZ;  InterPro: IPR006961  HrpZ (harpin elicitor) from the plant pathogen Pseudomonas syringae binds to lipid bilayers and forms a cation-conducting pore in vivo. This pore-forming activity may allow nutrient release or delivery of virulence factors during bacterial colonisation of host plants [].  The entry also represents hairpinN which is a virulence determinant which elicits lesion formation in Arabidopsis and tobacco and triggers systemic resistance in Arabidopsis []. 
Probab=32.98  E-value=36  Score=30.42  Aligned_cols=50  Identities=10%  Similarity=0.156  Sum_probs=30.2

Q ss_pred             chHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhh
Q 028455           31 KAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYY   85 (208)
Q Consensus        31 ~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~   85 (208)
                      .-..|=++|++||-.||+.|-.|    +...|.+++. ++..=..-|...+-+.+
T Consensus       162 ~D~~lL~eIaqFMD~nPe~FgkP----d~~sW~~eLk-eD~~L~~~E~~~F~~Al  211 (308)
T PF04877_consen  162 EDMPLLKEIAQFMDQNPEQFGKP----DRKSWADELK-EDNGLDKAETEQFQKAL  211 (308)
T ss_pred             ccHHHHHHHHHHHhcCHhhcCCC----CCchHHHHhh-cCCCCCHHHHHHHHHHH
Confidence            34567889999999999999444    1222444452 44443444555554444


No 44 
>TIGR02934 nifT_nitrog probable nitrogen fixation protein FixT. This largely uncharacterized protein family is assigned a role in nitrogen fixation by two criteria. First, its gene occurs, generally, among genes essential for expression of active nitrogenase. Second, its phylogenetic profile closely matches that of nitrogen-fixing bacteria. However, mutational studies in Klebsiella pneumoniae failed to demonstrate any phenotype for deletion or overexpression of the protein.
Probab=31.33  E-value=4.3  Score=28.28  Aligned_cols=44  Identities=16%  Similarity=0.112  Sum_probs=28.4

Q ss_pred             hcChhc-chhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhCceEEEEECCCCceeEeC
Q 028455           44 ASDPVK-YSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYGREIAAYDIQTTRCDLYG  103 (208)
Q Consensus        44 ~~np~~-y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~~~I~V~d~~~~~~~~fg  103 (208)
                      ++|.+- ++..+--++.++=+-.+.+++.|||                ++.+.+|....+.
T Consensus         6 R~~~~g~l~~YvpKKDLEE~Vv~~e~~~~WGG----------------~v~L~NGw~l~lp   50 (67)
T TIGR02934         6 RRNRAGELSAYVPKKDLEEVIVSVEKEELWGG----------------WVTLANGWRLELP   50 (67)
T ss_pred             EeCCCCCEEEEEECCcchhheeeeecCccccC----------------EEEECCccEEEeC
Confidence            444443 4223334678888888889999999                5566677554443


No 45 
>PRK09784 hypothetical protein; Provisional
Probab=31.16  E-value=25  Score=30.86  Aligned_cols=20  Identities=20%  Similarity=0.376  Sum_probs=16.4

Q ss_pred             EEEEEeCCCCchhhHHHHHH
Q 028455            5 IVRRVIPSDNSCLFNAVGYV   24 (208)
Q Consensus         5 l~~~~ip~DGnCLFrAis~~   24 (208)
                      |+--+|.|||-||.|||--.
T Consensus       200 lkyapvdgdgycllrailvl  219 (417)
T PRK09784        200 LKYAPVDGDGYCLLRAILVL  219 (417)
T ss_pred             ceecccCCCchhHHHHHHHh
Confidence            55678999999999999543


No 46 
>PRK06266 transcription initiation factor E subunit alpha; Validated
Probab=31.04  E-value=41  Score=27.59  Aligned_cols=49  Identities=8%  Similarity=0.135  Sum_probs=35.3

Q ss_pred             eeeeeCCCCCcchHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcC
Q 028455          138 QTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVI  187 (208)
Q Consensus       138 ~~~f~~~~~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~  187 (208)
                      .-.|..+.+.+.+.++....++.+.|+.+-.+.... .-..|..|+..+.
T Consensus        80 ~y~w~l~~~~i~d~ik~~~~~~~~klk~~l~~e~~~-~~Y~Cp~C~~ryt  128 (178)
T PRK06266         80 TYTWKPELEKLPEIIKKKKMEELKKLKEQLEEEENN-MFFFCPNCHIRFT  128 (178)
T ss_pred             EEEEEeCHHHHHHHHHHHHHHHHHHHHHHhhhccCC-CEEECCCCCcEEe
Confidence            457777777777888888888888888876665444 3367877776654


No 47 
>PHA00732 hypothetical protein
Probab=30.76  E-value=44  Score=23.74  Aligned_cols=10  Identities=30%  Similarity=0.766  Sum_probs=5.3

Q ss_pred             eccccCCCcC
Q 028455          178 RCGVCQIGVI  187 (208)
Q Consensus       178 ~C~~c~~~~~  187 (208)
                      +|..||+.+.
T Consensus        29 ~C~~CgKsF~   38 (79)
T PHA00732         29 KCPVCNKSYR   38 (79)
T ss_pred             ccCCCCCEeC
Confidence            5555555554


No 48 
>PF05148 Methyltransf_8:  Hypothetical methyltransferase;  InterPro: IPR007823 This family consists of uncharacterised eukaryotic proteins which are related to S-adenosyl-L-methionine-dependent methyltransferases.; GO: 0008168 methyltransferase activity; PDB: 2ZFU_B.
Probab=30.23  E-value=1.3e+02  Score=25.79  Aligned_cols=72  Identities=13%  Similarity=0.278  Sum_probs=41.7

Q ss_pred             hhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhc-----------CCCHHHHHHHhCC-CCcc------cCHHH
Q 028455           16 CLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFL-----------GKSNQEYCSWIQD-PEKW------GGAIE   77 (208)
Q Consensus        16 CLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l-----------~~~~~eY~~~i~~-~~~W------GG~iE   77 (208)
                      -.||-|-.+||.+..      ....+.++++|+.| +.+.           ..|++.+++|+++ |..|      .|+-.
T Consensus        13 srFR~lNE~LYT~~s------~~A~~lf~~dP~~F-~~YH~Gfr~Qv~~WP~nPvd~iI~~l~~~~~~~viaD~GCGdA~   85 (219)
T PF05148_consen   13 SRFRWLNEQLYTTSS------EEALKLFQEDPELF-DIYHEGFRQQVKKWPVNPVDVIIEWLKKRPKSLVIADFGCGDAK   85 (219)
T ss_dssp             HHHHHHHHHHHHS-H------HHHHHHHHH-HHHH-HHHHHHHHHHHCTSSS-HHHHHHHHHCTS-TTS-EEEES-TT-H
T ss_pred             CchHHHHHhHhcCCH------HHHHHHHHhCHHHH-HHHHHHHHHHHhcCCCCcHHHHHHHHHhcCCCEEEEECCCchHH
Confidence            369999999996542      23456788999988 4332           3589999999985 4556      23333


Q ss_pred             HHHHHHhh--CceEEEEECCCC
Q 028455           78 LSILADYY--GREIAAYDIQTT   97 (208)
Q Consensus        78 L~als~~~--~~~I~V~d~~~~   97 (208)
                         ||+..  +..|+.+|..+.
T Consensus        86 ---la~~~~~~~~V~SfDLva~  104 (219)
T PF05148_consen   86 ---LAKAVPNKHKVHSFDLVAP  104 (219)
T ss_dssp             ---HHHH--S---EEEEESS-S
T ss_pred             ---HHHhcccCceEEEeeccCC
Confidence               33444  345788886553


No 49 
>COG2051 RPS27A Ribosomal protein S27E [Translation, ribosomal structure and biogenesis]
Probab=28.98  E-value=29  Score=24.14  Aligned_cols=15  Identities=20%  Similarity=0.505  Sum_probs=12.0

Q ss_pred             CCceeeccccCCCcC
Q 028455          173 ANFTLRCGVCQIGVI  187 (208)
Q Consensus       173 ~~~~~~C~~c~~~~~  187 (208)
                      +++.++|..||..|.
T Consensus        35 ast~V~C~~CG~~l~   49 (67)
T COG2051          35 ASTVVTCLICGTTLA   49 (67)
T ss_pred             CceEEEecccccEEE
Confidence            456778999999886


No 50 
>PF13240 zinc_ribbon_2:  zinc-ribbon domain
Probab=28.95  E-value=33  Score=18.58  Aligned_cols=14  Identities=36%  Similarity=0.776  Sum_probs=8.6

Q ss_pred             cccCCceeeccccCCCc
Q 028455          170 TDTANFTLRCGVCQIGV  186 (208)
Q Consensus       170 t~t~~~~~~C~~c~~~~  186 (208)
                      .+.+.|   |..||..|
T Consensus        10 ~~~~~f---C~~CG~~l   23 (23)
T PF13240_consen   10 EDDAKF---CPNCGTPL   23 (23)
T ss_pred             CCcCcc---hhhhCCcC
Confidence            355556   77777654


No 51 
>PF07967 zf-C3HC:  C3HC zinc finger-like ;  InterPro: IPR012935 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.  This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) and other proteins. NIPA is thought to perform an antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signalling events []. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe protein containing this domain (O94506 from SWISSPROT) is involved in mRNA export from the nucleus [].  More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding, 0005634 nucleus
Probab=28.80  E-value=30  Score=26.65  Aligned_cols=23  Identities=13%  Similarity=0.330  Sum_probs=20.4

Q ss_pred             CCccccCCceeeccccCCCcCCH
Q 028455          167 KTYTDTANFTLRCGVCQIGVIGQ  189 (208)
Q Consensus       167 ~~~t~t~~~~~~C~~c~~~~~g~  189 (208)
                      +.++++...+|+|..||..+.-.
T Consensus        34 ~GW~~~~~d~l~C~~C~~~l~~~   56 (133)
T PF07967_consen   34 RGWICVSKDMLKCESCGARLCVK   56 (133)
T ss_pred             cCCCcCCCCEEEeCCCCCEEEEe
Confidence            88999999999999999887654


No 52 
>cd00350 rubredoxin_like Rubredoxin_like; nonheme iron binding domain containing a [Fe(SCys)4] center. The family includes rubredoxins, a small electron transfer protein, and a slightly smaller modular rubredoxin domain present in rubrerythrin and nigerythrin and detected either N- or C-terminal to such proteins as flavin reductase, NAD(P)H-nitrite reductase, and ferredoxin-thioredoxin reductase. In rubredoxin, the iron atom is coordinated by four cysteine residues (Fe(S-Cys)4), but iron can also be replaced by cobalt, nickel or zinc and believed to be involved in electron transfer.  Rubrerythrins and nigerythrins are small homodimeric proteins, generally consisting of 2 domains: a rubredoxin domain C-terminal to a non-sulfur, oxo-bridged diiron site in the N-terminal rubrerythrin domain.  Rubrerythrins and nigerythrins have putative peroxide activity.
Probab=28.25  E-value=22  Score=20.82  Aligned_cols=14  Identities=29%  Similarity=0.506  Sum_probs=10.9

Q ss_pred             eeccccCCCcCCHH
Q 028455          177 LRCGVCQIGVIGQK  190 (208)
Q Consensus       177 ~~C~~c~~~~~g~~  190 (208)
                      -+|.+||++..+..
T Consensus         2 ~~C~~CGy~y~~~~   15 (33)
T cd00350           2 YVCPVCGYIYDGEE   15 (33)
T ss_pred             EECCCCCCEECCCc
Confidence            47999999877653


No 53 
>PF04959 ARS2:  Arsenite-resistance protein 2;  InterPro: IPR007042 This entry represents Arsenite-resistance protein 2 (also known as Serrate RNA effector molecule homolog) which is thought to play a role in arsenite resistance [], although does not directly confer arsenite resistance but rather modulates arsenic sensitivity []. Arsenite is a carcinogenic compound which can act as a comutagen by inhibiting DNA repair. It is also involved in cell cycle progression at S phase. ; PDB: 3AX1_A.
Probab=27.83  E-value=45  Score=28.34  Aligned_cols=28  Identities=18%  Similarity=0.266  Sum_probs=20.8

Q ss_pred             cccCCceeeccccCCCcCCHHHHHHHHH
Q 028455          170 TDTANFTLRCGVCQIGVIGQKEAVEHAQ  197 (208)
Q Consensus       170 t~t~~~~~~C~~c~~~~~g~~~a~~ha~  197 (208)
                      +....-.-+|..|+|.|+|..-+.+|-.
T Consensus        71 ~e~~~~K~~C~lc~KlFkg~eFV~KHI~   98 (214)
T PF04959_consen   71 KEEDEDKWRCPLCGKLFKGPEFVRKHIF   98 (214)
T ss_dssp             -SSSSEEEEE-SSS-EESSHHHHHHHHH
T ss_pred             HHHcCCEECCCCCCcccCChHHHHHHHh
Confidence            3446667799999999999999999954


No 54 
>KOG1790 consensus 60s ribosomal protein L34 [Translation, ribosomal structure and biogenesis]
Probab=26.84  E-value=23  Score=27.41  Aligned_cols=25  Identities=20%  Similarity=0.436  Sum_probs=19.0

Q ss_pred             CCccccCCceeeccccCCCcCCHHH
Q 028455          167 KTYTDTANFTLRCGVCQIGVIGQKE  191 (208)
Q Consensus       167 ~~~t~t~~~~~~C~~c~~~~~g~~~  191 (208)
                      ++|+...+...+|.+|+..|.|-..
T Consensus        32 ~q~~kK~~~~pkc~~c~~~l~Gi~~   56 (121)
T KOG1790|consen   32 YQYVKKKAKLPKCGDCGMRLQGIPA   56 (121)
T ss_pred             hHhhHhhccCCCCCcCCcccCCCCC
Confidence            4566666677789999999998543


No 55 
>smart00238 BIR Baculoviral inhibition of apoptosis protein repeat. Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes.
Probab=26.76  E-value=1.2e+02  Score=20.18  Aligned_cols=40  Identities=20%  Similarity=0.263  Sum_probs=28.1

Q ss_pred             hhCCCccccCCceeeccccCCCcC----CHHHHHHHHHhhCCCcc
Q 028455          164 QRKKTYTDTANFTLRCGVCQIGVI----GQKEAVEHAQATGHVNF  204 (208)
Q Consensus       164 ~~~~~~t~t~~~~~~C~~c~~~~~----g~~~a~~ha~~tgH~~F  204 (208)
                      +..=|||.+ .-.++|-.|+..+.    ++.-.++|+...-...|
T Consensus        25 ~~Gfyy~~~-~d~v~C~~C~~~l~~w~~~d~p~~~H~~~~p~C~f   68 (71)
T smart00238       25 EAGFYYTGV-GDEVKCFFCGGELDNWEPGDDPWEEHKKWSPNCPF   68 (71)
T ss_pred             HcCCeECCC-CCEEEeCCCCCCcCCCCCCCCHHHHHhHhCcCCcC
Confidence            455677766 44599999999885    45557778776665555


No 56 
>PF13451 zf-trcl:  Probable zinc-binding domain
Probab=26.65  E-value=54  Score=21.38  Aligned_cols=27  Identities=19%  Similarity=0.277  Sum_probs=19.8

Q ss_pred             CceeeccccCCCcCCHHHHHHHHHhhC
Q 028455          174 NFTLRCGVCQIGVIGQKEAVEHAQATG  200 (208)
Q Consensus       174 ~~~~~C~~c~~~~~g~~~a~~ha~~tg  200 (208)
                      ...|+|-+||..+.=....|+...+-|
T Consensus         2 Dk~l~C~dCg~~FvfTa~EQ~fy~eKg   28 (49)
T PF13451_consen    2 DKTLTCKDCGAEFVFTAGEQKFYAEKG   28 (49)
T ss_pred             CeeEEcccCCCeEEEehhHHHHHHhcC
Confidence            357899999999986666666665544


No 57 
>COG2174 RPL34A Ribosomal protein L34E [Translation, ribosomal structure and biogenesis]
Probab=26.35  E-value=30  Score=25.56  Aligned_cols=14  Identities=21%  Similarity=0.558  Sum_probs=11.6

Q ss_pred             eeccccCCCcCCHH
Q 028455          177 LRCGVCQIGVIGQK  190 (208)
Q Consensus       177 ~~C~~c~~~~~g~~  190 (208)
                      -+|.+||..|.|..
T Consensus        35 p~C~~cg~pL~Gi~   48 (93)
T COG2174          35 PKCAICGRPLGGIP   48 (93)
T ss_pred             CcccccCCccCCcc
Confidence            37999999999853


No 58 
>COG5134 Uncharacterized conserved protein [Function unknown]
Probab=25.13  E-value=46  Score=28.53  Aligned_cols=32  Identities=22%  Similarity=0.397  Sum_probs=22.2

Q ss_pred             HHHHHHHHHHHHhhCCCc----cccCCceeeccccC
Q 028455          152 AEDLALKLVKEQQRKKTY----TDTANFTLRCGVCQ  183 (208)
Q Consensus       152 ~~~~a~~l~~~~~~~~~~----t~t~~~~~~C~~c~  183 (208)
                      |..++..-+++|+.++.-    .-.+-|+++|..|+
T Consensus        14 AqpL~~~~~~KlK~arprglSiRL~TPF~~RCL~C~   49 (272)
T COG5134          14 AQPLAKRKFDKLKNARPRGLSIRLETPFPVRCLNCE   49 (272)
T ss_pred             cchhHHHHHHHhcccCcccceEEeccCcceeecchh
Confidence            345666777777777653    34567899999995


No 59 
>cd00022 BIR Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger.
Probab=25.11  E-value=1.3e+02  Score=19.90  Aligned_cols=40  Identities=18%  Similarity=0.285  Sum_probs=27.5

Q ss_pred             hhCCCccccCCceeeccccCCCcCC----HHHHHHHHHhhCCCcc
Q 028455          164 QRKKTYTDTANFTLRCGVCQIGVIG----QKEAVEHAQATGHVNF  204 (208)
Q Consensus       164 ~~~~~~t~t~~~~~~C~~c~~~~~g----~~~a~~ha~~tgH~~F  204 (208)
                      +..=||+.. .-.++|--|+..+.+    +.-.++|....-+..|
T Consensus        23 ~~Gfyy~~~-~d~v~C~~C~~~~~~w~~~d~p~~~H~~~~p~C~f   66 (69)
T cd00022          23 EAGFYYTGR-GDEVKCFFCGLELKNWEPGDDPWEEHKRWSPNCPF   66 (69)
T ss_pred             HcCCeEcCC-CCEEEeCCCCCCccCCCCCCCHHHHHhHhCcCCcC
Confidence            455667665 456999999988864    5556778776655554


No 60 
>PRK13731 conjugal transfer surface exclusion protein TraT; Provisional
Probab=24.92  E-value=2e+02  Score=25.06  Aligned_cols=45  Identities=18%  Similarity=0.228  Sum_probs=30.8

Q ss_pred             Ceeee----eCCCCCcchHHHHHHHHHHHHHhhCCCcc----ccCCceeeccc--cCCC
Q 028455          137 DQTIF----PVQKGRTIGPAEDLALKLVKEQQRKKTYT----DTANFTLRCGV--CQIG  185 (208)
Q Consensus       137 d~~~f----~~~~~~~~~~~~~~a~~l~~~~~~~~~~t----~t~~~~~~C~~--c~~~  185 (208)
                      ++|||    +++|..    +..+..++.+.|+.++|--    +.+.+.|.-++  |+|.
T Consensus        50 ~ktVyv~vrNTSd~~----~~~l~~~i~~~L~~kGY~iv~~P~~A~Y~lQaNVL~~~K~  104 (243)
T PRK13731         50 ERTVFLQIKNTSDKD----MSGLQGKIADAVKAKGYQVVTSPDKAYYWIQANVLKADKM  104 (243)
T ss_pred             CceEEEEEeeCCCcc----hHHHHHHHHHHHHhCCeEEecChhhceeeeeeeehhcccC
Confidence            46777    455633    3346678888889999843    56777788877  7776


No 61 
>TIGR00373 conserved hypothetical protein TIGR00373. This family of proteins is, so far, restricted to archaeal genomes. The family appears to be distantly related to the N-terminal region of the eukaryotic transcription initiation factor IIE alpha chain.
Probab=24.11  E-value=59  Score=26.02  Aligned_cols=49  Identities=6%  Similarity=0.116  Sum_probs=31.4

Q ss_pred             eeeeeCCCCCcchHHHHHHHHHHHHHhhCCCccccCCceeeccccCCCcC
Q 028455          138 QTIFPVQKGRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGVCQIGVI  187 (208)
Q Consensus       138 ~~~f~~~~~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~c~~~~~  187 (208)
                      +..|-++.+++.+.++....++.+.|++.-.+.... .-..|..|+..+.
T Consensus        72 ~Y~w~i~~~~i~d~Ik~~~~~~~~~lk~~l~~e~~~-~~Y~Cp~c~~r~t  120 (158)
T TIGR00373        72 EYTWRINYEKALDVLKRKLEETAKKLREKLEFETNN-MFFICPNMCVRFT  120 (158)
T ss_pred             EEEEEeCHHHHHHHHHHHHHHHHHHHHHHHhhccCC-CeEECCCCCcEee
Confidence            456656666666777888788888777765543333 3356877775544


No 62 
>PF05413 Peptidase_C34:  Putative closterovirus papain-like endopeptidase;  InterPro: IPR008744 RNA-directed RNA polymerase (RdRp) (2.7.7.48 from EC) is an essential protein encoded in the genomes of all RNA containing viruses with no DNA stage [, ]. It catalyses synthesis of the RNA strand complementary to a given RNA template, but the precise molecular mechanism remains unclear. The postulated RNA replication process is a two-step mechanism. First, the initiation step of RNA synthesis begins at or near the 3' end of the RNA template by means of a primer-independent (de novo) mechanism. The de novo initiation consists in the addition of a nucleotide tri-phosphate (NTP) to the 3'-OH of the first initiating NTP. During the following so-called elongation phase, this nucleotidyl transfer reaction is repeated with subsequent NTPs to generate the complementary RNA product [].  All the RNA-directed RNA polymerases, and many DNA-directed polymerases, employ a fold whose organisation has been likened to the shape of a right hand with three subdomains termed fingers, palm and thumb []. Only the catalytic palm subdomain, composed of a four-stranded antiparallel beta-sheet with two alpha-helices, is well conserved among all of these enzymes. In RdRp, the palm subdomain comprises three well conserved motifs (A, B and C). Motif A (D-x(4,5)-D) and motif C (GDD) are spatially juxtaposed; the Asp residues of these motifs are implied in the binding of Mg2+ and/or Mn2+. The Asn residue of motif B is involved in selection of ribonucleoside triphosphates over dNTPs and thus determines whether RNA is synthesised rather than DNA []. The domain organisation [] and the 3D structure of the catalytic centre of a wide range of RdPp's, even those with a low overall sequence homology, are conserved. The catalytic centre is formed by several motifs containing a number of conserved amino acid residues. There are 4 superfamilies of viruses that cover all RNA containing viruses with no DNA stage: Viruses containing positive-strand RNA or double-strand RNA, except retroviruses and Birnaviridae: viral RNA-directed RNA polymerases including all positive-strand RNA viruses with no DNA stage, double-strand RNA viruses, and the Cystoviridae, Reoviridae, Hypoviridae, Partitiviridae, Totiviridae families. Mononegavirales (negative-strand RNA viruses with non-segmented genomes). Negative-strand RNA viruses with segmented genomes, i.e. Orthomyxoviruses (including influenza A, B, and C viruses, Thogotoviruses, and the infectious salmon anemia virus), Arenaviruses, Bunyaviruses, Hantaviruses, Nairoviruses, Phleboviruses, Tenuiviruses and Tospoviruses. Birnaviridae family of dsRNA viruses.  The RNA-directed RNA polymerases in the first of the above superfamilies can be divided into the following three subgroups: All positive-strand RNA eukaryotic viruses with no DNA stage. All RNA-containing bacteriophages -there are two families of RNA-containing bacteriophages: Leviviridae (positive ssRNA phages) and Cystoviridae (dsRNA phages). Reoviridae family of dsRNA viruses.   This signature is found in the RNA-direct RNA polymerase of apple chlorotic leaf spot virus and cherry mottle virus.; GO: 0003723 RNA binding, 0003968 RNA-directed RNA polymerase activity, 0005524 ATP binding, 0019079 viral genome replication
Probab=23.23  E-value=67  Score=23.27  Aligned_cols=89  Identities=13%  Similarity=0.197  Sum_probs=46.8

Q ss_pred             EEEeCCCCchhhHHHHHHhhcCCCchHHHHHHHHHHHhcChhcchhhhcCCCHHHHHHHhCCCCcccCHHHHHHHHHhhC
Q 028455            7 RRVIPSDNSCLFNAVGYVMEHDKNKAPELRQVIAATVASDPVKYSEAFLGKSNQEYCSWIQDPEKWGGAIELSILADYYG   86 (208)
Q Consensus         7 ~~~ip~DGnCLFrAis~~l~~~~~~~~~lR~~va~~I~~np~~y~e~~l~~~~~eY~~~i~~~~~WGG~iEL~als~~~~   86 (208)
                      ++.|.|--.|||-|+|..+...++          +.|...|...         +   +-|++.|.  .--.+.++++.|.
T Consensus         2 ~kFikGk~DClf~s~a~~I~Kkpe----------evm~~~phvl---------d---RCisNkGC--sidD~k~iC~~YE   57 (92)
T PF05413_consen    2 VKFIKGKYDCLFVSVAEIIHKKPE----------EVMMFLPHVL---------D---RCISNKGC--SIDDLKAICEKYE   57 (92)
T ss_pred             cceeccccccHHHHHHHHHhcCHH----------HHHHhChHHH---------H---HHHhcCCC--CHHHHHHHHhhcE
Confidence            567888899999999887664432          1122222222         1   12222221  2224678888888


Q ss_pred             ceEEEEECCCCceeEeCCCCCCCCeEEEEEcCcccee
Q 028455           87 REIAAYDIQTTRCDLYGQEKKYSERVMLIYDGLHYDA  123 (208)
Q Consensus        87 ~~I~V~d~~~~~~~~fg~~~~~~~~i~llY~G~HYD~  123 (208)
                      +.|.+-- +-| ...-|.-.  -+--.++..|+||..
T Consensus        58 iKveceG-DCG-lvE~Gs~G--l~~Gr~~LRGNHF~v   90 (92)
T PF05413_consen   58 IKVECEG-DCG-LVECGSIG--LPLGRMLLRGNHFSV   90 (92)
T ss_pred             EeeEecC-ccc-eEEecCcc--Cchhheeecccceee
Confidence            7765521 122 33334322  111135578899864


No 63 
>PRK03922 hypothetical protein; Provisional
Probab=21.76  E-value=1e+02  Score=23.56  Aligned_cols=38  Identities=13%  Similarity=-0.017  Sum_probs=27.5

Q ss_pred             hHHH-HHHHHHHHHHhh-CCCccccCCceeeccccCCCcC
Q 028455          150 GPAE-DLALKLVKEQQR-KKTYTDTANFTLRCGVCQIGVI  187 (208)
Q Consensus       150 ~~~~-~~a~~l~~~~~~-~~~~t~t~~~~~~C~~c~~~~~  187 (208)
                      |.|+ -+.-|..+.|+. .-.|-+..--...|..||..|.
T Consensus        21 dDAI~iAIseaGkrLn~~~l~yVeievG~~~cP~cge~~~   60 (113)
T PRK03922         21 DDAIGVAISEAGKRLNPEDLDYVEVEVGLTICPKCGEPFD   60 (113)
T ss_pred             HHHHHHHHHHHHhhcCcccCCeEEEecCcccCCCCCCcCC
Confidence            4333 334466666777 6778888888899999998765


No 64 
>PF15412 Nse4-Nse3_bdg:  Binding domain of Nse4/EID3 to Nse3-MAGE
Probab=21.75  E-value=31  Score=22.77  Aligned_cols=36  Identities=19%  Similarity=0.154  Sum_probs=28.2

Q ss_pred             CCcchHHHHHHHHHHHHHhhCCCccccCCceeeccc
Q 028455          146 GRTIGPAEDLALKLVKEQQRKKTYTDTANFTLRCGV  181 (208)
Q Consensus       146 ~~~~~~~~~~a~~l~~~~~~~~~~t~t~~~~~~C~~  181 (208)
                      .+.|-.+.++|.+-|+.|+-..-..|+..|.-+|-.
T Consensus         2 S~~Lv~aSdla~~ka~~lk~~~~~fd~deFv~~l~~   37 (56)
T PF15412_consen    2 SRLLVLASDLAAEKARNLKFGGSGFDVDEFVSKLKT   37 (56)
T ss_pred             cHHHHHHHHHHHHHHHHhccCCCccCHHHHHHHHHH
Confidence            345567788888888999999888899999666644


No 65 
>PF06107 DUF951:  Bacterial protein of unknown function (DUF951);  InterPro: IPR009296 This family consists of several short hypothetical bacterial proteins of unknown function.
Probab=21.61  E-value=48  Score=22.37  Aligned_cols=15  Identities=20%  Similarity=0.616  Sum_probs=12.0

Q ss_pred             CCceeeccccCCCcC
Q 028455          173 ANFTLRCGVCQIGVI  187 (208)
Q Consensus       173 ~~~~~~C~~c~~~~~  187 (208)
                      +.|.|||..||..+-
T Consensus        28 aDikikC~gCg~~im   42 (57)
T PF06107_consen   28 ADIKIKCLGCGRQIM   42 (57)
T ss_pred             CcEEEEECCCCCEEE
Confidence            568899999997654


No 66 
>PF13717 zinc_ribbon_4:  zinc-ribbon domain
Probab=21.30  E-value=58  Score=19.49  Aligned_cols=14  Identities=21%  Similarity=0.446  Sum_probs=10.9

Q ss_pred             CCceeeccccCCCc
Q 028455          173 ANFTLRCGVCQIGV  186 (208)
Q Consensus       173 ~~~~~~C~~c~~~~  186 (208)
                      .+..++|..||..+
T Consensus        22 ~g~~v~C~~C~~~f   35 (36)
T PF13717_consen   22 KGRKVRCSKCGHVF   35 (36)
T ss_pred             CCcEEECCCCCCEe
Confidence            45578999999765


No 67 
>PLN02748 tRNA dimethylallyltransferase
Probab=21.05  E-value=51  Score=31.27  Aligned_cols=26  Identities=31%  Similarity=0.496  Sum_probs=23.2

Q ss_pred             eeccccCC-CcCCHHHHHHHHHhhCCC
Q 028455          177 LRCGVCQI-GVIGQKEAVEHAQATGHV  202 (208)
Q Consensus       177 ~~C~~c~~-~~~g~~~a~~ha~~tgH~  202 (208)
                      ..|.+|++ .++|+.+=+.|-+...|-
T Consensus       419 ~~Ce~C~~~~~~G~~eW~~Hlksr~Hk  445 (468)
T PLN02748        419 YVCEACGNKVLRGAHEWEQHKQGRGHR  445 (468)
T ss_pred             ccccCCCCcccCCHHHHHHHhcchHHH
Confidence            36999998 899999999999998884


No 68 
>PF03884 DUF329:  Domain of unknown function (DUF329);  InterPro: IPR005584 The biological function of these short proteins is unknown, but they contain four conserved cysteines, suggesting that they all bind zinc. YacG (Q5X8H6 from SWISSPROT) from Escherichia coli has been shown to bind zinc and contains the structural motifs typical of zinc-binding proteins []. The conserved four cysteine motif in these proteins (-C-X(2)-C-X(15)-C-X(3)-C-) is not found in other zinc-binding proteins with known structures.; GO: 0008270 zinc ion binding; PDB: 1LV3_A.
Probab=21.01  E-value=46  Score=22.37  Aligned_cols=13  Identities=31%  Similarity=0.746  Sum_probs=7.1

Q ss_pred             ceeeccccCCCcC
Q 028455          175 FTLRCGVCQIGVI  187 (208)
Q Consensus       175 ~~~~C~~c~~~~~  187 (208)
                      |+.+|.+||+...
T Consensus         1 m~v~CP~C~k~~~   13 (57)
T PF03884_consen    1 MTVKCPICGKPVE   13 (57)
T ss_dssp             -EEE-TTT--EEE
T ss_pred             CcccCCCCCCeec
Confidence            5789999998754


No 69 
>PF08209 Sgf11:  Sgf11 (transcriptional regulation protein);  InterPro: IPR013246 The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae (Baker's yeast). The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation [].; PDB: 3M99_B 2LO2_A 3MHH_C 3MHS_C.
Probab=20.91  E-value=60  Score=19.35  Aligned_cols=19  Identities=21%  Similarity=0.237  Sum_probs=14.4

Q ss_pred             ceeeccccCCCcCCHHHHH
Q 028455          175 FTLRCGVCQIGVIGQKEAV  193 (208)
Q Consensus       175 ~~~~C~~c~~~~~g~~~a~  193 (208)
                      ....|..|+..+...+-|+
T Consensus         3 ~~~~C~nC~R~v~a~RfA~   21 (33)
T PF08209_consen    3 PYVECPNCGRPVAASRFAP   21 (33)
T ss_dssp             -EEE-TTTSSEEEGGGHHH
T ss_pred             CeEECCCCcCCcchhhhHH
Confidence            4568999999999888775


No 70 
>PRK05452 anaerobic nitric oxide reductase flavorubredoxin; Provisional
Probab=20.62  E-value=1.1e+02  Score=28.98  Aligned_cols=53  Identities=11%  Similarity=0.130  Sum_probs=31.8

Q ss_pred             eeeeCCCCCcchHHHHHHHHHHHHHhhC--CCccc--------------cCCceeeccccCCCcCCHHHH
Q 028455          139 TIFPVQKGRTIGPAEDLALKLVKEQQRK--KTYTD--------------TANFTLRCGVCQIGVIGQKEA  192 (208)
Q Consensus       139 ~~f~~~~~~~~~~~~~~a~~l~~~~~~~--~~~t~--------------t~~~~~~C~~c~~~~~g~~~a  192 (208)
                      ..|..++ +.++.+.+.+++|++.++.+  +|.|-              ......+|..||++-..+..-
T Consensus       373 ~~~~P~e-e~~~~~~~~g~~la~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~c~~c~~~yd~~~g~  441 (479)
T PRK05452        373 AKWRPDQ-DALELCREHGREIARQWALAPLPQSTVNTVVKEETSATTTADLGPRMQCSVCQWIYDPAKGE  441 (479)
T ss_pred             EEecCCH-HHHHHHHHHHHHHHHHHhhCCccccccccccccccccccccCCCCeEEECCCCeEECCCCCC
Confidence            3444444 44588888888888766622  11111              123445999999987765443


No 71 
>PF00653 BIR:  Inhibitor of Apoptosis domain;  InterPro: IPR001370 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.   The baculovirus inhibitor of apoptosis protein repeat (BIR) is a domain of tandem repeats separated by a variable length linker that seems to confer cell death-preventing activity [, ]. The BIR domains characterise the Inhibitor of Apoptosis (IAP) family of proteins (MEROPS proteinase inhibitor family I32, clan IV) that suppress apoptosis by interacting with and inhibiting the enzymatic activity of both initiator and effector caspases (MEROPS peptidase family C14, IPR002398 from INTERPRO). Several distinct mammalian IAPs including XIAP, c-IAP1, c-IAP2, and ML-IAP, have been identified, and they all exhibit antiapoptotic activity in cell culture. The functional unit in each IAP protein is the baculoviral IAP repeat (BIR), which contains approximately 80 amino acids folded around a zinc atom. Most mammalian IAPs have more than one BIR domain, with the different BIR domains performing distinct functions. For example, in XIAP, the third BIR domain (BIR3) potently inhibits the catalytic activity of caspase-9, whereas the linker sequences immediately preceding the second BIR domain (BIR2) selectively targets caspase-3 or -7.  The first-recognised members of family MEROPS inhibitor family I32 were viral proteins that inhibited the apoptosis of infected cells: Cp-IAP from Cydia pomonella granulosis virus (CpGV) [] and Op-IAP from Orgyia pseudotsugata multicapsid polyhedrosis virus(OpMNPV) []. The discovery of homologous proteins in mammals followed soon after with the recognition that mutations in the gene for neuronal apoptosis inhibitory protein (NIAP) underlie spinal muscular atrophy []. The inhibitors in family I32 all possess one or more 80-residue domains known as BIR (baculovirus inhibitor repeat) domains and have accordingly been termed 'BIR-containing' or 'BIRC' proteins as well as IAP proteins.  The mechanism of inhibition of caspases by the IAP proteins is complex, and reactive site residues cannot yet be identified with any confidence. Despite the conservation of the BIR or IAP (inhibitor of apoptosis) domains throughout the family it seems clear that other parts of the molecules also make essential contributions to inhibitory activity.  Homologs of most components in the mammalian apoptotic pathway have been identified in fruit flies. The Drosophila Apaf-1, known as Dapaf-1, HAC-1 or Dark, shares significant sequence similarity with its mammalian counterpart, and is critically important for the activation of the Drosophila initiator caspase Dronc. Dronc, in turn, cleaves and activates the effector caspase DrICE. The Drosophila IAP, DIAP1, binds to and in-activates both DrICE and Dronc through its BIR1 and BIR2 domains. During apoptosis, the anti-death function of DIAP1 is countered by at least four pro-apoptotic proteins, Reaper, Hid, Grim, and sickle, through direct physical interactions. These four proteins represent the functional homologs of the mammalian protein Smac, and they all share a conserved IAP-binding motif at their N termini. The three proteins Reaper, Hid, and Grim are collectively referred to as the RHG proteins [, ].  Both XIAP and DIAP1 contain a RING domain at their C termini, and can act as an E3 ubiquitin ligase. Indeed, both XIAP and DIAP1 have been shown to promote self-ubiquitination and degradation as well as to negatively regulate the target caspases. Nonetheless, important differences exist between XIAP and DIAP1. The primary function of XIAP is thought to inhibit the catalytic activities of caspases; to what extent the ubiquitinating activity of XIAP contributes to its function remains unclear. For DIAP1, however, the ubiquitinating activity appears to be essential for its function.  Recently a Drosophila p53 protein has been identified that mediates apoptosis via a novel pathway involving the activation of the Reaper gene and subsequent inhibition of the inhibitors of apoptosis (IAPs). CIAP1, a major mammalian homologue of Drosophila IAPs, is irreversibly inhibited (cleaved) during p53-dependent apoptosis and this cleavage is mediated by a serine protease. Serine protease inhibitors that block CIAP1 cleavage inhibit p53-dependent apoptosis. Furthermore, activation of the p53 protein increases the transcription of the HTRA2 gene, which encodes a serine protease that interacts with CIAP1 and potentiates apoptosis. Therefore mammalian p53 protein activates apoptosis through a novel pathway functionally similar to that in Drosophila, which involves HTRA2 and subsequent inhibition of CIAP1 by cleavage [].; GO: 0005622 intracellular; PDB: 3HL5_B 3UW5_A 3CM7_A 1G3F_A 1G73_C 3G76_G 3CM2_C 2VSL_A 2OPZ_B 3CLX_A ....
Probab=20.28  E-value=1.2e+02  Score=20.50  Aligned_cols=39  Identities=21%  Similarity=0.269  Sum_probs=26.2

Q ss_pred             hhCCCccccCCceeeccccCCCcC----CHHHHHHHHHhhCCCc
Q 028455          164 QRKKTYTDTANFTLRCGVCQIGVI----GQKEAVEHAQATGHVN  203 (208)
Q Consensus       164 ~~~~~~t~t~~~~~~C~~c~~~~~----g~~~a~~ha~~tgH~~  203 (208)
                      ++.=|||.+ ...++|-.||..+.    ++.-.++|.+..-...
T Consensus        25 ~aGFyy~~~-~d~v~C~~C~~~l~~w~~~Ddp~~~H~~~sp~C~   67 (70)
T PF00653_consen   25 RAGFYYTGT-GDRVRCFYCGLELDNWEPNDDPWEEHKRHSPNCP   67 (70)
T ss_dssp             HTTEEEESS-TTEEEETTTTEEEES-STT--HHHHHHHHSTTBH
T ss_pred             HCCCEEcCC-CCEEEEeccCCEEeCCCCCCCHHHHHHHHCcCCe
Confidence            455667766 78899999999884    4455677877554443


No 72 
>PF08782 c-SKI_SMAD_bind:  c-SKI Smad4 binding domain;  InterPro: IPR014890 c-SKI is an oncoprotein that inhibits TGF-beta signalling through interaction with Smad proteins []. This protein binds to Smad4 [].; GO: 0005634 nucleus; PDB: 1MR1_C.
Probab=20.16  E-value=38  Score=25.22  Aligned_cols=25  Identities=20%  Similarity=0.235  Sum_probs=13.4

Q ss_pred             CCccccCCceeeccccCCCcCCHHH
Q 028455          167 KTYTDTANFTLRCGVCQIGVIGQKE  191 (208)
Q Consensus       167 ~~~t~t~~~~~~C~~c~~~~~g~~~  191 (208)
                      ..|+...+.-|+|.+|+..|.-++=
T Consensus        19 ~lY~~~~a~CI~C~~C~~~FsP~kF   43 (96)
T PF08782_consen   19 ELYSSPNAKCIECLECRGMFSPQKF   43 (96)
T ss_dssp             GG--STT---EEETTT--EE-HHHH
T ss_pred             hhcCCCCCCceEcccCCCEeCCcCE
Confidence            3588888999999999988876653


No 73 
>PF14300 DUF4375:  Domain of unknown function (DUF4375); PDB: 3VJZ_A.
Probab=20.12  E-value=55  Score=24.74  Aligned_cols=18  Identities=28%  Similarity=0.549  Sum_probs=15.0

Q ss_pred             HHHHHHHHHHHhcChhcc
Q 028455           33 PELRQVIAATVASDPVKY   50 (208)
Q Consensus        33 ~~lR~~va~~I~~np~~y   50 (208)
                      ..+=..++.||++||+.|
T Consensus       106 e~~~~l~~~Yv~~h~~~F  123 (123)
T PF14300_consen  106 EDLTELLARYVREHPEKF  123 (123)
T ss_dssp             HHHHHHHHHHHHHTHHHH
T ss_pred             cHHHHHHHHHHHHCHhhC
Confidence            466778899999999976


No 74 
>PF10571 UPF0547:  Uncharacterised protein family UPF0547;  InterPro: IPR018886  This domain may well be a type of zinc-finger as it carries two pairs of highly conserved cysteine residues though with no accompanying histidines. Several members are annotated as putative helicases. 
Probab=20.06  E-value=51  Score=18.48  Aligned_cols=12  Identities=17%  Similarity=0.293  Sum_probs=8.8

Q ss_pred             eeeccccCCCcC
Q 028455          176 TLRCGVCQIGVI  187 (208)
Q Consensus       176 ~~~C~~c~~~~~  187 (208)
                      ..+|..||+.|.
T Consensus        14 ~~~Cp~CG~~F~   25 (26)
T PF10571_consen   14 AKFCPHCGYDFE   25 (26)
T ss_pred             cCcCCCCCCCCc
Confidence            346999998874


Done!