Query         033920
Match_columns 109
No_of_seqs    94 out of 96
Neff          3.5 
Searched_HMMs 46136
Date          Fri Mar 29 07:47:50 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/033920.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/033920hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF09597 IGR:  IGR protein moti 100.0   5E-33 1.1E-37  180.6   5.3   57   39-98      1-57  (57)
  2 PF00536 SAM_1:  SAM domain (St  98.4 5.6E-07 1.2E-11   56.1   4.4   58   37-96      6-64  (64)
  3 cd00166 SAM Sterile alpha moti  98.2 2.8E-06 6.2E-11   51.3   4.4   58   37-96      5-63  (63)
  4 smart00454 SAM Sterile alpha m  97.9 2.9E-05 6.3E-10   47.0   4.7   59   37-97      7-67  (68)
  5 PF07647 SAM_2:  SAM domain (St  97.9 1.7E-05 3.7E-10   49.5   3.6   59   36-96      6-66  (66)
  6 KOG4384 Uncharacterized SAM do  96.5  0.0053 1.1E-07   52.4   5.4   59   37-97    216-276 (361)
  7 KOG0196 Tyrosine kinase, EPH (  84.9     2.3   5E-05   40.6   5.9   66   33-100   920-987 (996)
  8 KOG4374 RNA-binding protein Bi  80.4     2.8 6.1E-05   33.9   4.1   59   37-97    152-211 (216)
  9 PF01031 Dynamin_M:  Dynamin ce  78.8       3 6.5E-05   32.9   3.8   76   22-97     41-123 (295)
 10 PF07524 Bromo_TP:  Bromodomain  78.4     2.4 5.3E-05   27.5   2.7   41   39-83     36-76  (77)
 11 KOG3678 SARM protein (with ste  76.5     4.6 9.9E-05   37.3   4.7   60   34-95    465-526 (832)
 12 PF15652 Tox-SHH:  HNH/Endo VII  69.4     5.8 0.00013   28.7   3.0   38   56-96     62-99  (100)
 13 PF00730 HhH-GPD:  HhH-GPD supe  65.8     9.1  0.0002   25.4   3.2   34   63-96     29-66  (108)
 14 COG3272 Uncharacterized conser  57.6     6.7 0.00014   30.6   1.6   21   80-100   101-121 (163)
 15 PF01152 Bac_globin:  Bacterial  56.9      18  0.0004   24.5   3.5   37   62-98     80-116 (120)
 16 KOG4375 Scaffold protein Shank  56.2      16 0.00034   30.6   3.6   55   36-92    212-267 (272)
 17 COG1603 RPP1 RNase P/RNase MRP  54.8      25 0.00054   28.5   4.5   76    6-106   147-226 (229)
 18 TIGR02527 dot_icm_IcmQ Dot/Icm  53.0     9.9 0.00021   30.2   1.9   27   38-64     16-42  (182)
 19 PF10454 DUF2458:  Protein of u  50.8      32 0.00069   25.8   4.3   28   54-81     92-120 (150)
 20 PRK10308 3-methyl-adenine DNA   50.6      21 0.00045   28.9   3.4   37   64-100   158-194 (283)
 21 PF12836 HHH_3:  Helix-hairpin-  50.2      13 0.00029   23.4   1.9   32   69-102    10-42  (65)
 22 PF14520 HHH_5:  Helix-hairpin-  49.5      59  0.0013   19.9   4.7   41   52-92     16-58  (60)
 23 PTZ00096 40S ribosomal protein  48.9     9.6 0.00021   29.1   1.2   26   64-90     22-47  (143)
 24 cd00454 Trunc_globin Truncated  48.8      27 0.00058   23.4   3.3   38   62-99     76-113 (116)
 25 PF09475 Dot_icm_IcmQ:  Dot/Icm  48.7     9.5 0.00021   30.2   1.2   26   38-63     16-41  (179)
 26 PRK10361 DNA recombination pro  47.7      14 0.00031   32.7   2.3   49   35-83    381-432 (475)
 27 smart00460 TGc Transglutaminas  44.7      19 0.00041   21.5   1.9   13   74-86     19-31  (68)
 28 PRK00024 hypothetical protein;  44.6      44 0.00096   26.2   4.4   46   47-92     40-86  (224)
 29 PF01841 Transglut_core:  Trans  44.4      33 0.00071   22.2   3.1   37   33-84     37-74  (113)
 30 PRK11639 zinc uptake transcrip  43.6      29 0.00063   25.8   3.1   23   72-94     14-37  (169)
 31 cd00923 Cyt_c_Oxidase_Va Cytoc  43.4      33 0.00072   25.0   3.2   30   55-84     70-99  (103)
 32 PF08955 BofC_C:  BofC C-termin  42.7      22 0.00049   24.3   2.2   31   55-96     43-73  (75)
 33 smart00478 ENDO3c endonuclease  42.4      50  0.0011   23.1   4.0   37   61-97     21-61  (149)
 34 PF11328 DUF3130:  Protein of u  41.7      19 0.00041   25.7   1.7   35   38-80     42-76  (90)
 35 PRK13482 DNA integrity scannin  41.5      41 0.00089   28.9   4.0   56   41-97    287-344 (352)
 36 COG1577 ERG12 Mevalonate kinas  40.7      52  0.0011   27.4   4.4   56   38-95    201-259 (307)
 37 PF04362 Iron_traffic:  Bacteri  40.2      33 0.00072   24.2   2.8   50   44-98     24-76  (88)
 38 PF07487 SopE_GEF:  SopE GEF do  39.5      19 0.00041   28.2   1.6   31   64-105    62-92  (165)
 39 PF12826 HHH_2:  Helix-hairpin-  39.2      31 0.00066   21.8   2.3   40   55-94     17-57  (64)
 40 PF06320 GCN5L1:  GCN5-like pro  37.9      34 0.00074   24.7   2.6   32   48-79     57-88  (121)
 41 PF15144 DUF4576:  Domain of un  37.9      34 0.00073   24.3   2.5   22   36-57     43-64  (88)
 42 PF02284 COX5A:  Cytochrome c o  37.8      36 0.00078   25.0   2.7   30   55-84     73-102 (108)
 43 PRK13766 Hef nuclease; Provisi  35.8      78  0.0017   28.2   5.0   53   42-94    716-769 (773)
 44 PRK05408 oxidative damage prot  35.6      61  0.0013   23.0   3.5   49   45-98     25-76  (90)
 45 PF03457 HA:  Helicase associat  34.6      62  0.0013   20.0   3.1   39   64-102     8-56  (68)
 46 PTZ00418 Poly(A) polymerase; P  33.3      48   0.001   30.3   3.3   71   37-107    72-147 (593)
 47 KOG1170 Diacylglycerol kinase   32.9      18 0.00039   35.1   0.6   60   32-95    996-1058(1099)
 48 KOG2841 Structure-specific end  32.9      75  0.0016   26.4   4.1   53   39-92    193-247 (254)
 49 PRK04038 rps19p 30S ribosomal   32.0      32  0.0007   25.9   1.7   24   64-88     14-37  (134)
 50 smart00540 LEM in nuclear memb  31.1      64  0.0014   19.9   2.7   27   72-98     12-43  (44)
 51 PF13518 HTH_28:  Helix-turn-he  30.4      61  0.0013   18.4   2.4   23   72-97     16-38  (52)
 52 PRK00558 uvrC excinuclease ABC  30.3      77  0.0017   28.5   4.1   55   39-93    541-596 (598)
 53 COG1623 Predicted nucleic-acid  30.0      50  0.0011   28.6   2.7   45   41-85    293-339 (349)
 54 COG0122 AlkA 3-methyladenine D  28.8      56  0.0012   26.6   2.7   59   37-99    118-182 (285)
 55 PRK09462 fur ferric uptake reg  28.3      53  0.0011   23.5   2.3   23   72-94      5-28  (148)
 56 COG5457 Uncharacterized conser  28.1      60  0.0013   21.5   2.3   28   64-91     32-59  (63)
 57 PF10281 Ish1:  Putative stress  27.5      61  0.0013   18.6   2.1   21   75-95     13-37  (38)
 58 PF13812 PPR_3:  Pentatricopept  26.5      53  0.0011   16.7   1.5   19   62-81     15-33  (34)
 59 PF08349 DUF1722:  Protein of u  26.3      50  0.0011   23.1   1.8   23   78-100    64-86  (117)
 60 cd00056 ENDO3c endonuclease II  25.0      92   0.002   21.9   3.0   39   63-101    32-73  (158)
 61 KOG0005 Ubiquitin-like protein  24.8      57  0.0012   22.2   1.8   14   77-90     33-46  (70)
 62 PF14527 LAGLIDADG_WhiA:  WhiA   24.6      25 0.00054   24.1   0.0   21   37-59     65-85  (93)
 63 PF11899 DUF3419:  Protein of u  24.0 1.2E+02  0.0025   25.9   3.9   49   38-88    159-214 (380)
 64 KOG4374 RNA-binding protein Bi  23.5      59  0.0013   26.4   1.9   53   36-92    117-173 (216)
 65 COG4352 RPL13 Ribosomal protei  23.0      58  0.0012   24.2   1.6   34   44-87     55-88  (113)
 66 PF10305 Fmp27_SW:  RNA pol II   22.5      55  0.0012   22.8   1.4   16   36-51     80-95  (103)
 67 PF13871 Helicase_C_4:  Helicas  22.4 1.7E+02  0.0037   24.1   4.5   44   33-80    227-271 (278)
 68 COG3179 Predicted chitinase [G  22.2 1.2E+02  0.0026   24.6   3.4   42   52-93      8-51  (206)
 69 TIGR00608 radc DNA repair prot  22.1   2E+02  0.0042   22.7   4.6   41   49-89     33-77  (218)
 70 PF10330 Stb3:  Putative Sin3 b  21.8      65  0.0014   23.1   1.7   16   78-93     38-54  (92)
 71 TIGR03019 pepcterm_femAB FemAB  21.8 1.8E+02  0.0039   23.1   4.3   59   37-95    127-191 (330)
 72 PF05577 Peptidase_S28:  Serine  21.8      72  0.0016   26.3   2.2   49   32-80    147-202 (434)
 73 PF00633 HHH:  Helix-hairpin-he  21.7 1.1E+02  0.0025   17.0   2.4   27   64-90      2-29  (30)
 74 PF12972 NAGLU_C:  Alpha-N-acet  21.4 2.8E+02  0.0062   22.2   5.5   58   49-107   124-189 (267)
 75 smart00611 SEC63 Domain of unk  21.2      89  0.0019   24.4   2.5   35   61-95    172-207 (312)
 76 COG0735 Fur Fe2+/Zn2+ uptake r  21.0      92   0.002   22.7   2.4   24   72-95      9-33  (145)
 77 COG1305 Transglutaminase-like   20.9 1.2E+02  0.0027   22.4   3.1   37   33-83    180-216 (319)
 78 TIGR01025 rpsS_arch ribosomal   20.8      52  0.0011   24.8   1.1   26   64-90     12-37  (135)
 79 PF14304 CSTF_C:  Transcription  20.7      67  0.0015   20.2   1.4   34   64-99     11-44  (46)
 80 PF15368 BioT2:  Spermatogenesi  20.3 1.4E+02   0.003   23.6   3.3   35   36-74    126-164 (170)

No 1  
>PF09597 IGR:  IGR protein motif;  InterPro: IPR019083  This entry is found in fungal and plant proteins and contains a conserved IGR motif. Its function is unknown. 
Probab=99.98  E-value=5e-33  Score=180.63  Aligned_cols=57  Identities=46%  Similarity=0.853  Sum_probs=56.2

Q ss_pred             HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhc
Q 033920           39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLG   98 (109)
Q Consensus        39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~G   98 (109)
                      |+|||++|||||++|++|||+   +|++||+++|.+||++||||++|||||+|+||||+|
T Consensus         1 V~tFL~~IGR~~~~~~~kf~~---~w~~lf~~~s~~LK~~GIp~r~RryiL~~~ek~r~G   57 (57)
T PF09597_consen    1 VETFLKLIGRGCEEHAEKFES---DWEKLFTTSSKQLKELGIPVRQRRYILRWREKYRQG   57 (57)
T ss_pred             CHHHHHHHcccHHHHHHHHHH---HHHHHHhcCHHHHHHCCCCHHHHHHHHHHHHHHhCc
Confidence            799999999999999999998   999999999999999999999999999999999998


No 2  
>PF00536 SAM_1:  SAM domain (Sterile alpha motif);  InterPro: IPR021129 The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins [] involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms []. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins [], nevertheless with a low affinity constant []. SAM domains also appear to possess the ability to bind RNA []. Smaug, a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA, binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.  Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces []. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures []. This entry represents type 1 SAM domains. ; PDB: 2KIV_A 3HIL_B 3KKA_A 3K1R_B 3SEN_B 3SEI_B 1V85_A 2KE7_A 2EAM_A 1WWV_A ....
Probab=98.39  E-value=5.6e-07  Score=56.06  Aligned_cols=58  Identities=31%  Similarity=0.461  Sum_probs=51.8

Q ss_pred             CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYR   96 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR   96 (109)
                      -+|.+||+.||  +++|++.|+...=|.+.|+.++...|+++|| ++-+|+.|++-.+++|
T Consensus         6 ~~V~~WL~~~~--l~~y~~~F~~~~i~g~~L~~lt~~dL~~lgi~~~ghr~ki~~~i~~Lk   64 (64)
T PF00536_consen    6 EDVSEWLKSLG--LEQYAENFEKNYIDGEDLLSLTEEDLEELGITKLGHRKKILRAIQKLK   64 (64)
T ss_dssp             HHHHHHHHHTT--GGGGHHHHHHTTSSHHHHTTSCHHHHHHTT-SSHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHCC--CHHHHHHHHcCCchHHHHHhcCHHHHHHcCCCCHHHHHHHHHHHHHhC
Confidence            47999999997  9999999977677899999999999999999 5599999999998876


No 3  
>cd00166 SAM Sterile alpha motif.; Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerization.
Probab=98.20  E-value=2.8e-06  Score=51.25  Aligned_cols=58  Identities=31%  Similarity=0.404  Sum_probs=50.4

Q ss_pred             CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCC-chhhhHHhhhhHhhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIP-CKHRKLILKHTHKYR   96 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp-~r~RKyIL~~~ekyR   96 (109)
                      .+|.+||+.+|  +++|++.|...==|.+.|..++...|+++||+ +-+|+.|++..++++
T Consensus         5 ~~V~~wL~~~~--~~~y~~~f~~~~i~g~~L~~l~~~dL~~lgi~~~g~r~~i~~~i~~l~   63 (63)
T cd00166           5 EDVAEWLESLG--LGQYADNFRENGIDGDLLLLLTEEDLKELGITLPGHRKKILKAIQKLK   63 (63)
T ss_pred             HHHHHHHHHcC--hHHHHHHHHHcCCCHHHHhHCCHHHHHHcCCCCHHHHHHHHHHHHHcC
Confidence            48999999997  89999999865338999999999999999995 499999999887653


No 4  
>smart00454 SAM Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation.
Probab=97.89  E-value=2.9e-05  Score=46.98  Aligned_cols=59  Identities=32%  Similarity=0.401  Sum_probs=50.7

Q ss_pred             CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhc-hHHHHhcCC-CchhhhHHhhhhHhhhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTR-TLKLKKLGI-PCKHRKLILKHTHKYRL   97 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~-S~~LKelGI-p~r~RKyIL~~~ekyR~   97 (109)
                      .+|..||..+|  +.+|++.|...-=+-..|+..+ ...|+++|| ++-+|+.|++..+++|.
T Consensus         7 ~~v~~wL~~~g--~~~y~~~f~~~~i~g~~ll~~~~~~~l~~lgi~~~~~r~~ll~~i~~l~~   67 (68)
T smart00454        7 ESVADWLESIG--LEQYADNFRKNGIDGALLLLLTSEEDLKELGITKLGHRKKILKAIQKLKD   67 (68)
T ss_pred             HHHHHHHHHCC--hHHHHHHHHHCCCCHHHHHhcChHHHHHHcCCCcHHHHHHHHHHHHHHHh
Confidence            48999999997  9999999987533446788888 899999999 89999999999988874


No 5  
>PF07647 SAM_2:  SAM domain (Sterile alpha motif);  InterPro: IPR011510 The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins [] involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms []. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins [], nevertheless with a low affinity constant []. SAM domains also appear to possess the ability to bind RNA []. Smaug, a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA, binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.  Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces []. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures []. This entry represents a second domain related to the SAM domain. ; GO: 0005515 protein binding; PDB: 1B0X_A 1X9X_B 1OW5_A 1V38_A 3BS7_A 3BS5_A 3TAD_A 3TAC_B 2K60_A 2DL0_A ....
Probab=97.88  E-value=1.7e-05  Score=49.48  Aligned_cols=59  Identities=24%  Similarity=0.420  Sum_probs=50.5

Q ss_pred             cCCHHHHHhhhccchhHHHHhhhhhh-hhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhh
Q 033920           36 KVGIPEFLNGIGKGVETHSAKLESEI-GDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYR   96 (109)
Q Consensus        36 ~~dV~tFL~~IGRg~~eha~Kfes~~-gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR   96 (109)
                      ..+|.+||..+|  +.+|++.|...= ...+.|..++...|+++|| +..+|+.||+-.++.|
T Consensus         6 ~~~v~~WL~~~g--l~~y~~~f~~~~i~g~~~L~~l~~~~L~~lGI~~~~~r~kll~~i~~Lk   66 (66)
T PF07647_consen    6 PEDVAEWLKSLG--LEQYADNFRENGIDGLEDLLQLTEEDLKELGITNLGHRRKLLSAIQELK   66 (66)
T ss_dssp             HHHHHHHHHHTT--CGGGHHHHHHTTCSHHHHHTTSCHHHHHHTTTTHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHCC--cHHHHHHHHHcCCcHHHHHhhCCHHHHHHcCCCCHHHHHHHHHHHHHcC
Confidence            458999999996  899999999733 4447799999999999999 8899999999887654


No 6  
>KOG4384 consensus Uncharacterized SAM domain protein [General function prediction only]
Probab=96.50  E-value=0.0053  Score=52.40  Aligned_cols=59  Identities=25%  Similarity=0.320  Sum_probs=50.3

Q ss_pred             CCHHHHHhhhccchhHHHHhhhh-hhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLES-EIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYRL   97 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes-~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR~   97 (109)
                      -.|+++|.+||  +++|.++|-. --.+++.+=..+-..|-++|| .+.+||.||.-+|.++.
T Consensus       216 ~~~~ewL~~i~--le~y~~~~L~nGYd~le~~k~i~e~dL~~lgI~nP~Hr~kLL~av~~~~e  276 (361)
T KOG4384|consen  216 KSLEEWLRRIG--LEEYIETLLENGYDTLEDLKDITEEDLEELGIDNPDHRKKLLSAVELLKE  276 (361)
T ss_pred             hHHHHHHHHhh--HHHHHHHHHHcchHHHHHHHhccHHHHHHhCCCCHHHHHHHHHHHHHHHh
Confidence            47999999998  9999999854 224577777888899999999 99999999999988774


No 7  
>KOG0196 consensus Tyrosine kinase, EPH (ephrin) receptor family [Signal transduction mechanisms]
Probab=84.92  E-value=2.3  Score=40.65  Aligned_cols=66  Identities=18%  Similarity=0.267  Sum_probs=56.7

Q ss_pred             ccccCCHHHHHhhhccchhHHHHhhhhhh-hhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhhhccc
Q 033920           33 YIVKVGIPEFLNGIGKGVETHSAKLESEI-GDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYRLGLW  100 (109)
Q Consensus        33 ~~~~~dV~tFL~~IGRg~~eha~Kfes~~-gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR~Gl~  100 (109)
                      +..-..|.++|..|+  |..+.+.|-++= ...+.+..++.+.|+.+|| =+-+-|.||.-.|-.|.+.-
T Consensus       920 ~~~f~sv~~WL~aIk--m~rY~~~F~~ag~~s~~~V~q~s~eDl~~~Gitl~GhqkkIl~SIq~m~~q~~  987 (996)
T KOG0196|consen  920 FTPFRSVGDWLEAIK--MGRYKEHFAAAGYTSFEDVAQMSAEDLLRLGITLAGHQKKILSSIQAMRAQMR  987 (996)
T ss_pred             CcccCCHHHHHHHhh--hhHHHHHHHhcCcccHHHHHhhhHHHHHhhceeecchhHHHHHHHHHHHHHhc
Confidence            445689999999998  999999998754 7899999999999999999 67788889988888887763


No 8  
>KOG4374 consensus RNA-binding protein Bicaudal-C [RNA processing and modification]
Probab=80.40  E-value=2.8  Score=33.86  Aligned_cols=59  Identities=27%  Similarity=0.237  Sum_probs=49.5

Q ss_pred             CCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhhhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKYRL   97 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~ekyR~   97 (109)
                      -+|--+|...|  |.++-.-|+-.==||++|..++...||.+|| +.--||.|+..-++-|.
T Consensus       152 ~~vl~~L~~lg--lg~y~~~f~~~evd~~~l~~lte~dlk~~gi~~~GpRkKi~~A~~~~r~  211 (216)
T KOG4374|consen  152 EGVLMELGILG--LGAYWKMFEAIEVDMDNLRLLTEEDLKDMGINSVGPRKKILCAIGKLRR  211 (216)
T ss_pred             chHHHHHHHHh--HHHHHHHHHHHHHHHHHHHhcccchhhhhcccccCcchhhhhhhhcccc
Confidence            34556677776  8888888876447999999999999999999 99999999998887664


No 9  
>PF01031 Dynamin_M:  Dynamin central region;  InterPro: IPR000375 Dynamin is a microtubule-associated force-producing protein of 100 Kd which is involved in the production of microtubule bundles. At the N terminus of dynamin is a GTPase domain (see IPR001401 from INTERPRO), and at the C terminus is a PH domain (see IPR001849 from INTERPRO). Between these two domains lies a central region of unknown function, which this entry represents.; GO: 0005525 GTP binding; PDB: 3ZVR_A 2AKA_B 2X2F_D 2X2E_D 3SNH_A 3ZYS_D 3ZYC_D 1JWY_B 1JX2_B 3SZR_A ....
Probab=78.80  E-value=3  Score=32.88  Aligned_cols=76  Identities=21%  Similarity=0.291  Sum_probs=54.5

Q ss_pred             cccccCCCC-CCccccCCHHHHHhhhccchhHHHHhh-hhhhhhHHHHhhhchHHHHhcCC-Cc----hhhhHHhhhhHh
Q 033920           22 SRFFTSKAS-NQYIVKVGIPEFLNGIGKGVETHSAKL-ESEIGDFQRLLVTRTLKLKKLGI-PC----KHRKLILKHTHK   94 (109)
Q Consensus        22 ~r~fs~~~~-~p~~~~~dV~tFL~~IGRg~~eha~Kf-es~~gdw~~Lf~~~S~~LKelGI-p~----r~RKyIL~~~ek   94 (109)
                      ..||+..|. +......+++..-+.+.+-..+|..+- ++-....++.+.-...+|+.+|- ++    .++.||+....+
T Consensus        41 ~~fF~~~~~~~~~~~~~G~~~L~~~L~~~L~~~I~~~LP~l~~~I~~~l~~~~~eL~~lG~~~~~~~~~~~~~l~~~~~~  120 (295)
T PF01031_consen   41 KEFFSNHPWYSSPADRCGTPALRKRLSELLVEHIRKSLPSLKSEIQKKLQEAEKELKRLGPPRPETPEEQRAYLLQIISK  120 (295)
T ss_dssp             HHHHHHSTTTGGGGGGSSHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHTHHHCSSSCHHHHHHHHHHHHHH
T ss_pred             HHHHhcccccCCcccccchHHHHHHHHHHHHHHHHHhCcHHHHHHHHHHHHHHHHHHHhCCCCCCCHHHHHHHHHHHHHH
Confidence            456666541 122235667666666666689998764 44338899999999999999988 33    688899999888


Q ss_pred             hhh
Q 033920           95 YRL   97 (109)
Q Consensus        95 yR~   97 (109)
                      |-+
T Consensus       121 f~~  123 (295)
T PF01031_consen  121 FSR  123 (295)
T ss_dssp             HHH
T ss_pred             HHH
Confidence            864


No 10 
>PF07524 Bromo_TP:  Bromodomain associated;  InterPro: IPR006565 This bromodomain is found in eukaryotic transcription factors and PHD domain containing proteins (IPR001965 from INTERPRO). The tandem PHD finger-bromodomain is found in many chromatin-associated proteins. It is involved in gene silencing by the human co-repressor KRAB-associated protein 1 (KAP1). The tandem PHD finger-bromodomain of KAP1 has a distinct structure that joins the two protein modules. The first helix, alpha(Z), of an atypical bromodomain forms the central hydrophobic core that anchors the other three helices of the bromodomain on one side and the zinc binding PHD finger on the other [].  The Rap1 GTPase-activating protein, Sipa1, is modulated by the cellular bromodomain protein, Brd4. Brd4 belongs to the BET family and is a multifunctional protein involved in transcription, replication, the signal transduction pathway, and cell cycle progression. All of these functions are linked to its association with acetylated chromatin. It has tandem bromodomains []. The dysregulation of the Brd4-associated pathways may play an important role in breast cancer progression []. Bovine papillomavirus type 1 E2 also binds to chromosomes in a complex with Brd4. Interaction with Brd4 is additionally important for E2-mediated transcriptional regulation [, ]. 
Probab=78.35  E-value=2.4  Score=27.47  Aligned_cols=41  Identities=15%  Similarity=0.340  Sum_probs=30.2

Q ss_pred             HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCch
Q 033920           39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCK   83 (109)
Q Consensus        39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r   83 (109)
                      +..|+..||+.+..+++..--...+..++.    ..|+++||.+.
T Consensus        36 ~~~yl~~l~~~~~~~ae~~gRt~~~~~Dv~----~al~~~gi~v~   76 (77)
T PF07524_consen   36 LQRYLQELGRTAKRYAEHAGRTEPNLQDVE----QALEEMGISVN   76 (77)
T ss_pred             HHHHHHHHHHHHHHHHHHcCCCCCCHHHHH----HHHHHhCCCCC
Confidence            458999999999999986554334455553    67899999764


No 11 
>KOG3678 consensus SARM protein (with sterile alpha and armadillo motifs) [Extracellular structures]
Probab=76.53  E-value=4.6  Score=37.30  Aligned_cols=60  Identities=23%  Similarity=0.299  Sum_probs=48.3

Q ss_pred             cccCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHh-cCC-CchhhhHHhhhhHhh
Q 033920           34 IVKVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKK-LGI-PCKHRKLILKHTHKY   95 (109)
Q Consensus        34 ~~~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKe-lGI-p~r~RKyIL~~~eky   95 (109)
                      -.--||++++++||  -++|.+||....=|=+=|+.++-..||. .|+ .-=+||..||-.+..
T Consensus       465 Wt~AdVQ~WvkkIG--FeeY~EkFakQ~VDGDLLLqLTEndLk~DvGM~SGl~RKRFlRELqtL  526 (832)
T KOG3678|consen  465 WTCADVQYWVKKIG--FEEYVEKFAKQMVDGDLLLQLTENDLKHDVGMISGLHRKRFLRELQTL  526 (832)
T ss_pred             cchHHHHHHHHHhC--HHHHHHHHHHHhccchHHHhhhhhhhhhhhhhhhhhhHHHHHHHHHHH
Confidence            33569999999998  9999999998775555688999999984 577 777899888765543


No 12 
>PF15652 Tox-SHH:  HNH/Endo VII superfamily toxin with a SHH signature
Probab=69.39  E-value=5.8  Score=28.71  Aligned_cols=38  Identities=24%  Similarity=0.236  Sum_probs=31.7

Q ss_pred             hhhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhh
Q 033920           56 KLESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYR   96 (109)
Q Consensus        56 Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR   96 (109)
                      |+++   +.++=|...+++|-+.|||..+|+--|+..=||+
T Consensus        62 kw~t---~~~~Ef~~~~~eM~dAGV~~~~~~~~l~~~Ykyf   99 (100)
T PF15652_consen   62 KWST---TLQEEFNNSYREMFDAGVSKECRKKALKAQYKYF   99 (100)
T ss_pred             Cccc---hHHHHHHHHHHHHHHcCCCHHHHHHHHHHHHhhc
Confidence            3666   6778888889999999999999999998876664


No 13 
>PF00730 HhH-GPD:  HhH-GPD superfamily base excision DNA repair protein This entry corresponds to Endonuclease III This entry corresponds to Alkylbase DNA glycosidase;  InterPro: IPR003265 Endonuclease III (4.2.99.18 from EC) is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism [, ]. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair []. The 3-D structures of Escherichia coli endonuclease III [] and catalytic domain of MutY [] have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL) []. Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs (see IPR003651 from INTERPRO). The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif [, ].  The HhH-GPD domain gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III, 4.2.99.18 from EC and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C-terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II 3.2.2.21 from EC, 8-oxoguanine DNA glycosylases and other members of the AlkA family.; GO: 0006284 base-excision repair; PDB: 3F0Z_A 3I0X_A 3F10_A 3I0W_A 3S6I_D 3N5N_Y 1PU7_A 1PU8_B 1PU6_B 1NGN_A ....
Probab=65.83  E-value=9.1  Score=25.36  Aligned_cols=34  Identities=18%  Similarity=0.242  Sum_probs=31.5

Q ss_pred             hHHHHhhhchHHHHhc----CCCchhhhHHhhhhHhhh
Q 033920           63 DFQRLLVTRTLKLKKL----GIPCKHRKLILKHTHKYR   96 (109)
Q Consensus        63 dw~~Lf~~~S~~LKel----GIp~r~RKyIL~~~ekyR   96 (109)
                      +++++..++..+|++.    |.+.+--+||..-++.+.
T Consensus        29 t~~~l~~~~~~el~~~i~~~G~~~~ka~~i~~~a~~~~   66 (108)
T PF00730_consen   29 TPEALAEASEEELRELIRPLGFSRRKAKYIIELARAIL   66 (108)
T ss_dssp             SHHHHHCSHHHHHHHHHTTSTSHHHHHHHHHHHHHHHH
T ss_pred             CHHHHHhCCHHHHHHHhhccCCCHHHHHHHHHHHHHhh
Confidence            4999999999999999    999888899999998887


No 14 
>COG3272 Uncharacterized conserved protein [Function unknown]
Probab=57.64  E-value=6.7  Score=30.62  Aligned_cols=21  Identities=19%  Similarity=0.306  Sum_probs=18.5

Q ss_pred             CCchhhhHHhhhhHhhhhccc
Q 033920           80 IPCKHRKLILKHTHKYRLGLW  100 (109)
Q Consensus        80 Ip~r~RKyIL~~~ekyR~Gl~  100 (109)
                      +...+|++++.|.|+||.|.-
T Consensus       101 L~s~er~~l~e~Ie~YR~G~~  121 (163)
T COG3272         101 LNSEERQELAELIESYRRGEQ  121 (163)
T ss_pred             hchHHHHHHHHHHHHHHcCCC
Confidence            467899999999999999973


No 15 
>PF01152 Bac_globin:  Bacterial-like globin;  InterPro: IPR001486 Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms []. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:   Haemoglobin (Hb): trimer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates []. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors []. Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [].  Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia []. Neuroglobin belongs to a branch of the globin family that diverged early in evolution.  Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin []. Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers []. Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation. Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants []. Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin []. Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [, ].  Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors []. Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 alpha-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [].   This entry represents a group of haemoglobin-like proteins found in eubacteria, cyanobacteria, protozoa, algae and plants, but not in animals or yeast. These proteins have a truncated 2-over-2 rather than the canonical 3-over-3 alpha-helical sandwich fold []. This entry includes:   HbN (or GlbN): a truncated haemoglobin-like protein that binds oxygen cooperatively with a very high affinity and a slow dissociation rate, which may exclude it from oxygen transport. It appears to be involved in bacterial nitric oxide detoxification and in nitrosative stress []. Cyanoglobin (or GlbN): a truncated haemoprotein found in cyanobacteria that has high oxygen affinity, and which appears to serve as part of a terminal oxidase, rather than as a respiratory pigment []. HbO (or GlbO): a truncated haemoglobin-like protein with a lower oxygen affinity than HbN. HbO associates with the bacterial cell membrane, where it significantly increases oxygen uptake over membranes lacking this protein. HbO appears to interact with a terminal oxidase, and could participate in an oxygen/electron-transfer process that facilitates oxygen transfer during aerobic metabolism []. Glb3: a nuclear-encoded truncated haemoglobin from plants that appears more closely related to HbO than HbN. Glb3 from Arabidopsis thaliana (Mouse-ear cress) exhibits an unusual concentration-independent binding of oxygen and carbon dioxide [].  ; GO: 0019825 oxygen binding, 0015671 oxygen transport; PDB: 2BKM_B 1UVY_A 1DLW_A 2XYK_B 2IG3_A 2GKM_B 1S61_A 1S56_B 1RTE_B 2GLN_A ....
Probab=56.87  E-value=18  Score=24.46  Aligned_cols=37  Identities=24%  Similarity=0.306  Sum_probs=31.9

Q ss_pred             hhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhc
Q 033920           62 GDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLG   98 (109)
Q Consensus        62 gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~G   98 (109)
                      ..++..+..=...|++.|+|...++.++...+.+|..
T Consensus        80 ~~f~~~~~~~~~al~~~~v~~~~~~~~~~~~~~~~~~  116 (120)
T PF01152_consen   80 EHFDRWLELLKQALDELGVPEELIDELLARLESLRDD  116 (120)
T ss_dssp             HHHHHHHHHHHHHHHHTTCTHHHHHHHHHHHHHHHHH
T ss_pred             HHHHHHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHHH
Confidence            4577777777889999999999999999999988864


No 16 
>KOG4375 consensus Scaffold protein Shank and related SAM domain proteins [Signal transduction mechanisms]
Probab=56.21  E-value=16  Score=30.64  Aligned_cols=55  Identities=20%  Similarity=0.291  Sum_probs=43.4

Q ss_pred             cCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC-CchhhhHHhhhh
Q 033920           36 KVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI-PCKHRKLILKHT   92 (109)
Q Consensus        36 ~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~   92 (109)
                      +.||.+.|.-++  +.||-++|.+-==|=.-|=..+..++.++|| -+-+|.-|=|-.
T Consensus       212 k~DV~dWLssl~--L~E~~~aF~d~eIdG~hLp~l~k~df~~LGVTRVgHRmnIerAL  267 (272)
T KOG4375|consen  212 KIDVNDWLSSLH--LIEYDDAFHDIEIDGKHLPLLRKLDFRGLGVTRVGHRMNIERAL  267 (272)
T ss_pred             cccHHHHHHhhh--hhhcchhhhhcccccchhhhcchhhhhcccchhhhhHHHHHHHH
Confidence            799999999997  9999999997212223455668889999999 899998875543


No 17 
>COG1603 RPP1 RNase P/RNase MRP subunit p30 [Translation, ribosomal structure and biogenesis]
Probab=54.83  E-value=25  Score=28.54  Aligned_cols=76  Identities=16%  Similarity=0.175  Sum_probs=49.5

Q ss_pred             HHHhhchhhhhhhc--CccccccCCCCCCccc--cCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCC
Q 033920            6 IFNNAGANSMVAVS--GFSRFFTSKASNQYIV--KVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIP   81 (109)
Q Consensus         6 ~~~~~~~~~~~~~~--~~~r~fs~~~~~p~~~--~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp   81 (109)
                      ++++.|.++- ++.  ....+++..+.+|++.  .-||.+|++.+|=. +.+|.++.+                      
T Consensus       147 ~l~~lr~~lr-l~rk~~v~ivvtS~A~s~~elrsP~dv~sl~~~lG~e-~~ea~~~~~----------------------  202 (229)
T COG1603         147 LLSFLRSLLR-LARKYDVPIVVTSDAESPLELRSPRDVISLAKVLGLE-DDEAKKSLS----------------------  202 (229)
T ss_pred             HHHHHHHHHH-HHHhcCCCEEEeCCCCChhhhcChhhHHHHHHHhCCC-HHHHHHHHH----------------------
Confidence            4444444442 333  5566777777777776  46999999999822 234444433                      


Q ss_pred             chhhhHHhhhhHhhhhcccccCCCC
Q 033920           82 CKHRKLILKHTHKYRLGLWRPRAAP  106 (109)
Q Consensus        82 ~r~RKyIL~~~ekyR~Gl~~P~g~~  106 (109)
                       ...+.||+|..+.|.|...||...
T Consensus       203 -~~p~~iL~~~~~~~~~~i~~gv~~  226 (229)
T COG1603         203 -EYPRLILRNRNRIRDGFIVPGVGS  226 (229)
T ss_pred             -HhHHHHHHHhhhcCCceEEecccc
Confidence             234568888999999888887653


No 18 
>TIGR02527 dot_icm_IcmQ Dot/Icm secretion system protein IcmQ. Members of this protein family are the IcmQ component of Dot/Icm secretion systems, as found in obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation (PubMed:15661013).
Probab=53.02  E-value=9.9  Score=30.16  Aligned_cols=27  Identities=26%  Similarity=0.277  Sum_probs=23.6

Q ss_pred             CHHHHHhhhccchhHHHHhhhhhhhhH
Q 033920           38 GIPEFLNGIGKGVETHSAKLESEIGDF   64 (109)
Q Consensus        38 dV~tFL~~IGRg~~eha~Kfes~~gdw   64 (109)
                      +-..||+.||+++.+--+.|++.+|.=
T Consensus        16 eeSnFLRvIgKnL~eIRd~f~~~l~~~   42 (182)
T TIGR02527        16 DESLFLRNIGKKLIAIKDLFEEAIAAA   42 (182)
T ss_pred             hHHHHHHHHHHhHHHHHHHHHHHhccc
Confidence            678899999999999999999977543


No 19 
>PF10454 DUF2458:  Protein of unknown function (DUF2458);  InterPro: IPR018858  This entry represents a family of uncharacterised proteins. 
Probab=50.78  E-value=32  Score=25.82  Aligned_cols=28  Identities=21%  Similarity=0.464  Sum_probs=22.7

Q ss_pred             HHhhhhhh-hhHHHHhhhchHHHHhcCCC
Q 033920           54 SAKLESEI-GDFQRLLVTRTLKLKKLGIP   81 (109)
Q Consensus        54 a~Kfes~~-gdw~~Lf~~~S~~LKelGIp   81 (109)
                      .++|...| -.|.++-......|+++|||
T Consensus        92 L~~fD~kV~~a~~~m~~~~~~~L~~LgVP  120 (150)
T PF10454_consen   92 LDKFDEKVYKASKQMSKEQQAELKELGVP  120 (150)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHhcCCC
Confidence            35555555 67899999999999999998


No 20 
>PRK10308 3-methyl-adenine DNA glycosylase II; Provisional
Probab=50.55  E-value=21  Score=28.95  Aligned_cols=37  Identities=24%  Similarity=0.328  Sum_probs=34.3

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhccc
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGLW  100 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl~  100 (109)
                      -+.|..++..+|++.|++-+--+||..-.+.+..|..
T Consensus       158 pe~La~~~~~eL~~~Gl~~~Ra~~L~~lA~~i~~g~l  194 (283)
T PRK10308        158 PERLAAADPQALKALGMPLKRAEALIHLANAALEGTL  194 (283)
T ss_pred             HHHHHcCCHHHHHHCCCCHHHHHHHHHHHHHHHcCCC
Confidence            8999999999999999998888999999999998865


No 21 
>PF12836 HHH_3:  Helix-hairpin-helix motif; PDB: 2EDU_A 2OCE_A 3BZK_A 3BZC_A 2DUY_A.
Probab=50.22  E-value=13  Score=23.42  Aligned_cols=32  Identities=28%  Similarity=0.397  Sum_probs=21.0

Q ss_pred             hhchHHHHhc-CCCchhhhHHhhhhHhhhhccccc
Q 033920           69 VTRTLKLKKL-GIPCKHRKLILKHTHKYRLGLWRP  102 (109)
Q Consensus        69 ~~~S~~LKel-GIp~r~RKyIL~~~ekyR~Gl~~P  102 (109)
                      +++..+|..+ ||..++=+.|+.++++.  |-|.=
T Consensus        10 ~as~~eL~~lpgi~~~~A~~Iv~~R~~~--G~f~s   42 (65)
T PF12836_consen   10 TASAEELQALPGIGPKQAKAIVEYREKN--GPFKS   42 (65)
T ss_dssp             TS-HHHHHTSTT--HHHHHHHHHHHHHH---S-SS
T ss_pred             cCCHHHHHHcCCCCHHHHHHHHHHHHhC--cCCCC
Confidence            5677788888 99888889999988776  55443


No 22 
>PF14520 HHH_5:  Helix-hairpin-helix domain; PDB: 3AUO_B 3AU6_A 3AU2_A 3B0X_A 3B0Y_A 1SZP_C 3LDA_A 1WCN_A 2JZB_B 2ZTC_A ....
Probab=49.49  E-value=59  Score=19.89  Aligned_cols=41  Identities=22%  Similarity=0.286  Sum_probs=29.3

Q ss_pred             HHHHhhhhh-hhhHHHHhhhchHHHHhc-CCCchhhhHHhhhh
Q 033920           52 THSAKLESE-IGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHT   92 (109)
Q Consensus        52 eha~Kfes~-~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~   92 (109)
                      .-+.++-+. +.++++|...+-.+|.+. ||..++-.-|..+.
T Consensus        16 ~~a~~L~~~G~~t~~~l~~a~~~~L~~i~Gig~~~a~~i~~~~   58 (60)
T PF14520_consen   16 KRAEKLYEAGIKTLEDLANADPEELAEIPGIGEKTAEKIIEAA   58 (60)
T ss_dssp             HHHHHHHHTTCSSHHHHHTSHHHHHHTSTTSSHHHHHHHHHHH
T ss_pred             HHHHHHHhcCCCcHHHHHcCCHHHHhcCCCCCHHHHHHHHHHH
Confidence            344555555 667999999999999988 99666666555543


No 23 
>PTZ00096 40S ribosomal protein S15; Provisional
Probab=48.86  E-value=9.6  Score=29.07  Aligned_cols=26  Identities=23%  Similarity=0.405  Sum_probs=14.8

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHHhh
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLILK   90 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyIL~   90 (109)
                      +|+|++++..+|-++ +|++|||.+.|
T Consensus        22 l~~L~~m~~~e~~~L-~~aR~RR~~~R   47 (143)
T PTZ00096         22 LEKLLALPEEELVEL-FRARQRRRINR   47 (143)
T ss_pred             HHHHHcCCHHHHHHH-cCccccccccc
Confidence            555666655555443 46666666543


No 24 
>cd00454 Trunc_globin Truncated hemoglobins (trHbs) are a family of oxygen-binding heme proteins found in cyanobacteria, eubacteria, unicellular eukaryotes, and plants. The truncated hemoglobins have a characteristic two-over-two alpha helical folding pattern that is distinct from the three-over-three pattern found in other globins.  A subset of these have been demonstrated to form homodimers.
Probab=48.78  E-value=27  Score=23.36  Aligned_cols=38  Identities=18%  Similarity=0.307  Sum_probs=32.1

Q ss_pred             hhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcc
Q 033920           62 GDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGL   99 (109)
Q Consensus        62 gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl   99 (109)
                      .+++.++..=...|++.|+|...+..++...+..|..+
T Consensus        76 ~~f~~~l~~l~~al~~~~~~~~~~~~~~~~~~~~~~~~  113 (116)
T cd00454          76 EEFDAWLELLRDALDELGVPAELADALLARAERIADHM  113 (116)
T ss_pred             HHHHHHHHHHHHHHHHhCCCHHHHHHHHHHHHHHHHHH
Confidence            45777777778899999999999999999999887643


No 25 
>PF09475 Dot_icm_IcmQ:  Dot/Icm secretion system protein (dot_icm_IcmQ);  InterPro: IPR013365  Proteins in this entry are the IcmQ component of Dot/Icm secretion systems, as found in the obligate intracellular pathogens Legionella pneumophila and Coxiella burnetii. While this system resembles type IV secretion systems and has been called a form of type IV, the literature now seems to favor calling this the Dot/Icm system. This protein was shown to be essential for translocation ().; PDB: 3FXE_A 3FXD_C.
Probab=48.68  E-value=9.5  Score=30.18  Aligned_cols=26  Identities=23%  Similarity=0.388  Sum_probs=22.5

Q ss_pred             CHHHHHhhhccchhHHHHhhhhhhhh
Q 033920           38 GIPEFLNGIGKGVETHSAKLESEIGD   63 (109)
Q Consensus        38 dV~tFL~~IGRg~~eha~Kfes~~gd   63 (109)
                      +-..||+.||+++.+--+.|.+.+|.
T Consensus        16 eeSnFLRvIgKnL~eIRd~f~~~l~~   41 (179)
T PF09475_consen   16 EESNFLRVIGKNLREIRDNFANQLGL   41 (179)
T ss_dssp             TSSHHHHHHHHHHHHHHHHHHHHHC-
T ss_pred             hHHHHHHHHHHhHHHHHHHHHHHhcc
Confidence            66789999999999999999997754


No 26 
>PRK10361 DNA recombination protein RmuC; Provisional
Probab=47.75  E-value=14  Score=32.70  Aligned_cols=49  Identities=10%  Similarity=0.224  Sum_probs=38.2

Q ss_pred             ccCCHHHHHhhhccchhHHHHhhhhhhhhHHH---HhhhchHHHHhcCCCch
Q 033920           35 VKVGIPEFLNGIGKGVETHSAKLESEIGDFQR---LLVTRTLKLKKLGIPCK   83 (109)
Q Consensus        35 ~~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~---Lf~~~S~~LKelGIp~r   83 (109)
                      ....+.+=+..||+.++.-.+.|.++++++..   =+-.+-.+||++|+.++
T Consensus       381 kl~~f~~~~~klG~~L~~a~~~y~~A~~~L~~Grgnli~~a~~~k~Lg~~~~  432 (475)
T PRK10361        381 KMRLFVDDMSAIGQSLDKAQDNYRQAMKKLSSGRGNVLAQAEAFRGLGVEIK  432 (475)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCchHHHHHHHHHcCCCcC
Confidence            34456667788999999999999998888873   34557789999999663


No 27 
>smart00460 TGc Transglutaminase/protease-like homologues. Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events.
Probab=44.66  E-value=19  Score=21.52  Aligned_cols=13  Identities=38%  Similarity=0.672  Sum_probs=10.5

Q ss_pred             HHHhcCCCchhhh
Q 033920           74 KLKKLGIPCKHRK   86 (109)
Q Consensus        74 ~LKelGIp~r~RK   86 (109)
                      .|+..|||++--.
T Consensus        19 llr~~GIpar~v~   31 (68)
T smart00460       19 LLRSLGIPARVVS   31 (68)
T ss_pred             HHHHCCCCeEEEe
Confidence            6889999998643


No 28 
>PRK00024 hypothetical protein; Reviewed
Probab=44.61  E-value=44  Score=26.18  Aligned_cols=46  Identities=22%  Similarity=0.225  Sum_probs=33.6

Q ss_pred             ccchhHHHHhhhhhhhhHHHHhhhchHHHHh-cCCCchhhhHHhhhh
Q 033920           47 GKGVETHSAKLESEIGDFQRLLVTRTLKLKK-LGIPCKHRKLILKHT   92 (109)
Q Consensus        47 GRg~~eha~Kfes~~gdw~~Lf~~~S~~LKe-lGIp~r~RKyIL~~~   92 (109)
                      ++++.+-|.++-+.+|++.+|+.++-.+|++ .||.......|+.-.
T Consensus        40 ~~~~~~LA~~LL~~fgsL~~l~~as~~eL~~i~GIG~akA~~L~a~~   86 (224)
T PRK00024         40 GKSVLDLARELLQRFGSLRGLLDASLEELQSIKGIGPAKAAQLKAAL   86 (224)
T ss_pred             CCCHHHHHHHHHHHcCCHHHHHhCCHHHHhhccCccHHHHHHHHHHH
Confidence            3456688888888888999999999999988 499444333443333


No 29 
>PF01841 Transglut_core:  Transglutaminase-like superfamily;  InterPro: IPR002931 This domain is found in many proteins known to have transglutaminase activity, i.e. which cross-link proteins through an acyl-transfer reaction between the gamma-carboxamide group of peptide-bound glutamine and the epsilon-amino group of peptide-bound lysine, resulting in a epsilon-(gamma-glutamyl)lysine isopeptide bond. Tranglutaminases have been found in a diverse range of species, from bacteria through to mammals. The enzymes require calcium binding and their activity leads to post-translational modification of proteins through acyl-transfer reactions, involving peptidyl glutamine residues as acyl donors and a variety of primary amines as acyl acceptors, with the generation of proteinase resistant isopeptide bonds [].  Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterised transglutaminase, the human blood clotting factor XIIIa' []. On the basis of the experimentally demonstrated activity of the Methanobacterium phage psiM2 pseudomurein endoisopeptidase [], it is proposed that many, if not all, microbial homologs of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease [].  A subunit of plasma Factor XIII revealed that each Factor XIIIA subunit is composed of four domains (termed N-terminal beta-sandwich, core domain (containing the catalytic and the regulatory sites), and C-terminal beta-barrels 1 and 2) and that two monomers assemble into the native dimer through the surfaces in domains 1 and 2, in opposite orientation. This organisation in four domains is highly conserved during evolution among transglutaminase isoforms [].; PDB: 2F4M_A 2F4O_A 1GGY_B 1FIE_B 1GGU_B 1GGT_B 1F13_A 1QRK_B 1EVU_A 1EX0_B ....
Probab=44.37  E-value=33  Score=22.17  Aligned_cols=37  Identities=27%  Similarity=0.348  Sum_probs=24.0

Q ss_pred             ccccCCHHHHHh-hhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCchh
Q 033920           33 YIVKVGIPEFLN-GIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCKH   84 (109)
Q Consensus        33 ~~~~~dV~tFL~-~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~   84 (109)
                      .....++.++|+ +.| .|.+++..|..              -|+.+||||+-
T Consensus        37 ~~~~~~~~~~l~~~~G-~C~~~a~l~~a--------------llr~~Gipar~   74 (113)
T PF01841_consen   37 SPGPRDASEVLRSGRG-DCEDYASLFVA--------------LLRALGIPARV   74 (113)
T ss_dssp             CCCCTTHHHHHHCEEE-SHHHHHHHHHH--------------HHHHHT--EEE
T ss_pred             CCCCCCHHHHHHcCCC-ccHHHHHHHHH--------------HHhhCCCceEE
Confidence            333556788877 444 78877777665              58899999863


No 30 
>PRK11639 zinc uptake transcriptional repressor; Provisional
Probab=43.56  E-value=29  Score=25.81  Aligned_cols=23  Identities=9%  Similarity=-0.009  Sum_probs=20.0

Q ss_pred             hHHHHhcCC-CchhhhHHhhhhHh
Q 033920           72 TLKLKKLGI-PCKHRKLILKHTHK   94 (109)
Q Consensus        72 S~~LKelGI-p~r~RKyIL~~~ek   94 (109)
                      ...||+.|+ .++||..||.....
T Consensus        14 ~~~L~~~GlR~T~qR~~IL~~l~~   37 (169)
T PRK11639         14 EKLCAQRNVRLTPQRLEVLRLMSL   37 (169)
T ss_pred             HHHHHHcCCCCCHHHHHHHHHHHh
Confidence            455899999 99999999999865


No 31 
>cd00923 Cyt_c_Oxidase_Va Cytochrome c oxidase subunit Va. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Va is one of three mammalian subunits that lacks a transmembrane region. Subunit Va is located on the matrix side of the membrane and binds thyroid hormone T2, releasing allosteric inhibition caused by the binding of ATP to subunit IV and allowing high turnover at elevated intramitochondrial ATP/ADP ratios.
Probab=43.42  E-value=33  Score=25.01  Aligned_cols=30  Identities=23%  Similarity=0.245  Sum_probs=21.7

Q ss_pred             HhhhhhhhhHHHHhhhchHHHHhcCCCchh
Q 033920           55 AKLESEIGDFQRLLVTRTLKLKKLGIPCKH   84 (109)
Q Consensus        55 ~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~   84 (109)
                      +|-++.-+-|.-+++-=...|+|+|||...
T Consensus        70 ~K~~~~~~~y~~~lqeikp~l~ELGI~t~E   99 (103)
T cd00923          70 DKCGAHKEIYPYILQEIKPTLKELGISTPE   99 (103)
T ss_pred             HHccCchhhHHHHHHHHhHHHHHHCCCCHH
Confidence            344332234888888889999999998764


No 32 
>PF08955 BofC_C:  BofC C-terminal domain;  InterPro: IPR015050 The C-terminal domain of the bacterial protein, bypass of forespore C (BofC), contains a three-stranded beta-sheet and three alpha-helices. The exact function is unknown []. ; PDB: 2BW2_A.
Probab=42.73  E-value=22  Score=24.31  Aligned_cols=31  Identities=29%  Similarity=0.432  Sum_probs=17.6

Q ss_pred             HhhhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhh
Q 033920           55 AKLESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYR   96 (109)
Q Consensus        55 ~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR   96 (109)
                      +++|+  +++++|         +.||+++.+.-.++-.+-|.
T Consensus        43 ~~Les--~~~~~L---------~~GIrV~~~~ey~~vLe~~~   73 (75)
T PF08955_consen   43 EKLES--SDHDQL---------KRGIRVRSKEEYNSVLETFK   73 (75)
T ss_dssp             TTS-H--HHHHHH---------HH--S---HHHHHHHHHHHH
T ss_pred             HHcCH--hHHHHH---------hCCCeeCCHHHHHHHHHhhc
Confidence            34555  468888         88999999988777766653


No 33 
>smart00478 ENDO3c endonuclease III. includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases
Probab=42.45  E-value=50  Score=23.06  Aligned_cols=37  Identities=14%  Similarity=0.146  Sum_probs=25.9

Q ss_pred             hhhHHHHhhhchHHH----HhcCCCchhhhHHhhhhHhhhh
Q 033920           61 IGDFQRLLVTRTLKL----KKLGIPCKHRKLILKHTHKYRL   97 (109)
Q Consensus        61 ~gdw~~Lf~~~S~~L----KelGIp~r~RKyIL~~~ekyR~   97 (109)
                      +++|+++..++..+|    ++.|.+.+-=+||..-.+.+..
T Consensus        21 ~~~~~~l~~~~~~eL~~~l~~~g~~~~ka~~i~~~a~~~~~   61 (149)
T smart00478       21 FPTPEDLAAADEEELEELIRPLGFYRRKAKYLIELARILVE   61 (149)
T ss_pred             CCCHHHHHCCCHHHHHHHHHHcCChHHHHHHHHHHHHHHHH
Confidence            345888888888777    6677777767777776665544


No 34 
>PF11328 DUF3130:  Protein of unknown function (DUF3130;  InterPro: IPR021477  This bacterial family of proteins has no known function. 
Probab=41.70  E-value=19  Score=25.69  Aligned_cols=35  Identities=23%  Similarity=0.376  Sum_probs=26.8

Q ss_pred             CHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC
Q 033920           38 GIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI   80 (109)
Q Consensus        38 dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI   80 (109)
                      .+..|-+.|    .+..++.|+    ++.+...+..+||++|+
T Consensus        42 sin~~r~Al----~dLv~~Ve~----fq~v~~~DA~RlkkmG~   76 (90)
T PF11328_consen   42 SINQLRTAL----IDLVDVVEN----FQQVVKKDASRLKKMGK   76 (90)
T ss_pred             hHHHHHHHH----HHHHHHHHH----HHHHHHHHHHHHHHHHH
Confidence            555665554    455556555    99999999999999998


No 35 
>PRK13482 DNA integrity scanning protein DisA; Provisional
Probab=41.47  E-value=41  Score=28.87  Aligned_cols=56  Identities=21%  Similarity=0.277  Sum_probs=41.2

Q ss_pred             HHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CC-CchhhhHHhhhhHhhhh
Q 033920           41 EFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GI-PCKHRKLILKHTHKYRL   97 (109)
Q Consensus        41 tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GI-p~r~RKyIL~~~ekyR~   97 (109)
                      -.|..|.|--...++.+-+.||++++++.++-.+|++. || +.+.+. |-.-..++..
T Consensus       287 RiLs~IPrl~k~iAk~Ll~~FGSL~~Il~As~eeL~~VeGIGe~rA~~-I~e~l~Rl~e  344 (352)
T PRK13482        287 RLLSKIPRLPSAVIENLVEHFGSLQGLLAASIEDLDEVEGIGEVRARA-IREGLSRLAE  344 (352)
T ss_pred             HHHhcCCCCCHHHHHHHHHHcCCHHHHHcCCHHHHhhCCCcCHHHHHH-HHHHHHHHHH
Confidence            45666666666777777788888999999999999986 89 555554 6665555544


No 36 
>COG1577 ERG12 Mevalonate kinase [Lipid metabolism]
Probab=40.74  E-value=52  Score=27.36  Aligned_cols=56  Identities=25%  Similarity=0.363  Sum_probs=41.7

Q ss_pred             CHHHHHhhhccchhHHHHhhhhhhhh---HHHHhhhchHHHHhcCCCchhhhHHhhhhHhh
Q 033920           38 GIPEFLNGIGKGVETHSAKLESEIGD---FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKY   95 (109)
Q Consensus        38 dV~tFL~~IGRg~~eha~Kfes~~gd---w~~Lf~~~S~~LKelGIp~r~RKyIL~~~eky   95 (109)
                      .+..|+..||.-+.+-..-+++  +|   +.++++....-|+++||....=+.|+.-.+++
T Consensus       201 ~~~~~~~~ig~~~~~a~~al~~--~d~e~lgelm~~nq~LL~~LgVs~~~L~~lv~~a~~~  259 (307)
T COG1577         201 VIDPILDAIGELVQEAEAALQT--GDFEELGELMNINQGLLKALGVSTPELDELVEAARSL  259 (307)
T ss_pred             HHHHHHHHHHHHHHHHHHHHhc--ccHHHHHHHHHHHHHHHHhcCcCcHHHHHHHHHHHhc
Confidence            4677889999666665555554  55   78889999999999999777767776666544


No 37 
>PF04362 Iron_traffic:  Bacterial Fe(2+) trafficking;  InterPro: IPR007457 The protein represented by this entry, YggX, serves to protect Fe-S clusters from oxidative damage []. The effect is two-fold: proteins that rely on Fe-S clusters do not become inactivated, and the release of free iron and hydrogen peroxide--a DNA damaging agent--is prevented. These observations are consistent with the hypothesis that YggX chelates free iron, and recent experiments show that YggX can indeed bind Fe(II) in vitro and in vivo []. Furthermore, YggX has a positive effect on the action of at least one Fe(II)-responsive protein. The combined actions of YggX is reminiscent of iron trafficking proteins [], and YggX is therefore proposed to play a role in Fe(II) trafficking []. In Escherichia coli, YggX was shown to be under the transcriptional control of the redox-sensing SoxRS system []. ; GO: 0005506 iron ion binding; PDB: 1YHD_A 1T07_A 1XS8_A.
Probab=40.17  E-value=33  Score=24.21  Aligned_cols=50  Identities=18%  Similarity=0.355  Sum_probs=39.1

Q ss_pred             hhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC---CchhhhHHhhhhHhhhhc
Q 033920           44 NGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI---PCKHRKLILKHTHKYRLG   98 (109)
Q Consensus        44 ~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI---p~r~RKyIL~~~ekyR~G   98 (109)
                      ..+|..+-++++|=     -|+.=+.-.+.-.-|+++   ++++|++|..+.++|=-|
T Consensus        24 G~lG~~I~~~iSk~-----AW~~W~~~QTmLINE~rLn~~dp~~R~~L~~qM~~Flf~   76 (88)
T PF04362_consen   24 GELGQRIYDNISKE-----AWQEWLEHQTMLINEYRLNMMDPEARKFLEEQMEKFLFG   76 (88)
T ss_dssp             SHHHHHHHHHSBHH-----HHHHHHHHHHHHHHHHT--TTSHHHHHHHHHHHHHHTTT
T ss_pred             CHHHHHHHHHHhHH-----HHHHHHHHHHHHHHhccCCCCCHHHHHHHHHHHHHHhcC
Confidence            45677777777772     288888888888888887   899999999999999654


No 38 
>PF07487 SopE_GEF:  SopE GEF domain;  InterPro: IPR016019  The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal" state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE [].   Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell []. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.  This entry represents the guanine nucleotide exchange factor domain of SopE. This domain has an alpha-helical structure consisting of two three-helix bundles arranged in a lamdba shape [, ].; GO: 0005085 guanyl-nucleotide exchange factor activity, 0009405 pathogenesis, 0031532 actin cytoskeleton reorganization, 0032862 activation of Rho GTPase activity, 0005576 extracellular region; PDB: 1GZS_B 1R9K_A 1R6E_A 2JOL_A 2JOK_A.
Probab=39.48  E-value=19  Score=28.15  Aligned_cols=31  Identities=26%  Similarity=0.462  Sum_probs=22.4

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcccccCCC
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGLWRPRAA  105 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl~~P~g~  105 (109)
                      -+.++-...+..|+.|+|+..           ++|.|.|+|+
T Consensus        62 i~pFL~eiGeaak~aGLPge~-----------KNgVFtp~Ga   92 (165)
T PF07487_consen   62 IQPFLFEIGEAAKNAGLPGEN-----------KNGVFTPSGA   92 (165)
T ss_dssp             SHHHHHHHHHHHHHTT-SEEE-----------ETTEEEETT-
T ss_pred             ccHHHHHHHHHHHHCCCCccc-----------cCCeeccCCC
Confidence            555666667788899999975           5888988885


No 39 
>PF12826 HHH_2:  Helix-hairpin-helix motif; PDB: 1X2I_B 1DGS_A 1V9P_B.
Probab=39.16  E-value=31  Score=21.84  Aligned_cols=40  Identities=20%  Similarity=0.280  Sum_probs=27.3

Q ss_pred             HhhhhhhhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhHh
Q 033920           55 AKLESEIGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTHK   94 (109)
Q Consensus        55 ~Kfes~~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~ek   94 (109)
                      ..+-..+|++++|..++-++|.+. ||..+.-+-|..|-+.
T Consensus        17 k~L~~~f~sl~~l~~a~~e~L~~i~gIG~~~A~si~~ff~~   57 (64)
T PF12826_consen   17 KLLAKHFGSLEALMNASVEELSAIPGIGPKIAQSIYEFFQD   57 (64)
T ss_dssp             HHHHHCCSCHHHHCC--HHHHCTSTT--HHHHHHHHHHHH-
T ss_pred             HHHHHHcCCHHHHHHcCHHHHhccCCcCHHHHHHHHHHHCC
Confidence            444455667999999999999887 9988888888877654


No 40 
>PF06320 GCN5L1:  GCN5-like protein 1 (GCN5L1);  InterPro: IPR009395 This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown [,].
Probab=37.92  E-value=34  Score=24.71  Aligned_cols=32  Identities=25%  Similarity=0.416  Sum_probs=26.9

Q ss_pred             cchhHHHHhhhhhhhhHHHHhhhchHHHHhcC
Q 033920           48 KGVETHSAKLESEIGDFQRLLVTRTLKLKKLG   79 (109)
Q Consensus        48 Rg~~eha~Kfes~~gdw~~Lf~~~S~~LKelG   79 (109)
                      |.+.+++.+|-.....|..+..--...|||+|
T Consensus        57 k~L~~~~~~l~kqt~qw~~~~~~~~~~LKEiG   88 (121)
T PF06320_consen   57 KQLQRNTAKLAKQTDQWLKLVDSFNDALKEIG   88 (121)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhc
Confidence            55677888888888889999988889999988


No 41 
>PF15144 DUF4576:  Domain of unknown function (DUF4576)
Probab=37.86  E-value=34  Score=24.33  Aligned_cols=22  Identities=23%  Similarity=0.520  Sum_probs=17.6

Q ss_pred             cCCHHHHHhhhccchhHHHHhh
Q 033920           36 KVGIPEFLNGIGKGVETHSAKL   57 (109)
Q Consensus        36 ~~dV~tFL~~IGRg~~eha~Kf   57 (109)
                      .||.+.||+..|-...|.|..|
T Consensus        43 ~p~fPkFLn~LGteIiEnAVef   64 (88)
T PF15144_consen   43 EPDFPKFLNLLGTEIIENAVEF   64 (88)
T ss_pred             CCchHHHHHHhhHHHHHHHHHH
Confidence            6899999999997766666554


No 42 
>PF02284 COX5A:  Cytochrome c oxidase subunit Va;  InterPro: IPR003204 Cytochrome c oxidase (1.9.3.1 from EC) is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen []. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.  In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits is known as Va.; GO: 0004129 cytochrome-c oxidase activity; PDB: 2DYR_R 3AG1_E 3ABL_E 1V54_R 2EIJ_R 1OCR_E 2DYS_E 2EIM_E 2OCC_E 3ASN_R ....
Probab=37.84  E-value=36  Score=25.01  Aligned_cols=30  Identities=23%  Similarity=0.311  Sum_probs=19.1

Q ss_pred             HhhhhhhhhHHHHhhhchHHHHhcCCCchh
Q 033920           55 AKLESEIGDFQRLLVTRTLKLKKLGIPCKH   84 (109)
Q Consensus        55 ~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~   84 (109)
                      +|-++.-+-|+-++.-=...|+|+|||..+
T Consensus        73 ~K~~~~~~~Y~~~lqElkPtl~ELGI~t~E  102 (108)
T PF02284_consen   73 DKCGNKKEIYPYILQELKPTLEELGIPTPE  102 (108)
T ss_dssp             HHTTT-TTHHHHHHHHHHHHHHHHT---TT
T ss_pred             HHccChHHHHHHHHHHHhhHHHHhCCCCHH
Confidence            444443335888888889999999998764


No 43 
>PRK13766 Hef nuclease; Provisional
Probab=35.79  E-value=78  Score=28.16  Aligned_cols=53  Identities=15%  Similarity=0.179  Sum_probs=39.3

Q ss_pred             HHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhHh
Q 033920           42 FLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTHK   94 (109)
Q Consensus        42 FL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~ek   94 (109)
                      +|..|..--..-+.++-+.+|++++++.++..+|++. ||..+.-+-|..+.++
T Consensus       716 ~L~~ipgig~~~a~~Ll~~fgs~~~i~~as~~~L~~i~Gig~~~a~~i~~~~~~  769 (773)
T PRK13766        716 IVESLPDVGPVLARNLLEHFGSVEAVMTASEEELMEVEGIGEKTAKRIREVVTS  769 (773)
T ss_pred             HHhcCCCCCHHHHHHHHHHcCCHHHHHhCCHHHHHhCCCCCHHHHHHHHHHHhh
Confidence            5666643334445666666788999999999999998 9987777777776654


No 44 
>PRK05408 oxidative damage protection protein; Provisional
Probab=35.62  E-value=61  Score=23.05  Aligned_cols=49  Identities=20%  Similarity=0.357  Sum_probs=39.1

Q ss_pred             hhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCC---CchhhhHHhhhhHhhhhc
Q 033920           45 GIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGI---PCKHRKLILKHTHKYRLG   98 (109)
Q Consensus        45 ~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI---p~r~RKyIL~~~ekyR~G   98 (109)
                      .+|+.+-++++|=     -|+.=+.-.+.-.-|.++   .+++|+||-.+.++|=-|
T Consensus        25 ~lGkrI~~~ISk~-----AW~~W~~~QTmLINE~rLn~~dp~ar~~L~~qMekF~F~   76 (90)
T PRK05408         25 ELGKRIYENISKE-----AWQEWLKHQTMLINEKRLNMMDPEARKFLEEQMEKFLFG   76 (90)
T ss_pred             HHHHHHHHHHHHH-----HHHHHHHhhHhhhhhccCCCCCHHHHHHHHHHHHHHhcC
Confidence            4677777777772     288888778888888877   899999999999999754


No 45 
>PF03457 HA:  Helicase associated domain;  InterPro: IPR005114 This short domain is found in multiple copies in bacterial helicase proteins. The domain is predicted to contain 3 alpha helices. The function of this domain may be to bind nucleic acid.; PDB: 2KTA_A.
Probab=34.62  E-value=62  Score=20.04  Aligned_cols=39  Identities=18%  Similarity=0.304  Sum_probs=22.3

Q ss_pred             HHHHhhhchHHHHhcC---CCchh-------hhHHhhhhHhhhhccccc
Q 033920           64 FQRLLVTRTLKLKKLG---IPCKH-------RKLILKHTHKYRLGLWRP  102 (109)
Q Consensus        64 w~~Lf~~~S~~LKelG---Ip~r~-------RKyIL~~~ekyR~Gl~~P  102 (109)
                      |++-|..=..--.+-|   ||...       -++|-+++.+||+|...|
T Consensus         8 W~~~~~~l~~y~~~~G~~~vp~~~~~~~~~Lg~Wl~~qR~~~r~g~L~~   56 (68)
T PF03457_consen    8 WEERYEALKAYKEEHGHLNVPRDYVTDGFPLGQWLNNQRRKYRKGKLTP   56 (68)
T ss_dssp             HHHHHHHHHHHHHHHS--S-SS-----SSHHHHHHHHHHHHHHHT---H
T ss_pred             HHHHHHHHHHHHHHHCCCCCCcccCcCCCcHHHHHHHHHHHHHcCCCCH
Confidence            5555544444445555   35443       788999999999987644


No 46 
>PTZ00418 Poly(A) polymerase; Provisional
Probab=33.27  E-value=48  Score=30.33  Aligned_cols=71  Identities=20%  Similarity=0.165  Sum_probs=45.2

Q ss_pred             CCHHHHHhhhccc-hhHHHHhhhhhhhhHHHHh-hhchHHHHhcCCCchhhh---HHhhhhHhhhhcccccCCCCC
Q 033920           37 VGIPEFLNGIGKG-VETHSAKLESEIGDFQRLL-VTRTLKLKKLGIPCKHRK---LILKHTHKYRLGLWRPRAAPA  107 (109)
Q Consensus        37 ~dV~tFL~~IGRg-~~eha~Kfes~~gdw~~Lf-~~~S~~LKelGIp~r~RK---yIL~~~ekyR~Gl~~P~g~~~  107 (109)
                      -.+.+||+..|-- .+|-..|=+..++.++++. .|-...-+++|++..+.+   -.|--.=-||.|.|+|+.|..
T Consensus        72 ~~L~~~L~~~~~fes~ee~~kR~~vL~~L~~iv~~wv~~vs~~k~~~~~~~~~~~g~I~tfGSYrLGV~~pgSDID  147 (593)
T PTZ00418         72 NELINLLKSYNLYETEEGKKKRERVLGSLNKLVREFVVEASIEQGINEEEASQISGKLFTFGSYRLGVVAPGSDID  147 (593)
T ss_pred             HHHHHHHHHcCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCChhHHhcCCeEEEEeccccccCCCCCCccc
Confidence            5678888876532 2222333333346677777 444455577888777655   334455679999999999974


No 47 
>KOG1170 consensus Diacylglycerol kinase [Lipid transport and metabolism]
Probab=32.92  E-value=18  Score=35.05  Aligned_cols=60  Identities=27%  Similarity=0.404  Sum_probs=48.2

Q ss_pred             CccccCCHHHHHhhhccchhHHHHhhhhh-h-hhHHHHhhhchHHHHhcCC-CchhhhHHhhhhHhh
Q 033920           32 QYIVKVGIPEFLNGIGKGVETHSAKLESE-I-GDFQRLLVTRTLKLKKLGI-PCKHRKLILKHTHKY   95 (109)
Q Consensus        32 p~~~~~dV~tFL~~IGRg~~eha~Kfes~-~-gdw~~Lf~~~S~~LKelGI-p~r~RKyIL~~~eky   95 (109)
                      ||-..-.|-+.|.-||  ++||.+.|+.. | |  ..|+.+--..||++|| .+-+=|.||.-.-..
T Consensus       996 ~~w~seeV~awLe~~~--LsEy~d~f~kndirG--seLl~L~rrDLkdlgvtkVGhvkril~aIkdl 1058 (1099)
T KOG1170|consen  996 PYWTSEEVCAWLESIG--LSEYKDTFRKNDIRG--SELLHLERRDLKDLGVTKVGHVKRILSAIKDL 1058 (1099)
T ss_pred             ccccHHHHHHHHhccc--cchhhhhhhccCccc--ceeeecCcccccccchhhhHHHHHHHHHHHHH
Confidence            4555667889999998  99999999852 2 4  5799999999999999 888888888654433


No 48 
>KOG2841 consensus Structure-specific endonuclease ERCC1-XPF, ERCC1 component [Replication, recombination and repair]
Probab=32.88  E-value=75  Score=26.44  Aligned_cols=53  Identities=23%  Similarity=0.344  Sum_probs=39.5

Q ss_pred             HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CC-CchhhhHHhhhh
Q 033920           39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GI-PCKHRKLILKHT   92 (109)
Q Consensus        39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GI-p~r~RKyIL~~~   92 (109)
                      +..||+.|-.-...-|..+-..||+++++++++..+|-.. |+ |.+.++ |....
T Consensus       193 ~~~~Lt~i~~VnKtda~~LL~~FgsLq~~~~AS~~ele~~~G~G~~kak~-l~~~l  247 (254)
T KOG2841|consen  193 LLGFLTTIPGVNKTDAQLLLQKFGSLQQISNASEGELEQCPGLGPAKAKR-LHKFL  247 (254)
T ss_pred             HHHHHHhCCCCCcccHHHHHHhcccHHHHHhcCHhHHHhCcCcCHHHHHH-HHHHH
Confidence            8899999954445566777777888999999999999876 88 555444 44443


No 49 
>PRK04038 rps19p 30S ribosomal protein S19P; Provisional
Probab=31.96  E-value=32  Score=25.94  Aligned_cols=24  Identities=17%  Similarity=0.426  Sum_probs=10.5

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHH
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLI   88 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyI   88 (109)
                      .|+|++++-.+|-++ +|++|||.|
T Consensus        14 l~~L~~m~~~~~~~l-~~ar~RRsl   37 (134)
T PRK04038         14 LEELQEMSLEEFAEL-LPARQRRSL   37 (134)
T ss_pred             HHHHHcCCHHHHHHH-cchhhhhhh
Confidence            344444444444332 345555544


No 50 
>smart00540 LEM in nuclear membrane-associated proteins. LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin.
Probab=31.10  E-value=64  Score=19.87  Aligned_cols=27  Identities=41%  Similarity=0.567  Sum_probs=21.3

Q ss_pred             hHHHHhcCCC-----chhhhHHhhhhHhhhhc
Q 033920           72 TLKLKKLGIP-----CKHRKLILKHTHKYRLG   98 (109)
Q Consensus        72 S~~LKelGIp-----~r~RKyIL~~~ekyR~G   98 (109)
                      ..+|++.|+|     ..+|+...+.-++++.|
T Consensus        12 ~~~L~~~G~~~gPIt~sTR~vy~kkL~~~~~~   43 (44)
T smart00540       12 RAELKQYGLPPGPITDTTRKLYEKKLRKLRRG   43 (44)
T ss_pred             HHHHHHcCCCCCCcCcchHHHHHHHHHHHHcC
Confidence            3578889985     37899999988888765


No 51 
>PF13518 HTH_28:  Helix-turn-helix domain
Probab=30.37  E-value=61  Score=18.42  Aligned_cols=23  Identities=22%  Similarity=0.449  Sum_probs=17.9

Q ss_pred             hHHHHhcCCCchhhhHHhhhhHhhhh
Q 033920           72 TLKLKKLGIPCKHRKLILKHTHKYRL   97 (109)
Q Consensus        72 S~~LKelGIp~r~RKyIL~~~ekyR~   97 (109)
                      +...++.||   .+.-|-+|..+|+.
T Consensus        16 ~~~a~~~gi---s~~tv~~w~~~y~~   38 (52)
T PF13518_consen   16 REIAREFGI---SRSTVYRWIKRYRE   38 (52)
T ss_pred             HHHHHHHCC---CHhHHHHHHHHHHh
Confidence            445678899   45678899999988


No 52 
>PRK00558 uvrC excinuclease ABC subunit C; Validated
Probab=30.27  E-value=77  Score=28.46  Aligned_cols=55  Identities=20%  Similarity=0.224  Sum_probs=39.7

Q ss_pred             HHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhH
Q 033920           39 IPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTH   93 (109)
Q Consensus        39 V~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~e   93 (109)
                      ....|..|..-=..-+.++-+.+|+++++++++..+|.+. ||+.+.-..|..|.+
T Consensus       541 ~~s~L~~IpGIG~k~~k~Ll~~FgS~~~i~~As~eeL~~v~Gig~~~A~~I~~~l~  596 (598)
T PRK00558        541 LTSALDDIPGIGPKRRKALLKHFGSLKAIKEASVEELAKVPGISKKLAEAIYEALH  596 (598)
T ss_pred             hhhhHhhCCCcCHHHHHHHHHHcCCHHHHHhCCHHHHhhcCCcCHHHHHHHHHHhc
Confidence            3444444432223344566666778999999999999999 999998888887764


No 53 
>COG1623 Predicted nucleic-acid-binding protein (contains the HHH domain) [General function prediction only]
Probab=29.99  E-value=50  Score=28.55  Aligned_cols=45  Identities=22%  Similarity=0.301  Sum_probs=38.8

Q ss_pred             HHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhc-CC-Cchhh
Q 033920           41 EFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKL-GI-PCKHR   85 (109)
Q Consensus        41 tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKel-GI-p~r~R   85 (109)
                      --|++|.|---.+++++-.++|+++++++++-+.|++. || ..+.|
T Consensus       293 R~l~kIpRlp~~iv~nlV~~F~~l~~il~As~edL~~VeGIGe~rAr  339 (349)
T COG1623         293 RLLNKIPRLPFAIVENLVRAFGTLDGILEASAEDLDAVEGIGEARAR  339 (349)
T ss_pred             HHHhcCcCccHHHHHHHHHHHhhHHHHHHhcHhHHhhhcchhHHHHH
Confidence            47889999999999999999999999999999999987 78 44433


No 54 
>COG0122 AlkA 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase [DNA replication, recombination, and repair]
Probab=28.82  E-value=56  Score=26.64  Aligned_cols=59  Identities=12%  Similarity=0.130  Sum_probs=46.2

Q ss_pred             CCHHHHHhhhccchhHHHHh------hhhhhhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcc
Q 033920           37 VGIPEFLNGIGKGVETHSAK------LESEIGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGL   99 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~K------fes~~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl   99 (109)
                      +.+..=-+.++|-++..-..      |++    =+.|...+...|++.|.+-+-=+||.+.++.+.+|.
T Consensus       118 vS~~~A~~i~~rl~~~~g~~~~~~~~fpt----pe~l~~~~~~~l~~~g~s~~Ka~yi~~~A~~~~~g~  182 (285)
T COG0122         118 VSVAAAAKIWARLVSLYGNALEIYHSFPT----PEQLAAADEEALRRCGLSGRKAEYIISLARAAAEGE  182 (285)
T ss_pred             hhHHHHHHHHHHHHHHhCCccccccCCCC----HHHHHhcCHHHHHHhCCcHHHHHHHHHHHHHHHcCC
Confidence            44444445555555555554      666    899999999999999999999999999999999994


No 55 
>PRK09462 fur ferric uptake regulator; Provisional
Probab=28.28  E-value=53  Score=23.47  Aligned_cols=23  Identities=30%  Similarity=0.302  Sum_probs=19.5

Q ss_pred             hHHHHhcCC-CchhhhHHhhhhHh
Q 033920           72 TLKLKKLGI-PCKHRKLILKHTHK   94 (109)
Q Consensus        72 S~~LKelGI-p~r~RKyIL~~~ek   94 (109)
                      ...||+.|+ .++||..||.....
T Consensus         5 ~~~l~~~glr~T~qR~~Il~~l~~   28 (148)
T PRK09462          5 NTALKKAGLKVTLPRLKILEVLQE   28 (148)
T ss_pred             HHHHHHcCCCCCHHHHHHHHHHHh
Confidence            356899999 99999999998754


No 56 
>COG5457 Uncharacterized conserved small protein [Function unknown]
Probab=28.05  E-value=60  Score=21.52  Aligned_cols=28  Identities=18%  Similarity=0.113  Sum_probs=22.6

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHHhhh
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLILKH   91 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~   91 (109)
                      =..|..++..+|++.||.-.+..+....
T Consensus        32 r~eL~~lsd~~L~DiGisR~d~~~e~~k   59 (63)
T COG5457          32 RRELLRLSDHLLSDIGISRADIEAEAAK   59 (63)
T ss_pred             HHHHHHHhHHHHHHcCCCHHHHHHHHHH
Confidence            4568888999999999988887776654


No 57 
>PF10281 Ish1:  Putative stress-responsive nuclear envelope protein;  InterPro: IPR018803  This group of proteins, found primarily in fungi, consists of putative stress-responsive nuclear envelope protein Ish1 and homologues []. 
Probab=27.51  E-value=61  Score=18.56  Aligned_cols=21  Identities=43%  Similarity=0.575  Sum_probs=14.7

Q ss_pred             HHhcCCCch----hhhHHhhhhHhh
Q 033920           75 LKKLGIPCK----HRKLILKHTHKY   95 (109)
Q Consensus        75 LKelGIp~r----~RKyIL~~~eky   95 (109)
                      |++.||++.    .|..+|..+.++
T Consensus        13 L~~~gi~~~~~~~~rd~Ll~~~k~~   37 (38)
T PF10281_consen   13 LKSHGIPVPKSAKTRDELLKLAKKN   37 (38)
T ss_pred             HHHcCCCCCCCCCCHHHHHHHHHHh
Confidence            567899443    688888877553


No 58 
>PF13812 PPR_3:  Pentatricopeptide repeat domain
Probab=26.55  E-value=53  Score=16.72  Aligned_cols=19  Identities=26%  Similarity=0.291  Sum_probs=11.8

Q ss_pred             hhHHHHhhhchHHHHhcCCC
Q 033920           62 GDFQRLLVTRTLKLKKLGIP   81 (109)
Q Consensus        62 gdw~~Lf~~~S~~LKelGIp   81 (109)
                      |+|+..+.+=.. |++.||.
T Consensus        15 g~~~~a~~~~~~-M~~~gv~   33 (34)
T PF13812_consen   15 GDPDAALQLFDE-MKEQGVK   33 (34)
T ss_pred             CCHHHHHHHHHH-HHHhCCC
Confidence            556666555443 6678884


No 59 
>PF08349 DUF1722:  Protein of unknown function (DUF1722);  InterPro: IPR013560 This domain of unknown function is found in bacteria and archaea and is homologous to the hypothetical protein ybgA from Escherichia coli. 
Probab=26.28  E-value=50  Score=23.09  Aligned_cols=23  Identities=17%  Similarity=0.296  Sum_probs=19.8

Q ss_pred             cCCCchhhhHHhhhhHhhhhccc
Q 033920           78 LGIPCKHRKLILKHTHKYRLGLW  100 (109)
Q Consensus        78 lGIp~r~RKyIL~~~ekyR~Gl~  100 (109)
                      .-++..+|++++...++||+|..
T Consensus        64 ~~ls~~EK~~~~~~i~~yr~g~i   86 (117)
T PF08349_consen   64 KKLSSEEKQHFLDLIEDYREGKI   86 (117)
T ss_pred             HhCCHHHHHHHHHHHHHHHcCCc
Confidence            34688899999999999999973


No 60 
>cd00056 ENDO3c endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases
Probab=25.02  E-value=92  Score=21.87  Aligned_cols=39  Identities=13%  Similarity=0.060  Sum_probs=31.2

Q ss_pred             hHHHHhhhchHHHHhcCCC---chhhhHHhhhhHhhhhcccc
Q 033920           63 DFQRLLVTRTLKLKKLGIP---CKHRKLILKHTHKYRLGLWR  101 (109)
Q Consensus        63 dw~~Lf~~~S~~LKelGIp---~r~RKyIL~~~ekyR~Gl~~  101 (109)
                      +++.|..++..+|++.|.+   .+--+||..-.+.+..+..+
T Consensus        32 t~~~l~~~~~~~l~~~~~~~G~~~kA~~i~~~a~~~~~~~~~   73 (158)
T cd00056          32 TPEALAAADEEELRELIRSLGYRRKAKYLKELARAIVEGFGG   73 (158)
T ss_pred             CHHHHHCCCHHHHHHHHHhcChHHHHHHHHHHHHHHHHHcCC
Confidence            4999999999999998887   46667888888887776543


No 61 
>KOG0005 consensus Ubiquitin-like protein [Cell cycle control, cell division, chromosome partitioning; Posttranslational modification, protein turnover, chaperones]
Probab=24.84  E-value=57  Score=22.23  Aligned_cols=14  Identities=43%  Similarity=0.714  Sum_probs=11.4

Q ss_pred             hcCCCchhhhHHhh
Q 033920           77 KLGIPCKHRKLILK   90 (109)
Q Consensus        77 elGIp~r~RKyIL~   90 (109)
                      +-|||++|.|.|..
T Consensus        33 keGIPp~qqrli~~   46 (70)
T KOG0005|consen   33 KEGIPPQQQRLIYA   46 (70)
T ss_pred             hcCCCchhhhhhhc
Confidence            66999999888864


No 62 
>PF14527 LAGLIDADG_WhiA:  WhiA LAGLIDADG-like domain; PDB: 3HYI_A 3HYJ_D.
Probab=24.63  E-value=25  Score=24.06  Aligned_cols=21  Identities=29%  Similarity=0.447  Sum_probs=15.9

Q ss_pred             CCHHHHHhhhccchhHHHHhhhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLES   59 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes   59 (109)
                      -++.+||+.||  ..+.+-+||+
T Consensus        65 e~I~dfL~~iG--A~~s~~~~E~   85 (93)
T PF14527_consen   65 EQISDFLKLIG--AHKSVLEFEN   85 (93)
T ss_dssp             HHHHHHHHHTT----CHCCHHHH
T ss_pred             HHHHHHHHHcC--hHHHHHHHHH
Confidence            68999999998  7777777776


No 63 
>PF11899 DUF3419:  Protein of unknown function (DUF3419);  InterPro: IPR021829  This family of proteins are functionally uncharacterised. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 398 to 802 amino acids in length. 
Probab=23.96  E-value=1.2e+02  Score=25.93  Aligned_cols=49  Identities=20%  Similarity=0.350  Sum_probs=34.7

Q ss_pred             CHHHHHhhhccchhHHHHhhhhhhhh--HHHHhhhc-----hHHHHhcCCCchhhhHH
Q 033920           38 GIPEFLNGIGKGVETHSAKLESEIGD--FQRLLVTR-----TLKLKKLGIPCKHRKLI   88 (109)
Q Consensus        38 dV~tFL~~IGRg~~eha~Kfes~~gd--w~~Lf~~~-----S~~LKelGIp~r~RKyI   88 (109)
                      +|+.+++.  +..+|..+-|++.+..  |..++.+-     .-.|+-+|||+.|+++|
T Consensus       159 ~v~~l~~a--~tleeQr~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~lGvp~~q~~~l  214 (380)
T PF11899_consen  159 DVKRLLEA--RTLEEQRRIWEKKIRPLFWRRLVRWLFVGNRRFLWFALGVPPAQYKML  214 (380)
T ss_pred             HHHHHHcC--CCHHHHHHHHHHhhhHHHHHHHHHHHHhcCHHHHHHhcCCCHHHHHHH
Confidence            66666664  4677777777776533  55566443     34789999999999998


No 64 
>KOG4374 consensus RNA-binding protein Bicaudal-C [RNA processing and modification]
Probab=23.51  E-value=59  Score=26.40  Aligned_cols=53  Identities=17%  Similarity=0.100  Sum_probs=37.0

Q ss_pred             cCCHHHHHhhhccchhHHHHhhhhhhhhHHHH---hhhchHHHHhcCC-CchhhhHHhhhh
Q 033920           36 KVGIPEFLNGIGKGVETHSAKLESEIGDFQRL---LVTRTLKLKKLGI-PCKHRKLILKHT   92 (109)
Q Consensus        36 ~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~L---f~~~S~~LKelGI-p~r~RKyIL~~~   92 (109)
                      .||.+++++.- .+.+.+...|..   +|+.|   .+.++.-++++|| -+-.|+-+|.--
T Consensus       117 ~pd~~~~~~~~-~~l~s~~~~~~~---~~~~l~~~~t~~~~vl~~L~~lglg~y~~~f~~~  173 (216)
T KOG4374|consen  117 RPDIQSLLTSR-LGLESYIKEFNL---QEIDLQTFGTLTEGVLMELGILGLGAYWKMFEAI  173 (216)
T ss_pred             CCchhhHHHHh-hcccccchhhhc---chHhhhhcccccchHHHHHHHHhHHHHHHHHHHH
Confidence            36889999982 335555555554   67774   5678889999999 777777666543


No 65 
>COG4352 RPL13 Ribosomal protein L13E [Translation, ribosomal structure and biogenesis]
Probab=22.97  E-value=58  Score=24.17  Aligned_cols=34  Identities=26%  Similarity=0.501  Sum_probs=24.8

Q ss_pred             hhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCchhhhH
Q 033920           44 NGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCKHRKL   87 (109)
Q Consensus        44 ~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r~RKy   87 (109)
                      -++|||-+         +|.+++ ..++-.+-+.+||++.+||.
T Consensus        55 ~R~GRGFs---------lgEl~a-AGL~~~~AR~LGI~VD~RRr   88 (113)
T COG4352          55 VRAGRGFS---------LGELKA-AGLSARKARTLGIAVDHRRR   88 (113)
T ss_pred             eeccCCcc---------HHHHHH-cCcCHHHHHhhCcceehhhc
Confidence            57899976         122333 36778888999999999974


No 66 
>PF10305 Fmp27_SW:  RNA pol II promoter Fmp27 protein domain;  InterPro: IPR019415 The function of the FMP27 protein is not known. FMP27 is the product of a nuclear encoded gene but it is detected in highly purified mitochondria in high-throughput studies []. This entry represents a conserved region within FMP27 that contains characteristic SW and GKG sequence motifs. 
Probab=22.51  E-value=55  Score=22.80  Aligned_cols=16  Identities=44%  Similarity=0.893  Sum_probs=13.8

Q ss_pred             cCCHHHHHhhhccchh
Q 033920           36 KVGIPEFLNGIGKGVE   51 (109)
Q Consensus        36 ~~dV~tFL~~IGRg~~   51 (109)
                      .-++++||-.+|+|+-
T Consensus        80 l~~~pdFLh~~GkG~P   95 (103)
T PF10305_consen   80 LDDLPDFLHDVGKGVP   95 (103)
T ss_pred             chhhHHHHHHhCCCCC
Confidence            4689999999999974


No 67 
>PF13871 Helicase_C_4:  Helicase_C-like
Probab=22.44  E-value=1.7e+02  Score=24.11  Aligned_cols=44  Identities=25%  Similarity=0.236  Sum_probs=31.0

Q ss_pred             ccccCCHHHHHhhhcc-chhHHHHhhhhhhhhHHHHhhhchHHHHhcCC
Q 033920           33 YIVKVGIPEFLNGIGK-GVETHSAKLESEIGDFQRLLVTRTLKLKKLGI   80 (109)
Q Consensus        33 ~~~~~dV~tFL~~IGR-g~~eha~Kfes~~gdw~~Lf~~~S~~LKelGI   80 (109)
                      ....++|.+||++|-- -++....-|+-    +...++..-+++|..|.
T Consensus       227 ~~k~~~V~kFLNRLLGL~V~~Qn~LF~y----F~~~l~~~I~~AK~~G~  271 (278)
T PF13871_consen  227 LDKDPSVPKFLNRLLGLPVEMQNALFKY----FTDTLDAIIEQAKAEGR  271 (278)
T ss_pred             ccccccHHHHHHHhhCCCHHHHHHHHHH----HHHHHHHHHHHHHHcCC
Confidence            3456799999999842 23344445554    88888888888888875


No 68 
>COG3179 Predicted chitinase [General function prediction only]
Probab=22.18  E-value=1.2e+02  Score=24.60  Aligned_cols=42  Identities=14%  Similarity=0.241  Sum_probs=28.5

Q ss_pred             HHHHhhhhhhhhHHHHhhhchHHHHhcCCCc--hhhhHHhhhhH
Q 033920           52 THSAKLESEIGDFQRLLVTRTLKLKKLGIPC--KHRKLILKHTH   93 (109)
Q Consensus        52 eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~--r~RKyIL~~~e   93 (109)
                      .|...|+.++.+.-..+.+=+..|++.||..  ++--+|-.+.|
T Consensus         8 ~~~ki~p~a~k~~~~v~~al~~~l~~~gi~~p~r~AmFlAQ~~H   51 (206)
T COG3179           8 DLRKIFPKARKEFVDVIVALQPALDEAGITTPLRQAMFLAQVMH   51 (206)
T ss_pred             HHHHhcchhhhhhHHHHHHHHHHHHHhcCCCHHHHHHHHHHHhh
Confidence            4455566666667777888899999999944  44445555554


No 69 
>TIGR00608 radc DNA repair protein radc. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=22.10  E-value=2e+02  Score=22.68  Aligned_cols=41  Identities=20%  Similarity=0.144  Sum_probs=30.9

Q ss_pred             chhHHHHhhhhhh---hhHHHHhhhchHHHHh-cCCCchhhhHHh
Q 033920           49 GVETHSAKLESEI---GDFQRLLVTRTLKLKK-LGIPCKHRKLIL   89 (109)
Q Consensus        49 g~~eha~Kfes~~---gdw~~Lf~~~S~~LKe-lGIp~r~RKyIL   89 (109)
                      ++.+-|..+-+.+   |++..|+.++-.+|++ .||.....-.|+
T Consensus        33 ~~~~lA~~ll~~f~~~g~l~~l~~a~~~eL~~i~GiG~aka~~l~   77 (218)
T TIGR00608        33 DVLSLSKRLLDVFGRQDSLGHLLSAPPEELSSVPGIGEAKAIQLK   77 (218)
T ss_pred             CHHHHHHHHHHHhcccCCHHHHHhCCHHHHHhCcCCcHHHHHHHH
Confidence            5668887777777   8899999999999988 599443333333


No 70 
>PF10330 Stb3:  Putative Sin3 binding protein;  InterPro: IPR018818  This entry represents Sin3 binding proteins conserved in fungi. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex []. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein []. 
Probab=21.80  E-value=65  Score=23.09  Aligned_cols=16  Identities=31%  Similarity=0.605  Sum_probs=13.8

Q ss_pred             cCC-CchhhhHHhhhhH
Q 033920           78 LGI-PCKHRKLILKHTH   93 (109)
Q Consensus        78 lGI-p~r~RKyIL~~~e   93 (109)
                      .+| |.||||.|..-.|
T Consensus        38 ~~ls~sKqRRLi~~ALE   54 (92)
T PF10330_consen   38 SDLSPSKQRRLIMAALE   54 (92)
T ss_pred             ccCCHHHHHHHHHHHHh
Confidence            356 8899999999988


No 71 
>TIGR03019 pepcterm_femAB FemAB-related protein, PEP-CTERM system-associated. Members of this protein family are found always as part of extended exopolysaccharide biosynthesis loci in bacteria. In nearly every case, these loci contain determinants for the processing of the PEP-CTERM proposed C-terminal protein sorting signal. This family shows remote, local sequence similarity to the FemAB protein family (see pfam02388), whose members
Probab=21.79  E-value=1.8e+02  Score=23.06  Aligned_cols=59  Identities=15%  Similarity=0.142  Sum_probs=45.4

Q ss_pred             CCHHHHHhhhccchhHHHHhhhhh------hhhHHHHhhhchHHHHhcCCCchhhhHHhhhhHhh
Q 033920           37 VGIPEFLNGIGKGVETHSAKLESE------IGDFQRLLVTRTLKLKKLGIPCKHRKLILKHTHKY   95 (109)
Q Consensus        37 ~dV~tFL~~IGRg~~eha~Kfes~------~gdw~~Lf~~~S~~LKelGIp~r~RKyIL~~~eky   95 (109)
                      .|.++++..+.+.+-...-|.+..      .++++.++.+-...|+.+|.|+..+.|.-+-.+.|
T Consensus       127 ~~~e~~~~~~~~k~R~~IRka~k~Gv~v~~~~~l~~F~~l~~~t~~r~g~p~~~~~~f~~l~~~~  191 (330)
T TIGR03019       127 ADPEANWLAIPRKQRAMVRKGIKAGLTVTVDGDLDRFYDVYAENMRDLGTPVFSRRYFRLLKDVF  191 (330)
T ss_pred             CCHHHHHHhcCHHHHHHHHHHHHCCeEEEECCcHHHHHHHHHHHHhcCCCCCCCHHHHHHHHHhc
Confidence            588999988877665555554421      25799999999999999999999899887766655


No 72 
>PF05577 Peptidase_S28:  Serine carboxypeptidase S28;  InterPro: IPR008758 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:  Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases.   In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.  Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S28 (clan SC). The predicted active site residues for members of this family and family S10 occur in the same order in the sequence: S, D, H. These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase [, , , ].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis; PDB: 3N2Z_B 3JYH_A 3N0T_C.
Probab=21.76  E-value=72  Score=26.26  Aligned_cols=49  Identities=18%  Similarity=0.295  Sum_probs=33.0

Q ss_pred             CccccCCHHHHHhhhccchhH----HHHhhhhhhhhHHHHhhhch--HHH-HhcCC
Q 033920           32 QYIVKVGIPEFLNGIGKGVET----HSAKLESEIGDFQRLLVTRT--LKL-KKLGI   80 (109)
Q Consensus        32 p~~~~~dV~tFL~~IGRg~~e----ha~Kfes~~gdw~~Lf~~~S--~~L-KelGI   80 (109)
                      |.+.+.|-.+|...|.+.+..    -++.+..++...++++...+  .+| +++++
T Consensus       147 pv~a~~df~~y~~~v~~~~~~~~~~C~~~i~~a~~~i~~~~~~~~~~~~l~~~f~~  202 (434)
T PF05577_consen  147 PVQAKVDFWEYFEVVTESLRKYGPNCYDAIRAAFDQIDKLLKTGNGRQQLKKKFKL  202 (434)
T ss_dssp             -CCHCCTTTHHHHHHHHHHHCCSCCHHHHHHHHHHHHHHHCCTCHHHHHHHHHCTB
T ss_pred             eeeeecccHHHHHHHHHHHHhhccHHHHHHHHHHHHHHHHhhcccHHHHHHHHhhh
Confidence            888888988888876543221    33666667778999987776  566 44565


No 73 
>PF00633 HHH:  Helix-hairpin-helix motif;  InterPro: IPR000445 The HhH motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins [, , ]. The HhH motif is similar to, but distinct from, the HtH motif. Both of these motifs have two helices connected by a short turn. In the HtH motif the second helix binds to DNA with the helix in the major groove. This allows the contact between specific base and residues throughout the protein. In the HhH motif the second helix does not protrude from the surface of the protein and therefore cannot lie in the major groove of the DNA. Crystallographic studies suggest that the interaction of the HhH domain with DNA is mediated by amino acids located in the strongly conserved loop (L-P-G-V) and at the N-terminal end of the second helix []. This interaction could involve the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups []. The structural difference between the HtH and HhH domains is reflected at the functional level: whereas the HtH domain, found primarily in gene regulatory proteins, binds DNA in a sequence specific manner, the HhH domain is rather found in proteins involved in enzymatic activities and binds DNA with no sequence specificity []. The HhH domain of DisA, a bacterial checkpoint control protein, is a DNA-binding domain [].; GO: 0003677 DNA binding; PDB: 3C1Z_A 3C23_A 3C1Y_A 3C21_A 1Z00_A 2A1J_B 1KEA_A 1VRL_A 1RRQ_A 3G0Q_A ....
Probab=21.67  E-value=1.1e+02  Score=17.03  Aligned_cols=27  Identities=33%  Similarity=0.381  Sum_probs=17.0

Q ss_pred             HHHHhhhchHHHHhc-CCCchhhhHHhh
Q 033920           64 FQRLLVTRTLKLKKL-GIPCKHRKLILK   90 (109)
Q Consensus        64 w~~Lf~~~S~~LKel-GIp~r~RKyIL~   90 (109)
                      .+.++..+-++|.++ ||-.+.=..||.
T Consensus         2 ~~g~~pas~eeL~~lpGIG~~tA~~I~~   29 (30)
T PF00633_consen    2 LDGLIPASIEELMKLPGIGPKTANAILS   29 (30)
T ss_dssp             HHHHHTSSHHHHHTSTT-SHHHHHHHHH
T ss_pred             CCCcCCCCHHHHHhCCCcCHHHHHHHHh
Confidence            445666666777766 887776666664


No 74 
>PF12972 NAGLU_C:  Alpha-N-acetylglucosaminidase (NAGLU) C-terminal domain;  InterPro: IPR024732 Alpha-N-acetylglucosaminidase is a lysosomal enzyme required for the stepwise degradation of heparan sulphate []. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB, or Sanfilippo syndrome type B) characterised by neurological dysfunction but relatively mild somatic manifestations []. The structure shows that the enzyme is composed of three domains. This C-terminal domain has an all alpha helical fold [].; PDB: 2VC9_A 2VCC_A 2VCB_A 2VCA_A 4A4A_A.
Probab=21.41  E-value=2.8e+02  Score=22.21  Aligned_cols=58  Identities=21%  Similarity=0.353  Sum_probs=37.2

Q ss_pred             chhHHHHhhhhhhhhHHHHhhhchH--------HHHhcCCCchhhhHHhhhhHhhhhcccccCCCCC
Q 033920           49 GVETHSAKLESEIGDFQRLLVTRTL--------KLKKLGIPCKHRKLILKHTHKYRLGLWRPRAAPA  107 (109)
Q Consensus        49 g~~eha~Kfes~~gdw~~Lf~~~S~--------~LKelGIp~r~RKyIL~~~ekyR~Gl~~P~g~~~  107 (109)
                      ....+..+|.+-+.|.+.|+.+...        .-|..|-...++++.-. --+=--=+|+|.|.+.
T Consensus       124 ~~~~~~~~~l~ll~dlD~lL~t~~~f~Lg~Wi~~Ar~~g~~~~e~~~yE~-NAR~qIT~Wg~~g~l~  189 (267)
T PF12972_consen  124 AFKALSARFLELLDDLDRLLATNPEFLLGKWIEDARAWGTTPEEKDLYEY-NARNQITLWGPSGELH  189 (267)
T ss_dssp             HHHHHHHHHHHHHHHHHHHHTT-GGGBHHHHHHHHHHSSTT--HHHHHHH-HHHHHTTTSHTTTS-T
T ss_pred             HHHHHHHHHHHHHHHHHHHHCcCCCCCHHHHHHHHHHHCCCHHHHHHHHH-HHHHHhhccCCCCCcc
Confidence            4556677888877889999988876        45888987777655432 2234445678887653


No 75 
>smart00611 SEC63 Domain of unknown function in Sec63p, Brr2p and other proteins.
Probab=21.15  E-value=89  Score=24.36  Aligned_cols=35  Identities=14%  Similarity=0.150  Sum_probs=27.9

Q ss_pred             hhhHHHHhhhchHHHHhc-CCCchhhhHHhhhhHhh
Q 033920           61 IGDFQRLLVTRTLKLKKL-GIPCKHRKLILKHTHKY   95 (109)
Q Consensus        61 ~gdw~~Lf~~~S~~LKel-GIp~r~RKyIL~~~eky   95 (109)
                      +.++++|.+++..+++++ |.+.++-+-|++..++|
T Consensus       172 i~s~~~l~~~~~~~~~~ll~~~~~~~~~i~~~~~~~  207 (312)
T smart00611      172 VLSLEDLLELEDEERGELLGLLDAEGERVYKVLSRL  207 (312)
T ss_pred             CCCHHHHHhcCHHHHHHHHcCCHHHHHHHHHHHHhC
Confidence            456999999999888887 88777777788777654


No 76 
>COG0735 Fur Fe2+/Zn2+ uptake regulation proteins [Inorganic ion transport and metabolism]
Probab=21.00  E-value=92  Score=22.65  Aligned_cols=24  Identities=21%  Similarity=0.242  Sum_probs=20.2

Q ss_pred             hHHHHhcCC-CchhhhHHhhhhHhh
Q 033920           72 TLKLKKLGI-PCKHRKLILKHTHKY   95 (109)
Q Consensus        72 S~~LKelGI-p~r~RKyIL~~~eky   95 (109)
                      ...||+.|+ .++||.-||+...+-
T Consensus         9 ~~~lk~~glr~T~qR~~vl~~L~~~   33 (145)
T COG0735           9 IERLKEAGLRLTPQRLAVLELLLEA   33 (145)
T ss_pred             HHHHHHcCCCcCHHHHHHHHHHHhc
Confidence            467999999 999999999887543


No 77 
>COG1305 Transglutaminase-like enzymes, putative cysteine proteases [Amino acid transport and metabolism]
Probab=20.91  E-value=1.2e+02  Score=22.44  Aligned_cols=37  Identities=19%  Similarity=0.112  Sum_probs=26.3

Q ss_pred             ccccCCHHHHHhhhccchhHHHHhhhhhhhhHHHHhhhchHHHHhcCCCch
Q 033920           33 YIVKVGIPEFLNGIGKGVETHSAKLESEIGDFQRLLVTRTLKLKKLGIPCK   83 (109)
Q Consensus        33 ~~~~~dV~tFL~~IGRg~~eha~Kfes~~gdw~~Lf~~~S~~LKelGIp~r   83 (109)
                      +....+.+.+|..-...|..|+.-|-.              -|+-+|||+|
T Consensus       180 ~~~~~~~~~~l~~~~G~C~d~a~l~va--------------l~Ra~GIpAR  216 (319)
T COG1305         180 TPVTGSASDALRLGRGVCRDFAHLLVA--------------LLRAAGIPAR  216 (319)
T ss_pred             CCCCCCHHHHHHhCCcccccHHHHHHH--------------HHHHcCCcce
Confidence            445567677776665678777766554              5788999998


No 78 
>TIGR01025 rpsS_arch ribosomal protein S19(archaeal)/S15(eukaryotic). This model represents eukaryotic ribosomal protein S15 and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15.
Probab=20.83  E-value=52  Score=24.84  Aligned_cols=26  Identities=23%  Similarity=0.438  Sum_probs=14.7

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHHhh
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLILK   90 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyIL~   90 (109)
                      +++|..++-.++-++ +|++|||.|-+
T Consensus        12 l~~L~~m~~~e~~~l-~~ar~RRs~~R   37 (135)
T TIGR01025        12 LEELQDMSLEELAKL-LPARQRRRLKR   37 (135)
T ss_pred             HHHHHcCCHHHHHHH-cCcccCccccc
Confidence            455555555555443 46777766643


No 79 
>PF14304 CSTF_C:  Transcription termination and cleavage factor C-terminal; PDB: 2J8P_A.
Probab=20.65  E-value=67  Score=20.22  Aligned_cols=34  Identities=18%  Similarity=0.313  Sum_probs=22.4

Q ss_pred             HHHHhhhchHHHHhcCCCchhhhHHhhhhHhhhhcc
Q 033920           64 FQRLLVTRTLKLKKLGIPCKHRKLILKHTHKYRLGL   99 (109)
Q Consensus        64 w~~Lf~~~S~~LKelGIp~r~RKyIL~~~ekyR~Gl   99 (109)
                      ...+++++..+.-  -.|+.+|.-|+.-++++++|.
T Consensus        11 l~QVL~Lt~eQI~--~LPp~qR~~I~~Lr~ql~~~~   44 (46)
T PF14304_consen   11 LMQVLQLTPEQIN--ALPPDQRQQILQLRQQLMRGE   44 (46)
T ss_dssp             HHHHHTS-HHHHH--TS-HHHHTHHHHHHHHHH---
T ss_pred             HHHHHcCCHHHHH--hCCHHHHHHHHHHHHHHHhcC
Confidence            4456666666654  349999999999999999984


No 80 
>PF15368 BioT2:  Spermatogenesis family BioT2
Probab=20.34  E-value=1.4e+02  Score=23.60  Aligned_cols=35  Identities=17%  Similarity=0.169  Sum_probs=26.7

Q ss_pred             cCCHHHHHhhhccchhHHHHhhhhhh----hhHHHHhhhchHH
Q 033920           36 KVGIPEFLNGIGKGVETHSAKLESEI----GDFQRLLVTRTLK   74 (109)
Q Consensus        36 ~~dV~tFL~~IGRg~~eha~Kfes~~----gdw~~Lf~~~S~~   74 (109)
                      --|+..||..    |+++|.-+|.++    +-++.||.|--.|
T Consensus       126 gdD~~SFL~~----CS~faaQLEeAvKEE~niLeSLfKWFQ~Q  164 (170)
T PF15368_consen  126 GDDMNSFLLC----CSQFAAQLEEAVKEERNILESLFKWFQQQ  164 (170)
T ss_pred             cccHHHHHHH----HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            4688888864    899999999887    6677777775444


Done!