Query         psy7099
Match_columns 75
No_of_seqs    47 out of 49
Neff          4.3 
Searched_HMMs 46136
Date          Fri Aug 16 18:07:30 2013
Command       hhsearch -i /work/01045/syshi/Psyhhblits/psy7099.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/7099hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PLN03099 PIR Protein PIR; Prov 100.0 2.7E-31 5.9E-36  225.9   6.2   72    2-74    861-932 (1232)
  2 KOG3534|consensus              100.0 1.9E-29 4.1E-34  210.1   6.0   73    2-74    898-970 (1253)
  3 PF05994 FragX_IP:  Cytoplasmic  99.9   4E-26 8.8E-31  189.6   4.0   69    2-74    511-579 (820)
  4 PF02186 TFIIE_beta:  TFIIE bet  92.5   0.029 6.4E-07   34.3  -0.4   54   12-75      6-60  (65)
  5 cd07977 TFIIE_beta_winged_heli  82.8     1.3 2.9E-05   27.5   2.5   58   12-75     10-69  (75)
  6 PF08352 oligo_HPY:  Oligopepti  81.6     1.2 2.7E-05   25.6   1.9   23   36-58      6-28  (64)
  7 PF10542 Vitelline_membr:  Vite  75.9     1.8 3.9E-05   24.5   1.3   20   54-73     19-38  (38)
  8 PF08900 DUF1845:  Domain of un  75.2      17 0.00037   26.4   6.6   63    6-69     35-106 (217)
  9 PF03945 Endotoxin_N:  delta en  69.9      17 0.00037   25.2   5.4   41   13-53     27-67  (226)
 10 COG0692 Ung Uracil DNA glycosy  63.9     4.1 8.8E-05   30.7   1.3   23    2-24    192-214 (223)
 11 PF09735 Nckap1:  Membrane-asso  60.0      14 0.00029   33.3   4.0   38   11-49    829-866 (1116)
 12 TIGR03761 ICE_PFL4669 integrat  58.2      45 0.00098   24.6   6.0   59    4-63     31-98  (216)
 13 TIGR03687 pupylate_cterm ubiqu  57.8      30 0.00066   19.0   3.8   25   29-53      5-29  (33)
 14 PF10923 DUF2791:  P-loop Domai  56.1     6.3 0.00014   31.6   1.3   42   13-54    229-277 (416)
 15 PF05596 Taeniidae_ag:  Taeniid  55.1      25 0.00053   21.6   3.5   30   24-53     31-63  (64)
 16 PF06419 COG6:  Conserved oligo  52.9      63  0.0014   26.8   6.6   40   15-54    346-386 (618)
 17 PHA03347 uracil DNA glycosylas  50.2      11 0.00023   28.7   1.6   22    4-25    223-244 (252)
 18 PF01273 LBP_BPI_CETP:  LBP / B  48.5      38 0.00082   22.1   3.9   28   28-55    137-164 (164)
 19 PHA03200 uracil DNA glycosylas  48.3      13 0.00028   28.3   1.8   23    3-25    223-245 (255)
 20 PF10158 LOH1CR12:  Tumour supp  47.8      61  0.0013   22.1   4.9   38   30-68     93-130 (131)
 21 PF05190 MutS_IV:  MutS family   47.5     8.6 0.00019   22.7   0.6   17   58-74     30-46  (92)
 22 PF07962 Swi3:  Replication For  46.7      31 0.00067   21.6   3.1   20   17-37      5-24  (83)
 23 KOG4515|consensus               44.9      59  0.0013   24.4   4.8   39   30-69    157-195 (217)
 24 TIGR00628 ung uracil-DNA glyco  37.8      23  0.0005   26.1   1.7   22    3-24    190-211 (212)
 25 PHA02855 anti-apoptotic membra  36.9 1.3E+02  0.0029   22.0   5.5   33   13-45    101-133 (180)
 26 PF12668 DUF3791:  Protein of u  35.5      33 0.00071   20.0   1.8   34    5-38     27-60  (62)
 27 cd08787 CARD_NOD2_1_CARD15 Cas  35.0      78  0.0017   20.8   3.7   33   28-60     52-84  (87)
 28 COG1383 RPS17A Ribosomal prote  35.0      55  0.0012   20.9   2.9   30   28-57     34-63  (74)
 29 PRK05254 uracil-DNA glycosylas  32.2      23 0.00049   26.2   0.9   24    3-26    195-218 (224)
 30 PHA03204 uracil DNA glycosylas  31.5      30 0.00065   27.3   1.5   22    4-25    292-313 (322)
 31 PF10444 Nbl1_Borealin_N:  Nbl1  30.8      69  0.0015   18.6   2.7   30   29-59     18-47  (59)
 32 COG0444 DppD ABC-type dipeptid  30.4      43 0.00094   26.2   2.2   24   35-58    237-260 (316)
 33 PF01399 PCI:  PCI domain;  Int  29.6 1.2E+02  0.0027   17.7   4.7   25   11-35      1-25  (105)
 34 KOG1917|consensus               26.9 1.9E+02  0.0041   26.5   5.7   43    9-52    829-871 (1125)
 35 PF05531 NPV_P10:  Nucleopolyhe  26.0 1.8E+02  0.0039   18.4   4.5   34   23-57      3-36  (75)
 36 PF15209 IL31:  Interleukin 31   25.5      94   0.002   21.9   3.0   25   10-34     67-91  (137)
 37 PF06585 JHBP:  Haemolymph juve  25.2 2.4E+02  0.0052   19.6   5.1   48   10-57    197-244 (248)
 38 PF12261 T_hemolysin:  Thermost  25.1     4.8  0.0001   28.8  -3.6   62    6-68     98-159 (179)
 39 KOG2994|consensus               24.9      40 0.00087   26.4   1.2   24    2-25    266-289 (297)
 40 PHA03202 uracil DNA glycosylas  24.8      47   0.001   26.1   1.6   22    4-25    284-305 (313)
 41 PF06887 DUF1265:  Protein of u  24.1      57  0.0012   19.3   1.5   17   30-52      1-17  (48)
 42 PHA02774 E1; Provisional        24.1      68  0.0015   27.4   2.5   31    5-37    395-425 (613)
 43 PF06757 Ins_allergen_rp:  Inse  23.9      86  0.0019   21.6   2.6   52    4-57     50-116 (179)
 44 KOG4130|consensus               23.4 2.4E+02  0.0052   22.1   5.1   39    7-46    171-209 (291)
 45 PF04036 DUF372:  Domain of unk  22.5      12 0.00025   21.2  -1.6    9    3-11     14-22  (38)
 46 PRK01151 rps17E 30S ribosomal   22.4      49  0.0011   20.0   1.0   25   28-52     33-57  (58)
 47 COG4608 AppF ABC-type oligopep  22.0      57  0.0012   25.0   1.5   22   36-57    194-215 (268)
 48 COG2981 CysZ Uncharacterized p  21.9 1.9E+02  0.0041   22.3   4.3   34   24-57     29-63  (250)
 49 PF05120 GvpG:  Gas vesicle pro  21.5      93   0.002   19.7   2.2   20   23-42     10-29  (79)
 50 PHA03395 p10 fibrous body prot  21.2 2.5E+02  0.0054   18.3   4.5   35   22-57      2-36  (87)
 51 PF05878 Phyto_Pns9_10:  Phytor  21.0 1.3E+02  0.0029   23.8   3.3   27   24-50    194-220 (312)
 52 PLN02936 epsilon-ring hydroxyl  20.6 1.2E+02  0.0026   23.5   3.0   42   12-54      6-48  (489)
 53 PF14271 DUF4359:  Domain of un  20.4   2E+02  0.0043   18.6   3.7   68    5-73     15-99  (107)
 54 cd00170 SEC14 Sec14p-like lipi  20.2 1.6E+02  0.0035   17.8   3.1   41   26-67    109-156 (157)

No 1  
>PLN03099 PIR Protein PIR; Provisional
Probab=99.97  E-value=2.7e-31  Score=225.91  Aligned_cols=72  Identities=29%  Similarity=0.469  Sum_probs=70.7

Q ss_pred             ccccCCcchhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCCcccccc
Q psy7099           2 VKFLGFVGAYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSPGVLGY   74 (75)
Q Consensus         2 ~ly~gFvG~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~G~~~y   74 (75)
                      ++|+||||+|||+|||||||||||||||+||||+++++|+ +|.|||++|+++|||+||||++||||+||++|
T Consensus       861 ~ly~~f~G~pH~~ai~rLlG~~~la~vi~~lL~~i~~~i~-~l~~~v~~L~~~mPk~~~Lp~~dyG~~g~~~~  932 (1232)
T PLN03099        861 ELHSKFFGLPHMFAIVKLLGSRSLPWLIRALLDHLSQKIT-TLEPMIEDLREAMPKAIGLPSFDGGVAGCMKI  932 (1232)
T ss_pred             HHHhcccCcHHHHHHHHHhcccchHHHHHHHHHHHHHHHH-HHHHHHHHHHHhCccccCCCccccCchHHHHH
Confidence            5899999999999999999999999999999999999999 89999999999999999999999999999987


No 2  
>KOG3534|consensus
Probab=99.96  E-value=1.9e-29  Score=210.11  Aligned_cols=73  Identities=68%  Similarity=1.213  Sum_probs=71.3

Q ss_pred             ccccCCcchhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCCcccccc
Q psy7099           2 VKFLGFVGAYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSPGVLGY   74 (75)
Q Consensus         2 ~ly~gFvG~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~G~~~y   74 (75)
                      +.|++|||.|||+||||||||||||||++||||++++++|++|.+||+++++.|||+|||||.||||+|+++|
T Consensus       898 ~~Y~~fvG~pHf~aicRLLgYQGIAVvmdelLKivk~Llqg~ilq~vktl~~~MPKiCkLPR~eYGSpgiL~y  970 (1253)
T KOG3534|consen  898 GSYRNFVGPPHFKAICRLLGYQGIAVVMDELLKIVKSLLQGTILQYVKTLMEVMPKICKLPRHEYGSPGILEY  970 (1253)
T ss_pred             HHHhcCCCcHHHHHHHHHhcccchHHHHHHHHHHHHHHHhhhHHHHHHHHHHhhHHhcCCCccccCChHHHHH
Confidence            3589999999999999999999999999999999999999999999999999999999999999999999987


No 3  
>PF05994 FragX_IP:  Cytoplasmic Fragile-X interacting family;  InterPro: IPR008081 Cytoplasmic fragile X mental retardation protein (FMRP) interacting protein belongs to a highly conserved but, as yet, functionally uncharacterised family. Absence of FMRP is responsible for pathologic manifestations in Fragile X Syndrome, the most frequent cause of inherited mental retardation []. FMRP is an RNA-binding protein that may have a role in local protein translation at neuronal dendrites and in dendritic spine maturation []. CYFIP1 and CYFIP2, which share a high level of sequence identity, have recently been identified as cytoplasmic FMRP interacting proteins []. CYFIP2 interacts with FMRP-related proteins FXR1P/2P, while CYFIP1 interacts exclusively with FMRP. The FMRP-CYFIP interaction involves the domain of FMRP that also mediates homo- and heteromerisation, suggesting competition between the various interaction partners. CYFIP1 also interacts with the small GTPase Rac1 implicated in development and maintenance of neuronal structures. CYFIP1/2 are both present in synaptosomal extracts [].  PIR121 (121F-specific p53 inducible RNA) is another functionally uncharacterised member of this family. The PIR121 gene maps to human chromosome 5q34, a region frequently translocated in acute myeloid leukaemia but not known to be amplified or deleted in solid tumours. Interaction between PIR121 and FMRP has been demonstrated, and hence PIR121 has also been termed CYFIP2 (Cytoplasmic FMRP Interacting Protein 2) [, ].  Shyc (Selective HYbridizing Clone) is a cytoplasmic protein of unknown function, expressed in the developing and embryonic nervous system. The protein has also been designated CYFIP1 due to the high sequence identity (98.7%) to its human orthologue. The CYFIP orthologues in Caenorhabditis elegans and Drosophila melanogaster (Fruit fly) share about 51% and 67% sequence identity with the human proteins, respectively []. The high level of conservation manifest throughout the entire CYFIP sequence between various orthologues suggests a number of functionally/structurally important domains. ; PDB: 3P8C_A.
Probab=99.92  E-value=4e-26  Score=189.59  Aligned_cols=69  Identities=59%  Similarity=1.091  Sum_probs=57.9

Q ss_pred             ccccCCcchhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCCcccccc
Q psy7099           2 VKFLGFVGAYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSPGVLGY   74 (75)
Q Consensus         2 ~ly~gFvG~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~G~~~y   74 (75)
                      ++|++|||+|||+||||||||+|||||++||++    +|+++++|||++++++|||+||||++|||++|||+|
T Consensus       511 ~~~~~FvG~pH~~ai~rLLg~~~la~li~ell~----~i~~~~~~~V~~l~~~mPk~~~lP~~dygs~g~~~~  579 (820)
T PF05994_consen  511 SLYRGFVGVPHFKAIVRLLGYRGLAVLIEELLK----LIQNKIEPYVKALMEAMPKSCKLPPFDYGSAGVYDY  579 (820)
T ss_dssp             GGGGS-B-HHHHHHHHHHHHHHHHHHHHHHHHH----HHHTHHHHHHHHHHHHS-SB-----GGG-HHHHHHH
T ss_pred             HHhCCccChHHHHHHHHHhCCCcHHHHHHHHHH----HHHHHHHHHHHHHHHhCCccccCCCccCCCHHHHHH
Confidence            589999999999999999999999999999999    888889999999999999999999999999999987


No 4  
>PF02186 TFIIE_beta:  TFIIE beta subunit core domain;  InterPro: IPR003166 Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit (IPR002853 from INTERPRO) and the small beta (IPR003166 from INTERPRO). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE []. The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow []. This entry represents the beta subunit of the transcription factor TFIIE.; GO: 0006367 transcription initiation from RNA polymerase II promoter, 0005673 transcription factor TFIIE complex; PDB: 1D8K_A 1D8J_A.
Probab=92.54  E-value=0.029  Score=34.28  Aligned_cols=54  Identities=19%  Similarity=0.393  Sum_probs=36.2

Q ss_pred             HHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCCcc-ccccC
Q psy7099          12 HFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSPG-VLGYR   75 (75)
Q Consensus        12 H~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~G-~~~y~   75 (75)
                      -++.+|..+-.++-|+-++|+++..+--+...+.++.+          +.|+..|-..| .|+|+
T Consensus         6 ql~~~VeymK~r~~Plt~~eI~d~l~~d~~~~~~~~Lk----------~npKI~~d~~~~~f~fk   60 (65)
T PF02186_consen    6 QLAKAVEYMKKRDHPLTLEEILDYLSLDIGKKLKQWLK----------NNPKIEYDPDGNTFSFK   60 (65)
T ss_dssp             HHHHHHHHHHHH-S-B-HHHHHHHHTSSS-HHHHHHHH----------H-TTEEEE-TT-CEEE-
T ss_pred             HHHHHHHHHHhcCCCcCHHHHHHHHcCCCCHHHHHHHH----------cCCCEEEecCCCEEEec
Confidence            46678888889999999999999988556555555554          66778887666 88774


No 5  
>cd07977 TFIIE_beta_winged_helix TFIIE_beta_winged_helix domain, located at the central core region of TFIIE beta, with double-stranded DNA binding activity. Transcription Factor IIE (TFIIE) beta winged-helix (or forkhead) domain is located at the central core region of TFIIE beta. The winged-helix is a form of helix-turn-helix (HTH) domain which typically binds DNA with the 3rd helix. The winged-helix domain is distinguished by the presence of a C-terminal beta-strand hairpin unit (the wing) that packs against the cleft of the tri-helical core. Although most winged-helix domains are multi-member families, TFIIE beta winged-helix domain is typically found as a single orthologous group. TFIIE is one of the six eukaryotic general transcription factors (TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH) that are required for transcription initiation of protein-coding genes. TFIIE is a heterotetramer consisting of two copies each of alpha and beta subunits. TFIIE beta contains several functional 
Probab=82.83  E-value=1.3  Score=27.45  Aligned_cols=58  Identities=14%  Similarity=0.292  Sum_probs=39.8

Q ss_pred             HHHHHHHHhcccc-HHHHHHHHHHHHH-HHHHhhHHHHHHHHHHhCCccCCCCCcccCCccccccC
Q psy7099          12 HFRAMCRLLGYQG-IAVVMEELLKIVT-SLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSPGVLGYR   75 (75)
Q Consensus        12 H~~aivRLLGy~g-ia~vi~elLk~v~-~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~G~~~y~   75 (75)
                      -+.-+|.-+-.++ -|+-++|+++..+ --+...++++.++..  .+   ..|++|-. .|.|.|+
T Consensus        10 ~l~~aV~ymK~r~~~Plt~~EIl~~ls~~d~~~~~~~~L~~~~--~~---~n~~~~~~-~~tf~fk   69 (75)
T cd07977          10 QLAKIVDYMKKRHQHPLTLDEILDYLSLLDIGPKLKEWLKSEA--LV---NNPKIDPK-DGTFSFK   69 (75)
T ss_pred             hHHHHHHHHHhcCCCCccHHHHHHHHhccCccHHHHHHHHhhh--hc---cCceeccC-CCEEEec
Confidence            3566788888899 9999999999988 555555555554221  11   26677754 6788875


No 6  
>PF08352 oligo_HPY:  Oligopeptide/dipeptide transporter, C-terminal region;  InterPro: IPR013563 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [].  The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ]. The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis [, , , , , ]. The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions []. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ]; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This entry features a region found towards the C terminus of oligopeptide ABC transporter ATP binding proteins, immediately following the ATP-binding domain (IPR003439 from INTERPRO). All characterised members appear able to be involved in the transport of oligopeptides or dipeptides. Some are important for sporulation or antibiotic resistance. Some dipeptide transporters also act on the haem precursor delta-aminolevulinic acid. ; GO: 0000166 nucleotide binding, 0005524 ATP binding, 0015833 peptide transport
Probab=81.57  E-value=1.2  Score=25.60  Aligned_cols=23  Identities=17%  Similarity=0.419  Sum_probs=20.1

Q ss_pred             HHHHHHhhHHHHHHHHHHhCCcc
Q psy7099          36 VTSLIQGSLLQFTKTLMDAMPKQ   58 (75)
Q Consensus        36 v~~~i~~~l~~yv~~l~~~mPk~   58 (75)
                      ++.++++...||++.|+++.|..
T Consensus         6 ~~~i~~~P~HPYT~~Ll~a~p~~   28 (64)
T PF08352_consen    6 TEDIFDNPRHPYTRALLAAVPSI   28 (64)
T ss_pred             HHHHHHCccCHHHHHHHHcCcch
Confidence            56788999999999999999854


No 7  
>PF10542 Vitelline_membr:  Vitelline membrane cysteine-rich region;  InterPro: IPR013135 In Drosophila melanogaster (Fruit fly) the vitelline membrane (VM) is the first layer of the eggshell produced by the follicular epithelium. It is composed of at least four different proteins. VM proteins are similarly organised with a central highly conserved 38-amino acid domain which is flanked by unrelated regions. Since the surrounding regions have diverged significantly, it is possible that the VM domain is of key importance in VM protein structure [, ]. The VM domain contains three highly conserved cysteines.
Probab=75.85  E-value=1.8  Score=24.46  Aligned_cols=20  Identities=35%  Similarity=0.673  Sum_probs=16.5

Q ss_pred             hCCccCCCCCcccCCccccc
Q psy7099          54 AMPKQCKLPRYDYGSPGVLG   73 (75)
Q Consensus        54 ~mPk~~kLP~~dyGs~G~~~   73 (75)
                      .-|..|.-|..+|||+|.+.
T Consensus        19 l~PvPC~~~a~~ygsagay~   38 (38)
T PF10542_consen   19 LKPVPCSAPAPSYGSAGAYT   38 (38)
T ss_pred             cccccCCCCCCccccccccC
Confidence            34778999999999999863


No 8  
>PF08900 DUF1845:  Domain of unknown function (DUF1845);  InterPro: IPR014996  Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens. 
Probab=75.21  E-value=17  Score=26.43  Aligned_cols=63  Identities=16%  Similarity=0.223  Sum_probs=48.9

Q ss_pred             CCcchhHHHHHHHHhcccc---------HHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCCc
Q psy7099           6 GFVGAYHFRAMCRLLGYQG---------IAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSP   69 (75)
Q Consensus         6 gFvG~pH~~aivRLLGy~g---------ia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~   69 (75)
                      +-+|+|-|.++++.+....         .=+-+||-|..++..++. +...++++++..|+.+++....+-.|
T Consensus        35 ~I~Gm~~~~~~~~~i~~~a~~DdPyAD~~L~~iEe~i~~~~~~l~~-~~~~l~~~l~~~p~~i~i~~~~s~~P  106 (217)
T PF08900_consen   35 AIIGMPGFASRLNRIWRDARQDDPYADWWLLRIEEKINEARQELQE-LIARLDALLAELPKGISISEIQSVQP  106 (217)
T ss_pred             CCcCHHHHHHHHHHHHHHHhcCCcHHHHHHHHHHHHHHHHHHHHHH-HHHHHHHHHHhCcCCccccccccCCC
Confidence            5689999999999887622         235678999999999976 55678888888999988877665544


No 9  
>PF03945 Endotoxin_N:  delta endotoxin, N-terminal domain;  InterPro: IPR005639 This family contains insecticidal toxins produced by Bacillus species of bacteria. During spore formation the bacteria produce crystals of this protein. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the delta toxin is composed of three distinct structural domains: an N-terminal helical bundle domain involved in membrane insertion and pore formation; a beta-sheet central domain (IPR001178 from INTERPRO) involved in receptor binding; and a C-terminal beta-sandwich domain (IPR005638 from INTERPRO) that interacts with the N-terminal domain to form a channel [, ]. This entry represents the conserved N-terminal domain.; GO: 0009405 pathogenesis; PDB: 3EB7_A 1I5P_A 2C9K_A 1CIY_A 1JI6_A 1DLC_A 1W99_A.
Probab=69.93  E-value=17  Score=25.16  Aligned_cols=41  Identities=17%  Similarity=0.342  Sum_probs=32.2

Q ss_pred             HHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHH
Q psy7099          13 FRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMD   53 (75)
Q Consensus        13 ~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~   53 (75)
                      +..++.+|.-.+=+=+.+++.+.++++|+..|..+..+...
T Consensus        27 ~s~l~~~lwp~~~~~~~~~~~~~ve~lI~~~I~~~~~~~~~   67 (226)
T PF03945_consen   27 LSTLIGILWPSSSPDIWEEIIKQVENLIDQKITEYDINILN   67 (226)
T ss_dssp             HHHHHHHCSSTGSHHHHHHHHHHHHHHHTHHHHHHHHHHHH
T ss_pred             HHHHHHHHCCCCCchHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence            33567777666655588999999999999999999887654


No 10 
>COG0692 Ung Uracil DNA glycosylase [DNA replication, recombination, and repair]
Probab=63.94  E-value=4.1  Score=30.66  Aligned_cols=23  Identities=30%  Similarity=0.486  Sum_probs=19.0

Q ss_pred             ccccCCcchhHHHHHHHHhcccc
Q psy7099           2 VKFLGFVGAYHFRAMCRLLGYQG   24 (75)
Q Consensus         2 ~ly~gFvG~pH~~aivRLLGy~g   24 (75)
                      +.++||||..||+.+=..|-.+|
T Consensus       192 Sa~rGFfG~~hFsk~N~~L~~~g  214 (223)
T COG0692         192 SAHRGFFGCKHFSKANEWLEKHG  214 (223)
T ss_pred             ccccCccCCCchHHHHHHHHHcC
Confidence            45899999999999888776655


No 11 
>PF09735 Nckap1:  Membrane-associated apoptosis protein;  InterPro: IPR019137 Nck-associated protein 1 is part of lamellipodial complex that controls Rac-dependent actin remodeling. It associates preferentially with the first SH3 domain of Nck and is a component of the WAVE2 complex composed of ABI1, CYFIP1/SRA1, NCKAP1/NAP1 and WASF2/WAVE2. It is also a component of the WAVE1 complex composed of ABI2, CYFIP2, C3orf10/HSPC300, NCKAP1 and WASF1/WAVE1. CYFIP2 binds to activated RAC1 which causes the complex to dissociate, releasing activated WASF1. The complex can also be activated by NCK1. Expression of this protein was found to be markedly reduced in patients with Alzheimer's disease [].; PDB: 3P8C_B.
Probab=60.02  E-value=14  Score=33.28  Aligned_cols=38  Identities=26%  Similarity=0.434  Sum_probs=32.6

Q ss_pred             hHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHH
Q psy7099          11 YHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTK   49 (75)
Q Consensus        11 pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~   49 (75)
                      .=++|+++|+|--|+..+=+.|+.++.+.++. |+..|.
T Consensus       829 ~ELrAL~~LiGpyGvk~l~~~L~~~va~~v~e-lkk~v~  866 (1116)
T PF09735_consen  829 NELRALVELIGPYGVKFLDENLMWHVASQVNE-LKKLVR  866 (1116)
T ss_dssp             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHHH
T ss_pred             HHHHHHHHHhCHHHHHHHHHHHHHHHHHHHHH-HHHHHH
Confidence            44789999999999999999999999999955 766654


No 12 
>TIGR03761 ICE_PFL4669 integrating conjugative element protein, PFL_4669 family. Members of this protein family, such as PFL4669, are found in integrating conjugative elements (ICE) of the PFGI-1 class as in Pseudomonas fluorescens.
Probab=58.22  E-value=45  Score=24.64  Aligned_cols=59  Identities=22%  Similarity=0.238  Sum_probs=44.9

Q ss_pred             ccCCcchhHHHHHHHHhccccH---------HHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCC
Q psy7099           4 FLGFVGAYHFRAMCRLLGYQGI---------AVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPR   63 (75)
Q Consensus         4 y~gFvG~pH~~aivRLLGy~gi---------a~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~   63 (75)
                      ..+-+|+|.|.+.++.+-...-         =+=+||=|..++..++. +...+++.++.+|+.+++..
T Consensus        31 ~~~IiGl~~f~s~~~~i~~~a~~DdPyAD~~Ll~~E~~l~~~~~~l~~-~~~~l~~~l~~~p~~l~ls~   98 (216)
T TIGR03761        31 KPGIIGMPGFISRLNRINQASEQDDPYADWALLRIEEKLLSARQEMQA-LLQRLDDLLAQLPPALDLSE   98 (216)
T ss_pred             CCCCcCcHHHHHHHHHHHHHHHcCCcHHHHHHHHHHHHHHHHHHHHHH-HHHHHHHHHHhCCCccchhh
Confidence            4567999999999998864211         13468888889999966 77788888888998877653


No 13 
>TIGR03687 pupylate_cterm ubiquitin-like protein Pup. Members of this protein family are Pup, a small protein whose ligation to target proteins steers them toward degradation. This protein family occurs in a number of bacteria, especially Actinobacteria such as Mycobacterium tuberculosis, that possess an archeal-type proteasome. All members of this protein family known during model construction end with the C-terminal motif [FY][VI]QKGG[QE]. Ligation is thought to occur between the C-terminal COOH of Pup and an epsilon-amino group of a Lys on the target protein. The N-terminal half of this protein is poorly conserved and not represented in the seed alignment.
Probab=57.81  E-value=30  Score=18.99  Aligned_cols=25  Identities=12%  Similarity=0.464  Sum_probs=21.0

Q ss_pred             HHHHHHHHHHHHHhhHHHHHHHHHH
Q psy7099          29 MEELLKIVTSLIQGSLLQFTKTLMD   53 (75)
Q Consensus        29 i~elLk~v~~~i~~~l~~yv~~l~~   53 (75)
                      ++.||+-|.+.+...-..||++..+
T Consensus         5 ~D~lLDeId~vLe~NAe~FV~~fVQ   29 (33)
T TIGR03687         5 VDDLLDEIDGVLESNAEEFVRGFVQ   29 (33)
T ss_pred             HHHHHHHHHHHHHHhHHHHHHHHHH
Confidence            5788998999998888999998763


No 14 
>PF10923 DUF2791:  P-loop Domain of unknown function (DUF2791);  InterPro: IPR021228  This is a family of proteins found in archaea and bacteria. Some of the proteins in this family are annotated as being methyl-accepting chemotaxis proteins and ATP/GTP binding proteins. 
Probab=56.10  E-value=6.3  Score=31.64  Aligned_cols=42  Identities=19%  Similarity=0.356  Sum_probs=30.5

Q ss_pred             HHHHHHHhccccHHHHHHHHHHH-------HHHHHHhhHHHHHHHHHHh
Q psy7099          13 FRAMCRLLGYQGIAVVMEELLKI-------VTSLIQGSLLQFTKTLMDA   54 (75)
Q Consensus        13 ~~aivRLLGy~gia~vi~elLk~-------v~~~i~~~l~~yv~~l~~~   54 (75)
                      |..++|..||+|+-++++|+-..       .+++-.+.|++.++++..+
T Consensus       229 L~~~lr~aGy~GLlI~lDE~e~l~kl~~~~~R~~~ye~lr~lidd~~~G  277 (416)
T PF10923_consen  229 LARFLRDAGYKGLLILLDELENLYKLRNDQAREKNYEALRQLIDDIDQG  277 (416)
T ss_pred             HHHHHHHcCCCceEEEEechHHHHhcCChHHHHHHHHHHHHHHHHHhcC
Confidence            56788999999999999998653       2445556666666666653


No 15 
>PF05596 Taeniidae_ag:  Taeniidae antigen;  InterPro: IPR008860 This family consists of several antigen proteins from Taenia and Echinococcus (tapeworm) species.
Probab=55.12  E-value=25  Score=21.60  Aligned_cols=30  Identities=23%  Similarity=0.312  Sum_probs=22.9

Q ss_pred             cHHHHHHHHHH---HHHHHHHhhHHHHHHHHHH
Q psy7099          24 GIAVVMEELLK---IVTSLIQGSLLQFTKTLMD   53 (75)
Q Consensus        24 gia~vi~elLk---~v~~~i~~~l~~yv~~l~~   53 (75)
                      .||=+..|+..   .++.+|...|..||+.|.+
T Consensus        31 kIa~l~kdw~~~~~~~r~KiR~~L~ey~k~L~~   63 (64)
T PF05596_consen   31 KIAQLAKDWNEICQEVRKKIRAALAEYCKGLKN   63 (64)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc
Confidence            55555555544   7788999999999999864


No 16 
>PF06419 COG6:  Conserved oligomeric complex COG6;  InterPro: IPR010490 COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation.
Probab=52.87  E-value=63  Score=26.77  Aligned_cols=40  Identities=18%  Similarity=0.330  Sum_probs=36.7

Q ss_pred             HHHHHhcccc-HHHHHHHHHHHHHHHHHhhHHHHHHHHHHh
Q psy7099          15 AMCRLLGYQG-IAVVMEELLKIVTSLIQGSLLQFTKTLMDA   54 (75)
Q Consensus        15 aivRLLGy~g-ia~vi~elLk~v~~~i~~~l~~yv~~l~~~   54 (75)
                      .+.+++|..+ +.-.+.||-+.....+.+.++.+++++.+-
T Consensus       346 ~~~k~i~~~s~L~~tl~~L~~~a~~~f~~~l~~~~~~l~~~  386 (618)
T PF06419_consen  346 TFSKLIGEDSSLIETLKELQDLAQKKFFSSLRDHVAKLLRS  386 (618)
T ss_pred             HHHHHcCCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh
Confidence            4678899987 999999999999999999999999999976


No 17 
>PHA03347 uracil DNA glycosylase; Provisional
Probab=50.23  E-value=11  Score=28.75  Aligned_cols=22  Identities=27%  Similarity=0.480  Sum_probs=18.6

Q ss_pred             ccCCcchhHHHHHHHHhccccH
Q psy7099           4 FLGFVGAYHFRAMCRLLGYQGI   25 (75)
Q Consensus         4 y~gFvG~pH~~aivRLLGy~gi   25 (75)
                      |+||+|..||..+=..|-.+|.
T Consensus       223 ~~gF~Gs~hFsk~N~~L~~~g~  244 (252)
T PHA03347        223 WPKFLGCNHFVLANKYLTQHGK  244 (252)
T ss_pred             CCCCCCCCHHHHHHHHHHHcCC
Confidence            4689999999999888877664


No 18 
>PF01273 LBP_BPI_CETP:  LBP / BPI / CETP family, N-terminal domain;  InterPro: IPR017942 This entry represents the N-terminal domain found in several lipid-binding serum glycoproteins. The N- and C-terminal domains share a similar two-layer alpha/beta structure, but they show little sequence identity. Proteins containing this N-terminal domain include:   Bactericidal permeability-increasing protein (BPI) Lipopolysaccharide-binding protein (LBP) Cholesteryl ester transfer protein (CETP) Phospholipid transfer protein (PLTP) Palate, lung and nasal epithelium carcinoma-associated protein (PLUNC)    Bactericidal permeability-increasing protein (BPI) is a potent antimicrobial protein of 456 residues that binds to and neutralises lipopolysaccharides from the outer membrane of Gram-negative bacteria []. BPI contains two domains that adopt the same structural fold, even though they have little sequence similarity [].   Lipopolysaccharide-binding protein (LBP) is an endotoxin-binding protein that is closely related to, and functions in a co-ordinated manner with BPI to facilitate an integrated host response to invading Gram-negative bacteria []. Cholesteryl ester transfer protein (CETP) is a glycoprotein that facilitates the transfer of lipids (cholesteryl esters and triglycerides) between the different lipoproteins that transport them through plasma, including HDL, LDL, VLDL and chylomicrons. These lipoproteins shield the lipids from water by encapsulating them within a coating of polar lipids and proteins [].  Phospholipid transfer protein (PLTP) exchanges phospholipids between lipoproteins and remodels high-density lipoproteins (HDLs) []. Palate, lung and nasal epithelium carcinoma-associated protein (PLUNC) is a potential host defensive protein that is secreted from the submucosal gland to the saliva and nasal lavage fluid. PLUNC appears to be a secreted product of neutrophil granules that participates in an aspect of the inflammatory response that contributes to host defence []. Short palate, lung and nasal epithelium clone 1 (SPLUNC1) may bind the lipopolysaccharide of Gram-negative nanobacteria, thereby playing an important role in the host defence of nasopharyngeal epithelium [].; GO: 0008289 lipid binding; PDB: 1EWF_A 1BP1_A 2OBD_A.
Probab=48.49  E-value=38  Score=22.11  Aligned_cols=28  Identities=21%  Similarity=0.349  Sum_probs=21.1

Q ss_pred             HHHHHHHHHHHHHHhhHHHHHHHHHHhC
Q psy7099          28 VMEELLKIVTSLIQGSLLQFTKTLMDAM   55 (75)
Q Consensus        28 vi~elLk~v~~~i~~~l~~yv~~l~~~m   55 (75)
                      +.+.+-+.++..+++.++|-++++.+.|
T Consensus       137 ~~~~l~~~i~~~l~~~iC~~i~~~l~~l  164 (164)
T PF01273_consen  137 FINFLSKSIRSLLQKKICPVINSLLSNL  164 (164)
T ss_dssp             HHHHTHHHHHHHHHHHHHHHHHHHHHH-
T ss_pred             HHHHHHHHHHHHHHhhcchHhHHhhcCC
Confidence            3445555889999999999999887654


No 19 
>PHA03200 uracil DNA glycosylase; Provisional
Probab=48.28  E-value=13  Score=28.33  Aligned_cols=23  Identities=26%  Similarity=0.355  Sum_probs=19.6

Q ss_pred             cccCCcchhHHHHHHHHhccccH
Q psy7099           3 KFLGFVGAYHFRAMCRLLGYQGI   25 (75)
Q Consensus         3 ly~gFvG~pH~~aivRLLGy~gi   25 (75)
                      .++||+|..||..+=..|-.+|.
T Consensus       223 a~rgFfgs~hFsk~N~~L~~~g~  245 (255)
T PHA03200        223 ARTPFIGNNHFVLANEYLSTHGK  245 (255)
T ss_pred             cCCCCCCCCHHHHHHHHHHHcCC
Confidence            57899999999999888877664


No 20 
>PF10158 LOH1CR12:  Tumour suppressor protein;  InterPro: IPR018780 This entry represents a region of 130 amino acids that is the most conserved part of some hypothetical proteins involved in loss of heterozygosity, and thus, tumour suppression []. The exact function of these proteins is not known. 
Probab=47.78  E-value=61  Score=22.10  Aligned_cols=38  Identities=18%  Similarity=0.329  Sum_probs=30.8

Q ss_pred             HHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCC
Q psy7099          30 EELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGS   68 (75)
Q Consensus        30 ~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs   68 (75)
                      .+-|..+++++++ +.+-++.|=+.+|..=+||++-+-+
T Consensus        93 s~~L~~~~~lL~~-~v~~ie~LN~~LP~~~RLep~~~~~  130 (131)
T PF10158_consen   93 SQQLSRCQSLLNQ-TVPSIETLNEILPEEERLEPFVWTT  130 (131)
T ss_pred             HHHHHHHHHHHHH-HHHHHHHHHhhCChhhcCCCCCCCC
Confidence            4556677888855 6678899999999999999997644


No 21 
>PF05190 MutS_IV:  MutS family domain IV C-terminus.;  InterPro: IPR007861 Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication []. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.  MutS is a modular protein with a complex structure [], and is composed of:   N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease. Connector domain, which is similar in structure to Holliday junction resolvase ruvC. Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA. Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a beta-sheet structure. ATPase domain (connected to the core domain), which has a classical Walker A motif. HTH (helix-turn-helix) domain, which is involved in dimer contacts.   The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts [].  This entry represents the clamp domain (domain 4) found in proteins of the MutS family. The clamp domain is inserted within the core domain at the top of the lever helices. It has a beta-sheet structure [].; GO: 0005524 ATP binding, 0030983 mismatched DNA binding, 0006298 mismatch repair; PDB: 2WTU_A 1OH7_A 1OH5_B 1W7A_B 1NG9_A 1OH8_B 1WBD_A 1WB9_A 3K0S_A 1OH6_A ....
Probab=47.49  E-value=8.6  Score=22.72  Aligned_cols=17  Identities=18%  Similarity=0.212  Sum_probs=8.7

Q ss_pred             cCCCCCcccCCcccccc
Q psy7099          58 QCKLPRYDYGSPGVLGY   74 (75)
Q Consensus        58 ~~kLP~~dyGs~G~~~y   74 (75)
                      ..++|...|...+..+|
T Consensus        30 ~~~~~~lk~~~~~~~gy   46 (92)
T PF05190_consen   30 KLGIPSLKLVYIPKRGY   46 (92)
T ss_dssp             HCT-TTBEEEEETTTEE
T ss_pred             HcCCCcEEEEEcCceEE
Confidence            44446666655555444


No 22 
>PF07962 Swi3:  Replication Fork Protection Component Swi3;  InterPro: IPR012923 Replication fork pausing is required to initiate recombination events. More specifically, Swi1 is required for recombination near the mat1 locus. Swi3 has been found to co-purify with Swi1. Together they define a fork protection complex that coordinates leading- and lagging-strand synthesis and stabilises stalled replication forks []. This complex is required for accurate replication, fork protection and replication checkpoint signalling [, ].; GO: 0006974 response to DNA damage stimulus, 0007049 cell cycle, 0048478 replication fork protection, 0005634 nucleus
Probab=46.73  E-value=31  Score=21.60  Aligned_cols=20  Identities=25%  Similarity=0.378  Sum_probs=16.0

Q ss_pred             HHHhccccHHHHHHHHHHHHH
Q psy7099          17 CRLLGYQGIAVVMEELLKIVT   37 (75)
Q Consensus        17 vRLLGy~gia~vi~elLk~v~   37 (75)
                      -||+|-+|||.+.+.. +.++
T Consensus         5 ~rL~~~~Glp~l~~~~-k~~k   24 (83)
T PF07962_consen    5 ERLLSPKGLPYLRKNF-KKFK   24 (83)
T ss_pred             HHccCCCCHHHHHHHH-HHcC
Confidence            3899999999998877 6544


No 23 
>KOG4515|consensus
Probab=44.85  E-value=59  Score=24.42  Aligned_cols=39  Identities=15%  Similarity=0.342  Sum_probs=31.7

Q ss_pred             HHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCCc
Q psy7099          30 EELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGSP   69 (75)
Q Consensus        30 ~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs~   69 (75)
                      ...|.-++..|. .+-|.+++|-+.+|..=+||+|.+|+.
T Consensus       157 s~~l~riq~~l~-~~Vp~le~lN~~L~~~eRLePf~~~~d  195 (217)
T KOG4515|consen  157 SDDLCRIQIILE-DIVPMLETLNEILTPDERLEPFNLGSD  195 (217)
T ss_pred             HHHHHHHHHHHH-HhHHHHHHHHhcCCcccccCCcccCcc
Confidence            344556677774 478899999999999999999999985


No 24 
>TIGR00628 ung uracil-DNA glycosylase. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=37.83  E-value=23  Score=26.08  Aligned_cols=22  Identities=32%  Similarity=0.492  Sum_probs=18.3

Q ss_pred             cccCCcchhHHHHHHHHhcccc
Q psy7099           3 KFLGFVGAYHFRAMCRLLGYQG   24 (75)
Q Consensus         3 ly~gFvG~pH~~aivRLLGy~g   24 (75)
                      .++||+|..||..+=..|-.+|
T Consensus       190 a~~gF~gs~~Fs~~N~~L~~~g  211 (212)
T TIGR00628       190 ARRGFFGCRHFSKANEYLEKHG  211 (212)
T ss_pred             cCCCCCCCCHHHHHHHHHHHcC
Confidence            5789999999999888776654


No 25 
>PHA02855 anti-apoptotic membrane protein; Provisional
Probab=36.92  E-value=1.3e+02  Score=22.03  Aligned_cols=33  Identities=18%  Similarity=0.342  Sum_probs=27.4

Q ss_pred             HHHHHHHhccccHHHHHHHHHHHHHHHHHhhHH
Q psy7099          13 FRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLL   45 (75)
Q Consensus        13 ~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~   45 (75)
                      ++.|+..+|+++.+.+++++++-+-++|.+.=+
T Consensus       101 lSiIiek~~~kn~~~v~s~lid~I~~kiSe~~~  133 (180)
T PHA02855        101 ISMIAEKKGYKNNNIVMSDLINEIANKISENSK  133 (180)
T ss_pred             HHHHHHHhccchHHHHHHHHHHHHHHHhhhhhH
Confidence            466889999999999999999988877766443


No 26 
>PF12668 DUF3791:  Protein of unknown function (DUF3791);  InterPro: IPR024269 This entry represents proteins of unknown function.
Probab=35.47  E-value=33  Score=20.01  Aligned_cols=34  Identities=9%  Similarity=-0.008  Sum_probs=23.4

Q ss_pred             cCCcchhHHHHHHHHhccccHHHHHHHHHHHHHH
Q psy7099           5 LGFVGAYHFRAMCRLLGYQGIAVVMEELLKIVTS   38 (75)
Q Consensus         5 ~gFvG~pH~~aivRLLGy~gia~vi~elLk~v~~   38 (75)
                      +.+-+...+..---.|..+|...+++++.+.++.
T Consensus        27 ~~~~~~~~i~~~Yd~lHt~s~~yivedi~~~l~~   60 (62)
T PF12668_consen   27 KRSGVIDYIIDCYDVLHTQSDEYIVEDIIEYLKN   60 (62)
T ss_pred             HHcCcHHHHHHcchHHHHCcHHHHHHHHHHHHHh
Confidence            3344444555555667888999999998887764


No 27 
>cd08787 CARD_NOD2_1_CARD15 Caspase activation and recruitment domain of NOD2, repeat 1. Caspase activation and recruitment domain (CARD) similar to that found in human NOD2 (CARD15), repeat 1. NOD2 is a member of the Nod-like receptor (NLR) family, which plays a central role in the innate immune response. NLRs typically contain an N-terminal effector domain, a central nucleotide-binding domain and a C-terminal ligand-binding region of several leucine-rich repeats (LRRs). In NOD2, as well as NOD1, the N-terminal effector domain is a CARD. NOD2 contains two N-terminal CARD repeats. Mutations in NOD2 have been associated with Crohns disease and Blau syndrome. Nod2-CARDs have been shown to interact with the CARD domain of the downstream effector RICK (RIP2, CARDIAK), a serine/threonine kinase. In general, CARDs are death domains (DDs) found associated with caspases. They are known to be important in the signaling pathways for apoptosis, inflammation, and host-defense mechanisms. DDs are pr
Probab=35.03  E-value=78  Score=20.77  Aligned_cols=33  Identities=15%  Similarity=0.111  Sum_probs=27.2

Q ss_pred             HHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCC
Q psy7099          28 VMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCK   60 (75)
Q Consensus        28 vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~k   60 (75)
                      -.+.||+.|.+|=......+++++.++.|.+.+
T Consensus        52 ~aR~LLD~V~~KGe~~C~~fl~a~~ea~~esq~   84 (87)
T cd08787          52 NARQLLDTVYNKGEWACQKFLAAAQQALAEEQS   84 (87)
T ss_pred             HHHHHHHHHHhcChhHHHHHHHHHHHhcccccC
Confidence            467788888888888899999999999997653


No 28 
>COG1383 RPS17A Ribosomal protein S17E [Translation, ribosomal structure and biogenesis]
Probab=34.99  E-value=55  Score=20.88  Aligned_cols=30  Identities=17%  Similarity=0.382  Sum_probs=24.8

Q ss_pred             HHHHHHHHHHHHHHhhHHHHHHHHHHhCCc
Q psy7099          28 VMEELLKIVTSLIQGSLLQFTKTLMDAMPK   57 (75)
Q Consensus        28 vi~elLk~v~~~i~~~l~~yv~~l~~~mPk   57 (75)
                      +++|+..+.+..+.|.|.-|+..++..+-+
T Consensus        34 ~V~e~~~i~SK~lRN~IAGYiT~~~~~~~~   63 (74)
T COG1383          34 LVEELANIQSKKLRNRIAGYITRLVKRIKE   63 (74)
T ss_pred             HHHHHhcchhHHHHHHHHHHHHHHHHHHHH
Confidence            468999888888999999999999876543


No 29 
>PRK05254 uracil-DNA glycosylase; Provisional
Probab=32.23  E-value=23  Score=26.19  Aligned_cols=24  Identities=33%  Similarity=0.520  Sum_probs=20.1

Q ss_pred             cccCCcchhHHHHHHHHhccccHH
Q psy7099           3 KFLGFVGAYHFRAMCRLLGYQGIA   26 (75)
Q Consensus         3 ly~gFvG~pH~~aivRLLGy~gia   26 (75)
                      -++||+|..||..+=..|-.+|..
T Consensus       195 a~~gF~gs~~F~~~N~~L~~~~~~  218 (224)
T PRK05254        195 AHRGFFGSKHFSKANALLKQHGKT  218 (224)
T ss_pred             ccCCCCCCCHHHHHHHHHHHcCCC
Confidence            467999999999998888877653


No 30 
>PHA03204 uracil DNA glycosylase; Provisional
Probab=31.55  E-value=30  Score=27.29  Aligned_cols=22  Identities=23%  Similarity=0.246  Sum_probs=19.0

Q ss_pred             ccCCcchhHHHHHHHHhccccH
Q psy7099           4 FLGFVGAYHFRAMCRLLGYQGI   25 (75)
Q Consensus         4 y~gFvG~pH~~aivRLLGy~gi   25 (75)
                      ++||+|..||..+=..|-.+|.
T Consensus       292 ~rgFfGs~hFskaN~~L~~~g~  313 (322)
T PHA03204        292 RKPFAHCTHFKDANEFLCKMGK  313 (322)
T ss_pred             cCCCCCCChHHHHHHHHHHcCC
Confidence            7899999999998888877664


No 31 
>PF10444 Nbl1_Borealin_N:  Nbl1 / Borealin N terminal;  InterPro: IPR018851 This entry represents the N-terminal domain of borealin, and is also found in the N-terminal-Borealin-like (NBL; YHR199C-A) protein from Saccharomyces cerevisiae (Baker's yeast). NBL is a subunit of the conserved chromosomal passenger complex (CPC; Ipl1p-Sli15p-Bir1p-Nbl1p), which regulates mitotic chromosome segregation. It is not required for the kinase activity of the complex and it mediates the interaction of Sli15p and Bir1p [].; PDB: 2RAW_B 2RAX_Y 2QFA_B.
Probab=30.78  E-value=69  Score=18.61  Aligned_cols=30  Identities=20%  Similarity=0.218  Sum_probs=19.1

Q ss_pred             HHHHHHHHHHHHHhhHHHHHHHHHHhCCccC
Q psy7099          29 MEELLKIVTSLIQGSLLQFTKTLMDAMPKQC   59 (75)
Q Consensus        29 i~elLk~v~~~i~~~l~~yv~~l~~~mPk~~   59 (75)
                      ++++-...+.++++ ++.-.+--...||+++
T Consensus        18 ~~~lr~~~~~~~~~-~~~~~~~~l~riP~~v   47 (59)
T PF10444_consen   18 IRRLRAQYENLLQS-LRNRLEMELLRIPKAV   47 (59)
T ss_dssp             HHHHHHHHHHHHHH-HHHHHHHHHHHS-HHH
T ss_pred             HHHHHHHHHHHHHH-HHHHHHHHHHHcCHHH
Confidence            45666677777755 5556666667888765


No 32 
>COG0444 DppD ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component [Amino acid transport and metabolism / Inorganic ion transport and metabolism]
Probab=30.38  E-value=43  Score=26.23  Aligned_cols=24  Identities=17%  Similarity=0.468  Sum_probs=21.2

Q ss_pred             HHHHHHHhhHHHHHHHHHHhCCcc
Q psy7099          35 IVTSLIQGSLLQFTKTLMDAMPKQ   58 (75)
Q Consensus        35 ~v~~~i~~~l~~yv~~l~~~mPk~   58 (75)
                      .++..+.+...||++.|++++|+.
T Consensus       237 ~~~~i~~~P~HPYT~~Ll~s~P~~  260 (316)
T COG0444         237 PVEEIFKNPKHPYTRGLLNSLPRL  260 (316)
T ss_pred             CHHHHhcCCCChHHHHHHHhCccc
Confidence            356788999999999999999976


No 33 
>PF01399 PCI:  PCI domain;  InterPro: IPR000717 A homology domain of unclear function, occurs in the C-terminal region of several regulatory components of the 26S proteasome as well as in other proteins. This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15) []. Apparently, all of the characterised proteins containing PCI domains are parts of larger multi-protein complexes. Proteins with PCI domains include budding yeast proteasome regulatory components Rpn3(Sun2), Rpn5, Rpn6, Rpn7and Rpn9 []; mammalian proteasome regulatory components p55, p58 and p44.5, and translation initiation factor 3 complex subunits p110 and INT6 [, ]; Arabidopsis COP9 and FUS6/COP11 []; mammalian G-protein pathway suppressor GPS1, and several uncharacterised ORFs from plant, nematodes and mammals. The complete homology domain comprises approx. 200 residues, the highest conservation is found in the C-terminal half. Several of the proteins mentioned above have no detectable homology to the N-terminal half of the domain.; GO: 0005515 protein binding; PDB: 3TXM_A 3TXN_A 1UFM_A 3CHM_A 3T5X_A 3T5V_B.
Probab=29.58  E-value=1.2e+02  Score=17.68  Aligned_cols=25  Identities=16%  Similarity=0.061  Sum_probs=15.2

Q ss_pred             hHHHHHHHHhccccHHHHHHHHHHH
Q psy7099          11 YHFRAMCRLLGYQGIAVVMEELLKI   35 (75)
Q Consensus        11 pH~~aivRLLGy~gia~vi~elLk~   35 (75)
                      ||+..+++.+-...+.-.-+.+-++
T Consensus         1 ~~~~~l~~~~~~~~~~~~~~~l~~~   25 (105)
T PF01399_consen    1 PPYSELLRAFRSGDLQEFEEFLEKH   25 (105)
T ss_dssp             HHHHHHHHHHHCT-HHHHHHHHHHT
T ss_pred             CHHHHHHHHHHhCCHHHHHHHHHHH
Confidence            6777777777777766664444444


No 34 
>KOG1917|consensus
Probab=26.94  E-value=1.9e+02  Score=26.53  Aligned_cols=43  Identities=28%  Similarity=0.322  Sum_probs=33.2

Q ss_pred             chhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHH
Q psy7099           9 GAYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLM   52 (75)
Q Consensus         9 G~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~   52 (75)
                      -..-++|+++|+|-=|+-.+-|.|.=|+.++++. |+.-|.+=+
T Consensus       829 Dl~ElrAlvel~GpYGvk~Lse~LmwHvasqV~e-lkklv~tn~  871 (1125)
T KOG1917|consen  829 DLSELRALVELLGPYGVKFLSEMLMWHVASQVNE-LKKLVVTNK  871 (1125)
T ss_pred             CHHHHHHHHHHhCchhhHHHHHHHHHHHHHHHHH-HHHHHHhhH
Confidence            4556899999999999999966666699999966 665554433


No 35 
>PF05531 NPV_P10:  Nucleopolyhedrovirus P10 protein;  InterPro: IPR008702 This family consists of several nucleopolyhedrovirus P10 proteins which are thought to be involved in the morphogenesis of the polyhedra [].; GO: 0019028 viral capsid
Probab=26.00  E-value=1.8e+02  Score=18.39  Aligned_cols=34  Identities=24%  Similarity=0.358  Sum_probs=29.8

Q ss_pred             ccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCc
Q psy7099          23 QGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPK   57 (75)
Q Consensus        23 ~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk   57 (75)
                      ++|-.+|++=++.+..+. ..+..-|+.+...+|.
T Consensus         3 ~NILl~Ir~dIk~vd~KV-daLq~~V~~l~~~~~~   36 (75)
T PF05531_consen    3 QNILLVIRQDIKAVDDKV-DALQTQVDDLESNLPD   36 (75)
T ss_pred             HhHHHHHHHHHHHHHHHH-HHHHHHHHHHHhcCCc
Confidence            688899999999999999 5588899999988886


No 36 
>PF15209 IL31:  Interleukin 31
Probab=25.48  E-value=94  Score=21.92  Aligned_cols=25  Identities=28%  Similarity=0.248  Sum_probs=18.5

Q ss_pred             hhHHHHHHHHhccccHHHHHHHHHH
Q psy7099          10 AYHFRAMCRLLGYQGIAVVMEELLK   34 (75)
Q Consensus        10 ~pH~~aivRLLGy~gia~vi~elLk   34 (75)
                      .|||++|-+|--.+.+-+|++.|=|
T Consensus        67 l~ylk~Ik~l~~n~~i~~Ii~~L~k   91 (137)
T PF15209_consen   67 LPYLKAIKRLSNNTVIDEIIEQLDK   91 (137)
T ss_pred             HHHHHHHHHhcccchHHHHHHHHHh
Confidence            4899999998777666666666554


No 37 
>PF06585 JHBP:  Haemolymph juvenile hormone binding protein (JHBP);  InterPro: IPR010562 This family consists of several insect specific haemolymph juvenile hormone binding proteins (JHBP). Juvenile hormone (JH) has a profound effect on insects. It regulates embryogenesis, maintains the status quo of larva development and stimulates reproductive maturation in the adult forms. JH is transported from the sites of its synthesis to target tissues by a haemolymph carrier called juvenile hormone-binding protein (JHBP). JHBP protects the JH molecules from hydrolysis by non-specific esterases present in the insect haemolymph []. The crystal structure of the JHBP from Galleria mellonella (Wax moth) shows an unusual fold consisting of a long alpha-helix wrapped in a much curved antiparallel beta-sheet. The folding pattern for this structure closely resembles that found in some tandem-repeat mammalian lipid-binding and bactericidal permeability-increasing proteins, with a similar organisation of the major cavity and a disulphide bond linking the long helix and the beta-sheet. It would appear that JHBP forms two cavities, only one of which, the one near the N- and C-termini, binds the hormone; binding induces a conformational change, of unknown significance [, ].; PDB: 3A1Z_D 3AOS_B 3AOT_A 2RQF_A 2RCK_A 3E8W_A 3E8T_A.
Probab=25.20  E-value=2.4e+02  Score=19.62  Aligned_cols=48  Identities=4%  Similarity=0.141  Sum_probs=37.4

Q ss_pred             hhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCc
Q psy7099          10 AYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPK   57 (75)
Q Consensus        10 ~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk   57 (75)
                      -+.+...++-+=.+..+.+++|+-..++..+...+.+.+..+++..|-
T Consensus       197 ~~~l~~~~n~~in~~~~~~~~~~~p~i~~~~~~~i~~~~N~~l~~~p~  244 (248)
T PF06585_consen  197 NKELSDFINKFINENWPELLNEVKPDIEEILSKIITDIINKILSKVPY  244 (248)
T ss_dssp             HHHHHHHHHHHHHHCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHS-C
T ss_pred             CHHHHHHHHHHHHHhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCH
Confidence            556666665555668888889999999999989899999988888873


No 38 
>PF12261 T_hemolysin:  Thermostable hemolysin;  InterPro: IPR022050  This family of proteins is found in bacteria. Proteins in this family are typically between 200 and 228 amino acids in length. T_hemolysin is a pore-forming toxin of bacteria, able to lyse erythrocytes from a number of mammalian species. 
Probab=25.12  E-value=4.8  Score=28.83  Aligned_cols=62  Identities=26%  Similarity=0.251  Sum_probs=44.2

Q ss_pred             CCcchhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCccCCCCCcccCC
Q psy7099           6 GFVGAYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPKQCKLPRYDYGS   68 (75)
Q Consensus         6 gFvG~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk~~kLP~~dyGs   68 (75)
                      ...+..||.+++.+|-.+|.-|++-...+-+++.++. +---...|-++-|....-.+.+|||
T Consensus        98 ~g~~~~l~~~l~~~L~~~g~~w~vfTaT~~lr~~~~r-lgl~~~~La~Ad~~rl~~~~~~WGs  159 (179)
T PF12261_consen   98 PGAARLLFAALAQLLAQQGFEWVVFTATRQLRNLFRR-LGLPPTVLADADPSRLGDDRASWGS  159 (179)
T ss_pred             cccHHHHHHHHHHHHHHCCCCEEEEeCCHHHHHHHHH-cCCCceeccccCHhHcCcChhhhhh
Confidence            3456789999999999999999998888777777643 3322345556666666555666665


No 39 
>KOG2994|consensus
Probab=24.88  E-value=40  Score=26.41  Aligned_cols=24  Identities=33%  Similarity=0.557  Sum_probs=19.6

Q ss_pred             ccccCCcchhHHHHHHHHhccccH
Q psy7099           2 VKFLGFVGAYHFRAMCRLLGYQGI   25 (75)
Q Consensus         2 ~ly~gFvG~pH~~aivRLLGy~gi   25 (75)
                      +.|+||||--||+-.=.+|-.+|.
T Consensus       266 Sa~rgFFgC~HFsk~N~~Le~~g~  289 (297)
T KOG2994|consen  266 SAYRGFFGCRHFSKTNELLEESGK  289 (297)
T ss_pred             cccccccccchhHHHHHHHHHhCC
Confidence            468999999999988887766554


No 40 
>PHA03202 uracil DNA glycosylase; Provisional
Probab=24.76  E-value=47  Score=26.08  Aligned_cols=22  Identities=23%  Similarity=0.217  Sum_probs=18.3

Q ss_pred             ccCCcchhHHHHHHHHhccccH
Q psy7099           4 FLGFVGAYHFRAMCRLLGYQGI   25 (75)
Q Consensus         4 y~gFvG~pH~~aivRLLGy~gi   25 (75)
                      +++|+|..||..+=..|-.+|.
T Consensus       284 ~~gFfg~~hFsk~N~~L~~~g~  305 (313)
T PHA03202        284 RVNFRDCPHFLEANAYLTKTGR  305 (313)
T ss_pred             ccCCCCCCHHHHHHHHHHHcCC
Confidence            4689999999998888877664


No 41 
>PF06887 DUF1265:  Protein of unknown function (DUF1265);  InterPro: IPR009676 This family represents a conserved region approximately 50 residues long within a number of proteins of unknown function that seem to be restricted to Caenorhabditis elegans.
Probab=24.12  E-value=57  Score=19.25  Aligned_cols=17  Identities=29%  Similarity=0.524  Sum_probs=11.2

Q ss_pred             HHHHHHHHHHHHhhHHHHHHHHH
Q psy7099          30 EELLKIVTSLIQGSLLQFTKTLM   52 (75)
Q Consensus        30 ~elLk~v~~~i~~~l~~yv~~l~   52 (75)
                      |||.|+-+++.      ||++++
T Consensus         1 EEL~kN~EDl~------YV~nmL   17 (48)
T PF06887_consen    1 EELVKNHEDLM------YVCNML   17 (48)
T ss_pred             ChHHHhhhhHH------HHHhHh
Confidence            56777777665      666654


No 42 
>PHA02774 E1; Provisional
Probab=24.10  E-value=68  Score=27.41  Aligned_cols=31  Identities=19%  Similarity=0.471  Sum_probs=22.7

Q ss_pred             cCCcchhHHHHHHHHhccccHHHHHHHHHHHHH
Q psy7099           5 LGFVGAYHFRAMCRLLGYQGIAVVMEELLKIVT   37 (75)
Q Consensus         5 ~gFvG~pH~~aivRLLGy~gia~vi~elLk~v~   37 (75)
                      +...|.-+..+|+++|-||++.++  ..++.++
T Consensus       395 ~~~~~~g~w~~iv~fL~~q~v~~~--~fl~~lk  425 (613)
T PHA02774        395 DKVEGEGDWKPIVKFLRYQGVEFI--SFLTALK  425 (613)
T ss_pred             hhcCCCCCHHHHHHHHHccCccHH--HHHHHHH
Confidence            345567779999999999999886  3444433


No 43 
>PF06757 Ins_allergen_rp:  Insect allergen related repeat, nitrile-specifier detoxification;  InterPro: IPR010629 This entry represents several insect specific allergen repeats. These repeats are commonly found in various proteins from cockroaches, fruit flies and mosquitos. It has been suggested that the repeat sequences have evolved by duplication of an ancestral amino acid domain, which may have arisen from the mitochondrial energy transfer proteins [].  This family exemplifies a case of novel gene evolution. The case in point is the arms-race between plants and their infective insective herbivores in the area of the glucosinolate-myrosinase system. Brassicas have developed the glucosinolate-myrosinase system as chemical defence mechanism against the insects, and consequently the insects have adapted to produce a detoxifying molecule, nitrile-specifier protein (NSP). NSP is present in the Pieris rapae (Cabbage white butterfly). NSP is structurally different from and has no amino acid homology to any known detoxifying enzymes, and it appears to have arisen by a process of domain and gene duplication of a sequence of unknown function that is widespread in insect species and referred to as insect-allergen-repeat protein. Thus this family is found either as a single domain or as a multiple repeat-domain []. 
Probab=23.92  E-value=86  Score=21.62  Aligned_cols=52  Identities=17%  Similarity=0.367  Sum_probs=33.1

Q ss_pred             ccCCcchhHHHHHHHHhccccHHHHHHHHHHHHHHHH---------------HhhHHHHHHHHHHhCCc
Q psy7099           4 FLGFVGAYHFRAMCRLLGYQGIAVVMEELLKIVTSLI---------------QGSLLQFTKTLMDAMPK   57 (75)
Q Consensus         4 y~gFvG~pH~~aivRLLGy~gia~vi~elLk~v~~~i---------------~~~l~~yv~~l~~~mPk   57 (75)
                      +..+.-.|.+++++.-|..+||.+.  ..++.+...+               .+.++.+++.+...+|+
T Consensus        50 ~~~l~~~pE~~~l~~yL~~~gldv~--~~i~~i~~~l~~~~~~p~~~~~~~~~~g~~g~~~di~~~lP~  116 (179)
T PF06757_consen   50 WQQLEALPEVKALLDYLESAGLDVY--YYINQINDLLGLPPLNPTPSLSCSRGGGLNGFVDDILALLPR  116 (179)
T ss_pred             HHHHHcCHHHHHHHHHHHHCCCCHH--HHHHHHHHHHcCCcCCCCcccccccCCCHHHHHHHHHHHCCH
Confidence            4445567888888888888888764  3344333333               33456666777777764


No 44 
>KOG4130|consensus
Probab=23.39  E-value=2.4e+02  Score=22.11  Aligned_cols=39  Identities=21%  Similarity=0.249  Sum_probs=28.9

Q ss_pred             CcchhHHHHHHHHhccccHHHHHHHHHHHHHHHHHhhHHH
Q psy7099           7 FVGAYHFRAMCRLLGYQGIAVVMEELLKIVTSLIQGSLLQ   46 (75)
Q Consensus         7 FvG~pH~~aivRLLGy~gia~vi~elLk~v~~~i~~~l~~   46 (75)
                      |||+.|++-+..-|--.+.+.+ ..+|..+=+.+..++=-
T Consensus       171 fFGvAH~HHiyEqL~~g~~~~~-~ilL~t~fQfsYTtlFG  209 (291)
T KOG4130|consen  171 FFGVAHAHHIYEQLQEGSMTTV-SILLTTCFQFSYTTLFG  209 (291)
T ss_pred             HHhHHHHHHHHHHHHhcchHHH-HHHHHHHHHHHHHHHHH
Confidence            8999999999888877777776 66666666666555543


No 45 
>PF04036 DUF372:  Domain of unknown function (DUF372);  InterPro: IPR007179 This is a group of proteins of unknown function. It is found N-terminal to another domain of unknown function (IPR007181 from INTERPRO).; PDB: 2I52_B 2IEC_D 2OGF_C.
Probab=22.48  E-value=12  Score=21.22  Aligned_cols=9  Identities=33%  Similarity=0.541  Sum_probs=6.1

Q ss_pred             cccCCcchh
Q psy7099           3 KFLGFVGAY   11 (75)
Q Consensus         3 ly~gFvG~p   11 (75)
                      ||..|+|.|
T Consensus        14 LyHQF~GtP   22 (38)
T PF04036_consen   14 LYHQFVGTP   22 (38)
T ss_dssp             HHHHHTT-E
T ss_pred             HHHHhcCCc
Confidence            677788877


No 46 
>PRK01151 rps17E 30S ribosomal protein S17e; Validated
Probab=22.37  E-value=49  Score=19.99  Aligned_cols=25  Identities=16%  Similarity=0.335  Sum_probs=19.8

Q ss_pred             HHHHHHHHHHHHHHhhHHHHHHHHH
Q psy7099          28 VMEELLKIVTSLIQGSLLQFTKTLM   52 (75)
Q Consensus        28 vi~elLk~v~~~i~~~l~~yv~~l~   52 (75)
                      +++|+..+-+..+.|.|.-||..++
T Consensus        33 ~v~e~a~i~SK~lRNrIAGYiT~~~   57 (58)
T PRK01151         33 LVEELTNIESKKVRNRIAGYITRKV   57 (58)
T ss_pred             HHHHHhcCccHhHHHHHhhhhhhcc
Confidence            5688888777888888888887765


No 47 
>COG4608 AppF ABC-type oligopeptide transport system, ATPase component [Amino acid transport and metabolism]
Probab=21.95  E-value=57  Score=25.00  Aligned_cols=22  Identities=27%  Similarity=0.522  Sum_probs=19.4

Q ss_pred             HHHHHHhhHHHHHHHHHHhCCc
Q psy7099          36 VTSLIQGSLLQFTKTLMDAMPK   57 (75)
Q Consensus        36 v~~~i~~~l~~yv~~l~~~mPk   57 (75)
                      .+.++.+...||+++|.++.|.
T Consensus       194 ~~~~~~~p~HpYTk~Ll~a~p~  215 (268)
T COG4608         194 TEEVFSNPLHPYTKALLSAVPV  215 (268)
T ss_pred             HHHHhhCCCCHHHHHHHHhCCc
Confidence            4677889999999999999994


No 48 
>COG2981 CysZ Uncharacterized protein involved in cysteine biosynthesis [Amino acid transport and metabolism]
Probab=21.91  E-value=1.9e+02  Score=22.27  Aligned_cols=34  Identities=24%  Similarity=0.311  Sum_probs=24.5

Q ss_pred             cHHHHHHHHHH-HHHHHHHhhHHHHHHHHHHhCCc
Q psy7099          24 GIAVVMEELLK-IVTSLIQGSLLQFTKTLMDAMPK   57 (75)
Q Consensus        24 gia~vi~elLk-~v~~~i~~~l~~yv~~l~~~mPk   57 (75)
                      ++|+++.-+|= -+....-+...++++.++..+|.
T Consensus        29 ilpLl~ni~L~~gl~~~~~~~~~~wid~Lm~~iPd   63 (250)
T COG2981          29 ILPLLLNILLWGGLFWLLFSQALPWIDTLMPGIPD   63 (250)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcCcc
Confidence            45666665555 44555557799999999999995


No 49 
>PF05120 GvpG:  Gas vesicle protein G ;  InterPro: IPR007804 Gas vesicles are intracellular, protein-coated, and hollow organelles found in cyanobacteria and halophilic archaea. They are permeable to ambient gases by diffusion and provide buoyancy, enabling cells to move upwards in water to access oxygen and/or light. Proteins containing this family are involved in the formation of gas vesicles []. 
Probab=21.48  E-value=93  Score=19.67  Aligned_cols=20  Identities=15%  Similarity=0.258  Sum_probs=17.6

Q ss_pred             ccHHHHHHHHHHHHHHHHHh
Q psy7099          23 QGIAVVMEELLKIVTSLIQG   42 (75)
Q Consensus        23 ~gia~vi~elLk~v~~~i~~   42 (75)
                      +|+.||.+.+.+.++.-+.+
T Consensus        10 rgv~wv~e~I~~~Ae~E~~D   29 (79)
T PF05120_consen   10 RGVVWVAEQIQEQAERELYD   29 (79)
T ss_pred             HHHHHHHHHHHHHHHHHHcC
Confidence            69999999999999888765


No 50 
>PHA03395 p10 fibrous body protein; Provisional
Probab=21.17  E-value=2.5e+02  Score=18.32  Aligned_cols=35  Identities=17%  Similarity=0.304  Sum_probs=28.8

Q ss_pred             cccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhCCc
Q psy7099          22 YQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAMPK   57 (75)
Q Consensus        22 y~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~mPk   57 (75)
                      +++|-.+|++=++.+..+. ..+..-|+.+.+-+|.
T Consensus         2 sqNILl~Ir~dIkavd~KV-dalQ~~V~~l~~nlpd   36 (87)
T PHA03395          2 SQNILLLIRQDIKAVSDKV-DALQAAVDDVRANLPD   36 (87)
T ss_pred             CchHHHHHHHHHHHHhhHH-HHHHHHHHHHHhcCCc
Confidence            4688888999999999998 4588888888888774


No 51 
>PF05878 Phyto_Pns9_10:  Phytoreovirus nonstructural protein Pns9/Pns10;  InterPro: IPR008776 This family consists of the Phytoreovirus nonstructural proteins Pns9 and Pns10. The function of this family is unknown.
Probab=21.01  E-value=1.3e+02  Score=23.77  Aligned_cols=27  Identities=15%  Similarity=0.161  Sum_probs=21.7

Q ss_pred             cHHHHHHHHHHHHHHHHHhhHHHHHHH
Q psy7099          24 GIAVVMEELLKIVTSLIQGSLLQFTKT   50 (75)
Q Consensus        24 gia~vi~elLk~v~~~i~~~l~~yv~~   50 (75)
                      .+---|+.+++.+++.|.+.+++||..
T Consensus       194 Ksk~qM~~~i~~~Rn~I~n~I~~fVn~  220 (312)
T PF05878_consen  194 KSKAQMRPEIQRIRNEILNKIQQFVNL  220 (312)
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHHHHHh
Confidence            344567888899999999999999864


No 52 
>PLN02936 epsilon-ring hydroxylase
Probab=20.65  E-value=1.2e+02  Score=23.47  Aligned_cols=42  Identities=29%  Similarity=0.373  Sum_probs=33.1

Q ss_pred             HHHHHHHHh-ccccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHh
Q psy7099          12 HFRAMCRLL-GYQGIAVVMEELLKIVTSLIQGSLLQFTKTLMDA   54 (75)
Q Consensus        12 H~~aivRLL-Gy~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~   54 (75)
                      -+-++-||- |..|+||+ .|-|+.|....++....++.++.+.
T Consensus         6 ~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~   48 (489)
T PLN02936          6 WLTSLNRLWGDDSGIPVA-DAKLEDVTDLLGGALFLPLFKWMNE   48 (489)
T ss_pred             HHHhhhccCCCCCCCccH-HhHHhhHHHHhccHHHHHHHHHHHH
Confidence            345566665 46799998 9999999999888888888888764


No 53 
>PF14271 DUF4359:  Domain of unknown function (DUF4359)
Probab=20.40  E-value=2e+02  Score=18.63  Aligned_cols=68  Identities=15%  Similarity=0.263  Sum_probs=41.2

Q ss_pred             cCCcchhHHHHHHHHhcc-ccHHHHHHHHHHHHHHHHHhhHHHHHHHHHHhC----------------CccCCCCCcccC
Q psy7099           5 LGFVGAYHFRAMCRLLGY-QGIAVVMEELLKIVTSLIQGSLLQFTKTLMDAM----------------PKQCKLPRYDYG   67 (75)
Q Consensus         5 ~gFvG~pH~~aivRLLGy-~gia~vi~elLk~v~~~i~~~l~~yv~~l~~~m----------------Pk~~kLP~~dyG   67 (75)
                      +++.+..=...+-+-+-. .+.|-.++.+++++..++.+. +|-++++++.+                +-+=.+|+|.+-
T Consensus        15 ~e~a~~~l~~~l~~~~c~~~~~p~~l~~~~~~c~~lv~~~-~~~i~~~i~~~t~r~Ny~lfSiy~t~~~~~~~~p~y~~~   93 (107)
T PF14271_consen   15 EEYASEQLTTYLKKEVCDEKQLPGFLRSLIKNCKRLVDSQ-RPQIEALIDQSTTRQNYGLFSIYTTELGGKSPLPSYKFV   93 (107)
T ss_pred             HHHHHHHHHHHHHHHHhccccCchHHHHHHHHHHHHHHhh-hHHHHHHHHhhhhhcceEEEEEEEEeccCCCCCccceEE
Confidence            333333333344444443 577777777888888888654 66666665432                223367888888


Q ss_pred             Cccccc
Q psy7099          68 SPGVLG   73 (75)
Q Consensus        68 s~G~~~   73 (75)
                      |-|+++
T Consensus        94 tlGi~g   99 (107)
T PF14271_consen   94 TLGIFG   99 (107)
T ss_pred             EEeecc
Confidence            777765


No 54 
>cd00170 SEC14 Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits.
Probab=20.17  E-value=1.6e+02  Score=17.84  Aligned_cols=41  Identities=22%  Similarity=0.260  Sum_probs=25.5

Q ss_pred             HHHHHHHHHHHHHHHHhhHHHH-------HHHHHHhCCccCCCCCcccC
Q psy7099          26 AVVMEELLKIVTSLIQGSLLQF-------TKTLMDAMPKQCKLPRYDYG   67 (75)
Q Consensus        26 a~vi~elLk~v~~~i~~~l~~y-------v~~l~~~mPk~~kLP~~dyG   67 (75)
                      +++++.+++.++..+....+.-       .+.|.+-+|+++ ||....|
T Consensus       109 p~~~~~~~~~~~~~l~~~~~~ki~~~~~~~~~L~~~i~~~~-Lp~~~GG  156 (157)
T cd00170         109 PWFFKVLWKIVKPFLSEKTRKKIVFLGSDKEELLKYIDKEQ-LPEEYGG  156 (157)
T ss_pred             CHhHHHHHHHHHHhcCHhhhhhEEEecCCHHHHHhhCChhh-CcHhhCC
Confidence            4566667777666554433322       567888888655 7766554


Done!