Query         004949
Match_columns 722
No_of_seqs    120 out of 141
Neff          3.3 
Searched_HMMs 46136
Date          Thu Mar 28 15:34:10 2013
Command       hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/004949.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/004949hhsearch_cdd -cpu 12 -v 0 

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 PF04765 DUF616:  Protein of un 100.0  3E-111  6E-116  869.2  28.3  304  388-708     1-305 (305)
  2 cd04194 GT8_A4GalT_like A4GalT  90.9    0.28 6.1E-06   49.0   4.3   96  486-586    29-131 (248)
  3 cd00505 Glyco_transf_8 Members  87.5     1.9 4.1E-05   43.5   7.4  159  486-679    29-199 (246)
  4 PF01501 Glyco_transf_8:  Glyco  76.6     2.1 4.5E-05   41.3   2.8   48  536-585    86-133 (250)
  5 COG1681 FlaB Archaeal flagelli  70.1     4.7  0.0001   41.9   3.6   28   59-86      4-31  (209)
  6 PRK08541 flagellin; Validated   64.4       7 0.00015   40.6   3.6   28   59-86      4-31  (211)
  7 PF01917 Arch_flagellin:  Archa  53.2      13 0.00029   36.5   3.3   26   61-86      3-28  (190)
  8 PRK08455 fliL flagellar basal   51.3      35 0.00076   34.5   5.9   22   67-88     25-46  (182)
  9 PF15243 ANAPC15:  Anaphase-pro  48.9      24 0.00053   32.6   4.0   23  211-233    28-50  (92)
 10 PF04415 DUF515:  Protein of un  45.8      20 0.00042   40.8   3.5   25  592-616   342-366 (416)
 11 cd06429 GT8_like_1 GT8_like_1   45.2      31 0.00066   36.3   4.6   47  537-585   102-148 (257)
 12 PF10446 DUF2457:  Protein of u  44.1      71  0.0015   36.9   7.4   20  275-294    98-117 (458)
 13 PF09451 ATG27:  Autophagy-rela  39.1      27 0.00058   36.8   3.1   40   99-138   226-268 (268)
 14 PF09307 MHC2-interact:  CLIP,   36.6      12 0.00025   35.9   0.0   39   49-88     21-59  (114)
 15 cd02515 Glyco_transf_6 Glycosy  35.8      30 0.00064   37.5   2.8  115  444-566    20-141 (271)
 16 cd06431 GT8_LARGE_C LARGE cata  33.1      44 0.00094   35.5   3.5   97  486-585    29-135 (280)
 17 PF01034 Syndecan:  Syndecan do  32.5      15 0.00033   32.1   0.1   27   63-89     16-42  (64)
 18 cd06432 GT8_HUGT1_C_like The C  32.5      36 0.00078   35.3   2.8   45  539-585    85-130 (248)
 19 PF08391 Ly49:  Ly49-like prote  29.7      18 0.00038   34.8   0.0   18   63-80      7-24  (119)
 20 PF03407 Nucleotid_trans:  Nucl  29.3      54  0.0012   32.2   3.2   87  552-681    69-156 (212)
 21 PRK05529 cell division protein  29.2      38 0.00083   35.5   2.3   21   46-67     22-42  (255)
 22 PF05637 Glyco_transf_34:  gala  28.9      39 0.00085   35.0   2.3   44  636-680   144-193 (239)
 23 PF07423 DUF1510:  Protein of u  27.0      39 0.00084   35.4   1.9   16  203-218   143-158 (217)
 24 PF03452 Anp1:  Anp1;  InterPro  26.9      63  0.0014   34.9   3.5   47  539-588   133-179 (269)
 25 PF10731 Anophelin:  Thrombin i  26.2      78  0.0017   27.8   3.2   23   66-88      7-30  (65)
 26 PHA02849 putative transmembran  26.1      72  0.0016   29.2   3.1   24   59-82     10-33  (82)
 27 PF11119 DUF2633:  Protein of u  26.0      33 0.00071   29.7   1.0   43   60-102     6-48  (59)
 28 PLN02718 Probable galacturonos  25.2 1.5E+02  0.0033   35.5   6.4   95  486-585   341-452 (603)
 29 PF12911 OppC_N:  N-terminal TM  25.2      25 0.00054   28.1   0.1   26   60-87     16-41  (56)
 30 PLN02742 Probable galacturonos  24.7      77  0.0017   37.3   3.9   41  542-584   346-386 (534)
 31 PF14991 MLANA:  Protein melan-  24.2      24 0.00052   34.1  -0.2   30   60-89     26-55  (118)
 32 COG4726 PilX Tfp pilus assembl  24.1      55  0.0012   34.1   2.3   25   50-80      7-31  (196)
 33 PF14155 DUF4307:  Domain of un  23.1      87  0.0019   29.4   3.2   28   61-88      6-33  (112)
 34 PF03314 DUF273:  Protein of un  23.0      66  0.0014   34.1   2.7   62  542-609    35-96  (222)
 35 PF02532 PsbI:  Photosystem II   21.6 1.9E+02  0.0041   23.1   4.2   26   67-92      7-32  (36)
 36 PF07790 DUF1628:  Protein of u  21.6 1.1E+02  0.0024   26.4   3.4   29   58-86      2-30  (80)
 37 PF11239 DUF3040:  Protein of u  21.6      92   0.002   27.4   2.9   19   59-77     40-58  (82)
 38 PF06781 UPF0233:  Uncharacteri  20.4 2.3E+02  0.0049   26.2   5.2   31   58-88     27-57  (87)
 39 COG1783 XtmB Phage terminase l  20.3   2E+02  0.0044   33.0   5.9  117  556-675     3-146 (414)

No 1  
>PF04765 DUF616:  Protein of unknown function (DUF616);  InterPro: IPR006852 The entry represents a protein of unknown function. The function of is unknown although a number of the members are thought to be glycosyltransferases.
Probab=100.00  E-value=2.6e-111  Score=869.24  Aligned_cols=304  Identities=56%  Similarity=1.028  Sum_probs=292.5

Q ss_pred             ceeeEeeccCCCCCCCCCCCcCCccchHHHhhcccccC-cccccccccCCCCCCCCCCcCCHhhHhhhccCcEEEEeeee
Q 004949          388 FLQYTEVEEKPDGEAEWEPRFAGHQSLQEREESFLARD-QKINCGFVKAPEGYPSTGFDLAEDDANYNSRCHIAVISCIF  466 (722)
Q Consensus       388 ~L~yv~~e~~~~~~~~~~~~FgG~~sl~eR~~sF~~~~-~~vhCGF~~gp~~~~~~GFdi~e~D~~~M~~CKIVVYTAIF  466 (722)
                      +|+||.+|+++.  ..++|+|||||||+||++||++++ |+|||||++      +|||||+|.|+.||++|+||||||||
T Consensus         1 nl~y~~~~~~~~--~~~~~~f~g~~s~~~R~~sf~~~~~~~v~Cgf~~------~~gf~i~~~d~~~m~~c~vvV~saIF   72 (305)
T PF04765_consen    1 NLTYIEEENKPE--SGRGPSFGGNQSLEERESSFDIQEDMTVHCGFVK------NTGFDISESDRRYMEKCRVVVYSAIF   72 (305)
T ss_pred             CCcccccccccc--cCCCCCcCCcCCHHHHHHhcCCCCCceecccccc------CCCCCCCHHHHHHHhcCCEEEEEEec
Confidence            699999998777  778999999999999999999976 699999999      59999999999999999999999999


Q ss_pred             CCCccccCCCCCcccccCCCCceEEEEecccchhhhcccCCCCCCCCcccceEEEEccCCCCccccccccccccccccCC
Q 004949          467 GNSDRLRIPVGKTVTRLSRKNVCFVMFTDELTLQTLSSEGQIPDRTGFIGLWKMVVVKNLPYDDMRRVGKIPKLLPHRLF  546 (722)
Q Consensus       467 GnYD~L~qP~~~~Is~~s~knVcFicFTDe~tL~sl~~~g~vpd~~~~vG~WKIV~VknlPy~D~RRngRipKiLpHRLF  546 (722)
                      |+||.|+||.+  |++.+.++|||+||||+.|+++|+++|.++++.+++|+||||+|+++||.|+||+||+||||||+||
T Consensus        73 G~yD~l~qP~~--i~~~s~~~vcf~mF~D~~t~~~l~~~~~~~~~~~~ig~WrIv~v~~lp~~d~rr~~r~~K~lpHrlf  150 (305)
T PF04765_consen   73 GNYDKLRQPKN--ISEYSKKNVCFFMFVDEETLKSLESEGHIPDENKKIGIWRIVVVKNLPYDDPRRNGRIPKLLPHRLF  150 (305)
T ss_pred             CCCccccCchh--hCHHHhcCccEEEEEehhhHHHHHhcCCccccccccCceEEEEecCCCCcchhhcCcccceeccccC
Confidence            99999999987  7778889999999999999999999999999999999999999999999999999999999999999


Q ss_pred             CCCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecCCCCCcHHHHHHHHHHhccCChHHHHHHHHHHHHcCCCCCCC
Q 004949          547 PSARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNHYDRHCVWEEVAQNKKLNKYNHTVIDQQFAFYQADGLKRFDP  626 (722)
Q Consensus       547 PnyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskHp~R~CVYEEAeackrl~K~~~~~Id~Qme~Yk~eGLp~~~~  626 (722)
                      |+|+|||||||||+|++||++||++|||+++++|||++||.|+||||||+||++++||+++.|++||++|+++|||++. 
T Consensus       151 p~y~ySIWID~ki~L~~Dp~~lie~~l~~~~~~~Ai~~H~~R~cvyeEa~a~~~~~k~~~~~I~~Qm~~Y~~eGlp~~s-  229 (305)
T PF04765_consen  151 PNYDYSIWIDGKIQLIVDPLLLIERFLWRKNADIAISKHPERNCVYEEAEACKRLGKYDPERIDEQMEFYKQEGLPPWS-  229 (305)
T ss_pred             CCCceEEEEeeeEEEecCHHHHHHHHHhcCCCcEEEeCCCCcccHHHHHHHHHHhcCCChHHHHHHHHHHHHcCCCccc-
Confidence            9999999999999999999999999999999999999999999999999999999999999999999999999999983 


Q ss_pred             CCCCCCCCCCCCCceEEEccCCcchhhhHHHHHHHHhcCCCCCcchHHHHHHHhhhcCCCCceeeccchhHHHHHHHHhh
Q 004949          627 SDPDRLLPSNVPEGSFIVRAHTPMSNLFSCLWFNEVDRFTSRDQLSFAYTYQKLRRMNPSKMFYLNMFKDCERRSMAKLF  706 (722)
Q Consensus       627 sdp~kl~pSdLpEgnVIVReHtp~sNlFmCLWFNEV~rFS~RDQLSFaYVlwKlr~mnp~~~f~lnMF~~cerr~l~~~f  706 (722)
                       .++.+++|+||||+||||+|+|++|+|||+|||||++||+||||||+||+||++     .+|+||||+||+||+||++|
T Consensus       230 -~~k~~l~s~v~E~~iIiR~H~~~~nlf~clWfnEv~rfs~RDQLSF~Yv~wk~~-----~~~~~~mf~~~~~~~~~~~~  303 (305)
T PF04765_consen  230 -PAKLPLPSDVPEGNIIIRKHNPMSNLFMCLWFNEVERFSPRDQLSFPYVLWKLG-----PKFKLNMFKDCERRQLVVLY  303 (305)
T ss_pred             -ccccccccCCccceEEEecCCchhHHHHHHHHHHHhcCCCcccchHHHHHHHhC-----CcccchhhhHHHHHHHHHhc
Confidence             456668999999999999999999999999999999999999999999999994     46999999999999999999


Q ss_pred             cc
Q 004949          707 RH  708 (722)
Q Consensus       707 ~H  708 (722)
                      +|
T Consensus       304 ~h  305 (305)
T PF04765_consen  304 RH  305 (305)
T ss_pred             CC
Confidence            99


No 2  
>cd04194 GT8_A4GalT_like A4GalT_like proteins catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface. The members of this family of glycosyltransferases catalyze the addition of galactose or glucose residues to the lipooligosaccharide (LOS) or lipopolysaccharide (LPS) of the bacterial cell surface. The enzymes exhibit broad substrate specificities. The known functions found in this family include: Alpha-1,4-galactosyltransferase, LOS-alpha-1,3-D-galactosyltransferase, UDP-glucose:(galactosyl) LPS alpha1,2-glucosyltransferase, UDP-galactose: (glucosyl) LPS alpha1,2-galactosyltransferase, and UDP-glucose:(glucosyl) LPS alpha1,2-glucosyltransferase. Alpha-1,4-galactosyltransferase from N. meningitidis  adds an alpha-galactose from UDP-Gal (the donor) to a terminal lactose (the acceptor) of the LOS structure of outer membrane. LOSs are virulence factors that enable the organism to evade the immune sys
Probab=90.91  E-value=0.28  Score=48.96  Aligned_cols=96  Identities=13%  Similarity=0.170  Sum_probs=59.1

Q ss_pred             CCceEEEEecccchhhhcc-cCCCCCCCCcccceEEEEccCCCCcccc-cccc-----ccccccccCCCCCCEEEEEeCc
Q 004949          486 KNVCFVMFTDELTLQTLSS-EGQIPDRTGFIGLWKMVVVKNLPYDDMR-RVGK-----IPKLLPHRLFPSARYSIWLDSK  558 (722)
Q Consensus       486 knVcFicFTDe~tL~sl~~-~g~vpd~~~~vG~WKIV~VknlPy~D~R-RngR-----ipKiLpHRLFPnyrYSIWIDgK  558 (722)
                      ..++|++++++.+...++. ....+..   -...+++.+....+.... ...+     +.|++...+||+|+.-||||+-
T Consensus        29 ~~~~~~il~~~is~~~~~~L~~~~~~~---~~~i~~~~i~~~~~~~~~~~~~~~~~~~y~rl~l~~ll~~~~rvlylD~D  105 (248)
T cd04194          29 RDYDFYILNDDISEENKKKLKELLKKY---NSSIEFIKIDNDDFKFFPATTDHISYATYYRLLIPDLLPDYDKVLYLDAD  105 (248)
T ss_pred             CceEEEEEeCCCCHHHHHHHHHHHHhc---CCeEEEEEcCHHHHhcCCcccccccHHHHHHHHHHHHhcccCEEEEEeCC
Confidence            4689999998755433221 0001110   012344555331111111 1222     3488999999999999999999


Q ss_pred             eeEecCHHHHHHHHHhcCCCeEEEecCC
Q 004949          559 LRLQRDPLLILEYFLWRKGYEYAISNHY  586 (722)
Q Consensus       559 IqL~~DPllLLE~fLwr~n~~fAIskHp  586 (722)
                      +.+..|+..|.+.-+  ++..+|+..|.
T Consensus       106 ~lv~~di~~L~~~~~--~~~~~aa~~d~  131 (248)
T cd04194         106 IIVLGDLSELFDIDL--GDNLLAAVRDP  131 (248)
T ss_pred             EEecCCHHHHhcCCc--CCCEEEEEecc
Confidence            999999998887533  56678888764


No 3  
>cd00505 Glyco_transf_8 Members of glycosyltransferase family 8 (GT-8) are involved in lipopolysaccharide biosynthesis and glycogen synthesis. Members of this family are involved in lipopolysaccharide biosynthesis and glycogen synthesis. GT-8 comprises enzymes with a number of known activities: lipopolysaccharide galactosyltransferase, lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase, and  N-acetylglucosaminyltransferase. GT-8 enzymes contains a conserved DXD motif which is essential in the coordination of a  catalytic divalent cation, most commonly Mn2+.
Probab=87.48  E-value=1.9  Score=43.45  Aligned_cols=159  Identities=14%  Similarity=0.132  Sum_probs=92.1

Q ss_pred             CCceEEEEecccchhhhcccCCCCCCCCcccceEEEEccCCCCc------cccccccccccccccCCCCCCEEEEEeCce
Q 004949          486 KNVCFVMFTDELTLQTLSSEGQIPDRTGFIGLWKMVVVKNLPYD------DMRRVGKIPKLLPHRLFPSARYSIWLDSKL  559 (722)
Q Consensus       486 knVcFicFTDe~tL~sl~~~g~vpd~~~~vG~WKIV~VknlPy~------D~RRngRipKiLpHRLFPnyrYSIWIDgKI  559 (722)
                      ..+.|++++|..+....+.-..+....+  ...+++.++...+.      ......-+.|++...|||+++--||+|+-+
T Consensus        29 ~~~~~~il~~~is~~~~~~L~~~~~~~~--~~i~~~~~~~~~~~~~~~~~~~~~~~~y~RL~i~~llp~~~kvlYLD~D~  106 (246)
T cd00505          29 KPLRFHVLTNPLSDTFKAALDNLRKLYN--FNYELIPVDILDSVDSEHLKRPIKIVTLTKLHLPNLVPDYDKILYVDADI  106 (246)
T ss_pred             CCeEEEEEEccccHHHHHHHHHHHhccC--ceEEEEeccccCcchhhhhcCccccceeHHHHHHHHhhccCeEEEEcCCe
Confidence            3688999998755433221000000000  12445555322111      223344578999999999999999999999


Q ss_pred             eEecCHHHHHHHHHhcCCCeEEEecCCCCCcHHHHHHHHHHhccCChHHHHHHHHHH-HHcCCCCCCCCCCCCCCCCCCC
Q 004949          560 RLQRDPLLILEYFLWRKGYEYAISNHYDRHCVWEEVAQNKKLNKYNHTVIDQQFAFY-QADGLKRFDPSDPDRLLPSNVP  638 (722)
Q Consensus       560 qL~~DPllLLE~fLwr~n~~fAIskHp~R~CVYEEAeackrl~K~~~~~Id~Qme~Y-k~eGLp~~~~sdp~kl~pSdLp  638 (722)
                      .+..|+..|.+..+  ++..+|+..-..    .      ....           +.| +..+++..          ...+
T Consensus       107 iv~~di~~L~~~~l--~~~~~aav~d~~----~------~~~~-----------~~~~~~~~~~~~----------~~yf  153 (246)
T cd00505         107 LVLTDIDELWDTPL--GGQELAAAPDPG----D------RREG-----------KYYRQKRSHLAG----------PDYF  153 (246)
T ss_pred             eeccCHHHHhhccC--CCCeEEEccCch----h------hhcc-----------chhhcccCCCCC----------CCce
Confidence            99999999888654  666788875421    0      0000           011 12233211          1246


Q ss_pred             CceEEEccCCcc-----hhhhHHHHHHHHhcCCCCCcchHHHHHHH
Q 004949          639 EGSFIVRAHTPM-----SNLFSCLWFNEVDRFTSRDQLSFAYTYQK  679 (722)
Q Consensus       639 EgnVIVReHtp~-----sNlFmCLWFNEV~rFS~RDQLSFaYVlwK  679 (722)
                      -++|++=+-...     .......|.+...+..--||=.++.++..
T Consensus       154 NsGVmlinl~~~r~~~~~~~~~~~~~~~~~~~~~~DQd~LN~~~~~  199 (246)
T cd00505         154 NSGVFVVNLSKERRNQLLKVALEKWLQSLSSLSGGDQDLLNTFFKQ  199 (246)
T ss_pred             eeeeEEEechHHHHHHHHHHHHHHHHhhcccCccCCcHHHHHHHhc
Confidence            677877544322     12223445555567888999999999876


No 4  
>PF01501 Glyco_transf_8:  Glycosyl transferase family 8;  InterPro: IPR002495 The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates (2.4.1.- from EC) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'. Glycosyltransferase family 8 GT8 from CAZY comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase (2.4.1.44 from EC), lipopolysaccharide glucosyltransferase 1 (2.4.1.58 from EC), glycogenin glucosyltransferase (2.4.1.186 from EC), inositol 1-alpha-galactosyltransferase (2.4.1.123 from EC). These enzymes have a distant similarity to family GT_24. ; GO: 0016757 transferase activity, transferring glycosyl groups; PDB: 1LL0_D 1ZCV_A 3USR_A 3V90_A 1ZCU_A 1ZCT_A 3V91_A 1ZCY_A 1ZDG_A 1ZDF_A ....
Probab=76.61  E-value=2.1  Score=41.31  Aligned_cols=48  Identities=19%  Similarity=0.243  Sum_probs=36.3

Q ss_pred             cccccccccCCCCCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecC
Q 004949          536 KIPKLLPHRLFPSARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNH  585 (722)
Q Consensus       536 RipKiLpHRLFPnyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskH  585 (722)
                      -+.|++.+.+||+|+--||||+-+.+..|+..|.+..+  ++..+|+...
T Consensus        86 ~~~rl~i~~ll~~~drilyLD~D~lv~~dl~~lf~~~~--~~~~~~a~~~  133 (250)
T PF01501_consen   86 TFARLFIPDLLPDYDRILYLDADTLVLGDLDELFDLDL--QGKYLAAVED  133 (250)
T ss_dssp             GGGGGGHHHHSTTSSEEEEE-TTEEESS-SHHHHC-----TTSSEEEEE-
T ss_pred             HHHHhhhHHHHhhcCeEEEEcCCeeeecChhhhhcccc--hhhhcccccc
Confidence            35789999999999999999999999999998887554  4666777766


No 5  
>COG1681 FlaB Archaeal flagellins [Cell motility and secretion]
Probab=70.13  E-value=4.7  Score=41.90  Aligned_cols=28  Identities=36%  Similarity=0.598  Sum_probs=24.9

Q ss_pred             CccchhHHHHHHHHHHHHHHHHHHhhcC
Q 004949           59 RRLSIGSVIFVLLLVLLATVLAYLYISG   86 (722)
Q Consensus        59 ~~~~~~~~~~~~~~~l~~~v~~~~~~s~   86 (722)
                      .-.+||++|++.++||+|.|+||-.|..
T Consensus         4 G~~GIgtlIVfIAmVlVAAVaA~VlInt   31 (209)
T COG1681           4 GATGIGTLIVFIAMVLVAAVAAYVLINT   31 (209)
T ss_pred             cccchhHHHHHHHHHHHHHHHHHHHhcc
Confidence            4678999999999999999999988754


No 6  
>PRK08541 flagellin; Validated
Probab=64.36  E-value=7  Score=40.64  Aligned_cols=28  Identities=32%  Similarity=0.516  Sum_probs=24.6

Q ss_pred             CccchhHHHHHHHHHHHHHHHHHHhhcC
Q 004949           59 RRLSIGSVIFVLLLVLLATVLAYLYISG   86 (722)
Q Consensus        59 ~~~~~~~~~~~~~~~l~~~v~~~~~~s~   86 (722)
                      .-.+||++|+|.++|||+.|+||-.|..
T Consensus         4 G~~GIGTLIVFIAmVLVAAVAA~VLInT   31 (211)
T PRK08541          4 GAVGIGTLIVFIAMVLVAAVAAAVLINT   31 (211)
T ss_pred             chhhhHHHHHHHHHHHHHHHHHHHhhcc
Confidence            3568999999999999999999888754


No 7  
>PF01917 Arch_flagellin:  Archaebacterial flagellin;  InterPro: IPR002774 Archaeal motility occurs by the rotation of flagella that are different to bacterial flagella, but show similarity to bacterial type IV pili. These similarities include the multiflagellin nature of the flagellar filament, N-terminal sequence similarities, as well as the presence of homologous proteins in the two systems [, ]. Also unlike bacterial flagellins but similar to type IV pilins, archaeal flagellins are initially synthesised with a short leader peptide that is cleaved by a membrane-located peptidase [, ]. The enzyme responsible for the removal of the this leader peptide is FlaK [].; GO: 0005198 structural molecule activity, 0006928 cellular component movement
Probab=53.17  E-value=13  Score=36.52  Aligned_cols=26  Identities=35%  Similarity=0.521  Sum_probs=22.2

Q ss_pred             cchhHHHHHHHHHHHHHHHHHHhhcC
Q 004949           61 LSIGSVIFVLLLVLLATVLAYLYISG   86 (722)
Q Consensus        61 ~~~~~~~~~~~~~l~~~v~~~~~~s~   86 (722)
                      .+++++|||.++||+|.|||+-.|..
T Consensus         3 ~gi~t~IvfIA~VlVAAv~a~vli~t   28 (190)
T PF01917_consen    3 VGIGTAIVFIAFVLVAAVAAGVLINT   28 (190)
T ss_pred             chhhhHHHHHHHHHHHHHHHHHHHHH
Confidence            36899999999999999999776653


No 8  
>PRK08455 fliL flagellar basal body-associated protein FliL; Reviewed
Probab=51.29  E-value=35  Score=34.48  Aligned_cols=22  Identities=32%  Similarity=0.458  Sum_probs=14.4

Q ss_pred             HHHHHHHHHHHHHHHHhhcCCC
Q 004949           67 IFVLLLVLLATVLAYLYISGYS   88 (722)
Q Consensus        67 ~~~~~~~l~~~v~~~~~~s~~~   88 (722)
                      +++|+|++++++++|.++++..
T Consensus        25 ~~~llll~~~G~~~~~~~~~~~   46 (182)
T PRK08455         25 GVVVLLLLIVGVIAMLLMGSKE   46 (182)
T ss_pred             HHHHHHHHHHHHHHHHHhcCCC
Confidence            3455555666788888777655


No 9  
>PF15243 ANAPC15:  Anaphase-promoting complex subunit 15
Probab=48.94  E-value=24  Score=32.61  Aligned_cols=23  Identities=26%  Similarity=0.218  Sum_probs=17.8

Q ss_pred             hhHHHHHHHHHHHhhhccccCCC
Q 004949          211 NELKMYEAEYEASLKNAGLSGNL  233 (722)
Q Consensus       211 ~elk~yeaeyeasl~~~g~~~~~  233 (722)
                      -||.++|++|+|-|.-+.+-.++
T Consensus        28 ~EL~~~Eq~~q~Wl~sI~ekd~n   50 (92)
T PF15243_consen   28 TELQQQEQQHQAWLQSIAEKDNN   50 (92)
T ss_pred             HHHHHHHHHHHHHHHHHHHhccC
Confidence            58999999999988766655333


No 10 
>PF04415 DUF515:  Protein of unknown function (DUF515)    ;  InterPro: IPR007509 This is a family of hypothetical archaeal proteins.
Probab=45.79  E-value=20  Score=40.83  Aligned_cols=25  Identities=16%  Similarity=0.276  Sum_probs=21.1

Q ss_pred             HHHHHHHHHhccCChHHHHHHHHHH
Q 004949          592 WEEVAQNKKLNKYNHTVIDQQFAFY  616 (722)
Q Consensus       592 YEEAeackrl~K~~~~~Id~Qme~Y  616 (722)
                      -+|+......+|++.+.|.+|+..|
T Consensus       342 l~~iL~A~a~gkl~~~~v~~~l~~Y  366 (416)
T PF04415_consen  342 LPEILKAIAAGKLDYSQVKEQLGNY  366 (416)
T ss_pred             HHHHHHHHHhCCcCHHHHHHHHHHh
Confidence            4677777888889889999999998


No 11 
>cd06429 GT8_like_1 GT8_like_1 represents a subfamily of GT8 with unknown function. A subfamily of glycosyltransferase family 8 with unknown function: Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase  lipopolysaccharide glucosyltransferase 1, glycogenin glucosyltransferase and inositol 1-alpha-galactosyltransferase. It is classified as a retaining glycosyltransferase, based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed.
Probab=45.18  E-value=31  Score=36.34  Aligned_cols=47  Identities=21%  Similarity=0.273  Sum_probs=38.2

Q ss_pred             ccccccccCCCCCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecC
Q 004949          537 IPKLLPHRLFPSARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNH  585 (722)
Q Consensus       537 ipKiLpHRLFPnyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskH  585 (722)
                      +.+++.-.+||+++--||+|+-+.+.+|...|.+.=+  ++..+|+...
T Consensus       102 y~Rl~ip~llp~~~kvlYLD~Dviv~~dl~eL~~~dl--~~~~~aav~d  148 (257)
T cd06429         102 FARFYLPELFPKLEKVIYLDDDVVVQKDLTELWNTDL--GGGVAGAVET  148 (257)
T ss_pred             HHHHHHHHHhhhhCeEEEEeCCEEEeCCHHHHhhCCC--CCCEEEEEhh
Confidence            4566777889999999999999999999998887633  5667777665


No 12 
>PF10446 DUF2457:  Protein of unknown function (DUF2457);  InterPro: IPR018853  This entry represents a family of uncharacterised proteins. 
Probab=44.06  E-value=71  Score=36.92  Aligned_cols=20  Identities=15%  Similarity=0.370  Sum_probs=14.2

Q ss_pred             CCCCCCCCCCcchhhhhccc
Q 004949          275 DSGHDKGDHSDVAKIQSQYQ  294 (722)
Q Consensus       275 d~g~~~~~~~~~~~~~~~~~  294 (722)
                      |.|.+-++..--|.|.++++
T Consensus        98 ddG~~TDnE~GFAdSDDEdD  117 (458)
T PF10446_consen   98 DDGNETDNEAGFADSDDEDD  117 (458)
T ss_pred             ccCccCcccccccccccccc
Confidence            56777777777777766665


No 13 
>PF09451 ATG27:  Autophagy-related protein 27;  InterPro: IPR018939 Autophagy is a degradative transport pathway that delivers cytosolic proteins to the lysosome (vacuole) [] and is induced by starvation []. Cytosolic proteins appear inside the vacuole enclosed in autophagic vesicles. Autophagy significantly differs from other transport pathways by using double membrane layered transport intermediates, called autophagosomes [, ]. The breakdown of vesicular transport intermediates is a unique feature of autophagy []. Autophagy can also function in the elimination of invading bacteria and antigens []. There are more than 25 AuTophaGy-related (ATG) genes that are essential for autophagy, although it is still not known how the autophagosome is made. Atg9 is a potential membrane carrier to deliver lipids that are used to form the vesicle. Atg27 is another transmembrane protein, and is a cycling protein []. It acts as an effector of VPS34 phosphatidylinositol 3-phosphate kinase signalling and regulates the cytoplasm to vacuole transport (Cvt) vesicle formation. It is also required for autophagy-dependent cycling of ATG9. 
Probab=39.09  E-value=27  Score=36.76  Aligned_cols=40  Identities=30%  Similarity=0.169  Sum_probs=27.3

Q ss_pred             hhcccCccccccccccccccccccccc---eEeecCCCCccCC
Q 004949           99 IISHSAVDDELKNDIDFLTNVTRTNTL---KVVGFGKGSISHG  138 (722)
Q Consensus        99 ~~~~~~~~~~~~~~~~~~~~~~~~~~~---k~~~fg~g~~~~g  138 (722)
                      .+.+...+.|+-...||++.+|..=+-   +|+.=-+|++.+|
T Consensus       226 ~~~~g~~g~e~iP~~dfw~~lP~l~kd~~~~v~~~~~g~~sRG  268 (268)
T PF09451_consen  226 YNRYGARGFELIPHFDFWRSLPYLIKDGVRFVVGTVQGSGSRG  268 (268)
T ss_pred             eccCCCCCceecccHhHHHhchHHHHHHHHHhhccccCCCCCC
Confidence            455555556666666999999876543   6666678887766


No 14 
>PF09307 MHC2-interact:  CLIP, MHC2 interacting;  InterPro: IPR015386 This domain is found in MHC class II-associated invariant chain (Ii), and in class II invariant chain-associated peptide (CLIP), and is required for association with class II major histocompatibility complex (MHC II) in the MHC II processing pathway []. Ii plays a critical role in the assembly of the MHC, as well as in MHC II antigen processing by stabilising peptide-free class II alpha/beta heterodimers in a complex soon after their synthesis and directing transport of the complex from the endoplasmic reticulum to compartments where peptide loading of class II takes place []. In antigen-presenting cells (APCs), loading of MHC II molecules with peptides is regulated by Ii, which blocks MHC II antigen-binding sites in pre-endosomal compartments []. Several factors modulate the surface expression of MHC II molecules via post-Golgi mechanisms, including CLIP. The Invariant chain contains a single transmembrane domain. Ii first assembles into a trimer and then associates with three class II alpha/beta MHC heterodimers. Although the membrane-proximal region of the Ii luminal domain is structurally disordered, the C-terminal segment of the luminal domain is largely alpha-helical and contains a major interaction site for the Ii trimer []. More information about these proteins can be found at Protein of the Month: MHC [].; GO: 0042289 MHC class II protein binding, 0006886 intracellular protein transport, 0006955 immune response, 0019882 antigen processing and presentation, 0016020 membrane; PDB: 1A6A_C 3QXD_F 3QXA_F 3PDO_C 1MUJ_C 3PGD_F 3PGC_F.
Probab=36.61  E-value=12  Score=35.90  Aligned_cols=39  Identities=10%  Similarity=0.220  Sum_probs=0.0

Q ss_pred             ccCCCCCCCCCccchhHHHHHHHHHHHHHHHHHHhhcCCC
Q 004949           49 RRSARSDKNGRRLSIGSVIFVLLLVLLATVLAYLYISGYS   88 (722)
Q Consensus        49 rr~~~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~s~~~   88 (722)
                      +++.+.+++ |.+.|+++-+++||+|.-=++.-||+-.+.
T Consensus        21 ~~~~~~s~s-ra~~vagltvLa~LLiAGQa~TaYfv~~Qk   59 (114)
T PF09307_consen   21 GGPQRGSCS-RALKVAGLTVLACLLIAGQAVTAYFVFQQK   59 (114)
T ss_dssp             ----------------------------------------
T ss_pred             CCCCCCCcc-chhHHHHHHHHHHHHHHhHHHHHHHHHHhH
Confidence            455555555 778888887666665554444445554444


No 15 
>cd02515 Glyco_transf_6 Glycosyltransferase family 6 comprises enzymes responsible for the production of the human ABO blood group antigens. Glycosyltransferase family 6, GT_6, comprises enzymes with three known activities: alpha-1,3-galactosyltransferase, alpha-1,3 N-acetylgalactosaminyltransferase, and alpha-galactosyltransferase. UDP-galactose:beta-galactosyl alpha-1,3-galactosyltransferase (alpha3GT) catalyzes the transfer of galactose from UDP-alpha-d-galactose into an alpha-1,3 linkage with beta-galactosyl groups in glycoconjugates. The enzyme exists in most mammalian species but is absent from humans, apes, and old world monkeys as a result of the mutational inactivation of the gene. The alpha-1,3 N-acetylgalactosaminyltransferase and alpha-galactosyltransferase are responsible for the production of the human ABO blood group antigens. A N-acetylgalactosaminyltransferases use a UDP-GalNAc donor to convert the H-antigen acceptor to the A antigen, whereas a galactosyltransferase use
Probab=35.83  E-value=30  Score=37.47  Aligned_cols=115  Identities=16%  Similarity=0.195  Sum_probs=63.9

Q ss_pred             CcCCHhhHhhhccC-cEEEEeeeeCCCccccCCCCCcc--cccCCCCceEEEEecccchhhhcccCCCCC-CCCcccceE
Q 004949          444 FDLAEDDANYNSRC-HIAVISCIFGNSDRLRIPVGKTV--TRLSRKNVCFVMFTDELTLQTLSSEGQIPD-RTGFIGLWK  519 (722)
Q Consensus       444 Fdi~e~D~~~M~~C-KIVVYTAIFGnYD~L~qP~~~~I--s~~s~knVcFicFTDe~tL~sl~~~g~vpd-~~~~vG~WK  519 (722)
                      |+..=-+..|-++. +|.+.---+|.|-...+.--...  .=+-.-.|-|+.|||...+        +|. .-+---.=+
T Consensus        20 f~~~~l~~~y~~~n~tIgl~vfatGkY~~f~~~F~~SAEk~Fm~g~~v~YyVFTD~~~~--------~p~v~lg~~r~~~   91 (271)
T cd02515          20 FNPDVLDEYYRKQNITIGLTVFAVGKYTEFLERFLESAEKHFMVGYRVIYYIFTDKPAA--------VPEVELGPGRRLT   91 (271)
T ss_pred             CCHHHHHHHHHhcCCEEEEEEEEeccHHHHHHHHHHHHHHhccCCCeeEEEEEeCCccc--------CcccccCCCceeE
Confidence            44444445555433 35555555677754333211000  0022356899999997653        221 000001234


Q ss_pred             EEEc-cCC--CCccccccccccccccccCCCCCCEEEEEeCceeEecCHH
Q 004949          520 MVVV-KNL--PYDDMRRVGKIPKLLPHRLFPSARYSIWLDSKLRLQRDPL  566 (722)
Q Consensus       520 IV~V-knl--Py~D~RRngRipKiLpHRLFPnyrYSIWIDgKIqL~~DPl  566 (722)
                      |+.| ...  |+..++|.--+-+...-+++-+++|-..+|+++.+....-
T Consensus        92 V~~v~~~~~W~~~sl~Rm~~~~~~~~~~~~~e~DYlF~~dvd~~F~~~ig  141 (271)
T cd02515          92 VLKIAEESRWQDISMRRMKTLADHIADRIGHEVDYLFCMDVDMVFQGPFG  141 (271)
T ss_pred             EEEeccccCCcHHHHHHHHHHHHHHHHhhcccCCEEEEeeCCceEeecCC
Confidence            4445 223  5557776655545445567899999999999999996544


No 16 
>cd06431 GT8_LARGE_C LARGE catalytic domain has closest homology to GT8 glycosyltransferase involved in lipooligosaccharide synthesis. The catalytic domain of LARGE is a putative glycosyltransferase. Mutations of LARGE in mouse and human cause dystroglycanopathies, a disease associated with hypoglycosylation of the membrane protein alpha-dystroglycan (alpha-DG) and consequent loss of extracellular ligand binding. LARGE needs to both physically interact with alpha-dystroglycan and function as a glycosyltransferase in order to stimulate alpha-dystroglycan hyperglycosylation. LARGE localizes to the Golgi apparatus and contains three conserved DxD motifs. While two of the motifs are indispensible for glycosylation function, one is important for localization of th eenzyme. LARGE was originally named because it covers approximately large trunck of genomic DNA, more than 600bp long. The predicted protein structure contains an N-terminal cytoplasmic domain, a transmembrane region, a coiled-coil
Probab=33.10  E-value=44  Score=35.54  Aligned_cols=97  Identities=15%  Similarity=0.145  Sum_probs=59.1

Q ss_pred             CCceEEEEecccchhhhcccC-CCCCCCCcccceEEEEccCC--CC---cccccccc--ccccccccCCC-CCCEEEEEe
Q 004949          486 KNVCFVMFTDELTLQTLSSEG-QIPDRTGFIGLWKMVVVKNL--PY---DDMRRVGK--IPKLLPHRLFP-SARYSIWLD  556 (722)
Q Consensus       486 knVcFicFTDe~tL~sl~~~g-~vpd~~~~vG~WKIV~Vknl--Py---~D~RRngR--ipKiLpHRLFP-nyrYSIWID  556 (722)
                      ..++|..|+|+.+...++.-. ....   .--.+.++.+..+  ++   ......+.  +.+++.+.+|| +++--||||
T Consensus        29 ~~~~fhii~d~~s~~~~~~l~~~~~~---~~~~i~f~~i~~~~~~~~~~~~~~~s~~y~y~RL~ip~llp~~~dkvLYLD  105 (280)
T cd06431          29 NPLHFHLITDEIARRILATLFQTWMV---PAVEVSFYNAEELKSRVSWIPNKHYSGIYGLMKLVLTEALPSDLEKVIVLD  105 (280)
T ss_pred             CCEEEEEEECCcCHHHHHHHHHhccc---cCcEEEEEEhHHhhhhhccCcccchhhHHHHHHHHHHHhchhhcCEEEEEc
Confidence            569999999977654433210 0100   0123555555321  11   01111221  14888999999 799999999


Q ss_pred             CceeEecCHHHHHHHHH-hcCCCeEEEecC
Q 004949          557 SKLRLQRDPLLILEYFL-WRKGYEYAISNH  585 (722)
Q Consensus       557 gKIqL~~DPllLLE~fL-wr~n~~fAIskH  585 (722)
                      +=+.+.+|+..|.+.|. ..++.-+|+..+
T Consensus       106 ~Diiv~~di~eL~~~~~~~~~~~~~a~v~~  135 (280)
T cd06431         106 TDITFATDIAELWKIFHKFTGQQVLGLVEN  135 (280)
T ss_pred             CCEEEcCCHHHHHHHhhhcCCCcEEEEecc
Confidence            99999999999888741 235556677654


No 17 
>PF01034 Syndecan:  Syndecan domain;  InterPro: IPR001050 The syndecans are transmembrane proteoglycans which are involved in the organisation of cytoskeleton and/or actin microfilaments, and have important roles as cell surface receptors during cell-cell and/or cell-matrix interactions [, ]. Structurally, these proteins consist of four separate domains:   A signal sequence; An extracellular domain (ectodomain) of variable length whose sequence is not evolutionary conserved in the various forms of syndecans. The ectodomain contains the sites of attachment of the heparan sulphate glycosaminoglycan side chains;  A transmembrane region;  A highly conserved cytoplasmic domain of about 30 to 35 residues, which could interact with cytoskeletal proteins.    The proteins known to belong to this family are:    Syndecan 1.  Syndecan 2 or fibroglycan.  Syndecan 3 or neuroglycan or N-syndecan.  Syndecan 4 or amphiglycan or ryudocan.  Drosophila syndecan.   Caenorhabditis elegans probable syndecan (F57C7.3).    Syndecan-4, a transmembrane heparan sulphate proteoglycan, is a coreceptor with integrins in cell adhesion. It has been suggested to form a ternary signalling complex with protein kinase Calpha and phosphatidylinositol 4,5-bisphosphate (PIP2). Structural studies have demonstrated that the cytoplasmic domain undergoes a conformational transition and forms a symmetric dimer in the presence of phospholipid activator PIP2, and whose overall structure in solution exhibits a twisted clamp shape having a cavity in the centre of dimeric interface. In addition, it has been observed that the syndecan-4 variable domain interacts, strongly, not only with fatty acyl groups but also the anionic head group of PIP2. These findings indicate that PIP2 promotes oligomerisation of the syndecan-4 cytoplasmic domain for transmembrane signalling and cell-matrix adhesion [, ].; GO: 0008092 cytoskeletal protein binding, 0016020 membrane; PDB: 1EJQ_B 1EJP_B 1YBO_C 1OBY_Q.
Probab=32.51  E-value=15  Score=32.08  Aligned_cols=27  Identities=26%  Similarity=0.325  Sum_probs=0.8

Q ss_pred             hhHHHHHHHHHHHHHHHHHHhhcCCCC
Q 004949           63 IGSVIFVLLLVLLATVLAYLYISGYSN   89 (722)
Q Consensus        63 ~~~~~~~~~~~l~~~v~~~~~~s~~~~   89 (722)
                      +|+|+.+||+|||+..++|++-..|.-
T Consensus        16 aG~Vvgll~ailLIlf~iyR~rkkdEG   42 (64)
T PF01034_consen   16 AGGVVGLLFAILLILFLIYRMRKKDEG   42 (64)
T ss_dssp             ---------------------S-----
T ss_pred             HHHHHHHHHHHHHHHHHHHHHHhcCCC
Confidence            456666778888888888998877763


No 18 
>cd06432 GT8_HUGT1_C_like The C-terminal domain of HUGT1-like is highly homologous to the GT 8 family. C-terminal domain of glycoprotein glucosyltransferase (UGT).  UGT is a large glycoprotein whose C-terminus contains the catalytic activity. This catalytic C-terminal domain is highly homologous to Glycosyltransferase Family 8 (GT 8) and contains the DXD motif that coordinates donor sugar binding, characteristic for Family 8 glycosyltransferases.  GT 8 proteins are retaining enzymes based on the relative anomeric stereochemistry of the substrate and product in the reaction catalyzed. The non-catalytic N-terminal portion of the human UTG1 (HUGT1) has been shown to monitor the protein folding status and activate its glucosyltransferase activity.
Probab=32.49  E-value=36  Score=35.35  Aligned_cols=45  Identities=24%  Similarity=0.381  Sum_probs=36.6

Q ss_pred             ccccccCCC-CCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecC
Q 004949          539 KLLPHRLFP-SARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNH  585 (722)
Q Consensus       539 KiLpHRLFP-nyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskH  585 (722)
                      .+++..||| +++--||+|+-+.++.|...|.+.=+  ++.-+|+..|
T Consensus        85 rL~~~~lLP~~vdkvLYLD~Dilv~~dL~eL~~~dl--~~~~~Aav~d  130 (248)
T cd06432          85 ILFLDVLFPLNVDKVIFVDADQIVRTDLKELMDMDL--KGAPYGYTPF  130 (248)
T ss_pred             HHHHHHhhhhccCEEEEEcCCceecccHHHHHhcCc--CCCeEEEeec
Confidence            366777899 69999999999999999888877543  5777888776


No 19 
>PF08391 Ly49:  Ly49-like protein, N-terminal region;  InterPro: IPR013600 The sequences making up this entry are annotated as, or are similar to, Ly49 receptors (e.g. P20937 from SWISSPROT). These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function []. They are members of the C-type lectin receptor superfamily [], and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (IPR001304 from INTERPRO). ; PDB: 1QO3_D 3C8J_D 1P4L_D 3C8K_D 3G8K_B 1JA3_B 3CAD_A 3G8L_A.
Probab=29.74  E-value=18  Score=34.82  Aligned_cols=18  Identities=39%  Similarity=0.473  Sum_probs=0.0

Q ss_pred             hhHHHHHHHHHHHHHHHH
Q 004949           63 IGSVIFVLLLVLLATVLA   80 (722)
Q Consensus        63 ~~~~~~~~~~~l~~~v~~   80 (722)
                      |+.++.+|||+|||||.+
T Consensus         7 iav~LGILCllLLvtv~v   24 (119)
T PF08391_consen    7 IAVALGILCLLLLVTVAV   24 (119)
T ss_dssp             ------------------
T ss_pred             HHHHHHHHHHHHHHHHHH
Confidence            455566888888888766


No 20 
>PF03407 Nucleotid_trans:  Nucleotide-diphospho-sugar transferase;  InterPro: IPR005069 Proteins in this family have been been predicted to be nucleotide-diphospho-sugar transferases [].
Probab=29.27  E-value=54  Score=32.16  Aligned_cols=87  Identities=18%  Similarity=0.194  Sum_probs=61.3

Q ss_pred             EEEEeCceeEecCHHHHHHHHHhcCCCeEEEecCCCCCcHHHHHHHHHHhccCChHHHHHHHHHHHHcCCCCCCCCCCCC
Q 004949          552 SIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNHYDRHCVWEEVAQNKKLNKYNHTVIDQQFAFYQADGLKRFDPSDPDR  631 (722)
Q Consensus       552 SIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskHp~R~CVYEEAeackrl~K~~~~~Id~Qme~Yk~eGLp~~~~sdp~k  631 (722)
                      -+++|+=+.+..||+.+++.    .++++.++.-........                        .             
T Consensus        69 vl~~D~Dvv~~~dp~~~~~~----~~~Di~~~~d~~~~~~~~------------------------~-------------  107 (212)
T PF03407_consen   69 VLFSDADVVWLRDPLPYFEN----PDADILFSSDGWDGTNSD------------------------R-------------  107 (212)
T ss_pred             eEEecCCEEEecCcHHhhcc----CCCceEEecCCCcccchh------------------------h-------------
Confidence            56899999999999988822    556677665211111000                        0             


Q ss_pred             CCCCCCCCceEEEccCCcchhhhHHHHHHHHhcCCC-CCcchHHHHHHHhh
Q 004949          632 LLPSNVPEGSFIVRAHTPMSNLFSCLWFNEVDRFTS-RDQLSFAYTYQKLR  681 (722)
Q Consensus       632 l~pSdLpEgnVIVReHtp~sNlFmCLWFNEV~rFS~-RDQLSFaYVlwKlr  681 (722)
                        ...++-+++++=+.|+.+..|+..|...+...+. .||-.|..++....
T Consensus       108 --~~~~~n~G~~~~r~t~~~~~~~~~w~~~~~~~~~~~DQ~~~n~~l~~~~  156 (212)
T PF03407_consen  108 --NGNLVNTGFYYFRPTPRTIAFLEDWLERMAESPGCWDQQAFNELLREQA  156 (212)
T ss_pred             --cCCccccceEEEecCHHHHHHHHHHHHHHHhCCCcchHHHHHHHHHhcc
Confidence              0012345777777799999999999999999955 79999999999753


No 21 
>PRK05529 cell division protein FtsQ; Provisional
Probab=29.17  E-value=38  Score=35.48  Aligned_cols=21  Identities=19%  Similarity=0.313  Sum_probs=11.7

Q ss_pred             cccccCCCCCCCCCccchhHHH
Q 004949           46 ARARRSARSDKNGRRLSIGSVI   67 (722)
Q Consensus        46 ~~~rr~~~~~~~~~~~~~~~~~   67 (722)
                      .|.||..++.|+ |+..+.+++
T Consensus        22 ~~~~~~~~~~~~-r~~~~~~~~   42 (255)
T PRK05529         22 ERVRRFTTRIRR-RFILLACAV   42 (255)
T ss_pred             hhhhchhhhccc-hhhhHHHHH
Confidence            345666666666 555555443


No 22 
>PF05637 Glyco_transf_34:  galactosyl transferase GMA12/MNN10 family;  InterPro: IPR008630 This family contains a number of glycosyltransferase enzymes that contain a DXD motif. This family includes a number of Caenorhabditis elegans homologues where the DXD is replaced by DXH. Some members of this family are included in glycosyltransferase family 34.; GO: 0016758 transferase activity, transferring hexosyl groups, 0016021 integral to membrane; PDB: 2P72_B 2P73_A 2P6W_A.
Probab=28.86  E-value=39  Score=35.04  Aligned_cols=44  Identities=25%  Similarity=0.333  Sum_probs=0.0

Q ss_pred             CCCCceEEEccCCcchhhhHHHHHHHHhcCC------CCCcchHHHHHHHh
Q 004949          636 NVPEGSFIVRAHTPMSNLFSCLWFNEVDRFT------SRDQLSFAYTYQKL  680 (722)
Q Consensus       636 dLpEgnVIVReHtp~sNlFmCLWFNEV~rFS------~RDQLSFaYVlwKl  680 (722)
                      +|=-|.||||. ++-+..|+..|+...-+..      ..||=+|.|.+...
T Consensus       144 gLNtGsFliRn-s~ws~~fLd~w~~~~~~~~~~~~~~~~EQsAl~~ll~~~  193 (239)
T PF05637_consen  144 GLNTGSFLIRN-SPWSRDFLDAWADPLYRNYDWDQLEFDEQSALEHLLQWH  193 (239)
T ss_dssp             ---------------------------------------------------
T ss_pred             ccccccccccc-ccccccccccccccccccccccccccccccccccccccc
Confidence            45566788888 4445568899997665433      35688888887653


No 23 
>PF07423 DUF1510:  Protein of unknown function (DUF1510);  InterPro: IPR009988 This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown.
Probab=27.03  E-value=39  Score=35.42  Aligned_cols=16  Identities=19%  Similarity=0.175  Sum_probs=9.4

Q ss_pred             ccccccchhhHHHHHH
Q 004949          203 GLYNEAGRNELKMYEA  218 (722)
Q Consensus       203 glyne~gr~elk~yea  218 (722)
                      .-|..+--+=.+|-.|
T Consensus       143 ~~y~~~S~DW~Em~~A  158 (217)
T PF07423_consen  143 MTYDSGSVDWNEMLKA  158 (217)
T ss_pred             ccccCCCcCHHHHHHH
Confidence            3466666665556655


No 24 
>PF03452 Anp1:  Anp1;  InterPro: IPR005109 The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins colocalize within the cis Golgi, where they are physically associated in two distinct complexes [].
Probab=26.92  E-value=63  Score=34.90  Aligned_cols=47  Identities=21%  Similarity=0.462  Sum_probs=36.9

Q ss_pred             ccccccCCCCCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecCCCC
Q 004949          539 KLLPHRLFPSARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNHYDR  588 (722)
Q Consensus       539 KiLpHRLFPnyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskHp~R  588 (722)
                      -+|.|-|=|...+.+|+|+-|.  .-|-.||+.|+ ..+.+|.+++=+.+
T Consensus       133 ~LL~~aL~p~~swVlWlDaDIv--~~P~~lI~dli-~~~kdIivPn~~~~  179 (269)
T PF03452_consen  133 FLLSSALGPWHSWVLWLDADIV--ETPPTLIQDLI-AHDKDIIVPNCWRR  179 (269)
T ss_pred             HHHHhhcCCcccEEEEEecCcc--cCChHHHHHHH-hCCCCEEccceeec
Confidence            4677778889999999999888  67888999976 66778877654433


No 25 
>PF10731 Anophelin:  Thrombin inhibitor from mosquito;  InterPro: IPR018932  Members of this family are all inhibitors of thrombin, the peptidase that is at the end of the blood coagulation cascade and which creates the clot by cleaving fibrinogen. The interaction between thrombin and fibrinogen involves two different areas of contact - via the thrombin active site and via a second substrate-binding site known as an exosite. The inhibitor acts by blocking the exosite, rather than by interacting with the active site. The inhibitors are from mosquitoes that feed on human blood and which, by inhibiting thrombin, prevent the blood from clotting and keep it flowing. 
Probab=26.25  E-value=78  Score=27.81  Aligned_cols=23  Identities=35%  Similarity=0.416  Sum_probs=16.2

Q ss_pred             HHHHHHHHHHHHHHH-HHhhcCCC
Q 004949           66 VIFVLLLVLLATVLA-YLYISGYS   88 (722)
Q Consensus        66 ~~~~~~~~l~~~v~~-~~~~s~~~   88 (722)
                      ||-+||++|++.|-+ =.|--|+.
T Consensus         7 vialLC~aLva~vQ~APQYa~Gee   30 (65)
T PF10731_consen    7 VIALLCVALVAIVQSAPQYAPGEE   30 (65)
T ss_pred             HHHHHHHHHHHHHhcCcccCCCCC
Confidence            566788888886554 56777766


No 26 
>PHA02849 putative transmembrane protein; Provisional
Probab=26.09  E-value=72  Score=29.21  Aligned_cols=24  Identities=29%  Similarity=0.632  Sum_probs=19.2

Q ss_pred             CccchhHHHHHHHHHHHHHHHHHH
Q 004949           59 RRLSIGSVIFVLLLVLLATVLAYL   82 (722)
Q Consensus        59 ~~~~~~~~~~~~~~~l~~~v~~~~   82 (722)
                      ...++|++.+++.+|++++.++|.
T Consensus        10 ~~f~~g~v~vi~v~v~vI~i~~fl   33 (82)
T PHA02849         10 IEFDAGAVTVILVFVLVISFLAFM   33 (82)
T ss_pred             cccccchHHHHHHHHHHHHHHHHH
Confidence            567899998888888888777754


No 27 
>PF11119 DUF2633:  Protein of unknown function (DUF2633);  InterPro: IPR022576  This family is conserved largely in Proteobacteria. Several members are named as YfgG. The function is not known. 
Probab=26.04  E-value=33  Score=29.72  Aligned_cols=43  Identities=19%  Similarity=0.401  Sum_probs=33.8

Q ss_pred             ccchhHHHHHHHHHHHHHHHHHHhhcCCCCCCCCccchhhhcc
Q 004949           60 RLSIGSVIFVLLLVLLATVLAYLYISGYSNHNDDDQDKEIISH  102 (722)
Q Consensus        60 ~~~~~~~~~~~~~~l~~~v~~~~~~s~~~~~~~~~~~~~~~~~  102 (722)
                      ...+.-+|++++++.+.+.|+|.-|....-|++-+|..++.+.
T Consensus         6 ~~~mtriVLLISfiIlfgRl~Y~~I~a~~hHq~k~~a~~~~~~   48 (59)
T PF11119_consen    6 NSRMTRIVLLISFIILFGRLIYSAIGAWVHHQDKKQAQQIEQS   48 (59)
T ss_pred             cchHHHHHHHHHHHHHHHHHHHHHHhHHHHHHHHHhccccccc
Confidence            3456778888999999999999999888888877666655543


No 28 
>PLN02718 Probable galacturonosyltransferase
Probab=25.16  E-value=1.5e+02  Score=35.46  Aligned_cols=95  Identities=17%  Similarity=0.264  Sum_probs=59.2

Q ss_pred             CCceEEEEecccchhhhcccC-CCCCCCCcccceEEEEccCC---C--Ccc-----cccccc------ccccccccCCCC
Q 004949          486 KNVCFVMFTDELTLQTLSSEG-QIPDRTGFIGLWKMVVVKNL---P--YDD-----MRRVGK------IPKLLPHRLFPS  548 (722)
Q Consensus       486 knVcFicFTDe~tL~sl~~~g-~vpd~~~~vG~WKIV~Vknl---P--y~D-----~RRngR------ipKiLpHRLFPn  548 (722)
                      .++.|..|||..+...++.-- ..+..   -..+.|+.|++.   |  |..     +.++.+      +.+++.-.+||+
T Consensus       341 ~~ivFHVvTD~is~~~mk~wf~l~~~~---~a~I~V~~Iddf~~lp~~~~~~lk~l~s~~~~~~S~~~y~Rl~ipellp~  417 (603)
T PLN02718        341 EKIVFHVVTDSLNYPAISMWFLLNPPG---KATIQILNIDDMNVLPADYNSLLMKQNSHDPRYISALNHARFYLPDIFPG  417 (603)
T ss_pred             CcEEEEEEeCCCCHHHHHHHHHhCCCC---CcEEEEEecchhccccccchhhhhhccccccccccHHHHHHHHHHHHhcc
Confidence            359999999988876655210 01110   125666666532   1  111     011111      335666678999


Q ss_pred             CCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecC
Q 004949          549 ARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNH  585 (722)
Q Consensus       549 yrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskH  585 (722)
                      ++--||+|+-+.+..|...|.+.-+  ++.-+|+...
T Consensus       418 l~KvLYLD~DvVV~~DL~eL~~iDl--~~~v~aaVed  452 (603)
T PLN02718        418 LNKIVLFDHDVVVQRDLSRLWSLDM--KGKVVGAVET  452 (603)
T ss_pred             cCEEEEEECCEEecCCHHHHhcCCC--CCcEEEEecc
Confidence            9999999999999999988877533  5666777643


No 29 
>PF12911 OppC_N:  N-terminal TM domain of oligopeptide transport permease C
Probab=25.16  E-value=25  Score=28.10  Aligned_cols=26  Identities=46%  Similarity=0.803  Sum_probs=11.8

Q ss_pred             ccchhHHHHHHHHHHHHHHHHHHhhcCC
Q 004949           60 RLSIGSVIFVLLLVLLATVLAYLYISGY   87 (722)
Q Consensus        60 ~~~~~~~~~~~~~~l~~~v~~~~~~s~~   87 (722)
                      +.++.+++++++++|++ +++ -+++.+
T Consensus        16 k~a~~gl~il~~~vl~a-i~~-p~~~p~   41 (56)
T PF12911_consen   16 KLAVIGLIILLILVLLA-IFA-PFISPY   41 (56)
T ss_pred             chHHHHHHHHHHHHHHH-HHH-HHcCCC
Confidence            44555554444443333 333 445555


No 30 
>PLN02742 Probable galacturonosyltransferase
Probab=24.73  E-value=77  Score=37.32  Aligned_cols=41  Identities=17%  Similarity=0.306  Sum_probs=32.6

Q ss_pred             cccCCCCCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEec
Q 004949          542 PHRLFPSARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISN  584 (722)
Q Consensus       542 pHRLFPnyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIsk  584 (722)
                      .-.+||+.+--||+|+-+.+..|...|.+.=|  ++.-+|+..
T Consensus       346 lP~llp~l~KvlYLD~DvVV~~DL~eL~~~DL--~~~viaAVe  386 (534)
T PLN02742        346 IPEIYPALEKVVFLDDDVVVQKDLTPLFSIDL--HGNVNGAVE  386 (534)
T ss_pred             HHHHhhccCeEEEEeCCEEecCChHHHhcCCC--CCCEEEEeC
Confidence            33689999999999999999999988876533  466677664


No 31 
>PF14991 MLANA:  Protein melan-A; PDB: 2GTZ_F 2GT9_F 3MRO_P 2GUO_C 3MRQ_P 2GTW_C 3L6F_C 3MRP_P.
Probab=24.25  E-value=24  Score=34.05  Aligned_cols=30  Identities=30%  Similarity=0.381  Sum_probs=2.6

Q ss_pred             ccchhHHHHHHHHHHHHHHHHHHhhcCCCC
Q 004949           60 RLSIGSVIFVLLLVLLATVLAYLYISGYSN   89 (722)
Q Consensus        60 ~~~~~~~~~~~~~~l~~~v~~~~~~s~~~~   89 (722)
                      -.+||.++|||.++||.+..-|++=||+-.
T Consensus        26 AaGIGiL~VILgiLLliGCWYckRRSGYk~   55 (118)
T PF14991_consen   26 AAGIGILIVILGILLLIGCWYCKRRSGYKT   55 (118)
T ss_dssp             --SSS-------------------------
T ss_pred             hccceeHHHHHHHHHHHhheeeeecchhhh
Confidence            367899999999999999998999999874


No 32 
>COG4726 PilX Tfp pilus assembly protein PilX [Cell motility and secretion / Intracellular trafficking and secretion]
Probab=24.11  E-value=55  Score=34.07  Aligned_cols=25  Identities=28%  Similarity=0.554  Sum_probs=13.5

Q ss_pred             cCCCCCCCCCccchhHHHHHHHHHHHHHHHH
Q 004949           50 RSARSDKNGRRLSIGSVIFVLLLVLLATVLA   80 (722)
Q Consensus        50 r~~~~~~~~~~~~~~~~~~~~~~~l~~~v~~   80 (722)
                      |..|++|+      .++||||.++||+|.|+
T Consensus         7 r~~r~qRG------~~LivvL~~LvvltLl~   31 (196)
T COG4726           7 RGSRRQRG------FALIVVLMVLVVLTLLG   31 (196)
T ss_pred             CCccccCc------eEeHHHHHHHHHHHHHH
Confidence            44455555      34555666555555554


No 33 
>PF14155 DUF4307:  Domain of unknown function (DUF4307)
Probab=23.08  E-value=87  Score=29.36  Aligned_cols=28  Identities=14%  Similarity=0.239  Sum_probs=16.0

Q ss_pred             cchhHHHHHHHHHHHHHHHHHHhhcCCC
Q 004949           61 LSIGSVIFVLLLVLLATVLAYLYISGYS   88 (722)
Q Consensus        61 ~~~~~~~~~~~~~l~~~v~~~~~~s~~~   88 (722)
                      +.+.++++++.++++++.++|.+.+..+
T Consensus         6 ~~~~~~v~~vv~~~~~~w~~~~~~~~~~   33 (112)
T PF14155_consen    6 LVIAGAVLVVVAGAVVAWFGYSQFGSPP   33 (112)
T ss_pred             eEehHHHHHHHHHHHHhHhhhhhccCCC
Confidence            4444444555555666667777665554


No 34 
>PF03314 DUF273:  Protein of unknown function, DUF273;  InterPro: IPR004988 This is a family of proteins of unknown function.
Probab=23.01  E-value=66  Score=34.08  Aligned_cols=62  Identities=19%  Similarity=0.380  Sum_probs=42.6

Q ss_pred             cccCCCCCCEEEEEeCceeEecCHHHHHHHHHhcCCCeEEEecCCCCCcHHHHHHHHHHhccCChHHH
Q 004949          542 PHRLFPSARYSIWLDSKLRLQRDPLLILEYFLWRKGYEYAISNHYDRHCVWEEVAQNKKLNKYNHTVI  609 (722)
Q Consensus       542 pHRLFPnyrYSIWIDgKIqL~~DPllLLE~fLwr~n~~fAIskHp~R~CVYEEAeackrl~K~~~~~I  609 (722)
                      .+.++|++++-++||+-|-++ +|...||.|+ ..+.++-++.-+   .- =|+.|..-+-|=++..+
T Consensus        35 va~~L~~~~~vlflDaDigVv-Np~~~iEefi-d~~~Di~fydR~---~n-~Ei~agsYlvkNT~~~~   96 (222)
T PF03314_consen   35 VAKILPEYDWVLFLDADIGVV-NPNRRIEEFI-DEGYDIIFYDRF---FN-WEIAAGSYLVKNTEYSR   96 (222)
T ss_pred             HHHHhccCCEEEEEcCCceee-cCcccHHHhc-CCCCcEEEEecc---cc-hhhhhccceeeCCHHHH
Confidence            356779999999999999776 8999999987 777788776543   22 34445333333344333


No 35 
>PF02532 PsbI:  Photosystem II reaction centre I protein (PSII 4.8 kDa protein);  InterPro: IPR003686 Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product. PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [, ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10 kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection [].  This family represents the low molecular weight transmembrane protein PsbI, which is tightly associated with the D1/D2 heterodimer in PSII. The function of PsbI is unknown, but it may be involved in the assembly, dimerisation or stabilisation of PSII dimers [].; GO: 0015979 photosynthesis, 0009523 photosystem II, 0009539 photosystem II reaction center, 0016020 membrane; PDB: 3A0H_i 3ARC_I 3A0B_i 3BZ2_I 3PRQ_I 3KZI_I 3PRR_I 2AXT_i 4FBY_I 1S5L_i ....
Probab=21.62  E-value=1.9e+02  Score=23.13  Aligned_cols=26  Identities=8%  Similarity=0.346  Sum_probs=18.9

Q ss_pred             HHHHHHHHHHHHHHHHhhcCCCCCCC
Q 004949           67 IFVLLLVLLATVLAYLYISGYSNHND   92 (722)
Q Consensus        67 ~~~~~~~l~~~v~~~~~~s~~~~~~~   92 (722)
                      ++-.++.+.|.+|.+-++|.|+..|.
T Consensus         7 ~Vy~vV~ffv~LFifGflsnDp~RnP   32 (36)
T PF02532_consen    7 FVYTVVIFFVSLFIFGFLSNDPGRNP   32 (36)
T ss_dssp             HHHHHHHHHHHHHHHHHHTTCTTSSS
T ss_pred             eehhhHHHHHHHHhccccCCCCCCCC
Confidence            34455566778888999999996443


No 36 
>PF07790 DUF1628:  Protein of unknown function (DUF1628);  InterPro: IPR012859 The sequences making up this family are derived from hypothetical proteins of unknown function expressed by various archaeal species. The region in question is approximately 160 residues long. 
Probab=21.60  E-value=1.1e+02  Score=26.39  Aligned_cols=29  Identities=21%  Similarity=0.352  Sum_probs=22.1

Q ss_pred             CCccchhHHHHHHHHHHHHHHHHHHhhcC
Q 004949           58 GRRLSIGSVIFVLLLVLLATVLAYLYISG   86 (722)
Q Consensus        58 ~~~~~~~~~~~~~~~~l~~~v~~~~~~s~   86 (722)
                      |=++-||.+++++..|+++++++...++-
T Consensus         2 avS~viGviLliaitVilaavv~~~~~~~   30 (80)
T PF07790_consen    2 AVSPVIGVILLIAITVILAAVVGAFVFGL   30 (80)
T ss_pred             CccHHHHHHHHHHHHHHHHHHHHHHHhcc
Confidence            35677899988888888888888655554


No 37 
>PF11239 DUF3040:  Protein of unknown function (DUF3040);  InterPro: IPR021401  Some members in this family of proteins with unknown function are annotated as membrane proteins however this cannot be confirmed. 
Probab=21.60  E-value=92  Score=27.40  Aligned_cols=19  Identities=37%  Similarity=0.578  Sum_probs=10.0

Q ss_pred             CccchhHHHHHHHHHHHHH
Q 004949           59 RRLSIGSVIFVLLLVLLAT   77 (722)
Q Consensus        59 ~~~~~~~~~~~~~~~l~~~   77 (722)
                      ++...|.+++++.++++++
T Consensus        40 r~~~~~~~~~v~gl~llv~   58 (82)
T PF11239_consen   40 RRRVLGVLLVVVGLALLVA   58 (82)
T ss_pred             hHHHHHHHHHHHHHHHHHH
Confidence            3445566666665544443


No 38 
>PF06781 UPF0233:  Uncharacterised protein family (UPF0233);  InterPro: IPR009619 This is a group of proteins of unknown function.
Probab=20.36  E-value=2.3e+02  Score=26.21  Aligned_cols=31  Identities=19%  Similarity=0.134  Sum_probs=18.4

Q ss_pred             CCccchhHHHHHHHHHHHHHHHHHHhhcCCC
Q 004949           58 GRRLSIGSVIFVLLLVLLATVLAYLYISGYS   88 (722)
Q Consensus        58 ~~~~~~~~~~~~~~~~l~~~v~~~~~~s~~~   88 (722)
                      +.++.-=+.+++.+++|-+.-++.+||++..
T Consensus        27 ~~sp~W~~p~m~~lmllGL~WiVvyYi~~~~   57 (87)
T PF06781_consen   27 KPSPRWYAPLMLGLMLLGLLWIVVYYISGGQ   57 (87)
T ss_pred             CCCCccHHHHHHHHHHHHHHHHhhhhcccCC
Confidence            3455555555555555555555677787764


No 39 
>COG1783 XtmB Phage terminase large subunit [General function prediction only]
Probab=20.30  E-value=2e+02  Score=33.04  Aligned_cols=117  Identities=21%  Similarity=0.331  Sum_probs=65.2

Q ss_pred             eCceeEecCHH-HHHHHHHhcCCCeEEEecCCCCCcHHHHHHHHHHhccCC---------------hHHHHHHHH-HHHH
Q 004949          556 DSKLRLQRDPL-LILEYFLWRKGYEYAISNHYDRHCVWEEVAQNKKLNKYN---------------HTVIDQQFA-FYQA  618 (722)
Q Consensus       556 DgKIqL~~DPl-lLLE~fLwr~n~~fAIskHp~R~CVYEEAeackrl~K~~---------------~~~Id~Qme-~Yk~  618 (722)
                      |.-|+++.||. ...-.|.|.+.+.+++..-.-+.+ |.=|..++...+-.               ...+-.|+. .-..
T Consensus         3 ~~~ii~v~~p~~~~~y~f~w~qk~~i~~G~rGS~KS-y~~alk~i~kl~~~~~~d~lvIR~v~nt~k~St~~~~~e~l~e   81 (414)
T COG1783           3 DSVIIKVIDPIIFEAYVFFWNQKYFIAKGGRGSSKS-YATALKNIEKLLSEPGIDVLVIREVENTHKQSTYALFKEALSE   81 (414)
T ss_pred             cchhhhhhchhhhhhhhccchheEEEEEccCCCchh-HHHHHHHHHHHHcCCCCcEEEEEEeccccchhHHHHHHHHHHH
Confidence            56677778884 444456688887777777766666 66665555432111               122233322 2234


Q ss_pred             cCCC---CCCCCCCCCCCCCCCCCceEEEccCC-c------chhhhHHHHHHHHhcCCCCCcchHHH
Q 004949          619 DGLK---RFDPSDPDRLLPSNVPEGSFIVRAHT-P------MSNLFSCLWFNEVDRFTSRDQLSFAY  675 (722)
Q Consensus       619 eGLp---~~~~sdp~kl~pSdLpEgnVIVReHt-p------~sNlFmCLWFNEV~rFS~RDQLSFaY  675 (722)
                      .|+.   ....++|....  .-.-..||++-|- |      ..|-++.+||+|...|+-=|-...-+
T Consensus        82 ~gv~~~f~~~~s~pe~i~--~~~G~ri~F~G~ddp~klKSi~~~~~s~~WfEE~~e~s~e~~~e~l~  146 (414)
T COG1783          82 LGVTKFFKISKSSPEIIL--KDTGQRIIFKGLDDPAKLKSIAVNWISDLWFEEASEFSYEDDIELLV  146 (414)
T ss_pred             hCccceeEEecCChhhee--cccCcEEEEecCCCHHHhhhhhcchhhhhhHHHHhhhhhhhhHHHHH
Confidence            4666   44444554431  1112247777765 3      12347889999999888544444433


Done!