Query psy14259 Match_columns 79 No_of_seqs 103 out of 632 Neff 5.5 Searched_HMMs 46136 Date Fri Aug 16 23:26:58 2013 Command hhsearch -i /work/01045/syshi/Psyhhblits/psy14259.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/14259hhsearch_cdd -cpu 12 -v 0 No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF00930 DPPIV_N: Dipeptidyl p 99.8 4.4E-19 9.5E-24 131.5 4.7 70 5-78 29-98 (353) 2 KOG2100|consensus 99.3 1.6E-12 3.5E-17 106.4 6.5 66 13-79 139-205 (755) 3 PF07676 PD40: WD40-like Beta 96.8 0.0025 5.5E-08 33.4 3.5 21 23-43 10-30 (39) 4 PRK04043 tolB translocation pr 96.7 0.0062 1.3E-07 47.1 6.6 54 6-63 264-321 (419) 5 KOG2281|consensus 96.4 0.00089 1.9E-08 55.9 -0.0 56 23-79 201-264 (867) 6 PRK03629 tolB translocation pr 95.5 0.066 1.4E-06 41.0 6.9 39 23-62 288-330 (429) 7 PRK04043 tolB translocation pr 95.3 0.099 2.2E-06 40.4 7.2 38 24-62 235-276 (419) 8 PRK00178 tolB translocation pr 94.9 0.097 2.1E-06 39.4 6.1 37 23-60 200-240 (430) 9 PRK04922 tolB translocation pr 94.6 0.18 3.9E-06 38.4 6.9 37 25-62 295-335 (433) 10 PRK01029 tolB translocation pr 94.0 0.12 2.6E-06 39.9 4.9 37 24-61 329-369 (428) 11 PRK05137 tolB translocation pr 93.9 0.17 3.6E-06 38.6 5.6 36 23-59 203-242 (435) 12 PRK01029 tolB translocation pr 93.9 0.3 6.5E-06 37.7 6.9 53 6-62 358-414 (428) 13 PRK04792 tolB translocation pr 93.9 0.15 3.3E-06 39.4 5.3 37 23-60 219-259 (448) 14 PRK05137 tolB translocation pr 93.8 0.2 4.3E-06 38.2 5.8 37 24-61 292-332 (435) 15 PRK04792 tolB translocation pr 93.8 0.35 7.6E-06 37.4 7.1 39 24-63 308-350 (448) 16 PRK03629 tolB translocation pr 93.7 0.17 3.7E-06 38.8 5.3 37 22-59 199-239 (429) 17 PRK00178 tolB translocation pr 93.4 0.22 4.7E-06 37.5 5.3 38 24-62 289-330 (430) 18 COG0823 TolB Periplasmic compo 93.3 0.28 6.1E-06 38.4 5.9 38 26-64 286-327 (425) 19 PRK01742 tolB translocation pr 93.1 0.24 5.1E-06 37.8 5.2 36 23-59 205-244 (429) 20 PRK02889 tolB translocation pr 93.1 0.26 5.7E-06 37.6 5.5 36 23-59 197-236 (427) 21 COG0823 TolB Periplasmic compo 92.4 0.19 4.1E-06 39.4 3.8 35 25-60 241-279 (425) 22 PRK02889 tolB translocation pr 92.3 0.37 8E-06 36.8 5.2 36 25-61 331-370 (427) 23 PRK04922 tolB translocation pr 91.4 0.53 1.2E-05 35.9 5.3 38 24-62 338-379 (433) 24 COG4946 Uncharacterized protei 90.4 0.29 6.4E-06 40.2 3.2 42 22-64 79-128 (668) 25 TIGR02800 propeller_TolB tol-p 90.2 0.89 1.9E-05 33.6 5.4 37 25-62 281-321 (417) 26 PF12894 Apc4_WD40: Anaphase-p 88.0 2.4 5.1E-05 23.7 4.9 28 21-48 11-39 (47) 27 PRK13616 lipoprotein LpqB; Pro 87.7 1.1 2.3E-05 36.6 4.7 24 23-46 449-472 (591) 28 TIGR02800 propeller_TolB tol-p 87.5 1.7 3.6E-05 32.1 5.3 37 24-61 236-276 (417) 29 PF08662 eIF2A: Eukaryotic tra 87.3 1.2 2.7E-05 30.7 4.2 26 23-48 61-89 (194) 30 PRK01742 tolB translocation pr 86.3 1.4 2.9E-05 33.7 4.3 33 26-59 337-369 (429) 31 PF10647 Gmad1: Lipoprotein Lp 84.8 2.6 5.7E-05 30.3 5.0 26 23-48 113-142 (253) 32 TIGR02171 Fb_sc_TIGR02171 Fibr 82.7 3 6.5E-05 36.2 5.2 30 22-51 350-386 (912) 33 PF12566 DUF3748: Protein of u 78.0 3.4 7.4E-05 27.9 3.3 25 23-47 69-93 (122) 34 PF14583 Pectate_lyase22: Olig 76.4 2.2 4.9E-05 33.6 2.4 38 23-61 37-78 (386) 35 PF04762 IKI3: IKI3 family; I 75.9 3.7 7.9E-05 35.3 3.7 40 9-48 108-148 (928) 36 KOG1523|consensus 74.7 4.7 0.0001 31.6 3.7 44 4-49 191-235 (361) 37 PF00930 DPPIV_N: Dipeptidyl p 74.1 2 4.3E-05 31.9 1.5 19 25-43 104-122 (353) 38 PF04053 Coatomer_WDAD: Coatom 74.0 5 0.00011 31.8 3.8 25 24-48 147-171 (443) 39 KOG1524|consensus 69.8 7.7 0.00017 32.6 4.1 35 23-57 147-181 (737) 40 cd00200 WD40 WD40 domain, foun 68.3 26 0.00057 22.4 5.7 28 22-49 10-38 (289) 41 PF08662 eIF2A: Eukaryotic tra 68.2 5.6 0.00012 27.4 2.7 21 20-40 142-162 (194) 42 KOG1445|consensus 65.5 10 0.00022 32.7 4.0 31 21-52 720-751 (1012) 43 PF15492 Nbas_N: Neuroblastoma 65.3 10 0.00023 28.8 3.8 36 13-48 35-71 (282) 44 PF00400 WD40: WD domain, G-be 63.5 16 0.00036 17.8 3.6 26 22-47 12-38 (39) 45 PRK13616 lipoprotein LpqB; Pro 63.3 26 0.00057 28.7 6.0 36 23-60 351-395 (591) 46 TIGR03866 PQQ_ABC_repeats PQQ- 61.8 24 0.00052 23.8 4.8 40 24-64 251-293 (300) 47 KOG4497|consensus 60.4 11 0.00023 30.1 3.2 46 19-64 89-147 (447) 48 PF14583 Pectate_lyase22: Olig 54.4 44 0.00095 26.5 5.6 54 3-59 64-118 (386) 49 KOG1445|consensus 52.0 20 0.00042 31.0 3.5 41 19-59 168-209 (1012) 50 KOG1920|consensus 49.1 28 0.00061 31.4 4.1 42 8-49 96-138 (1265) 51 PF12657 TFIIIC_delta: Transcr 48.8 22 0.00047 24.0 2.9 23 24-47 7-29 (173) 52 PF04841 Vps16_N: Vps16, N-ter 46.5 75 0.0016 24.5 5.8 28 22-49 217-245 (410) 53 KOG1274|consensus 45.9 42 0.00091 29.5 4.6 35 14-48 225-260 (933) 54 KOG1332|consensus 42.2 83 0.0018 24.1 5.3 40 20-59 255-295 (299) 55 KOG2055|consensus 42.2 90 0.002 25.7 5.8 56 20-75 256-328 (514) 56 PF08955 BofC_C: BofC C-termin 41.6 12 0.00025 23.2 0.6 18 55-72 12-29 (75) 57 KOG2139|consensus 41.4 31 0.00068 27.7 3.0 19 23-41 282-300 (445) 58 KOG2314|consensus 41.0 17 0.00037 30.6 1.6 18 22-39 347-364 (698) 59 KOG0291|consensus 41.0 44 0.00096 29.1 4.0 31 21-51 96-126 (893) 60 KOG1407|consensus 40.8 44 0.00095 25.8 3.6 25 24-48 109-134 (313) 61 PF10647 Gmad1: Lipoprotein Lp 40.4 73 0.0016 22.8 4.7 28 21-48 23-54 (253) 62 KOG0286|consensus 39.2 71 0.0015 25.0 4.6 42 20-61 54-96 (343) 63 PF04993 TfoX_N: TfoX N-termin 37.3 33 0.00073 21.0 2.2 22 28-49 21-42 (97) 64 KOG0275|consensus 36.2 18 0.0004 28.9 1.1 34 3-38 197-230 (508) 65 KOG2107|consensus 35.6 39 0.00085 24.2 2.5 36 37-75 118-158 (179) 66 COG3386 Gluconolactonase [Carb 34.7 85 0.0018 23.6 4.4 23 26-48 167-191 (307) 67 KOG2106|consensus 34.7 62 0.0013 27.1 3.8 28 21-48 447-475 (626) 68 PF01436 NHL: NHL repeat; Int 34.2 61 0.0013 15.6 2.6 15 35-49 6-20 (28) 69 KOG2314|consensus 33.2 48 0.001 28.1 3.0 29 20-48 209-237 (698) 70 KOG1007|consensus 33.0 81 0.0018 24.8 4.1 31 23-53 125-155 (370) 71 KOG0318|consensus 33.0 1.3E+02 0.0029 25.2 5.4 48 6-53 174-223 (603) 72 KOG4497|consensus 32.7 46 0.00099 26.6 2.7 24 26-49 13-36 (447) 73 PF03079 ARD: ARD/ARD' family; 32.2 1.5E+02 0.0032 20.3 5.0 42 30-74 109-156 (157) 74 PF10584 Proteasome_A_N: Prote 32.0 20 0.00043 17.5 0.4 9 27-35 6-14 (23) 75 KOG0771|consensus 30.8 51 0.0011 26.3 2.7 40 20-59 185-224 (398) 76 KOG0973|consensus 29.7 84 0.0018 27.8 4.0 30 20-49 128-158 (942) 77 PF09865 DUF2092: Predicted pe 28.8 1.3E+02 0.0028 21.6 4.3 46 29-76 30-77 (214) 78 PF05796 Chordopox_G2: Chordop 28.5 48 0.001 24.4 2.1 22 27-48 94-115 (216) 79 PF11061 DUF2862: Protein of u 28.4 27 0.00059 21.0 0.7 15 65-79 48-62 (64) 80 COG4946 Uncharacterized protei 28.3 1.4E+02 0.0031 25.1 4.9 43 18-61 440-487 (668) 81 KOG1539|consensus 28.1 91 0.002 27.4 3.9 27 23-49 619-647 (910) 82 cd01782 AF6_RA_repeat1 Ubiquit 28.0 1.4E+02 0.003 19.9 4.0 35 7-43 78-112 (112) 83 PRK10115 protease 2; Provision 27.5 74 0.0016 26.4 3.2 27 23-49 128-160 (686) 84 KOG1538|consensus 27.4 1.3E+02 0.0027 26.5 4.5 26 23-48 14-39 (1081) 85 smart00320 WD40 WD40 repeats. 27.1 62 0.0013 13.4 3.8 17 23-39 14-30 (40) 86 PF09142 TruB_C: tRNA Pseudour 25.5 75 0.0016 18.1 2.2 19 24-42 27-45 (56) 87 KOG2111|consensus 25.2 1.1E+02 0.0023 24.1 3.5 40 8-47 213-253 (346) 88 KOG2315|consensus 25.1 1.2E+02 0.0025 25.4 3.9 20 23-42 272-291 (566) 89 PF02897 Peptidase_S9_N: Proly 25.0 83 0.0018 23.4 2.9 26 24-49 126-157 (414) 90 KOG2394|consensus 24.1 62 0.0013 27.2 2.2 22 24-45 335-356 (636) 91 KOG1354|consensus 23.6 61 0.0013 26.0 2.0 26 23-48 407-432 (433) 92 PF07433 DUF1513: Protein of u 23.3 1.3E+02 0.0028 23.1 3.6 38 5-43 34-71 (305) 93 KOG0183|consensus 23.2 1.3E+02 0.0028 22.6 3.5 24 25-48 7-45 (249) 94 KOG1446|consensus 23.1 2.4E+02 0.0053 21.9 5.1 29 20-48 186-215 (311) 95 PTZ00420 coronin; Provisional 22.5 2.3E+02 0.0049 23.4 5.1 28 22-49 168-196 (568) 96 KOG0315|consensus 22.3 2.6E+02 0.0056 21.6 5.0 34 18-51 164-198 (311) 97 PHA03078 transcriptional elong 22.0 76 0.0016 23.4 2.1 21 28-48 92-112 (219) 98 KOG2100|consensus 21.9 64 0.0014 27.3 1.9 21 25-45 210-230 (755) 99 smart00415 HSF heat shock fact 21.9 94 0.002 19.5 2.3 14 25-38 20-33 (105) 100 PF07646 Kelch_2: Kelch motif; 21.7 1.3E+02 0.0028 15.7 2.6 18 31-48 1-18 (49) 101 PF04053 Coatomer_WDAD: Coatom 21.7 2.2E+02 0.0047 22.6 4.8 67 6-74 15-83 (443) 102 KOG0319|consensus 20.9 3E+02 0.0066 23.9 5.6 23 27-49 25-47 (775) 103 KOG2919|consensus 20.6 1.1E+02 0.0024 24.4 2.8 29 23-51 51-80 (406) 104 cd02257 Peptidase_C19 Peptidas 20.5 1.4E+02 0.0031 19.2 3.0 18 31-48 207-226 (255) 105 COG5169 HSF1 Heat shock transc 20.4 95 0.0021 23.5 2.4 19 22-40 25-43 (282) 106 KOG4659|consensus 20.2 1.3E+02 0.0027 28.4 3.4 38 34-71 596-635 (1899) 107 KOG1864|consensus 20.1 1.1E+02 0.0023 25.5 2.8 23 26-48 519-543 (587) 108 COG4831 Roadblock/LC7 domain [ 20.1 76 0.0016 20.9 1.6 17 24-40 15-31 (109) No 1 >PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction:Dipeptidyl-Polypeptide + H(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0006508 proteolysis, 0016020 membrane; PDB: 2RIP_A 3Q8W_B 2AJL_I 1TKR_B 1TK3_B 3C45_A 2G5P_A 3G0C_D 1R9M_C 1RWQ_A .... Probab=99.76 E-value=4.4e-19 Score=131.46 Aligned_cols=70 Identities=29% Similarity=0.522 Sum_probs=53.7 Q ss_pred cCCccceeecccCCCCCCCeeeeEEccCCCeEEEEEcCCEEEEcCCCCCCeEEeccCCCceeEecccceeeeec Q psy14259 5 RLPIRAEKQQNINDHEAPYLQHISWAPVDNALAFVYNRDVYYSPSATLQDIYRLSNTGSEVVSNGVPDWLYQAR 78 (79) Q Consensus 5 ~~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~nnly~~~~~~~~~~~~lT~dG~~~i~nG~~DWvYeEE 78 (79) .+.++..+++... ...++.++|||+|+.||||++||||+.+...+ ..+|||+||...|+||+|||||||| T Consensus 29 d~~~~~~~~l~~~---~~~~~~~~~sP~g~~~~~v~~~nly~~~~~~~-~~~~lT~dg~~~i~nG~~dwvyeEE 98 (353) T PF00930_consen 29 DIETGEITPLTPP---PPKLQDAKWSPDGKYIAFVRDNNLYLRDLATG-QETQLTTDGEPGIYNGVPDWVYEEE 98 (353) T ss_dssp ETTTTEEEESS-E---ETTBSEEEE-SSSTEEEEEETTEEEEESSTTS-EEEESES--TTTEEESB--HHHHHH T ss_pred ecCCCceEECcCC---ccccccceeecCCCeeEEEecCceEEEECCCC-CeEEeccccceeEEcCccceecccc Confidence 3445555555544 56799999999999999999999999976544 6999999998899999999999999 No 2 >KOG2100|consensus Probab=99.35 E-value=1.6e-12 Score=106.36 Aligned_cols=66 Identities=33% Similarity=0.685 Sum_probs=56.1 Q ss_pred ecccCCCCCCCeeeeEEccCCCeEEEEEcCCEEEEcCCCCCCeEEeccCCCc-eeEecccceeeeecC Q psy14259 13 QQNINDHEAPYLQHISWAPVDNALAFVYNRDVYYSPSATLQDIYRLSNTGSE-VVSNGVPDWLYQARE 79 (79) Q Consensus 13 ~~~~~~~~~~~~q~a~wsP~g~~lafV~~nnly~~~~~~~~~~~~lT~dG~~-~i~nG~~DWvYeEEf 79 (79) +..++..+...+|.+.|||.|++++||++||||+...... ...++|.+|.. .+|||.+||+||||. T Consensus 139 ~~~~~~~~~~~~~~~~wsp~~~~l~yv~~~niy~~~~~~~-~~~~~~~~~~~~~i~ng~~Dw~yeeEv 205 (755) T KOG2100|consen 139 KLHPPEYEGSKIQYASWSPLGNDLAYVLHNNIYYQSSEED-EDVRIVSNGGEDVIFNGKPDWIYEEEV 205 (755) T ss_pred cccCcccCCCeeEEEEEcCCCCEEEEEEecccccccCcCC-CceEEEecCCCceEEcCCCCceeehhh Confidence 4455666777799999999999999999999999986544 47788888888 699999999999874 No 3 >PF07676 PD40: WD40-like Beta Propeller Repeat; InterPro: IPR011659 WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase [, ]. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events. This region appears to be related to the IPR001680 from INTERPRO repeat. This model is likely to miss copies within a sequence.; PDB: 2HQS_D 1C5K_A 2IVZ_A 2W8B_D 3IAX_A 1CRZ_A 1N6F_D 1N6D_C 1N6E_C 1K32_A .... Probab=96.78 E-value=0.0025 Score=33.35 Aligned_cols=21 Identities=24% Similarity=0.408 Sum_probs=16.9 Q ss_pred CeeeeEEccCCCeEEEEEcCC Q psy14259 23 YLQHISWAPVDNALAFVYNRD 43 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nn 43 (79) .--.+.|||+|+.|+|+.+.+ T Consensus 10 ~~~~p~~SpDGk~i~f~s~~~ 30 (39) T PF07676_consen 10 DDGSPAWSPDGKYIYFTSNRN 30 (39) T ss_dssp SEEEEEE-TTSSEEEEEEECT T ss_pred cccCEEEecCCCEEEEEecCC Confidence 456889999999999998754 No 4 >PRK04043 tolB translocation protein TolB; Provisional Probab=96.72 E-value=0.0062 Score=47.06 Aligned_cols=54 Identities=11% Similarity=0.126 Sum_probs=37.8 Q ss_pred CCccceeecccCCCCCCCeeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCCC Q psy14259 6 LPIRAEKQQNINDHEAPYLQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTGS 63 (79) Q Consensus 6 ~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG~ 63 (79) ++.+..+++...+. .-..+.|||+|+.|+|+.++ +||+.+ ..+++..|||..|. T Consensus 264 l~~g~~~~LT~~~~---~d~~p~~SPDG~~I~F~Sdr~g~~~Iy~~d-l~~g~~~rlt~~g~ 321 (419) T PRK04043 264 TNTKTLTQITNYPG---IDVNGNFVEDDKRIVFVSDRLGYPNIFMKK-LNSGSVEQVVFHGK 321 (419) T ss_pred CCCCcEEEcccCCC---ccCccEECCCCCEEEEEECCCCCceEEEEE-CCCCCeEeCccCCC Confidence 44555555543322 23457899999999999987 899997 45566779987654 No 5 >KOG2281|consensus Probab=96.36 E-value=0.00089 Score=55.94 Aligned_cols=56 Identities=20% Similarity=0.261 Sum_probs=45.1 Q ss_pred CeeeeEEccC-CCeEEEEEcCCEEEEcCCCCCCeEEecc--CCCc-----eeEecccceeeeecC Q psy14259 23 YLQHISWAPV-DNALAFVYNRDVYYSPSATLQDIYRLSN--TGSE-----VVSNGVPDWLYQARE 79 (79) Q Consensus 23 ~~q~a~wsP~-g~~lafV~~nnly~~~~~~~~~~~~lT~--dG~~-----~i~nG~~DWvYeEEf 79 (79) ....++.+|. +..|||+++++|||.+. ..+.+.|+|. .|.. .+.+|+|.+|-+||| T Consensus 201 ~~~dP~lcP~~~~fia~i~~~dl~V~n~-~~~~ekrlt~~h~g~sn~~dd~~saGVasyv~QEEf 264 (867) T KOG2281|consen 201 TRMDPKLCPADPDFIAYIKVCDLWVLNI-LTGEEKRLTYIHNGSSNSKDDAISAGVASYVVQEEF 264 (867) T ss_pred CccCcccCCCCccceeeeehhhhhhhhh-hhchhhceeeeeccccccccchhhcCcchHHHHHHH Confidence 4567788887 89999999999999975 4456778873 4432 788999999999997 No 6 >PRK03629 tolB translocation protein TolB; Provisional Probab=95.52 E-value=0.066 Score=41.04 Aligned_cols=39 Identities=21% Similarity=0.296 Sum_probs=29.9 Q ss_pred CeeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCC Q psy14259 23 YLQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG 62 (79) ....+.|||+|+.|+|+.+. +||..+. .++...+||..+ T Consensus 288 ~~~~~~wSPDG~~I~f~s~~~g~~~Iy~~d~-~~g~~~~lt~~~ 330 (429) T PRK03629 288 NNTEPTWFPDSQNLAYTSDQAGRPQVYKVNI-NGGAPQRITWEG 330 (429) T ss_pred CcCceEECCCCCEEEEEeCCCCCceEEEEEC-CCCCeEEeecCC Confidence 45678999999999999875 6998863 444577887654 No 7 >PRK04043 tolB translocation protein TolB; Provisional Probab=95.27 E-value=0.099 Score=40.44 Aligned_cols=38 Identities=13% Similarity=0.181 Sum_probs=29.4 Q ss_pred eeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEeccCC Q psy14259 24 LQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT~dG 62 (79) ...+.|||+|+.|+|+.. .+||+.+. .++...+||... T Consensus 235 ~~~~~~SPDG~~la~~~~~~g~~~Iy~~dl-~~g~~~~LT~~~ 276 (419) T PRK04043 235 LVVSDVSKDGSKLLLTMAPKGQPDIYLYDT-NTKTLTQITNYP 276 (419) T ss_pred EEeeEECCCCCEEEEEEccCCCcEEEEEEC-CCCcEEEcccCC Confidence 446789999999999976 56999974 445578998643 No 8 >PRK00178 tolB translocation protein TolB; Provisional Probab=94.90 E-value=0.097 Score=39.40 Aligned_cols=37 Identities=14% Similarity=0.275 Sum_probs=27.6 Q ss_pred CeeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEecc Q psy14259 23 YLQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSN 60 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~ 60 (79) .+..+.|||+|+.|+|+..+ +||+.+. ..+...+||. T Consensus 200 ~~~~p~wSpDG~~la~~s~~~~~~~l~~~~l-~~g~~~~l~~ 240 (430) T PRK00178 200 PILSPRWSPDGKRIAYVSFEQKRPRIFVQNL-DTGRREQITN 240 (430) T ss_pred ceeeeeECCCCCEEEEEEcCCCCCEEEEEEC-CCCCEEEccC Confidence 46788999999999999743 5998864 4444666664 No 9 >PRK04922 tolB translocation protein TolB; Provisional Probab=94.58 E-value=0.18 Score=38.44 Aligned_cols=37 Identities=24% Similarity=0.473 Sum_probs=28.1 Q ss_pred eeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCC Q psy14259 25 QHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 25 q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG 62 (79) ..+.|||+|+.|+|+.+. +||+.+. .+++..+||..| T Consensus 295 ~~~~~spDG~~l~f~sd~~g~~~iy~~dl-~~g~~~~lt~~g 335 (433) T PRK04922 295 TEPTWAPDGKSIYFTSDRGGRPQIYRVAA-SGGSAERLTFQG 335 (433) T ss_pred cceEECCCCCEEEEEECCCCCceEEEEEC-CCCCeEEeecCC Confidence 467999999999999864 4999863 344577787654 No 10 >PRK01029 tolB translocation protein TolB; Provisional Probab=94.00 E-value=0.12 Score=39.85 Aligned_cols=37 Identities=19% Similarity=0.345 Sum_probs=29.2 Q ss_pred eeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEeccC Q psy14259 24 LQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLSNT 61 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT~d 61 (79) ...+.|||+|+.|+|+.. .+||+++ ..+++..+||.+ T Consensus 329 ~~~p~wSPDG~~Laf~~~~~g~~~I~v~d-l~~g~~~~Lt~~ 369 (428) T PRK01029 329 SSCPAWSPDGKKIAFCSVIKGVRQICVYD-LATGRDYQLTTS 369 (428) T ss_pred ccceeECCCCCEEEEEEcCCCCcEEEEEE-CCCCCeEEccCC Confidence 467899999999999975 4699996 455567888854 No 11 >PRK05137 tolB translocation protein TolB; Provisional Probab=93.94 E-value=0.17 Score=38.57 Aligned_cols=36 Identities=8% Similarity=0.122 Sum_probs=27.9 Q ss_pred CeeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEec Q psy14259 23 YLQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLS 59 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT 59 (79) .+..+.|||+|+.|+|+.. ..||+++. ..+...+|| T Consensus 203 ~v~~p~wSpDG~~lay~s~~~g~~~i~~~dl-~~g~~~~l~ 242 (435) T PRK05137 203 LVLTPRFSPNRQEITYMSYANGRPRVYLLDL-ETGQRELVG 242 (435) T ss_pred CeEeeEECCCCCEEEEEEecCCCCEEEEEEC-CCCcEEEee Confidence 4778999999999999964 46999974 444456666 No 12 >PRK01029 tolB translocation protein TolB; Provisional Probab=93.88 E-value=0.3 Score=37.67 Aligned_cols=53 Identities=13% Similarity=0.156 Sum_probs=35.4 Q ss_pred CCccceeecccCCCCCCCeeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEeccCC Q psy14259 6 LPIRAEKQQNINDHEAPYLQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 6 ~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT~dG 62 (79) +..+..+++... ......+.|||+|+.|+|... .+||+.+ ..+++..+||... T Consensus 358 l~~g~~~~Lt~~---~~~~~~p~wSpDG~~L~f~~~~~g~~~L~~vd-l~~g~~~~Lt~~~ 414 (428) T PRK01029 358 LATGRDYQLTTS---PENKESPSWAIDSLHLVYSAGNSNESELYLIS-LITKKTRKIVIGS 414 (428) T ss_pred CCCCCeEEccCC---CCCccceEECCCCCEEEEEECCCCCceEEEEE-CCCCCEEEeecCC Confidence 344555444422 123567899999999999865 4699986 4455688898643 No 13 >PRK04792 tolB translocation protein TolB; Provisional Probab=93.87 E-value=0.15 Score=39.40 Aligned_cols=37 Identities=19% Similarity=0.311 Sum_probs=27.1 Q ss_pred CeeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEecc Q psy14259 23 YLQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSN 60 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~ 60 (79) .+..+.|||+|+.|+|+... +||+.+. ..++..++|. T Consensus 219 ~~~~p~wSPDG~~La~~s~~~g~~~L~~~dl-~tg~~~~lt~ 259 (448) T PRK04792 219 PLMSPAWSPDGRKLAYVSFENRKAEIFVQDI-YTQVREKVTS 259 (448) T ss_pred cccCceECCCCCEEEEEEecCCCcEEEEEEC-CCCCeEEecC Confidence 35678999999999999643 5999973 3444566663 No 14 >PRK05137 tolB translocation protein TolB; Provisional Probab=93.84 E-value=0.2 Score=38.16 Aligned_cols=37 Identities=22% Similarity=0.251 Sum_probs=28.9 Q ss_pred eeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccC Q psy14259 24 LQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNT 61 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~d 61 (79) ...+.|||+|+.|+|+.+. +||+.+ ..++...+||.. T Consensus 292 ~~~~~~spDG~~i~f~s~~~g~~~Iy~~d-~~g~~~~~lt~~ 332 (435) T PRK05137 292 DTSPSYSPDGSQIVFESDRSGSPQLYVMN-ADGSNPRRISFG 332 (435) T ss_pred cCceeEcCCCCEEEEEECCCCCCeEEEEE-CCCCCeEEeecC Confidence 4468999999999999864 699996 445557788764 No 15 >PRK04792 tolB translocation protein TolB; Provisional Probab=93.78 E-value=0.35 Score=37.39 Aligned_cols=39 Identities=26% Similarity=0.420 Sum_probs=29.7 Q ss_pred eeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCCC Q psy14259 24 LQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTGS 63 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG~ 63 (79) ...+.|||+|+.|+|+.+. +||+.+. .+++..+||.+|. T Consensus 308 ~~~p~wSpDG~~I~f~s~~~g~~~Iy~~dl-~~g~~~~Lt~~g~ 350 (448) T PRK04792 308 DTEPSWHPDGKSLIFTSERGGKPQIYRVNL-ASGKVSRLTFEGE 350 (448) T ss_pred ccceEECCCCCEEEEEECCCCCceEEEEEC-CCCCEEEEecCCC Confidence 4568999999999999864 5998863 4455778886553 No 16 >PRK03629 tolB translocation protein TolB; Provisional Probab=93.74 E-value=0.17 Score=38.79 Aligned_cols=37 Identities=16% Similarity=0.286 Sum_probs=27.3 Q ss_pred CCeeeeEEccCCCeEEEEE----cCCEEEEcCCCCCCeEEec Q psy14259 22 PYLQHISWAPVDNALAFVY----NRDVYYSPSATLQDIYRLS 59 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV~----~nnly~~~~~~~~~~~~lT 59 (79) .....+.|||+|+.|||+. +.+||+.+. .+++..+|| T Consensus 199 ~~~~~p~wSPDG~~la~~s~~~g~~~i~i~dl-~~G~~~~l~ 239 (429) T PRK03629 199 QPLMSPAWSPDGSKLAYVTFESGRSALVIQTL-ANGAVRQVA 239 (429) T ss_pred CceeeeEEcCCCCEEEEEEecCCCcEEEEEEC-CCCCeEEcc Confidence 3577899999999999985 246998863 444455655 No 17 >PRK00178 tolB translocation protein TolB; Provisional Probab=93.41 E-value=0.22 Score=37.51 Aligned_cols=38 Identities=18% Similarity=0.235 Sum_probs=28.5 Q ss_pred eeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCC Q psy14259 24 LQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG 62 (79) ...+.|||+|+.|+|+.+. +||+.+. .+++..++|..| T Consensus 289 ~~~~~~spDg~~i~f~s~~~g~~~iy~~d~-~~g~~~~lt~~~ 330 (430) T PRK00178 289 DTEPFWGKDGRTLYFTSDRGGKPQIYKVNV-NGGRAERVTFVG 330 (430) T ss_pred cCCeEECCCCCEEEEEECCCCCceEEEEEC-CCCCEEEeecCC Confidence 3457899999999999864 5998863 445577887554 No 18 >COG0823 TolB Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking and secretion] Probab=93.26 E-value=0.28 Score=38.42 Aligned_cols=38 Identities=24% Similarity=0.480 Sum_probs=30.9 Q ss_pred eeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCCCc Q psy14259 26 HISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTGSE 64 (79) Q Consensus 26 ~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG~~ 64 (79) ...|||+|+.|+|+.|. +||+++. ......|+|.++.. T Consensus 286 ~Ps~spdG~~ivf~Sdr~G~p~I~~~~~-~g~~~~riT~~~~~ 327 (425) T COG0823 286 SPSWSPDGSKIVFTSDRGGRPQIYLYDL-EGSQVTRLTFSGGG 327 (425) T ss_pred CccCCCCCCEEEEEeCCCCCcceEEECC-CCCceeEeeccCCC Confidence 78999999999999886 5999964 45557899977654 No 19 >PRK01742 tolB translocation protein TolB; Provisional Probab=93.15 E-value=0.24 Score=37.81 Aligned_cols=36 Identities=17% Similarity=0.182 Sum_probs=25.6 Q ss_pred CeeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEec Q psy14259 23 YLQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLS 59 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT 59 (79) .+..+.|||+|+.|||+... .||+++. ..+...+++ T Consensus 205 ~v~~p~wSPDG~~la~~s~~~~~~~i~i~dl-~tg~~~~l~ 244 (429) T PRK01742 205 PLMSPAWSPDGSKLAYVSFENKKSQLVVHDL-RSGARKVVA 244 (429) T ss_pred ccccceEcCCCCEEEEEEecCCCcEEEEEeC-CCCceEEEe Confidence 46789999999999999643 4999864 333333444 No 20 >PRK02889 tolB translocation protein TolB; Provisional Probab=93.13 E-value=0.26 Score=37.61 Aligned_cols=36 Identities=19% Similarity=0.226 Sum_probs=27.1 Q ss_pred CeeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEec Q psy14259 23 YLQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLS 59 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT 59 (79) .+..+.|||+|+.|||+.. .+||+++. .+++..++| T Consensus 197 ~v~~p~wSPDG~~la~~s~~~~~~~I~~~dl-~~g~~~~l~ 236 (427) T PRK02889 197 PIISPAWSPDGTKLAYVSFESKKPVVYVHDL-ATGRRRVVA 236 (427) T ss_pred CcccceEcCCCCEEEEEEccCCCcEEEEEEC-CCCCEEEee Confidence 4668899999999999974 34999874 444466666 No 21 >COG0823 TolB Periplasmic component of the Tol biopolymer transport system [Intracellular trafficking and secretion] Probab=92.36 E-value=0.19 Score=39.37 Aligned_cols=35 Identities=29% Similarity=0.450 Sum_probs=27.4 Q ss_pred eeeEEccCCCeEEEEEcCC----EEEEcCCCCCCeEEecc Q psy14259 25 QHISWAPVDNALAFVYNRD----VYYSPSATLQDIYRLSN 60 (79) Q Consensus 25 q~a~wsP~g~~lafV~~nn----ly~~~~~~~~~~~~lT~ 60 (79) -.++|||+|+.|||+..+| ||+.+. .+....+||+ T Consensus 241 ~~P~fspDG~~l~f~~~rdg~~~iy~~dl-~~~~~~~Lt~ 279 (425) T COG0823 241 GAPAFSPDGSKLAFSSSRDGSPDIYLMDL-DGKNLPRLTN 279 (425) T ss_pred CCccCCCCCCEEEEEECCCCCccEEEEcC-CCCcceeccc Confidence 3579999999999998875 999975 4444667774 No 22 >PRK02889 tolB translocation protein TolB; Provisional Probab=92.26 E-value=0.37 Score=36.81 Aligned_cols=36 Identities=17% Similarity=0.262 Sum_probs=27.4 Q ss_pred eeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccC Q psy14259 25 QHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNT 61 (79) Q Consensus 25 q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~d 61 (79) ..+.|||+|+.|+|+.+. +||+.+. ..++..++|.. T Consensus 331 ~~~~~SpDG~~Ia~~s~~~g~~~I~v~d~-~~g~~~~lt~~ 370 (427) T PRK02889 331 TSPRISPDGKLLAYISRVGGAFKLYVQDL-ATGQVTALTDT 370 (427) T ss_pred CceEECCCCCEEEEEEccCCcEEEEEEEC-CCCCeEEccCC Confidence 457899999999999764 4999974 44457778754 No 23 >PRK04922 tolB translocation protein TolB; Provisional Probab=91.42 E-value=0.53 Score=35.88 Aligned_cols=38 Identities=13% Similarity=0.096 Sum_probs=28.1 Q ss_pred eeeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCC Q psy14259 24 LQHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG 62 (79) ...+.|||+|+.|+|+... +||+.+. .++...+||.++ T Consensus 338 ~~~~~~SpDG~~Ia~~~~~~~~~~I~v~d~-~~g~~~~Lt~~~ 379 (433) T PRK04922 338 NARASVSPDGKKIAMVHGSGGQYRIAVMDL-STGSVRTLTPGS 379 (433) T ss_pred ccCEEECCCCCEEEEEECCCCceeEEEEEC-CCCCeEECCCCC Confidence 3468999999999998643 4999874 445577888653 No 24 >COG4946 Uncharacterized protein related to the periplasmic component of the Tol biopolymer transport system [Function unknown] Probab=90.42 E-value=0.29 Score=40.19 Aligned_cols=42 Identities=21% Similarity=0.360 Sum_probs=33.9 Q ss_pred CCeeeeEEccCCCeEEEE--------EcCCEEEEcCCCCCCeEEeccCCCc Q psy14259 22 PYLQHISWAPVDNALAFV--------YNRDVYYSPSATLQDIYRLSNTGSE 64 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV--------~~nnly~~~~~~~~~~~~lT~dG~~ 64 (79) ..+-.+++||+|..+||. ...|||+++. .+++..|||.-|.. T Consensus 79 GVvnn~kf~pdGrkvaf~rv~~~ss~~taDly~v~~-e~Ge~kRiTyfGr~ 128 (668) T COG4946 79 GVVNNPKFSPDGRKVAFSRVMLGSSLQTADLYVVPS-EDGEAKRITYFGRR 128 (668) T ss_pred ceeccccCCCCCcEEEEEEEEecCCCccccEEEEeC-CCCcEEEEEEeccc Confidence 347789999999999993 3578999975 45679999988754 No 25 >TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi, Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear. Probab=90.19 E-value=0.89 Score=33.55 Aligned_cols=37 Identities=27% Similarity=0.558 Sum_probs=27.7 Q ss_pred eeeEEccCCCeEEEEEcC----CEEEEcCCCCCCeEEeccCC Q psy14259 25 QHISWAPVDNALAFVYNR----DVYYSPSATLQDIYRLSNTG 62 (79) Q Consensus 25 q~a~wsP~g~~lafV~~n----nly~~~~~~~~~~~~lT~dG 62 (79) ..+.|+|+|+.|+|+.+. +||+.+. .+++..++|..+ T Consensus 281 ~~~~~s~dg~~l~~~s~~~g~~~iy~~d~-~~~~~~~l~~~~ 321 (417) T TIGR02800 281 TEPSWSPDGKSIAFTSDRGGSPQIYMMDA-DGGEVRRLTFRG 321 (417) T ss_pred CCEEECCCCCEEEEEECCCCCceEEEEEC-CCCCEEEeecCC Confidence 457899999999999764 6999863 444566777554 No 26 >PF12894 Apc4_WD40: Anaphase-promoting complex subunit 4 WD40 domain Probab=88.04 E-value=2.4 Score=23.67 Aligned_cols=28 Identities=18% Similarity=0.343 Sum_probs=22.5 Q ss_pred CCCeeeeEEccCCCeEEEEEc-CCEEEEc Q psy14259 21 APYLQHISWAPVDNALAFVYN-RDVYYSP 48 (79) Q Consensus 21 ~~~~q~a~wsP~g~~lafV~~-nnly~~~ 48 (79) ...++.+.|||+.+-||.+.+ +.|++.. T Consensus 11 ~~~v~~~~w~P~mdLiA~~t~~g~v~v~R 39 (47) T PF12894_consen 11 PSRVSCMSWCPTMDLIALGTEDGEVLVYR 39 (47) T ss_pred CCcEEEEEECCCCCEEEEEECCCeEEEEE Confidence 345889999999999999976 4577764 No 27 >PRK13616 lipoprotein LpqB; Provisional Probab=87.68 E-value=1.1 Score=36.61 Aligned_cols=24 Identities=13% Similarity=0.189 Sum_probs=22.7 Q ss_pred CeeeeEEccCCCeEEEEEcCCEEE Q psy14259 23 YLQHISWAPVDNALAFVYNRDVYY 46 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nnly~ 46 (79) .+....|||+|.+|||+.+..||+ T Consensus 449 ~Issl~wSpDG~RiA~i~~g~v~V 472 (591) T PRK13616 449 PISELQLSRDGVRAAMIIGGKVYL 472 (591) T ss_pred CcCeEEECCCCCEEEEEECCEEEE Confidence 488999999999999999999998 No 28 >TIGR02800 propeller_TolB tol-pal system beta propeller repeat protein TolB. The Tol-PAL system is required for bacterial outer membrane integrity. E. coli TolB is involved in the tonB-independent uptake of group A colicins (colicins A, E1, E2, E3 and K), and is necessary for the colicins to reach their respective targets after initial binding to the bacteria. It is also involved in uptake of filamentous DNA. Study of its structure suggest that the TolB protein might be involved in the recycling of peptidoglycan or in its covalent linking with lipoproteins. The Tol-Pal system is also implicated in pathogenesis of E. coli, Haemophilus ducreyi, Salmonella enterica and Vibrio cholerae, but the mechanism(s) is unclear. Probab=87.53 E-value=1.7 Score=32.12 Aligned_cols=37 Identities=22% Similarity=0.284 Sum_probs=26.8 Q ss_pred eeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEeccC Q psy14259 24 LQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLSNT 61 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT~d 61 (79) ...+.|||+|+.|+|... .+||+.+. .++...++|.. T Consensus 236 ~~~~~~spDg~~l~~~~~~~~~~~i~~~d~-~~~~~~~l~~~ 276 (417) T TIGR02800 236 NGAPAFSPDGSKLAVSLSKDGNPDIYVMDL-DGKQLTRLTNG 276 (417) T ss_pred ccceEECCCCCEEEEEECCCCCccEEEEEC-CCCCEEECCCC Confidence 445899999999999864 35999874 34446677653 No 29 >PF08662 eIF2A: Eukaryotic translation initiation factor eIF2A; InterPro: IPR013979 This entry contains beta propellor domains found in eukaryotic translation initiation factors and TolB domain-containing proteins. Probab=87.31 E-value=1.2 Score=30.73 Aligned_cols=26 Identities=23% Similarity=0.579 Sum_probs=20.4 Q ss_pred CeeeeEEccCCCeEEEEEcC---CEEEEc Q psy14259 23 YLQHISWAPVDNALAFVYNR---DVYYSP 48 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~n---nly~~~ 48 (79) .+..++|||+|+.+|.+.+. .+-+.+ T Consensus 61 ~I~~~~WsP~g~~favi~g~~~~~v~lyd 89 (194) T PF08662_consen 61 PIHDVAWSPNGNEFAVIYGSMPAKVTLYD 89 (194) T ss_pred ceEEEEECcCCCEEEEEEccCCcccEEEc Confidence 39999999999999999753 455554 No 30 >PRK01742 tolB translocation protein TolB; Provisional Probab=86.29 E-value=1.4 Score=33.69 Aligned_cols=33 Identities=15% Similarity=0.058 Sum_probs=25.8 Q ss_pred eeEEccCCCeEEEEEcCCEEEEcCCCCCCeEEec Q psy14259 26 HISWAPVDNALAFVYNRDVYYSPSATLQDIYRLS 59 (79) Q Consensus 26 ~a~wsP~g~~lafV~~nnly~~~~~~~~~~~~lT 59 (79) .+.|||+|+.|+|+..++|+..+. ..+...++| T Consensus 337 ~~~~SpDG~~ia~~~~~~i~~~Dl-~~g~~~~lt 369 (429) T PRK01742 337 SAQISADGKTLVMINGDNVVKQDL-TSGSTEVLS 369 (429) T ss_pred CccCCCCCCEEEEEcCCCEEEEEC-CCCCeEEec Confidence 468999999999999999999863 444455565 No 31 >PF10647 Gmad1: Lipoprotein LpqB beta-propeller domain; InterPro: IPR018910 The Gmad1 domain is found associated with IPR019606 from INTERPRO, in bacterial spore formation. It is predicted to have a beta-propeller fold and to have a passive binding role rather than a catalytic function owing to the low number of conserved hydrophilic residues. Probab=84.82 E-value=2.6 Score=30.32 Aligned_cols=26 Identities=19% Similarity=0.275 Sum_probs=23.7 Q ss_pred CeeeeEEccCCCeEEEEE----cCCEEEEc Q psy14259 23 YLQHISWAPVDNALAFVY----NRDVYYSP 48 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~----~nnly~~~ 48 (79) .+..+.+||+|.++|+|. +..||+.- T Consensus 113 ~I~~l~vSpDG~RvA~v~~~~~~~~v~va~ 142 (253) T PF10647_consen 113 RITALRVSPDGTRVAVVVEDGGGGRVYVAG 142 (253) T ss_pred ceEEEEECCCCcEEEEEEecCCCCeEEEEE Confidence 688999999999999999 89999873 No 32 >TIGR02171 Fb_sc_TIGR02171 Fibrobacter succinogenes paralogous family TIGR02171. This model describes a paralogous family of the rumen bacterium Fibrobacter succinogenes. Eleven members are found in Fibrobacter succinogenes S85, averaging over 900 amino acids in length. More than half are predicted lipoproteins. The function is unknown. Probab=82.68 E-value=3 Score=36.17 Aligned_cols=30 Identities=20% Similarity=0.211 Sum_probs=22.3 Q ss_pred CCeeeeEEccCCCeEEE-EE-cC-----CEEEEcCCC Q psy14259 22 PYLQHISWAPVDNALAF-VY-NR-----DVYYSPSAT 51 (79) Q Consensus 22 ~~~q~a~wsP~g~~laf-V~-~n-----nly~~~~~~ 51 (79) ..+-.+.|||+|+.||| |. ++ .||+++..+ T Consensus 350 ~~i~sP~~SPDG~~vAY~ts~e~~~g~s~vYv~~L~t 386 (912) T TIGR02171 350 ISVYHPDISPDGKKVAFCTGIEGLPGKSSVYVRNLNA 386 (912) T ss_pred CceecCcCCCCCCEEEEEEeecCCCCCceEEEEehhc Confidence 34778999999999999 42 22 499987543 No 33 >PF12566 DUF3748: Protein of unknown function (DUF3748); InterPro: IPR022223 This domain family is found in bacteria and eukaryotes, and is approximately 120 amino acids in length. Probab=77.97 E-value=3.4 Score=27.86 Aligned_cols=25 Identities=24% Similarity=0.301 Sum_probs=19.8 Q ss_pred CeeeeEEccCCCeEEEEEcCCEEEE Q psy14259 23 YLQHISWAPVDNALAFVYNRDVYYS 47 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nnly~~ 47 (79) ...--.|||+|+.|+|+++..+--. T Consensus 69 GtHvHvfSpDG~~lSFTYNDhVmhe 93 (122) T PF12566_consen 69 GTHVHVFSPDGSWLSFTYNDHVMHE 93 (122) T ss_pred CccceEECCCCCEEEEEecchhhcc Confidence 3445689999999999999887543 No 34 >PF14583 Pectate_lyase22: Oligogalacturonate lyase; PDB: 3C5M_C 3PE7_A. Probab=76.35 E-value=2.2 Score=33.56 Aligned_cols=38 Identities=18% Similarity=0.362 Sum_probs=21.7 Q ss_pred CeeeeEEccCCCeEEEEEc----CCEEEEcCCCCCCeEEeccC Q psy14259 23 YLQHISWAPVDNALAFVYN----RDVYYSPSATLQDIYRLSNT 61 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~----nnly~~~~~~~~~~~~lT~d 61 (79) ..-.-.|.++|++|.|.-+ .|||..+ ..+.+.+|||.. T Consensus 37 YF~~~~ft~dG~kllF~s~~dg~~nly~lD-L~t~~i~QLTdg 78 (386) T PF14583_consen 37 YFYQNCFTDDGRKLLFASDFDGNRNLYLLD-LATGEITQLTDG 78 (386) T ss_dssp -TTS--B-TTS-EEEEEE-TTSS-EEEEEE-TTT-EEEE---S T ss_pred eecCCCcCCCCCEEEEEeccCCCcceEEEE-cccCEEEECccC Confidence 3445689999999999876 6799997 456679999963 No 35 >PF04762 IKI3: IKI3 family; InterPro: IPR006849 Members of this family are components of the elongator multi-subunit component of a novel RNA polymerase II holoenzyme for transcriptional elongation []. Probab=75.88 E-value=3.7 Score=35.33 Aligned_cols=40 Identities=20% Similarity=0.299 Sum_probs=29.4 Q ss_pred cceeecccCCCCCCCeeeeEEccCCCeEEEEEc-CCEEEEc Q psy14259 9 RAEKQQNINDHEAPYLQHISWAPVDNALAFVYN-RDVYYSP 48 (79) Q Consensus 9 ~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~-nnly~~~ 48 (79) .+.......+.-+..+..|+|||++..||+|.+ ++|.+.. T Consensus 108 ~~~~~~E~VG~vd~GI~a~~WSPD~Ella~vT~~~~l~~mt 148 (928) T PF04762_consen 108 PDEDEIEIVGSVDSGILAASWSPDEELLALVTGEGNLLLMT 148 (928) T ss_pred CCCceeEEEEEEcCcEEEEEECCCcCEEEEEeCCCEEEEEe Confidence 333444444555667999999999999999985 5777775 No 36 >KOG1523|consensus Probab=74.73 E-value=4.7 Score=31.59 Aligned_cols=44 Identities=18% Similarity=0.381 Sum_probs=30.7 Q ss_pred ecCCccceeecccCCCCCCCeeeeEEccCCCeEEEE-EcCCEEEEcC Q psy14259 4 SRLPIRAEKQQNINDHEAPYLQHISWAPVDNALAFV-YNRDVYYSPS 49 (79) Q Consensus 4 ~~~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV-~~nnly~~~~ 49 (79) +|+|-|..+.+.. ..-.-+....|||+|+.|||| +|.-+++.+. T Consensus 191 sk~PFG~lm~E~~--~~ggwvh~v~fs~sG~~lawv~Hds~v~~~da 235 (361) T KOG1523|consen 191 SKMPFGQLMSEAS--SSGGWVHGVLFSPSGNRLAWVGHDSTVSFVDA 235 (361) T ss_pred cCCcHHHHHHhhc--cCCCceeeeEeCCCCCEeeEecCCCceEEeec Confidence 4555555555443 222237788999999999999 7888888863 No 37 >PF00930 DPPIV_N: Dipeptidyl peptidase IV (DPP IV) N-terminal region; InterPro: IPR002469 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain defines serine peptidases belonging to MEROPS peptidase family S9 (clan SC), subfamily S9B (dipeptidyl-peptidase IV). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. This domain is an alignment of the region to the N-terminal side of the active site, which is found in IPR001375 from INTERPRO. CD26 (3.4.14.5 from EC) is also called adenosine deaminase-binding protein (ADA-binding protein) or dipeptidylpeptidase IV (DPP IV ectoenzyme). The exopeptidase cleaves off N-terminal X-Pro or X-Ala dipeptides from polypeptides (dipeptidyl peptidase IV activity). CD26 serves as the costimulatory molecule in T cell activation and is an associated marker of autoimmune diseases, adenosine deaminase-deficiency and HIV pathogenesis. Dipeptidyl peptidase IV (DPP IV) is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. The enzyme catalyses the reaction:Dipeptidyl-Polypeptide + H(2)O = Dipeptide + Polypeptide It is a type II membrane protein that forms a homodimer. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0006508 proteolysis, 0016020 membrane; PDB: 2RIP_A 3Q8W_B 2AJL_I 1TKR_B 1TK3_B 3C45_A 2G5P_A 3G0C_D 1R9M_C 1RWQ_A .... Probab=74.14 E-value=2 Score=31.93 Aligned_cols=19 Identities=26% Similarity=0.544 Sum_probs=14.6 Q ss_pred eeeEEccCCCeEEEEEcCC Q psy14259 25 QHISWAPVDNALAFVYNRD 43 (79) Q Consensus 25 q~a~wsP~g~~lafV~~nn 43 (79) ...-|||+|++|||.+-|+ T Consensus 104 ~~~~WSpd~~~la~~~~d~ 122 (353) T PF00930_consen 104 SAVWWSPDSKYLAFLRFDE 122 (353) T ss_dssp BSEEE-TTSSEEEEEEEE- T ss_pred cceEECCCCCEEEEEEECC Confidence 4567999999999998765 No 38 >PF04053 Coatomer_WDAD: Coatomer WD associated region ; InterPro: IPR006692 Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer []. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins []. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi []. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes []. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits. This entry represents the WD-associated region found in coatomer subunits alpha, beta and beta' subunits. The alpha-subunit (RET1P) of the coatomer complex in Saccharomyces cerevisiae (Baker's yeast), participates in membrane transport between the endoplasmic reticulum and Golgi apparatus. The protein contains six WD-40 repeat motifs in its N-terminal region []. More information about these proteins can be found at Protein of the Month: Clathrin [].; GO: 0005198 structural molecule activity, 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0030117 membrane coat; PDB: 3MKQ_B. Probab=74.03 E-value=5 Score=31.75 Aligned_cols=25 Identities=16% Similarity=0.402 Sum_probs=19.9 Q ss_pred eeeeEEccCCCeEEEEEcCCEEEEc Q psy14259 24 LQHISWAPVDNALAFVYNRDVYYSP 48 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~nnly~~~ 48 (79) +.++.||++|+.+|++.++.+|+.+ T Consensus 147 vk~V~Ws~~g~~val~t~~~i~il~ 171 (443) T PF04053_consen 147 VKYVIWSDDGELVALVTKDSIYILK 171 (443) T ss_dssp -EEEEE-TTSSEEEEE-S-SEEEEE T ss_pred CcEEEEECCCCEEEEEeCCeEEEEE Confidence 6889999999999999999999986 No 39 >KOG1524|consensus Probab=69.82 E-value=7.7 Score=32.57 Aligned_cols=35 Identities=26% Similarity=0.480 Sum_probs=28.8 Q ss_pred CeeeeEEccCCCeEEEEEcCCEEEEcCCCCCCeEE Q psy14259 23 YLQHISWAPVDNALAFVYNRDVYYSPSATLQDIYR 57 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nnly~~~~~~~~~~~~ 57 (79) .+.-++|.|+++.+.|..+..+|+.+.....+.+| T Consensus 147 ~v~c~~W~p~S~~vl~c~g~h~~IKpL~~n~k~i~ 181 (737) T KOG1524|consen 147 SIRCARWAPNSNSIVFCQGGHISIKPLAANSKIIR 181 (737) T ss_pred eeEEEEECCCCCceEEecCCeEEEeecccccceeE Confidence 36778999999999999999999998655444444 No 40 >cd00200 WD40 WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and botto Probab=68.26 E-value=26 Score=22.36 Aligned_cols=28 Identities=11% Similarity=0.029 Sum_probs=23.4 Q ss_pred CCeeeeEEccCCCeEEEEE-cCCEEEEcC Q psy14259 22 PYLQHISWAPVDNALAFVY-NRDVYYSPS 49 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV~-~nnly~~~~ 49 (79) ..+..+.|+|+++.|+... ++.|++.+. T Consensus 10 ~~i~~~~~~~~~~~l~~~~~~g~i~i~~~ 38 (289) T cd00200 10 GGVTCVAFSPDGKLLATGSGDGTIKVWDL 38 (289) T ss_pred CCEEEEEEcCCCCEEEEeecCcEEEEEEe Confidence 4588999999999999987 778888763 No 41 >PF08662 eIF2A: Eukaryotic translation initiation factor eIF2A; InterPro: IPR013979 This entry contains beta propellor domains found in eukaryotic translation initiation factors and TolB domain-containing proteins. Probab=68.17 E-value=5.6 Score=27.41 Aligned_cols=21 Identities=24% Similarity=0.374 Sum_probs=16.7 Q ss_pred CCCCeeeeEEccCCCeEEEEE Q psy14259 20 EAPYLQHISWAPVDNALAFVY 40 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV~ 40 (79) +....-.+.|||+|++|+-.. T Consensus 142 ~~~~~t~~~WsPdGr~~~ta~ 162 (194) T PF08662_consen 142 EHSDATDVEWSPDGRYLATAT 162 (194) T ss_pred ccCcEEEEEEcCCCCEEEEEE Confidence 444578899999999988764 No 42 >KOG1445|consensus Probab=65.49 E-value=10 Score=32.68 Aligned_cols=31 Identities=16% Similarity=0.237 Sum_probs=25.3 Q ss_pred CCCeeeeEEccCCCeEEEE-EcCCEEEEcCCCC Q psy14259 21 APYLQHISWAPVDNALAFV-YNRDVYYSPSATL 52 (79) Q Consensus 21 ~~~~q~a~wsP~g~~lafV-~~nnly~~~~~~~ 52 (79) ...+-.++|||+|+.+|-| +|..|.|.. +.+ T Consensus 720 tdqIf~~AWSpdGr~~AtVcKDg~~rVy~-Prs 751 (1012) T KOG1445|consen 720 TDQIFGIAWSPDGRRIATVCKDGTLRVYE-PRS 751 (1012) T ss_pred cCceeEEEECCCCcceeeeecCceEEEeC-CCC Confidence 3457788999999999998 789999986 444 No 43 >PF15492 Nbas_N: Neuroblastoma-amplified sequence, N terminal Probab=65.27 E-value=10 Score=28.80 Aligned_cols=36 Identities=14% Similarity=0.325 Sum_probs=26.8 Q ss_pred ecccCCCCCCCeeeeEEccCCCeEEEEEc-CCEEEEc Q psy14259 13 QQNINDHEAPYLQHISWAPVDNALAFVYN-RDVYYSP 48 (79) Q Consensus 13 ~~~~~~~~~~~~q~a~wsP~g~~lafV~~-nnly~~~ 48 (79) +-..+....++.+..+|||++..|||.+. ..|++.+ T Consensus 35 kcqVpkD~~PQWRkl~WSpD~tlLa~a~S~G~i~vfd 71 (282) T PF15492_consen 35 KCQVPKDPNPQWRKLAWSPDCTLLAYAESTGTIRVFD 71 (282) T ss_pred EEecCCCCCchheEEEECCCCcEEEEEcCCCeEEEEe Confidence 33456677778999999999999999876 3444443 No 44 >PF00400 WD40: WD domain, G-beta repeat; InterPro: IPR019781 WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase [, ]. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.; PDB: 2ZKQ_a 3CFV_B 3CFS_B 1PEV_A 1NR0_A 1VYH_T 3RFH_A 3O2Z_T 3FRX_C 3U5G_g .... Probab=63.48 E-value=16 Score=17.76 Aligned_cols=26 Identities=23% Similarity=0.305 Sum_probs=18.8 Q ss_pred CCeeeeEEccCCCeEEEEEc-CCEEEE Q psy14259 22 PYLQHISWAPVDNALAFVYN-RDVYYS 47 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV~~-nnly~~ 47 (79) ..+....|+|+++.|+=.-. +.|.+. T Consensus 12 ~~i~~i~~~~~~~~~~s~~~D~~i~vw 38 (39) T PF00400_consen 12 SSINSIAWSPDGNFLASGSSDGTIRVW 38 (39) T ss_dssp SSEEEEEEETTSSEEEEEETTSEEEEE T ss_pred CcEEEEEEecccccceeeCCCCEEEEE Confidence 45888999999888877644 555554 No 45 >PRK13616 lipoprotein LpqB; Provisional Probab=63.29 E-value=26 Score=28.69 Aligned_cols=36 Identities=8% Similarity=0.121 Sum_probs=26.4 Q ss_pred CeeeeEEccCCCeEEEEE-------c--CCEEEEcCCCCCCeEEecc Q psy14259 23 YLQHISWAPVDNALAFVY-------N--RDVYYSPSATLQDIYRLSN 60 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~-------~--nnly~~~~~~~~~~~~lT~ 60 (79) .+..++.||+|+.+|||. + ..||+.+. . +...++|. T Consensus 351 ~vsspaiSpdG~~vA~v~~~~~~~~d~~s~Lwv~~~-g-g~~~~lt~ 395 (591) T PRK13616 351 NITSAALSRSGRQVAAVVTLGRGAPDPASSLWVGPL-G-GVAVQVLE 395 (591) T ss_pred CcccceECCCCCEEEEEEeecCCCCCcceEEEEEeC-C-Ccceeeec Confidence 567889999999999999 2 36888863 2 23456654 No 46 >TIGR03866 PQQ_ABC_repeats PQQ-dependent catabolism-associated beta-propeller protein. Members of this protein family consist of seven repeats each of the YVTN family beta-propeller repeat (see TIGR02276). Members occur invariably as part of a transport operon that is associated with PQQ-dependent catabolism of alcohols such as phenylethanol. Probab=61.80 E-value=24 Score=23.77 Aligned_cols=40 Identities=10% Similarity=0.101 Sum_probs=25.6 Q ss_pred eeeeEEccCCCeEEEE--EcCCEEEEcCCCCCCe-EEeccCCCc Q psy14259 24 LQHISWAPVDNALAFV--YNRDVYYSPSATLQDI-YRLSNTGSE 64 (79) Q Consensus 24 ~q~a~wsP~g~~lafV--~~nnly~~~~~~~~~~-~~lT~dG~~ 64 (79) .-...|+|+|+.|+.. .++.|.+++. ...+. .++..++.+ T Consensus 251 ~~~~~~~~~g~~l~~~~~~~~~i~v~d~-~~~~~~~~~~~~~~~ 293 (300) T TIGR03866 251 VWQLAFTPDEKYLLTTNGVSNDVSVIDV-AALKVIKSIKVGRLP 293 (300) T ss_pred cceEEECCCCCEEEEEcCCCCeEEEEEC-CCCcEEEEEEccccc Confidence 3456899999988765 3688999974 34333 455444333 No 47 >KOG4497|consensus Probab=60.40 E-value=11 Score=30.07 Aligned_cols=46 Identities=22% Similarity=0.345 Sum_probs=31.1 Q ss_pred CCCCCeeeeEEccCCCeEEEEEcCCE------------EEEcCCC-CCCeEEeccCCCc Q psy14259 19 HEAPYLQHISWAPVDNALAFVYNRDV------------YYSPSAT-LQDIYRLSNTGSE 64 (79) Q Consensus 19 ~~~~~~q~a~wsP~g~~lafV~~nnl------------y~~~~~~-~~~~~~lT~dG~~ 64 (79) .++..+..+.|||+|++|.-..+.++ |+.+-+. ..+.+.++.||.- T Consensus 89 eg~agls~~~WSPdgrhiL~tseF~lriTVWSL~t~~~~~~~~pK~~~kg~~f~~dg~f 147 (447) T KOG4497|consen 89 EGQAGLSSISWSPDGRHILLTSEFDLRITVWSLNTQKGYLLPHPKTNVKGYAFHPDGQF 147 (447) T ss_pred cCCCcceeeeECCCcceEeeeecceeEEEEEEeccceeEEecccccCceeEEECCCCce Confidence 45667999999999988887776664 3333222 2345677788865 No 48 >PF14583 Pectate_lyase22: Oligogalacturonate lyase; PDB: 3C5M_C 3PE7_A. Probab=54.42 E-value=44 Score=26.46 Aligned_cols=54 Identities=17% Similarity=-0.009 Sum_probs=30.4 Q ss_pred eecCCccceeecccCCCCCCCeeeeEEccCCCeEEEEEc-CCEEEEcCCCCCCeEEec Q psy14259 3 ESRLPIRAEKQQNINDHEAPYLQHISWAPVDNALAFVYN-RDVYYSPSATLQDIYRLS 59 (79) Q Consensus 3 ~~~~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~-nnly~~~~~~~~~~~~lT 59 (79) +..+.++..+||...+... ..-..+||+++++.||++ .+|+-.+. ...+...|- T Consensus 64 ~lDL~t~~i~QLTdg~g~~--~~g~~~s~~~~~~~Yv~~~~~l~~vdL-~T~e~~~vy 118 (386) T PF14583_consen 64 LLDLATGEITQLTDGPGDN--TFGGFLSPDDRALYYVKNGRSLRRVDL-DTLEERVVY 118 (386) T ss_dssp EEETTT-EEEE---SS-B---TTT-EE-TTSSEEEEEETTTEEEEEET-TT--EEEEE T ss_pred EEEcccCEEEECccCCCCC--ccceEEecCCCeEEEEECCCeEEEEEC-CcCcEEEEE Confidence 3467788888888655322 336889999999999986 47887764 333344443 No 49 >KOG1445|consensus Probab=51.98 E-value=20 Score=31.01 Aligned_cols=41 Identities=10% Similarity=0.122 Sum_probs=33.1 Q ss_pred CCCCCeeeeEEccCCCeEEEE-EcCCEEEEcCCCCCCeEEec Q psy14259 19 HEAPYLQHISWAPVDNALAFV-YNRDVYYSPSATLQDIYRLS 59 (79) Q Consensus 19 ~~~~~~q~a~wsP~g~~lafV-~~nnly~~~~~~~~~~~~lT 59 (79) ....++|.|.||-+|.-||-- +|.+|-+.+...+.+.+|.| T Consensus 168 ~h~d~vQSa~WseDG~llatscKdkqirifDPRa~~~piQ~t 209 (1012) T KOG1445|consen 168 GHTDKVQSADWSEDGKLLATSCKDKQIRIFDPRASMEPIQTT 209 (1012) T ss_pred CCchhhhccccccCCceEeeecCCcceEEeCCccCCCccccc Confidence 355679999999999988764 78899988755567788888 No 50 >KOG1920|consensus Probab=49.05 E-value=28 Score=31.45 Aligned_cols=42 Identities=14% Similarity=0.266 Sum_probs=31.3 Q ss_pred ccceeecccCCCCCCCeeeeEEccCCCeEEEEEc-CCEEEEcC Q psy14259 8 IRAEKQQNINDHEAPYLQHISWAPVDNALAFVYN-RDVYYSPS 49 (79) Q Consensus 8 ~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~-nnly~~~~ 49 (79) +++.......+.-+..+..|.|||++..+|++.+ .+|++.+. T Consensus 96 d~et~~~eivg~vd~GI~aaswS~Dee~l~liT~~~tll~mT~ 138 (1265) T KOG1920|consen 96 DPETLELEIVGNVDNGISAASWSPDEELLALITGRQTLLFMTK 138 (1265) T ss_pred cccccceeeeeeccCceEEEeecCCCcEEEEEeCCcEEEEEec Confidence 3344444455556667999999999999999988 77777754 No 51 >PF12657 TFIIIC_delta: Transcription factor IIIC subunit delta N-term; InterPro: IPR024761 This entry represents a domain found towards the N terminus of the 90 kDa subunit of transcription factor IIIC (also known as subunit 9 in yeast []). The whole subunit is involved in RNA polymerase III-mediated transcription. It is possible that this N-terminal domain interacts with TFIIIC subunit 8 []. Probab=48.84 E-value=22 Score=24.02 Aligned_cols=23 Identities=17% Similarity=0.258 Sum_probs=20.3 Q ss_pred eeeeEEccCCCeEEEEEcCCEEEE Q psy14259 24 LQHISWAPVDNALAFVYNRDVYYS 47 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~nnly~~ 47 (79) ....+||.+| .||.+.+..|++. T Consensus 7 ~~~l~WS~Dg-~laV~t~~~v~IL 29 (173) T PF12657_consen 7 PNALAWSEDG-QLAVATGESVHIL 29 (173) T ss_pred CcCeeECCCC-CEEEEcCCeEEEE Confidence 4567999988 7999999999998 No 52 >PF04841 Vps16_N: Vps16, N-terminal region; InterPro: IPR006926 This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast []. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport []. The role of VPS16 in this complex is not known.; GO: 0006886 intracellular protein transport, 0005737 cytoplasm Probab=46.54 E-value=75 Score=24.54 Aligned_cols=28 Identities=14% Similarity=0.235 Sum_probs=23.2 Q ss_pred CCeeeeEEccCCCeEEEEEc-CCEEEEcC Q psy14259 22 PYLQHISWAPVDNALAFVYN-RDVYYSPS 49 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV~~-nnly~~~~ 49 (79) ..+...+.||+|+.||+..+ ++|++... T Consensus 217 ~~i~~iavSpng~~iAl~t~~g~l~v~ss 245 (410) T PF04841_consen 217 GPIIKIAVSPNGKFIALFTDSGNLWVVSS 245 (410) T ss_pred CCeEEEEECCCCCEEEEEECCCCEEEEEC Confidence 35889999999999999977 67887754 No 53 >KOG1274|consensus Probab=45.92 E-value=42 Score=29.48 Aligned_cols=35 Identities=9% Similarity=0.224 Sum_probs=25.8 Q ss_pred cccCCCCCCCeeeeEEccCCCeEEEE-EcCCEEEEc Q psy14259 14 QNINDHEAPYLQHISWAPVDNALAFV-YNRDVYYSP 48 (79) Q Consensus 14 ~~~~~~~~~~~q~a~wsP~g~~lafV-~~nnly~~~ 48 (79) ..........+...+|||+|.+||-. .+|.|-+++ T Consensus 225 ~Lr~~~~ss~~~~~~wsPnG~YiAAs~~~g~I~vWn 260 (933) T KOG1274|consen 225 KLRDKLSSSKFSDLQWSPNGKYIAASTLDGQILVWN 260 (933) T ss_pred eecccccccceEEEEEcCCCcEEeeeccCCcEEEEe Confidence 33444455558899999999999988 566677775 No 54 >KOG1332|consensus Probab=42.22 E-value=83 Score=24.14 Aligned_cols=40 Identities=15% Similarity=0.073 Sum_probs=27.8 Q ss_pred CCCCeeeeEEccCCCeEEEEEc-CCEEEEcCCCCCCeEEec Q psy14259 20 EAPYLQHISWAPVDNALAFVYN-RDVYYSPSATLQDIYRLS 59 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV~~-nnly~~~~~~~~~~~~lT 59 (79) -...+-.+.||+.|+.||--.+ |++++......++=++|+ T Consensus 255 f~~~~w~vSWS~sGn~LaVs~GdNkvtlwke~~~Gkw~~v~ 295 (299) T KOG1332|consen 255 FPDVVWRVSWSLSGNILAVSGGDNKVTLWKENVDGKWEEVG 295 (299) T ss_pred CCcceEEEEEeccccEEEEecCCcEEEEEEeCCCCcEEEcc Confidence 3344678899999999999876 556666544445555655 No 55 >KOG2055|consensus Probab=42.21 E-value=90 Score=25.70 Aligned_cols=56 Identities=18% Similarity=0.241 Sum_probs=39.3 Q ss_pred CCCCeeeeEEccCCCeEEEEEcCCEEEE--cCCC---------------CCCeEEeccCCCceeEecccceee Q psy14259 20 EAPYLQHISWAPVDNALAFVYNRDVYYS--PSAT---------------LQDIYRLSNTGSEVVSNGVPDWLY 75 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV~~nnly~~--~~~~---------------~~~~~~lT~dG~~~i~nG~~DWvY 75 (79) ....+|-+.|+|+|+..+|+....-|+. +... +-+...|..+|+-..++|..-|+| T Consensus 256 ~~fPi~~a~f~p~G~~~i~~s~rrky~ysyDle~ak~~k~~~~~g~e~~~~e~FeVShd~~fia~~G~~G~I~ 328 (514) T KOG2055|consen 256 EKFPIQKAEFAPNGHSVIFTSGRRKYLYSYDLETAKVTKLKPPYGVEEKSMERFEVSHDSNFIAIAGNNGHIH 328 (514) T ss_pred ccCccceeeecCCCceEEEecccceEEEEeeccccccccccCCCCcccchhheeEecCCCCeEEEcccCceEE Confidence 4556899999999999999888775543 2111 011235667888788899988886 No 56 >PF08955 BofC_C: BofC C-terminal domain; InterPro: IPR015050 The C-terminal domain of the bacterial protein, bypass of forespore C (BofC), contains a three-stranded beta-sheet and three alpha-helices. The exact function is unknown []. ; PDB: 2BW2_A. Probab=41.56 E-value=12 Score=23.24 Aligned_cols=18 Identities=22% Similarity=0.436 Sum_probs=13.9 Q ss_pred eEEeccCCCceeEecccc Q psy14259 55 IYRLSNTGSEVVSNGVPD 72 (79) Q Consensus 55 ~~~lT~dG~~~i~nG~~D 72 (79) ..=||.||.-++|+|.|+ T Consensus 12 YfGi~~dG~LslF~G~P~ 29 (75) T PF08955_consen 12 YFGISEDGVLSLFEGPPG 29 (75) T ss_dssp -EEEETTTEEEEBSSS-S T ss_pred eEEEcCCCcEEEEecCCC Confidence 567888998899999985 No 57 >KOG2139|consensus Probab=41.41 E-value=31 Score=27.70 Aligned_cols=19 Identities=26% Similarity=0.590 Sum_probs=16.8 Q ss_pred CeeeeEEccCCCeEEEEEc Q psy14259 23 YLQHISWAPVDNALAFVYN 41 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~ 41 (79) .+|.|.|||.|..|.|+.- T Consensus 282 rvqtacWspcGsfLLf~~s 300 (445) T KOG2139|consen 282 RVQTACWSPCGSFLLFACS 300 (445) T ss_pred ceeeeeecCCCCEEEEEEc Confidence 7999999999999988753 No 58 >KOG2314|consensus Probab=41.03 E-value=17 Score=30.59 Aligned_cols=18 Identities=33% Similarity=0.726 Sum_probs=15.8 Q ss_pred CCeeeeEEccCCCeEEEE Q psy14259 22 PYLQHISWAPVDNALAFV 39 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV 39 (79) +.++.+.|||.++-|||= T Consensus 347 ~gIr~FswsP~~~llAYw 364 (698) T KOG2314|consen 347 SGIRDFSWSPTSNLLAYW 364 (698) T ss_pred ccccCcccCCCcceEEEE Confidence 358999999999999985 No 59 >KOG0291|consensus Probab=41.03 E-value=44 Score=29.10 Aligned_cols=31 Identities=10% Similarity=0.005 Sum_probs=26.4 Q ss_pred CCCeeeeEEccCCCeEEEEEcCCEEEEcCCC Q psy14259 21 APYLQHISWAPVDNALAFVYNRDVYYSPSAT 51 (79) Q Consensus 21 ~~~~q~a~wsP~g~~lafV~~nnly~~~~~~ 51 (79) +..++..+|||+|..+|-..+|-|=++..+. T Consensus 96 k~~v~~i~fSPng~~fav~~gn~lqiw~~P~ 126 (893) T KOG0291|consen 96 KRGVGAIKFSPNGKFFAVGCGNLLQIWHAPG 126 (893) T ss_pred cCccceEEECCCCcEEEEEecceeEEEecCc Confidence 4468899999999999999999998886554 No 60 >KOG1407|consensus Probab=40.84 E-value=44 Score=25.79 Aligned_cols=25 Identities=16% Similarity=0.337 Sum_probs=18.4 Q ss_pred eeeeEEccCCCeEEEEEc-CCEEEEc Q psy14259 24 LQHISWAPVDNALAFVYN-RDVYYSP 48 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~-nnly~~~ 48 (79) =-++.|||.|++++++-. +.|-..+ T Consensus 109 ni~i~wsp~g~~~~~~~kdD~it~id 134 (313) T KOG1407|consen 109 NINITWSPDGEYIAVGNKDDRITFID 134 (313) T ss_pred ceEEEEcCCCCEEEEecCcccEEEEE Confidence 357899999999999954 4444443 No 61 >PF10647 Gmad1: Lipoprotein LpqB beta-propeller domain; InterPro: IPR018910 The Gmad1 domain is found associated with IPR019606 from INTERPRO, in bacterial spore formation. It is predicted to have a beta-propeller fold and to have a passive binding role rather than a catalytic function owing to the low number of conserved hydrophilic residues. Probab=40.43 E-value=73 Score=22.79 Aligned_cols=28 Identities=21% Similarity=0.285 Sum_probs=22.9 Q ss_pred CCCeeeeEEccCCCeEEEEE----cCCEEEEc Q psy14259 21 APYLQHISWAPVDNALAFVY----NRDVYYSP 48 (79) Q Consensus 21 ~~~~q~a~wsP~g~~lafV~----~nnly~~~ 48 (79) ...+..+++||+|+.+|||. ...||+.. T Consensus 23 ~~~~~s~AvS~dg~~~A~v~~~~~~~~L~~~~ 54 (253) T PF10647_consen 23 GYDVTSPAVSPDGSRVAAVSEGDGGRSLYVGP 54 (253) T ss_pred CccccceEECCCCCeEEEEEEcCCCCEEEEEc Confidence 34688999999999999999 35577775 No 62 >KOG0286|consensus Probab=39.17 E-value=71 Score=24.97 Aligned_cols=42 Identities=7% Similarity=0.092 Sum_probs=34.0 Q ss_pred CCCCeeeeEEccCCCeEEEE-EcCCEEEEcCCCCCCeEEeccC Q psy14259 20 EAPYLQHISWAPVDNALAFV-YNRDVYYSPSATLQDIYRLSNT 61 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV-~~nnly~~~~~~~~~~~~lT~d 61 (79) ...++..+.|++++++|+-- .|.-|.+++..+.++.++|+-. T Consensus 54 H~~Ki~~~~ws~Dsr~ivSaSqDGklIvWDs~TtnK~haipl~ 96 (343) T KOG0286|consen 54 HLNKIYAMDWSTDSRRIVSASQDGKLIVWDSFTTNKVHAIPLP 96 (343) T ss_pred cccceeeeEecCCcCeEEeeccCCeEEEEEcccccceeEEecC Confidence 44578899999999988776 6678999988888888888754 No 63 >PF04993 TfoX_N: TfoX N-terminal domain; InterPro: IPR007076 This domain is found in a number of bacterial proteins including the TfoX gene product of Haemophilus influenzae. TfoX may play a key role in the development of genetic competence by regulating the expression of late competence-specific genes []. This family corresponds to the N-terminal presumed domain of TfoX. The domain is found in association with the C-terminal domain in some, but not all members of this group, suggesting this is an autonomous and functionally unrelated domain.; PDB: 2OD0_A. Probab=37.27 E-value=33 Score=20.95 Aligned_cols=22 Identities=14% Similarity=0.167 Sum_probs=17.1 Q ss_pred EEccCCCeEEEEEcCCEEEEcC Q psy14259 28 SWAPVDNALAFVYNRDVYYSPS 49 (79) Q Consensus 28 ~wsP~g~~lafV~~nnly~~~~ 49 (79) .+-=+|+-+|.|.+|.||++.+ T Consensus 21 g~~~dg~mfa~v~~~~lylR~~ 42 (97) T PF04993_consen 21 GIYVDGKMFALVCDDRLYLRVD 42 (97) T ss_dssp EEEETTEEEEEEETTEEEEE-- T ss_pred EEEECCEEEEEEECCEEEEEeC Confidence 3444899999999999999964 No 64 >KOG0275|consensus Probab=36.24 E-value=18 Score=28.85 Aligned_cols=34 Identities=15% Similarity=0.288 Sum_probs=20.3 Q ss_pred eecCCccceeecccCCCCCCCeeeeEEccCCCeEEE Q psy14259 3 ESRLPIRAEKQQNINDHEAPYLQHISWAPVDNALAF 38 (79) Q Consensus 3 ~~~~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~laf 38 (79) |.++|+.-......+.. +...-|.|||+|++|+- T Consensus 197 Ee~~Pt~l~r~IKFg~K--Sh~EcA~FSPDgqyLvs 230 (508) T KOG0275|consen 197 EERYPTQLARSIKFGQK--SHVECARFSPDGQYLVS 230 (508) T ss_pred hhhchHHhhhheecccc--cchhheeeCCCCceEee Confidence 33455444444333332 33667899999998874 No 65 >KOG2107|consensus Probab=35.65 E-value=39 Score=24.18 Aligned_cols=36 Identities=28% Similarity=0.384 Sum_probs=29.2 Q ss_pred EEEEcCCEEEEcCCCCCCeEEeccCCCc-----eeEecccceee Q psy14259 37 AFVYNRDVYYSPSATLQDIYRLSNTGSE-----VVSNGVPDWLY 75 (79) Q Consensus 37 afV~~nnly~~~~~~~~~~~~lT~dG~~-----~i~nG~~DWvY 75 (79) .||.-++|++.+. +-.+|.|.+.+. .+|-|.|-|.. T Consensus 118 i~vekGDlivlPa---GiyHRFTtt~~n~vkamRlF~~~p~wta 158 (179) T KOG2107|consen 118 IFVEKGDLIVLPA---GIYHRFTTTPSNYVKAMRLFVGEPKWTA 158 (179) T ss_pred EEEecCCEEEecC---cceeeeecCchHHHHHHHHhcCCccccc Confidence 6889999999963 448899998876 78889998864 No 66 >COG3386 Gluconolactonase [Carbohydrate transport and metabolism] Probab=34.68 E-value=85 Score=23.62 Aligned_cols=23 Identities=9% Similarity=0.141 Sum_probs=15.3 Q ss_pred eeEEccCCCeEEEEEc--CCEEEEc Q psy14259 26 HISWAPVDNALAFVYN--RDVYYSP 48 (79) Q Consensus 26 ~a~wsP~g~~lafV~~--nnly~~~ 48 (79) -.+|||+|+.|-++-- |-||-.. T Consensus 167 Gla~SpDg~tly~aDT~~~~i~r~~ 191 (307) T COG3386 167 GLAFSPDGKTLYVADTPANRIHRYD 191 (307) T ss_pred ceEECCCCCEEEEEeCCCCeEEEEe Confidence 3589999986655533 6666664 No 67 >KOG2106|consensus Probab=34.65 E-value=62 Score=27.09 Aligned_cols=28 Identities=18% Similarity=0.255 Sum_probs=22.9 Q ss_pred CCCeeeeEEccCCCeEEEE-EcCCEEEEc Q psy14259 21 APYLQHISWAPVDNALAFV-YNRDVYYSP 48 (79) Q Consensus 21 ~~~~q~a~wsP~g~~lafV-~~nnly~~~ 48 (79) .+.+.-+++||+|..||-- +||-||+.. T Consensus 447 ~~~ls~v~ysp~G~~lAvgs~d~~iyiy~ 475 (626) T KOG2106|consen 447 NEQLSVVRYSPDGAFLAVGSHDNHIYIYR 475 (626) T ss_pred CCceEEEEEcCCCCEEEEecCCCeEEEEE Confidence 5668999999999999987 566677664 No 68 >PF01436 NHL: NHL repeat; InterPro: IPR001258 The NHL repeat, named after NCL-1, HT2A and Lin-41, is found largely in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides []. In many it occurs in tandem arrays, for example in the ringfinger beta-box, coiled-coil (RBCC) eukaryotic growth regulators []. The 'Brain Tumor' protein (Brat) is one such growth regulator that contains a 6-bladed NHL-repeat beta-propeller [, ]. The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK, PknD, from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain.; GO: 0005515 protein binding; PDB: 3FVZ_A 3FW0_A 1RWL_A 1RWI_A 1Q7F_A. Probab=34.24 E-value=61 Score=15.57 Aligned_cols=15 Identities=20% Similarity=0.352 Sum_probs=11.1 Q ss_pred eEEEEEcCCEEEEcC Q psy14259 35 ALAFVYNRDVYYSPS 49 (79) Q Consensus 35 ~lafV~~nnly~~~~ 49 (79) .|+.-.+++||+.+. T Consensus 6 gvav~~~g~i~VaD~ 20 (28) T PF01436_consen 6 GVAVDSDGNIYVADS 20 (28) T ss_dssp EEEEETTSEEEEEEC T ss_pred EEEEeCCCCEEEEEC Confidence 356668899999874 No 69 >KOG2314|consensus Probab=33.19 E-value=48 Score=28.07 Aligned_cols=29 Identities=14% Similarity=0.141 Sum_probs=23.8 Q ss_pred CCCCeeeeEEccCCCeEEEEEcCCEEEEc Q psy14259 20 EAPYLQHISWAPVDNALAFVYNRDVYYSP 48 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV~~nnly~~~ 48 (79) +..---+++|||.|.+|+-.|..-|-++- T Consensus 209 enWTetyv~wSP~GTYL~t~Hk~GI~lWG 237 (698) T KOG2314|consen 209 ENWTETYVRWSPKGTYLVTFHKQGIALWG 237 (698) T ss_pred hcceeeeEEecCCceEEEEEeccceeeec Confidence 34445689999999999999999888874 No 70 >KOG1007|consensus Probab=33.03 E-value=81 Score=24.80 Aligned_cols=31 Identities=13% Similarity=0.293 Sum_probs=25.9 Q ss_pred CeeeeEEccCCCeEEEEEcCCEEEEcCCCCC Q psy14259 23 YLQHISWAPVDNALAFVYNRDVYYSPSATLQ 53 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nnly~~~~~~~~ 53 (79) ++.-..|-|++..+|-+.+|+|-+.....+. T Consensus 125 ~i~cvew~Pns~klasm~dn~i~l~~l~ess 155 (370) T KOG1007|consen 125 KINCVEWEPNSDKLASMDDNNIVLWSLDESS 155 (370) T ss_pred ceeeEEEcCCCCeeEEeccCceEEEEcccCc Confidence 4667799999999999999999999754443 No 71 >KOG0318|consensus Probab=32.97 E-value=1.3e+02 Score=25.21 Aligned_cols=48 Identities=8% Similarity=0.112 Sum_probs=33.0 Q ss_pred CCccceeecccCCC-CCCCeeeeEEccCCCeEEEE-EcCCEEEEcCCCCC Q psy14259 6 LPIRAEKQQNINDH-EAPYLQHISWAPVDNALAFV-YNRDVYYSPSATLQ 53 (79) Q Consensus 6 ~~~~~~~~~~~~~~-~~~~~q~a~wsP~g~~lafV-~~nnly~~~~~~~~ 53 (79) |=+|.+.+-..... ....++...+||+|..+|=+ .|..+|+.+...+. T Consensus 174 ffeGPPFKFk~s~r~HskFV~~VRysPDG~~Fat~gsDgki~iyDGktge 223 (603) T KOG0318|consen 174 FFEGPPFKFKSSFREHSKFVNCVRYSPDGSRFATAGSDGKIYIYDGKTGE 223 (603) T ss_pred EeeCCCeeeeecccccccceeeEEECCCCCeEEEecCCccEEEEcCCCcc Confidence 33455555543333 33468889999999999988 67889999754433 No 72 >KOG4497|consensus Probab=32.71 E-value=46 Score=26.64 Aligned_cols=24 Identities=21% Similarity=0.235 Sum_probs=21.4 Q ss_pred eeEEccCCCeEEEEEcCCEEEEcC Q psy14259 26 HISWAPVDNALAFVYNRDVYYSPS 49 (79) Q Consensus 26 ~a~wsP~g~~lafV~~nnly~~~~ 49 (79) .+.|||+|+++|-..+..|.+++. T Consensus 13 ~c~fSp~g~yiAs~~~yrlviRd~ 36 (447) T KOG4497|consen 13 FCSFSPCGNYIASLSRYRLVIRDS 36 (447) T ss_pred ceeECCCCCeeeeeeeeEEEEecc Confidence 689999999999999998888863 No 73 >PF03079 ARD: ARD/ARD' family; InterPro: IPR004313 The two acireductone dioxygenase enzymes (ARD and ARD', previously known as E-2 and E-2') from Klebsiella pneumoniae share the same amino acid sequence Q9ZFE7 from SWISSPROT, but bind different metal ions: ARD binds Ni2+, ARD' binds Fe2+ []. ARD and ARD' can be experimentally interconverted by removal of the bound metal ion and reconstitution with the appropriate metal ion. The two enzymes share the same substrate, 1,2-dihydroxy-3-keto-5-(methylthio)pentene, but yield different products. ARD' yields the alpha-keto precursor of methionine (and formate), thus forming part of the ubiquitous methionine salvage pathway that converts 5'-methylthioadenosine (MTA) to methionine. This pathway is responsible for the tight control of the concentration of MTA, which is a powerful inhibitor of polyamine biosynthesis and transmethylation reactions []. ARD yields methylthiopropanoate, carbon monoxide and formate, and thus prevents the conversion of MTA to methionine. The role of the ARD catalysed reaction is unclear: methylthiopropanoate is cytotoxic, and carbon monoxide can activate guanylyl cyclase, leading to increased intracellular cGMP levels [, ]. This family also contains other proteins, whose functions are not well characterised.; GO: 0010309 acireductone dioxygenase [iron(II)-requiring] activity, 0055114 oxidation-reduction process; PDB: 1VR3_A 1ZRR_A 2HJI_A. Probab=32.19 E-value=1.5e+02 Score=20.32 Aligned_cols=42 Identities=17% Similarity=0.227 Sum_probs=29.8 Q ss_pred ccCCCeE-EEEEcCCEEEEcCCCCCCeEEeccCCCc-----eeEeccccee Q psy14259 30 APVDNAL-AFVYNRDVYYSPSATLQDIYRLSNTGSE-----VVSNGVPDWL 74 (79) Q Consensus 30 sP~g~~l-afV~~nnly~~~~~~~~~~~~lT~dG~~-----~i~nG~~DWv 74 (79) .+++..+ +.+..+||.+.+. +..++.|-+-++ .+|.+.+-|+ T Consensus 109 ~~~~~wiri~~e~GDli~vP~---g~~HrF~~~~~~~i~aiRlF~~~~gWv 156 (157) T PF03079_consen 109 DGDDVWIRILCEKGDLIVVPA---GTYHRFTLGESPYIKAIRLFKDEPGWV 156 (157) T ss_dssp -TTCEEEEEEEETTCEEEE-T---T--EEEEESTTSSEEEEEEESSCGGEE T ss_pred cCCCEEEEEEEcCCCEEecCC---CCceeEEcCCCCcEEEEEeecCCCCcc Confidence 5566667 7888899998863 458888877665 7888888887 No 74 >PF10584 Proteasome_A_N: Proteasome subunit A N-terminal signature; InterPro: IPR000426 The proteasome (or macropain) (3.4.25.1 from EC) [, , , , ] is a eukaryotic and archaeal multicatalytic proteinase complex that seems to be involved in an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is composed of about 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700 kDa. Most proteasome subunits can be classified, on the basis on sequence similarities into two groups, alpha (A) and beta (B). This family contains the alpha subunit sequences which range from 210 to 290 amino acids. These sequences are classified as non-peptidase homologues in MEROPS peptidase family T1 (clan PB(T)). ; GO: 0004175 endopeptidase activity, 0006511 ubiquitin-dependent protein catabolic process, 0019773 proteasome core complex, alpha-subunit complex; PDB: 3H4P_M 1IRU_O 3UN4_U 1FNT_A 3OEV_G 3OEU_U 3SDK_U 3DY3_G 3MG7_G 3L5Q_C .... Probab=31.95 E-value=20 Score=17.48 Aligned_cols=9 Identities=11% Similarity=0.353 Sum_probs=6.7 Q ss_pred eEEccCCCe Q psy14259 27 ISWAPVDNA 35 (79) Q Consensus 27 a~wsP~g~~ 35 (79) ..|||+|+- T Consensus 6 t~FSp~Grl 14 (23) T PF10584_consen 6 TTFSPDGRL 14 (23) T ss_dssp TSBBTTSSB T ss_pred eeECCCCeE Confidence 468998863 No 75 >KOG0771|consensus Probab=30.79 E-value=51 Score=26.33 Aligned_cols=40 Identities=13% Similarity=0.086 Sum_probs=30.1 Q ss_pred CCCCeeeeEEccCCCeEEEEEcCCEEEEcCCCCCCeEEec Q psy14259 20 EAPYLQHISWAPVDNALAFVYNRDVYYSPSATLQDIYRLS 59 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV~~nnly~~~~~~~~~~~~lT 59 (79) ....+-...|||+|+.||++-.+-..|++...+....+.| T Consensus 185 ~~~eV~DL~FS~dgk~lasig~d~~~VW~~~~g~~~a~~t 224 (398) T KOG0771|consen 185 HHAEVKDLDFSPDGKFLASIGADSARVWSVNTGAALARKT 224 (398) T ss_pred hcCccccceeCCCCcEEEEecCCceEEEEeccCchhhhcC Confidence 3345888999999999999988888888754443334666 No 76 >KOG0973|consensus Probab=29.73 E-value=84 Score=27.77 Aligned_cols=30 Identities=20% Similarity=0.371 Sum_probs=25.2 Q ss_pred CCCCeeeeEEccCCCeEEEE-EcCCEEEEcC Q psy14259 20 EAPYLQHISWAPVDNALAFV-YNRDVYYSPS 49 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV-~~nnly~~~~ 49 (79) .+..++...|||++..||-+ .||-+.+++. T Consensus 128 H~~DV~Dv~Wsp~~~~lvS~s~DnsViiwn~ 158 (942) T KOG0973|consen 128 HDSDVLDVNWSPDDSLLVSVSLDNSVIIWNA 158 (942) T ss_pred CCCccceeccCCCccEEEEecccceEEEEcc Confidence 45579999999999999999 6788888863 No 77 >PF09865 DUF2092: Predicted periplasmic protein (DUF2092); InterPro: IPR019207 This entry represents various hypothetical prokaryotic proteins of unknown function. Probab=28.85 E-value=1.3e+02 Score=21.65 Aligned_cols=46 Identities=17% Similarity=0.156 Sum_probs=31.8 Q ss_pred EccCCCeEEEEEcCCEEEEcCCCCCCeEEeccCCCc--eeEecccceeee Q psy14259 29 WAPVDNALAFVYNRDVYYSPSATLQDIYRLSNTGSE--VVSNGVPDWLYQ 76 (79) Q Consensus 29 wsP~g~~lafV~~nnly~~~~~~~~~~~~lT~dG~~--~i~nG~~DWvYe 76 (79) -.++|..|.|....+|.+.. +..- ...++.|+.. .+|+|..=.+|- T Consensus 30 v~~~gqklq~~~~~~v~v~R-Pdkl-r~~~~gd~~~~~~~yDGkt~Tl~~ 77 (214) T PF09865_consen 30 VTPDGQKLQFSSSGTVTVQR-PDKL-RIDRRGDGADREFYYDGKTFTLYD 77 (214) T ss_pred ecCCCceEEEEEEEEEEEeC-CCeE-EEEEEcCCcceEEEECCCEEEEEc Confidence 45789999999999999985 4332 2233344433 889998876664 No 78 >PF05796 Chordopox_G2: Chordopoxvirus protein G2; InterPro: IPR008446 This family consists of several Chordopoxvirus isatin-beta-thiosemicarbazone dependent protein (protein G2) sequences. Inactivation of the gene coding for this protein renders the virus dependent upon isatin-beta-thiosemicarbazone (IBT) for growth []. Probab=28.52 E-value=48 Score=24.39 Aligned_cols=22 Identities=23% Similarity=0.338 Sum_probs=18.5 Q ss_pred eEEccCCCeEEEEEcCCEEEEc Q psy14259 27 ISWAPVDNALAFVYNRDVYYSP 48 (79) Q Consensus 27 a~wsP~g~~lafV~~nnly~~~ 48 (79) .+..++++.+|+|+|++||+.. T Consensus 94 ykL~~~m~Giaivk~~~V~v~~ 115 (216) T PF05796_consen 94 YKLPASMKGIAIVKDRNVYVRR 115 (216) T ss_pred hcCCcccCcEEEEcCCEEEEEc Confidence 3556788999999999999974 No 79 >PF11061 DUF2862: Protein of unknown function (DUF2862); InterPro: IPR021291 This family of proteins has no known function. Probab=28.36 E-value=27 Score=21.02 Aligned_cols=15 Identities=27% Similarity=0.652 Sum_probs=12.4 Q ss_pred eeEecccceeeeecC Q psy14259 65 VVSNGVPDWLYQARE 79 (79) Q Consensus 65 ~i~nG~~DWvYeEEf 79 (79) ..-||..-|.+|+|+ T Consensus 48 ~~~ng~~~WFFedEi 62 (64) T PF11061_consen 48 EFSNGSRTWFFEDEI 62 (64) T ss_pred EecCCceeEEchhhc Confidence 445899999999985 No 80 >COG4946 Uncharacterized protein related to the periplasmic component of the Tol biopolymer transport system [Function unknown] Probab=28.35 E-value=1.4e+02 Score=25.09 Aligned_cols=43 Identities=14% Similarity=0.343 Sum_probs=29.7 Q ss_pred CCCCCCeeeeEEccCCCeEEEEEcCC-----EEEEcCCCCCCeEEeccC Q psy14259 18 DHEAPYLQHISWAPVDNALAFVYNRD-----VYYSPSATLQDIYRLSNT 61 (79) Q Consensus 18 ~~~~~~~q~a~wsP~g~~lafV~~nn-----ly~~~~~~~~~~~~lT~d 61 (79) ..+...+-.+.|||++..+||-.--- |-+.+ ..+++..++|+. T Consensus 440 kS~~~lItdf~~~~nsr~iAYafP~gy~tq~Iklyd-m~~~Kiy~vTT~ 487 (668) T COG4946 440 KSEYGLITDFDWHPNSRWIAYAFPEGYYTQSIKLYD-MDGGKIYDVTTP 487 (668) T ss_pred ccccceeEEEEEcCCceeEEEecCcceeeeeEEEEe-cCCCeEEEecCC Confidence 44455678899999999999986533 33443 345567788864 No 81 >KOG1539|consensus Probab=28.15 E-value=91 Score=27.41 Aligned_cols=27 Identities=22% Similarity=0.283 Sum_probs=22.6 Q ss_pred CeeeeEEccCCCeEEEEEc--CCEEEEcC Q psy14259 23 YLQHISWAPVDNALAFVYN--RDVYYSPS 49 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~--nnly~~~~ 49 (79) ..-...+||+|+.||=++- |.||++.+ T Consensus 619 ~~~sls~SPngD~LAT~Hvd~~gIylWsN 647 (910) T KOG1539|consen 619 PCTSLSFSPNGDFLATVHVDQNGIYLWSN 647 (910) T ss_pred cceeeEECCCCCEEEEEEecCceEEEEEc Confidence 4667899999999998864 88999975 No 82 >cd01782 AF6_RA_repeat1 Ubiquitin domain of AT-6, first repeat. The AF-6 protein (also known as afadin and canoe) is a multidomain cell junction protein that contains two N-terminal Ras-associating (RA) domains in addition to FHA (forkhead-associated), DIL (class V myosin homology region), and PDZ domains and a proline-rich region. AF6 acts downstream of the Egfr (Epidermal Growth Factor-receptor)/Ras signalling pathway and provides a link from Egfr to cytoskeletal elements. Probab=27.99 E-value=1.4e+02 Score=19.91 Aligned_cols=35 Identities=23% Similarity=0.425 Sum_probs=26.4 Q ss_pred CccceeecccCCCCCCCeeeeEEccCCCeEEEEEcCC Q psy14259 7 PIRAEKQQNINDHEAPYLQHISWAPVDNALAFVYNRD 43 (79) Q Consensus 7 ~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~nn 43 (79) +.|+.+++. +.+.|.+.-.-|+|+.....||-.|+ T Consensus 78 ~nGe~RKL~--d~E~PL~~RL~w~~~dre~~FvLk~~ 112 (112) T cd01782 78 ENGEERRLL--DDEKPLVVQLNWHKDDREGRFLLKND 112 (112) T ss_pred cCCceEEcC--CcCCCeEEeeccCCCCceeEEEeccC Confidence 345555555 66777788889999999999997664 No 83 >PRK10115 protease 2; Provisional Probab=27.51 E-value=74 Score=26.39 Aligned_cols=27 Identities=11% Similarity=0.144 Sum_probs=22.2 Q ss_pred CeeeeEEccCCCeEEEEEcCC------EEEEcC Q psy14259 23 YLQHISWAPVDNALAFVYNRD------VYYSPS 49 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nn------ly~~~~ 49 (79) .+..+.|||+|+.|||..+.+ ||+++. T Consensus 128 ~l~~~~~Spdg~~la~~~d~~G~E~~~l~v~d~ 160 (686) T PRK10115 128 TLGGMAITPDNTIMALAEDFLSRRQYGIRFRNL 160 (686) T ss_pred EEeEEEECCCCCEEEEEecCCCcEEEEEEEEEC Confidence 466789999999999998754 888864 No 84 >KOG1538|consensus Probab=27.36 E-value=1.3e+02 Score=26.54 Aligned_cols=26 Identities=12% Similarity=0.231 Sum_probs=24.0 Q ss_pred CeeeeEEccCCCeEEEEEcCCEEEEc Q psy14259 23 YLQHISWAPVDNALAFVYNRDVYYSP 48 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nnly~~~ 48 (79) -+.+.+|.|+|..|....++.||+.+ T Consensus 14 ci~d~afkPDGsqL~lAAg~rlliyD 39 (1081) T KOG1538|consen 14 CINDIAFKPDGTQLILAAGSRLLVYD 39 (1081) T ss_pred chheeEECCCCceEEEecCCEEEEEe Confidence 47789999999999999999999996 No 85 >smart00320 WD40 WD40 repeats. Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain. Probab=27.08 E-value=62 Score=13.44 Aligned_cols=17 Identities=18% Similarity=0.286 Sum_probs=12.0 Q ss_pred CeeeeEEccCCCeEEEE Q psy14259 23 YLQHISWAPVDNALAFV 39 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV 39 (79) .+..+.|+|.++.++.+ T Consensus 14 ~i~~~~~~~~~~~~~~~ 30 (40) T smart00320 14 PVTSVAFSPDGKYLASA 30 (40) T ss_pred ceeEEEECCCCCEEEEe Confidence 46777888877666555 No 86 >PF09142 TruB_C: tRNA Pseudouridine synthase II, C terminal; InterPro: IPR015225 Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an alpha+beta structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are []: Pseudouridine synthase I, TruA. Pseudouridine synthase II, TruB, which contains and additional C-terminal PUA domain. Pseudouridine synthase RsuA (ribosomal small subunit) and RluC/RluD (ribosomal large subunits), both of which contain an additional N-terminal alpha-L RNA-binding motif. Pseudouridine synthase TruD, which has a natural circular permutation in the catalytic domain, as well as an insertion of a family-specific alpha+beta subdomain. TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA []. The C-terminal domain adopts a secondary structure consisting of a four-stranded beta sheet and one alpha helix, similar to that found in PUA domains. It is predominantly involved in RNA-binding, being mostly found in tRNA pseudouridine synthase B (TruB) []. ; GO: 0003723 RNA binding, 0009982 pseudouridine synthase activity, 0001522 pseudouridine synthesis, 0009451 RNA modification; PDB: 1SGV_B. Probab=25.47 E-value=75 Score=18.05 Aligned_cols=19 Identities=21% Similarity=0.256 Sum_probs=13.2 Q ss_pred eeeeEEccCCCeEEEEEcC Q psy14259 24 LQHISWAPVDNALAFVYNR 42 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~n 42 (79) -..+.++|+|+.||-+.+. T Consensus 27 g~~aa~~pdG~lvAL~~~~ 45 (56) T PF09142_consen 27 GPVAAFAPDGRLVALLEER 45 (56) T ss_dssp S-EEEE-TTS-EEEEEEEE T ss_pred ceEEEECCCCcEEEEEEcc Confidence 4688999999999988653 No 87 >KOG2111|consensus Probab=25.21 E-value=1.1e+02 Score=24.11 Aligned_cols=40 Identities=13% Similarity=0.016 Sum_probs=33.0 Q ss_pred ccceeecccCCCCCCCeeeeEEccCCCeEEEEEc-CCEEEE Q psy14259 8 IRAEKQQNINDHEAPYLQHISWAPVDNALAFVYN-RDVYYS 47 (79) Q Consensus 8 ~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~-nnly~~ 47 (79) +|+..+...++.....++-.+|||++..||--.| .-|++. T Consensus 213 ~g~~l~E~RRG~d~A~iy~iaFSp~~s~LavsSdKgTlHiF 253 (346) T KOG2111|consen 213 DGTLLQELRRGVDRADIYCIAFSPNSSWLAVSSDKGTLHIF 253 (346) T ss_pred CCcEeeeeecCCchheEEEEEeCCCccEEEEEcCCCeEEEE Confidence 5667777888899999999999999999998877 445554 No 88 >KOG2315|consensus Probab=25.07 E-value=1.2e+02 Score=25.42 Aligned_cols=20 Identities=25% Similarity=0.669 Sum_probs=17.2 Q ss_pred CeeeeEEccCCCeEEEEEcC Q psy14259 23 YLQHISWAPVDNALAFVYNR 42 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~n 42 (79) .+..++|||+|.-++-|++. T Consensus 272 PVhdv~W~~s~~EF~VvyGf 291 (566) T KOG2315|consen 272 PVHDVTWSPSGREFAVVYGF 291 (566) T ss_pred CceEEEECCCCCEEEEEEec Confidence 48999999999988888763 No 89 >PF02897 Peptidase_S9_N: Prolyl oligopeptidase, N-terminal beta-propeller domain; InterPro: IPR004106 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents the beta-propeller domain found at the N-terminal of prolyl oligopeptidase, including acylamino-acid-releasing enzyme (also known as acylaminoacyl peptidase), which belong to the MEROPS peptidase family S9 (clan SC), subfamily S9A. The prolyl oligopeptidase family consist of a number of evolutionary related peptidases whose catalytic activity seems to be provided by a charge relay system similar to that of the trypsin family of serine proteases, but which evolved by independent convergent evolution. The N-terminal domain of prolyl oligopeptidases form an unusual 7-bladed beta-propeller consisting of seven 4-stranded beta-sheet motifs. Prolyl oligopeptidase is a large cytosolic enzyme involved in the maturation and degradation of peptide hormones and neuropeptides, which relate to the induction of amnesia. The enzyme contains a peptidase domain, where its catalytic triad (Ser554, His680, Asp641) is covered by the central tunnel of the N-terminal beta-propeller domain. In this way, large structured peptides are excluded from the active site, thereby protecting larger peptides and proteins from proteolysis in the cytosol []. The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Mammalian acylaminoacyl peptidase is an exopeptidase that is a member of the same prolyl oligopeptidase family of serine peptidases. This enzyme removes acylated amino acid residues from the N terminus of oligopeptides [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 2BKL_B 3DDU_A 1YR2_A 2XE4_A 1VZ3_A 3EQ9_A 1O6F_A 3EQ7_A 4AN0_A 1UOP_A .... Probab=25.00 E-value=83 Score=23.42 Aligned_cols=26 Identities=19% Similarity=0.162 Sum_probs=19.4 Q ss_pred eeeeEEccCCCeEEEEEc--CC----EEEEcC Q psy14259 24 LQHISWAPVDNALAFVYN--RD----VYYSPS 49 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~--nn----ly~~~~ 49 (79) +....+||+|+.|||..+ ++ |++.+. T Consensus 126 ~~~~~~Spdg~~la~~~s~~G~e~~~l~v~Dl 157 (414) T PF02897_consen 126 LGGFSVSPDGKRLAYSLSDGGSEWYTLRVFDL 157 (414) T ss_dssp EEEEEETTTSSEEEEEEEETTSSEEEEEEEET T ss_pred eeeeeECCCCCEEEEEecCCCCceEEEEEEEC Confidence 346789999999999954 33 777753 No 90 >KOG2394|consensus Probab=24.10 E-value=62 Score=27.21 Aligned_cols=22 Identities=18% Similarity=0.341 Sum_probs=18.0 Q ss_pred eeeeEEccCCCeEEEEEcCCEE Q psy14259 24 LQHISWAPVDNALAFVYNRDVY 45 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~~nnly 45 (79) +--..|||+|++|+--=+.||. T Consensus 335 LLCvcWSPDGKyIvtGGEDDLV 356 (636) T KOG2394|consen 335 LLCVCWSPDGKYIVTGGEDDLV 356 (636) T ss_pred eEEEEEcCCccEEEecCCcceE Confidence 5567999999999887777765 No 91 >KOG1354|consensus Probab=23.58 E-value=61 Score=25.97 Aligned_cols=26 Identities=23% Similarity=0.572 Sum_probs=23.1 Q ss_pred CeeeeEEccCCCeEEEEEcCCEEEEc Q psy14259 23 YLQHISWAPVDNALAFVYNRDVYYSP 48 (79) Q Consensus 23 ~~q~a~wsP~g~~lafV~~nnly~~~ 48 (79) ++-..+|.|.-+.+|-..-||||+.. T Consensus 407 kilh~aWhp~en~ia~aatnnlyif~ 432 (433) T KOG1354|consen 407 KILHTAWHPKENSIAVAATNNLYIFQ 432 (433) T ss_pred HHHhhccCCccceeeeeecCceEEec Confidence 56678999999999999999999863 No 92 >PF07433 DUF1513: Protein of unknown function (DUF1513); InterPro: IPR008311 There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. Probab=23.25 E-value=1.3e+02 Score=23.06 Aligned_cols=38 Identities=11% Similarity=0.059 Sum_probs=25.5 Q ss_pred cCCccceeecccCCCCCCCeeeeEEccCCCeEEEEEcCC Q psy14259 5 RLPIRAEKQQNINDHEAPYLQHISWAPVDNALAFVYNRD 43 (79) Q Consensus 5 ~~~~~~~~~~~~~~~~~~~~q~a~wsP~g~~lafV~~nn 43 (79) ..-+|...+....+...-.--++.|||+|+. .|+.+|+ T Consensus 34 D~~~g~~~~~~~a~~gRHFyGHg~fs~dG~~-LytTEnd 71 (305) T PF07433_consen 34 DCRTGQLLQRLWAPPGRHFYGHGVFSPDGRL-LYTTEND 71 (305) T ss_pred EcCCCceeeEEcCCCCCEEecCEEEcCCCCE-EEEeccc Confidence 3445666665555555555668899998885 5787877 No 93 >KOG0183|consensus Probab=23.16 E-value=1.3e+02 Score=22.56 Aligned_cols=24 Identities=8% Similarity=-0.003 Sum_probs=18.2 Q ss_pred eeeEEccCC---------------CeEEEEEcCCEEEEc Q psy14259 25 QHISWAPVD---------------NALAFVYNRDVYYSP 48 (79) Q Consensus 25 q~a~wsP~g---------------~~lafV~~nnly~~~ 48 (79) +...|||+| +..++|+++|..+.- T Consensus 7 altvFSPDGhL~QVEYAqEAvrkGstaVgvrg~~~vvlg 45 (249) T KOG0183|consen 7 ALTVFSPDGHLFQVEYAQEAVRKGSTAVGVRGNNCVVLG 45 (249) T ss_pred ceEEECCCCCEEeeHhHHHHHhcCceEEEeccCceEEEE Confidence 456788877 456899999988773 No 94 >KOG1446|consensus Probab=23.13 E-value=2.4e+02 Score=21.85 Aligned_cols=29 Identities=10% Similarity=0.311 Sum_probs=22.1 Q ss_pred CCCCeeeeEEccCCCeEEEEEcCC-EEEEc Q psy14259 20 EAPYLQHISWAPVDNALAFVYNRD-VYYSP 48 (79) Q Consensus 20 ~~~~~q~a~wsP~g~~lafV~~nn-ly~~~ 48 (79) .....-..+|||+|+.|.-..+++ +|+.+ T Consensus 186 ~~~ew~~l~FS~dGK~iLlsT~~s~~~~lD 215 (311) T KOG1446|consen 186 DEAEWTDLEFSPDGKSILLSTNASFIYLLD 215 (311) T ss_pred CccceeeeEEcCCCCEEEEEeCCCcEEEEE Confidence 344577889999999998887755 66665 No 95 >PTZ00420 coronin; Provisional Probab=22.52 E-value=2.3e+02 Score=23.36 Aligned_cols=28 Identities=14% Similarity=0.185 Sum_probs=22.8 Q ss_pred CCeeeeEEccCCCeEEEE-EcCCEEEEcC Q psy14259 22 PYLQHISWAPVDNALAFV-YNRDVYYSPS 49 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV-~~nnly~~~~ 49 (79) ..+..+.|+|+|+.|+-. .|+.|.+++. T Consensus 168 ~~V~SlswspdG~lLat~s~D~~IrIwD~ 196 (568) T PTZ00420 168 KKLSSLKWNIKGNLLSGTCVGKHMHIIDP 196 (568) T ss_pred CcEEEEEECCCCCEEEEEecCCEEEEEEC Confidence 347889999999999877 4778999963 No 96 >KOG0315|consensus Probab=22.29 E-value=2.6e+02 Score=21.64 Aligned_cols=34 Identities=18% Similarity=0.189 Sum_probs=27.3 Q ss_pred CCCCCCeeeeEEccCCCeEEEEEc-CCEEEEcCCC Q psy14259 18 DHEAPYLQHISWAPVDNALAFVYN-RDVYYSPSAT 51 (79) Q Consensus 18 ~~~~~~~q~a~wsP~g~~lafV~~-nnly~~~~~~ 51 (79) ++....+|.++..|+|..|+-+.+ +|.|++.... T Consensus 164 Pe~~~~i~sl~v~~dgsml~a~nnkG~cyvW~l~~ 198 (311) T KOG0315|consen 164 PEDDTSIQSLTVMPDGSMLAAANNKGNCYVWRLLN 198 (311) T ss_pred CCCCcceeeEEEcCCCcEEEEecCCccEEEEEccC Confidence 344467999999999999998866 8999997543 No 97 >PHA03078 transcriptional elongation factor; Provisional Probab=22.04 E-value=76 Score=23.43 Aligned_cols=21 Identities=19% Similarity=0.308 Sum_probs=17.7 Q ss_pred EEccCCCeEEEEEcCCEEEEc Q psy14259 28 SWAPVDNALAFVYNRDVYYSP 48 (79) Q Consensus 28 ~wsP~g~~lafV~~nnly~~~ 48 (79) +..+++.-+|+|+|++||+.. T Consensus 92 kL~~~m~Gia~i~~~~V~v~~ 112 (219) T PHA03078 92 KLPNNMKGIAVIKDRNVYVRR 112 (219) T ss_pred cCCcccCceEEEcCCEEEEEc Confidence 445678999999999999985 No 98 >KOG2100|consensus Probab=21.93 E-value=64 Score=27.27 Aligned_cols=21 Identities=29% Similarity=0.498 Sum_probs=16.5 Q ss_pred eeeEEccCCCeEEEEEcCCEE Q psy14259 25 QHISWAPVDNALAFVYNRDVY 45 (79) Q Consensus 25 q~a~wsP~g~~lafV~~nnly 45 (79) ...-|||+|..+||...||-- T Consensus 210 ~a~wwsp~g~~la~~~~~dt~ 230 (755) T KOG2100|consen 210 SAIWWSPDGDRLAYASFNDTK 230 (755) T ss_pred ccceeCCCCceeEEEEecccc Confidence 345689999999998877743 No 99 >smart00415 HSF heat shock factor. Probab=21.93 E-value=94 Score=19.45 Aligned_cols=14 Identities=29% Similarity=0.774 Sum_probs=10.7 Q ss_pred eeeEEccCCCeEEE Q psy14259 25 QHISWAPVDNALAF 38 (79) Q Consensus 25 q~a~wsP~g~~laf 38 (79) ..+.|+|+|+.+.- T Consensus 20 ~iI~W~~~G~~f~I 33 (105) T smart00415 20 KIISWSPSGKSFVI 33 (105) T ss_pred CEEEECCCCCEEEE Confidence 36899999987543 No 100 >PF07646 Kelch_2: Kelch motif; InterPro: IPR011498 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding Probab=21.75 E-value=1.3e+02 Score=15.71 Aligned_cols=18 Identities=22% Similarity=0.248 Sum_probs=12.8 Q ss_pred cCCCeEEEEEcCCEEEEc Q psy14259 31 PVDNALAFVYNRDVYYSP 48 (79) Q Consensus 31 P~g~~lafV~~nnly~~~ 48 (79) |...+.+-+.++.||+.= T Consensus 1 ~r~~hs~~~~~~kiyv~G 18 (49) T PF07646_consen 1 PRYGHSAVVLDGKIYVFG 18 (49) T ss_pred CccceEEEEECCEEEEEC Confidence 344556668899999984 No 101 >PF04053 Coatomer_WDAD: Coatomer WD associated region ; InterPro: IPR006692 Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer []. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins []. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi []. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes []. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits. This entry represents the WD-associated region found in coatomer subunits alpha, beta and beta' subunits. The alpha-subunit (RET1P) of the coatomer complex in Saccharomyces cerevisiae (Baker's yeast), participates in membrane transport between the endoplasmic reticulum and Golgi apparatus. The protein contains six WD-40 repeat motifs in its N-terminal region []. More information about these proteins can be found at Protein of the Month: Clathrin [].; GO: 0005198 structural molecule activity, 0006886 intracellular protein transport, 0016192 vesicle-mediated transport, 0030117 membrane coat; PDB: 3MKQ_B. Probab=21.69 E-value=2.2e+02 Score=22.60 Aligned_cols=67 Identities=9% Similarity=0.013 Sum_probs=29.6 Q ss_pred CCccceeecccCCC--CCCCeeeeEEccCCCeEEEEEcCCEEEEcCCCCCCeEEeccCCCceeEeccccee Q psy14259 6 LPIRAEKQQNINDH--EAPYLQHISWAPVDNALAFVYNRDVYYSPSATLQDIYRLSNTGSEVVSNGVPDWL 74 (79) Q Consensus 6 ~~~~~~~~~~~~~~--~~~~~q~a~wsP~g~~lafV~~nnly~~~~~~~~~~~~lT~dG~~~i~nG~~DWv 74 (79) +++|+...+..... .....|...++|+|+.++-.-++.-.+... ..- .....-.|...++-+..++. T Consensus 15 ~~dg~~~~l~~k~lg~~~~~p~~ls~npngr~v~V~g~geY~iyt~-~~~-r~k~~G~g~~~vw~~~n~yA 83 (443) T PF04053_consen 15 IKDGERLPLSVKELGSCEIYPQSLSHNPNGRFVLVCGDGEYEIYTA-LAW-RNKAFGSGLSFVWSSRNRYA 83 (443) T ss_dssp --TTS-B----EEEEE-SS--SEEEE-TTSSEEEEEETTEEEEEET-TTT-EEEEEEE-SEEEE-TSSEEE T ss_pred cCCCceeeEEeccCCCCCcCCeeEEECCCCCEEEEEcCCEEEEEEc-cCC-cccccCceeEEEEecCccEE Confidence 34455444332222 233468889999999888877776666652 222 12222333345555544443 No 102 >KOG0319|consensus Probab=20.87 E-value=3e+02 Score=23.93 Aligned_cols=23 Identities=13% Similarity=0.222 Sum_probs=19.5 Q ss_pred eEEccCCCeEEEEEcCCEEEEcC Q psy14259 27 ISWAPVDNALAFVYNRDVYYSPS 49 (79) Q Consensus 27 a~wsP~g~~lafV~~nnly~~~~ 49 (79) ++||++|+.|+=..+|-|-..+. T Consensus 25 ~~~s~nG~~L~t~~~d~Vi~idv 47 (775) T KOG0319|consen 25 VAWSSNGQHLYTACGDRVIIIDV 47 (775) T ss_pred eeECCCCCEEEEecCceEEEEEc Confidence 89999999998888887777753 No 103 >KOG2919|consensus Probab=20.63 E-value=1.1e+02 Score=24.41 Aligned_cols=29 Identities=14% Similarity=0.278 Sum_probs=22.1 Q ss_pred CeeeeEEccCCCe-EEEEEcCCEEEEcCCC Q psy14259 23 YLQHISWAPVDNA-LAFVYNRDVYYSPSAT 51 (79) Q Consensus 23 ~~q~a~wsP~g~~-lafV~~nnly~~~~~~ 51 (79) .+.-.+|||+|.. |+-+.+|-|-+++.+. T Consensus 51 f~kgckWSPDGSciL~~sedn~l~~~nlP~ 80 (406) T KOG2919|consen 51 FLKGCKWSPDGSCILSLSEDNCLNCWNLPF 80 (406) T ss_pred hhccceeCCCCceEEeecccCeeeEEecCh Confidence 4667799999974 5677888888886544 No 104 >cd02257 Peptidase_C19 Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome. Probab=20.49 E-value=1.4e+02 Score=19.25 Aligned_cols=18 Identities=17% Similarity=0.296 Sum_probs=14.3 Q ss_pred cCCCeEEEEEcC--CEEEEc Q psy14259 31 PVDNALAFVYNR--DVYYSP 48 (79) Q Consensus 31 P~g~~lafV~~n--nly~~~ 48 (79) -.||+++|+++. +-|+.- T Consensus 207 ~~GHY~~~~~~~~~~~W~~~ 226 (255) T cd02257 207 DSGHYVAYVKDPSDGKWYKF 226 (255) T ss_pred CCcCeEEEEeCCCCCceEEE Confidence 468999999997 677664 No 105 >COG5169 HSF1 Heat shock transcription factor [Transcription] Probab=20.43 E-value=95 Score=23.52 Aligned_cols=19 Identities=21% Similarity=0.609 Sum_probs=14.4 Q ss_pred CCeeeeEEccCCCeEEEEE Q psy14259 22 PYLQHISWAPVDNALAFVY 40 (79) Q Consensus 22 ~~~q~a~wsP~g~~lafV~ 40 (79) +.--...|||+|..++... T Consensus 25 e~~k~I~Ws~~G~sfvI~~ 43 (282) T COG5169 25 EYYKLIQWSPDGRSFVILD 43 (282) T ss_pred ccCCceEECCCCCEEEEeC Confidence 4456789999999976554 No 106 >KOG4659|consensus Probab=20.15 E-value=1.3e+02 Score=28.40 Aligned_cols=38 Identities=18% Similarity=0.274 Sum_probs=29.6 Q ss_pred CeEEEEEcCCEEEEcCCCC--CCeEEeccCCCceeEeccc Q psy14259 34 NALAFVYNRDVYYSPSATL--QDIYRLSNTGSEVVSNGVP 71 (79) Q Consensus 34 ~~lafV~~nnly~~~~~~~--~~~~~lT~dG~~~i~nG~~ 71 (79) ..||+-++.-||+...... ++.++||+||.-.|+-|.+ T Consensus 596 r~Iavg~~G~lyvaEsD~rriNrvr~~~tdg~i~ilaGa~ 635 (1899) T KOG4659|consen 596 RDIAVGTDGALYVAESDGRRINRVRKLSTDGTISILAGAK 635 (1899) T ss_pred hceeecCCceEEEEeccchhhhheEEeccCceEEEecCCC Confidence 6788888999999875332 5677999999768888854 No 107 >KOG1864|consensus Probab=20.12 E-value=1.1e+02 Score=25.45 Aligned_cols=23 Identities=22% Similarity=0.339 Sum_probs=17.3 Q ss_pred eeEEccC-CCeEEEEEcCCE-EEEc Q psy14259 26 HISWAPV-DNALAFVYNRDV-YYSP 48 (79) Q Consensus 26 ~a~wsP~-g~~lafV~~nnl-y~~~ 48 (79) +..-+|+ ||++|||+++.. |+.- T Consensus 519 H~G~~p~~GHYia~~r~~~~nWl~f 543 (587) T KOG1864|consen 519 HLGSTPNRGHYVAYVKSLDFNWLLF 543 (587) T ss_pred eccCCCCCcceEEEEeeCCCCceec Confidence 3445675 899999999888 6653 No 108 >COG4831 Roadblock/LC7 domain [Function unknown] Probab=20.08 E-value=76 Score=20.93 Aligned_cols=17 Identities=6% Similarity=0.155 Sum_probs=13.0 Q ss_pred eeeeEEccCCCeEEEEE Q psy14259 24 LQHISWAPVDNALAFVY 40 (79) Q Consensus 24 ~q~a~wsP~g~~lafV~ 40 (79) +..-.|||+|+-++|-- T Consensus 15 ~AAGefs~DGkLv~Ykg 31 (109) T COG4831 15 MAAGEFSPDGKLVEYKG 31 (109) T ss_pred eEeceeCCCCceEEeeC Confidence 34458999999998843 Done!