RPS-BLAST 2.2.22 [Sep-27-2009] Database: CddA 21,609 sequences; 6,263,737 total letters Searching..................................................done Query= gi|254780747|ref|YP_003065160.1| putative protease IV transmembrane protein [Candidatus Liberibacter asiaticus str. psy62] (293 letters) >gnl|CDD|132934 cd07023, S49_Sppa_N_C, Signal peptide peptidase A (SppA), a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. This subfamily contains members with either a single domain (sometimes referred to as 36K type), such as sohB peptidase, protein C and archaeal signal peptide peptidase, or an amino-terminal domain in addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases. Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. Length = 208 Score = 220 bits (562), Expect = 4e-58 Identities = 77/202 (38%), Positives = 127/202 (62%), Gaps = 6/202 (2%) Query: 37 HVARIAIRGQIED-----SQELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQ 91 +A I I G I D + LIE++ + DDS A+++ ++SPGGS A E I+R I+ Sbjct: 1 KIAVIDIEGTISDGGGIGADSLIEQLRKAREDDSVKAVVLRINSPGGSVVASEEIYREIR 60 Query: 92 KVKNR-KPVITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGV 150 +++ KPV+ + ++AAS GY I+ A++ IVA T++ GSIGV+ Q P ++ LDKLG+ Sbjct: 61 RLRKAKKPVVASMGDVAASGGYYIAAAADKIVANPTTITGSIGVIGQGPNLEELLDKLGI 120 Query: 151 SIKSVKSSPMKAEPSPFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSDG 210 ++KS P K + SP + + ++Q +VD Y FV +V+E R + ++ L+DG Sbjct: 121 ERDTIKSGPGKDKGSPDRPLTEEERAILQALVDDIYDQFVDVVAEGRGMSGERLDKLADG 180 Query: 211 RIWTGAEAKKVGLIDVVGGQEE 232 R+WTG +A ++GL+D +GG ++ Sbjct: 181 RVWTGRQALELGLVDELGGLDD 202 >gnl|CDD|30961 COG0616, SppA, Periplasmic serine proteases (ClpP class) [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. Length = 317 Score = 183 bits (465), Expect = 6e-47 Identities = 70/234 (29%), Positives = 122/234 (52%), Gaps = 9/234 (3%) Query: 32 EDNSPHVARIAIRGQIEDS---------QELIERIERISRDDSATALIVSLSSPGGSAYA 82 + S +A I + G I ++ E + D S A+++ ++SPGGS A Sbjct: 55 KRGSKVIAVIHVEGAIVAGGGPLRFIGGDDIEEILRAARADPSVKAVVLRINSPGGSVVA 114 Query: 83 GEAIFRAIQKVKNRKPVITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVK 142 E I RA+++++ +KPV+ V AAS GY I+ A++ IVA +S+ GSIGV+ P + Sbjct: 115 SELIARALKRLRAKKPVVVSVGGYAASGGYYIALAADKIVADPSSITGSIGVISGAPNFE 174 Query: 143 PFLDKLGVSIKSVKSSPMKAEPSPFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYD 202 L+KLGV + + + K SPF + + +++Q +D +Y FV V+E R + + Sbjct: 175 ELLEKLGVEKEVITAGEYKDILSPFRPLTEEEREILQKEIDETYDEFVDKVAEGRGLSDE 234 Query: 203 KTLVLSDGRIWTGAEAKKVGLIDVVGGQEEVWQSLYALGVDQSIRKIKDWNPPK 256 L+ GR+WTG +A ++GL+D +GG ++ + L + + + Sbjct: 235 AVDKLATGRVWTGQQALELGLVDELGGLDDAVKDAAELAGVKDVPVVYYLEEKS 288 >gnl|CDD|144805 pfam01343, Peptidase_S49, Peptidase family S49. Length = 154 Score = 130 bits (330), Expect = 3e-31 Identities = 51/141 (36%), Positives = 87/141 (61%) Query: 97 KPVITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVK 156 KPV+ AAS GY ++ A++ IVA T++VGSIGV+ Q + LDKLGV I +++ Sbjct: 7 KPVVAYAGNYAASGGYYLASAADKIVANPTTIVGSIGVIMQGLNYEGLLDKLGVKIDTIR 66 Query: 157 SSPMKAEPSPFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSDGRIWTGA 216 + K S F + P+ + +Q ++D +Y FV+ V+++RN+ D+ +++GR+WTG Sbjct: 67 AGEYKDAGSLFRPLTPEEREALQRMLDETYQMFVQKVAKNRNLTVDQVDKIAEGRVWTGQ 126 Query: 217 EAKKVGLIDVVGGQEEVWQSL 237 +A + GL+D +G ++ L Sbjct: 127 QAVEAGLVDELGTLDDAIARL 147 >gnl|CDD|132933 cd07022, S49_Sppa_36K_type, Signal peptide peptidase A (SppA) 36K type, a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV) 36K type: SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Members in this subfamily are all bacterial and include sohB peptidase and protein C. These are sometimes referred to as 36K type since they contain only one domain, unlike E. coli SppA that also contains an amino-terminal domain. Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Length = 214 Score = 116 bits (293), Expect = 8e-27 Identities = 54/193 (27%), Positives = 93/193 (48%), Gaps = 11/193 (5%) Query: 51 QELIERIERISRDDSATALIVSLSSPGGSAY----AGEAIFRAIQKVKNRKPVITEVHEM 106 + + I D A+++ + SPGG +AI A + KP++ V+ + Sbjct: 28 EGIAAAIRAALADPDVRAIVLDIDSPGGEVAGVFELADAIRAA----RAGKPIVAFVNGL 83 Query: 107 AASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPSP 166 AASA Y I+ A++ IV T+ VGSIGV+ + L+K G+ + + + K + +P Sbjct: 84 AASAAYWIASAADRIVVTPTAGVGSIGVVASHVDQSKALEKAGLKVTLIFAGAHKVDGNP 143 Query: 167 FSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSD-GRIWTGAEAKKVGLID 225 ++ +A +Q VD+ Y FV V+ +R + V + G ++ G EA GL D Sbjct: 144 DEPLSDEARARLQAEVDALYAMFVAAVARNRGLSAAA--VRATEGGVFRGQEAVAAGLAD 201 Query: 226 VVGGQEEVWQSLY 238 VG ++ +L Sbjct: 202 AVGTLDDALAALA 214 >gnl|CDD|132930 cd07019, S49_SppA_1, Signal peptide peptidase A (SppA), a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppAs in this subfamily are found in all three domains of life and are involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Site-directed mutagenesis and sequence analysis have shown these bacterial, archaeal and thylakoid SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain, similar to Arabidopsis thaliana SppA1 peptidase. Others, including sohB peptidase, protein C and archaeal signal peptide peptidase, do not contain the amino-terminal domain. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. Length = 211 Score = 103 bits (259), Expect = 4e-23 Identities = 58/175 (33%), Positives = 95/175 (54%), Gaps = 2/175 (1%) Query: 55 ERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNR-KPVITEVHEMAASAGYL 113 +I D A+++ ++SPGGS A E I + + KPV+ AAS GY Sbjct: 28 AQIRDARLDPKVKAIVLRVNSPGGSVTASEVIRAELAAARAAGKPVVVSAGGAAASGGYW 87 Query: 114 ISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPSPFSEVNPK 173 IS +N IVA ++L GSIG+ V+ LD +GV V +SP+ A+ S + P+ Sbjct: 88 ISTPANYIVANPSTLTGSIGIFGVITTVENSLDSIGVHTDGVSTSPL-ADVSITRALPPE 146 Query: 174 AVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSDGRIWTGAEAKKVGLIDVVG 228 A +Q +++ Y F+ LV+++R+ ++ ++ G +WTG +AK GL+D +G Sbjct: 147 AQLGLQLSIENGYKRFITLVADARHSTPEQIDKIAQGHVWTGQDAKANGLVDSLG 201 >gnl|CDD|132929 cd07018, S49_SppA_67K_type, Signal peptide peptidase A (SppA) 67K type, a serine protease, has catalytic Ser-Lys dyad. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV) 67K type: SppA is found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Members in this subfamily contain an amino-terminal domain in addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases. Unlike the eukaryotic functional homologs that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown that members in this subfamily, mostly bacterial, are serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. Length = 222 Score = 91.4 bits (228), Expect = 2e-19 Identities = 46/190 (24%), Positives = 93/190 (48%), Gaps = 4/190 (2%) Query: 51 QELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKN-RKPVITEVHEMAAS 109 ++L+E +E+ + DD +++ L G E + +A+++ + KPVI + + Sbjct: 32 RDLLEALEKAAEDDRIKGIVLDLDGLSGGLAKLEELRQALERFRASGKPVIAYA-DGYSQ 90 Query: 110 AGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPSPFSE 169 Y ++ A++ I + V G+ + + K LDKLGV ++ + K+ PF+ Sbjct: 91 GQYYLASAADEIYLNPSGSVELTGLSAETLFFKGLLDKLGVEVQVFRVGEYKSAVEPFTR 150 Query: 170 VN--PKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSDGRIWTGAEAKKVGLIDVV 227 + P+A + Q ++DS + ++ V+ SR + D L D + EA + GL+D + Sbjct: 151 DDMSPEAREQTQALLDSLWDQYLADVAASRGLSPDALEALIDLGGDSAEEALEAGLVDGL 210 Query: 228 GGQEEVWQSL 237 ++E+ L Sbjct: 211 AYRDELEARL 220 >gnl|CDD|132923 cd00394, Clp_protease_like, Caseinolytic protease (ClpP) is an ATP-dependent protease. Clp protease (caseinolytic protease; ClpP; endopeptidase Clp; Peptidase S14; ATP-dependent protease, ClpAP)-like enzymes are highly conserved serine proteases and belong to the ClpP/Crotonase superfamily. Included in this family are Clp proteases that are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. The functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP. Active site consists of the triad Ser, His and Asp, preferring hydrophobic or non-polar residues at P1 or P1' positions. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Another family included in this class of enzymes is the signal peptide peptidase A (SppA; S49) which is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Mutagenesis studies suggest that the catalytic center of SppA comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain. Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain. The third family included in this hierarchy is nodulation formation efficiency D (NfeD) which is a membrane-bound Clp-class protease and only found in bacteria and archaea. Majority of the NfeD genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named stomatin operon partner protein (STOPP). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 from Pyrococcus horikoshii has been shown to possess serine protease activity having a Ser-Lys catalytic dyad. Length = 161 Score = 86.3 bits (214), Expect = 1e-17 Identities = 58/190 (30%), Positives = 84/190 (44%), Gaps = 33/190 (17%) Query: 41 IAIRGQIED-SQELIERIERISRDD-SATALIVSLSSPGGSAYAGEAIFRAIQKVKNRKP 98 I I G IED S + + R + D S A+++ +++PGG AG I A+Q RKP Sbjct: 2 IFINGVIEDVSADQLAAQIRFAEADNSVKAIVLEVNTPGGRVDAGMNIVDALQAS--RKP 59 Query: 99 VITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSS 158 VI V AASAGY I+ A+N IV A + VGS G + Y Sbjct: 60 VIAYVGGQAASAGYYIATAANKIVMAPGTRVGSHGPIGGYGGNG---------------- 103 Query: 159 PMKAEPSPFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLV-LSDGRIWTGAE 217 NP A + Q ++ F+ LV+E+R +K + + T E Sbjct: 104 ------------NPTAQEADQRIILYFIARFISLVAENRGQTTEKLEEDIEKDLVLTAQE 151 Query: 218 AKKVGLIDVV 227 A + GL+D + Sbjct: 152 ALEYGLVDAL 161 >gnl|CDD|132925 cd07014, S49_SppA, Signal peptide peptidase A. Signal peptide peptidase A (SppA; Peptidase S49; Protease IV): SppA is an intramembrane enzyme found in all three domains of life and is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Unlike the eukaryotic functional homologs that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown these bacterial, archaeal and thylakoid SppAs to be ClpP-like serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain, cleaving peptide bonds in the plane of the lipid bilayer. Mutagenesis studies also suggest that the catalytic center comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain (sometimes referred to as 67K type). Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain (sometimes referred to as 36K type). Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the conserved carboxy-terminal protease domain and the lysine in the non-conserved amino-terminal domain. This family also contains homologs that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases. Length = 177 Score = 64.6 bits (157), Expect = 3e-11 Identities = 51/193 (26%), Positives = 85/193 (44%), Gaps = 40/193 (20%) Query: 48 EDSQELIERIERISRDDSA-TALIVSLSSPGGSAYAGEAIFRAIQKVKN-RKPVITEVHE 105 S + R +R D A+++ ++SPGGS A E I + + KPV+ Sbjct: 21 NVSGDTTAAQIRDARLDPKVKAIVLRVNSPGGSVTASEVIRAELAAARAAGKPVVASGGG 80 Query: 106 MAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPS 165 AAS GY IS +N IVA ++LVGSIG+ Sbjct: 81 NAASGGYWISTPANYIVANPSTLVGSIGIF------------------------------ 110 Query: 166 PFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLV-LSDGRIWTGAEAKKVGLI 224 A Q+ +++ Y F+ LV+++R+ ++ + ++ G +WTG +AK GL+ Sbjct: 111 ----GVQLADQLS---IENGYKRFITLVADNRHSTPEQQIDKIAQGGVWTGQDAKANGLV 163 Query: 225 DVVGGQEEVWQSL 237 D +G ++ L Sbjct: 164 DSLGSFDDAVAKL 176 >gnl|CDD|31233 COG1030, NfeD, Membrane-bound serine protease (ClpP class) [Posttranslational modification, protein turnover, chaperones]. Length = 436 Score = 53.0 bits (127), Expect = 1e-07 Identities = 36/125 (28%), Positives = 59/125 (47%), Gaps = 8/125 (6%) Query: 11 RYVMLSLVTLTVVYFSWSSHVEDNSPHVARIAIRGQIED-SQELIERIERISRDDSATAL 69 R ++ L L + + S V V I I G I+ S + ++R + + +++A A+ Sbjct: 3 RAGLIILALLLLALAAPS--VATAEKKVYVIEIDGAIDPASADYLQRALQSAEEENAAAV 60 Query: 70 IVSLSSPGGSAYAGEAIFRAIQKVKNRKPVITEVHE---MAASAGYLISCASNIIVAAET 126 ++ L +PGG + I RAI PVI V AASAG I A++I A Sbjct: 61 VLELDTPGGLLDSMRQIVRAILNSPV--PVIGYVVPDGARAASAGTYILMATHIAAMAPG 118 Query: 127 SLVGS 131 + +G+ Sbjct: 119 TNIGA 123 >gnl|CDD|31083 COG0740, ClpP, Protease subunit of ATP-dependent Clp proteases [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]. Length = 200 Score = 50.6 bits (121), Expect = 5e-07 Identities = 45/195 (23%), Positives = 76/195 (38%), Gaps = 33/195 (16%) Query: 41 IAIRGQIEDS--QELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNRKP 98 I + G+IED ++ ++ + +D + + ++SPGGS AG AI+ +Q +K P Sbjct: 30 IFLGGEIEDHMANLIVAQLLFLEAEDPDKDIYLYINSPGGSVTAGLAIYDTMQFIK--PP 87 Query: 99 VITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSS 158 V T AAS G ++ A G G F P + Sbjct: 88 VSTICMGQAASMGSVLLMA------------GDKGKRFALPN----------------AR 119 Query: 159 PMKAEPSPFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSD-GRIWTGAE 217 M +PS ++ +++ + R+ +E +K +D + E Sbjct: 120 IMIHQPSGGAQGQASDIEIHAREILKIKERLNRIYAEHTGQTLEKIEKDTDRDTWMSAEE 179 Query: 218 AKKVGLIDVVGGQEE 232 AK+ GLID V E Sbjct: 180 AKEYGLIDKVIESRE 194 >gnl|CDD|132932 cd07021, Clp_protease_NfeD_like, Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control. Length = 178 Score = 50.7 bits (122), Expect = 5e-07 Identities = 45/200 (22%), Positives = 82/200 (41%), Gaps = 35/200 (17%) Query: 41 IAIRGQIEDSQE-LIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNRKPV 99 I I G+I+ +ER + ++++ A A+++ + +PGG + I I P Sbjct: 4 IPIEGEIDPGLAAFVERALKEAKEEGADAVVLDIDTPGGRVDSALEIVDLILNS--PIPT 61 Query: 100 ITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSP 159 I V++ AASAG LI+ A++ I A + +G+ + P K S Sbjct: 62 IAYVNDRAASAGALIALAADEIYMAPGATIGAAEPI-------PGDGNGAADEKVQ--SY 112 Query: 160 MKAEPSPFSE---VNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSDGRIWT-- 214 +A+ +E +P + M V + +P + + G + T Sbjct: 113 WRAKMRAAAEKKGRDPDIAEAM--------------VDKDIEVP---GVGIKGGELLTLT 155 Query: 215 GAEAKKVGLID-VVGGQEEV 233 EA KVG + + G +E+ Sbjct: 156 ADEALKVGYAEGIAGSLDEL 175 >gnl|CDD|132931 cd07020, Clp_protease_NfeD_1, Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control. Length = 187 Score = 49.5 bits (119), Expect = 1e-06 Identities = 30/98 (30%), Positives = 48/98 (48%), Gaps = 6/98 (6%) Query: 38 VARIAIRGQIED-SQELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNR 96 V + I G I + + +ER + + A ALI+ L +PGG + I +AI + Sbjct: 1 VYVLEINGAITPATADYLERAIDQAEEGGADALIIELDTPGGLLDSTREIVQAIL--ASP 58 Query: 97 KPVITEVH---EMAASAGYLISCASNIIVAAETSLVGS 131 PV+ V+ AASAG I A++I A + +G+ Sbjct: 59 VPVVVYVYPSGARAASAGTYILLAAHIAAMAPGTNIGA 96 >gnl|CDD|119339 cd06558, crotonase-like, Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole.. Length = 195 Score = 48.7 bits (117), Expect = 2e-06 Identities = 43/209 (20%), Positives = 73/209 (34%), Gaps = 65/209 (31%) Query: 51 QELIERIERISRDDSATALIVS--------------LSSPGGSAYAGEAIFRAIQKV--- 93 EL ++ D ++++ L++ + A R +Q++ Sbjct: 29 DELAAALDEAEADPDVRVVVLTGAGKAFCAGADLKELAALSDAGEEARAFIRELQELLRA 88 Query: 94 --KNRKPVITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQYPYVKPFLDKLGVS 151 + KPVI V+ A G ++ A +I +AAE + F P V KLG+ Sbjct: 89 LLRLPKPVIAAVNGAALGGGLELALACDIRIAAEDA-------KFGLPEV-----KLGLV 136 Query: 152 IKSVKSSPMKAEPSPFSEVNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVLSDGR 211 P RLV +R + L+L+ GR Sbjct: 137 ------------PGG-----------------GGTQRLPRLVGPARA----RELLLT-GR 162 Query: 212 IWTGAEAKKVGLIDVVGGQEEVWQSLYAL 240 + EA ++GL+D V EE+ + L Sbjct: 163 RISAEEALELGLVDEVVPDEELLAAALEL 191 >gnl|CDD|132926 cd07015, Clp_protease_NfeD, Nodulation formation efficiency D (NfeD) is a membrane-bound ClpP-class protease. Nodulation formation efficiency D (NfeD; stomatin operon partner protein, STOPP; DUF107) is a member of membrane-anchored ClpP-class proteases. Currently, more than 300 NfeD homologs have been identified - all of which are bacterial or archaeal in origin. Majority of these genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named STOPP (stomatin operon partner protein). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 (1510-N or PH1510-N) from Pyrococcus horikoshii has been shown to possess serine protease activity and has a Ser-Lys catalytic dyad, preferentially cleaving hydrophobic substrates. Difference in oligomeric form and catalytic residues between 1510-N (forming a dimer) and ClpP (forming a tetradecamer) shows a possible functional difference: 1510-N is likely to have a regulatory function while ClpP is involved in protein quality control. Length = 172 Score = 47.8 bits (113), Expect = 4e-06 Identities = 36/105 (34%), Positives = 53/105 (50%), Gaps = 6/105 (5%) Query: 38 VARIAIRGQIEDSQE-LIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNR 96 V I+GQI +R I+ D+A A+I+ L +PGG A A I + IQ+ K Sbjct: 1 VYVAQIKGQITSYTYDQFDRYITIAEQDNAEAIIIELDTPGGRADAAGNIVQRIQQSK-- 58 Query: 97 KPVITEVH---EMAASAGYLISCASNIIVAAETSLVGSIGVLFQY 138 PVI V+ AASAG I+ S++I A + +G+ + Y Sbjct: 59 IPVIIYVYPPGASAASAGTYIALGSHLIAMAPGTSIGACRPILGY 103 >gnl|CDD|132927 cd07016, S14_ClpP_1, Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. This subfamily only contains bacterial sequences. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Length = 160 Score = 46.8 bits (112), Expect = 7e-06 Identities = 24/93 (25%), Positives = 47/93 (50%), Gaps = 10/93 (10%) Query: 41 IAIRGQIEDS-----QELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKN 95 I I G I +E + ++ + D + V ++SPGG +AG AI+ A+++ K Sbjct: 3 IYIYGDIGSDWGVTAKEFKDALDALGDDSD---ITVRINSPGGDVFAGLAIYNALKRHKG 59 Query: 96 RKPVITEVHEMAASAGYLISCASNIIVAAETSL 128 + V ++ +AASA +I+ A + + ++ Sbjct: 60 K--VTVKIDGLAASAASVIAMAGDEVEMPPNAM 90 >gnl|CDD|110924 pfam01972, SDH_sah, Serine dehydrogenase proteinase. This family of archaebacterial proteins, formerly known as DUF114, has been found to be a serine dehydrogenase proteinase distantly related to ClpP proteinases that belong to the serine proteinase superfamily. The family has a catalytic triad of Ser, Asp, His residues, which shows an altered residue ordering compared with the ClpP proteinases but similar to that of the carboxypeptidase clan. Length = 286 Score = 43.7 bits (103), Expect = 7e-05 Identities = 42/141 (29%), Positives = 72/141 (51%), Gaps = 19/141 (13%) Query: 47 IEDSQELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNRKPVITEVHEM 106 IEDS+E++ I R++ D LI+ +PGG A A I +A+++ K + VI V Sbjct: 75 IEDSEEILRAI-RLTPKDMPIDLIIH--TPGGLALAATQIAKALKEHKAKTTVI--VPHY 129 Query: 107 AASAGYLISCASNIIVAAETSLVGSIG-VLFQYPYVKPFLDKLGVSIKSV--KSSPMKAE 163 A S G LI+ A++ I+ E +++G + + QYP SI K P K + Sbjct: 130 AMSGGTLIALAADEIIMDENAVLGPVDPQIGQYP---------AASILKAVEKKGPKKID 180 Query: 164 PSPF--SEVNPKAVQMMQDVV 182 ++++ KA++ M++ V Sbjct: 181 DQTLILADISKKAIKQMEEFV 201 >gnl|CDD|144241 pfam00574, CLP_protease, Clp protease. The Clp protease has an active site catalytic triad. In E. coli Clp protease, ser-111, his-136 and asp-185 form the catalytic triad. A putative Clp protease from Cyanophora paradoxa has lost all of these active site residues and is therefore inactive. A member from Chlamydomonas eugametos contains two large insertions, a member from Chlamydomonas reinhardtii contains one large insertion. Length = 182 Score = 39.8 bits (94), Expect = 9e-04 Identities = 18/43 (41%), Positives = 23/43 (53%), Gaps = 2/43 (4%) Query: 75 SPGGSAYAGEAIFRAIQKVKNRKPVITEVHEMAASAGYLISCA 117 SPGGS AG AI+ +Q +K V T +AAS G + A Sbjct: 55 SPGGSVTAGLAIYDTMQFIKP--DVSTICLGLAASMGSFLLAA 95 >gnl|CDD|132928 cd07017, S14_ClpP_2, Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Length = 171 Score = 39.0 bits (92), Expect = 0.001 Identities = 20/45 (44%), Positives = 27/45 (60%), Gaps = 2/45 (4%) Query: 73 LSSPGGSAYAGEAIFRAIQKVKNRKPVITEVHEMAASAGYLISCA 117 ++SPGGS AG AI+ +Q +K PV T +AAS G L+ A Sbjct: 46 INSPGGSVTAGLAIYDTMQYIKP--PVSTICLGLAASMGALLLAA 88 >gnl|CDD|132924 cd07013, S14_ClpP, Caseinolytic protease (ClpP) is an ATP-dependent, highly conserved serine protease. Clp protease (caseinolytic protease; ClpP; Peptidase S14) is a highly conserved serine protease present throughout in bacteria and eukaryota, but seems to be absent in archaea, mollicutes and some fungi. Clp proteases are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. Additionally, they are implicated in the control of cell growth, targeting DNA-binding protein from starved cells. ClpP has also been linked to the tight regulation of virulence genes in the pathogens Listeria monocytogenes and Salmonella typhimurium. This enzyme belong to the family of ATP-dependent proteases; the functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP, although the proteolytic subunit alone does possess some catalytic activity. Active site consists of the triad Ser, His and Asp; some members have lost all of these active site residues and are therefore inactive, while others may have one or two large insertions. ClpP seems to prefer hydrophobic or non-polar residues at P1 or P1' positions in its substrate. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Length = 162 Score = 38.4 bits (89), Expect = 0.002 Identities = 22/81 (27%), Positives = 44/81 (54%), Gaps = 4/81 (4%) Query: 39 ARIAIRGQIED--SQELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNR 96 I + G++ED + + ++ + + + + ++SPGG +AG AI+ I+ +K Sbjct: 1 REIMLTGEVEDISANQFAAQLLFLGAVNPEKDIYLYINSPGGDVFAGMAIYDTIKFIK-- 58 Query: 97 KPVITEVHEMAASAGYLISCA 117 V+T + +AAS G +I+ A Sbjct: 59 ADVVTIIDGLAASMGSVIAMA 79 >gnl|CDD|31227 COG1024, CaiD, Enoyl-CoA hydratase/carnithine racemase [Lipid metabolism]. Length = 257 Score = 37.4 bits (86), Expect = 0.005 Identities = 24/103 (23%), Positives = 39/103 (37%), Gaps = 22/103 (21%) Query: 48 EDSQELIERIERISRDDSATALIVSLSSPGGSAYAG---------------EAIFRAIQK 92 E EL E ++ D ++ L+ G + AG E + + Q Sbjct: 32 EMLDELAEALDEAEADPDVRVVV--LTGAGKAFSAGADLKELLSPEDGNAAENLMQPGQD 89 Query: 93 VKNR-----KPVITEVHEMAASAGYLISCASNIIVAAETSLVG 130 + KPVI V+ A G ++ A +I +AAE + G Sbjct: 90 LLRALADLPKPVIAAVNGYALGGGLELALACDIRIAAEDAKFG 132 >gnl|CDD|73197 cd00347, Flavin_utilizing_monoxygenases, Flavin-utilizing monoxygenases. Length = 90 Score = 36.9 bits (85), Expect = 0.007 Identities = 21/91 (23%), Positives = 32/91 (35%), Gaps = 8/91 (8%) Query: 25 FSWSSHVEDNSPHVARIAIRGQIEDSQELIERIERISRDDSATALIVSLSSPGGSAYAGE 84 F A + EL ER+ D + A+ SSP + AGE Sbjct: 3 FGLFLPPPGGGGATAAEDLE----YLVELARLAERLGFDAAWVAIWFGGSSPPVAEQAGE 58 Query: 85 A----IFRAIQKVKNRKPVITEVHEMAASAG 111 + +F A + + + E AA+AG Sbjct: 59 SGDGLLFAAREPPEEVAEALARYREAAAAAG 89 >gnl|CDD|36058 KOG0840, KOG0840, KOG0840, ATP-dependent Clp protease, proteolytic subunit [Posttranslational modification, protein turnover, chaperones]. Length = 275 Score = 36.5 bits (84), Expect = 0.009 Identities = 22/79 (27%), Positives = 40/79 (50%), Gaps = 4/79 (5%) Query: 41 IAIRGQIED--SQELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAIQKVKNRKP 98 + + I+D + +I ++ + +D + + ++SPGGS AG AI+ +Q +K Sbjct: 95 VFLGQPIDDDVANLVIAQLLYLDSEDPKKPIYLYINSPGGSVTAGLAIYDTMQYIKP--D 152 Query: 99 VITEVHEMAASAGYLISCA 117 V T +AAS L+ A Sbjct: 153 VSTICVGLAASMAALLLAA 171 >gnl|CDD|144097 pfam00378, ECH, Enoyl-CoA hydratase/isomerase family. This family contains a diverse set of enzymes including: Enoyl-CoA hydratase. Napthoate synthase. Carnitate racemase. 3-hydoxybutyryl-CoA dehydratase. Dodecanoyl-CoA delta-isomerase. Length = 169 Score = 33.4 bits (77), Expect = 0.069 Identities = 23/124 (18%), Positives = 50/124 (40%), Gaps = 25/124 (20%) Query: 52 ELIERIERISRDDSATALIVS--------------LSSPGGSAYA-----GEAIFRAIQK 92 ELI+ +E++ +D S A++++ +++ + A ++ ++ Sbjct: 21 ELIQALEKLEQDPSVRAVVLTGAPGAFSAGADIKEMAAKRPAQQAQFSLEALDLWSRLED 80 Query: 93 VKNRKPVITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVLFQ-YP--YVKPFLDK-L 148 + KPVI V+ A G ++ A + +AA+ + G P L + + Sbjct: 81 LP--KPVIAAVNGYALGGGLELALACDYRIAADNAKFGLPETKLGIIPGAGGTQRLPRII 138 Query: 149 GVSI 152 G S Sbjct: 139 GHSA 142 >gnl|CDD|176969 CHL00028, clpP, ATP-dependent Clp protease proteolytic subunit. Length = 200 Score = 33.3 bits (77), Expect = 0.084 Identities = 18/46 (39%), Positives = 22/46 (47%), Gaps = 2/46 (4%) Query: 75 SPGGSAYAGEAIFRAIQKVKNRKPVITEVHEMAASAGYLISCASNI 120 SPGGS +G AI+ +Q VK V T +AAS I I Sbjct: 69 SPGGSVISGLAIYDTMQFVK--PDVHTICLGLAASMASFILAGGEI 112 >gnl|CDD|36272 KOG1054, KOG1054, KOG1054, Glutamate-gated AMPA-type ion channel receptor subunit GluR2 and related subunits [Inorganic ion transport and metabolism, Amino acid transport and metabolism, Signal transduction mechanisms]. Length = 897 Score = 33.1 bits (75), Expect = 0.10 Identities = 53/193 (27%), Positives = 80/193 (41%), Gaps = 32/193 (16%) Query: 111 GYLISCASNIIVAAET-SLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPSPFSE 169 G L+ ++I VA T +LV + F KPF+ LG+SI K P K++P FS Sbjct: 489 GELVYGRADIAVAPLTITLVREEVIDFS----KPFM-SLGISIMIKK--PQKSKPGVFSF 541 Query: 170 VNPKAVQMMQDVVDSSYHWFVRLVSESRNIPYDK-TLVLSDGRIWTGAEAKKVGLIDVVG 228 ++P A ++ +V + V L SR PY+ T GR + G+ + Sbjct: 542 LDPLAYEIWMCIVFAYIGVSVVLFLVSRFSPYEWHTEEFERGRFTPSDPPNEFGIFN--- 598 Query: 229 GQEEVWQSLYAL---GVDQSIRKIKDWNPPKNYWFCDL-------KNLSISSLLEDTIP- 277 +W SL A G D S R + +WF L NL+ +E + Sbjct: 599 ---SLWFSLGAFMQQGCDISPRSLSGRIVGGVWWFFTLIIISSYTANLAAFLTVERMVSP 655 Query: 278 ------LMKQTKV 284 L KQT++ Sbjct: 656 IESAEDLAKQTEI 668 >gnl|CDD|146756 pfam04286, DUF445, Protein of unknown function (DUF445). Predicted to be a membrane protein. Length = 367 Score = 30.7 bits (70), Expect = 0.48 Identities = 10/36 (27%), Positives = 17/36 (47%), Gaps = 3/36 (8%) Query: 31 VEDNSPHVARIAIRGQIE--DSQELIERIERISRDD 64 +E + ++ I + D++EL E IE I D Sbjct: 307 LERYHLEIGQL-ISETVNRWDAEELEELIELIVGRD 341 >gnl|CDD|33682 COG3894, COG3894, Uncharacterized metal-binding protein [General function prediction only]. Length = 614 Score = 29.9 bits (67), Expect = 0.94 Identities = 36/175 (20%), Positives = 60/175 (34%), Gaps = 26/175 (14%) Query: 70 IVSLSSPGGSAYAGE-------AIFRAIQKVKN--RKPVITEV--HEMAASAGYLISCAS 118 IV+ S+ G A+ G+ A AI V+ + V +EMA + G L C S Sbjct: 350 IVTASAAAGPAFEGQEISHGMRASPGAIDDVREFEGEEWRYTVLDNEMAKAPGVLGICGS 409 Query: 119 NIIV-AAETSLVGSIG----VLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPSPFSEVNPK 173 I AE G G V ++K GV + F+E + + Sbjct: 410 GEIDEVAEMYANGITGTGVIVKIALAARSGLVEKPGVKFPAELLE--LGNGITFTEKDIE 467 Query: 174 AVQMMQDVVDSSYHWFVRLVSESRNIPYDKTLVL----SDGRIWTGAEAKKVGLI 224 + + + + + E I + + + G +A +GLI Sbjct: 468 EAGKAKGAIRAGH----MTLIEKAGIELEDIERIYMAGAFGTYIDAKKAMVIGLI 518 >gnl|CDD|147479 pfam05311, Baculo_PP31, Baculovirus 33KDa late protein (PP31). Autographa californica nuclear polyhedrosis virus (AcMNPV) pp31 is a nuclear phosphoprotein that accumulates in the virogenic stroma, which is the viral replication centre in the infected-cell nucleus, binds to DNA, and serves as a late expression factor. Length = 267 Score = 29.3 bits (66), Expect = 1.4 Identities = 19/67 (28%), Positives = 29/67 (43%), Gaps = 13/67 (19%) Query: 142 KPFLD------KLGVSIKSVKSSPMKAEPSPFSEVNPKAVQMMQDVVDSS-------YHW 188 KPF+D KLG SI+ +S E +P N K + + S Y+ Sbjct: 98 KPFVDIFDFMEKLGKSIEVKSASSSSTESNPGKRRNSKRTEANVAEIKESNEKRSKLYNE 157 Query: 189 FVRLVSE 195 F R+++E Sbjct: 158 FYRVLNE 164 >gnl|CDD|31758 COG1570, XseA, Exonuclease VII, large subunit [DNA replication, recombination, and repair]. Length = 440 Score = 29.1 bits (65), Expect = 1.5 Identities = 23/62 (37%), Positives = 34/62 (54%), Gaps = 10/62 (16%) Query: 50 SQELIERIERISRDDSATALIVSLSSPGGS-----AYAGEAIFRAIQKVKNRKPVITEV- 103 ++E++E IER ++ LIV+ GGS A+ E + RAI +R PVI+ V Sbjct: 178 AEEIVEAIERANQRGDVDVLIVARG--GGSIEDLWAFNDEIVARAI--AASRIPVISAVG 233 Query: 104 HE 105 HE Sbjct: 234 HE 235 >gnl|CDD|33213 COG3407, MVD1, Mevalonate pyrophosphate decarboxylase [Lipid metabolism]. Length = 329 Score = 29.1 bits (65), Expect = 1.6 Identities = 26/120 (21%), Positives = 47/120 (39%), Gaps = 14/120 (11%) Query: 6 KKIKTRYVMLSLVTLTVVYFSWSSHVEDNSPHVARIAIRGQIEDSQELIERIERISRDDS 65 KK+ +R M + Y +W H E++ + AIR +D +++ E E S + Sbjct: 188 KKVSSREGMQLTAETSPFYDAWLEHSEEDL-EEMKEAIRE--KDFEKIGELAENDSLEMH 244 Query: 66 ATALIVSLSSPGGSAYAGEAIFRAIQKVKNRKPVITEVHEMAASAGYLISCASNIIVAAE 125 AT +SS Y + R I+ V E+ + + + + N+ V Sbjct: 245 AT----LMSSGPPFFYLTDESLRIIEFVH-------ELRKEGNAVYFTMDAGPNVKVITL 293 >gnl|CDD|145733 pfam02738, Ald_Xan_dh_C2, Molybdopterin-binding domain of aldehyde dehydrogenase. Length = 543 Score = 28.8 bits (65), Expect = 1.7 Identities = 17/62 (27%), Positives = 25/62 (40%), Gaps = 10/62 (16%) Query: 38 VARIAIRGQIEDSQELIERIERISRDDSATALIVSLSSPGGSA---YAGEAIFRAIQKVK 94 VA+IA EL ++ I T + + S GS G A+ A +K+K Sbjct: 347 VAQIAAE-------ELGIPLDDIRVISGDTDKVPNGSGTYGSRGTDVNGNAVRLACEKLK 399 Query: 95 NR 96 R Sbjct: 400 ER 401 >gnl|CDD|144091 pfam00370, FGGY_N, FGGY family of carbohydrate kinases, N-terminal domain. This domain adopts a ribonuclease H-like fold and is structurally related to the C-terminal domain. Length = 245 Score = 28.4 bits (64), Expect = 2.1 Identities = 18/62 (29%), Positives = 25/62 (40%), Gaps = 8/62 (12%) Query: 231 EEVWQSLYALGVDQSIRKIKDWNPPKNYWFCDLKNLSISSLLEDTIPLMKQTKVQGLWAV 290 EE+WQ+L Q+IRKI +K + IS + L K K + Sbjct: 46 EEIWQALA-----QAIRKI---LQQSGISPKQIKGIGISGQGHGLVLLDKNDKPLYPAIL 97 Query: 291 WN 292 WN Sbjct: 98 WN 99 >gnl|CDD|143396 cd07077, ALDH-like, NAD(P)+-dependent aldehyde dehydrogenase-like (ALDH-like) family. The aldehyde dehydrogenase-like (ALDH-like) group of the ALDH superfamily of NAD(P)+-dependent enzymes which, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. This group includes families ALDH18, ALDH19, and ALDH20 and represents such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group. Length = 397 Score = 27.6 bits (61), Expect = 4.6 Identities = 23/115 (20%), Positives = 34/115 (29%), Gaps = 15/115 (13%) Query: 80 AYAGEAIFRAIQKVKNRKPVI-------------TEVHEMAASAGYLISCASNIIVAAET 126 A G A K PVI T E A+ + + A+E Sbjct: 187 ATGGRDAVDAAVKHSPHIPVIGFGAGNSPVVVDETADEERASGSVHDSKFFDQNACASEQ 246 Query: 127 SLVGSIGVLFQYPYVKPFLDKLGVSIKSVKSSPMKAEPSPFSEVNPKAVQMMQDV 181 +L VL P + F KL V V + +A++ M + Sbjct: 247 NLYVVDDVL--DPLYEEFKLKLVVEGLKVPQETKPLSKETTPSFDDEALESMTPL 299 >gnl|CDD|36895 KOG1682, KOG1682, KOG1682, Enoyl-CoA isomerase [Lipid transport and metabolism]. Length = 287 Score = 27.4 bits (60), Expect = 4.9 Identities = 19/67 (28%), Positives = 33/67 (49%), Gaps = 6/67 (8%) Query: 73 LSSPGGSAYAGEAIFRAIQKVKN-----RKPVITEVHEMAASAGYLISCASNIIVAAETS 127 L++ GS E +F+ V N PVI +V+ AA+AG + + +++VA + S Sbjct: 98 LTNEPGSDIHAE-VFQTCTDVMNDIRNLPVPVIAKVNGYAAAAGCQLVASCDMVVATKNS 156 Query: 128 LVGSIGV 134 + G Sbjct: 157 KFSTPGA 163 >gnl|CDD|36892 KOG1679, KOG1679, KOG1679, Enoyl-CoA hydratase [Lipid transport and metabolism]. Length = 291 Score = 27.3 bits (60), Expect = 5.1 Identities = 17/46 (36%), Positives = 26/46 (56%), Gaps = 3/46 (6%) Query: 203 KTLVLSDGRIWTGAEAKKVGLIDVVGGQEEVWQSLY--ALGVDQSI 246 K L+ + R+ GAEA K+GL++ V Q E + Y AL + + I Sbjct: 186 KELIFT-ARVLNGAEAAKLGLVNHVVEQNEEGDAAYQKALELAREI 230 >gnl|CDD|147546 pfam05416, Peptidase_C37, Southampton virus-type processing peptidase. Corresponds to Merops family C37. Norwalk-like viruses (NLVs), including the Southampton virus, cause acute non-bacterial gastroenteritis in humans. The NLV genome encodes three open reading frames (ORFs). ORF1 encodes a polyprotein, which is processed by the viral protease into six proteins. Length = 535 Score = 27.1 bits (60), Expect = 6.2 Identities = 11/30 (36%), Positives = 13/30 (43%) Query: 230 QEEVWQSLYALGVDQSIRKIKDWNPPKNYW 259 +EE + G D RK DWNP W Sbjct: 321 KEERAKLGLVTGSDIRKRKPIDWNPKGPLW 350 >gnl|CDD|30807 COG0459, GroL, Chaperonin GroEL (HSP60 family) [Posttranslational modification, protein turnover, chaperones]. Length = 524 Score = 26.7 bits (59), Expect = 7.8 Identities = 21/105 (20%), Positives = 39/105 (37%), Gaps = 4/105 (3%) Query: 31 VEDNSPHVARIAIRGQIEDSQELIERIERISRDDSATALIVSLSSPGGSAYAGEAIFRAI 90 + + VA I +RG E EL E+ RI +D+ + ++ G A A Sbjct: 353 RKAKAGGVATILVRGATE--VELDEKERRI--EDALNVVRAAVEEGKIVPGGGAAEIEAA 408 Query: 91 QKVKNRKPVITEVHEMAASAGYLISCASNIIVAAETSLVGSIGVL 135 +++ + E + + + AE + + I VL Sbjct: 409 LRLREYAMTVEGGDEQLGIEAFARALEAPPRQLAENAGLDPIEVL 453 Database: CddA Posted date: Feb 4, 2011 9:38 PM Number of letters in database: 6,263,737 Number of sequences in database: 21,609 Lambda K H 0.317 0.132 0.389 Gapped Lambda K H 0.267 0.0619 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 21609 Number of Hits to DB: 3,481,304 Number of extensions: 173072 Number of successful extensions: 556 Number of sequences better than 10.0: 1 Number of HSP's gapped: 538 Number of HSP's successfully gapped: 52 Length of query: 293 Length of database: 6,263,737 Length adjustment: 93 Effective length of query: 200 Effective length of database: 4,254,100 Effective search space: 850820000 Effective search space used: 850820000 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 57 (26.0 bits)