Query 036530
Match_columns 76
No_of_seqs 32 out of 34
Neff 2.5
Searched_HMMs 46136
Date Fri Mar 29 02:58:13 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/036530.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/036530hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF15386 Tantalus: Drosophila 51.6 8.3 0.00018 24.2 1.0 14 27-40 35-48 (61)
2 KOG2130 Phosphatidylserine-spe 44.4 17 0.00037 30.2 2.1 47 29-75 330-377 (407)
3 PF09707 Cas_Cas2CT1978: CRISP 40.8 13 0.00029 24.3 0.8 31 5-35 16-47 (86)
4 TIGR01873 cas_CT1978 CRISPR-as 37.7 8.1 0.00018 25.6 -0.6 22 5-26 16-37 (87)
5 PRK11558 putative ssRNA endonu 32.3 12 0.00026 25.3 -0.5 22 5-26 18-39 (97)
6 PF14009 DUF4228: Domain of un 32.1 21 0.00046 23.0 0.7 10 26-35 172-181 (181)
7 cd07051 BMC_like_1_repeat1 Bac 14.9 74 0.0016 22.3 0.7 13 7-19 33-45 (111)
8 PF10295 DUF2406: Uncharacteri 14.7 51 0.0011 21.2 -0.2 21 14-34 41-61 (69)
9 KOG1654 Microtubule-associated 13.6 18 0.00039 25.6 -2.7 14 28-41 89-102 (116)
10 PF02395 Peptidase_S6: Immunog 11.1 72 0.0016 28.1 -0.4 17 4-20 116-132 (769)
No 1
>PF15386 Tantalus: Drosophila Tantalus-like
Probab=51.59 E-value=8.3 Score=24.24 Aligned_cols=14 Identities=57% Similarity=0.695 Sum_probs=10.5
Q ss_pred ccccccchhhcccc
Q 036530 27 PKMLDTISEEEKDV 40 (76)
Q Consensus 27 ~~~LdTIaEEdrE~ 40 (76)
...|+||.||-++.
T Consensus 35 ~~~LETIfEEp~~~ 48 (61)
T PF15386_consen 35 PKNLETIFEEPKNE 48 (61)
T ss_pred cCCcchhhcccccc
Confidence 35899999985544
No 2
>KOG2130 consensus Phosphatidylserine-specific receptor PtdSerR, contains JmjC domain [Chromatin structure and dynamics; Signal transduction mechanisms]
Probab=44.40 E-value=17 Score=30.22 Aligned_cols=47 Identities=30% Similarity=0.293 Sum_probs=26.9
Q ss_pred ccccchhhcccccCCC-CccCCCCCCCCCCCcccccceeeecCCCCCC
Q 036530 29 MLDTISEEEKDVSGSD-ALASTRRSFSSCSSASSARASVACDSNSTNG 75 (76)
Q Consensus 29 ~LdTIaEEdrE~~~~d-s~~s~p~~~~~~s~~ssa~~~~~~~~~~~~~ 75 (76)
-|..|+++-.++...+ +.-++-.++++++++|+.+..--||.|..||
T Consensus 330 el~~l~~s~~~~e~~~~~~~sss~ssssssss~~~s~e~e~d~~G~~g 377 (407)
T KOG2130|consen 330 ELADLADSTHLEESTGLASDSSSDSSSSSSSSSSSSDEEESDDNGDNG 377 (407)
T ss_pred hHHHHhhhhccccccCcccccccccccccccCCCCCccccccccCccc
Confidence 4556666555433332 2234445555555666666666899888766
No 3
>PF09707 Cas_Cas2CT1978: CRISPR-associated protein (Cas_Cas2CT1978); InterPro: IPR010152 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799 from INTERPRO. Cas2 is one of four protein families (Cas1 to Cas4) that are associated with CRISPR elements and always occur near a repeat cluster, usually in the order cas3-cas4-cas1-cas2. The function of Cas2 (and Cas1) is unknown. Cas3 proteins appear to be helicases while Cas4 proteins resemble RecB-type exonucleases, suggesting that these genes are involved in DNA metabolism or gene expression [].
Probab=40.84 E-value=13 Score=24.34 Aligned_cols=31 Identities=19% Similarity=0.548 Sum_probs=21.7
Q ss_pred hhhhceeeecCCceeeeeecCC-ccccccchh
Q 036530 5 RMARFFMEVAPPQFVSVMRHRT-PKMLDTISE 35 (76)
Q Consensus 5 ~~ARf~mEVAPPq~iSv~R~r~-~~~LdTIaE 35 (76)
.++||++|++|--||.-+..|. -+.-+-|.+
T Consensus 16 ~Ltrwl~Ei~~GVyVg~~s~rVRe~lW~~v~~ 47 (86)
T PF09707_consen 16 FLTRWLLEIRPGVYVGNVSARVRERLWERVTE 47 (86)
T ss_pred hhhheeEecCCCcEEcCCCHHHHHHHHHHHHh
Confidence 5789999999999998666553 333344433
No 4
>TIGR01873 cas_CT1978 CRISPR-associated endoribonuclease Cas2, E. coli subfamily. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents a minor branch of the Cas2 family of CRISPR-associated endonuclease, whereas most Cas2 proteins are modeled instead by TIGR01573. This form of Cas2 is characteristic for the Ecoli subtype of CRISPR/Cas locus.
Probab=37.69 E-value=8.1 Score=25.59 Aligned_cols=22 Identities=18% Similarity=0.312 Sum_probs=18.4
Q ss_pred hhhhceeeecCCceeeeeecCC
Q 036530 5 RMARFFMEVAPPQFVSVMRHRT 26 (76)
Q Consensus 5 ~~ARf~mEVAPPq~iSv~R~r~ 26 (76)
+++||++|++|--||.-+..|.
T Consensus 16 ~Lt~wllEv~~GVyVg~~s~rV 37 (87)
T TIGR01873 16 RLALWLLEPRAGVYVGGVSASV 37 (87)
T ss_pred hhhhheeecCCCcEEcCCCHHH
Confidence 5799999999999999666553
No 5
>PRK11558 putative ssRNA endonuclease; Provisional
Probab=32.27 E-value=12 Score=25.26 Aligned_cols=22 Identities=27% Similarity=0.486 Sum_probs=18.3
Q ss_pred hhhhceeeecCCceeeeeecCC
Q 036530 5 RMARFFMEVAPPQFVSVMRHRT 26 (76)
Q Consensus 5 ~~ARf~mEVAPPq~iSv~R~r~ 26 (76)
+++||++|++|--||.-+..|.
T Consensus 18 ~Lt~wllEv~~GVyVg~~S~rV 39 (97)
T PRK11558 18 RLAVWLLEVRAGVYVGDVSRRI 39 (97)
T ss_pred hhhhheEecCCCcEEcCCCHHH
Confidence 5799999999999999665553
No 6
>PF14009 DUF4228: Domain of unknown function (DUF4228)
Probab=32.08 E-value=21 Score=23.02 Aligned_cols=10 Identities=30% Similarity=0.352 Sum_probs=8.5
Q ss_pred Cccccccchh
Q 036530 26 TPKMLDTISE 35 (76)
Q Consensus 26 ~~~~LdTIaE 35 (76)
.++.||||.|
T Consensus 172 WrP~LesI~E 181 (181)
T PF14009_consen 172 WRPALESIPE 181 (181)
T ss_pred ccCCCCCcCc
Confidence 4689999987
No 7
>cd07051 BMC_like_1_repeat1 Bacterial Micro-Compartment (BMC)-like domain 1 repeat 1. BMC-like domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of the carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. Proteins in this CD contain two tandem BMC domains. This CD includes repeat 1 (the first BMC domain of BMC like 1 proteins).
Probab=14.91 E-value=74 Score=22.35 Aligned_cols=13 Identities=38% Similarity=0.639 Sum_probs=9.7
Q ss_pred hhceeeecCCcee
Q 036530 7 ARFFMEVAPPQFV 19 (76)
Q Consensus 7 ARf~mEVAPPq~i 19 (76)
|-.||||||---|
T Consensus 33 a~l~iEvsPG~~I 45 (111)
T cd07051 33 ASLWIEVAPGLAI 45 (111)
T ss_pred eEEEEEeccchhH
Confidence 4579999997544
No 8
>PF10295 DUF2406: Uncharacterised protein (DUF2406); InterPro: IPR018809 This entry represents a family of small proteins conserved in fungi. The function is not known.
Probab=14.65 E-value=51 Score=21.24 Aligned_cols=21 Identities=33% Similarity=0.425 Sum_probs=17.5
Q ss_pred cCCceeeeeecCCccccccch
Q 036530 14 APPQFVSVMRHRTPKMLDTIS 34 (76)
Q Consensus 14 APPq~iSv~R~r~~~~LdTIa 34 (76)
.=|-+..++|.|.-+=||||-
T Consensus 41 ~~PD~SNPTR~R~ERPLDTIR 61 (69)
T PF10295_consen 41 TDPDRSNPTRSRDERPLDTIR 61 (69)
T ss_pred CCCCCCCCCcccccCchHHHH
Confidence 347778899999999999994
No 9
>KOG1654 consensus Microtubule-associated anchor protein involved in autophagy and membrane trafficking [Cytoskeleton]
Probab=13.63 E-value=18 Score=25.65 Aligned_cols=14 Identities=36% Similarity=0.503 Sum_probs=10.5
Q ss_pred cccccchhhccccc
Q 036530 28 KMLDTISEEEKDVS 41 (76)
Q Consensus 28 ~~LdTIaEEdrE~~ 41 (76)
..+..|=||++|++
T Consensus 89 ~~ms~~Ye~~kdeD 102 (116)
T KOG1654|consen 89 ATMSALYEEEKDED 102 (116)
T ss_pred hhHHHHHHhhcccC
Confidence 46678888888764
No 10
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=11.08 E-value=72 Score=28.05 Aligned_cols=17 Identities=35% Similarity=0.763 Sum_probs=7.4
Q ss_pred chhhhceeeecCCceee
Q 036530 4 SRMARFFMEVAPPQFVS 20 (76)
Q Consensus 4 s~~ARf~mEVAPPq~iS 20 (76)
.|+-+|++||||.-...
T Consensus 116 pRLnK~VTEvaP~~~t~ 132 (769)
T PF02395_consen 116 PRLNKFVTEVAPAEMTT 132 (769)
T ss_dssp EEESS---SS----BBS
T ss_pred eecCceEEEEecccccc
Confidence 47889999999976554
Done!