Query 033212
Match_columns 125
No_of_seqs 31 out of 33
Neff 2.0
Searched_HMMs 46136
Date Fri Mar 29 11:11:20 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/033212.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/033212hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF09707 Cas_Cas2CT1978: CRISP 36.8 13 0.00028 26.4 0.2 26 38-63 16-41 (86)
2 TIGR01873 cas_CT1978 CRISPR-as 36.6 12 0.00026 26.8 0.0 28 38-65 16-43 (87)
3 PRK11558 putative ssRNA endonu 31.0 17 0.00037 26.5 0.1 28 38-65 18-45 (97)
4 PF15386 Tantalus: Drosophila 27.2 28 0.00062 23.6 0.7 13 61-73 36-48 (61)
5 PF14009 DUF4228: Domain of un 24.7 29 0.00064 24.1 0.4 11 58-68 171-181 (181)
6 PF15509 DUF4650: Domain of un 16.5 1.1E+02 0.0024 28.4 2.3 23 15-37 312-334 (520)
7 KOG2893 Zn finger protein [Gen 15.1 40 0.00086 29.5 -0.8 20 1-20 25-44 (341)
8 PLN03121 nucleic acid binding 13.3 63 0.0014 27.0 -0.1 16 99-114 204-219 (243)
9 cd07051 BMC_like_1_repeat1 Bac 11.4 1.2E+02 0.0025 23.1 0.9 13 40-52 33-45 (111)
10 PF11923 DUF3441: Domain of un 11.2 59 0.0013 23.7 -0.8 17 28-44 54-70 (112)
No 1
>PF09707 Cas_Cas2CT1978: CRISPR-associated protein (Cas_Cas2CT1978); InterPro: IPR010152 Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against bacteriophages, possibly acting with an RNA interference-like mechanism to inhibit gene functions of invasive DNA elements [, ]. Differences in the number and type of spacers between CRISPR repeats correlate with phage sensitivity. It is thought that following phage infection, bacteria integrate new spacers derived from phage genomic sequences, and that the removal or addition of particular spacers modifies the phage-resistance phenotype of the cell. Therefore, the specificity of CRISPRs may be determined by spacer-phage sequence similarity. In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci []. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. This entry represents a minor branch of the Cas2 family of CRISPR-associated protein which are found in IPR003799 from INTERPRO. Cas2 is one of four protein families (Cas1 to Cas4) that are associated with CRISPR elements and always occur near a repeat cluster, usually in the order cas3-cas4-cas1-cas2. The function of Cas2 (and Cas1) is unknown. Cas3 proteins appear to be helicases while Cas4 proteins resemble RecB-type exonucleases, suggesting that these genes are involved in DNA metabolism or gene expression [].
Probab=36.80 E-value=13 Score=26.38 Aligned_cols=26 Identities=27% Similarity=0.520 Sum_probs=20.7
Q ss_pred hcccceeeecCCceeeeeeccccccc
Q 033212 38 RITRFVMEVAPPQFVSVMRHRTAKML 63 (125)
Q Consensus 38 r~ARf~mEVAPPq~ISv~RrR~s~mL 63 (125)
..+||++|++|--||.-+-.|....|
T Consensus 16 ~Ltrwl~Ei~~GVyVg~~s~rVRe~l 41 (86)
T PF09707_consen 16 FLTRWLLEIRPGVYVGNVSARVRERL 41 (86)
T ss_pred hhhheeEecCCCcEEcCCCHHHHHHH
Confidence 57899999999999997666665443
No 2
>TIGR01873 cas_CT1978 CRISPR-associated endoribonuclease Cas2, E. coli subfamily. CRISPR is a term for Clustered, Regularly Interspaced Short Palidromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This model represents a minor branch of the Cas2 family of CRISPR-associated endonuclease, whereas most Cas2 proteins are modeled instead by TIGR01573. This form of Cas2 is characteristic for the Ecoli subtype of CRISPR/Cas locus.
Probab=36.61 E-value=12 Score=26.80 Aligned_cols=28 Identities=11% Similarity=0.181 Sum_probs=21.9
Q ss_pred hcccceeeecCCceeeeeeccccccccc
Q 033212 38 RITRFVMEVAPPQFVSVMRHRTAKMLDT 65 (125)
Q Consensus 38 r~ARf~mEVAPPq~ISv~RrR~s~mLDT 65 (125)
+.+||++|++|--||.-+..|...+|=.
T Consensus 16 ~Lt~wllEv~~GVyVg~~s~rVRe~lW~ 43 (87)
T TIGR01873 16 RLALWLLEPRAGVYVGGVSASVRERIWD 43 (87)
T ss_pred hhhhheeecCCCcEEcCCCHHHHHHHHH
Confidence 5799999999999999766666655433
No 3
>PRK11558 putative ssRNA endonuclease; Provisional
Probab=31.03 E-value=17 Score=26.47 Aligned_cols=28 Identities=21% Similarity=0.388 Sum_probs=21.8
Q ss_pred hcccceeeecCCceeeeeeccccccccc
Q 033212 38 RITRFVMEVAPPQFVSVMRHRTAKMLDT 65 (125)
Q Consensus 38 r~ARf~mEVAPPq~ISv~RrR~s~mLDT 65 (125)
+++||++|++|--||.-+..|...+|=.
T Consensus 18 ~Lt~wllEv~~GVyVg~~S~rVRd~lW~ 45 (97)
T PRK11558 18 RLAVWLLEVRAGVYVGDVSRRIREMIWQ 45 (97)
T ss_pred hhhhheEecCCCcEEcCCCHHHHHHHHH
Confidence 5799999999999999766666554433
No 4
>PF15386 Tantalus: Drosophila Tantalus-like
Probab=27.21 E-value=28 Score=23.60 Aligned_cols=13 Identities=54% Similarity=0.603 Sum_probs=10.0
Q ss_pred ccccchhhhhhcc
Q 033212 61 KMLDTISEEEKDA 73 (125)
Q Consensus 61 ~mLDTIaEEdrE~ 73 (125)
..||||-||-.+.
T Consensus 36 ~~LETIfEEp~~~ 48 (61)
T PF15386_consen 36 KNLETIFEEPKNE 48 (61)
T ss_pred CCcchhhcccccc
Confidence 4899999985543
No 5
>PF14009 DUF4228: Domain of unknown function (DUF4228)
Probab=24.73 E-value=29 Score=24.07 Aligned_cols=11 Identities=27% Similarity=0.323 Sum_probs=9.1
Q ss_pred cccccccchhh
Q 033212 58 RTAKMLDTISE 68 (125)
Q Consensus 58 R~s~mLDTIaE 68 (125)
..++.||||.|
T Consensus 171 ~WrP~LesI~E 181 (181)
T PF14009_consen 171 SWRPALESIPE 181 (181)
T ss_pred CccCCCCCcCc
Confidence 44689999987
No 6
>PF15509 DUF4650: Domain of unknown function (DUF4650)
Probab=16.46 E-value=1.1e+02 Score=28.44 Aligned_cols=23 Identities=43% Similarity=0.462 Sum_probs=18.2
Q ss_pred HHHHHHHHHHHhhchhhHHhhhh
Q 033212 15 LCHKLLSLIVLKLNQKKKKMANS 37 (125)
Q Consensus 15 ~~~~~~~~~~~~~~~~~~~MA~s 37 (125)
-.|||---..-||+.||||.|..
T Consensus 312 q~hKLRLKLLKKLKAKKkKLAsL 334 (520)
T PF15509_consen 312 QTHKLRLKLLKKLKAKKKKLASL 334 (520)
T ss_pred hHHHHHHHHHHHHHHhHHHHHHH
Confidence 35888666667899999999864
No 7
>KOG2893 consensus Zn finger protein [General function prediction only]
Probab=15.05 E-value=40 Score=29.50 Aligned_cols=20 Identities=45% Similarity=0.687 Sum_probs=15.5
Q ss_pred CccccccCcCCchhHHHHHH
Q 033212 1 MLIQNQKSTFPSSVLCHKLL 20 (125)
Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~ 20 (125)
.|||.||...=.--+|||-|
T Consensus 25 iliqhqkakhfkchichkkl 44 (341)
T KOG2893|consen 25 ILIQHQKAKHFKCHICHKKL 44 (341)
T ss_pred hhhhhhhhccceeeeehhhh
Confidence 47899988777777888754
No 8
>PLN03121 nucleic acid binding protein; Provisional
Probab=13.28 E-value=63 Score=27.03 Aligned_cols=16 Identities=31% Similarity=0.491 Sum_probs=12.7
Q ss_pred cchhhhcccccccccc
Q 033212 99 SASAVVAGSKNFLRGV 114 (125)
Q Consensus 99 sa~~a~~ns~~f~~~v 114 (125)
.|-.+..|++||.+|-
T Consensus 204 ~a~sai~~~~Y~~~Ga 219 (243)
T PLN03121 204 AAANAVVNSSYFSKGA 219 (243)
T ss_pred Hhhhhhhhcchhhcch
Confidence 4456788999999985
No 9
>cd07051 BMC_like_1_repeat1 Bacterial Micro-Compartment (BMC)-like domain 1 repeat 1. BMC-like domains exist in cyanobacteria, proteobacteria, and actinobacteria and are homologs of the carboxysome shell proteins. They might be encoded from putative organelles involved in unknown metabolic process. Although it has been suggested that these carboxysome shell protein homologs form hexamers and further assemble into the flat facets of the polyhedral bacterial organelles shell at present no experimental evidence exists to directly support this view. Proteins in this CD contain two tandem BMC domains. This CD includes repeat 1 (the first BMC domain of BMC like 1 proteins).
Probab=11.39 E-value=1.2e+02 Score=23.11 Aligned_cols=13 Identities=31% Similarity=0.434 Sum_probs=9.9
Q ss_pred ccceeeecCCcee
Q 033212 40 TRFVMEVAPPQFV 52 (125)
Q Consensus 40 ARf~mEVAPPq~I 52 (125)
|-.|+||||---|
T Consensus 33 a~l~iEvsPG~~I 45 (111)
T cd07051 33 ASLWIEVAPGLAI 45 (111)
T ss_pred eEEEEEeccchhH
Confidence 5579999997544
No 10
>PF11923 DUF3441: Domain of unknown function (DUF3441); InterPro: IPR021846 This presumed domain is functionally uncharacterised. This domain is found in archaea and eukaryotes. This domain is typically between 104 to 119 amino acids in length. This domain is found associated with PF05833 from PFAM, PF05670 from PFAM. This domain has two conserved residues (P and G) that may be functionally important.
Probab=11.20 E-value=59 Score=23.71 Aligned_cols=17 Identities=35% Similarity=0.386 Sum_probs=15.6
Q ss_pred chhhHHhhhhhccccee
Q 033212 28 NQKKKKMANSRITRFVM 44 (125)
Q Consensus 28 ~~~~~~MA~sr~ARf~m 44 (125)
|+||.|.|+..+.+|..
T Consensus 54 ~~KKGKaak~il~~f~~ 70 (112)
T PF11923_consen 54 NAKKGKAAKEILEYFTA 70 (112)
T ss_pred CcchHHHHHHHHHHHHh
Confidence 78999999999999965
Done!