Query T0603 CRISPR associated protein Cas1 , E.coli, 305 residues Match_columns 305 No_of_seqs 207 out of 468 Neff 6.3 Searched_HMMs 11830 Date Mon Jul 5 09:06:40 2010 Command /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/bin/hhsearch -i /home/syshi_3/CASP9/HHsearch4Targetseq/seq/T0603.hhm -d /home/syshi_2/2008/ferredoxin/manualcheck/update/HHsearch/database/pfamA_24_hhmdb -o /home/syshi_3/CASP9/HHsearch4Targetseq/pfamAsearch/T0603.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 PF01867 Cas_Cas1: CRISPR asso 100.0 0 0 367.0 16.4 237 17-257 1-280 (316) 2 PF07085 DRTGG: DRTGG domain; 49.2 5.5 0.00047 16.5 3.0 53 29-81 40-94 (105) 3 PF00491 Arginase: Arginase fa 42.6 7.1 0.0006 15.8 2.7 36 48-83 77-115 (273) 4 PF10820 DUF2543: Protein of u 40.8 7.9 0.00067 15.5 3.2 58 216-273 9-76 (81) 5 PF03983 SHD1: SLA1 homology d 40.6 8 0.00068 15.4 3.2 40 12-51 16-55 (70) 6 PF02951 GSH-S_N: Prokaryotic 32.3 11 0.00091 14.6 5.0 64 13-76 30-112 (119) 7 PF02606 LpxK: Tetraacyldisacc 30.4 11 0.00092 14.6 2.0 41 40-80 31-73 (326) 8 PF04283 CheF-arch: Chemotaxis 29.4 12 0.001 14.3 3.3 33 21-54 26-58 (221) 9 PF04010 DUF357: Protein of un 27.2 11 0.00096 14.5 1.7 14 172-185 48-61 (75) 10 PF00951 Arteri_Gl: Arteriviru 21.8 17 0.0014 13.4 2.4 42 175-218 47-88 (179) No 1 >PF01867 Cas_Cas1: CRISPR associated protein Cas1; InterPro: IPR002729 This family of proteins are found in archaea and bacteria and are, as yet, functionally uncharacterised. It is one of four protein families in prokaryotic genomes that contain multiple CRISPR elements. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats. The cas genes are found near the repeats . This protein is otherwise uncharacterised.; PDB: 3god_D 2yzs_A. Probab=100.00 E-value=0 Score=366.98 Aligned_cols=237 Identities=26% Similarity=0.372 Sum_probs=205.8 Q ss_pred EEEE-EE-EEEEEECCEEEEEECCCEEEECCCCCCCEEEECCCCEECHHHHHHHHHCCCEEEEECCCCCEEEECCCCCCC Q ss_conf 6888-74-389985778999876865786161003566861773036889999997598899988987065512777764 Q T0603 17 MIFL-QY-GQIDVIDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEAGVRVYASGQPGGA 94 (305) Q Consensus 17 ~lYl-e~-g~i~~~~~~l~i~~~~g~~~~IPi~~i~~IlL~~gvsIT~~al~~la~~Gi~V~f~g~~G~~~y~~~~~~~~ 94 (305) .||+ ++ ++|++++++++|+++++.+..||+++|++|+++|+++||++||+.|+++||+|+||+++|.+++...++.+. T Consensus 1 TLyv~~~g~~L~~~~~~l~v~~~~~~~~~iPl~~I~~Ivi~g~v~iSt~ai~~l~~~gI~v~f~~~~G~~~g~l~p~~~~ 80 (316) T PF01867_consen 1 TLYVTEQGAYLSKKGGRLVVEKKGEIKKEIPLEDIDQIVIFGGVSISTAAIRLLAENGIPVVFLDENGRPYGRLYPPYNR 80 (316) T ss_dssp -EEE--S-EEEEEETTEEEEEECS-EEEEC-CCCCCEEEE-S--EEEHHHHHHHHH---BEEEB-----BSEEECTSS-- T ss_pred CEEEEECCEEEEEECCEEEEEECCCEEEEECHHHHCEEEEECCCEECHHHHHHHHHCCCEEEEECCCCCEEEEEECCCCC T ss_conf 98997689289998999999999957899707880789994896487999999998799899989899378998578898 Q ss_pred CHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHCC-------------------------CHHHHHHHHHHHHHHHHHHHHH Q ss_conf 089999999983598899999999999751001-------------------------4234108988516799999999 Q T0603 95 RSDKLLYQAKLALDEDLRLKVVRKMFELRFGEP-------------------------APARRSVEQLRGIEGSRVRATY 149 (305) Q Consensus 95 ~~~~l~~Q~~~~~d~~~RL~vAR~m~~~R~~~~-------------------------~~~~~sie~LrG~EG~~ar~yy 149 (305) ++..+++|++++.|+++|+.+||+|+..|+.+. +....++++|||+||.+++.|| T Consensus 81 ~~~~~~~Q~~~~~~~~~rl~iAr~ii~~Ki~nq~~~L~~~~~~~~~~~~i~~l~~~~~~~~~~~~~~l~g~Eg~aa~~Yf 160 (316) T PF01867_consen 81 NVQLRRAQYQAYDDEEFRLAIAREIIKGKIRNQRALLKRYSKNRELSEAIDELEQIEELENAVSIDELRGIEGQAARIYF 160 (316) T ss_dssp -SHHHHHHHHHCCSHHHHHHHHHHHHHHHHHHHHHHHSHHHHHH-HHHHHHHHHHHHHHHH--SHHHHHHHHHHHHHHHH T ss_pred CHHHHHHHHHHHCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHHH T ss_conf 64999999986059268999999999999850999999850344531113457669988756898787779999999999 Q ss_pred HHHHHHH--CCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCEEEECCCCC--CEEECCHHHHHH Q ss_conf 9999871--885345247888888565366899999999999999999970888431256648987--513000555335 Q T0603 150 ALLAKQY--GVTWNGRRYDPKDWEKGDTINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPL--SFVYDIADIIKF 225 (305) Q Consensus 150 ~~~a~~~--~~~~~gR~~~r~~~~~~DpvNa~LS~gyslLy~~~~~aI~~~Gl~P~lGflH~g~~~--Slv~DiADlfK~ 225 (305) ..+++.+ ++.|.+|.+ +||.||+|++|||||++||+.|.+||+++||||++||+|+++++ |||||+||+||| T Consensus 161 ~~l~~~~~~~~~F~~R~~----rp~~D~vNa~LsygY~iL~~~v~~ai~~~GLdP~~G~lH~~~~~~~sL~~DLmE~fRp 236 (316) T PF01867_consen 161 EALFQLLPPGFGFSGRNR----RPPEDPVNALLSYGYAILYSEVARAIVAAGLDPYIGFLHEPRYGRPSLALDLMEPFRP 236 (316) T ss_dssp HHHHHHTT-----SS-------SS--SHHHHHHH---CCCHHHHHHHHCCSS--TT--SSS-------HHHHHHHCCCHC T ss_pred HHHHHHHCCCCCCCCCCC----CCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCEEECCCCCCCCHHHHHHHHHHH T ss_conf 999998434678788777----9999878899999999999999999998599978765668999997159886887506 Q ss_pred HHHHHHHHHHHHCCCCCH------------HHHHHHHHHHHHHH Q ss_conf 666789999861487772------------48999999999986 Q T0603 226 DTVVPKAFEIARRNPGEP------------DREVRLACRDIFRS 257 (305) Q Consensus 226 ~i~~p~aF~~~~~~~~~~------------e~~~R~~~r~~~~~ 257 (305) .+||+++|++++++..++ ..+.|+.+...|.+ T Consensus 237 ~~vD~~v~~l~~~~~~~~~dF~~~~~~~~L~~~~rk~~~~~~~~ 280 (316) T PF01867_consen 237 VIVDRLVFRLINRGEIKPEDFEKRENGCYLNKEGRKKFIKAFEE 280 (316) T ss_dssp CCCHHHHHHHHHTTSSHGHHCCCCTTEEEE-----HHHHHHHHH T ss_pred HHHHHHHHHHHHHCCCCHHHHCCCCCEEEECHHHHHHHHHHHHH T ss_conf 87999999998726888788410188699758999999999998 No 2 >PF07085 DRTGG: DRTGG domain; InterPro: IPR010766 This presumed domain is about 120 amino acids in length. It is found associated with CBS domains IPR000644 from INTERPRO, as well as the CbiA domain IPR002586 from INTERPRO. The function of this domain is unknown. It is named the DRTGG domain after some of the most conserved residues. This domain may be very distantly related to a pair of CBS domains. There are no significant sequence similarities, but its length and association with CBS domains supports this idea. ; PDB: 2ioj_B. Probab=49.19 E-value=5.5 Score=16.49 Aligned_cols=53 Identities=17% Similarity=0.202 Sum_probs=37.5 Q ss_pred CCEEEEEECCCEEEE--CCCCCCCEEEECCCCEECHHHHHHHHHCCCEEEEECCC Q ss_conf 778999876865786--16100356686177303688999999759889998898 Q T0603 29 DGAFVLIDKTGIRTH--IPVGSVACIMLEPGTRVSHAAVRLAAQVGTLLVWVGEA 81 (305) Q Consensus 29 ~~~l~i~~~~g~~~~--IPi~~i~~IlL~~gvsIT~~al~~la~~Gi~V~f~g~~ 81 (305) .|.++++..+-..+. -=.+.+++|+|-+|..++...++++-+.|++|.....+ T Consensus 40 ~g~lvIt~gdR~di~~~a~~~~i~~iIltgg~~~~~~vl~la~~~~ipvl~t~~d 94 (105) T PF07085_consen 40 EGDLVITPGDREDIQLAAIEAGIAGIILTGGLEPDEEVLELAKEKGIPVLSTPYD 94 (105) T ss_dssp TSEEEEE----HHHHHHHTT-TEEEEEE-------HHHHHHHHH----EEE---- T ss_pred CCEEEEECCCCHHHHHHHHHHCCCEEEEECCCCCCHHHHHHHHHCCCEEEEECCC T ss_conf 9819998688599999999858989999299998999999997779869997687 No 3 >PF00491 Arginase: Arginase family; InterPro: IPR006035 The ureohydrolase superfamily includes arginase (3.5.3.1 from EC), agmatinase (3.5.3.11 from EC), formiminoglutamase (3.5.3.8 from EC) and proclavaminate amidinohydrolase (3.5.3.22 from EC) . These enzymes share a 3-layer alpha-beta-alpha structure , , , and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways. Arginase, which catalyses the conversion of arginine to urea and ornithine, is one of the five members of the urea cycle enzymes that convert ammonia to urea as the principal product of nitrogen excretion . There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia. Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals. Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide. Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics . ; GO: 0016813 hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amidines, 0046872 metal ion binding; PDB: 1woi_F 1wog_B 1woh_D 2a0m_A 1xfk_A 1gq7_E 1gq6_C 2ef5_A 2eiv_F 2ef4_A .... Probab=42.56 E-value=7.1 Score=15.77 Aligned_cols=36 Identities=14% Similarity=0.069 Sum_probs=30.4 Q ss_pred CCCEEEECCCCEECHHHHHHHHHC---CCEEEEECCCCC Q ss_conf 035668617730368899999975---988999889870 Q T0603 48 SVACIMLEPGTRVSHAAVRLAAQV---GTLLVWVGEAGV 83 (305) Q Consensus 48 ~i~~IlL~~gvsIT~~al~~la~~---Gi~V~f~g~~G~ 83 (305) ..-.|+|||..++|-.+++.+.+. .+.|+|++.+.- T Consensus 77 ~~~pi~lGGdhsit~~~~~al~~~~~~~~~vI~~DAH~D 115 (273) T PF00491_consen 77 GKFPIVLGGDHSITYGAIRALARAYGGPIGVIHFDAHPD 115 (273) T ss_dssp CCEEEE----GGGGHHHHHHHTTCHSTTEEEEEESSS-- T ss_pred CCEEEECCCCCCCCHHHHHHHHHHCCCCEEEEEECCCCC T ss_conf 998997389630033777889987089769999716767 No 4 >PF10820 DUF2543: Protein of unknown function (DUF2543) Probab=40.83 E-value=7.9 Score=15.46 Aligned_cols=58 Identities=21% Similarity=0.222 Sum_probs=29.9 Q ss_pred EECCHHHHHHHHHHHHH----------HHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHC Q ss_conf 30005553356667899----------99861487772489999999999862078999999999836 Q T0603 216 VYDIADIIKFDTVVPKA----------FEIARRNPGEPDREVRLACRDIFRSSKTLAKLIPLIEDVLA 273 (305) Q Consensus 216 v~DiADlfK~~i~~p~a----------F~~~~~~~~~~e~~~R~~~r~~~~~~~~l~~~i~~i~~ll~ 273 (305) -|||+|.|-.+..-|++ |.+.-....+-|-.-..+-.+.-++..+-++-|+||.+.|. T Consensus 9 yyDi~dEYatE~a~PV~~~Er~~LA~YFQlLitRL~nneEIsEeAQ~EMA~eAgi~~~RIDdIA~FLN 76 (81) T PF10820_consen 9 YYDIVDEYATETAEPVSEAERDALAHYFQLLITRLMNNEEISEEAQQEMAREAGIDERRIDDIANFLN 76 (81) T ss_pred HHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHHHHHHHCCCCHHHHHHHHHHHH T ss_conf 65558888776236610667648999999999998150753399999999982997776899999998 No 5 >PF03983 SHD1: SLA1 homology domain 1, SHD1 ; InterPro: IPR007131 The SLA1 homology domain is found in the cytoskeleton assembly control protein SLA1, which is responsible for the correct formation of the actin cytoskeleton.; PDB: 2hbp_A. Probab=40.56 E-value=8 Score=15.43 Aligned_cols=40 Identities=18% Similarity=0.354 Sum_probs=32.2 Q ss_pred CCCCCEEEEEEEEEEEECCEEEEEECCCEEEECCCCCCCE Q ss_conf 0475268887438998577899987686578616100356 Q T0603 12 KDRVSMIFLQYGQIDVIDGAFVLIDKTGIRTHIPVGSVAC 51 (305) Q Consensus 12 ~dr~s~lYle~g~i~~~~~~l~i~~~~g~~~~IPi~~i~~ 51 (305) .|+-.-.-+|---|...++.+.+...+|.++.||++.++. T Consensus 16 tD~tG~F~VeA~fl~~~dgkV~L~k~nG~~I~VP~~kLS~ 55 (70) T PF03983_consen 16 TDRTGKFKVEAEFLGLNDGKVHLHKTNGVKISVPLAKLSD 55 (70) T ss_dssp EB-----EEEEEE-------EEEE-----EEEE-TTSB-H T ss_pred ECCCCCEEEEEEEEEECCCEEEEEECCCEEEEEEHHHCCH T ss_conf 8389988899999886298999993499499979599598 No 6 >PF02951 GSH-S_N: Prokaryotic glutathione synthetase, N-terminal domain; InterPro: IPR004215 Prokaryotic glutathione synthetase 6.3.2.3 from EC (glutathione synthase) catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to orthophosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis. The enzyme is inhibited by 7,8-dihydrofolate, methotrexate and trimethoprim. This domain is the N-terminus of the enzyme.; GO: 0004363 glutathione synthase activity, 0006750 glutathione biosynthetic process; PDB: 1gsh_A 1gsa_A 1glv_A 2glt_A. Probab=32.26 E-value=11 Score=14.59 Aligned_cols=64 Identities=19% Similarity=0.387 Sum_probs=37.9 Q ss_pred CCCCEEEEEEEEEEEECCEEEEE-------EC------CCEEEECCCCCCCEEEEC--CCCE----ECHHHHHHHHHCCC Q ss_conf 47526888743899857789998-------76------865786161003566861--7730----36889999997598 Q T0603 13 DRVSMIFLQYGQIDVIDGAFVLI-------DK------TGIRTHIPVGSVACIMLE--PGTR----VSHAAVRLAAQVGT 73 (305) Q Consensus 13 dr~s~lYle~g~i~~~~~~l~i~-------~~------~g~~~~IPi~~i~~IlL~--~gvs----IT~~al~~la~~Gi 73 (305) -.+.+.|.+.+-|..++|.+... +. -+....+|++++++|++= |-.. .++..+..+.+.|+ T Consensus 30 RGh~v~~~~~~dL~~~~g~~~a~~~~v~~~~~~~~~~~~~~~~~~~L~~~Dvv~mRkDPPfD~~yi~aT~lLe~ae~~g~ 109 (119) T PF02951_consen 30 RGHEVFYYEPGDLSLRDGRVYARARPVTVRDDKGDWYKLGEEERLPLSEFDVVLMRKDPPFDMEYIYATYLLELAERQGV 109 (119) T ss_dssp ---EEEEE----EEE----EEEEEEEEEE-S-SS--EEE---EEEEGGGSSEEEBE--S---HHHHHHHHHHHHHHH--- T ss_pred CCCEEEEEECCCEEEECCEEEEEEEEEEEECCCCCCEECCCCEECCCCCCCEEEEECCCCCCCHHHHHHHHHHHHCCCCC T ss_conf 69999999437489989999999999998258878176178578770009999991698997068999999986540995 Q ss_pred EEE Q ss_conf 899 Q T0603 74 LLV 76 (305) Q Consensus 74 ~V~ 76 (305) .|+ T Consensus 110 ~Vv 112 (119) T PF02951_consen 110 LVV 112 (119) T ss_dssp EEE T ss_pred EEE T ss_conf 999 No 7 >PF02606 LpxK: Tetraacyldisaccharide-1-P 4'-kinase; InterPro: IPR003758 Tetraacyldisaccharide 4'-kinase 2.7.1.130 from EC phosphorylates the 4'-position of a tetraacyldisaccharide 1-phosphate precursor (DS-1-P) of lipid A, but the enzyme has not yet been purified because of instability . This enzyme is involved in the synthesis of lipid A portion of the bacterial lipopolysaccharide layer (LPS). ; GO: 0009029 tetraacyldisaccharide 4'-kinase activity, 0009245 lipid A biosynthetic process Probab=30.36 E-value=11 Score=14.58 Aligned_cols=41 Identities=27% Similarity=0.216 Sum_probs=35.0 Q ss_pred EEEECCCCCCCEEEECC-C-CEECHHHHHHHHHCCCEEEEECC Q ss_conf 57861610035668617-7-30368899999975988999889 Q T0603 40 IRTHIPVGSVACIMLEP-G-TRVSHAAVRLAAQVGTLLVWVGE 80 (305) Q Consensus 40 ~~~~IPi~~i~~IlL~~-g-vsIT~~al~~la~~Gi~V~f~g~ 80 (305) .+..+|+-.|+-|..|| | |-++..+++.|.++|..+..++. T Consensus 31 ~~~~vPVIsVGNitvGGTGKTP~v~~L~~~L~~~G~~~~IlSR 73 (326) T PF02606_consen 31 YRLPVPVISVGNITVGGTGKTPLVIWLARLLKARGYRPAILSR 73 (326) T ss_pred CCCCCCEEEECCCCCCCCCHHHHHHHHHHHHHHCCCCEEEECC T ss_conf 6799989998881169998589999999999976993599358 No 8 >PF04283 CheF-arch: Chemotaxis signal transduction system protein F from archaea; InterPro: IPR007381 This is an archaeal protein of unknown function. Probab=29.44 E-value=12 Score=14.28 Aligned_cols=33 Identities=27% Similarity=0.425 Sum_probs=27.2 Q ss_pred EEEEEEEECCEEEEEECCCEEEECCCCCCCEEEE Q ss_conf 7438998577899987686578616100356686 Q T0603 21 QYGQIDVIDGAFVLIDKTGIRTHIPVGSVACIML 54 (305) Q Consensus 21 e~g~i~~~~~~l~i~~~~g~~~~IPi~~i~~IlL 54 (305) ..|++-..+.+|++...+++ ..||...|..|-. T Consensus 26 ~~~ri~Ls~~RlvL~~~~~k-~tIpls~I~Di~~ 58 (221) T PF04283_consen 26 RKGRIVLSNDRLVLAGNDGK-RTIPLSSIEDIGV 58 (221) T ss_pred EEEEEEEECCEEEEEECCCE-EEEECCCEEECCC T ss_conf 79999996267999807980-8971304375035 No 9 >PF04010 DUF357: Protein of unknown function (DUF357); InterPro: IPR007155 Members of this entry are short (less than 100 amino acids) proteins found in archaebacteria. The function of these proteins is unknown.; PDB: 2pmr_A 2oo2_A. Probab=27.19 E-value=11 Score=14.46 Aligned_cols=14 Identities=36% Similarity=0.411 Sum_probs=10.3 Q ss_pred CCCHHHHHHHHHHH Q ss_conf 56536689999999 Q T0603 172 KGDTINQCISAATS 185 (305) Q Consensus 172 ~~DpvNa~LS~gys 185 (305) .+|++||+.++.|+ T Consensus 48 ~gD~v~Ala~~sYa 61 (75) T PF04010_consen 48 KGDYVNALASFSYA 61 (75) T ss_dssp ---HHHHHHHHHHH T ss_pred CCCHHHHHHHHHHH T ss_conf 78889999999999 No 10 >PF00951 Arteri_Gl: Arterivirus GL envelope glycoprotein; InterPro: IPR001332 Arteriviruses encode four envelope proteins, GL, GS, M and N. GL envelope glycoprotein is heterogenously glycosylated with N-acetyllactosamine in a cell-type-specific manner. The GL glycoprotein expresses the neutralization determinants .; GO: 0019031 viral envelope Probab=21.83 E-value=17 Score=13.38 Aligned_cols=42 Identities=19% Similarity=0.109 Sum_probs=18.8 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCCCCCEEEECCCCCCEEEC Q ss_conf 36689999999999999999997088843125664898751300 Q T0603 175 TINQCISAATSCLYGVTEAAILAAGYAPAIGFVHTGKPLSFVYD 218 (305) Q Consensus 175 pvNa~LS~gyslLy~~~~~aI~~~Gl~P~lGflH~g~~~Slv~D 218 (305) ++...++++--=.- ...-.+....+ ++.||.|...-.|=+|- T Consensus 47 ~~t~i~s~~~lt~~-hfl~~~~~~~~-~~~gf~~~ryvls~~y~ 88 (179) T PF00951_consen 47 VVTHILSLGFLTTS-HFLDFLGLSAV-SYAGFVHGRYVLSSAYA 88 (179) T ss_pred HHHHHHHCCHHHHH-HHHHHHHHHHH-HHHEEEECCEEHHHHHH T ss_conf 87998743638888-99999889888-55214524087799999 Done!