Query gi|254781007|ref|YP_003065420.1| hypothetical protein CLIBASIA_04540 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 411 No_of_seqs 1 out of 3 Neff 1.0 Searched_HMMs 39220 Date Mon May 30 03:00:40 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254781007.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 cd01473 vWA_CTRP CTRP for CS 52.9 17 0.00044 16.6 9.6 155 212-393 22-190 (192) 2 cd01480 vWA_collagen_alpha_1-V 52.7 15 0.00037 17.1 2.9 63 302-378 109-171 (186) 3 KOG0099 consensus 51.7 18 0.00046 16.5 3.3 46 271-338 325-370 (379) 4 cd01475 vWA_Matrilin VWA_Matri 50.2 19 0.00048 16.3 3.7 46 344-391 134-181 (224) 5 KOG2531 consensus 47.2 21 0.00053 16.0 3.1 47 73-119 114-166 (545) 6 pfam07888 CALCOCO1 Calcium bin 46.7 6 0.00015 19.9 0.2 20 146-165 294-313 (546) 7 COG4854 Predicted membrane pro 40.1 27 0.00068 15.3 3.4 39 317-355 44-82 (126) 8 cd01481 vWA_collagen_alpha3-VI 38.8 28 0.00071 15.2 3.0 32 344-377 132-163 (165) 9 cd01474 vWA_ATR ATR (Anthrax T 37.9 29 0.00073 15.1 6.7 50 344-395 133-183 (185) 10 TIGR01271 CFTR_protein cystic 37.4 29 0.00074 15.0 6.7 36 56-93 1099-1134(1534) 11 PRK03958 tRNA 2'-O-methylase; 32.1 35 0.00089 14.4 4.0 61 346-410 3-67 (175) 12 KOG0829 consensus 30.7 37 0.00094 14.3 2.6 47 315-361 78-132 (169) 13 TIGR02599 TIGR02599 Verrucomic 30.5 18 0.00047 16.4 0.6 151 32-195 12-185 (396) 14 cd01469 vWA_integrins_alpha_su 30.0 38 0.00097 14.2 7.2 36 343-378 130-170 (177) 15 cd01482 vWA_collagen_alphaI-XI 29.4 39 0.00099 14.1 2.6 27 344-372 129-155 (164) 16 cd00049 MH1 MH1 is a small DNA 28.8 22 0.00057 15.8 0.8 14 222-235 107-120 (121) 17 pfam07727 RVT_2 Reverse transc 27.5 5.5 0.00014 20.1 -2.5 49 30-80 68-132 (246) 18 KOG0090 consensus 27.3 28 0.00072 15.1 1.1 27 302-329 152-178 (238) 19 TIGR00955 3a01204 Pigment prec 27.1 17 0.00044 16.6 -0.1 28 212-241 463-490 (671) 20 smart00523 DWA Domain A in dwa 25.7 24 0.00061 15.6 0.4 15 221-235 92-106 (109) 21 pfam03165 MH1 MH1 domain. The 24.8 28 0.00072 15.1 0.7 14 221-234 93-106 (107) 22 TIGR00708 cobA cob(I)alamin ad 23.9 21 0.00054 16.0 -0.1 35 39-73 118-165 (191) 23 pfam03542 Tuberin Tuberin. Tub 23.5 42 0.0011 13.9 1.4 25 368-401 328-352 (356) 24 pfam01727 consensus 22.2 45 0.0012 13.6 1.3 25 277-303 13-37 (81) 25 TIGR01405 polC_Gram_pos DNA po 20.9 34 0.00086 14.5 0.5 16 175-190 687-702 (1264) 26 TIGR01524 ATPase-IIIB_Mg magne 20.8 7.9 0.0002 19.0 -2.8 65 129-199 312-383 (892) 27 pfam09702 Cas_Csa5 CRISPR-asso 20.6 54 0.0014 13.1 1.4 29 310-346 13-41 (105) 28 PRK12703 tRNA 2'-O-methylase; 20.6 56 0.0014 13.0 3.7 38 285-328 215-252 (339) No 1 >cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. Probab=52.91 E-value=17 Score=16.63 Aligned_cols=155 Identities=19% Similarity=0.268 Sum_probs=82.7 Q ss_pred HHHHHHHCCCCCEECCCCHHHHHHHEEEEEEEEEECCCCCHHHHHCCCCCCCHHCCHHHHHHHHHHHHCCCCCCC----- Q ss_conf 898876187432002720234434122112578625862125654186421100060899988874101346784----- Q gi|254781007|r 212 VKNFLSQIPYKNFCMAPYHYSSILYWAVGTLTYSVDNKTTTREYYKDPYYATWDHFPYSFIKNVFDMTSNQFGDG----- 286 (411) Q Consensus 212 vknflsqipyknfcmapyhyssilywavgtltysvdnktttreyykdpyyatwdhfpysfiknvfdmtsnqfgdg----- 286 (411) +++|+..+- .+|-..|-. --||-..||-.+++... .....+| + .-..++.+-.+...-+..| T Consensus 22 v~~F~~~lv-~~f~Ig~~~------~rvgvv~yS~~~~~~~~-f~~~~~~---~--k~~~l~~i~~l~~~~~~gg~T~tg 88 (192) T cd01473 22 VIPFTEKII-NNLNISKDK------VHVGILLFAEKNRDVVP-FSDEERY---D--KNELLKKINDLKNSYRSGGETYIV 88 (192) T ss_pred HHHHHHHHH-HHCCCCCCC------EEEEEEEECCCCCEEEE-CCCCCCC---C--HHHHHHHHHHHHHCCCCCCCCHHH T ss_conf 999999999-875659896------19999995588740132-3554434---8--999999999987314689824799 Q ss_pred EEE--ECCCCCCCCCCCCC--CEEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHHH Q ss_conf 264--13674344676677--07999973451003330289999999985998875114764168997404887420267 Q gi|254781007|r 287 QVL--TNTNHCFPHGASQN--KYMLMLAIGNQLSRSSVEKEKIEKVLQDCHYMHKRHRTGRDAITIFSVGFSPDQDTRYT 362 (411) Q Consensus 287 qvl--tntnhcfphgasqn--kymlmlaignqlsrssvekekiekvlqdchymhkrhrtgrdaitifsvgfspdqdtryt 362 (411) +.| ...+.-...|+.++ |.++++.=|+.-+++...-+...+.|. ...|+||.||-.-- .+-. T Consensus 89 ~AL~~~~~~~~~~~g~R~~vpkv~IvlTDG~s~~~~~~~~~~~a~~lr------------~~gV~i~avGVg~~--~~~e 154 (192) T cd01473 89 EALKYGLKNYTKHGNRRKDAPKVTMLFTDGNDTSASKKELQDISLLYK------------EENVKLLVVGVGAA--SENK 154 (192) T ss_pred HHHHHHHHHHCCCCCCCCCCCEEEEEEECCCCCCCCHHHHHHHHHHHH------------HCCCEEEEEEECCC--CHHH T ss_conf 999999998634678888997499999569988731678999999999------------87978999980637--9999 Q ss_pred HHHHCCCCHH-----EEEECCCCCHHHHHHHHHHHH Q ss_conf 8763059032-----044258765344789877777 Q gi|254781007|r 363 LRQCASDPSK-----YYEINSDENVMPIAKSLARNV 393 (411) Q Consensus 363 lrqcasdpsk-----yyeinsdenvmpiakslarnv 393 (411) ||+-|+.|.. .|-..+=+++.+|.+.+.+++ T Consensus 155 L~~iag~~~~~~~c~~~~~~~fd~l~~i~~~l~~~v 190 (192) T cd01473 155 LKLLAGCDINNDNCPNVIKTEWNNLNGISKFLTDKI 190 (192) T ss_pred HHHHHCCCCCCCCCCEEEECCHHHHHHHHHHHHHHH T ss_conf 999869998899775799479789999999999972 No 2 >cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=52.72 E-value=15 Score=17.13 Aligned_cols=63 Identities=24% Similarity=0.341 Sum_probs=37.7 Q ss_pred CCCEEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHHHHHHHCCCCHHEEEECC Q ss_conf 77079999734510033302899999999859988751147641689974048874202678763059032044258 Q gi|254781007|r 302 QNKYMLMLAIGNQLSRSSVEKEKIEKVLQDCHYMHKRHRTGRDAITIFSVGFSPDQDTRYTLRQCASDPSKYYEINS 378 (411) Q Consensus 302 qnkymlmlaignqlsrssvekekiekvlqdchymhkrhrtgrdaitifsvgfspdqdtrytlrqcasdpskyyeins 378 (411) .+|-++.+.=|. |.. -+...++...++.. +..|+||.||..... .-.|++.||+|++.+-.++ T Consensus 109 ~~kvlvliTDG~--S~~-~~~~~~~~aa~~lr---------~~GV~ifaVGVG~~~--~~eL~~IAs~p~~~~~~~~ 171 (186) T cd01480 109 ENKFLLVITDGH--SDG-SPDGGIEKAVNEAD---------HLGIKIFFVAVGSQN--EEPLSRIACDGKSALYREN 171 (186) T ss_pred CCEEEEEEECCC--CCC-CCCHHHHHHHHHHH---------HCCCEEEEEEECCCC--HHHHHHHHCCCCCEEEECC T ss_conf 853899984587--666-74066999999999---------879899999947488--7999998589973897368 No 3 >KOG0099 consensus Probab=51.72 E-value=18 Score=16.51 Aligned_cols=46 Identities=37% Similarity=0.741 Sum_probs=38.4 Q ss_pred HHHHHHHHHCCCCCCCEEEECCCCCCCCCCCCCCEEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99888741013467842641367434467667707999973451003330289999999985998875 Q gi|254781007|r 271 FIKNVFDMTSNQFGDGQVLTNTNHCFPHGASQNKYMLMLAIGNQLSRSSVEKEKIEKVLQDCHYMHKR 338 (411) Q Consensus 271 fiknvfdmtsnqfgdgqvltntnhcfphgasqnkymlmlaignqlsrssvekekiekvlqdchymhkr 338 (411) ||..-|--.|..-||| ..+|+||=. ..|..|.|..|..||.-|-.| T Consensus 325 fird~FlRiSta~~Dg-----~h~CYpHFT-----------------cAvDTenIrrVFnDcrdiIqr 370 (379) T KOG0099 325 FIRDEFLRISTASGDG-----RHYCYPHFT-----------------CAVDTENIRRVFNDCRDIIQR 370 (379) T ss_pred HHHHHHHHHCCCCCCC-----CEECCCCEE-----------------EEECHHHHHHHHHHHHHHHHH T ss_conf 4230677630346787-----530002346-----------------771538899998789999999 No 4 >cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. Probab=50.22 E-value=19 Score=16.35 Aligned_cols=46 Identities=22% Similarity=0.369 Sum_probs=32.1 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCH--HEEEECCCCCHHHHHHHHHH Q ss_conf 4168997404887420267876305903--20442587653447898777 Q gi|254781007|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPS--KYYEINSDENVMPIAKSLAR 391 (411) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdps--kyyeinsdenvmpiakslar 391 (411) +.|.||.||..- - .+-.|++.||+|+ ..+.+++=+..-.++..|.. T Consensus 134 ~GV~ifaVGVg~-~-~~~eL~~IAs~P~~~hvf~v~~F~~l~~l~~~l~~ 181 (224) T cd01475 134 LGIEMFAVGVGR-A-DEEELREIASEPLADHVFYVEDFSTIEELTKKFQG 181 (224) T ss_pred CCCEEEEEECCC-C-CHHHHHHHHCCCCHHCEEEECCHHHHHHHHHHHHH T ss_conf 798899996374-7-98999998559737568994798899999999876 No 5 >KOG2531 consensus Probab=47.23 E-value=21 Score=16.04 Aligned_cols=47 Identities=26% Similarity=0.196 Sum_probs=23.3 Q ss_pred HHCCHHHHHHHHHHHHHHC---CCCCCCCCCC---CHHHHHHHHHEEECCCCC Q ss_conf 7412235789999987400---3433334543---188874322113224673 Q gi|254781007|r 73 REAGRDNFIRHQIEKALNT---YNSRDLSNTG---SIESIVKDAVILTKNVNS 119 (411) Q Consensus 73 reagrdnfirhqiekalnt---ynsrdlsntg---siesivkdaviltknvns 119 (411) +.-.-+.+..+|+|.|+-. -+-+|-|.|- .+|.-|-.+--|.|-..| T Consensus 114 ~~Ld~~~~L~eQle~aF~v~~sP~WmDsSTtkQC~ElE~~VGG~~~la~LTGS 166 (545) T KOG2531 114 ESLDPEKSLHEQLESAFSVQTSPIWMDSSTTKQCQELEEAVGGAQELAKLTGS 166 (545) T ss_pred HCCCHHHHHHHHHHHHHCCCCCCCCCCCCHHHHHHHHHHHHCCHHHHHHHHCC T ss_conf 63793367999998764146897622564088999999871648888776364 No 6 >pfam07888 CALCOCO1 Calcium binding and coiled-coil domain (CALCOCO1) like. Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region. Probab=46.68 E-value=6 Score=19.86 Aligned_cols=20 Identities=10% Similarity=0.265 Sum_probs=10.2 Q ss_pred HCCCCEEHHHHHHHHHHCCC Q ss_conf 23582223554323332066 Q gi|254781007|r 146 QSKGKVDISRRKKVMYKQNI 165 (411) Q Consensus 146 qskgkvdisrrkkvmykqni 165 (411) ...+.+.-|+++-++....+ T Consensus 294 ~~qeql~AS~q~a~~L~~EL 313 (546) T pfam07888 294 TLQERLESSQQKAGLLGEEL 313 (546) T ss_pred HHHHHHHHHHHHHHHHHHHH T ss_conf 87888888899999999999 No 7 >COG4854 Predicted membrane protein [Function unknown] Probab=40.08 E-value=27 Score=15.29 Aligned_cols=39 Identities=23% Similarity=0.405 Sum_probs=30.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEEEECCC Q ss_conf 333028999999998599887511476416899740488 Q gi|254781007|r 317 RSSVEKEKIEKVLQDCHYMHKRHRTGRDAITIFSVGFSP 355 (411) Q Consensus 317 rssvekekiekvlqdchymhkrhrtgrdaitifsvgfsp 355 (411) --|+-|+..++|+.|-.-..--.|..|..|.+||+|+.- T Consensus 44 ~l~l~k~Rv~~vvEDER~lrvse~aSr~TiqV~~is~Al 82 (126) T COG4854 44 LLSLVKRRVDEVVEDERTLRVSERASRRTIQVFSISAAL 82 (126) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHEEEEEEEEHHHH T ss_conf 999999999988525889888875300257888846988 No 8 >cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=38.81 E-value=28 Score=15.16 Aligned_cols=32 Identities=28% Similarity=0.405 Sum_probs=24.8 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCHHEEEEC Q ss_conf 4168997404887420267876305903204425 Q gi|254781007|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPSKYYEIN 377 (411) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdpskyyein 377 (411) ..+.||.||... -.+-.|++-||+|+.-|.++ T Consensus 132 ~gV~i~aVGvg~--~~~~eL~~IAs~p~~vf~~~ 163 (165) T cd01481 132 AGIVPFAIGARN--ADLAELQQIAFDPSFVFQVS 163 (165) T ss_pred CCCEEEEEECCC--CCHHHHHHHHCCCCCEEECC T ss_conf 897899996897--99999999858987769738 No 9 >cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. Probab=37.88 E-value=29 Score=15.06 Aligned_cols=50 Identities=16% Similarity=0.229 Sum_probs=37.1 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCHHEEEECC-CCCHHHHHHHHHHHHHH Q ss_conf 41689974048874202678763059032044258-76534478987777787 Q gi|254781007|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPSKYYEINS-DENVMPIAKSLARNVIT 395 (411) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdpskyyeins-denvmpiakslarnvit 395 (411) ..|+||+||.+ +- .+-.|++-||+|++-|.++. =+..-.|...|...+-+ T Consensus 133 ~gV~i~aVGV~-~~-~~~eL~~IAs~p~~vf~v~~~F~~L~~i~~~l~~~iC~ 183 (185) T cd01474 133 LGAIVYCVGVT-DF-LKSQLINIADSKEYVFPVTSGFQALSGIIESVVKKACI 183 (185) T ss_pred CCCEEEEEECC-CC-CHHHHHHHHCCCCEEEECCCCHHHHHHHHHHHHHHHCC T ss_conf 89489999716-25-99999987199864898347577789999999985287 No 10 >TIGR01271 CFTR_protein cystic fibrosis transmembrane conductor regulator (CFTR); InterPro: IPR005291 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). These proteins are integral membrane proteins and they are involved in the transport of chloride ions. Many of these proteins are the cystis fibrosis transmembrane conductor regulators (CFTR) in eukaryotes. The principal role of this protein is chloride ion conductance. The protein is predicted to consist of 12 transmembrane domains. Mutations or lesions in the genetic loci have been linked to the aetiology of asthma, bronchiectasis, chronic obstructive pulmonary disease etc. Disease-causing mutations have been studied by 36Cl efflux assays in vitro cell cultures and electrophysiology, all of which point to the impairment of chloride channel stability and not the biosynthetic processing per se.; GO: 0005254 chloride channel activity, 0006811 ion transport, 0016020 membrane. Probab=37.40 E-value=29 Score=15.01 Aligned_cols=36 Identities=36% Similarity=0.613 Sum_probs=23.7 Q ss_pred HHHHHHHCCCCCHHHHHHHCCHHHHHHHHHHHHHHCCC Q ss_conf 98776520683323667741223578999998740034 Q gi|254781007|r 56 VLDHVIYRTSPKNLYDLREAGRDNFIRHQIEKALNTYN 93 (411) Q Consensus 56 vldhviyrtspknlydlreagrdnfirhqiekalntyn 93 (411) +..|.| +|-|.|+.+|.-||..+..--..||||+.. T Consensus 1099 IFsHLi--~SLkGLWT~RAFGRQsYFETLFHKALN~HT 1134 (1534) T TIGR01271 1099 IFSHLI--TSLKGLWTIRAFGRQSYFETLFHKALNLHT 1134 (1534) T ss_pred HHHHHH--HHHHHHHHHHHCCCCCHHHHHHHHHHHHHH T ss_conf 678899--875233443303787526789999887678 No 11 >PRK03958 tRNA 2'-O-methylase; Reviewed Probab=32.08 E-value=35 Score=14.44 Aligned_cols=61 Identities=28% Similarity=0.555 Sum_probs=27.4 Q ss_pred EEEEEEECCCCCCHHHHHHH----HCCCCHHEEEECCCCCHHHHHHHHHHHHHHHHHHHEEEEEEECCC Q ss_conf 68997404887420267876----305903204425876534478987777787444332574130115 Q gi|254781007|r 346 ITIFSVGFSPDQDTRYTLRQ----CASDPSKYYEINSDENVMPIAKSLARNVITNWFSQFTITVVDSWR 410 (411) Q Consensus 346 itifsvgfspdqdtrytlrq----casdpskyyeinsdenvmpiakslarnvitnwfsqftitvvdswr 410 (411) |.+.-.|.-|+.|.|-|-.- -|-..++.+--..|+.+ ..--+.|..+|=..|.+...++|+ T Consensus 3 I~VLRlGHR~~RD~RiTTHv~LtaRAfGA~~i~l~~~D~~~----~etv~~V~~rwGG~F~~e~~~~~~ 67 (175) T PRK03958 3 IVVLRLGHRPERDKRITTHVGLTARALGADKILFASEDEHV----KESVEDIVERWGGPFKVEVTKSWK 67 (175) T ss_pred EEEEECCCCCCCCCCHHHHHHHHHHHHCCCEEEECCCCHHH----HHHHHHHHHHCCCCEEEEECCCHH T ss_conf 99995678876676313588898887268767876887668----999999998618966999768979 No 12 >KOG0829 consensus Probab=30.73 E-value=37 Score=14.28 Aligned_cols=47 Identities=32% Similarity=0.357 Sum_probs=32.3 Q ss_pred CCHHHHHH---H-----HHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHH Q ss_conf 00333028---9-----99999998599887511476416899740488742026 Q gi|254781007|r 315 LSRSSVEK---E-----KIEKVLQDCHYMHKRHRTGRDAITIFSVGFSPDQDTRY 361 (411) Q Consensus 315 lsrssvek---e-----kiekvlqdchymhkrhrtgrdaitifsvgfspdqdtry 361 (411) -|||++.. | +...|-|--.-|..|||.-+++|.|.+|.--|..|++- T Consensus 78 dSRsG~HNmYkEyRd~t~~gAV~q~y~dMaaRhRar~~~I~Iikv~~v~a~~~kR 132 (169) T KOG0829 78 DSRSGTHNMYKEYRDTTRVGAVEQCYRDMAARHRARFRSIQIIKVAEVPAEDCKR 132 (169) T ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEEEEEHHHHHH T ss_conf 2577615789999876655589999999877765206612599887764887435 No 13 >TIGR02599 TIGR02599 Verrucomicrobium spinosum paralogous family TIGR02599. Probab=30.50 E-value=18 Score=16.42 Aligned_cols=151 Identities=23% Similarity=0.329 Sum_probs=63.6 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHCCHHHH--HHHHHH-HHHHCCCCCCCCCCCCHHHHHH Q ss_conf 7899999987667777666789999877652068332366774122357--899999-8740034333345431888743 Q gi|254781007|r 32 SFVAIVDVVVDQVTVMQKTAWLQEVLDHVIYRTSPKNLYDLREAGRDNF--IRHQIE-KALNTYNSRDLSNTGSIESIVK 108 (411) Q Consensus 32 sfvaivdvvvdqvtvmqktawlqevldhviyrtspknlydlreagrdnf--irhqie-kalntynsrdlsntgsiesivk 108 (411) +.++|+=+|+-||+++-...|-+ ++-+ .-..||| |.-| |--+.+ -.||+|=.--.++.++ .++-. T Consensus 12 tvLsilm~v~~~v~~~tq~tw~~---------a~ar-~~qFREA-R~AFEaisr~LsQATLN~Yw~Y~~~~~~~-~~~~~ 79 (396) T TIGR02599 12 TVLSILMLVLAQVLSQTQRTWRR---------ATAR-AEQFREA-RAAFEAISRRLSQATLNAYWDYKYNAGTS-RTVAN 79 (396) T ss_pred HHHHHHHHHHHHHHHHHHHHHHH---------HHHH-HHHHHHH-HHHHHHHHHHHHHHHHCCHHHHHHCCCCC-CCCCC T ss_conf 99999999998877558888886---------4545-4654999-99999985300023403135664326777-55664 Q ss_pred HH-HEEECCCCCCCEEEEEEEE---EEEEEEE-HHHHHHHHHHCCCCEE-HHH-HHHH---HHHCCCCEEEEEECCCCEE Q ss_conf 22-1132246733138999886---3101110-3789998742358222-355-4323---3320666388630004225 Q gi|254781007|r 109 DA-VILTKNVNSLPLQFTVDIA---LSTTVQL-RGSLLQMFSQSKGKVD-ISR-RKKV---MYKQNIGLMIMPFAWDGYW 178 (411) Q Consensus 109 da-viltknvnslplqftvdia---lsttvql-rgsllqmfsqskgkvd-isr-rkkv---mykqniglmimpfawdgyw 178 (411) ++ -|-++-.-.=-|+|-..-| |.+..+. .-+=--+|=|.-=-++ ..- -.-+ --+.+-||--+--+| ||+ T Consensus 80 ~~~~~P~~Y~R~SELhFv~Gpa~~LL~~~~~ag~~~GHavFFQAPLG~~n~~p~a~~~GaP~~~~~~gL~~LLN~W-Gyy 158 (396) T TIGR02599 80 AATEVPTGYERQSELHFVSGPASTLLGAVITAGAEPGHAVFFQAPLGVTNLDPGADGSGAPEQAGTEGLDNLLNAW-GYY 158 (396) T ss_pred CCCCCCCCCEEEEEEEEEECCCCCCCCCCCCCCCCCCCEEEECCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHC-CCE T ss_conf 1011553312665016772456222012324688766104540253556777778888887775144476545112-243 Q ss_pred EECC-------CCCC-CC--CCCCCHH Q ss_conf 3125-------5224-35--4387101 Q gi|254781007|r 179 LASR-------GKVA-DS--KVHPPKY 195 (411) Q Consensus 179 lasr-------gkva-ds--kvhppky 195 (411) +.-+ |-++ .| +++|++| T Consensus 159 v~~~~D~~~RP~FL~~~s~~~~~p~R~ 185 (396) T TIGR02599 159 VEFGSDLPRRPAFLAKSSAKPTAPERY 185 (396) T ss_pred EEECCCCCCCCCCCCCCCCCCCCCCCC T ss_conf 753788887874235777655788862 No 14 >cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. Probab=29.99 E-value=38 Score=14.20 Aligned_cols=36 Identities=22% Similarity=0.549 Sum_probs=26.7 Q ss_pred CCEEEEEEEECCCCCCH---HHHHHHHCCCCHH--EEEECC Q ss_conf 64168997404887420---2678763059032--044258 Q gi|254781007|r 343 RDAITIFSVGFSPDQDT---RYTLRQCASDPSK--YYEINS 378 (411) Q Consensus 343 rdaitifsvgfspdqdt---rytlrqcasdpsk--yyeins 378 (411) ++.|.||+||..+.-+. ...|+..||+|.. .|.+++ T Consensus 130 ~~gv~vf~VGvG~~~~~~~~~~eL~~iAs~P~~~hvf~~~~ 170 (177) T cd01469 130 REGIIRYAIGVGGHFQRENSREELKTIASKPPEEHFFNVTD 170 (177) T ss_pred HCCEEEEEEEECCCCCCCCCHHHHHHHHCCCCHHCEEEECC T ss_conf 79908999995551467451999999967985871998379 No 15 >cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=29.36 E-value=39 Score=14.13 Aligned_cols=27 Identities=37% Similarity=0.607 Sum_probs=21.7 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCHH Q ss_conf 41689974048874202678763059032 Q gi|254781007|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPSK 372 (411) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdpsk 372 (411) ..|+||.||. .+.| +..|++-||+|+. T Consensus 129 ~gv~i~~VGV-g~~~-~~eL~~IAs~P~~ 155 (164) T cd01482 129 LGVNVFAVGV-KDAD-ESELKMIASKPSE 155 (164) T ss_pred CCCEEEEEEC-CCCC-HHHHHHHHCCCCH T ss_conf 8938999978-8378-9999999689856 No 16 >cd00049 MH1 MH1 is a small DNA binding domain, binding in an unusal way involving a beta hairpin structure binding to the major groove. MH1 is present in Smad proteins, an important family of proteins involved in TGF-beta signalling and frequent targets of tumorigenic mutations. Also known as Domain A in dwarfin family proteins. Probab=28.76 E-value=22 Score=15.80 Aligned_cols=14 Identities=43% Similarity=1.220 Sum_probs=11.3 Q ss_pred CCEECCCCHHHHHH Q ss_conf 32002720234434 Q gi|254781007|r 222 KNFCMAPYHYSSIL 235 (411) Q Consensus 222 knfcmapyhyssil 235 (411) ...|.-|||||-+. T Consensus 107 ~~VC~NPyHy~rv~ 120 (121) T cd00049 107 DEVCINPYHYSRVV 120 (121) T ss_pred CEEEECCHHHHHHC T ss_conf 95883850686512 No 17 >pfam07727 RVT_2 Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model. Probab=27.50 E-value=5.5 Score=20.10 Aligned_cols=49 Identities=33% Similarity=0.493 Sum_probs=33.2 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCC----------------HHHHHHHCCHHHH Q ss_conf 98789999998766777766678999987765206833----------------2366774122357 Q gi|254781007|r 30 LSSFVAIVDVVVDQVTVMQKTAWLQEVLDHVIYRTSPK----------------NLYDLREAGRDNF 80 (411) Q Consensus 30 lssfvaivdvvvdqvtvmqktawlqevldhviyrtspk----------------nlydlreagrdnf 80 (411) +-+.+|.-|..+.|.-| ++|.|+.-||+.||-.-|. .||.||.|+|.=+ T Consensus 68 llaiaa~~~~~~~q~Dv--~tAFLn~~l~e~IYm~~P~G~~~~~~~~~V~kL~kaLYGLkqapr~W~ 132 (246) T pfam07727 68 LLALAAQRGWELHQMDV--KTAFLNGELEEEVYMKQPPGFEDPGKPNKVCRLKKSLYGLKQAPRAWY 132 (246) T ss_pred HHHHHHHCCCEEEEEEE--CHHHHCCCCCCCCEEECCCCCCCCCCCCEEEEEEEECCCCCCCHHHHH T ss_conf 99999875986555454--112305767878449588764668888889999410016563689999 No 18 >KOG0090 consensus Probab=27.27 E-value=28 Score=15.08 Aligned_cols=27 Identities=37% Similarity=0.315 Sum_probs=14.8 Q ss_pred CCCEEEEEEECCCCCHHHHHHHHHHHHH Q ss_conf 7707999973451003330289999999 Q gi|254781007|r 302 QNKYMLMLAIGNQLSRSSVEKEKIEKVL 329 (411) Q Consensus 302 qnkymlmlaignqlsrssvekekiekvl 329 (411) -||.-+..|--.+.-|...||| |+++. T Consensus 152 CNKqDl~tAkt~~~Ir~~LEkE-i~~lr 178 (238) T KOG0090 152 CNKQDLFTAKTAEKIRQQLEKE-IHKLR 178 (238) T ss_pred ECCHHHHHCCCHHHHHHHHHHH-HHHHH T ss_conf 5552232138599999999999-99999 No 19 >TIGR00955 3a01204 Pigment precourser permease; InterPro: IPR005284 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This family includes different parts of a membrane-spanning permease system necessary for the transport of pigment precursor into pigment cells responsible for eye color. White protein dimerises with brown protein for the transport of guanine and with scarlet protein for the transport of tryptophan.; GO: 0006810 transport. Probab=27.07 E-value=17 Score=16.62 Aligned_cols=28 Identities=32% Similarity=0.859 Sum_probs=23.1 Q ss_pred HHHHHHHCCCCCEECCCCHHHHHHHEEEEE Q ss_conf 898876187432002720234434122112 Q gi|254781007|r 212 VKNFLSQIPYKNFCMAPYHYSSILYWAVGT 241 (411) Q Consensus 212 vknflsqipyknfcmapyhyssilywavgt 241 (411) +-+.|+.+|. |-+.|+-|.+|.||.+|- T Consensus 463 lak~la~lP~--~i~~p~~f~~I~Ywm~GL 490 (671) T TIGR00955 463 LAKTLAELPL--FIILPALFTSIVYWMIGL 490 (671) T ss_pred HHHHHHHHHH--HHHHHHHHHHHHHHHHCC T ss_conf 9999998489--997479999999875156 No 20 >smart00523 DWA Domain A in dwarfin family proteins. Probab=25.69 E-value=24 Score=15.62 Aligned_cols=15 Identities=40% Similarity=0.919 Sum_probs=11.7 Q ss_pred CCCEECCCCHHHHHH Q ss_conf 432002720234434 Q gi|254781007|r 221 YKNFCMAPYHYSSIL 235 (411) Q Consensus 221 yknfcmapyhyssil 235 (411) ....|.-||||+-|- T Consensus 92 ~~~VC~NPYHy~Rv~ 106 (109) T smart00523 92 SDEVCCNPYHYSRVE 106 (109) T ss_pred CCEEEECCCCCCCCC T ss_conf 997870883120011 No 21 >pfam03165 MH1 MH1 domain. The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localisation signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx. Probab=24.84 E-value=28 Score=15.10 Aligned_cols=14 Identities=43% Similarity=1.127 Sum_probs=10.9 Q ss_pred CCCEECCCCHHHHH Q ss_conf 43200272023443 Q gi|254781007|r 221 YKNFCMAPYHYSSI 234 (411) Q Consensus 221 yknfcmapyhyssi 234 (411) ....|.-||||+-+ T Consensus 93 ~~~VC~NPyHy~Rv 106 (107) T pfam03165 93 KDEVCINPYHYSRV 106 (107) T ss_pred CCEEEECCCCHHHC T ss_conf 99388487334324 No 22 >TIGR00708 cobA cob(I)alamin adenosyltransferase; InterPro: IPR003724 ATP:cob(I)alamin (or ATP:corrinoid) adenosyltransferases (2.5.1.17 from EC), catalyse the conversion of cobalamin (vitamin B12) into its coenzyme form, adenosylcobalamin (coenzyme B12) . Adenosylcobalamin (AdoCbl) is required for the ativity of certain enzymes. AdoCbl contains an adenosyl moiety liganded to the cobalt ion of cobalamin via a covalent Co-C bond, and its synthesis is unique to certain prokaryotes. ATP:cob(I)alamin adenosyltransferases are classed into three groups: CobA-type , EutT-type and PduO-type . Each of the three enzyme types appears to be specialised for particular AdoCbl-dependent enzymes or for the de novo synthesis AdoCbl. PduO and EutT are distantly related, sharing short conserved motifs, while CobA is evolutionarily unrelated and is an example of convergent evolution. This entry represnts the ATP:cob(I)alamin adenosyltransferases CobA (Salmonella typhimurium), CobO (Pseudomonas denitrificans), and ButR (Escherichia coli). There is a high degree of sequence identity between these proteins . CobA is responsible for attaching the adenosyl moiety from ATP to the cobalt ion of the corrin ring, necessary for the convertion of cobalamin to adenosylcobalamin , . ; GO: 0005524 ATP binding, 0008817 cob(I)yrinic acid ac-diamide adenosyltransferase activity, 0009236 cobalamin biosynthetic process. Probab=23.90 E-value=21 Score=15.96 Aligned_cols=35 Identities=37% Similarity=0.627 Sum_probs=24.4 Q ss_pred HHHHHHHHHHHHHHH--HHHHHH---------HH--HCCCCCHHHHHH Q ss_conf 987667777666789--999877---------65--206833236677 Q gi|254781007|r 39 VVVDQVTVMQKTAWL--QEVLDH---------VI--YRTSPKNLYDLR 73 (411) Q Consensus 39 vvvdqvtvmqktawl--qevldh---------vi--yrtspknlydlr 73 (411) |..|.+|||-+-.|| .||++- || =|-.|+.|+++- T Consensus 118 vLLDE~~~~l~~GyL~veeV~~~L~~kp~~~~vvlTGR~aP~~L~~~A 165 (191) T TIGR00708 118 VLLDELTVALKFGYLDVEEVVEVLQEKPKSQHVVLTGRGAPQELVELA 165 (191) T ss_pred EEHHHHHHHHHCCCCCHHHHHHHHHCCCCCCEEEEECCCCCHHHHHHC T ss_conf 403423455534897889999998558456778886687868899751 No 23 >pfam03542 Tuberin Tuberin. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumour suppressor gene. The TSC2 gene codes for tuberin and interacts with hamartin pfam04388, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking. Probab=23.53 E-value=42 Score=13.87 Aligned_cols=25 Identities=36% Similarity=0.739 Sum_probs=18.7 Q ss_pred CCCHHEEEECCCCCHHHHHHHHHHHHHHHHHHHE Q ss_conf 5903204425876534478987777787444332 Q gi|254781007|r 368 SDPSKYYEINSDENVMPIAKSLARNVITNWFSQF 401 (411) Q Consensus 368 sdpskyyeinsdenvmpiakslarnvitnwfsqf 401 (411) .+|+||- --.-|||-+||..||-.- T Consensus 328 Tnp~ky~---------~YiVsLAhhVIa~WFlkc 352 (356) T pfam03542 328 TNPARYD---------HYTVSLAHHVIAAWFIKC 352 (356) T ss_pred CCCCCCH---------HHHHHHHHHHHHHHHHHH T ss_conf 7864301---------688999999999999981 No 24 >pfam01727 consensus Probab=22.24 E-value=45 Score=13.65 Aligned_cols=25 Identities=36% Similarity=0.602 Sum_probs=19.6 Q ss_pred HHHCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 410134678426413674344676677 Q gi|254781007|r 277 DMTSNQFGDGQVLTNTNHCFPHGASQN 303 (411) Q Consensus 277 dmtsnqfgdgqvltntnhcfphgasqn 303 (411) +-.-.|+|-|-.|++|| ||-|+|-- T Consensus 13 ~k~YqqyG~Gl~l~dtn--l~gGSSGS 37 (81) T pfam01727 13 GKEYQQYGYGLALNDTN--LPGGSSGS 37 (81) T ss_pred CCCHHHHCCEEEEECCC--CCCCCCCC T ss_conf 40255424205760366--69987666 No 25 >TIGR01405 polC_Gram_pos DNA polymerase III, alpha subunit, Gram-positive type; InterPro: IPR006308 These are the polypeptide chains of DNA polymerase III. Full-length homologs of this protein are restricted to the Gram-positive lineages, including the Mycoplasmas. This protein is designated alpha chain and given the gene symbol polC, but is not a full-length homolog of other polC genes. The N-terminal region of about 200 amino acids is rich in low-complexity sequence and poorly alignable. ; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006260 DNA replication, 0005737 cytoplasm. Probab=20.91 E-value=34 Score=14.55 Aligned_cols=16 Identities=56% Similarity=0.879 Sum_probs=8.4 Q ss_pred CCEEEECCCCCCCCCC Q ss_conf 4225312552243543 Q gi|254781007|r 175 DGYWLASRGKVADSKV 190 (411) Q Consensus 175 dgywlasrgkvadskv 190 (411) |||...|||-|.-|=| T Consensus 687 DGYlVGSRGSVGSSlV 702 (1264) T TIGR01405 687 DGYLVGSRGSVGSSLV 702 (1264) T ss_pred CCCEECCCCCCHHHHH T ss_conf 8846637743114578 No 26 >TIGR01524 ATPase-IIIB_Mg magnesium-translocating P-type ATPase; InterPro: IPR006415 This group describes the magnesium translocating P-type ATPase found in a limited number of bacterial species and best described in Salmonella typhimurium, which contains two isoforms . These transporters are active in low external Mg2+ concentrations and pump the ion into the cytoplasm. The magnesium ATPases have been classified as type IIIB by a phylogenetic analysis .; GO: 0015444 magnesium-importing ATPase activity, 0015693 magnesium ion transport, 0016021 integral to membrane. Probab=20.79 E-value=7.9 Score=19.00 Aligned_cols=65 Identities=29% Similarity=0.457 Sum_probs=42.7 Q ss_pred EEEEEEEEHHHHHHHHHH---CCCCEEHHHHHHHHHH----CCCCEEEEEECCCCEEEECCCCCCCCCCCCCHHCCHH Q ss_conf 631011103789998742---3582223554323332----0666388630004225312552243543871010277 Q gi|254781007|r 129 ALSTTVQLRGSLLQMFSQ---SKGKVDISRRKKVMYK----QNIGLMIMPFAWDGYWLASRGKVADSKVHPPKYLEYS 199 (411) Q Consensus 129 alsttvqlrgsllqmfsq---skgkvdisrrkkvmyk----qniglmimpfawdgywlasrgkvadskvhppkyleys 199 (411) ||+-.|-|--..|-|.-- .||-+..||||-+|-+ ||+|-|=. ----.-|.....|..--+|+.-| T Consensus 312 AlavAVGLTPEMLPMIVssnLAkGAi~mSk~KVIvK~L~AIQNfGAMDi------LCTDKTGTLT~Dki~L~~h~D~s 383 (892) T TIGR01524 312 ALAVAVGLTPEMLPMIVSSNLAKGAIKMSKKKVIVKRLDAIQNFGAMDI------LCTDKTGTLTQDKIVLEKHLDVS 383 (892) T ss_pred HHHHHHCCCCCCCHHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCCC------CCCCCCCCCCCHHHHHHHHHCCC T ss_conf 9998744772312267743567899986127425630330014343222------11388887430133221110258 No 27 >pfam09702 Cas_Csa5 CRISPR-associated protein (Cas_Csa5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus. Probab=20.64 E-value=54 Score=13.13 Aligned_cols=29 Identities=45% Similarity=0.608 Sum_probs=18.5 Q ss_pred EECCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCEE Q ss_conf 7345100333028999999998599887511476416 Q gi|254781007|r 310 AIGNQLSRSSVEKEKIEKVLQDCHYMHKRHRTGRDAI 346 (411) Q Consensus 310 aignqlsrssvekekiekvlqdchymhkrhrtgrdai 346 (411) -|||.|| ||..|+||-||.-.- |+|++.- T Consensus 13 Ri~NALs-----kEaV~~vl~e~~Ri~---~sg~~~~ 41 (105) T pfam09702 13 RIGNALS-----KEAVEKVLYEAQRIV---RSGIERG 41 (105) T ss_pred HHHHHCC-----HHHHHHHHHHHHHHH---HHHHCCC T ss_conf 9853247-----788999999999999---9754002 No 28 >PRK12703 tRNA 2'-O-methylase; Reviewed Probab=20.60 E-value=56 Score=13.02 Aligned_cols=38 Identities=13% Similarity=0.321 Sum_probs=16.0 Q ss_pred CCEEEECCCCCCCCCCCCCCEEEEEEECCCCCHHHHHHHHHHHH Q ss_conf 84264136743446766770799997345100333028999999 Q gi|254781007|r 285 DGQVLTNTNHCFPHGASQNKYMLMLAIGNQLSRSSVEKEKIEKV 328 (411) Q Consensus 285 dgqvltntnhcfphgasqnkymlmlaignqlsrssvekekiekv 328 (411) -|..|..-..|-.||-.. -.+|..+-|+---.|++-+. T Consensus 215 aGaLLHDiGRs~Th~i~H------~v~Ga~i~r~~g~~e~v~~I 252 (339) T PRK12703 215 AGALLHDIGRTKTNGIDH------AVAGAEILRKENIDDRVVSI 252 (339) T ss_pred HHHHHHHHCCCCCCCCHH------HHHHHHHHHHCCCCHHHHHH T ss_conf 415775303343268407------77689999975998899999 Done!