Query gi|254780914|ref|YP_003065327.1| hypothetical protein CLIBASIA_04065 [Candidatus Liberibacter asiaticus str. psy62] Match_columns 408 No_of_seqs 1 out of 3 Neff 1.0 Searched_HMMs 39220 Date Mon May 30 01:04:03 2011 Command /home/congqian_1/programs/hhpred/hhsearch -i 254780914.hhm -d /home/congqian_1/database/cdd/Cdd.hhm No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 cd01480 vWA_collagen_alpha_1-V 52.1 15 0.00039 17.1 2.9 63 302-378 109-171 (186) 2 cd01473 vWA_CTRP CTRP for CS 52.0 18 0.00045 16.6 9.6 155 212-393 22-190 (192) 3 KOG0099 consensus 51.3 18 0.00047 16.5 3.3 46 271-338 325-370 (379) 4 cd01475 vWA_Matrilin VWA_Matri 49.4 20 0.0005 16.3 3.7 46 344-391 134-181 (224) 5 pfam07888 CALCOCO1 Calcium bin 46.9 5.9 0.00015 20.0 0.2 19 147-165 295-313 (546) 6 KOG2531 consensus 46.7 21 0.00055 16.0 3.1 47 213-268 311-360 (545) 7 COG4854 Predicted membrane pro 39.9 27 0.00068 15.3 3.4 38 318-355 45-82 (126) 8 cd01481 vWA_collagen_alpha3-VI 38.2 29 0.00073 15.1 3.0 32 344-377 132-163 (165) 9 TIGR01271 CFTR_protein cystic 36.8 30 0.00076 15.0 6.7 69 22-94 1047-1135(1534) 10 cd01474 vWA_ATR ATR (Anthrax T 36.8 30 0.00076 15.0 6.8 50 344-395 133-183 (185) 11 KOG0090 consensus 32.7 22 0.00055 16.0 1.2 27 302-329 152-178 (238) 12 KOG0829 consensus 30.7 37 0.00095 14.3 2.6 47 315-361 78-132 (169) 13 TIGR02599 TIGR02599 Verrucomic 30.2 19 0.00048 16.4 0.6 85 32-128 12-100 (396) 14 cd01469 vWA_integrins_alpha_su 29.3 39 0.001 14.1 7.2 36 343-378 130-170 (177) 15 cd01482 vWA_collagen_alphaI-XI 29.0 40 0.001 14.1 2.6 27 344-372 129-155 (164) 16 cd00049 MH1 MH1 is a small DNA 28.3 23 0.00059 15.8 0.8 14 222-235 107-120 (121) 17 pfam07727 RVT_2 Reverse transc 26.9 5.7 0.00015 20.1 -2.5 50 29-80 67-132 (246) 18 TIGR00955 3a01204 Pigment prec 26.8 18 0.00045 16.6 -0.1 28 212-241 463-490 (671) 19 pfam07299 FBP Fibronectin-bind 26.7 43 0.0011 13.8 2.3 35 77-111 6-41 (208) 20 KOG0942 consensus 26.3 44 0.0011 13.8 2.4 14 22-35 135-148 (1001) 21 TIGR01703 hybrid_clust hydroxy 25.8 27 0.00069 15.3 0.7 10 140-149 259-268 (567) 22 smart00523 DWA Domain A in dwa 25.3 25 0.00063 15.6 0.4 15 221-235 92-106 (109) 23 pfam03165 MH1 MH1 domain. The 24.4 29 0.00074 15.1 0.7 14 221-234 93-106 (107) 24 pfam03542 Tuberin Tuberin. Tub 23.5 41 0.0011 14.0 1.3 24 368-400 328-351 (356) 25 TIGR00708 cobA cob(I)alamin ad 23.5 22 0.00056 15.9 -0.1 35 39-73 118-165 (191) 26 pfam01727 consensus 22.0 46 0.0012 13.6 1.3 25 277-303 13-37 (81) 27 TIGR01405 polC_Gram_pos DNA po 20.8 34 0.00087 14.6 0.5 16 175-190 687-702 (1264) 28 pfam09702 Cas_Csa5 CRISPR-asso 20.3 55 0.0014 13.1 1.4 29 310-346 13-41 (105) 29 TIGR01524 ATPase-IIIB_Mg magne 20.2 8.3 0.00021 18.9 -2.8 65 129-199 312-383 (892) No 1 >cd01480 vWA_collagen_alpha_1-VI-type VWA_collagen alpha(VI) type: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=52.07 E-value=15 Score=17.09 Aligned_cols=63 Identities=24% Similarity=0.341 Sum_probs=37.6 Q ss_pred CCCEEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHHHHHHHCCCCHHEEEECC Q ss_conf 77079999734510033302899999999859988751147641689974048874202678763059032044258 Q gi|254780914|r 302 QNKYMLMLAIGNQLSRSSVEKEKIEKVLQDCHYMHKRHRTGRDAITIFSVGFSPDQDTRYTLRQCASDPSKYYEINS 378 (408) Q Consensus 302 qnkymlmlaignqlsrssvekekiekvlqdchymhkrhrtgrdaitifsvgfspdqdtrytlrqcasdpskyyeins 378 (408) .+|-++.+.=|. |.. -+...++...++.. +..|+||.||..... .-.|++.||+|++.+-.++ T Consensus 109 ~~kvlvliTDG~--S~~-~~~~~~~~aa~~lr---------~~GV~ifaVGVG~~~--~~eL~~IAs~p~~~~~~~~ 171 (186) T cd01480 109 ENKFLLVITDGH--SDG-SPDGGIEKAVNEAD---------HLGIKIFFVAVGSQN--EEPLSRIACDGKSALYREN 171 (186) T ss_pred CCEEEEEEECCC--CCC-CCCHHHHHHHHHHH---------HCCCEEEEEEECCCC--HHHHHHHHCCCCCEEEECC T ss_conf 853899984587--666-74066999999999---------879899999947488--7999998589973897368 No 2 >cd01473 vWA_CTRP CTRP for CS protein-TRAP-related protein: Adhesion of Plasmodium to host cells is an important phenomenon in parasite invasion and in malaria associated pathology.CTRP encodes a protein containing a putative signal sequence followed by a long extracellular region of 1990 amino acids, a transmembrane domain, and a short cytoplasmic segment. The extracellular region of CTRP contains two separated adhesive domains. The first domain contains six 210-amino acid-long homologous VWA domain repeats. The second domain contains seven repeats of 87-60 amino acids in length, which share similarities with the thrombospondin type 1 domain found in a variety of adhesive molecules. Finally, CTRP also contains consensus motifs found in the superfamily of haematopoietin receptors. The VWA domains in these proteins likely mediate protein-protein interactions. Probab=52.03 E-value=18 Score=16.58 Aligned_cols=155 Identities=19% Similarity=0.268 Sum_probs=82.7 Q ss_pred HHHHHHHCCCCCEECCCCHHHHHHHEEEEEEEEEECCCCCHHHHHCCCCCCCHHCCHHHHHHHHHHHHCCCCCCC----- Q ss_conf 898876187432002720234434122112578625862125654186421100060899988874101346784----- Q gi|254780914|r 212 VKNFLSQIPYKNFCMAPYHYSSILYWAVGTLTYSVDNKTTTREYYKDPYYATWDHFPYSFIKNVFDMTSNQFGDG----- 286 (408) Q Consensus 212 vknflsqipyknfcmapyhyssilywavgtltysvdnktttreyykdpyyatwdhfpysfiknvfdmtsnqfgdg----- 286 (408) +++|+..+- .+|-..|-. --||-..||-.+++... .....+| + .-..++.+-.+...-+..| T Consensus 22 v~~F~~~lv-~~f~Ig~~~------~rvgvv~yS~~~~~~~~-f~~~~~~---~--k~~~l~~i~~l~~~~~~gg~T~tg 88 (192) T cd01473 22 VIPFTEKII-NNLNISKDK------VHVGILLFAEKNRDVVP-FSDEERY---D--KNELLKKINDLKNSYRSGGETYIV 88 (192) T ss_pred HHHHHHHHH-HHCCCCCCC------EEEEEEEECCCCCEEEE-CCCCCCC---C--HHHHHHHHHHHHHCCCCCCCCHHH T ss_conf 999999999-875659896------19999995588740132-3554434---8--999999999987314689824799 Q ss_pred EEE--ECCCCCCCCCCCCC--CEEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHHH Q ss_conf 264--13674344676677--07999973451003330289999999985998875114764168997404887420267 Q gi|254780914|r 287 QVL--TNTNHCFPHGASQN--KYMLMLAIGNQLSRSSVEKEKIEKVLQDCHYMHKRHRTGRDAITIFSVGFSPDQDTRYT 362 (408) Q Consensus 287 qvl--tntnhcfphgasqn--kymlmlaignqlsrssvekekiekvlqdchymhkrhrtgrdaitifsvgfspdqdtryt 362 (408) +.| ...+.-...|+.++ |.++++.=|+.-+++...-+...+.|. ...|+||.||-.-- .+-. T Consensus 89 ~AL~~~~~~~~~~~g~R~~vpkv~IvlTDG~s~~~~~~~~~~~a~~lr------------~~gV~i~avGVg~~--~~~e 154 (192) T cd01473 89 EALKYGLKNYTKHGNRRKDAPKVTMLFTDGNDTSASKKELQDISLLYK------------EENVKLLVVGVGAA--SENK 154 (192) T ss_pred HHHHHHHHHHCCCCCCCCCCCEEEEEEECCCCCCCCHHHHHHHHHHHH------------HCCCEEEEEEECCC--CHHH T ss_conf 999999998634678888997499999569988731678999999999------------87978999980637--9999 Q ss_pred HHHHCCCCHH-----EEEECCCCCHHHHHHHHHHHH Q ss_conf 8763059032-----044258765344789877777 Q gi|254780914|r 363 LRQCASDPSK-----YYEINSDENVMPIAKSLARNV 393 (408) Q Consensus 363 lrqcasdpsk-----yyeinsdenvmpiakslarnv 393 (408) ||+-|+.|.. .|-..+=+++.+|.+.+.+++ T Consensus 155 L~~iag~~~~~~~c~~~~~~~fd~l~~i~~~l~~~v 190 (192) T cd01473 155 LKLLAGCDINNDNCPNVIKTEWNNLNGISKFLTDKI 190 (192) T ss_pred HHHHHCCCCCCCCCCEEEECCHHHHHHHHHHHHHHH T ss_conf 999869998899775799479789999999999972 No 3 >KOG0099 consensus Probab=51.34 E-value=18 Score=16.51 Aligned_cols=46 Identities=37% Similarity=0.741 Sum_probs=38.5 Q ss_pred HHHHHHHHHCCCCCCCEEEECCCCCCCCCCCCCCEEEEEEECCCCCHHHHHHHHHHHHHHHHHHHHHH Q ss_conf 99888741013467842641367434467667707999973451003330289999999985998875 Q gi|254780914|r 271 FIKNVFDMTSNQFGDGQVLTNTNHCFPHGASQNKYMLMLAIGNQLSRSSVEKEKIEKVLQDCHYMHKR 338 (408) Q Consensus 271 fiknvfdmtsnqfgdgqvltntnhcfphgasqnkymlmlaignqlsrssvekekiekvlqdchymhkr 338 (408) ||..-|--.|..-||| ..+|+||=. ..|..|.|..|..||.-|-.| T Consensus 325 fird~FlRiSta~~Dg-----~h~CYpHFT-----------------cAvDTenIrrVFnDcrdiIqr 370 (379) T KOG0099 325 FIRDEFLRISTASGDG-----RHYCYPHFT-----------------CAVDTENIRRVFNDCRDIIQR 370 (379) T ss_pred HHHHHHHHHCCCCCCC-----CEECCCCEE-----------------EEECHHHHHHHHHHHHHHHHH T ss_conf 4230677630346787-----530002346-----------------771538899998789999999 No 4 >cd01475 vWA_Matrilin VWA_Matrilin: In cartilaginous plate, extracellular matrix molecules mediate cell-matrix and matrix-matrix interactions thereby providing tissue integrity. Some members of the matrilin family are expressed specifically in developing cartilage rudiments. The matrilin family consists of at least four members. All the members of the matrilin family contain VWA domains, EGF-like domains and a heptad repeat coiled-coiled domain at the carboxy terminus which is responsible for the oligomerization of the matrilins. The VWA domains have been shown to be essential for matrilin network formation by interacting with matrix ligands. Probab=49.37 E-value=20 Score=16.30 Aligned_cols=46 Identities=22% Similarity=0.369 Sum_probs=32.2 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCH--HEEEECCCCCHHHHHHHHHH Q ss_conf 4168997404887420267876305903--20442587653447898777 Q gi|254780914|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPS--KYYEINSDENVMPIAKSLAR 391 (408) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdps--kyyeinsdenvmpiakslar 391 (408) +.|.||.||..- - .+-.|++.||+|+ ..+.+++=+..-.++..|.. T Consensus 134 ~GV~ifaVGVg~-~-~~~eL~~IAs~P~~~hvf~v~~F~~l~~l~~~l~~ 181 (224) T cd01475 134 LGIEMFAVGVGR-A-DEEELREIASEPLADHVFYVEDFSTIEELTKKFQG 181 (224) T ss_pred CCCEEEEEECCC-C-CHHHHHHHHCCCCHHCEEEECCHHHHHHHHHHHHH T ss_conf 798899996374-7-98999998559737568994798899999999876 No 5 >pfam07888 CALCOCO1 Calcium binding and coiled-coil domain (CALCOCO1) like. Proteins found in this family are similar to the coiled-coil transcriptional coactivator protein coexpressed by Mus musculus (CoCoA/CALCOCO1). This protein binds to a highly conserved N-terminal domain of p160 coactivators, such as GRIP1, and thus enhances transcriptional activation by a number of nuclear receptors. CALCOCO1 has a central coiled-coil region with three leucine zipper motifs, which is required for its interaction with GRIP1 and may regulate the autonomous transcriptional activation activity of the C-terminal region. Probab=46.93 E-value=5.9 Score=20.01 Aligned_cols=19 Identities=11% Similarity=0.279 Sum_probs=9.7 Q ss_pred CCCCEEHHHHHHHHHHCCC Q ss_conf 3582223554323332066 Q gi|254780914|r 147 SKGKVDISRRKKVMYKQNI 165 (408) Q Consensus 147 skgkvdisrrkkvmykqni 165 (408) ..+.+.-|+++-++....+ T Consensus 295 ~qeql~AS~q~a~~L~~EL 313 (546) T pfam07888 295 LQERLESSQQKAGLLGEEL 313 (546) T ss_pred HHHHHHHHHHHHHHHHHHH T ss_conf 7888888899999999999 No 6 >KOG2531 consensus Probab=46.67 E-value=21 Score=16.01 Aligned_cols=47 Identities=30% Similarity=0.602 Sum_probs=22.5 Q ss_pred HHHHHHCCCCCEEC--CCCHHHHHHHEEEEEEEEEECCCCCHHHHHCC-CCCCCHHCCH Q ss_conf 98876187432002--72023443412211257862586212565418-6421100060 Q gi|254780914|r 213 KNFLSQIPYKNFCM--APYHYSSILYWAVGTLTYSVDNKTTTREYYKD-PYYATWDHFP 268 (408) Q Consensus 213 knflsqipyknfcm--apyhyssilywavgtltysvdnktttreyykd-pyyatwdhfp 268 (408) |++..-+-|.-||- .|-||-..|-+. |-.-|||--.+ ---..||+|- T Consensus 311 ~~~~p~~egHvf~hP~~~~~YM~mlCfk---------NgSL~RE~ir~~~~~~sWd~Fn 360 (545) T KOG2531 311 KEYHPSPEGHVFCHPTDPNHYMGMLCFK---------NGSLTRERIRNESANGSWDKFN 360 (545) T ss_pred CCCCCCCCCCEECCCCCCCCEEEEEEEC---------CCHHHHHHHHHCCCCCCHHHHH T ss_conf 7888897753402688864228999961---------7707899986222479778999 No 7 >COG4854 Predicted membrane protein [Function unknown] Probab=39.94 E-value=27 Score=15.31 Aligned_cols=38 Identities=24% Similarity=0.435 Sum_probs=28.7 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHCCCCCEEEEEEEECCC Q ss_conf 33028999999998599887511476416899740488 Q gi|254780914|r 318 SSVEKEKIEKVLQDCHYMHKRHRTGRDAITIFSVGFSP 355 (408) Q Consensus 318 ssvekekiekvlqdchymhkrhrtgrdaitifsvgfsp 355 (408) -|+-|+..++|+.|-.-..-..|..|..|.+||+|+.- T Consensus 45 l~l~k~Rv~~vvEDER~lrvse~aSr~TiqV~~is~Al 82 (126) T COG4854 45 LSLVKRRVDEVVEDERTLRVSERASRRTIQVFSISAAL 82 (126) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHEEEEEEEEHHHH T ss_conf 99999999988525889888875300257888846988 No 8 >cd01481 vWA_collagen_alpha3-VI-like VWA_collagen alpha 3(VI) like: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=38.17 E-value=29 Score=15.13 Aligned_cols=32 Identities=28% Similarity=0.405 Sum_probs=24.8 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCHHEEEEC Q ss_conf 4168997404887420267876305903204425 Q gi|254780914|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPSKYYEIN 377 (408) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdpskyyein 377 (408) ..+.||.||... -.+-.|++-||+|+.-|.++ T Consensus 132 ~gV~i~aVGvg~--~~~~eL~~IAs~p~~vf~~~ 163 (165) T cd01481 132 AGIVPFAIGARN--ADLAELQQIAFDPSFVFQVS 163 (165) T ss_pred CCCEEEEEECCC--CCHHHHHHHHCCCCCEEECC T ss_conf 897899996897--99999999858987769738 No 9 >TIGR01271 CFTR_protein cystic fibrosis transmembrane conductor regulator (CFTR); InterPro: IPR005291 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). These proteins are integral membrane proteins and they are involved in the transport of chloride ions. Many of these proteins are the cystis fibrosis transmembrane conductor regulators (CFTR) in eukaryotes. The principal role of this protein is chloride ion conductance. The protein is predicted to consist of 12 transmembrane domains. Mutations or lesions in the genetic loci have been linked to the aetiology of asthma, bronchiectasis, chronic obstructive pulmonary disease etc. Disease-causing mutations have been studied by 36Cl efflux assays in vitro cell cultures and electrophysiology, all of which point to the impairment of chloride channel stability and not the biosynthetic processing per se.; GO: 0005254 chloride channel activity, 0006811 ion transport, 0016020 membrane. Probab=36.81 E-value=30 Score=14.98 Aligned_cols=69 Identities=33% Similarity=0.597 Sum_probs=43.6 Q ss_pred HHHHHHHHHHH------HHHHHHHHHHHHHHHHHHHHHHH--------------HHHHHHHCCCCCHHHHHHHCCHHHHH Q ss_conf 99999999987------89999998766777766678999--------------98776520683323667741223578 Q gi|254780914|r 22 FLVITAILLSS------FVAIVDVVVDQVTVMQKTAWLQE--------------VLDHVIYRTSPKNLYDLREAGRDNFI 81 (408) Q Consensus 22 flvitaillss------fvaivdvvvdqvtvmqktawlqe--------------vldhviyrtspknlydlreagrdnfi 81 (408) ..|+-||..-| |+|-+-|.| +-||-..-.|+. +..|.| +|-|.|+.+|.-||..+. T Consensus 1047 LIV~GAI~~Vsvl~PY~~~AaiPv~V--iFi~LR~YFL~TsQQLKQLEsEaRSPIFsHLi--~SLkGLWT~RAFGRQsYF 1122 (1534) T TIGR01271 1047 LIVVGAIAVVSVLQPYIFIAAIPVAV--IFIMLRAYFLRTSQQLKQLESEARSPIFSHLI--TSLKGLWTIRAFGRQSYF 1122 (1534) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHHHCCCCHHHHHHH--HHHHHHHHHHHCCCCCHH T ss_conf 99998899887842589989989999--99999999862225667765215771678899--875233443303787526 Q ss_pred HHHHHHHHHCCCC Q ss_conf 9999987400342 Q gi|254780914|r 82 RHQIEKALNTYNS 94 (408) Q Consensus 82 rhqiekalntyns 94 (408) .--..||||+... T Consensus 1123 ETLFHKALN~HTA 1135 (1534) T TIGR01271 1123 ETLFHKALNLHTA 1135 (1534) T ss_pred HHHHHHHHHHHHH T ss_conf 7899998876789 No 10 >cd01474 vWA_ATR ATR (Anthrax Toxin Receptor): Anthrax toxin is a key virulence factor for Bacillus anthracis, the causative agent of anthrax. ATR is the cellular receptor for the anthrax protective antigen and facilitates entry of the toxin into cells. The VWA domain in ATR contains the toxin binding site and mediates interaction with protective antigen. The binding is mediated by divalent cations that binds to the MIDAS motif. These proteins are a family of vertebrate ECM receptors expressed by endothelial cells. Probab=36.78 E-value=30 Score=14.98 Aligned_cols=50 Identities=16% Similarity=0.229 Sum_probs=37.2 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCHHEEEECC-CCCHHHHHHHHHHHHHH Q ss_conf 41689974048874202678763059032044258-76534478987777787 Q gi|254780914|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPSKYYEINS-DENVMPIAKSLARNVIT 395 (408) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdpskyyeins-denvmpiakslarnvit 395 (408) ..|+||+||.+ +- .+-.|++-||+|++-|.++. =+..-.|...|...+-+ T Consensus 133 ~gV~i~aVGV~-~~-~~~eL~~IAs~p~~vf~v~~~F~~L~~i~~~l~~~iC~ 183 (185) T cd01474 133 LGAIVYCVGVT-DF-LKSQLINIADSKEYVFPVTSGFQALSGIIESVVKKACI 183 (185) T ss_pred CCCEEEEEECC-CC-CHHHHHHHHCCCCEEEECCCCHHHHHHHHHHHHHHHCC T ss_conf 89489999716-25-99999987199864898347577789999999985287 No 11 >KOG0090 consensus Probab=32.69 E-value=22 Score=15.99 Aligned_cols=27 Identities=37% Similarity=0.315 Sum_probs=14.0 Q ss_pred CCCEEEEEEECCCCCHHHHHHHHHHHHH Q ss_conf 7707999973451003330289999999 Q gi|254780914|r 302 QNKYMLMLAIGNQLSRSSVEKEKIEKVL 329 (408) Q Consensus 302 qnkymlmlaignqlsrssvekekiekvl 329 (408) -||.-+..|--.+.-|...||| |+++. T Consensus 152 CNKqDl~tAkt~~~Ir~~LEkE-i~~lr 178 (238) T KOG0090 152 CNKQDLFTAKTAEKIRQQLEKE-IHKLR 178 (238) T ss_pred ECCHHHHHCCCHHHHHHHHHHH-HHHHH T ss_conf 5552232138599999999999-99999 No 12 >KOG0829 consensus Probab=30.71 E-value=37 Score=14.31 Aligned_cols=47 Identities=32% Similarity=0.357 Sum_probs=32.3 Q ss_pred CCHHHHHH---H-----HHHHHHHHHHHHHHHHCCCCCEEEEEEEECCCCCCHHH Q ss_conf 00333028---9-----99999998599887511476416899740488742026 Q gi|254780914|r 315 LSRSSVEK---E-----KIEKVLQDCHYMHKRHRTGRDAITIFSVGFSPDQDTRY 361 (408) Q Consensus 315 lsrssvek---e-----kiekvlqdchymhkrhrtgrdaitifsvgfspdqdtry 361 (408) -|||++.. | +...|-|--.-|..|||.-+++|.|.+|.--|..|++- T Consensus 78 dSRsG~HNmYkEyRd~t~~gAV~q~y~dMaaRhRar~~~I~Iikv~~v~a~~~kR 132 (169) T KOG0829 78 DSRSGTHNMYKEYRDTTRVGAVEQCYRDMAARHRARFRSIQIIKVAEVPAEDCKR 132 (169) T ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEEEEEHHHHHH T ss_conf 2577615789999876655589999999877765206612599887764887435 No 13 >TIGR02599 TIGR02599 Verrucomicrobium spinosum paralogous family TIGR02599. Probab=30.24 E-value=19 Score=16.44 Aligned_cols=85 Identities=22% Similarity=0.346 Sum_probs=40.4 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCHHHHHHHCCHHHH--HHHHHH-HHHHCCCCCCHHHCCCHHHHHH Q ss_conf 7899999987667777666789999877652068332366774122357--899999-8740034220000000887743 Q gi|254780914|r 32 SFVAIVDVVVDQVTVMQKTAWLQEVLDHVIYRTSPKNLYDLREAGRDNF--IRHQIE-KALNTYNSRDLSNIGSIESIVK 108 (408) Q Consensus 32 sfvaivdvvvdqvtvmqktawlqevldhviyrtspknlydlreagrdnf--irhqie-kalntynsrdlsnigsiesivk 108 (408) +.++|+=+|+-||+++-...|-+ ++-+ .-..||| |.-| |--+.+ -.||+|=.--.++-++ .++-. T Consensus 12 tvLsilm~v~~~v~~~tq~tw~~---------a~ar-~~qFREA-R~AFEaisr~LsQATLN~Yw~Y~~~~~~~-~~~~~ 79 (396) T TIGR02599 12 TVLSILMLVLAQVLSQTQRTWRR---------ATAR-AEQFREA-RAAFEAISRRLSQATLNAYWDYKYNAGTS-RTVAN 79 (396) T ss_pred HHHHHHHHHHHHHHHHHHHHHHH---------HHHH-HHHHHHH-HHHHHHHHHHHHHHHHCCHHHHHHCCCCC-CCCCC T ss_conf 99999999998877558888886---------4545-4654999-99999985300023403135664326777-55664 Q ss_pred HH-HEEECCCCCCCEEEEEEE Q ss_conf 22-113224673313899988 Q gi|254780914|r 109 DA-VILTKNVNSLPLQFTVDI 128 (408) Q Consensus 109 da-viltknvnslplqftvdi 128 (408) ++ -|-++-.-.=-|+|-..- T Consensus 80 ~~~~~P~~Y~R~SELhFv~Gp 100 (396) T TIGR02599 80 AATEVPTGYERQSELHFVSGP 100 (396) T ss_pred CCCCCCCCCEEEEEEEEEECC T ss_conf 101155331266501677245 No 14 >cd01469 vWA_integrins_alpha_subunit Integrins are a class of adhesion receptors that link the extracellular matrix to the cytoskeleton and cooperate with growth factor receptors to promote celll survival, cell cycle progression and cell migration. Integrins consist of an alpha and a beta sub-unit. Each sub-unit has a large extracellular portion, a single transmembrane segment and a short cytoplasmic domain. The N-terminal domains of the alpha and beta subunits associate to form the integrin headpiece, which contains the ligand binding site, whereas the C-terminal segments traverse the plasma membrane and mediate interaction with the cytoskeleton and with signalling proteins.The VWA domains present in the alpha subunits of integrins seem to be a chordate specific radiation of the gene family being found only in vertebrates. They mediate protein-protein interactions. Probab=29.32 E-value=39 Score=14.15 Aligned_cols=36 Identities=22% Similarity=0.549 Sum_probs=26.7 Q ss_pred CCEEEEEEEECCCCCCH---HHHHHHHCCCCHH--EEEECC Q ss_conf 64168997404887420---2678763059032--044258 Q gi|254780914|r 343 RDAITIFSVGFSPDQDT---RYTLRQCASDPSK--YYEINS 378 (408) Q Consensus 343 rdaitifsvgfspdqdt---rytlrqcasdpsk--yyeins 378 (408) ++.|.||+||..+.-+. ...|+..||+|.. .|.+++ T Consensus 130 ~~gv~vf~VGvG~~~~~~~~~~eL~~iAs~P~~~hvf~~~~ 170 (177) T cd01469 130 REGIIRYAIGVGGHFQRENSREELKTIASKPPEEHFFNVTD 170 (177) T ss_pred HCCEEEEEEEECCCCCCCCCHHHHHHHHCCCCHHCEEEECC T ss_conf 79908999995551467451999999967985871998379 No 15 >cd01482 vWA_collagen_alphaI-XII-like Collagen: The extracellular matrix represents a complex alloy of variable members of diverse protein families defining structural integrity and various physiological functions. The most abundant family is the collagens with more than 20 different collagen types identified thus far. Collagens are centrally involved in the formation of fibrillar and microfibrillar networks of the extracellular matrix, basement membranes as well as other structures of the extracellular matrix. Some collagens have about 15-18 vWA domains in them. The VWA domains present in these collagens mediate protein-protein interactions. Probab=28.97 E-value=40 Score=14.11 Aligned_cols=27 Identities=37% Similarity=0.607 Sum_probs=21.7 Q ss_pred CEEEEEEEECCCCCCHHHHHHHHCCCCHH Q ss_conf 41689974048874202678763059032 Q gi|254780914|r 344 DAITIFSVGFSPDQDTRYTLRQCASDPSK 372 (408) Q Consensus 344 daitifsvgfspdqdtrytlrqcasdpsk 372 (408) ..|+||.||. .+.| +..|++-||+|+. T Consensus 129 ~gv~i~~VGV-g~~~-~~eL~~IAs~P~~ 155 (164) T cd01482 129 LGVNVFAVGV-KDAD-ESELKMIASKPSE 155 (164) T ss_pred CCCEEEEEEC-CCCC-HHHHHHHHCCCCH T ss_conf 8938999978-8378-9999999689856 No 16 >cd00049 MH1 MH1 is a small DNA binding domain, binding in an unusal way involving a beta hairpin structure binding to the major groove. MH1 is present in Smad proteins, an important family of proteins involved in TGF-beta signalling and frequent targets of tumorigenic mutations. Also known as Domain A in dwarfin family proteins. Probab=28.28 E-value=23 Score=15.77 Aligned_cols=14 Identities=43% Similarity=1.220 Sum_probs=11.3 Q ss_pred CCEECCCCHHHHHH Q ss_conf 32002720234434 Q gi|254780914|r 222 KNFCMAPYHYSSIL 235 (408) Q Consensus 222 knfcmapyhyssil 235 (408) ...|.-|||||-+. T Consensus 107 ~~VC~NPyHy~rv~ 120 (121) T cd00049 107 DEVCINPYHYSRVV 120 (121) T ss_pred CEEEECCHHHHHHC T ss_conf 95883850686512 No 17 >pfam07727 RVT_2 Reverse transcriptase (RNA-dependent DNA polymerase). A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model. Probab=26.94 E-value=5.7 Score=20.09 Aligned_cols=50 Identities=34% Similarity=0.510 Sum_probs=34.3 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCC----------------HHHHHHHCCHHHH Q ss_conf 998789999998766777766678999987765206833----------------2366774122357 Q gi|254780914|r 29 LLSSFVAIVDVVVDQVTVMQKTAWLQEVLDHVIYRTSPK----------------NLYDLREAGRDNF 80 (408) Q Consensus 29 llssfvaivdvvvdqvtvmqktawlqevldhviyrtspk----------------nlydlreagrdnf 80 (408) ++-+.+|.-|..+.|.-| ++|.|+.-||+.||-.-|. .||.||.|+|.=+ T Consensus 67 ~llaiaa~~~~~~~q~Dv--~tAFLn~~l~e~IYm~~P~G~~~~~~~~~V~kL~kaLYGLkqapr~W~ 132 (246) T pfam07727 67 LLLALAAQRGWELHQMDV--KTAFLNGELEEEVYMKQPPGFEDPGKPNKVCRLKKSLYGLKQAPRAWY 132 (246) T ss_pred HHHHHHHHCCCEEEEEEE--CHHHHCCCCCCCCEEECCCCCCCCCCCCEEEEEEEECCCCCCCHHHHH T ss_conf 899999875986555454--112305767878449588764668888889999410016563689999 No 18 >TIGR00955 3a01204 Pigment precourser permease; InterPro: IPR005284 ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energize diverse biological systems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These regions can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs. ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain . The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyze ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarize the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site , , . The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis , , , , , . Proteins known to belong to this family are classified in several functional subfamilies depending on the substrate used (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1). This family includes different parts of a membrane-spanning permease system necessary for the transport of pigment precursor into pigment cells responsible for eye color. White protein dimerises with brown protein for the transport of guanine and with scarlet protein for the transport of tryptophan.; GO: 0006810 transport. Probab=26.77 E-value=18 Score=16.63 Aligned_cols=28 Identities=32% Similarity=0.859 Sum_probs=23.1 Q ss_pred HHHHHHHCCCCCEECCCCHHHHHHHEEEEE Q ss_conf 898876187432002720234434122112 Q gi|254780914|r 212 VKNFLSQIPYKNFCMAPYHYSSILYWAVGT 241 (408) Q Consensus 212 vknflsqipyknfcmapyhyssilywavgt 241 (408) +-+.|+.+|. |-+.|+-|.+|.||.+|- T Consensus 463 lak~la~lP~--~i~~p~~f~~I~Ywm~GL 490 (671) T TIGR00955 463 LAKTLAELPL--FIILPALFTSIVYWMIGL 490 (671) T ss_pred HHHHHHHHHH--HHHHHHHHHHHHHHHHCC T ss_conf 9999998489--997479999999875156 No 19 >pfam07299 FBP Fibronectin-binding protein (FBP). This family consists of several bacterial fibronectin-binding proteins which are thought to be involved in virulence in Listeria species. Probab=26.73 E-value=43 Score=13.84 Aligned_cols=35 Identities=23% Similarity=0.538 Sum_probs=26.4 Q ss_pred HHHHHHHHHHHHHHCCCC-CCHHHCCCHHHHHHHHH Q ss_conf 235789999987400342-20000000887743221 Q gi|254780914|r 77 RDNFIRHQIEKALNTYNS-RDLSNIGSIESIVKDAV 111 (408) Q Consensus 77 rdnfirhqiekalntyns-rdlsnigsiesivkdav 111 (408) .=|||+.|++.-.|+|.+ -|.+-+-.+.+...+.+ T Consensus 6 qYN~IK~qv~~L~~~~~svnD~~vi~a~k~~~~~kI 41 (208) T pfam07299 6 QYNFIKQQVAQLVNAYRTVNDRNVIKAVKALAIEKI 41 (208) T ss_pred HHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHH T ss_conf 468999999999998641488899999999999999 No 20 >KOG0942 consensus Probab=26.27 E-value=44 Score=13.79 Aligned_cols=14 Identities=43% Similarity=0.598 Sum_probs=5.9 Q ss_pred HHHHHHHHHHHHHH Q ss_conf 99999999987899 Q gi|254780914|r 22 FLVITAILLSSFVA 35 (408) Q Consensus 22 flvitaillssfva 35 (408) |..|.--++.+|+. T Consensus 135 ~~~~p~~ll~~~~~ 148 (1001) T KOG0942 135 FQAIPRRLLESFTS 148 (1001) T ss_pred HHHHHHHHHHHHHH T ss_conf 89879999999999 No 21 >TIGR01703 hybrid_clust hydroxylamine reductase; InterPro: IPR010048 Hybrid cluster proteins (HCP, or Prismane) have been identified in bacteria, archaea and eukaryotic protozoa. No specific function has yet been assigned to these proteins, but it may involve oxidoreductase enzymatic activity. These proteins contain one 4Fe-4S cluster, and one hybrid 4Fe-2O-2S cluster, the latter being similar to the Ni-Fe-S cluster found in carbon monoxide dehydrogenase enzymes (IPR010047 from INTERPRO) , . This subfamily is heterogeneous with respect to the presence or absence of a region of about 100 amino acids not far from the N terminus of the protein. Members have been described as monomeric. ; GO: 0016661 oxidoreductase activity acting on other nitrogenous compounds as donors, 0051536 iron-sulfur cluster binding, 0006118 electron transport, 0005737 cytoplasm. Probab=25.85 E-value=27 Score=15.30 Aligned_cols=10 Identities=30% Similarity=0.664 Sum_probs=3.8 Q ss_pred HHHHHHHCCC Q ss_conf 9998742358 Q gi|254780914|r 140 LLQMFSQSKG 149 (408) Q Consensus 140 llqmfsqskg 149 (408) |-+...|.+| T Consensus 259 L~~LL~QT~g 268 (567) T TIGR01703 259 LEELLEQTEG 268 (567) T ss_pred HHHHHHHHHH T ss_conf 9999998741 No 22 >smart00523 DWA Domain A in dwarfin family proteins. Probab=25.26 E-value=25 Score=15.59 Aligned_cols=15 Identities=40% Similarity=0.919 Sum_probs=11.6 Q ss_pred CCCEECCCCHHHHHH Q ss_conf 432002720234434 Q gi|254780914|r 221 YKNFCMAPYHYSSIL 235 (408) Q Consensus 221 yknfcmapyhyssil 235 (408) ....|.-||||+-|- T Consensus 92 ~~~VC~NPYHy~Rv~ 106 (109) T smart00523 92 SDEVCCNPYHYSRVE 106 (109) T ss_pred CCEEEECCCCCCCCC T ss_conf 997870883120011 No 23 >pfam03165 MH1 MH1 domain. The MH1 (MAD homology 1) domain is found at the amino terminus of MAD related proteins such as Smads. This domain is separated from the MH2 domain by a non-conserved linker region. The crystal structure of the MH1 domain shows that a highly conserved 11 residue beta hairpin is used to bind the DNA consensus sequence GNCN in the major groove, shown to be vital for the transcriptional activation of target genes. Not all examples of MH1 can bind to DNA however. Smad2 cannot bind DNA and has a large insertion within the hairpin that presumably abolishes DNA binding. A basic helix (H2) in MH1 with the nuclear localisation signal KKLKK has been shown to be essential for Smad3 nuclear import. Smads also use the MH1 domain to interact with transcription factors such as Jun, TFE3, Sp1, and Runx. Probab=24.44 E-value=29 Score=15.07 Aligned_cols=14 Identities=43% Similarity=1.127 Sum_probs=10.9 Q ss_pred CCCEECCCCHHHHH Q ss_conf 43200272023443 Q gi|254780914|r 221 YKNFCMAPYHYSSI 234 (408) Q Consensus 221 yknfcmapyhyssi 234 (408) ....|.-||||+-+ T Consensus 93 ~~~VC~NPyHy~Rv 106 (107) T pfam03165 93 KDEVCINPYHYSRV 106 (107) T ss_pred CCEEEECCCCHHHC T ss_conf 99388487334324 No 24 >pfam03542 Tuberin Tuberin. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumour suppressor gene. The TSC2 gene codes for tuberin and interacts with hamartin pfam04388, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking. Probab=23.53 E-value=41 Score=13.97 Aligned_cols=24 Identities=38% Similarity=0.780 Sum_probs=18.3 Q ss_pred CCCHHEEEECCCCCHHHHHHHHHHHHHHHHHHE Q ss_conf 590320442587653447898777778733430 Q gi|254780914|r 368 SDPSKYYEINSDENVMPIAKSLARNVITNWFSQ 400 (408) Q Consensus 368 sdpskyyeinsdenvmpiakslarnvitnwfsq 400 (408) .+|+||- --.-|||-+||..||-. T Consensus 328 Tnp~ky~---------~YiVsLAhhVIa~WFlk 351 (356) T pfam03542 328 TNPARYD---------HYTVSLAHHVIAAWFIK 351 (356) T ss_pred CCCCCCH---------HHHHHHHHHHHHHHHHH T ss_conf 7864301---------68899999999999998 No 25 >TIGR00708 cobA cob(I)alamin adenosyltransferase; InterPro: IPR003724 ATP:cob(I)alamin (or ATP:corrinoid) adenosyltransferases (2.5.1.17 from EC), catalyse the conversion of cobalamin (vitamin B12) into its coenzyme form, adenosylcobalamin (coenzyme B12) . Adenosylcobalamin (AdoCbl) is required for the ativity of certain enzymes. AdoCbl contains an adenosyl moiety liganded to the cobalt ion of cobalamin via a covalent Co-C bond, and its synthesis is unique to certain prokaryotes. ATP:cob(I)alamin adenosyltransferases are classed into three groups: CobA-type , EutT-type and PduO-type . Each of the three enzyme types appears to be specialised for particular AdoCbl-dependent enzymes or for the de novo synthesis AdoCbl. PduO and EutT are distantly related, sharing short conserved motifs, while CobA is evolutionarily unrelated and is an example of convergent evolution. This entry represnts the ATP:cob(I)alamin adenosyltransferases CobA (Salmonella typhimurium), CobO (Pseudomonas denitrificans), and ButR (Escherichia coli). There is a high degree of sequence identity between these proteins . CobA is responsible for attaching the adenosyl moiety from ATP to the cobalt ion of the corrin ring, necessary for the convertion of cobalamin to adenosylcobalamin , . ; GO: 0005524 ATP binding, 0008817 cob(I)yrinic acid ac-diamide adenosyltransferase activity, 0009236 cobalamin biosynthetic process. Probab=23.49 E-value=22 Score=15.94 Aligned_cols=35 Identities=37% Similarity=0.627 Sum_probs=24.6 Q ss_pred HHHHHHHHHHHHHHH--HHHHHH---------HH--HCCCCCHHHHHH Q ss_conf 987667777666789--999877---------65--206833236677 Q gi|254780914|r 39 VVVDQVTVMQKTAWL--QEVLDH---------VI--YRTSPKNLYDLR 73 (408) Q Consensus 39 vvvdqvtvmqktawl--qevldh---------vi--yrtspknlydlr 73 (408) |..|.+|||-+-.|| .||++- || =|-.|+.|+++- T Consensus 118 vLLDE~~~~l~~GyL~veeV~~~L~~kp~~~~vvlTGR~aP~~L~~~A 165 (191) T TIGR00708 118 VLLDELTVALKFGYLDVEEVVEVLQEKPKSQHVVLTGRGAPQELVELA 165 (191) T ss_pred EEHHHHHHHHHCCCCCHHHHHHHHHCCCCCCEEEEECCCCCHHHHHHC T ss_conf 403423455534897889999998558456778886687868899751 No 26 >pfam01727 consensus Probab=22.05 E-value=46 Score=13.64 Aligned_cols=25 Identities=36% Similarity=0.602 Sum_probs=19.6 Q ss_pred HHHCCCCCCCEEEECCCCCCCCCCCCC Q ss_conf 410134678426413674344676677 Q gi|254780914|r 277 DMTSNQFGDGQVLTNTNHCFPHGASQN 303 (408) Q Consensus 277 dmtsnqfgdgqvltntnhcfphgasqn 303 (408) +-.-.|+|-|-.|++|| ||-|+|-- T Consensus 13 ~k~YqqyG~Gl~l~dtn--l~gGSSGS 37 (81) T pfam01727 13 GKEYQQYGYGLALNDTN--LPGGSSGS 37 (81) T ss_pred CCCHHHHCCEEEEECCC--CCCCCCCC T ss_conf 40255424205760366--69987666 No 27 >TIGR01405 polC_Gram_pos DNA polymerase III, alpha subunit, Gram-positive type; InterPro: IPR006308 These are the polypeptide chains of DNA polymerase III. Full-length homologs of this protein are restricted to the Gram-positive lineages, including the Mycoplasmas. This protein is designated alpha chain and given the gene symbol polC, but is not a full-length homolog of other polC genes. The N-terminal region of about 200 amino acids is rich in low-complexity sequence and poorly alignable. ; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006260 DNA replication, 0005737 cytoplasm. Probab=20.79 E-value=34 Score=14.57 Aligned_cols=16 Identities=56% Similarity=0.879 Sum_probs=7.8 Q ss_pred CCEEEECCCCCCCCCC Q ss_conf 4225312552243543 Q gi|254780914|r 175 DGYWLASRGKVADSKV 190 (408) Q Consensus 175 dgywlasrgkvadskv 190 (408) |||...|||-|.-|=| T Consensus 687 DGYlVGSRGSVGSSlV 702 (1264) T TIGR01405 687 DGYLVGSRGSVGSSLV 702 (1264) T ss_pred CCCEECCCCCCHHHHH T ss_conf 8846637743114578 No 28 >pfam09702 Cas_Csa5 CRISPR-associated protein (Cas_Csa5). CRISPR is a term for Clustered, Regularly Interspaced Short Palindromic Repeats. A number of protein families appear only in association with these repeats and are designated Cas (CRISPR-Associated) proteins. This entry represents a minor family of Cas proteins found in various species of Sulfolobus and Pyrococcus (all archaeal). It is found with two different CRISPR loci in Sulfolobus solfataricus. Probab=20.25 E-value=55 Score=13.07 Aligned_cols=29 Identities=45% Similarity=0.608 Sum_probs=18.4 Q ss_pred EECCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCEE Q ss_conf 7345100333028999999998599887511476416 Q gi|254780914|r 310 AIGNQLSRSSVEKEKIEKVLQDCHYMHKRHRTGRDAI 346 (408) Q Consensus 310 aignqlsrssvekekiekvlqdchymhkrhrtgrdai 346 (408) -|||.|| ||..|+||-||.-.- |+|++.- T Consensus 13 Ri~NALs-----kEaV~~vl~e~~Ri~---~sg~~~~ 41 (105) T pfam09702 13 RIGNALS-----KEAVEKVLYEAQRIV---RSGIERG 41 (105) T ss_pred HHHHHCC-----HHHHHHHHHHHHHHH---HHHHCCC T ss_conf 9853247-----788999999999999---9754002 No 29 >TIGR01524 ATPase-IIIB_Mg magnesium-translocating P-type ATPase; InterPro: IPR006415 This group describes the magnesium translocating P-type ATPase found in a limited number of bacterial species and best described in Salmonella typhimurium, which contains two isoforms . These transporters are active in low external Mg2+ concentrations and pump the ion into the cytoplasm. The magnesium ATPases have been classified as type IIIB by a phylogenetic analysis .; GO: 0015444 magnesium-importing ATPase activity, 0015693 magnesium ion transport, 0016021 integral to membrane. Probab=20.17 E-value=8.3 Score=18.95 Aligned_cols=65 Identities=29% Similarity=0.457 Sum_probs=42.3 Q ss_pred EEEEEEEEHHHHHHHHHH---CCCCEEHHHHHHHHHH----CCCCEEEEEECCCCEEEECCCCCCCCCCCCCHHCCHH Q ss_conf 631011103789998742---3582223554323332----0666388630004225312552243543871010277 Q gi|254780914|r 129 ALSTTVQLRGSLLQMFSQ---SKGKVDISRRKKVMYK----QNIGLMIMPFAWDGYWLASRGKVADSKVHPPKYLEYS 199 (408) Q Consensus 129 alsttvqlrgsllqmfsq---skgkvdisrrkkvmyk----qniglmimpfawdgywlasrgkvadskvhppkyleys 199 (408) ||+-.|-|--..|-|.-- .||-+..||||-.|-+ ||+|-|=. ----.-|.....|..--+|+.-| T Consensus 312 AlavAVGLTPEMLPMIVssnLAkGAi~mSk~KVIvK~L~AIQNfGAMDi------LCTDKTGTLT~Dki~L~~h~D~s 383 (892) T TIGR01524 312 ALAVAVGLTPEMLPMIVSSNLAKGAIKMSKKKVIVKRLDAIQNFGAMDI------LCTDKTGTLTQDKIVLEKHLDVS 383 (892) T ss_pred HHHHHHCCCCCCCHHHHHHHHHHHHHHHCCCCEEEEECCCCCCCCCCCC------CCCCCCCCCCCHHHHHHHHHCCC T ss_conf 9998744772312267743567899986127425630330014343222------11388887430133221110258 Done!