Query 035227
Match_columns 70
No_of_seqs 102 out of 140
Neff 4.8
Searched_HMMs 46136
Date Fri Mar 29 10:14:43 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/035227.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/035227hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF06331 Tbf5: Transcription f 100.0 2.8E-39 6.1E-44 196.5 5.6 68 1-70 1-68 (68)
2 KOG3451 Uncharacterized conser 100.0 2.7E-36 5.9E-41 184.2 7.3 69 1-69 1-69 (71)
3 PF12392 DUF3656: Collagenase 68.0 22 0.00048 22.4 5.2 45 18-64 70-117 (122)
4 PF13625 Helicase_C_3: Helicas 57.2 8.9 0.00019 24.5 1.9 36 4-43 92-129 (129)
5 PF06793 UPF0262: Uncharacteri 50.9 30 0.00065 24.2 3.8 40 30-69 4-44 (158)
6 PF13296 T6SS_Vgr: Putative ty 49.0 8.8 0.00019 25.0 0.9 14 2-15 65-78 (109)
7 PF06470 SMC_hinge: SMC protei 45.1 6.6 0.00014 23.8 -0.1 27 9-40 88-115 (120)
8 PRK02853 hypothetical protein; 42.7 46 0.00099 23.4 3.7 40 30-69 8-48 (161)
9 TIGR01680 Veg_Stor_Prot vegeta 41.7 31 0.00067 25.9 2.9 31 13-43 83-113 (275)
10 PF08854 DUF1824: Domain of un 40.7 16 0.00035 24.4 1.2 8 5-12 98-105 (125)
11 COG4551 Predicted protein tyro 39.8 25 0.00055 23.1 2.0 33 30-62 75-107 (109)
12 PF01990 ATP-synt_F: ATP synth 39.7 27 0.00058 21.2 2.0 27 41-67 44-70 (95)
13 PF05325 DUF730: Protein of un 37.8 14 0.0003 24.5 0.5 18 3-20 14-31 (122)
14 cd06166 Sortase_D_5 Sortase D 35.9 28 0.00061 22.1 1.7 7 8-14 108-114 (126)
15 cd00004 Sortase Sortases are c 34.7 29 0.00062 21.8 1.6 8 8-15 108-115 (128)
16 cd01721 Sm_D3 The eukaryotic S 33.9 23 0.0005 20.7 1.0 13 4-16 22-34 (70)
17 PTZ00061 DNA-directed RNA poly 33.3 1E+02 0.0022 22.3 4.4 38 3-46 93-133 (205)
18 PF01166 TSC22: TSC-22/dip/bun 31.4 38 0.00083 20.2 1.6 28 38-65 3-31 (59)
19 PF09875 DUF2102: Uncharacteri 30.7 84 0.0018 20.6 3.3 10 38-47 54-63 (104)
20 cd01723 LSm4 The eukaryotic Sm 30.7 25 0.00053 20.9 0.7 14 4-17 23-36 (76)
21 KOG0667 Dual-specificity tyros 30.5 1.2E+02 0.0025 25.3 4.8 47 19-68 235-285 (586)
22 PF06395 CDC24: CDC24 Calponin 30.5 43 0.00093 21.2 1.9 35 12-46 42-80 (89)
23 PF08234 Spindle_Spc25: Chromo 30.4 29 0.00063 20.4 1.0 21 6-26 35-58 (74)
24 cd01726 LSm6 The eukaryotic Sm 29.5 30 0.00064 19.9 0.9 14 3-16 21-34 (67)
25 cd01733 LSm10 The eukaryotic S 29.4 29 0.00062 20.9 0.9 13 4-16 31-43 (78)
26 cd01725 LSm2 The eukaryotic Sm 29.1 30 0.00064 20.9 0.9 14 4-17 23-36 (81)
27 KOG3293 Small nuclear ribonucl 29.1 30 0.00065 23.6 1.0 56 5-60 25-86 (134)
28 TIGR01675 plant-AP plant acid 28.9 50 0.0011 23.8 2.2 32 13-44 57-90 (229)
29 COG1436 NtpG Archaeal/vacuolar 27.9 71 0.0015 20.5 2.6 45 10-66 27-73 (104)
30 PF01957 NfeD: NfeD-like C-ter 25.6 53 0.0012 20.3 1.7 21 28-48 123-143 (144)
31 PRK04333 50S ribosomal protein 25.4 74 0.0016 19.7 2.3 20 29-48 18-38 (84)
32 cd04927 ACT_ACR-like_2 Second 25.0 88 0.0019 18.1 2.5 31 32-64 44-74 (76)
33 COG3827 Uncharacterized protei 24.5 66 0.0014 23.8 2.2 17 49-65 194-210 (231)
34 COG3764 SrtA Sortase (surface 24.5 52 0.0011 23.6 1.6 18 8-36 178-195 (210)
35 KOG1228 Uncharacterized conser 24.5 78 0.0017 23.7 2.6 56 10-65 123-198 (256)
36 PF15005 IZUMO: Izumo sperm-eg 24.2 89 0.0019 21.6 2.7 17 9-25 6-22 (160)
37 COG3697 CitX Phosphoribosyl-de 23.9 2.8E+02 0.006 19.9 5.6 62 6-67 91-179 (182)
38 cd06165 Sortase_A_1 Sortase A 23.6 56 0.0012 20.6 1.5 8 8-15 107-114 (127)
39 COG1838 FumA Tartrate dehydrat 23.6 2E+02 0.0044 20.6 4.4 21 5-25 24-45 (184)
40 cd04899 ACT_ACR-UUR-like_2 C-t 23.5 90 0.0019 16.9 2.2 26 32-59 43-68 (70)
41 COG3700 AphA Acid phosphatase 23.5 58 0.0012 24.0 1.7 39 15-56 50-99 (237)
42 cd01724 Sm_D1 The eukaryotic S 23.2 41 0.00089 20.8 0.8 13 4-16 23-35 (90)
43 TIGR03272 methan_mark_6 putati 23.0 1.3E+02 0.0028 20.5 3.2 29 7-47 33-62 (132)
44 PRK02228 V-type ATP synthase s 22.9 1.2E+02 0.0025 18.9 2.9 26 41-66 46-71 (100)
45 PF10691 DUF2497: Protein of u 22.9 90 0.002 18.9 2.3 21 45-65 35-55 (73)
No 1
>PF06331 Tbf5: Transcription factor TFIIH complex subunit Tfb5; InterPro: IPR009400 This entry represents nucleotide excision repair (NER) proteins, such as TTDA subunit of TFIIH basal transcription factor complex (also known as subunit 5 of RNA polymerase II transcription factor B), and Rex1. These proteins have a structural motif consisting of a 2-layer sandwich structure with an alpha/beta plait topology. Nucleotide excision repair is a major pathway for repairing UV light-induced DNA damage in most organisms. Transcription/repair factor IIH (TFIIH) is essential for RNA polymerase II transcription and nucleotide excision repair. The TFIIH complex consists of ten subunits: ERCC2, ERCC3, GTF2H1, GTF2H2, GTF2H3, GTF2H4, GTF2H5, MNAT1, CDK7 and CCNH. Defects in GTF2H5 cause the disease trichothiodystrophy (TTD), therefore GTF2H5 (general transcription factor 2H subunit 5) is also known as the TTD group A (TTDA) subunit (and as Tfb5) []. The TTDA subunit is responsible for the DNA repair function of the complex. TTDA is present both bound to TFIIH, and as a free fraction that shuffles between the cytoplasm and nucleus; induction of NER-type DNA lesions shifts the balance towards TTDA's more stable association with TFIIH []. TTDA is also required for the stability of the TFIIH complex and for the presence of normal levels of TFIIH in the cell. REX1 (required for excision 1) is required for DNA repair in the single-celled, photosynthetic algae Chlamydomonas reinhardtii [], and has homologues in other eukaryotes.; GO: 0003677 DNA binding, 0006289 nucleotide-excision repair; PDB: 2JNJ_B 1YDL_A 3DGP_B 3DOM_B.
Probab=100.00 E-value=2.8e-39 Score=196.46 Aligned_cols=68 Identities=50% Similarity=0.909 Sum_probs=60.1
Q ss_pred CccceeeeEEeeCHHHHHHHHHhhcCCCCCCCeEEEeccCceeEEccchHHHHHHHHHHHHHhcCCCCCC
Q 035227 1 MVNAIKGLFISCDIPMAQFIINMNASMPQSQKFIIHILDSTHLFVQPNMAEMIRSAIAEFRDQNSYEKPA 70 (70)
Q Consensus 1 Mv~a~kGvLi~CDpaiKq~il~lde~~~~~~~FIIedLDdthlfV~~~~v~~lk~~l~~~l~~n~~~~~~ 70 (70)
||||+|||||+|||||||||++||++++.+ |||+|||||||||+++++++||+||+++|++|+|++++
T Consensus 1 Mv~a~kGvLv~CDpa~Kq~il~ld~~~~~~--FIIedLDdthlfV~~~~v~~lk~~l~~~l~~n~~~~~~ 68 (68)
T PF06331_consen 1 MVNAIKGVLVECDPAIKQFILHLDESMPHG--FIIEDLDDTHLFVKPDVVEMLKEELDELLDQNSYPPTE 68 (68)
T ss_dssp --EEEEEEEEES-HHHHHHHHHHHHHCCTS--SEEEEECTTEEEE-CCCHHHHHHHHHHCCCC-TTTTS-
T ss_pred CCceeeeEEEEcCHHHHHHHHHHhcCCCCC--eEEEEcCCCeEEEcHhHHHHHHHHHHHHHHhcCCCCCC
Confidence 999999999999999999999999987666 99999999999999999999999999999999999974
No 2
>KOG3451 consensus Uncharacterized conserved protein [Function unknown]
Probab=100.00 E-value=2.7e-36 Score=184.19 Aligned_cols=69 Identities=45% Similarity=0.800 Sum_probs=68.3
Q ss_pred CccceeeeEEeeCHHHHHHHHHhhcCCCCCCCeEEEeccCceeEEccchHHHHHHHHHHHHHhcCCCCC
Q 035227 1 MVNAIKGLFISCDIPMAQFIINMNASMPQSQKFIIHILDSTHLFVQPNMAEMIRSAIAEFRDQNSYEKP 69 (70)
Q Consensus 1 Mv~a~kGvLi~CDpaiKq~il~lde~~~~~~~FIIedLDdthlfV~~~~v~~lk~~l~~~l~~n~~~~~ 69 (70)
||||.||+||+||||+||+|+++|++++++.+|||++||||||||+++.++++|.+|+++|++|+|+|+
T Consensus 1 Mvna~KGvlV~cDp~~kqlilnmd~sm~~~skfii~eLDdthLfV~p~~vemvk~~le~~le~n~ye~~ 69 (71)
T KOG3451|consen 1 MVNAKKGVLVTCDPAFKQLILNMDDSMQLGSKFIIEELDDTHLFVNPSIVEMVKNELERILENNNYEAV 69 (71)
T ss_pred CCccccceEEecChhHHHHhhhccccCCCCCCeeEEEeccceeeecHHHHHHHHHHHHHHHHhcCCCcC
Confidence 999999999999999999999999999999999999999999999999999999999999999999987
No 3
>PF12392 DUF3656: Collagenase ; InterPro: IPR020988 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported. This domain is found in a number of proteins belonging to the MEROPS peptidase family U32. Peptidase family U32 contains endopeptidases, including collagenase, from bacteria.
Probab=67.98 E-value=22 Score=22.43 Aligned_cols=45 Identities=11% Similarity=0.186 Sum_probs=34.7
Q ss_pred HHHHHhhcCCCCCCCeEEEecc---CceeEEccchHHHHHHHHHHHHHhc
Q 035227 18 QFIINMNASMPQSQKFIIHILD---STHLFVQPNMAEMIRSAIAEFRDQN 64 (70)
Q Consensus 18 q~il~lde~~~~~~~FIIedLD---dthlfV~~~~v~~lk~~l~~~l~~n 64 (70)
++.-+|... .+-.|.+++++ |..+|+..+.+..||++.-+.|++.
T Consensus 70 ~i~~ql~Kl--G~T~F~~~~i~i~~~~~lFlP~s~LN~lRRea~e~L~~~ 117 (122)
T PF12392_consen 70 RIRKQLSKL--GNTPFELENIEIDLDEGLFLPISELNELRREAVEKLEEK 117 (122)
T ss_pred HHHHHHHhh--CCCcEEEEEEEEEcCCCEEEEHHHHHHHHHHHHHHHHHH
Confidence 344455443 35579999986 7899999999999999988887754
No 4
>PF13625 Helicase_C_3: Helicase conserved C-terminal domain
Probab=57.23 E-value=8.9 Score=24.50 Aligned_cols=36 Identities=17% Similarity=0.309 Sum_probs=23.5
Q ss_pred cee-eeEEee-CHHHHHHHHHhhcCCCCCCCeEEEeccCcee
Q 035227 4 AIK-GLFISC-DIPMAQFIINMNASMPQSQKFIIHILDSTHL 43 (70)
Q Consensus 4 a~k-GvLi~C-DpaiKq~il~lde~~~~~~~FIIedLDdthl 43 (70)
..+ +++++| |+.+-+-|++.-+ -.+|+.+.+.+|++
T Consensus 92 l~~~~~~l~~~d~~~l~~l~~~~~----~~~~~~~~~~p~v~ 129 (129)
T PF13625_consen 92 LYKGAYLLECDDPELLDELLADPE----LAKLILRRIAPTVF 129 (129)
T ss_pred EecCeEEEEECCHHHHHHHHhChh----hhhhhccccCCCcC
Confidence 344 677777 6666666665533 56688888887753
No 5
>PF06793 UPF0262: Uncharacterised protein family (UPF0262); InterPro: IPR008321 There are currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=50.95 E-value=30 Score=24.25 Aligned_cols=40 Identities=23% Similarity=0.211 Sum_probs=32.2
Q ss_pred CCCeEEEeccCc-eeEEccchHHHHHHHHHHHHHhcCCCCC
Q 035227 30 SQKFIIHILDST-HLFVQPNMAEMIRSAIAEFRDQNSYEKP 69 (70)
Q Consensus 30 ~~~FIIedLDdt-hlfV~~~~v~~lk~~l~~~l~~n~~~~~ 69 (70)
.++.+=-+|||. -.--.|++...=+-.+-+++++|+|.|+
T Consensus 4 ~~Rl~~i~LDe~s~~~~~p~vE~ER~vAIfDLleeN~F~p~ 44 (158)
T PF06793_consen 4 DQRLIDIELDEASIGRRTPEVEHERAVAIFDLLEENSFAPV 44 (158)
T ss_pred cCcEEEEEeCCCCCCCCCccHHHHHHHHHHHHHhcCeeccC
Confidence 566777789984 4456788888888899999999999975
No 6
>PF13296 T6SS_Vgr: Putative type VI secretion system Rhs element Vgr
Probab=48.97 E-value=8.8 Score=25.01 Aligned_cols=14 Identities=50% Similarity=0.574 Sum_probs=12.5
Q ss_pred ccceeeeEEeeCHH
Q 035227 2 VNAIKGLFISCDIP 15 (70)
Q Consensus 2 v~a~kGvLi~CDpa 15 (70)
|||-+|+||+.++.
T Consensus 65 vRa~~GlliSt~~~ 78 (109)
T PF13296_consen 65 VRAGKGLLISTEAR 78 (109)
T ss_pred hhcccEEEEEcCCC
Confidence 78999999999875
No 7
>PF06470 SMC_hinge: SMC proteins Flexible Hinge Domain; InterPro: IPR010935 This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction [].; GO: 0005515 protein binding, 0005524 ATP binding, 0051276 chromosome organization, 0005694 chromosome; PDB: 2WD5_A 1GXL_C 1GXK_A 1GXJ_A 3NWC_B 3L51_A.
Probab=45.05 E-value=6.6 Score=23.84 Aligned_cols=27 Identities=11% Similarity=0.140 Sum_probs=21.3
Q ss_pred EEee-CHHHHHHHHHhhcCCCCCCCeEEEeccC
Q 035227 9 FISC-DIPMAQFIINMNASMPQSQKFIIHILDS 40 (70)
Q Consensus 9 Li~C-DpaiKq~il~lde~~~~~~~FIIedLDd 40 (70)
+|+| |+.++..+.++ .++-||++++|+
T Consensus 88 ~i~~~d~~~~~~~~~l-----lg~~~vv~~l~~ 115 (120)
T PF06470_consen 88 LIEFPDEEYRPALEFL-----LGDVVVVDDLEE 115 (120)
T ss_dssp GEEESCGGGHHHHHHH-----HTTEEEESSHHH
T ss_pred hcccCcHHHHHHHHHH-----cCCEEEECCHHH
Confidence 5799 98999988887 466688887764
No 8
>PRK02853 hypothetical protein; Provisional
Probab=42.66 E-value=46 Score=23.40 Aligned_cols=40 Identities=18% Similarity=0.289 Sum_probs=30.1
Q ss_pred CCCeEEEeccCcee-EEccchHHHHHHHHHHHHHhcCCCCC
Q 035227 30 SQKFIIHILDSTHL-FVQPNMAEMIRSAIAEFRDQNSYEKP 69 (70)
Q Consensus 30 ~~~FIIedLDdthl-fV~~~~v~~lk~~l~~~l~~n~~~~~ 69 (70)
.++.+=-+|||..+ --.|++...=+-.+-+++++|+|.|+
T Consensus 8 ~~Rl~~i~LDe~~~~~~~p~vE~ER~vAIfDLlEeN~F~p~ 48 (161)
T PRK02853 8 RNRLVDVELDEASIGRSTPDVEHERAVAIFDLLEENSFAPE 48 (161)
T ss_pred ccceEEEEeccccCCCCChhHHHHHHHHHHHHhhhceecCC
Confidence 44666677887443 34677777777888999999999986
No 9
>TIGR01680 Veg_Stor_Prot vegetative storage protein. The proteins represented by this model are close relatives of the plant acid phosphatases (TIGR01675), are limited to members of the Phaseoleae including Glycine max (soybean) and Phaseolus vulgaris (kidney bean). These proteins are highly expressed in the leaves of repeatedly depodded plants. VSP differs most strinkingly from the acid phosphatases in the lack of the conserved nucleophilic aspartate residue in the N-terminus, thus, they should be inactive as phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max VSP. In 1994 this assertion was refuted by the separation of the activity from the VSP.
Probab=41.71 E-value=31 Score=25.88 Aligned_cols=31 Identities=13% Similarity=0.028 Sum_probs=22.7
Q ss_pred CHHHHHHHHHhhcCCCCCCCeEEEeccCcee
Q 035227 13 DIPMAQFIINMNASMPQSQKFIIHILDSTHL 43 (70)
Q Consensus 13 DpaiKq~il~lde~~~~~~~FIIedLDdthl 43 (70)
+..+.|...++.+....+.+-+|-|||||-|
T Consensus 83 ~~v~~~a~~y~~~~~~~~~dA~V~DIDET~L 113 (275)
T TIGR01680 83 KTVNQQAYFFARDLEVHEKDTFLFNIDGTAL 113 (275)
T ss_pred HHHHHHHHHHHHhCcCCCCCEEEEECccccc
Confidence 4456666666655544578999999999966
No 10
>PF08854 DUF1824: Domain of unknown function (DUF1824); InterPro: IPR014953 This uncharacterised group of proteins are principally found in cyanobacteria. ; PDB: 2Q22_B.
Probab=40.71 E-value=16 Score=24.42 Aligned_cols=8 Identities=50% Similarity=1.094 Sum_probs=6.7
Q ss_pred eeeeEEee
Q 035227 5 IKGLFISC 12 (70)
Q Consensus 5 ~kGvLi~C 12 (70)
.+||||+|
T Consensus 98 ~rGVLiSc 105 (125)
T PF08854_consen 98 GRGVLISC 105 (125)
T ss_dssp -BEEEEEE
T ss_pred cceEEEEe
Confidence 48999999
No 11
>COG4551 Predicted protein tyrosine phosphatase [General function prediction only]
Probab=39.80 E-value=25 Score=23.11 Aligned_cols=33 Identities=9% Similarity=0.360 Sum_probs=29.6
Q ss_pred CCCeEEEeccCceeEEccchHHHHHHHHHHHHH
Q 035227 30 SQKFIIHILDSTHLFVQPNMAEMIRSAIAEFRD 62 (70)
Q Consensus 30 ~~~FIIedLDdthlfV~~~~v~~lk~~l~~~l~ 62 (70)
+++.|--|+-|...|.+++.++.|++++.-+|-
T Consensus 75 ~kRviCLDIPDdy~yMq~eLi~lLkrkv~p~L~ 107 (109)
T COG4551 75 GKRVICLDIPDDYEYMQPELIDLLKRKVGPYLR 107 (109)
T ss_pred CCeEEEEeCCchHhhcCHHHHHHHHHhhhhhhc
Confidence 677888999899999999999999999987764
No 12
>PF01990 ATP-synt_F: ATP synthase (F/14-kDa) subunit; InterPro: IPR008218 ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [, ]. The different types include: F-ATPases (F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts). V-ATPases (V1V0-ATPases), which are primarily found in eukaryotic vacuoles and catalyse ATP hydrolysis to transport solutes and lower pH in organelles. A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases (though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases). P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes. E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP. The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis [, , ]. The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases. This entry represents subunit F found in the V1 complex of V-ATPases (both eukaryotic and bacterial), as well as in the A1 complex of A-ATPases. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis []. More information about this protein can be found at Protein of the Month: ATP Synthases [].; GO: 0046933 hydrogen ion transporting ATP synthase activity, rotational mechanism, 0046961 proton-transporting ATPase activity, rotational mechanism, 0015991 ATP hydrolysis coupled proton transport, 0033178 proton-transporting two-sector ATPase complex, catalytic domain; PDB: 2D00_E 3A5C_P 3J0J_H 3A5D_H 2OV6_A 2QAI_B 3AON_B 2I4R_A.
Probab=39.65 E-value=27 Score=21.18 Aligned_cols=27 Identities=26% Similarity=0.554 Sum_probs=22.7
Q ss_pred ceeEEccchHHHHHHHHHHHHHhcCCC
Q 035227 41 THLFVQPNMAEMIRSAIAEFRDQNSYE 67 (70)
Q Consensus 41 thlfV~~~~v~~lk~~l~~~l~~n~~~ 67 (70)
.-++|.++..+.++.++.+++.++.++
T Consensus 44 gIIii~e~~~~~~~~~l~~~~~~~~~P 70 (95)
T PF01990_consen 44 GIIIITEDLAEKIRDELDEYREESSLP 70 (95)
T ss_dssp EEEEEEHHHHTTHHHHHHHHHHTSSSS
T ss_pred cEEEeeHHHHHHHHHHHHHHHhccCCc
Confidence 457889999999999999998887543
No 13
>PF05325 DUF730: Protein of unknown function (DUF730); InterPro: IPR007989 This family consists of several uncharacterised Arabidopsis thaliana proteins of unknown function.
Probab=37.79 E-value=14 Score=24.55 Aligned_cols=18 Identities=28% Similarity=0.422 Sum_probs=12.9
Q ss_pred cceeeeEEeeCHHHHHHH
Q 035227 3 NAIKGLFISCDIPMAQFI 20 (70)
Q Consensus 3 ~a~kGvLi~CDpaiKq~i 20 (70)
|--|||+|+||-..|-.+
T Consensus 14 rrdkgv~ie~dcnakvvv 31 (122)
T PF05325_consen 14 RRDKGVPIECDCNAKVVV 31 (122)
T ss_pred ccCCCcceeccCCceEEE
Confidence 346899999987655443
No 14
>cd06166 Sortase_D_5 Sortase D (SrtD) is a membrane transpeptidase found in gram-positive bacteria that anchors surface proteins to peptidoglycans of the bacterial cell wall envelope. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at the cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. Class D sortases are further classified into subfamilies 4 and 5. This group contains a subset of Class D sortases belonging to subfamily-5, represented by Clostridium perfringens CPE2315. Subfamily-5 sortases recognize a nonstandard sorting signal (LAXTG) and have replaced Sortase A in some gram-postive bacteria. They may play a housekeeping role in the cell.
Probab=35.92 E-value=28 Score=22.10 Aligned_cols=7 Identities=29% Similarity=0.719 Sum_probs=6.4
Q ss_pred eEEeeCH
Q 035227 8 LFISCDI 14 (70)
Q Consensus 8 vLi~CDp 14 (70)
.|++|+|
T Consensus 108 tLiTC~~ 114 (126)
T cd06166 108 TLITCTP 114 (126)
T ss_pred EEEEcCC
Confidence 6999999
No 15
>cd00004 Sortase Sortases are cysteine transpeptidases, found in gram-positive bacteria, that anchor surface proteins to peptidoglycans of the bacterial cell wall envelope. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. The different classes are called Sortase A or SrtA (subfamily 1), B or SrtB (subfamily 2), C or SrtC (subfamily3), D or SrtD (subfamilies 4 and 5), and E or SrtE. In two different sortase subfamilies, the N-terminus either functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring, or it contains a signal peptide only and the C-terminus serves as a membrane anchor. Most gram-positive bacteria contain more than one s
Probab=34.74 E-value=29 Score=21.80 Aligned_cols=8 Identities=38% Similarity=0.945 Sum_probs=7.1
Q ss_pred eEEeeCHH
Q 035227 8 LFISCDIP 15 (70)
Q Consensus 8 vLi~CDpa 15 (70)
.|++|+|.
T Consensus 108 tLiTC~~~ 115 (128)
T cd00004 108 TLITCTPP 115 (128)
T ss_pred EEEEcCCC
Confidence 69999987
No 16
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=33.87 E-value=23 Score=20.71 Aligned_cols=13 Identities=23% Similarity=0.401 Sum_probs=11.3
Q ss_pred ceeeeEEeeCHHH
Q 035227 4 AIKGLFISCDIPM 16 (70)
Q Consensus 4 a~kGvLi~CDpai 16 (70)
..+|.|+.||..|
T Consensus 22 ~~~G~L~~~D~~M 34 (70)
T cd01721 22 VYRGKLIEAEDNM 34 (70)
T ss_pred EEEEEEEEEcCCc
Confidence 5789999999976
No 17
>PTZ00061 DNA-directed RNA polymerase; Provisional
Probab=33.31 E-value=1e+02 Score=22.33 Aligned_cols=38 Identities=16% Similarity=0.236 Sum_probs=25.7
Q ss_pred cceeeeEEee---CHHHHHHHHHhhcCCCCCCCeEEEeccCceeEEc
Q 035227 3 NAIKGLFISC---DIPMAQFIINMNASMPQSQKFIIHILDSTHLFVQ 46 (70)
Q Consensus 3 ~a~kGvLi~C---DpaiKq~il~lde~~~~~~~FIIedLDdthlfV~ 46 (70)
++.+|+||.. -|+.++.|..+. .+|.||-.-++-|+|+
T Consensus 93 n~~r~IlV~q~~ltp~Ar~~i~~~~------~~~~iE~F~E~eLlvn 133 (205)
T PTZ00061 93 DIQRAILVTQNVLTPFAKDAILEAA------PRHIIENFLETELLVN 133 (205)
T ss_pred CCceEEEEECCCCCHHHHHHHHhhC------CCcEEEEeeehheEEe
Confidence 4678899887 466777777662 3477777666655554
No 18
>PF01166 TSC22: TSC-22/dip/bun family; InterPro: IPR000580 Several eukaryotic proteins are evolutionary related and are thought to be involved in transcriptional regulation. These proteins are highly similar in a region of about 50 residues that include a conserved leucine-zipper domain most probably involved in homo- or hetero-dimerisation. Proteins containing this signature include: Vertebrate protein TSC-22 [], a transcriptional regulator which seems to act on C-type natriuretic peptide (CNP) promoter. Mammalian protein DIP (DSIP-immunoreactive peptide) [], a protein whose function is not yet known. Drosophila protein bunched [] (gene bun) (also known as shortsighted), a probable transcription factor required for peripheral nervous system morphogenesis, eye development and oogenesis. Caenorhabditis elegans hypothetical protein T18D3.7. ; GO: 0003700 sequence-specific DNA binding transcription factor activity, 0006355 regulation of transcription, DNA-dependent; PDB: 1DIP_B.
Probab=31.39 E-value=38 Score=20.17 Aligned_cols=28 Identities=36% Similarity=0.526 Sum_probs=19.1
Q ss_pred ccCceeEE-ccchHHHHHHHHHHHHHhcC
Q 035227 38 LDSTHLFV-QPNMAEMIRSAIAEFRDQNS 65 (70)
Q Consensus 38 LDdthlfV-~~~~v~~lk~~l~~~l~~n~ 65 (70)
|=.|||.. =.+-|+.||+++.++.++|+
T Consensus 3 LVKtHLm~AVrEEVevLK~~I~eL~~~n~ 31 (59)
T PF01166_consen 3 LVKTHLMYAVREEVEVLKEQIAELEERNS 31 (59)
T ss_dssp SCCCHGGGT-TTSHHHHHHHHHHHHHHHH
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Confidence 33455532 34668999999999988774
No 19
>PF09875 DUF2102: Uncharacterized protein conserved in archaea (DUF2102); InterPro: IPR012025 The exact functionof this protein unknown, but likely is linked to methanogenesis or a process closely connected to it.
Probab=30.67 E-value=84 Score=20.61 Aligned_cols=10 Identities=50% Similarity=0.986 Sum_probs=6.8
Q ss_pred ccCceeEEcc
Q 035227 38 LDSTHLFVQP 47 (70)
Q Consensus 38 LDdthlfV~~ 47 (70)
+|.+|+|||.
T Consensus 54 ld~~~IF~Kd 63 (104)
T PF09875_consen 54 LDPNHIFVKD 63 (104)
T ss_pred hCCCceEeec
Confidence 5777777764
No 20
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.67 E-value=25 Score=20.87 Aligned_cols=14 Identities=29% Similarity=0.567 Sum_probs=11.3
Q ss_pred ceeeeEEeeCHHHH
Q 035227 4 AIKGLFISCDIPMA 17 (70)
Q Consensus 4 a~kGvLi~CDpaiK 17 (70)
..+|.|..||+.|=
T Consensus 23 ~~~G~L~~~D~~mN 36 (76)
T cd01723 23 TYNGHLVNCDNWMN 36 (76)
T ss_pred EEEEEEEEEcCCCc
Confidence 57899999999553
No 21
>KOG0667 consensus Dual-specificity tyrosine-phosphorylation regulated kinase [General function prediction only]
Probab=30.54 E-value=1.2e+02 Score=25.26 Aligned_cols=47 Identities=21% Similarity=0.433 Sum_probs=38.2
Q ss_pred HHHHhhcCCCCCCCeEEEecc----CceeEEccchHHHHHHHHHHHHHhcCCCC
Q 035227 19 FIINMNASMPQSQKFIIHILD----STHLFVQPNMAEMIRSAIAEFRDQNSYEK 68 (70)
Q Consensus 19 ~il~lde~~~~~~~FIIedLD----dthlfV~~~~v~~lk~~l~~~l~~n~~~~ 68 (70)
+|-+||..-+..+.-+|.-+| -.|++| +.|.|..-|-+++.+|.|.+
T Consensus 235 iL~~ln~~d~~~~~n~Vrm~d~F~fr~Hlci---VfELL~~NLYellK~n~f~G 285 (586)
T KOG0667|consen 235 ILELLNKHDPDDKYNIVRMLDYFYFRNHLCI---VFELLSTNLYELLKNNKFRG 285 (586)
T ss_pred HHHHHhccCCCCCeeEEEeeeccccccceee---eehhhhhhHHHHHHhcCCCC
Confidence 677888776778888999988 467777 46788899999999999875
No 22
>PF06395 CDC24: CDC24 Calponin; InterPro: IPR010481 This is a calponin homology domain.
Probab=30.46 E-value=43 Score=21.18 Aligned_cols=35 Identities=31% Similarity=0.559 Sum_probs=23.8
Q ss_pred eCHHHHHHHHHhhcCC--CCCCCeEEEec--cCceeEEc
Q 035227 12 CDIPMAQFIINMNASM--PQSQKFIIHIL--DSTHLFVQ 46 (70)
Q Consensus 12 CDpaiKq~il~lde~~--~~~~~FIIedL--DdthlfV~ 46 (70)
|=.|+--||...-... |..+=|+|.|| |+|+=|||
T Consensus 42 ~K~ai~~Fi~ack~~L~~~~~e~FtIsdl~~~dT~gfvK 80 (89)
T PF06395_consen 42 CKKAIYKFIQACKQELGFPDEELFTISDLYGDDTNGFVK 80 (89)
T ss_pred HHHHHHHHHHHHHHhcCCCccceeeeeccccCCCcchhh
Confidence 3345666666654443 34567999999 78888886
No 23
>PF08234 Spindle_Spc25: Chromosome segregation protein Spc25; InterPro: IPR013255 This is a family of chromosome segregation proteins. It contains Spc25, which is a conserved eukaryotic kinetochore protein involved in cell division. In fungi the Spc25 protein is a subunit of the Nuf2-Ndc80 complex [], and in vertebrates it forms part of the Ndc80 complex []. ; PDB: 2VE7_B.
Probab=30.43 E-value=29 Score=20.44 Aligned_cols=21 Identities=33% Similarity=0.604 Sum_probs=8.0
Q ss_pred eeeEEeeCHHH---HHHHHHhhcC
Q 035227 6 KGLFISCDIPM---AQFIINMNAS 26 (70)
Q Consensus 6 kGvLi~CDpai---Kq~il~lde~ 26 (70)
+..+++|+|++ .+++-.||++
T Consensus 35 ~Y~v~~~~P~l~~l~~l~~~LN~t 58 (74)
T PF08234_consen 35 KYEVISCDPPLEDLDELVDELNKT 58 (74)
T ss_dssp -EE----------THHHHHHHHH-
T ss_pred eEEEEEecCCcchHHHHHHHHhcc
Confidence 45688999985 6777778774
No 24
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=29.51 E-value=30 Score=19.93 Aligned_cols=14 Identities=21% Similarity=0.242 Sum_probs=11.2
Q ss_pred cceeeeEEeeCHHH
Q 035227 3 NAIKGLFISCDIPM 16 (70)
Q Consensus 3 ~a~kGvLi~CDpai 16 (70)
+..+|.|..+|+.|
T Consensus 21 ~~~~G~L~~~D~~m 34 (67)
T cd01726 21 VDYRGILACLDGYM 34 (67)
T ss_pred CEEEEEEEEEccce
Confidence 35789999999854
No 25
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=29.43 E-value=29 Score=20.93 Aligned_cols=13 Identities=31% Similarity=0.386 Sum_probs=11.5
Q ss_pred ceeeeEEeeCHHH
Q 035227 4 AIKGLFISCDIPM 16 (70)
Q Consensus 4 a~kGvLi~CDpai 16 (70)
..+|.|..||..|
T Consensus 31 ~~~G~L~~vD~~M 43 (78)
T cd01733 31 TVTGRIASVDAFM 43 (78)
T ss_pred EEEEEEEEEcCCc
Confidence 5789999999987
No 26
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=29.12 E-value=30 Score=20.90 Aligned_cols=14 Identities=29% Similarity=0.413 Sum_probs=11.8
Q ss_pred ceeeeEEeeCHHHH
Q 035227 4 AIKGLFISCDIPMA 17 (70)
Q Consensus 4 a~kGvLi~CDpaiK 17 (70)
..+|.|..||+-|-
T Consensus 23 ~~~G~L~~vD~~MN 36 (81)
T cd01725 23 SIRGTLHSVDQYLN 36 (81)
T ss_pred EEEEEEEEECCCcc
Confidence 46899999999873
No 27
>KOG3293 consensus Small nuclear ribonucleoprotein (snRNP) [RNA processing and modification]
Probab=29.12 E-value=30 Score=23.59 Aligned_cols=56 Identities=13% Similarity=0.231 Sum_probs=35.1
Q ss_pred eeeeEEeeCHHHHHHHHHhhcCCCCCCCeEEE-e--ccC---ceeEEccchHHHHHHHHHHH
Q 035227 5 IKGLFISCDIPMAQFIINMNASMPQSQKFIIH-I--LDS---THLFVQPNMAEMIRSAIAEF 60 (70)
Q Consensus 5 ~kGvLi~CDpaiKq~il~lde~~~~~~~FIIe-d--LDd---thlfV~~~~v~~lk~~l~~~ 60 (70)
..|.|+.||-.|-=-+...=..++.+.+|.+- + |-- ..|=|..++++.+|++...-
T Consensus 25 ~nGhL~~cD~wMNl~L~~Vi~ts~Dgdkf~r~pEcYirGttIkylri~d~iid~vkee~~~~ 86 (134)
T KOG3293|consen 25 YNGHLVNCDNWMNLHLREVICTSEDGDKFFRMPECYIRGTTIKYLRIPDEIIDKVKEECVSN 86 (134)
T ss_pred ecceeecchhhhhcchheeEEeccCCCceeecceeEEecceeEEEeccHHHHHHHHHHHHHh
Confidence 57999999999864444444444456666553 2 111 23445677888888777654
No 28
>TIGR01675 plant-AP plant acid phosphatase. This model explicitly excludes the VSPs which lack the nucleophilc aspartate. The possibility exists, however, that some members of this family may, while containing all of the conserved HAD-superfamily catalytic residues, lack activity and have a function related to the function of the VSPs rather than the acid phosphatases.
Probab=28.90 E-value=50 Score=23.85 Aligned_cols=32 Identities=13% Similarity=-0.022 Sum_probs=21.1
Q ss_pred CHHHHHHHHHhhcCCC--CCCCeEEEeccCceeE
Q 035227 13 DIPMAQFIINMNASMP--QSQKFIIHILDSTHLF 44 (70)
Q Consensus 13 DpaiKq~il~lde~~~--~~~~FIIedLDdthlf 44 (70)
+..+.+.+.++++-.+ .+..-||=|+|||-|=
T Consensus 57 ~~v~~~a~~y~~~~~~~~dg~~A~V~DIDET~Ls 90 (229)
T TIGR01675 57 KRVVDEAYFYAKSLALSGDGMDAWIFDVDDTLLS 90 (229)
T ss_pred HHHHHHHHHHHHHhhccCCCCcEEEEcccccccc
Confidence 4445566666654433 3668999999999664
No 29
>COG1436 NtpG Archaeal/vacuolar-type H+-ATPase subunit F [Energy production and conversion]
Probab=27.90 E-value=71 Score=20.51 Aligned_cols=45 Identities=16% Similarity=0.238 Sum_probs=32.8
Q ss_pred EeeCH--HHHHHHHHhhcCCCCCCCeEEEeccCceeEEccchHHHHHHHHHHHHHhcCC
Q 035227 10 ISCDI--PMAQFIINMNASMPQSQKFIIHILDSTHLFVQPNMAEMIRSAIAEFRDQNSY 66 (70)
Q Consensus 10 i~CDp--aiKq~il~lde~~~~~~~FIIedLDdthlfV~~~~v~~lk~~l~~~l~~n~~ 66 (70)
+--+| ..++++..|-+. |=.-++|.++..+.|++++.++..++..
T Consensus 27 v~~~~~~~~~~~~~~l~~~------------~~~iIiite~~a~~i~~~i~~~~~~~~~ 73 (104)
T COG1436 27 VADDEEDELRAALRVLAED------------DVGIILITEDLAEKIREEIRRIIRSSVL 73 (104)
T ss_pred EecChhHHHHHHHHhhccC------------CceEEEEeHHHHhhhHHHHHHHhhccCc
Confidence 34455 467777777442 3346789999999999999999877654
No 30
>PF01957 NfeD: NfeD-like C-terminal, partner-binding; InterPro: IPR002810 The nfe genes (nfeA, nfeB, and nfeD) are involved in the nodulation efficiency and competitiveness of Rhizobium meliloti (Sinorhizobium meliloti) (Rhizobium meliloti) on alfalfa roots []. The specific function of this family is unknown although it is unlikely that NfeD is specifically involved in nodulation as the family contains several different archaeal and bacterial species most of which are not symbionts. This entry describes archaeal and bacterial proteins which are variously described, examples are: nodulation protein, nodulation efficiency protein D (nfeD), hypothetical protein and membrane-bound serine protease (ClpP class). A number of these proteins are classified in MEROPS peptidase family S49 as non-peptidase homologues or as unassigned peptidases. ; PDB: 2K5H_A 3CP0_A 2EXD_A.
Probab=25.59 E-value=53 Score=20.33 Aligned_cols=21 Identities=19% Similarity=0.445 Sum_probs=16.5
Q ss_pred CCCCCeEEEeccCceeEEccc
Q 035227 28 PQSQKFIIHILDSTHLFVQPN 48 (70)
Q Consensus 28 ~~~~~FIIedLDdthlfV~~~ 48 (70)
+.+.+..|.+.|.++|+|++.
T Consensus 123 ~~G~~V~Vv~v~g~~L~V~~~ 143 (144)
T PF01957_consen 123 PKGDRVRVVGVEGNTLIVEPV 143 (144)
T ss_dssp -TT-EEEEEEEESSCEEEEE-
T ss_pred CCCCEEEEEEEECCEEEEEEC
Confidence 458899999999999999874
No 31
>PRK04333 50S ribosomal protein L14e; Validated
Probab=25.36 E-value=74 Score=19.66 Aligned_cols=20 Identities=20% Similarity=0.303 Sum_probs=15.3
Q ss_pred CCCCeEEEec-cCceeEEccc
Q 035227 29 QSQKFIIHIL-DSTHLFVQPN 48 (70)
Q Consensus 29 ~~~~FIIedL-DdthlfV~~~ 48 (70)
.++.|+|-++ |+.+++|.-.
T Consensus 18 ~gk~~vIv~i~d~~~vlVdg~ 38 (84)
T PRK04333 18 AGRKCVIVDIIDKNFVLVTGP 38 (84)
T ss_pred CCCEEEEEEEecCCEEEEECC
Confidence 4677777776 9999999654
No 32
>cd04927 ACT_ACR-like_2 Second ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). This CD includes the second ACT domain, of a novel type of ACT domain-containing protein which is composed almost entirely of four ACT domain repeats (the "ACR" protein). ACR proteins, found only in Arabidopsis and Oryza, as yet, are proposed to function as novel regulatory or sensor proteins in plants. Nine ACR gene products (ACR1-8 in Arabidopsis and OsARC1-9 in Oryza) have been described, however, the ACR-like sequences in this CD are distinct from those characterized. This CD includes the Oryza sativa ACR-like protein (Os05g0113000) encoded on chromosome 5 and the Arabidopsis thaliana predicted gene product, At2g39570. Members of this CD belong to the superfamily of ACT regulatory domains.
Probab=24.99 E-value=88 Score=18.14 Aligned_cols=31 Identities=10% Similarity=0.009 Sum_probs=21.9
Q ss_pred CeEEEeccCceeEEccchHHHHHHHHHHHHHhc
Q 035227 32 KFIIHILDSTHLFVQPNMAEMIRSAIAEFRDQN 64 (70)
Q Consensus 32 ~FIIedLDdthlfV~~~~v~~lk~~l~~~l~~n 64 (70)
-|.|.|-+.. ...++..+.|++.|.+.+-++
T Consensus 44 ~F~V~d~~~~--~~~~~~~~~l~~~L~~~L~~~ 74 (76)
T cd04927 44 LFFITDAREL--LHTKKRREETYDYLRAVLGDS 74 (76)
T ss_pred EEEEeCCCCC--CCCHHHHHHHHHHHHHHHchh
Confidence 4888876555 355677888888888776554
No 33
>COG3827 Uncharacterized protein conserved in bacteria [Function unknown]
Probab=24.51 E-value=66 Score=23.81 Aligned_cols=17 Identities=35% Similarity=0.616 Sum_probs=15.1
Q ss_pred hHHHHHHHHHHHHHhcC
Q 035227 49 MAEMIRSAIAEFRDQNS 65 (70)
Q Consensus 49 ~v~~lk~~l~~~l~~n~ 65 (70)
..++||=.|.+|||+|-
T Consensus 194 a~eMLRPmLqdWLDkNL 210 (231)
T COG3827 194 AAEMLRPMLQDWLDKNL 210 (231)
T ss_pred HHHHHHHHHHHHHHccc
Confidence 57899999999999994
No 34
>COG3764 SrtA Sortase (surface protein transpeptidase) [Cell envelope biogenesis, outer membrane]
Probab=24.47 E-value=52 Score=23.59 Aligned_cols=18 Identities=17% Similarity=0.447 Sum_probs=12.7
Q ss_pred eEEeeCHHHHHHHHHhhcCCCCCCCeEEE
Q 035227 8 LFISCDIPMAQFIINMNASMPQSQKFIIH 36 (70)
Q Consensus 8 vLi~CDpaiKq~il~lde~~~~~~~FIIe 36 (70)
.||+|+|. . .+++++|++
T Consensus 178 TLiTC~p~-------~----~~~~RlIv~ 195 (210)
T COG3764 178 TLITCTPY-------G----SATKRLIVK 195 (210)
T ss_pred EEEEccCC-------C----CCceeEEEE
Confidence 69999997 1 236677765
No 35
>KOG1228 consensus Uncharacterized conserved protein [Function unknown]
Probab=24.47 E-value=78 Score=23.66 Aligned_cols=56 Identities=30% Similarity=0.462 Sum_probs=37.6
Q ss_pred Eee-CHHHHHHHHHhhcCC------C---CCCCeEEEe---------c-cCceeEEccchHHHHHHHHHHHHHhcC
Q 035227 10 ISC-DIPMAQFIINMNASM------P---QSQKFIIHI---------L-DSTHLFVQPNMAEMIRSAIAEFRDQNS 65 (70)
Q Consensus 10 i~C-DpaiKq~il~lde~~------~---~~~~FIIed---------L-DdthlfV~~~~v~~lk~~l~~~l~~n~ 65 (70)
|+| |-.-.|++..+--+- . .++++||.- | +...|||.|+++.+|-..-.+.|++|.
T Consensus 123 Vqcrdlq~Aq~L~~~Ais~GFReSGIt~~~~~k~ivAIR~sirleVPlg~s~kLmVTpEYv~fL~~~anekmdeN~ 198 (256)
T KOG1228|consen 123 VQCRDLQDAQILHSMAISCGFRESGITVGKRGKTIVAIRSSIRLEVPLGHSGKLMVTPEYVDFLLNVANEKMDENK 198 (256)
T ss_pred EehhhhhhHHHHHHHHHhcCccccccccccCCcEEEEEEeeceeeeccCCCccEEecHHHHHHHHHHHHHHHhhhH
Confidence 455 455567766654321 1 245688753 2 467899999999999887777777763
No 36
>PF15005 IZUMO: Izumo sperm-egg fusion
Probab=24.16 E-value=89 Score=21.61 Aligned_cols=17 Identities=12% Similarity=0.417 Sum_probs=13.2
Q ss_pred EEeeCHHHHHHHHHhhc
Q 035227 9 FISCDIPMAQFIINMNA 25 (70)
Q Consensus 9 Li~CDpaiKq~il~lde 25 (70)
.++|||++.+=+.+|-.
T Consensus 6 CL~CDp~v~eal~~L~~ 22 (160)
T PF15005_consen 6 CLQCDPSVVEALKSLRH 22 (160)
T ss_pred eeeCCHHHHHHHHHHHH
Confidence 36899998888877744
No 37
>COG3697 CitX Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) [Coenzyme metabolism / Lipid metabolism]
Probab=23.92 E-value=2.8e+02 Score=19.88 Aligned_cols=62 Identities=13% Similarity=0.188 Sum_probs=43.7
Q ss_pred eeeEEeeCHH--HHHHHHHhhcCCCCCCCeEEEecc-Cce-------------eE-Ec----------cchHHHHHHHHH
Q 035227 6 KGLFISCDIP--MAQFIINMNASMPQSQKFIIHILD-STH-------------LF-VQ----------PNMAEMIRSAIA 58 (70)
Q Consensus 6 kGvLi~CDpa--iKq~il~lde~~~~~~~FIIedLD-dth-------------lf-V~----------~~~v~~lk~~l~ 58 (70)
-|.++.|=|| .|+.-..|.++.|.|+=+=|--|| +.| .| +. .+.++.|..+++
T Consensus 91 E~~~~i~apAr~LK~~mi~LE~~~PLGRLwDiDVi~~~g~~LSR~~~~lp~R~CLiC~q~A~~CaR~rkHsveell~kIe 170 (182)
T COG3697 91 EAMLSIAAPARDLKLAMIALEESHPLGRLWDIDVLDAEGEILSRRDFGLPPRRCLICEQSAKVCARGRKHSVEELLNKIE 170 (182)
T ss_pred ceEEEecCcHHHHHHHHHHHHhcCChhhhccceeeccCCCEeeccccCCCCceeEeehhhHHHHhccccccHHHHHHHHH
Confidence 4678888887 899999999999988755444444 222 23 32 345788888999
Q ss_pred HHHHhcCCC
Q 035227 59 EFRDQNSYE 67 (70)
Q Consensus 59 ~~l~~n~~~ 67 (70)
+++..+.+.
T Consensus 171 ~ll~d~~~~ 179 (182)
T COG3697 171 ALLHDYDAC 179 (182)
T ss_pred HHHhhhhhh
Confidence 888766543
No 38
>cd06165 Sortase_A_1 Sortase A (SrtA) or subfamily-1 sortases are cysteine transpeptidases found in gram-positive bacteria that anchor surface proteins to peptidoglycans of the bacterial cell wall envelope. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal (usually a pentapeptide motif), and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference. This group contains a subset of Class A (subfamily-1) sortases, excluding SrtA from Staphylococcus aureus. Sortase A cleaves between threonine and glycine of the LPXTG motif in a wide range of protein substrates. It affects the ability of a pathogen to establish successful infection. Sortase A contains an N-terminal region that functions as both a signal peptide for secretion and a stop-tra
Probab=23.63 E-value=56 Score=20.58 Aligned_cols=8 Identities=38% Similarity=0.787 Sum_probs=7.0
Q ss_pred eEEeeCHH
Q 035227 8 LFISCDIP 15 (70)
Q Consensus 8 vLi~CDpa 15 (70)
.|++|.|.
T Consensus 107 tLiTC~p~ 114 (127)
T cd06165 107 TLITCDDA 114 (127)
T ss_pred EEEecCCC
Confidence 69999987
No 39
>COG1838 FumA Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain [Energy production and conversion]
Probab=23.58 E-value=2e+02 Score=20.56 Aligned_cols=21 Identities=10% Similarity=0.308 Sum_probs=19.3
Q ss_pred eeeeEEee-CHHHHHHHHHhhc
Q 035227 5 IKGLFISC-DIPMAQFIINMNA 25 (70)
Q Consensus 5 ~kGvLi~C-DpaiKq~il~lde 25 (70)
..|.+++| |.|-|-++-.+++
T Consensus 24 lsG~I~t~RD~AH~ri~e~~~~ 45 (184)
T COG1838 24 LSGKIVTGRDAAHKRLLEMLDR 45 (184)
T ss_pred EeeEEEEehhHHHHHHHHHHhc
Confidence 57999999 9999999999985
No 40
>cd04899 ACT_ACR-UUR-like_2 C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD and related domains. This ACT domain family, ACT_ACR-UUR-like_2, includes the second of two C-terminal ACT domains of the bacterial signal-transducing uridylyltransferase /uridylyl-removing (UUR) enzyme, GlnD; including those enzymes similar to the GlnD found in enteric Escherichia coli and those found in photosynthetic, nitrogen-fixing bacterium Rhodospirillum rubrum. Also included in this CD are the second and fourth ACT domains of a novel protein composed almost entirely of ACT domain repeats, the ACR protein. These ACR proteins, found in Arabidopsis and Oryza, are proposed to function as novel regulatory or sensor proteins in plants. Members of this CD belong to the superfamily of ACT regulatory domains.
Probab=23.54 E-value=90 Score=16.85 Aligned_cols=26 Identities=27% Similarity=0.469 Sum_probs=16.7
Q ss_pred CeEEEeccCceeEEccchHHHHHHHHHH
Q 035227 32 KFIIHILDSTHLFVQPNMAEMIRSAIAE 59 (70)
Q Consensus 32 ~FIIedLDdthlfV~~~~v~~lk~~l~~ 59 (70)
-|.|.+-+... +.++..+.|+++|.+
T Consensus 43 ~f~i~~~~~~~--~~~~~~~~i~~~l~~ 68 (70)
T cd04899 43 VFYVTDADGQP--LDPERQEALRAALGE 68 (70)
T ss_pred EEEEECCCCCc--CCHHHHHHHHHHHHh
Confidence 36677666554 566677777777654
No 41
>COG3700 AphA Acid phosphatase (class B) [General function prediction only]
Probab=23.54 E-value=58 Score=23.98 Aligned_cols=39 Identities=23% Similarity=0.409 Sum_probs=24.6
Q ss_pred HHHHHHHHhhcCCCCCCCeEEEeccCceeEE-----------ccchHHHHHHH
Q 035227 15 PMAQFIINMNASMPQSQKFIIHILDSTHLFV-----------QPNMAEMIRSA 56 (70)
Q Consensus 15 aiKq~il~lde~~~~~~~FIIedLDdthlfV-----------~~~~v~~lk~~ 56 (70)
++.|+=-.|.-..|-.=.| |+|||-||- .|...++|+..
T Consensus 50 SvaqI~~SLeG~~Pi~VsF---DIDDTvLFsSp~F~~Gk~~~sPgs~DyLknq 99 (237)
T COG3700 50 SVAQIENSLEGRPPIAVSF---DIDDTVLFSSPGFWRGKKYFSPGSEDYLKNQ 99 (237)
T ss_pred EHHHHHhhhcCCCCeeEee---ccCCeeEecccccccCccccCCChHHhhcCH
Confidence 3556655665443333345 999999995 45667777743
No 42
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=23.19 E-value=41 Score=20.81 Aligned_cols=13 Identities=23% Similarity=0.406 Sum_probs=11.2
Q ss_pred ceeeeEEeeCHHH
Q 035227 4 AIKGLFISCDIPM 16 (70)
Q Consensus 4 a~kGvLi~CDpai 16 (70)
..+|.|..||+-|
T Consensus 23 ~~~G~L~~vD~~M 35 (90)
T cd01724 23 IVHGTITGVDPSM 35 (90)
T ss_pred EEEEEEEEEcCce
Confidence 4689999999977
No 43
>TIGR03272 methan_mark_6 putative methanogenesis marker protein 6. Members of this protein family, to date, are found in a completed prokaryotic genome if and only if the species is one of the archaeal methanogens. The exact function is unknown, but likely is linked to methanogenesis or a process closely connected to it.
Probab=23.02 E-value=1.3e+02 Score=20.49 Aligned_cols=29 Identities=24% Similarity=0.403 Sum_probs=16.7
Q ss_pred eeEEeeCHH-HHHHHHHhhcCCCCCCCeEEEeccCceeEEcc
Q 035227 7 GLFISCDIP-MAQFIINMNASMPQSQKFIIHILDSTHLFVQP 47 (70)
Q Consensus 7 GvLi~CDpa-iKq~il~lde~~~~~~~FIIedLDdthlfV~~ 47 (70)
|.+|+.+.. +..++-.+- .+|.+|+||+.
T Consensus 33 G~~i~G~~e~V~~~v~~iR------------~ld~~~IF~Kd 62 (132)
T TIGR03272 33 GAIITGPEEEVMKVAERIR------------ELDPNHIFVKD 62 (132)
T ss_pred eeeeeCCHHHHHHHHHHHH------------hhCCCceEeec
Confidence 555555543 444444442 36778888864
No 44
>PRK02228 V-type ATP synthase subunit F; Provisional
Probab=22.95 E-value=1.2e+02 Score=18.86 Aligned_cols=26 Identities=8% Similarity=0.054 Sum_probs=21.5
Q ss_pred ceeEEccchHHHHHHHHHHHHHhcCC
Q 035227 41 THLFVQPNMAEMIRSAIAEFRDQNSY 66 (70)
Q Consensus 41 thlfV~~~~v~~lk~~l~~~l~~n~~ 66 (70)
.-++|.++..+.+.+++.+.+++...
T Consensus 46 gII~Ite~~~~~i~e~i~~~~~~~~~ 71 (100)
T PRK02228 46 GILVMHDDDLEKLPRRLRRTLEESVE 71 (100)
T ss_pred EEEEEehhHhHhhHHHHHHHHhcCCC
Confidence 35789999999999999998877654
No 45
>PF10691 DUF2497: Protein of unknown function (DUF2497) ; InterPro: IPR019632 Members of this family belong to the Alphaproteobacteria. The function of the family is not known.
Probab=22.92 E-value=90 Score=18.91 Aligned_cols=21 Identities=29% Similarity=0.499 Sum_probs=16.5
Q ss_pred EccchHHHHHHHHHHHHHhcC
Q 035227 45 VQPNMAEMIRSAIAEFRDQNS 65 (70)
Q Consensus 45 V~~~~v~~lk~~l~~~l~~n~ 65 (70)
+..=+.++||=.|.+|||+|-
T Consensus 35 lE~lvremLRPmLkeWLD~nL 55 (73)
T PF10691_consen 35 LEDLVREMLRPMLKEWLDENL 55 (73)
T ss_pred HHHHHHHHHHHHHHHHHHhcc
Confidence 344467889999999999984
Done!