Query psy15380
Match_columns 1738
No_of_seqs 607 out of 2104
Neff 7.4
Searched_HMMs 46136
Date Fri Aug 16 19:30:33 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy15380.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/15380hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd01644 RT_pepA17 RT_pepA17: R 100.0 1E-46 2.2E-51 422.6 14.0 212 1162-1380 1-213 (213)
2 PF05380 Peptidase_A17: Pao re 100.0 6.4E-37 1.4E-41 329.4 13.3 144 1558-1704 16-159 (159)
3 cd03715 RT_ZFREV_like RT_ZFREV 99.9 5E-27 1.1E-31 265.9 19.9 181 1129-1335 10-191 (210)
4 cd01645 RT_Rtv RT_Rtv: Reverse 99.9 3.2E-25 7E-30 251.2 16.9 181 1130-1335 11-195 (213)
5 PF05585 DUF1758: Putative pep 99.9 9.5E-24 2.1E-28 229.4 13.6 159 878-1040 1-160 (164)
6 PF03564 DUF1759: Protein of u 99.9 6.5E-22 1.4E-26 210.9 13.3 142 441-628 1-143 (145)
7 cd01647 RT_LTR RT_LTR: Reverse 99.8 2.7E-20 5.9E-25 204.1 14.2 158 1147-1335 1-158 (177)
8 cd03714 RT_DIRS1 RT_DIRS1: Rev 99.6 1.1E-15 2.5E-20 157.2 11.6 98 1226-1335 1-99 (119)
9 PF05380 Peptidase_A17: Pao re 99.5 9.8E-16 2.1E-20 165.3 1.1 95 1393-1501 1-101 (159)
10 PF00078 RVT_1: Reverse transc 98.8 6E-09 1.3E-13 118.3 8.7 119 1213-1335 56-193 (214)
11 cd00304 RT_like RT_like: Rever 98.6 5.6E-08 1.2E-12 96.4 6.1 79 1226-1337 1-79 (98)
12 cd05484 retropepsin_like_LTR_2 98.5 2.7E-07 5.8E-12 90.4 8.9 67 889-958 11-78 (91)
13 PF08284 RVP_2: Retroviral asp 98.4 1.1E-06 2.4E-11 92.2 9.8 101 866-1009 16-118 (135)
14 PF00077 RVP: Retroviral aspar 98.3 8.8E-07 1.9E-11 88.3 6.3 74 875-958 7-80 (100)
15 PF12384 Peptidase_A2B: Ty3 tr 98.3 3.4E-06 7.5E-11 87.5 9.5 84 889-975 45-130 (177)
16 cd05479 RP_DDI RP_DDI; retrope 98.2 5.9E-06 1.3E-10 85.7 9.6 81 889-1007 27-111 (124)
17 PF13650 Asp_protease_2: Aspar 98.1 1.2E-05 2.6E-10 78.1 9.4 67 889-958 9-77 (90)
18 PF09668 Asp_protease: Asparty 98.0 1E-05 2.2E-10 82.7 6.9 90 875-1007 26-119 (124)
19 cd05481 retropepsin_like_LTR_1 98.0 2.3E-05 4.9E-10 76.9 8.1 69 888-959 9-81 (93)
20 cd01648 TERT TERT: Telomerase 97.8 3.7E-05 8E-10 79.2 6.7 90 1226-1335 1-91 (119)
21 cd05483 retropepsin_like_bacte 97.7 0.00013 2.9E-09 71.6 9.0 64 875-944 4-68 (96)
22 cd06095 RP_RTVL_H_like Retrope 97.7 0.00016 3.4E-09 70.1 8.2 39 889-929 9-47 (86)
23 PF13975 gag-asp_proteas: gag- 97.6 0.0001 2.3E-09 68.7 6.5 58 874-936 9-68 (72)
24 TIGR02281 clan_AA_DTGA clan AA 97.5 0.0003 6.5E-09 72.6 8.7 50 875-929 13-64 (121)
25 cd05480 NRIP_C NRIP_C; putativ 97.5 0.00031 6.7E-09 67.7 7.1 83 888-1007 8-95 (103)
26 cd06094 RP_Saci_like RP_Saci_l 97.1 0.00098 2.1E-08 63.5 6.1 53 889-944 9-61 (89)
27 cd01650 RT_nLTR_like RT_nLTR: 97.0 0.00071 1.5E-08 77.5 5.1 99 1218-1335 79-180 (220)
28 cd03487 RT_Bac_retron_II RT_Ba 96.9 0.001 2.2E-08 76.1 5.3 116 1217-1335 53-174 (214)
29 cd00303 retropepsin_like Retro 96.7 0.0076 1.6E-07 56.6 8.6 68 888-958 8-78 (92)
30 TIGR03698 clan_AA_DTGF clan AA 96.5 0.013 2.9E-07 59.1 9.4 67 876-944 2-70 (107)
31 KOG0012|consensus 96.4 0.0031 6.8E-08 73.7 4.9 39 888-926 245-286 (380)
32 PF02160 Peptidase_A3: Caulifl 96.4 0.0054 1.2E-07 67.6 6.2 99 878-1012 9-108 (201)
33 PF00098 zf-CCHC: Zinc knuckle 96.0 0.0035 7.5E-08 41.8 1.5 18 722-739 1-18 (18)
34 cd01646 RT_Bac_retron_I RT_Bac 95.8 0.02 4.3E-07 62.2 7.3 72 1260-1335 49-120 (158)
35 cd01651 RT_G2_intron RT_G2_int 95.8 0.0087 1.9E-07 68.7 4.7 119 1217-1335 66-204 (226)
36 cd05482 HIV_retropepsin_like R 95.8 0.028 6.1E-07 54.3 7.2 65 889-958 9-73 (87)
37 cd06222 RnaseH RNase H (RNase 95.8 0.042 9.1E-07 56.0 9.1 93 1626-1734 2-98 (130)
38 PF00075 RNase_H: RNase H; In 95.6 0.009 1.9E-07 62.5 3.2 67 1623-1715 3-75 (132)
39 COG0328 RnhA Ribonuclease HI [ 95.5 0.037 7.9E-07 59.0 7.4 80 1623-1722 3-93 (154)
40 PRK07708 hypothetical protein; 95.3 0.2 4.3E-06 57.3 13.2 81 1623-1716 73-159 (219)
41 COG5082 AIR1 Arginine methyltr 94.9 0.014 3.1E-07 63.3 2.3 72 686-757 58-138 (190)
42 PF12382 Peptidase_A2E: Retrot 94.8 0.071 1.5E-06 50.7 6.2 69 889-961 47-117 (137)
43 COG5082 AIR1 Arginine methyltr 94.7 0.02 4.3E-07 62.2 2.8 36 719-755 58-93 (190)
44 PF14893 PNMA: PNMA 94.0 0.12 2.6E-06 62.1 7.6 97 436-533 164-268 (331)
45 PF14223 UBN2: gag-polypeptide 94.0 0.39 8.4E-06 49.4 10.5 71 578-649 42-114 (119)
46 PF09337 zf-H2C2: His(2)-Cys(2 94.0 0.025 5.4E-07 45.9 1.2 34 313-346 5-39 (39)
47 PTZ00368 universal minicircle 93.6 0.07 1.5E-06 57.3 4.3 53 688-749 27-86 (148)
48 PTZ00368 universal minicircle 93.3 0.071 1.5E-06 57.3 3.7 60 688-756 52-120 (148)
49 PF07727 RVT_2: Reverse transc 92.7 0.049 1.1E-06 63.6 1.4 102 1222-1325 78-190 (246)
50 PF14227 UBN2_2: gag-polypepti 92.5 1 2.2E-05 46.4 10.8 72 578-650 40-113 (119)
51 PF13456 RVT_3: Reverse transc 92.5 0.16 3.5E-06 48.6 4.6 54 24-78 19-75 (87)
52 PRK00203 rnhA ribonuclease H; 92.4 0.3 6.4E-06 52.7 6.9 90 1624-1735 4-110 (150)
53 PF00336 DNA_pol_viral_C: DNA 92.0 0.12 2.7E-06 55.9 3.3 49 28-88 142-196 (245)
54 cd06222 RnaseH RNase H (RNase 89.2 0.39 8.4E-06 48.7 4.0 75 2-84 45-127 (130)
55 PRK08719 ribonuclease H; Revie 88.4 0.72 1.6E-05 49.5 5.4 71 1622-1713 3-82 (147)
56 PRK13907 rnhA ribonuclease H; 88.3 2.6 5.6E-05 44.0 9.5 74 1624-1717 2-81 (128)
57 KOG4400|consensus 88.3 0.2 4.3E-06 59.2 1.2 35 722-756 144-180 (261)
58 COG3577 Predicted aspartyl pro 88.1 0.96 2.1E-05 50.0 6.1 43 889-931 116-160 (215)
59 PRK06548 ribonuclease H; Provi 86.0 2 4.3E-05 46.8 7.1 68 1623-1714 5-78 (161)
60 PF00075 RNase_H: RNase H; In 82.9 1.5 3.1E-05 45.8 4.4 52 27-78 58-116 (132)
61 PF13456 RVT_3: Reverse transc 82.6 2.7 5.9E-05 40.0 5.8 56 1672-1736 4-59 (87)
62 PF14787 zf-CCHC_5: GAG-polypr 81.4 0.87 1.9E-05 35.9 1.4 20 721-740 2-21 (36)
63 PRK07238 bifunctional RNase H/ 80.5 7.7 0.00017 48.4 10.3 77 1623-1716 2-84 (372)
64 COG5550 Predicted aspartyl pro 79.7 7.2 0.00016 39.9 7.6 68 874-944 11-80 (125)
65 PF13696 zf-CCHC_2: Zinc knuck 78.5 0.84 1.8E-05 35.3 0.6 19 722-740 9-27 (32)
66 PF13917 zf-CCHC_3: Zinc knuck 75.7 1 2.3E-05 37.2 0.4 20 720-739 3-22 (42)
67 COG0328 RnhA Ribonuclease HI [ 75.4 4.7 0.0001 43.4 5.3 69 1-77 46-128 (154)
68 PF03732 Retrotrans_gag: Retro 74.5 4.1 8.9E-05 39.6 4.4 52 474-526 2-57 (96)
69 cd01709 RT_like_1 RT_like_1: A 74.0 7.1 0.00015 47.2 6.9 111 1208-1336 39-150 (346)
70 PRK13907 rnhA ribonuclease H; 73.5 4.4 9.6E-05 42.2 4.6 59 24-83 59-121 (128)
71 KOG0119|consensus 72.2 1.6 3.6E-05 53.6 1.1 42 689-740 262-304 (554)
72 smart00343 ZnF_C2HC zinc finge 69.0 2.3 5E-05 31.3 0.9 17 723-739 1-17 (26)
73 KOG4400|consensus 66.1 3.2 6.9E-05 49.1 1.8 39 689-740 144-183 (261)
74 PF00098 zf-CCHC: Zinc knuckle 65.1 4.3 9.4E-05 27.3 1.5 16 690-705 2-18 (18)
75 PF14787 zf-CCHC_5: GAG-polypr 62.1 4.1 8.8E-05 32.3 1.1 18 688-705 2-20 (36)
76 KOG3752|consensus 58.1 22 0.00047 43.2 6.7 72 1624-1714 213-293 (371)
77 PRK00203 rnhA ribonuclease H; 55.8 14 0.00031 39.7 4.4 51 27-77 62-125 (150)
78 PRK06548 ribonuclease H; Provi 53.7 21 0.00045 39.0 5.2 50 27-76 62-124 (161)
79 PRK07708 hypothetical protein; 48.9 25 0.00054 40.4 5.1 53 28-81 142-200 (219)
80 PRK07238 bifunctional RNase H/ 46.3 25 0.00054 43.9 5.1 74 2-84 49-127 (372)
81 PF13696 zf-CCHC_2: Zinc knuck 42.3 10 0.00022 29.6 0.4 19 687-705 7-26 (32)
82 PF00336 DNA_pol_viral_C: DNA 41.1 20 0.00044 39.6 2.6 60 1624-1711 95-154 (245)
83 PF12353 eIF3g: Eukaryotic tra 40.6 13 0.00027 39.0 1.0 19 686-704 104-122 (128)
84 PF14392 zf-CCHC_4: Zinc knuck 38.8 11 0.00025 32.4 0.3 17 722-738 32-48 (49)
85 KOG4768|consensus 36.0 28 0.0006 44.5 3.0 131 1205-1335 344-539 (796)
86 PF14244 UBN2_3: gag-polypepti 34.2 1.1E+02 0.0024 32.9 7.1 84 436-526 8-115 (152)
87 KOG1005|consensus 27.5 1.1E+02 0.0023 41.2 6.3 90 1260-1357 630-720 (888)
88 PF13917 zf-CCHC_3: Zinc knuck 23.9 40 0.00086 28.2 1.0 19 687-705 3-22 (42)
89 PRK08719 ribonuclease H; Revie 22.8 1.3E+02 0.0029 32.3 5.1 48 29-76 69-129 (147)
90 PF15288 zf-CCHC_6: Zinc knuck 22.1 49 0.0011 27.3 1.2 19 722-740 2-22 (40)
No 1
>cd01644 RT_pepA17 RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.
Probab=100.00 E-value=1e-46 Score=422.58 Aligned_cols=212 Identities=36% Similarity=0.670 Sum_probs=194.6
Q ss_pred ceeeccccccCCCCCCCceEEeecCCccccCCCCcccccccCccccchhHHHhhhcccCcceeecccccceeeeEecccC
Q psy15380 1162 GYFIPHHVVTKPDSPSSKMRIVYDASTKTSSGKSLNDILHAGSKMYNDLFCILLKFRLFPYALNGDITKMFLQIKLLSDY 1241 (1738)
Q Consensus 1162 ~~y~Ph~~V~k~~~~ttk~Riv~D~sa~~~~~~sLN~~l~~gp~~~~~l~~iL~r~r~~~~~~~~Dl~kaf~qv~l~~~d 1241 (1738)
.||+|||+|+|++|++||+|||+|||++. +|.+||+.+.+||+++|+|.++|++||+++||+++||++|||||+|+|+|
T Consensus 1 ~~y~ph~~V~~~~~~~~k~R~V~D~s~~~-~g~sLN~~l~~gp~~~~~l~~iL~~~R~~~~~~~~Di~~af~qI~i~~~d 79 (213)
T cd01644 1 VWYLPHHAVIKPSKTTTKLRVVFDASARY-NGVSLNDMLLKGPDLLNSLFGVLLRFRQGKIAVSADIEKMFHQVKVRPED 79 (213)
T ss_pred CcccCCceecCCCCCCCccEEEEeccccc-CCchhhHHhccCCccccchhhhheeeecCceeEehhHHHhhhheecCccc
Confidence 49999999999999899999999999984 89999999999999999999999999999999999999999999999999
Q ss_pred cceEEEEeccCCcccc-eEEEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccccccceeccceeeecCCHHHHHHH
Q psy15380 1242 WKFQKILWRFSNKEKI-DVYELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQYIGESLYMDDCVISFPSQKEAIEF 1320 (1738)
Q Consensus 1242 r~~~~f~w~~~~~~~~-~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvDDili~~~s~ee~~~~ 1320 (1738)
+++++|+|+++++.+. +.|+|+|||||+++||++||++|++++.++.+. .+++.+...+|||||+++++|.+||.+.
T Consensus 80 ~~~~~F~w~~~~~~~~~~~Y~~~~~pFG~~~AP~~~~~~~~~~~~~~~~~--~~~~~i~~~~YvDDili~~~s~~e~~~~ 157 (213)
T cd01644 80 RDVLRFLWRKDGDEPKPIEYRMTVVPFGAASAPFLANRALKQHAEDHPHE--AAAKIIKRNFYVDDILVSTDTLNEAVNV 157 (213)
T ss_pred CceEEEEEeCCCCCCcceEEEEEEEccCCccchHHHHHHHHHHHhhcchh--hHHHHHHHeeecccceecCCCHHHHHHH
Confidence 9999999999876665 999999999999999999999999999988764 3566677889999999999999999999
Q ss_pred HHHHHHHHHhCCceEeeeccCChHhhhcCCCccchhhccccCCCCCCceEeeeEEeccCC
Q psy15380 1321 FNQTVKLFASGGFRFTKWSSNSPEILEQIPIHDRLAEMVSWHTDDFSHKILGMLWNTSSD 1380 (1738)
Q Consensus 1321 ~~~v~~~l~~~Gf~l~k~~SN~~~vl~~i~~~~~~~~~~~~~~~~~~~k~LGi~W~~~~D 1380 (1738)
++++.++|+++||+++||+||+.++++.++++.. ..-.+.+..+|+||+.|++.+|
T Consensus 158 ~~~v~~~L~~~Gf~l~kw~sn~~~~l~~~~~~~~----~~~~~~~~~~k~LGl~W~~~~D 213 (213)
T cd01644 158 AKRLIALLKKGGFNLRKWASNSQEVLDDLPEERV----LLDRDSDVTEKTLGLRWNPKTD 213 (213)
T ss_pred HHHHHHHHHhCCccchhcccCchhhhhccccccc----ccccccccchhcccceeeccCC
Confidence 9999999999999999999999999999998631 1112344669999999999887
No 2
>PF05380 Peptidase_A17: Pao retrotransposon peptidase ; InterPro: IPR008042 This signature identifies members of the Pao retrotransposon family.
Probab=100.00 E-value=6.4e-37 Score=329.37 Aligned_cols=144 Identities=40% Similarity=0.633 Sum_probs=139.8
Q ss_pred ccccchhHHHHHHHHHHHHHhcCCCCCCCCChhhHHHHHHHHHhcCCCCcEEEeeeeeccCCCceeeeeeccccccccee
Q psy15380 1558 GRGLLAPVILWAKLLIRELCILKLDWDSIPPHHLVQLWQTFQAQLPLLESLAFPRHIGVYKDCKFQLIGFSDASEKGYGA 1637 (1738)
Q Consensus 1558 plg~~~~~~~~~k~l~q~l~~~~~~WD~~l~~~~~~~w~~~~~~l~~l~~~~ipR~~~~~~~~~~~L~~F~DAS~~ayga 1637 (1738)
|||+++|+++.+|+|||++|+.+++||++||+++...|..|.+++..++++.|||++...+...++||+|||||+.||||
T Consensus 16 PlGl~~p~~i~~K~llq~lw~~~l~WD~~lp~el~~~w~~~~~~l~~~~~i~iPR~i~~~~~~~~~L~~F~DAS~~ayga 95 (159)
T PF05380_consen 16 PLGLLAPIIIRAKLLLQKLWQSKLDWDDPLPDELRKEWKKWLKELESLSPIRIPRCIPISDYRSVELHVFCDASESAYGA 95 (159)
T ss_pred cchhhHHHHHHHHHHHHhhhccccchhhhhhHHHHHHHHHHHHHHhhcccccCCcccccccccceeeeEeecccccceee
Confidence 99999999999999999999999999999999999999999999999999999998876666789999999999999999
Q ss_pred EEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCCcccEEEEe
Q psy15380 1638 LIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYPIDSIYCFT 1704 (1738)
Q Consensus 1638 ~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~~~~~~~~t 1704 (1738)
|+|+|+ ..+|...+.+++||+||+|+|+.|||||||+|+++|++|+.++.++++ +++.+++|||
T Consensus 96 vvYlr~-~~~~~~~~~ll~aKsrv~P~k~~tIPRlEL~a~~l~~~l~~~~~~~l~--~~~~~~~~wt 159 (159)
T PF05380_consen 96 VVYLRS-YSDGSVQVRLLFAKSRVAPLKTVTIPRLELLAALLGVRLANTVKKELD--IEISQVVFWT 159 (159)
T ss_pred EeEeee-ccCCceeeeeeeecccccCCCCCcHHHHHHHHHHHHHHHHHHHHHHcC--CCcceeEEeC
Confidence 999999 789999999999999999999999999999999999999999999999 9999999997
No 3
>cd03715 RT_ZFREV_like RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses.
Probab=99.95 E-value=5e-27 Score=265.93 Aligned_cols=181 Identities=17% Similarity=0.244 Sum_probs=164.5
Q ss_pred chhhHHHHHHHHHHHHhcCCeeecCCCCCCCCCceeeccccccCCCCCCCceEEeecCCccccCCCCcccccccCccccc
Q psy15380 1129 NINLRCDYNQAMTDLIDNGFMVKCASLDQGDLNGYFIPHHVVTKPDSPSSKMRIVYDASTKTSSGKSLNDILHAGSKMYN 1208 (1738)
Q Consensus 1129 ~p~l~~~y~~~i~e~l~~G~ie~v~~~~~~~~~~~y~Ph~~V~k~~~~ttk~Riv~D~sa~~~~~~sLN~~l~~gp~~~~ 1208 (1738)
.|+.+++.+++|++|++.|+|+++.++ |..|.++|.|+++ +++|+|+|+|. ||..+....+++|
T Consensus 10 ~~~~~~~~~~~v~~ll~~G~I~~~~s~-------~~sp~~~V~Kk~g--~~~R~~vD~r~-------lN~~~~~~~~~~p 73 (210)
T cd03715 10 PREAREGITPHIQELLEAGILVPCQSP-------WNTPILPVKKPGG--NDYRMVQDLRL-------VNQAVLPIHPAVP 73 (210)
T ss_pred CHHHHHHHHHHHHHHHHCCCeECCCCC-------CCCceEEEEeCCC--CcceEEEEhhh-------hhhcccccCcCCC
Confidence 367889999999999999999999664 9999999999864 39999999965 7999999999999
Q ss_pred hhHHHhhhcc-cCcceeecccccceeeeEecccCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHhh
Q psy15380 1209 DLFCILLKFR-LFPYALNGDITKMFLQIKLLSDYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDE 1287 (1738)
Q Consensus 1209 ~l~~iL~r~r-~~~~~~~~Dl~kaf~qv~l~~~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~ 1287 (1738)
.+.+++.++. +++||+++|+++||+||+|+|+++++++|.|.++ .|+|++||||+++||++|+++|+.++...
T Consensus 74 ~~~~~l~~l~~~~~~~s~lDl~~af~~i~l~~~~~~~taf~~~~~------~y~~~~lp~Gl~~sp~~f~~~~~~~l~~~ 147 (210)
T cd03715 74 NPYTLLSLLPPKHQWYTVLDLANAFFSLPLAPDSQPLFAFEWEGQ------QYTFTRLPQGFKNSPTLFHEALARDLAPF 147 (210)
T ss_pred cHHHHHHHhccCCeEEEEeeccCeEEEEEcccccEEeEEEEECCe------eEEEEEEeccccCcHHHHHHHHHHHHHHH
Confidence 9999999886 8999999999999999999999999999999987 99999999999999999999999999887
Q ss_pred hhcCcccccccccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1288 SKNFPLACQYIGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1288 ~~~~p~~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
...+. .....+|+|||+++++|.+||.+.++.|+.+|+++||.+
T Consensus 148 ~~~~~----~~~~~~Y~DDili~s~~~~e~~~~l~~v~~~l~~~gl~l 191 (210)
T cd03715 148 PLEHE----GTILLQYVDDLLLAADSEEDCLKGTDALLTHLGELGYKV 191 (210)
T ss_pred HhhCC----CeEEEEECCcEEEecCCHHHHHHHHHHHHHHHHHCCCCc
Confidence 53211 123567999999999999999999999999999999998
No 4
>cd01645 RT_Rtv RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of single-stranded RNA into double-stranded viral DNA for integration into host chromosomes. Proteins in this subfamily contain long terminal repeats (LTRs) and are multifunctional enzymes with RNA-directed DNA polymerase, DNA directed DNA polymerase, and ribonuclease hybrid (RNase H) activities. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex, and the process of reverse transcription generates in the cytoplasm forming a linear DNA duplex via an intricate series of steps. This duplex DNA is colinear with its RNA template, but contains terminal duplications known as LTRs that are not present in viral RNA. It has been proposed that two specialized template switches, known as strand-transfer reactions or "jumps", are required to generate the LTRs.
Probab=99.93 E-value=3.2e-25 Score=251.21 Aligned_cols=181 Identities=20% Similarity=0.189 Sum_probs=155.8
Q ss_pred hhhHHHHHHHHHHHHhcCCeeecCCCCCCCCCceeeccccccCCCCCCCceEEeecCCccccCCCCcccccccCcc---c
Q psy15380 1130 INLRCDYNQAMTDLIDNGFMVKCASLDQGDLNGYFIPHHVVTKPDSPSSKMRIVYDASTKTSSGKSLNDILHAGSK---M 1206 (1738)
Q Consensus 1130 p~l~~~y~~~i~e~l~~G~ie~v~~~~~~~~~~~y~Ph~~V~k~~~~ttk~Riv~D~sa~~~~~~sLN~~l~~gp~---~ 1206 (1738)
++.++++.++|++++++|+|+++.++ |.+|.++|.|+++ ++|+|+|+|. ||+.+..... .
T Consensus 11 ~~~~~~~~~~i~~ll~~g~I~~~~s~-------~~sp~~~v~K~~g---~~R~~~D~r~-------lN~~~~~~~~~~~~ 73 (213)
T cd01645 11 EEKLEALTELVTEQLKEGHIEPSTSP-------WNTPVFVIKKKSG---KWRLLHDLRA-------VNAQTQDMGALQPG 73 (213)
T ss_pred HHHHHHHHHHHHHHHHCCceecCCCC-------CcCcEEEEEcCCC---CeEEEechHH-------HhhhcccccccCCC
Confidence 57788999999999999999999864 9999999999875 9999999965 6998876533 2
Q ss_pred cchhHHHhhhcccCcceeecccccceeeeEecccCcceEEEEeccCC-cccceEEEEEEeeeccccchHHHHHHHHHHHH
Q psy15380 1207 YNDLFCILLKFRLFPYALNGDITKMFLQIKLLSDYWKFQKILWRFSN-KEKIDVYELRVVIFGTKASPYLAQRTVKQLID 1285 (1738)
Q Consensus 1207 ~~~l~~iL~r~r~~~~~~~~Dl~kaf~qv~l~~~dr~~~~f~w~~~~-~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~ 1285 (1738)
+|.+ ..+.+.++|+++||++||+||+|+|+|+.+++|.|...+ ..+.+.|+|++||||+++||++|++.|+.++.
T Consensus 74 ~p~~----~~l~~~~~~s~lDl~~af~~i~l~~~~~~~taf~~~~~~~~~~~~~~~~~~lP~Gl~~SP~~f~~~m~~~l~ 149 (213)
T cd01645 74 LPHP----AALPKGWPLIVLDLKDCFFSIPLHPDDRERFAFTVPSINNKGPAKRYQWKVLPQGMKNSPTICQSFVAQALE 149 (213)
T ss_pred CCCh----HHcCCCceEEEEEccCcEEEeeeccCCcceeEEEeccccCCCCCceEEEEEeCCCCcChHHHHHHHHHHHHH
Confidence 2322 235678999999999999999999999999999996432 23566999999999999999999999999999
Q ss_pred hhhhcCcccccccccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1286 DESKNFPLACQYIGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1286 ~~~~~~p~~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
...+.++. .....|||||++.+++.+||.+.++.+..+|+++||.+
T Consensus 150 ~~~~~~~~----~~~~~Y~DDili~s~~~~~~~~~l~~v~~~l~~~gl~l 195 (213)
T cd01645 150 PFRKQYPD----IVIYHYMDDILIASDLEGQLREIYEELRQTLLRWGLTI 195 (213)
T ss_pred HHHHHCCC----eEEEEEcCCEEEEcCCHHHHHHHHHHHHHHHHHCCCEe
Confidence 88765442 33568999999999999999999999999999999999
No 5
>PF05585 DUF1758: Putative peptidase (DUF1758); InterPro: IPR008737 This is a family of nematode proteins of unknown function []. However, it seems likely that these proteins act as aspartic peptidases.
Probab=99.90 E-value=9.5e-24 Score=229.42 Aligned_cols=159 Identities=28% Similarity=0.371 Sum_probs=138.3
Q ss_pred EEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeec-cCCCcceeceEEEEEecccCCcceEEEEEEe
Q psy15380 878 ILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGV-GSISSPVQGQVTMRFGSRFDKRYNYTIKALV 956 (1738)
Q Consensus 878 V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~-g~~~~~~~~~v~v~i~~~~~~~~~~~i~a~V 956 (1738)
|.|.|++|+...++||||||||.|||++++|++|+|+..+..+.+.+. |...........+.|.+...++ .+.+++++
T Consensus 1 v~V~n~~g~~~~~~~LlDsGSq~SfIt~~la~~L~L~~~~~~~~~~~~~g~~~~~~~~~~~~~i~~~~~~~-~~~i~alv 79 (164)
T PF05585_consen 1 VNVFNPNGNQVEARALLDSGSQRSFITESLANKLNLPGTGEKILVIGTFGSSSPKSKKCVRVKISSRTSNN-SLEIEALV 79 (164)
T ss_pred CEEECCCCCEEEEEEEEecCCchhHHhHHHHHHhCCCCCCceEEEEeccCccCccceeEEEEEEEEecCCC-ceEEEEEe
Confidence 578999999999999999999999999999999999999866444444 4444444566666777666554 48999999
Q ss_pred eccccCCCCccccchhhhHHhhhhhcccCCcCCCCCCceeeeeccccccccccCCccccCCCCCCeeeccccceEEecCC
Q psy15380 957 VNHVVDKLPVKSVDLSKLEYLQNIRHKLADDEFMEPGNICGILGAQIYPYLVSGDPLLGKDCNQPAAINSTLGYVVLGNA 1036 (1738)
Q Consensus 957 vp~i~~~lP~~~i~~~~~~~l~~l~~~Ladp~~~~~~~idlLIG~D~~~~i~~~~~~~~l~~~~p~a~~T~lGwii~G~~ 1036 (1738)
+|.|++++|..+++.++|.|++++. ||||.|+.+.+||||||+|+++.++.++.++. ..+.|+|++|.||||++|+.
T Consensus 80 v~~I~~~l~~~~i~~~~~~~~~~l~--lad~~f~~~~~iDiLIG~D~~~~ll~~~~i~~-~~~~~~a~~T~~GWiisG~~ 156 (164)
T PF05585_consen 80 VPKITGNLPSAPIDDSDWKHLNNLP--LADPNFRESSPIDILIGADYFWQLLTGGQIKR-LPGGPTAQETKFGWIISGKA 156 (164)
T ss_pred cCcccccccccccCHHHHhhhcCCc--cccccccCCCCCeEEEccchHHHHhCCceEec-CCCCCEEEeCCeEeEEeCcc
Confidence 9999999999999999999999999 99999999999999999999999999988887 78889999999999999987
Q ss_pred CCCC
Q psy15380 1037 PVLD 1040 (1738)
Q Consensus 1037 ~~~~ 1040 (1738)
....
T Consensus 157 ~~~~ 160 (164)
T PF05585_consen 157 SEQK 160 (164)
T ss_pred CCcc
Confidence 6543
No 6
>PF03564 DUF1759: Protein of unknown function (DUF1759); InterPro: IPR005312 This is a small family of proteins of unknown function.
Probab=99.87 E-value=6.5e-22 Score=210.93 Aligned_cols=142 Identities=31% Similarity=0.519 Sum_probs=134.1
Q ss_pred CCCCCcCChHHHHHHHHHHhccCCCCChHHHHHHHHHhccccHHHhhcCCCCCCCcHHHHHHHHHhHhccchhhHHHHHH
Q psy15380 441 FDGACIGNWPLFIEMYRINIHNRTDLTNAHKLQYLLSKLSGGALAVAAGIPPTEHNYQVIYDALLEKYDDKRNLATYYMD 520 (1738)
Q Consensus 441 F~G~~~~~~~~f~~~F~~~v~~~~~l~d~~K~~~L~s~L~G~A~~~i~~~~~t~~nY~~a~~~L~~~fg~~~~i~~~~~~ 520 (1738)
|+|+ +.+|+.|++.|+++||++..++|.+|++||+++|+|+|+++|++++++++||+.||+.|+++||++..+++++++
T Consensus 1 F~G~-~~~~~~F~~~F~~~v~~n~~~~d~~K~~~L~~~L~G~A~~~i~~~~~~~~~Y~~a~~~L~~~yg~~~~i~~~~~~ 79 (145)
T PF03564_consen 1 FDGD-PSEWPEFIDQFDSLVHENPDLSDIEKLNYLRSCLKGEAKELIRGLPLSEENYEEAWELLEERYGNPRRIIQALLE 79 (145)
T ss_pred CCCC-HHHHHHHHHHHHHHHhcccCCCHHHHHHHHHHHhcchHHHHHHcccccchhhHHHHHHHHHHhCCchHHHHHHHH
Confidence 9999 999999999999999998999999999999999999999999999999999999999999999999999999999
Q ss_pred HHhccc-ccCCChhhhhhhcccchhHHHHHHHHhhhhcchhhhhHhhhhhhhhcccchhHHHHHHHHHHHHHHHHhCCCC
Q psy15380 521 SLLNFK-TQSGSLETFIDYFGANVAQVIYDALLEKYDDKRNLATYYMDSLLNFKIQSGSLETFIDYFGANVAALKALDLP 599 (1738)
Q Consensus 521 ~l~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~L~~l~~~ 599 (1738)
+|.++| +.+++.. .|++|++.++.++.+|+.+|.+
T Consensus 80 ~l~~l~~~~~~d~~--------------------------------------------~L~~~~~~v~~~i~~L~~lg~~ 115 (145)
T PF03564_consen 80 ELRNLPPISNDDPE--------------------------------------------ALRSLVDKVNNCIRALKALGVN 115 (145)
T ss_pred HHhccccccchhHH--------------------------------------------HHHHHHHHHHHHHHHHHHcCCC
Confidence 999999 7887776 8999999999999999999998
Q ss_pred CCchhhHhhhcccCCCHHHHHHHHHHhCC
Q psy15380 600 NLSEFFLFYLGNSKLDESTRKQFELSLGK 628 (1738)
Q Consensus 600 ~~~~~~l~~~i~~kLp~~~~~~~~~~~~~ 628 (1738)
.+ +..++.+|++|||..++.+|.+...+
T Consensus 116 ~~-~~~l~~~i~~KLp~~~~~~w~~~~~~ 143 (145)
T PF03564_consen 116 VD-DPLLISIILSKLPPEIREKWEEHVKK 143 (145)
T ss_pred CC-CHHHHHHHHHHCCHHHHHHHHHHhhc
Confidence 88 45555999999999999999987643
No 7
>cd01647 RT_LTR RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.
Probab=99.83 E-value=2.7e-20 Score=204.05 Aligned_cols=158 Identities=22% Similarity=0.292 Sum_probs=146.0
Q ss_pred CCeeecCCCCCCCCCceeeccccccCCCCCCCceEEeecCCccccCCCCcccccccCccccchhHHHhhhcccCcceeec
Q psy15380 1147 GFMVKCASLDQGDLNGYFIPHHVVTKPDSPSSKMRIVYDASTKTSSGKSLNDILHAGSKMYNDLFCILLKFRLFPYALNG 1226 (1738)
Q Consensus 1147 G~ie~v~~~~~~~~~~~y~Ph~~V~k~~~~ttk~Riv~D~sa~~~~~~sLN~~l~~gp~~~~~l~~iL~r~r~~~~~~~~ 1226 (1738)
|+|++++++ |+.|.++|.|+++ |.|+|+|++. ||+++...++++|.+.+++..++++.++++.
T Consensus 1 g~i~~~~~~-------~~~p~~~v~k~~~---k~R~~~D~r~-------ln~~~~~~~~~~p~i~~~~~~~~~~~~~~~~ 63 (177)
T cd01647 1 GIIEPSSSP-------YASPVVVVKKKDG---KLRLCVDYRK-------LNKVTIKDRYPLPTIDELLEELAGAKVFSKL 63 (177)
T ss_pred CeeEccCCC-------CCCceEEEECCCC---CEEEEEcCHH-------HhcccCCCCCCCCCHHHHHHHhhcCcEEEec
Confidence 789998775 5599999999876 9999999955 7999999999999999999999999999999
Q ss_pred ccccceeeeEecccCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccccccceeccc
Q psy15380 1227 DITKMFLQIKLLSDYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQYIGESLYMDD 1306 (1738)
Q Consensus 1227 Dl~kaf~qv~l~~~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvDD 1306 (1738)
|+.+||+||+++++++.+++|.|..+ .|+++++|||+++||..++.+|+.++....+. ....||||
T Consensus 64 D~~~~~~~i~l~~~~~~~~~~~~~~~------~~~~~~~p~G~~~s~~~~~~~~~~~l~~~~~~--------~~~~y~DD 129 (177)
T cd01647 64 DLRSGYHQIPLAEESRPKTAFRTPFG------LYEYTRMPFGLKNAPATFQRLMNKILGDLLGD--------FVEVYLDD 129 (177)
T ss_pred ccccCcceeeeccCChhhceeecCCC------ccEEEEecCCCccHHHHHHHHHHhhhcccccc--------ccEEEecC
Confidence 99999999999999999999999988 99999999999999999999999998876431 25579999
Q ss_pred eeeecCCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1307 CVISFPSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1307 ili~~~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
+++.+++.+||.+.++.+...|+++||.+
T Consensus 130 i~i~~~~~~~~~~~~~~~~~~l~~~~~~~ 158 (177)
T cd01647 130 ILVYSKTEEEHLEHLREVLERLREAGLKL 158 (177)
T ss_pred ccccCCCHHHHHHHHHHHHHHHHHcCCEe
Confidence 99999999999999999999999999998
No 8
>cd03714 RT_DIRS1 RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase.
Probab=99.63 E-value=1.1e-15 Score=157.20 Aligned_cols=98 Identities=20% Similarity=0.336 Sum_probs=84.8
Q ss_pred cccccceeeeEecccCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccccccceecc
Q psy15380 1226 GDITKMFLQIKLLSDYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQYIGESLYMD 1305 (1738)
Q Consensus 1226 ~Dl~kaf~qv~l~~~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvD 1305 (1738)
+|+.+||+||+|+|+|+++++|.|..+ .|+|++||||+++||++|+++|+.++...... ...+..|+|
T Consensus 1 lD~~~ay~~i~l~~~~~~~~af~~~~~------~~~~~~mp~Gl~~sp~~f~~~~~~i~~~~~~~------~~~v~~Y~D 68 (119)
T cd03714 1 VDLKDAYFHIPILPRSRDLLGFAWQGE------TYQFKALPFGLSLAPRVFTKVVEALLAPLRLL------GVRIFSYLD 68 (119)
T ss_pred CchhhceEEEecCCCCcceeeEEECCC------cEEEEecCCcccchHHHHHHHHHHHHHHhhcC------CeEEEEEec
Confidence 599999999999999999999999988 99999999999999999999999999876421 233568999
Q ss_pred ceeeecCCHHHHHHHHHHHHH-HHHhCCceE
Q psy15380 1306 DCVISFPSQKEAIEFFNQTVK-LFASGGFRF 1335 (1738)
Q Consensus 1306 Dili~~~s~ee~~~~~~~v~~-~l~~~Gf~l 1335 (1738)
||++.++|.++..+.+..+.. +++++||.+
T Consensus 69 Dili~~~~~~~~~~~~~~l~~~~l~~~gl~l 99 (119)
T cd03714 69 DLLIIASSIKTSEAVLRHLRATLLANLGFTL 99 (119)
T ss_pred CeEEEeCcHHHHHHHHHHHHHHHHHHcCCcc
Confidence 999999986666666666555 699999998
No 9
>PF05380 Peptidase_A17: Pao retrotransposon peptidase ; InterPro: IPR008042 This signature identifies members of the Pao retrotransposon family.
Probab=99.54 E-value=9.8e-16 Score=165.33 Aligned_cols=95 Identities=33% Similarity=0.539 Sum_probs=90.3
Q ss_pred CCccccccc------ccccchhHHHHHHHHHHHHHhhccCCCCCCCchhhhHHHHHhhhcccccccccCcccchhhhccc
Q psy15380 1393 LTKRGHTRV------NFGLLAPVILWAKLLIRELCILKLDWDSIPPPHLVQLWQTFQAQLPLLESLAFPRHIGVYLRRKN 1466 (1738)
Q Consensus 1393 ~TkR~vls~------plG~~~p~~l~~k~~ls~l~~~~~~Wd~~l~~~~~~~w~~~~~~~~~~~~~~~pr~~~~~~~~~~ 1466 (1738)
+|||+++|. |+|+++|+++++|+++|.||+.+++||++|++++.+.|..|.++++.++.+.+||++
T Consensus 1 pTKR~ils~ia~~yDPlGl~~p~~i~~K~llq~lw~~~l~WD~~lp~el~~~w~~~~~~l~~~~~i~iPR~i-------- 72 (159)
T PF05380_consen 1 PTKRQILSFIASIYDPLGLLAPIIIRAKLLLQKLWQSKLDWDDPLPDELRKEWKKWLKELESLSPIRIPRCI-------- 72 (159)
T ss_pred CChHHHHHHHHHHcCcchhhHHHHHHHHHHHHhhhccccchhhhhhHHHHHHHHHHHHHHhhcccccCCccc--------
Confidence 699999995 999999999999999999999999999999999999999999999999999999988
Q ss_pred cCCCCCCCCchhhhhccCCCCCCCCCchhhHHHHH
Q psy15380 1467 KMGKPQNVDKDDKILMNGFSKPRRNGRHFVILLRR 1501 (1738)
Q Consensus 1467 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1501 (1738)
+..+...+++|+|+|+|+..|++++|++.
T Consensus 73 ------~~~~~~~~~L~~F~DAS~~aygavvYlr~ 101 (159)
T PF05380_consen 73 ------PISDYRSVELHVFCDASESAYGAVVYLRS 101 (159)
T ss_pred ------ccccccceeeeEeecccccceeeEeEeee
Confidence 34567789999999999999999999988
No 10
>PF00078 RVT_1: Reverse transcriptase (RNA-dependent DNA polymerase); InterPro: IPR000477 The use of an RNA template to produce DNA, for integration into the host genome and exploitation of a host cell, is a strategy employed in the replication of retroid elements, such as the retroviruses and bacterial retrons. The enzyme catalysing polymerisation is an RNA-directed DNA-polymerase, or reverse trancriptase (RT) (2.7.7.49 from EC). Reverse transcriptase occurs in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. Retroviral reverse transcriptase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The discovery of retroelements in the prokaryotes raises intriguing questions concerning their roles in bacteria and the origin and evolution of reverse transcriptases and whether the bacterial reverse transcriptases are older than eukaryotic reverse transcriptases [].; GO: 0003723 RNA binding, 0003964 RNA-directed DNA polymerase activity, 0006278 RNA-dependent DNA replication; PDB: 1MU2_B 3RWE_C 3DU6_B 3DU5_A 3KYL_A 2WOM_B 1DTQ_A 2OPS_A 3FFI_A 1VRU_B ....
Probab=98.84 E-value=6e-09 Score=118.26 Aligned_cols=119 Identities=20% Similarity=0.210 Sum_probs=101.6
Q ss_pred HhhhcccCcceeecccccceeeeEecccCcceEEEEeccC-------------------CcccceEEEEEEeeeccccch
Q psy15380 1213 ILLKFRLFPYALNGDITKMFLQIKLLSDYWKFQKILWRFS-------------------NKEKIDVYELRVVIFGTKASP 1273 (1738)
Q Consensus 1213 iL~r~r~~~~~~~~Dl~kaf~qv~l~~~dr~~~~f~w~~~-------------------~~~~~~~y~f~r~pFGl~~SP 1273 (1738)
.+...++..+++.+|+++||.+|+.++-.+.+.++.+.+. +.. ...+....+|+|...||
T Consensus 56 ~~~~~~~~~~~~~~Di~~~f~sI~~~~l~~~l~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~-~~~~~~~glpqG~~~S~ 134 (214)
T PF00078_consen 56 KLNRFKGYLYFLKLDISKAFDSIPHHRLLRKLKRFGVPKKLIRLIQNLLSDRTAKVYLDGDL-SPYFQKRGLPQGSPLSP 134 (214)
T ss_dssp HHHC-CGSSEEEEEECCCCGGGSBBHTTTGGGGEEEEECCSCHHHHHHHHHHHH-EECGCSS-SEEEEESBS-TTSTCHH
T ss_pred cccccccccccceeccccccccceeeeccccccccccccccccccccccccccccccccccc-ccccccccccccccccc
Confidence 4678899999999999999999999999999999987743 112 56899999999999999
Q ss_pred HHHHHHHHHHHHhhhhcCcccccccccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1274 YLAQRTVKQLIDDESKNFPLACQYIGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1274 ~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
.+++..|..+........ ........|+||+++.+++.+++.+.++.+.+.+++.|+.+
T Consensus 135 ~l~~~~l~~l~~~~~~~~---~~~~~~~rY~DD~~i~~~~~~~~~~~~~~i~~~~~~~gl~l 193 (214)
T PF00078_consen 135 LLFNIYLDDLDRELQQEL---NPDISYLRYADDILIISKSKEELQKILEKISQWLEELGLKL 193 (214)
T ss_dssp HHHHHHHHHHHHHHHHHS----TTSEEEEETTEEEEEESSHHHHHHHHHHHHHHHHHTTSBC
T ss_pred hhhccccccccccccccc---cccccceEeccccEEEECCHHHHHHHHHHHHHHHHHCCCEE
Confidence 999999999988877643 12344678999999999999999999999999999999998
No 11
>cd00304 RT_like RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.
Probab=98.60 E-value=5.6e-08 Score=96.35 Aligned_cols=79 Identities=23% Similarity=0.232 Sum_probs=69.1
Q ss_pred cccccceeeeEecccCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccccccceecc
Q psy15380 1226 GDITKMFLQIKLLSDYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQYIGESLYMD 1305 (1738)
Q Consensus 1226 ~Dl~kaf~qv~l~~~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvD 1305 (1738)
+|++++|+||+ +|||...||.+++.+|..+........ .......|+|
T Consensus 1 ~d~~~~~~~~~----------------------------lPqG~~~Sp~l~~~~~~~l~~~~~~~~----~~~~~~~Y~D 48 (98)
T cd00304 1 FDVKSFFTSIP----------------------------LPQGSPLSPALANLYMEKLEAPILKQL----LDITLIRYVD 48 (98)
T ss_pred CCHHHcCCCCc----------------------------cCCCCchHHHHHHHHHHHHHHHHHHhc----CCceEEEeeC
Confidence 59999999998 999999999999999999988765422 2345678999
Q ss_pred ceeeecCCHHHHHHHHHHHHHHHHhCCceEee
Q psy15380 1306 DCVISFPSQKEAIEFFNQTVKLFASGGFRFTK 1337 (1738)
Q Consensus 1306 Dili~~~s~ee~~~~~~~v~~~l~~~Gf~l~k 1337 (1738)
|+++.+.+. ++.+.+..+.+.++++|++++.
T Consensus 49 D~~i~~~~~-~~~~~~~~l~~~l~~~gl~ln~ 79 (98)
T cd00304 49 DLVVIAKSE-QQAVKKRELEEFLARLGLNLSD 79 (98)
T ss_pred cEEEEeCcH-HHHHHHHHHHHHHHHcCcEECh
Confidence 999999999 9999999999999999999943
No 12
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=98.54 E-value=2.7e-07 Score=90.37 Aligned_cols=67 Identities=21% Similarity=0.251 Sum_probs=59.7
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcce-eeeEEeeccCCCcceeceEEEEEecccCCcceEEEEEEeec
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLF-VATDLKGVGSISSPVQGQVTMRFGSRFDKRYNYTIKALVVN 958 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~-~~i~I~g~g~~~~~~~~~v~v~i~~~~~~~~~~~i~a~Vvp 958 (1738)
.+++|+||||++|+|+++.+.++|.+... ....++++||....+.|.+.+.+...+ ..+.++++|++
T Consensus 11 ~i~~lvDTGA~~svis~~~~~~lg~~~~~~~~~~v~~a~G~~~~~~G~~~~~v~~~~---~~~~~~~~v~~ 78 (91)
T cd05484 11 PLKFQLDTGSAITVISEKTWRKLGSPPLKPTKKRLRTATGTKLSVLGQILVTVKYGG---KTKVLTLYVVK 78 (91)
T ss_pred EEEEEEcCCcceEEeCHHHHHHhCCCccccccEEEEecCCCEeeEeEEEEEEEEECC---EEEEEEEEEEE
Confidence 78999999999999999999999998544 789999999999888888888888876 45789999988
No 13
>PF08284 RVP_2: Retroviral aspartyl protease; InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.
Probab=98.40 E-value=1.1e-06 Score=92.20 Aligned_cols=101 Identities=20% Similarity=0.185 Sum_probs=73.6
Q ss_pred cccceeEEEEEEEEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcce--eeeEEeeccCCCcceeceEEEEEecc
Q psy15380 866 TTQTTVLLGTVKILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLF--VATDLKGVGSISSPVQGQVTMRFGSR 943 (1738)
Q Consensus 866 ~~~~~v~l~t~~V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~--~~i~I~g~g~~~~~~~~~v~v~i~~~ 943 (1738)
+....++..+..| .+. .+.+|+||||.-|||++++|++++++..+ .++.+.+.|+..........+.+.+.
T Consensus 16 ~~~~~vi~g~~~I--~~~-----~~~vLiDSGAThsFIs~~~a~~~~l~~~~l~~~~~V~~~g~~~~~~~~~~~~~~~i~ 88 (135)
T PF08284_consen 16 EESPDVITGTFLI--NSI-----PASVLIDSGATHSFISSSFAKKLGLPLEPLPRPIVVSAPGGSINCEGVCPDVPLSIQ 88 (135)
T ss_pred cCCCCeEEEEEEe--ccE-----EEEEEEecCCCcEEccHHHHHhcCCEEEEccCeeEEecccccccccceeeeEEEEEC
Confidence 3445566665554 332 78999999999999999999999999876 67888877665433334456677666
Q ss_pred cCCcceEEEEEEeeccccCCCCccccchhhhHHhhhhhcccCCcCCCCCCceeeeecccccccccc
Q psy15380 944 FDKRYNYTIKALVVNHVVDKLPVKSVDLSKLEYLQNIRHKLADDEFMEPGNICGILGAQIYPYLVS 1009 (1738)
Q Consensus 944 ~~~~~~~~i~a~Vvp~i~~~lP~~~i~~~~~~~l~~l~~~Ladp~~~~~~~idlLIG~D~~~~i~~ 1009 (1738)
+ ..+..++++++ | ...|+|+|+|++..+.+
T Consensus 89 g---~~~~~dl~vl~-------------------------l--------~~~DvILGm~WL~~~~~ 118 (135)
T PF08284_consen 89 G---HEFVVDLLVLD-------------------------L--------GGYDVILGMDWLKKHNP 118 (135)
T ss_pred C---eEEEeeeEEec-------------------------c--------cceeeEeccchHHhCCC
Confidence 5 46678888776 2 23699999999887643
No 14
>PF00077 RVP: Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026; InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=98.31 E-value=8.8e-07 Score=88.28 Aligned_cols=74 Identities=22% Similarity=0.275 Sum_probs=60.8
Q ss_pred EEEEEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeeccCCCcceeceEEEEEecccCCcceEEEEE
Q psy15380 875 TVKILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGVGSISSPVQGQVTMRFGSRFDKRYNYTIKA 954 (1738)
Q Consensus 875 t~~V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~g~~~~~~~~~v~v~i~~~~~~~~~~~i~a 954 (1738)
.+.|.+.+. .+.|||||||++|+|+++.+..++.+ ......+.|+|+.. ...+...+.+...+ ..+...+
T Consensus 7 ~i~v~i~g~-----~i~~LlDTGA~vsiI~~~~~~~~~~~-~~~~~~v~~~~g~~-~~~~~~~~~v~~~~---~~~~~~~ 76 (100)
T PF00077_consen 7 YITVKINGK-----KIKALLDTGADVSIISEKDWKKLGPP-PKTSITVRGAGGSS-SILGSTTVEVKIGG---KEFNHTF 76 (100)
T ss_dssp EEEEEETTE-----EEEEEEETTBSSEEESSGGSSSTSSE-EEEEEEEEETTEEE-EEEEEEEEEEEETT---EEEEEEE
T ss_pred eEEEeECCE-----EEEEEEecCCCcceeccccccccccc-ccCCceeccCCCcc-eeeeEEEEEEEEEC---ccceEEE
Confidence 345555544 88999999999999999999998877 56888999999988 77788888888877 4556778
Q ss_pred Eeec
Q psy15380 955 LVVN 958 (1738)
Q Consensus 955 ~Vvp 958 (1738)
+|+|
T Consensus 77 ~v~~ 80 (100)
T PF00077_consen 77 LVVP 80 (100)
T ss_dssp EESS
T ss_pred EecC
Confidence 8777
No 15
>PF12384 Peptidase_A2B: Ty3 transposon peptidase; InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=98.27 E-value=3.4e-06 Score=87.50 Aligned_cols=84 Identities=20% Similarity=0.188 Sum_probs=67.7
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcce-eeeEEeeccCCCcc-eeceEEEEEecccCCcceEEEEEEeeccccCCCCc
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLF-VATDLKGVGSISSP-VQGQVTMRFGSRFDKRYNYTIKALVVNHVVDKLPV 966 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~-~~i~I~g~g~~~~~-~~~~v~v~i~~~~~~~~~~~i~a~Vvp~i~~~lP~ 966 (1738)
.+.+||||||-.|||+.+++++|+|+... .++.++|+.+.... ++..+.+.|...+ ..+.+.|||++.+..++..
T Consensus 45 ~i~vLfDSGSPTSfIr~di~~kL~L~~~~app~~fRG~vs~~~~~tsEAv~ld~~i~n---~~i~i~aYV~d~m~~dlII 121 (177)
T PF12384_consen 45 PIKVLFDSGSPTSFIRSDIVEKLELPTHDAPPFRFRGFVSGESATTSEAVTLDFYIDN---KLIDIAAYVTDNMDHDLII 121 (177)
T ss_pred EEEEEEeCCCccceeehhhHHhhCCccccCCCEEEeeeccCCceEEEEeEEEEEEECC---eEEEEEEEEeccCCcceEe
Confidence 78999999999999999999999999988 78999999866544 5688999999887 6789999999977655544
Q ss_pred cccchhhhH
Q psy15380 967 KSVDLSKLE 975 (1738)
Q Consensus 967 ~~i~~~~~~ 975 (1738)
.....+.|+
T Consensus 122 GnPiL~ryp 130 (177)
T PF12384_consen 122 GNPILDRYP 130 (177)
T ss_pred ccHHHhhhH
Confidence 332333333
No 16
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=98.19 E-value=5.9e-06 Score=85.74 Aligned_cols=81 Identities=16% Similarity=0.240 Sum_probs=57.3
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcce-ee--eEEeeccCCCcceeceE-EEEEecccCCcceEEEEEEeeccccCCC
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLF-VA--TDLKGVGSISSPVQGQV-TMRFGSRFDKRYNYTIKALVVNHVVDKL 964 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~-~~--i~I~g~g~~~~~~~~~v-~v~i~~~~~~~~~~~i~a~Vvp~i~~~l 964 (1738)
++++|+||||+.|||+.++|++||++... .+ ..+.|.|+ ....+.+ .+.+.+.+ ..+.+++.|+|.
T Consensus 27 ~~~~LvDTGAs~s~Is~~~a~~lgl~~~~~~~~~~~~~g~g~--~~~~g~~~~~~l~i~~---~~~~~~~~Vl~~----- 96 (124)
T cd05479 27 PVKAFVDSGAQMTIMSKACAEKCGLMRLIDKRFQGIAKGVGT--QKILGRIHLAQVKIGN---LFLPCSFTVLED----- 96 (124)
T ss_pred EEEEEEeCCCceEEeCHHHHHHcCCccccCcceEEEEecCCC--cEEEeEEEEEEEEECC---EEeeeEEEEECC-----
Confidence 68999999999999999999999998643 22 34555554 3333433 44566655 455677777751
Q ss_pred CccccchhhhHHhhhhhcccCCcCCCCCCceeeeecccccccc
Q psy15380 965 PVKSVDLSKLEYLQNIRHKLADDEFMEPGNICGILGAQIYPYL 1007 (1738)
Q Consensus 965 P~~~i~~~~~~~l~~l~~~Ladp~~~~~~~idlLIG~D~~~~i 1007 (1738)
...|+|||+|++..+
T Consensus 97 ----------------------------~~~d~ILG~d~L~~~ 111 (124)
T cd05479 97 ----------------------------DDVDFLIGLDMLKRH 111 (124)
T ss_pred ----------------------------CCcCEEecHHHHHhC
Confidence 146899999998865
No 17
>PF13650 Asp_protease_2: Aspartyl protease
Probab=98.12 E-value=1.2e-05 Score=78.12 Aligned_cols=67 Identities=25% Similarity=0.246 Sum_probs=47.8
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCccee--eeEEeeccCCCcceeceEEEEEecccCCcceEEEEEEeec
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLFV--ATDLKGVGSISSPVQGQVTMRFGSRFDKRYNYTIKALVVN 958 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~~--~i~I~g~g~~~~~~~~~v~v~i~~~~~~~~~~~i~a~Vvp 958 (1738)
++++||||||+.|+|+++++++|+++.... ...+.++|+. ........-++.+++ .....+++.+++
T Consensus 9 ~~~~liDTGa~~~~i~~~~~~~l~~~~~~~~~~~~~~~~~g~-~~~~~~~~~~i~ig~--~~~~~~~~~v~~ 77 (90)
T PF13650_consen 9 PVRFLIDTGASISVISRSLAKKLGLKPRPKSVPISVSGAGGS-VTVYRGRVDSITIGG--ITLKNVPFLVVD 77 (90)
T ss_pred EEEEEEcCCCCcEEECHHHHHHcCCCCcCCceeEEEEeCCCC-EEEEEEEEEEEEECC--EEEEeEEEEEEC
Confidence 789999999999999999999999988774 6888999987 333322222555544 112245565555
No 18
>PF09668 Asp_protease: Aspartyl protease; InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=98.02 E-value=1e-05 Score=82.65 Aligned_cols=90 Identities=20% Similarity=0.288 Sum_probs=57.9
Q ss_pred EEEEEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcce-e--eeEEeeccCCCcceeceEE-EEEecccCCcceE
Q psy15380 875 TVKILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLF-V--ATDLKGVGSISSPVQGQVT-MRFGSRFDKRYNY 950 (1738)
Q Consensus 875 t~~V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~-~--~i~I~g~g~~~~~~~~~v~-v~i~~~~~~~~~~ 950 (1738)
.+.+.+.|. .+.|++|||||.|+||.++|+++||...- + .....|.|. .+..|.+. +.+.+++ ..+
T Consensus 26 yI~~~ing~-----~vkA~VDtGAQ~tims~~~a~r~gL~~lid~r~~g~a~GvG~--~~i~G~Ih~~~l~ig~---~~~ 95 (124)
T PF09668_consen 26 YINCKINGV-----PVKAFVDTGAQSTIMSKSCAERCGLMRLIDKRFAGVAKGVGT--QKILGRIHSVQLKIGG---LFF 95 (124)
T ss_dssp EEEEEETTE-----EEEEEEETT-SS-EEEHHHHHHTTGGGGEEGGG-EE---------EEEEEEEEEEEEETT---EEE
T ss_pred EEEEEECCE-----EEEEEEeCCCCccccCHHHHHHcCChhhccccccccccCCCc--CceeEEEEEEEEEECC---EEE
Confidence 355555544 78999999999999999999999997543 2 233444532 23445554 7777766 567
Q ss_pred EEEEEeeccccCCCCccccchhhhHHhhhhhcccCCcCCCCCCceeeeecccccccc
Q psy15380 951 TIKALVVNHVVDKLPVKSVDLSKLEYLQNIRHKLADDEFMEPGNICGILGAQIYPYL 1007 (1738)
Q Consensus 951 ~i~a~Vvp~i~~~lP~~~i~~~~~~~l~~l~~~Ladp~~~~~~~idlLIG~D~~~~i 1007 (1738)
...+.|++. .++|+|||.|++..+
T Consensus 96 ~~s~~Vle~---------------------------------~~~d~llGld~L~~~ 119 (124)
T PF09668_consen 96 PCSFTVLED---------------------------------QDVDLLLGLDMLKRH 119 (124)
T ss_dssp EEEEEEETT---------------------------------SSSSEEEEHHHHHHT
T ss_pred EEEEEEeCC---------------------------------CCcceeeeHHHHHHh
Confidence 777877761 346899999987654
No 19
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=97.96 E-value=2.3e-05 Score=76.90 Aligned_cols=69 Identities=26% Similarity=0.347 Sum_probs=60.5
Q ss_pred eeeeEEeeCCCCccccCHhhHhhhC---CCc-ceeeeEEeeccCCCcceeceEEEEEecccCCcceEEEEEEeecc
Q psy15380 888 HEMRFLLDCASMSNILSLSACKQLG---LKT-LFVATDLKGVGSISSPVQGQVTMRFGSRFDKRYNYTIKALVVNH 959 (1738)
Q Consensus 888 ~~v~aLLDSGS~vS~Is~~la~~Lg---L~~-~~~~i~I~g~g~~~~~~~~~v~v~i~~~~~~~~~~~i~a~Vvp~ 959 (1738)
..++++|||||++|+|+.+.+++|+ .+. .++++.++++|++...+.|.+.+.+..++ ..+.+.|+|++.
T Consensus 9 ~~v~~~vDtGA~vnllp~~~~~~l~~~~~~~L~~t~~~L~~~~g~~~~~~G~~~~~v~~~~---~~~~~~f~Vvd~ 81 (93)
T cd05481 9 QSVKFQLDTGATCNVLPLRWLKSLTPDKDPELRPSPVRLTAYGGSTIPVEGGVKLKCRYRN---PKYNLTFQVVKE 81 (93)
T ss_pred eeEEEEEecCCEEEeccHHHHhhhccCCCCcCccCCeEEEeeCCCEeeeeEEEEEEEEECC---cEEEEEEEEECC
Confidence 5889999999999999999999998 444 44899999999999999998888888777 457999999984
No 20
>cd01648 TERT TERT: Telomerase reverse transcriptase (TERT). Telomerase is a ribonucleoprotein (RNP) that synthesizes telomeric DNA repeats. The telomerase RNA subunit provides the template for synthesis of these repeats. The catalytic subunit of RNP is known as telomerase reverse transcriptase (TERT). The reverse transcriptase (RT) domain is located in the C-terminal region of the TERT polypeptide. Single amino acid substitutions in this region lead to telomere shortening and senescence. Telomerase is an enzyme that, in certain cells, maintains the physical ends of chromosomes (telomeres) during replication. In somatic cells, replication of the lagging strand requires the continual presence of an RNA primer approximately 200 nucleotides upstream, which is complementary to the template strand. Since there is a region of DNA less than 200 base pairs from the end of the chromosome where this is not possible, the chromosome is continually shortened. However, a surplus of repetitive DNA at
Probab=97.80 E-value=3.7e-05 Score=79.25 Aligned_cols=90 Identities=18% Similarity=0.089 Sum_probs=69.5
Q ss_pred cccccceeeeEecccCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccccccceecc
Q psy15380 1226 GDITKMFLQIKLLSDYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQYIGESLYMD 1305 (1738)
Q Consensus 1226 ~Dl~kaf~qv~l~~~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvD 1305 (1738)
.|+++||-.|+. .|.+..| +|.|...||.++.-.|+.+...................|+|
T Consensus 1 ~d~~~~~~~~~~--------~~~~~~G------------lpQG~~lSp~L~nl~l~~l~~~~~~~~~~~~~~~~~~rYaD 60 (119)
T cd01648 1 TDIKKCYDSIPQ--------YYRQKVG------------IPQGSPLSSLLCSLYYADLENKYLSFLDVIDKDSLLLRLVD 60 (119)
T ss_pred CChHHhccchhh--------hhhhcCc------------ccCCcchHHHHHHHHHHHHHHHHHhhcccCCCCceEEEEeC
Confidence 488999988888 3333333 99999999999999988887765442100111223456999
Q ss_pred ceeeecCCHHHHHHHHHHHHHHH-HhCCceE
Q psy15380 1306 DCVISFPSQKEAIEFFNQTVKLF-ASGGFRF 1335 (1738)
Q Consensus 1306 Dili~~~s~ee~~~~~~~v~~~l-~~~Gf~l 1335 (1738)
|+++.+++.+++.+.++.+...+ ++.||.+
T Consensus 61 D~li~~~~~~~~~~~~~~l~~~l~~~~gl~i 91 (119)
T cd01648 61 DFLLITTSLDKAIKFLNLLLRGFINQYKTFV 91 (119)
T ss_pred cEEEEeCCHHHHHHHHHHHHHhhHHhhCeEE
Confidence 99999999999999999999998 9999998
No 21
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.72 E-value=0.00013 Score=71.62 Aligned_cols=64 Identities=16% Similarity=0.188 Sum_probs=45.0
Q ss_pred EEEEEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcce-eeeEEeeccCCCcceeceEEEEEeccc
Q psy15380 875 TVKILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLF-VATDLKGVGSISSPVQGQVTMRFGSRF 944 (1738)
Q Consensus 875 t~~V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~-~~i~I~g~g~~~~~~~~~v~v~i~~~~ 944 (1738)
++++.+.+. ++++||||||+.|+|+.+++++|++.... ....+.+++|........ .-++++++
T Consensus 4 ~v~v~i~~~-----~~~~llDTGa~~s~i~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~-~~~i~ig~ 68 (96)
T cd05483 4 VVPVTINGQ-----PVRFLLDTGASTTVISEELAERLGLPLTLGGKVTVQTANGRVRAARVR-LDSLQIGG 68 (96)
T ss_pred EEEEEECCE-----EEEEEEECCCCcEEcCHHHHHHcCCCccCCCcEEEEecCCCccceEEE-cceEEECC
Confidence 355656533 78999999999999999999999983333 677788888866554322 33444443
No 22
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where
Probab=97.66 E-value=0.00016 Score=70.08 Aligned_cols=39 Identities=18% Similarity=0.238 Sum_probs=34.2
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeeccCCC
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGVGSIS 929 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~g~~~ 929 (1738)
.+.+|+||||+.|+|+++.++++. ....+..+.|+||..
T Consensus 9 ~~~fLvDTGA~~tii~~~~a~~~~--~~~~~~~v~gagG~~ 47 (86)
T cd06095 9 PIVFLVDTGATHSVLKSDLGPKQE--LSTTSVLIRGVSGQS 47 (86)
T ss_pred EEEEEEECCCCeEEECHHHhhhcc--CCCCcEEEEeCCCcc
Confidence 789999999999999999999982 234889999999986
No 23
>PF13975 gag-asp_proteas: gag-polyprotein putative aspartyl protease
Probab=97.64 E-value=0.0001 Score=68.72 Aligned_cols=58 Identities=22% Similarity=0.224 Sum_probs=46.6
Q ss_pred EEEEEEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCccee--eeEEeeccCCCcceeceE
Q psy15380 874 GTVKILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLFV--ATDLKGVGSISSPVQGQV 936 (1738)
Q Consensus 874 ~t~~V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~~--~i~I~g~g~~~~~~~~~v 936 (1738)
.++++.+.+. .+.+|+||||+.|||++++|++||++.... ...++.++|......+.+
T Consensus 9 ~~v~~~I~g~-----~~~alvDtGat~~fis~~~a~rLgl~~~~~~~~~~v~~a~g~~~~~~g~~ 68 (72)
T PF13975_consen 9 MYVPVSIGGV-----QVKALVDTGATHNFISESLAKRLGLPLEKPPSPIRVKLANGSVIEIRGVA 68 (72)
T ss_pred EEEEEEECCE-----EEEEEEeCCCcceecCHHHHHHhCCCcccCCCCEEEEECCCCccccceEE
Confidence 3455555543 788999999999999999999999999884 599999999876665443
No 24
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=97.53 E-value=0.0003 Score=72.64 Aligned_cols=50 Identities=20% Similarity=0.169 Sum_probs=41.0
Q ss_pred EEEEEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcce--eeeEEeeccCCC
Q psy15380 875 TVKILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLF--VATDLKGVGSIS 929 (1738)
Q Consensus 875 t~~V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~--~~i~I~g~g~~~ 929 (1738)
.+++.+.+. ++.+|+||||+.++|++++|++||++... ....+.++||..
T Consensus 13 ~v~~~InG~-----~~~flVDTGAs~t~is~~~A~~Lgl~~~~~~~~~~~~ta~G~~ 64 (121)
T TIGR02281 13 YATGRVNGR-----NVRFLVDTGATSVALNEEDAQRLGLDLNRLGYTVTVSTANGQI 64 (121)
T ss_pred EEEEEECCE-----EEEEEEECCCCcEEcCHHHHHHcCCCcccCCceEEEEeCCCcE
Confidence 466666543 78999999999999999999999998754 467888998854
No 25
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.46 E-value=0.00031 Score=67.70 Aligned_cols=83 Identities=24% Similarity=0.263 Sum_probs=53.1
Q ss_pred eeeeEEeeCCCCccccCHhhHhhhCCCcceeeeEE----eeccCCCcceeceE-EEEEecccCCcceEEEEEEeeccccC
Q psy15380 888 HEMRFLLDCASMSNILSLSACKQLGLKTLFVATDL----KGVGSISSPVQGQV-TMRFGSRFDKRYNYTIKALVVNHVVD 962 (1738)
Q Consensus 888 ~~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I----~g~g~~~~~~~~~v-~v~i~~~~~~~~~~~i~a~Vvp~i~~ 962 (1738)
+.++|++|||||.|+||..+|+++||...-..... .|+|.+ .+..|.+ .++++++. ..++..+.|++
T Consensus 8 ~~vkAfVDsGaQ~timS~~caercgL~r~v~~~r~~g~A~gvgt~-~kiiGrih~~~ikig~---~~~~CSftVld---- 79 (103)
T cd05480 8 KELRALVDTGCQYNLISAACLDRLGLKERVLKAKAEEEAPSLPTS-VKVIGQIERLVLQLGQ---LTVECSAQVVD---- 79 (103)
T ss_pred EEEEEEEecCCchhhcCHHHHHHcChHhhhhhccccccccCCCcc-eeEeeEEEEEEEEeCC---EEeeEEEEEEc----
Confidence 47899999999999999999999999754222122 233321 1222322 34555554 34455555554
Q ss_pred CCCccccchhhhHHhhhhhcccCCcCCCCCCceeeeecccccccc
Q psy15380 963 KLPVKSVDLSKLEYLQNIRHKLADDEFMEPGNICGILGAQIYPYL 1007 (1738)
Q Consensus 963 ~lP~~~i~~~~~~~l~~l~~~Ladp~~~~~~~idlLIG~D~~~~i 1007 (1738)
..++|+|+|.|.+..+
T Consensus 80 -----------------------------~~~~d~llGLdmLkrh 95 (103)
T cd05480 80 -----------------------------DNEKNFSLGLQTLKSL 95 (103)
T ss_pred -----------------------------CCCcceEeeHHHHhhc
Confidence 2467999999987754
No 26
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=97.10 E-value=0.00098 Score=63.54 Aligned_cols=53 Identities=17% Similarity=0.255 Sum_probs=44.7
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeeccCCCcceeceEEEEEeccc
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGVGSISSPVQGQVTMRFGSRF 944 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~g~~~~~~~~~v~v~i~~~~ 944 (1738)
.+++|+||||++|+|.....++. ....++.+.++||+.+...|...+.+..+.
T Consensus 9 ~~~fLVDTGA~vSviP~~~~~~~---~~~~~~~l~AANgt~I~tyG~~~l~ldlGl 61 (89)
T cd06094 9 GLRFLVDTGAAVSVLPASSTKKS---LKPSPLTLQAANGTPIATYGTRSLTLDLGL 61 (89)
T ss_pred CcEEEEeCCCceEeecccccccc---ccCCceEEEeCCCCeEeeeeeEEEEEEcCC
Confidence 35899999999999998887753 344678999999999999999898888865
No 27
>cd01650 RT_nLTR_like RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse transcriptase (RT). This subfamily contains both non-LTR retrotransposons and non-LTR retrovirus RTs. RTs catalyze the conversion of single-stranded RNA into double-stranded DNA for integration into host chromosomes. RT is a multifunctional enzyme with RNA-directed DNA polymerase, DNA directed DNA polymerase and ribonuclease hybrid (RNase H) activities.
Probab=97.00 E-value=0.00071 Score=77.45 Aligned_cols=99 Identities=18% Similarity=0.117 Sum_probs=77.4
Q ss_pred ccCcceeecccccceeeeEecccCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhcC--cccc
Q psy15380 1218 RLFPYALNGDITKMFLQIKLLSDYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKNF--PLAC 1295 (1738)
Q Consensus 1218 r~~~~~~~~Dl~kaf~qv~l~~~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~--p~~~ 1295 (1738)
....+++.+|+++||-.|..+.=.+.. .+|.|...||.+++-+|..+.+...... +...
T Consensus 79 ~~~~~~l~~Di~~aFdsi~~~~l~~~l-------------------GipQG~~lSp~l~~l~~~~l~~~~~~~~~~~~~~ 139 (220)
T cd01650 79 KKSLVLVFLDFEKAFDSVDHEFLLKAL-------------------GVRQGDPLSPLLFNLALDDLLRLLNKEEEIKLGG 139 (220)
T ss_pred CCceEEEEEEHHhhcCcCCHHHHHHHh-------------------CCccCCcccHHHHHHHHHHHHHHHHhhccccCCC
Confidence 567889999999999887643222211 7999999999999999999887765210 0011
Q ss_pred cccccceeccceeeecCCHH-HHHHHHHHHHHHHHhCCceE
Q psy15380 1296 QYIGESLYMDDCVISFPSQK-EAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1296 ~~i~~~~YvDDili~~~s~e-e~~~~~~~v~~~l~~~Gf~l 1335 (1738)
..+....|+||+++.+.+.+ ++.+.++.+...+.+.|+.+
T Consensus 140 ~~~~~~~yaDD~~i~~~~~~~~~~~~~~~~~~~~~~~gl~i 180 (220)
T cd01650 140 PGITHLAYADDIVLFSEGKSRKLQELLQRLQEWSKESGLKI 180 (220)
T ss_pred CccceEEeccceeeeccCCHHHHHHHHHHHHHHHHHcCCEE
Confidence 23446689999999999999 99999999999999999988
No 28
>cd03487 RT_Bac_retron_II RT_Bac_retron_II: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome.
Probab=96.90 E-value=0.001 Score=76.06 Aligned_cols=116 Identities=19% Similarity=0.202 Sum_probs=77.8
Q ss_pred cccCcceeecccccceeeeEecccCcceEEEEeccCC----cccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhcCc
Q psy15380 1217 FRLFPYALNGDITKMFLQIKLLSDYWKFQKILWRFSN----KEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKNFP 1292 (1738)
Q Consensus 1217 ~r~~~~~~~~Dl~kaf~qv~l~~~dr~~~~f~w~~~~----~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p 1292 (1738)
..+..|++.+||+++|-.|..+.=-+.+......... -..+..+. ..+|.|...||.+++-+|..+-...... .
T Consensus 53 ~~~~~~v~~~Di~~fFdsI~~~~L~~~l~~~~~~~~~~~~~l~~~~~~~-~GlpQG~~lSp~Lanl~l~~~d~~l~~~-~ 130 (214)
T cd03487 53 HCGAKYVLKLDIKDFFPSITFERVRGVFRSLGYFSPDVATILAKLCTYN-GHLPQGAPTSPALSNLVFRKLDERLSKL-A 130 (214)
T ss_pred hcCCCEEEEeehhhhcccCCHHHHHHHHHHcCCCCHHHHHHHHHHHhCC-CCcCCCCcccHHHHHHHHHHHHHHHHHH-H
Confidence 4567899999999999998775421111111110000 00011111 1799999999999999988765443321 0
Q ss_pred ccccccccceeccceeeecCCHH--HHHHHHHHHHHHHHhCCceE
Q psy15380 1293 LACQYIGESLYMDDCVISFPSQK--EAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1293 ~~~~~i~~~~YvDDili~~~s~e--e~~~~~~~v~~~l~~~Gf~l 1335 (1738)
. ...+...-||||+++.+++.+ ++.+.+..+...|++.||.+
T Consensus 131 ~-~~~~~~~RYaDD~~i~~~~~~~~~~~~~~~~i~~~l~~~gL~l 174 (214)
T cd03487 131 K-SNGLTYTRYADDITFSSNKKLKEALDKLLEIIRSILSEEGFKI 174 (214)
T ss_pred H-HcCCeEEEEeccEEEEccccchhHHHHHHHHHHHHHHHCCcee
Confidence 0 112334579999999999988 89999999999999999998
No 29
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=96.67 E-value=0.0076 Score=56.59 Aligned_cols=68 Identities=26% Similarity=0.375 Sum_probs=47.3
Q ss_pred eeeeEEeeCCCCccccCHhhHhhhCC-Ccce-eeeEEeeccCCCcceece-EEEEEecccCCcceEEEEEEeec
Q psy15380 888 HEMRFLLDCASMSNILSLSACKQLGL-KTLF-VATDLKGVGSISSPVQGQ-VTMRFGSRFDKRYNYTIKALVVN 958 (1738)
Q Consensus 888 ~~v~aLLDSGS~vS~Is~~la~~LgL-~~~~-~~i~I~g~g~~~~~~~~~-v~v~i~~~~~~~~~~~i~a~Vvp 958 (1738)
..+.+|+|+||+.++++.+.+++++. .... ....+.++++......+. ..+.+...+ ..+.+.+++++
T Consensus 8 ~~~~~liDtgs~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~---~~~~~~~~~~~ 78 (92)
T cd00303 8 VPVRALVDSGASVNFISESLAKKLGLPPRLLPTPLKVKGANGSSVKTLGVILPVTIGIGG---KTFTVDFYVLD 78 (92)
T ss_pred EEEEEEEcCCCcccccCHHHHHHcCCCcccCCCceEEEecCCCEeccCcEEEEEEEEeCC---EEEEEEEEEEc
Confidence 37899999999999999999999987 3322 567777887765444333 455555544 34556666554
No 30
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=96.50 E-value=0.013 Score=59.13 Aligned_cols=67 Identities=12% Similarity=0.011 Sum_probs=48.5
Q ss_pred EEEEEEecC-CCceeeeEEeeCCCCccc-cCHhhHhhhCCCcceeeeEEeeccCCCcceeceEEEEEeccc
Q psy15380 876 VKILVLDVY-GKPHEMRFLLDCASMSNI-LSLSACKQLGLKTLFVATDLKGVGSISSPVQGQVTMRFGSRF 944 (1738)
Q Consensus 876 ~~V~v~~~~-g~~~~v~aLLDSGS~vS~-Is~~la~~LgL~~~~~~i~I~g~g~~~~~~~~~v~v~i~~~~ 944 (1738)
+.+.+.|++ .+...+.+|+||||+..+ |+.++|++||++... ...+.+++|..... ......+...+
T Consensus 2 ~~v~~~~p~~~~~~~v~~LVDTGat~~~~l~~~~a~~lgl~~~~-~~~~~tA~G~~~~~-~v~~~~v~igg 70 (107)
T TIGR03698 2 LDVELSNPKNPEFMEVRALVDTGFSGFLLVPPDIVNKLGLPELD-QRRVYLADGREVLT-DVAKASIIING 70 (107)
T ss_pred EEEEEeCCCCCCceEEEEEEECCCCeEEecCHHHHHHcCCCccc-CcEEEecCCcEEEE-EEEEEEEEECC
Confidence 567888874 444589999999999887 999999999998864 55788888853332 22334444444
No 31
>KOG0012|consensus
Probab=96.44 E-value=0.0031 Score=73.69 Aligned_cols=39 Identities=23% Similarity=0.493 Sum_probs=31.2
Q ss_pred eeeeEEeeCCCCccccCHhhHhhhCCCcce---eeeEEeecc
Q psy15380 888 HEMRFLLDCASMSNILSLSACKQLGLKTLF---VATDLKGVG 926 (1738)
Q Consensus 888 ~~v~aLLDSGS~vS~Is~~la~~LgL~~~~---~~i~I~g~g 926 (1738)
+.|+|++|||||.|+||.++|+++||...- ..-..+|+|
T Consensus 245 ~~VKAfVDsGaq~timS~~Caer~gL~rlid~r~~g~a~gvg 286 (380)
T KOG0012|consen 245 VPVKAFVDSGAQTTIMSAACAERCGLNRLIDKRFQGEARGVG 286 (380)
T ss_pred EEEEEEEcccchhhhhhHHHHHHhChHHHhhhhhhccccCCC
Confidence 389999999999999999999999998754 223344555
No 32
>PF02160 Peptidase_A3: Cauliflower mosaic virus peptidase (A3); InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=96.39 E-value=0.0054 Score=67.63 Aligned_cols=99 Identities=9% Similarity=-0.051 Sum_probs=63.6
Q ss_pred EEEEecCCCceeeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeeccCCCccee-ceEEEEEecccCCcceEEEEEEe
Q psy15380 878 ILVLDVYGKPHEMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGVGSISSPVQ-GQVTMRFGSRFDKRYNYTIKALV 956 (1738)
Q Consensus 878 V~v~~~~g~~~~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~g~~~~~~~-~~v~v~i~~~~~~~~~~~i~a~V 956 (1738)
|.+.-+......+.|++||||++.++++.+.-.-.......++.|+|++++...+. ..-.+.+.+.+ +.+.
T Consensus 9 ~~i~~~gy~~~~~~~~vDTGAt~C~~~~~iiP~e~we~~~~~i~v~~an~~~~~i~~~~~~~~i~I~~---~~F~----- 80 (201)
T PF02160_consen 9 VKISFPGYKKFNYHCYVDTGATICCASKKIIPEEYWEKSKKPIKVKGANGSIIQINKKAKNGKIQIAD---KIFR----- 80 (201)
T ss_pred EEEEEcCceeEEEEEEEeCCCceEEecCCcCCHHHHHhCCCcEEEEEecCCceEEEEEecCceEEEcc---EEEe-----
Confidence 33443433566889999999999999887765544444557789999998765543 34455555544 2222
Q ss_pred eccccCCCCccccchhhhHHhhhhhcccCCcCCCCCCceeeeeccccccccccCCc
Q psy15380 957 VNHVVDKLPVKSVDLSKLEYLQNIRHKLADDEFMEPGNICGILGAQIYPYLVSGDP 1012 (1738)
Q Consensus 957 vp~i~~~lP~~~i~~~~~~~l~~l~~~Ladp~~~~~~~idlLIG~D~~~~i~~~~~ 1012 (1738)
+|.+- -....+|||||++++..+.+.-+
T Consensus 81 IP~iY----------------------------q~~~g~d~IlG~NF~r~y~Pfiq 108 (201)
T PF02160_consen 81 IPTIY----------------------------QQESGIDIILGNNFLRLYEPFIQ 108 (201)
T ss_pred ccEEE----------------------------EecCCCCEEecchHHHhcCCcEE
Confidence 22110 11246899999999987666633
No 33
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=96.05 E-value=0.0035 Score=41.82 Aligned_cols=18 Identities=44% Similarity=0.999 Sum_probs=16.1
Q ss_pred ccccccccCCCCCCCCCC
Q psy15380 722 DLCVNCLGTGHKANNCPS 739 (1738)
Q Consensus 722 ~lCf~Clk~GH~a~~C~s 739 (1738)
+.||+|++.||++++|++
T Consensus 1 ~~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 1 RKCFNCGEPGHIARDCPK 18 (18)
T ss_dssp SBCTTTSCSSSCGCTSSS
T ss_pred CcCcCCCCcCcccccCcc
Confidence 369999999999999974
No 34
>cd01646 RT_Bac_retron_I RT_Bac_retron_I: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome.
Probab=95.83 E-value=0.02 Score=62.20 Aligned_cols=72 Identities=19% Similarity=0.216 Sum_probs=60.2
Q ss_pred EEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccccccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1260 YELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQYIGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1260 y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
.....+|.|...||.+++.+|..+-...... ...+...-||||+++.+++.+++.+.++.+...+++.|+.+
T Consensus 49 ~~~~GlpqG~~lS~~L~~~~l~~~d~~i~~~----~~~~~~~RY~DD~~i~~~~~~~~~~~~~~i~~~l~~~gL~l 120 (158)
T cd01646 49 GQTNGLPIGPLTSRFLANIYLNDVDHELKSK----LKGVDYVRYVDDIRIFADSKEEAEEILEELKEFLAELGLSL 120 (158)
T ss_pred CCCceEccCcchHHHHHHHHHHHHHHHHHhc----cCCceEEEecCcEEEEcCCHHHHHHHHHHHHHHHHHCCCEE
Confidence 3456799999999999999998875555432 12344567999999999999999999999999999999999
No 35
>cd01651 RT_G2_intron RT_G2_intron: Reverse transcriptases (RTs) with group II intron origin. RT transcribes DNA using RNA as template. Proteins in this subfamily are found in bacterial and mitochondrial group II introns. Their most probable ancestor was a retrotransposable element with both gag-like and pol-like genes. This subfamily of proteins appears to have captured the RT sequences from transposable elements, which lack long terminal repeats (LTRs).
Probab=95.82 E-value=0.0087 Score=68.67 Aligned_cols=119 Identities=18% Similarity=0.164 Sum_probs=78.4
Q ss_pred cccCcceeecccccceeeeEecc---------cCcceEEEEec---cC-CcccceEEEEEEeeeccccchHHHHHHHHHH
Q psy15380 1217 FRLFPYALNGDITKMFLQIKLLS---------DYWKFQKILWR---FS-NKEKIDVYELRVVIFGTKASPYLAQRTVKQL 1283 (1738)
Q Consensus 1217 ~r~~~~~~~~Dl~kaf~qv~l~~---------~dr~~~~f~w~---~~-~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~ 1283 (1738)
.....|++.+|++++|-.|.-+- .+.....++.. .. .........-..+|.|...||.++.-+|..+
T Consensus 66 ~~~~~~~~~~Di~~~Fdsi~~~~l~~~l~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~GlpqG~~lSp~L~~~~l~~l 145 (226)
T cd01651 66 KGGYTWVIEGDIKGFFDNIDHDLLLKILKRRIGDKRVLRLIRKWLKAGVLEDGKLVETEKGTPQGGVISPLLANIYLHEL 145 (226)
T ss_pred cCCCeEEEEccHHHhcCCCCHHHHHHHHHHhcccHHHHHHHHHHHhceEccCCeEeCCCCCcCCCccHHHHHHHHHHHHH
Confidence 56688999999999998774211 01110000000 00 0001111223459999999999999998887
Q ss_pred HHhhhhcCc-------ccccccccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1284 IDDESKNFP-------LACQYIGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1284 l~~~~~~~p-------~~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
......... .....+....|+||+++.+++.+++.+.++.+...+++.|+.+
T Consensus 146 d~~l~~~~~~~~~~~~~~~~~~~~~rY~DD~~i~~~~~~~~~~~~~~i~~~~~~~gl~l 204 (226)
T cd01651 146 DKFVEEKLKEYYDTSDPKFRRLRYVRYADDFVIGVRGPKEAEEIKELIREFLEELGLEL 204 (226)
T ss_pred HHHHHHhhhhcccccccccCceEEEEecCceEEecCCHHHHHHHHHHHHHHHHHcCCee
Confidence 665543210 0112344567999999999999999999999999999999998
No 36
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=95.80 E-value=0.028 Score=54.30 Aligned_cols=65 Identities=14% Similarity=0.047 Sum_probs=41.0
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeeccCCCcceeceEEEEEecccCCcceEEEEEEeec
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGVGSISSPVQGQVTMRFGSRFDKRYNYTIKALVVN 958 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~g~~~~~~~~~v~v~i~~~~~~~~~~~i~a~Vvp 958 (1738)
.+.+||||||+.|+|++..+.... +....+..|.|+||. ..+.....+.+...+. .....+++.|
T Consensus 9 ~~~~llDTGAd~Tvi~~~~~p~~w-~~~~~~~~i~GIGG~-~~~~~~~~v~i~i~~~---~~~g~vlv~~ 73 (87)
T cd05482 9 LFEGLLDTGADVSIIAENDWPKNW-PIQPAPSNLTGIGGA-ITPSQSSVLLLEIDGE---GHLGTILVYV 73 (87)
T ss_pred EEEEEEccCCCCeEEcccccCCCC-ccCCCCeEEEeccce-EEEEEEeeEEEEEcCC---eEEEEEEEcc
Confidence 679999999999999985554321 111266788899985 3444444566666552 3344455544
No 37
>cd06222 RnaseH RNase H (RNase HI) is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a not sequence-specific manner. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H knockout mice lack mitochondrial DNA replication and die as embryos. The retroviral reverse transcriptase contains an RNase H domain that plays an important role in converting a single stranded retroviral genomic RNA into a dsDNA for integration into host chromosomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.
Probab=95.76 E-value=0.042 Score=55.96 Aligned_cols=93 Identities=16% Similarity=0.057 Sum_probs=58.7
Q ss_pred eecccccc----cceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCCcccEE
Q psy15380 1626 GFSDASEK----GYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYPIDSIY 1701 (1738)
Q Consensus 1626 ~F~DAS~~----ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~~~~~~ 1701 (1738)
+|+|||-. .+|.-++++- .++.... ...+. . ...|.-..||.|++.|++.+. . ..+..+.
T Consensus 2 ~~~Dgs~~~~~~~~g~g~v~~~--~~~~~~~--~~~~~-~---~~~s~~~aEl~al~~al~~~~------~--~~~~~i~ 65 (130)
T cd06222 2 IYTDGSCRGNPGPAGAGVVLRD--PGGEVLL--SGGLL-G---GNTTNNRAELLALIEALELAL------E--LGGKKVN 65 (130)
T ss_pred EEecccCCCCCCceEEEEEEEe--CCCeEEE--ecccc-C---CCCcHHHHHHHHHHHHHHHHH------h--CCCceEE
Confidence 69999988 2233344442 3332221 11111 1 456889999999999988776 2 5689999
Q ss_pred EEechhhhHHhhcCCCccccchhhchHHHHHhh
Q psy15380 1702 CFTDSSVALCWAHSPSHLFNTFVANRISKIHQN 1734 (1738)
Q Consensus 1702 ~~tDS~i~l~~i~~~~~~~~~fv~nRv~~I~~~ 1734 (1738)
++|||..++..+++.......-....+..|++.
T Consensus 66 i~~Ds~~~~~~~~~~~~~~~~~~~~~~~~i~~~ 98 (130)
T cd06222 66 IYTDSQYVINALTGWYEGKPVKNVDLWQRLLAL 98 (130)
T ss_pred EEECHHHHHHHhhccccCCChhhHHHHHHHHHH
Confidence 999999999999986431222233344444443
No 38
>PF00075 RNase_H: RNase H; InterPro: IPR002156 The RNase H domain is responsible for hydrolysis of the RNA portion of RNA x DNA hybrids, and this activity requires the presence of divalent cations (Mg2+ or Mn2+) that bind its active site. This domain is a part of a large family of homologous RNase H enzymes of which the RNase HI protein from Escherichia coli is the best characterised []. Secondary structure predictions for the enzymes from E. coli, yeast, human liver and diverse retroviruses (such as Rous sarcoma virus and the Foamy viruses) supported, in every case, the five beta-strands (1 to 5) and four or five alpha-helices (A, B/C, D, E) that have been identified by crystallography in the RNase H domain of Human immunodeficiency virus 1 (HIV-1) reverse transcriptase and in E. coli RNase H []. Reverse transcriptase (RT) is a modular enzyme carrying polymerase and ribonuclease H (RNase H) activities in separable domains. Reverse transcriptase (RT) converts the single-stranded RNA genome of a retrovirus into a double-stranded DNA copy for integration into the host genome. This process requires ribonuclease H as well as RNA- and DNA-directed DNA polymerase activities. Retroviral RNase H is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. Bacterial RNase H 3.1.26.4 from EC catalyses endonucleolytic cleavage to 5'-phosphomonoester acting on RNA-DNA hybrids. The 3D structure of the RNase H domain from diverse bacteria and retroviruses has been solved [, , ]. All have four beta strands and four to five alpha helices. The E. coli RNase H1 protein binds a single Mg2+ ion cofactor in the active site of the enzyme. The divalent cation is bound by the carboxyl groups of four acidic residues, Asp-10, Glu-48, Asp-70, and Asp-134 []. The first three acidic residues are highly conserved in all bacterial and retroviral RNase H sequences. ; GO: 0003676 nucleic acid binding, 0004523 ribonuclease H activity; PDB: 3LP3_B 2KW4_A 3P1G_A 1RIL_A 2RPI_A 4EQJ_G 4EP2_B 3OTY_P 3U3G_D 2ZQB_D ....
Probab=95.56 E-value=0.009 Score=62.48 Aligned_cols=67 Identities=27% Similarity=0.257 Sum_probs=45.7
Q ss_pred eeeeeccccc------ccceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCC
Q psy15380 1623 QLIGFSDASE------KGYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYP 1696 (1738)
Q Consensus 1623 ~L~~F~DAS~------~ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~ 1696 (1738)
++.+|+|||- .++|.+++ +| . ....+....|.-+.||.|+..|++.+ . .
T Consensus 3 ~~~iytDgS~~~~~~~~~~g~v~~------~~-~--------~~~~~~~~~s~~~aEl~Ai~~AL~~~---~-~------ 57 (132)
T PF00075_consen 3 AIIIYTDGSCRPNPGKGGAGYVVW------GG-R--------NFSFRLGGQSNNRAELQAIIEALKAL---E-H------ 57 (132)
T ss_dssp SEEEEEEEEECTTTTEEEEEEEEE------TT-E--------EEEEEEESECHHHHHHHHHHHHHHTH---S-T------
T ss_pred cEEEEEeCCccCCCCceEEEEEEE------CC-e--------EEEecccccchhhhheehHHHHHHHh---h-c------
Confidence 5789999993 35555332 23 1 11122226799999999999998733 1 1
Q ss_pred cccEEEEechhhhHHhhcC
Q psy15380 1697 IDSIYCFTDSSVALCWAHS 1715 (1738)
Q Consensus 1697 ~~~~~~~tDS~i~l~~i~~ 1715 (1738)
..+.++|||+.++.+|..
T Consensus 58 -~~v~I~tDS~~v~~~l~~ 75 (132)
T PF00075_consen 58 -RKVTIYTDSQYVLNALNK 75 (132)
T ss_dssp -SEEEEEES-HHHHHHHHT
T ss_pred -ccccccccHHHHHHHHHH
Confidence 678899999999998887
No 39
>COG0328 RnhA Ribonuclease HI [DNA replication, recombination, and repair]
Probab=95.49 E-value=0.037 Score=59.01 Aligned_cols=80 Identities=21% Similarity=0.240 Sum_probs=54.7
Q ss_pred eeeeeccccc------ccceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCC
Q psy15380 1623 QLIGFSDASE------KGYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYP 1696 (1738)
Q Consensus 1623 ~L~~F~DAS~------~ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~ 1696 (1738)
.+++|+|+|- .|||++++ ..+++... ..+....|=.|+||.|++.|++.+.. ..
T Consensus 3 ~v~if~DGa~~gNpG~gG~g~vl~----~~~~~~~~--------s~~~~~tTNNraEl~A~i~AL~~l~~--------~~ 62 (154)
T COG0328 3 KVEIFTDGACLGNPGPGGWGAVLR----YGDGEKEL--------SGGEGRTTNNRAELRALIEALEALKE--------LG 62 (154)
T ss_pred ceEEEecCccCCCCCCceEEEEEE----cCCceEEE--------eeeeecccChHHHHHHHHHHHHHHHh--------cC
Confidence 4788999984 56888766 23443311 22333568899999999999887765 34
Q ss_pred cccEEEEechhhhHH----hhc-CCCccccc
Q psy15380 1697 IDSIYCFTDSSVALC----WAH-SPSHLFNT 1722 (1738)
Q Consensus 1697 ~~~~~~~tDS~i~l~----~i~-~~~~~~~~ 1722 (1738)
+..+.++|||+.|.. |+. .....|++
T Consensus 63 ~~~v~l~tDS~yv~~~i~~w~~~w~~~~w~~ 93 (154)
T COG0328 63 ACEVTLYTDSKYVVEGITRWIVKWKKNGWKT 93 (154)
T ss_pred CceEEEEecHHHHHHHHHHHHhhccccCccc
Confidence 788999999998843 633 34456653
No 40
>PRK07708 hypothetical protein; Validated
Probab=95.34 E-value=0.2 Score=57.29 Aligned_cols=81 Identities=15% Similarity=0.184 Sum_probs=52.0
Q ss_pred eeeeeccccc------ccceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCC
Q psy15380 1623 QLIGFSDASE------KGYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYP 1696 (1738)
Q Consensus 1623 ~L~~F~DAS~------~ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~ 1696 (1738)
.+++|+|||- .|+|+|++ . +.|.....+ .....+.+ ..|=...|+.|++.|++++..+ . +.
T Consensus 73 ~~~vY~DGs~~~n~g~aG~GvVI~--~--~~g~~~~~~-~~~~~l~~--~~TNN~AEy~Ali~aL~~A~e~----g--~~ 139 (219)
T PRK07708 73 EILVYFDGGFDKETKLAGLGIVIY--Y--KQGNKRYRI-RRNAYIEG--IYDNNEAEYAALYYAMQELEEL----G--VK 139 (219)
T ss_pred cEEEEEeeccCCCCCCcEEEEEEE--E--CCCCEEEEE-Eeeccccc--cccCcHHHHHHHHHHHHHHHHc----C--CC
Confidence 5889999976 34566655 2 334332222 21122322 2477888999999998876433 3 22
Q ss_pred cccEEEEechhhhHHhhcCC
Q psy15380 1697 IDSIYCFTDSSVALCWAHSP 1716 (1738)
Q Consensus 1697 ~~~~~~~tDS~i~l~~i~~~ 1716 (1738)
-..+.+++||+.++.|+++.
T Consensus 140 ~~~V~I~~DSqlVi~qi~g~ 159 (219)
T PRK07708 140 HEPVTFRGDSQVVLNQLAGE 159 (219)
T ss_pred cceEEEEeccHHHHHHhCCC
Confidence 23488999999999999975
No 41
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=94.92 E-value=0.014 Score=63.27 Aligned_cols=72 Identities=21% Similarity=0.395 Sum_probs=39.4
Q ss_pred ccCccccccC-CCCCcccccccc---cCCHHHHHHHHHhcccccccccCCCCCCCC-C---CCCCCCccccccc-cCCCC
Q psy15380 686 MRGNMCVLCS-EEHPLFKCNKFL---KLSPQERYGVVKQHDLCVNCLGTGHKANNC-P---SKSNCNICQFRHN-TLLHF 756 (1738)
Q Consensus 686 ~~~~~C~~C~-~~H~~~~C~~f~---~~s~~eR~~~vk~~~lCf~Clk~GH~a~~C-~---s~~~C~~C~~~Hh-t~lh~ 756 (1738)
.....|..|+ .+|..++||... -.-..-|....-+...|++|+.-||++++| + .+..|..|+..+| +..|+
T Consensus 58 ~~~~~C~nCg~~GH~~~DCP~~iC~~C~~~~H~s~~C~~~~~C~~Cg~~GH~~~dC~P~~~~~~~C~~C~s~~H~s~~Cp 137 (190)
T COG5082 58 EENPVCFNCGQNGHLRRDCPHSICYNCSWDGHRSNHCPKPKKCYNCGETGHLSRDCNPSKDQQKSCFDCNSTRHSSEDCP 137 (190)
T ss_pred ccccccchhcccCcccccCChhHhhhcCCCCcccccCCcccccccccccCccccccCcccccCcceeccCCCccccccCc
Confidence 4457888998 788888888100 000001111111225777777777777777 2 2335677776444 34454
Q ss_pred C
Q psy15380 757 N 757 (1738)
Q Consensus 757 ~ 757 (1738)
.
T Consensus 138 ~ 138 (190)
T COG5082 138 S 138 (190)
T ss_pred c
Confidence 4
No 42
>PF12382 Peptidase_A2E: Retrotransposon peptidase; InterPro: IPR024648 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This entry represents a small family of fungal retroviral aspartyl peptidases.
Probab=94.80 E-value=0.071 Score=50.74 Aligned_cols=69 Identities=14% Similarity=0.184 Sum_probs=48.4
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcceeeeEEeeccCCCc-ce-eceEEEEEecccCCcceEEEEEEeecccc
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLFVATDLKGVGSISS-PV-QGQVTMRFGSRFDKRYNYTIKALVVNHVV 961 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~~~i~I~g~g~~~~-~~-~~~v~v~i~~~~~~~~~~~i~a~Vvp~i~ 961 (1738)
.+.||+|+|||+++|+++.+..-+||..+-.-++ -+||.-. .. .....+.|.+++ -.+..+++|+....
T Consensus 47 sipclidtgaq~niiteetvrahklptrpw~~sv-iyggvyp~kinrkt~kl~i~lng---isikteflvvkkfs 117 (137)
T PF12382_consen 47 SIPCLIDTGAQVNIITEETVRAHKLPTRPWSQSV-IYGGVYPNKINRKTIKLNINLNG---ISIKTEFLVVKKFS 117 (137)
T ss_pred cceeEEccCceeeeeehhhhhhccCCCCcchhhe-EeccccccccccceEEEEEEecc---eEEEEEEEEEEecc
Confidence 5689999999999999999999999988732222 1333222 22 255667777766 45677888888653
No 43
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=94.75 E-value=0.02 Score=62.23 Aligned_cols=36 Identities=31% Similarity=0.755 Sum_probs=28.8
Q ss_pred HhcccccccccCCCCCCCCCCCCCCCccccccccCCC
Q psy15380 719 KQHDLCVNCLGTGHKANNCPSKSNCNICQFRHNTLLH 755 (1738)
Q Consensus 719 k~~~lCf~Clk~GH~a~~C~s~~~C~~C~~~Hht~lh 755 (1738)
.....||||+..||.+++|+ -.-|..|...-|.+.|
T Consensus 58 ~~~~~C~nCg~~GH~~~DCP-~~iC~~C~~~~H~s~~ 93 (190)
T COG5082 58 EENPVCFNCGQNGHLRRDCP-HSICYNCSWDGHRSNH 93 (190)
T ss_pred ccccccchhcccCcccccCC-hhHhhhcCCCCccccc
Confidence 45688999999999999999 4678888666666654
No 44
>PF14893 PNMA: PNMA
Probab=94.05 E-value=0.12 Score=62.13 Aligned_cols=97 Identities=19% Similarity=0.235 Sum_probs=82.7
Q ss_pred CCCCCCCCCCc------CChHHHHHHHHHHhccCCCCChHHHHHHHHHhccccHHHhhcCCC--CCCCcHHHHHHHHHhH
Q psy15380 436 LEIPTFDGACI------GNWPLFIEMYRINIHNRTDLTNAHKLQYLLSKLSGGALAVAAGIP--PTEHNYQVIYDALLEK 507 (1738)
Q Consensus 436 ~~lp~F~G~~~------~~~~~f~~~F~~~v~~~~~l~d~~K~~~L~s~L~G~A~~~i~~~~--~t~~nY~~a~~~L~~~ 507 (1738)
.++..|+|. . +.|..|.+.+..++..-.+++|.+|-..|+..|.|.|.+.++.+. ........-++.|+..
T Consensus 164 ~~L~iFSG~-~~~~~gee~fe~Wl~~a~~~v~~W~~~~e~ekrrrlle~L~GpA~~~~r~l~~~nP~~t~~~~l~aL~~~ 242 (331)
T PF14893_consen 164 RDLRIFSGR-EEPAPGEESFESWLEHANEMVKKWNDVSEEEKRRRLLESLRGPALDSRRKLQKKNPKQTAQDCLKALGQV 242 (331)
T ss_pred hhhhhhcCC-CCCCCCcccHHHHHHHHHHHHHhccCCchhhchhhhHHhcccHHHHHHHHHHhcCCCCCHHHHHHHHHHh
Confidence 568899997 4 449999999999999866699999999999999999999999884 4567789999999999
Q ss_pred hccchhhHHHHHHHHhcccccCCChh
Q psy15380 508 YDDKRNLATYYMDSLLNFKTQSGSLE 533 (1738)
Q Consensus 508 fg~~~~i~~~~~~~l~~~~~~~~~~~ 533 (1738)
||++......+...+...+..++...
T Consensus 243 Fg~~es~~~~~~kf~~~~Q~~~E~ls 268 (331)
T PF14893_consen 243 FGSSESRETLEAKFLNTFQEPGEKLS 268 (331)
T ss_pred cCCcccHHHHHHHHHHhhccCCCCHH
Confidence 99999998888777777665555555
No 45
>PF14223 UBN2: gag-polypeptide of LTR copia-type
Probab=94.01 E-value=0.39 Score=49.45 Aligned_cols=71 Identities=18% Similarity=0.182 Sum_probs=51.1
Q ss_pred hHHHHHHHHHHHHHHHHhCCCCCCchhhHhhhcccCCCHHHHHHHHHHhCCCCCc--cHHHHHHHHHHHHHHHH
Q psy15380 578 SLETFIDYFGANVAALKALDLPNLSEFFLFYLGNSKLDESTRKQFELSLGKHEIP--TFHKLLEFAQNHNKILN 649 (1738)
Q Consensus 578 ~l~~~~~~~~~~~~~L~~l~~~~~~~~~l~~~i~~kLp~~~~~~~~~~~~~~~~~--tl~~ll~fl~~~~~~~~ 649 (1738)
.+..|+.++...++.|..+|.+.....++ ..++..||+.-..-........+.+ |+++++..+...+....
T Consensus 42 sv~~y~~~~~~i~~~L~~~g~~i~d~~~v-~~iL~~Lp~~y~~~~~~i~~~~~~~~~t~~el~~~L~~~E~~~~ 114 (119)
T PF14223_consen 42 SVDEYISRLKEIVDELRAIGKPISDEDLV-SKILRSLPPSYDTFVTAIRNSKDLPKMTLEELISRLLAEEMRLK 114 (119)
T ss_pred cHHHHHHHHHHhhhhhhhcCCcccchhHH-HHHHhcCCchhHHHHHHHHhcCCCCcCCHHHHHHHHHHHHHHHH
Confidence 67778888888999999999977654444 8899999987654443332233444 89999999887765443
No 46
>PF09337 zf-H2C2: His(2)-Cys(2) zinc finger; InterPro: IPR015416 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents an H2C2-type zinc finger that binds to histone upstream activating sequence (UAS) elements found in histone gene promoters []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].
Probab=93.97 E-value=0.025 Score=45.87 Aligned_cols=34 Identities=15% Similarity=0.066 Sum_probs=26.3
Q ss_pred CCchhhHHHHH-Hhhhhccccchhhhhcccccccc
Q psy15380 313 QPLVVNSFIAI-RATFILQYFKQNRVLGENCLISF 346 (1738)
Q Consensus 313 ~~~~~~~~~~~-r~~~~i~~r~~~r~~~~~C~~c~ 346 (1738)
|.|...++..| ++|||++.++-|+.++++|..|+
T Consensus 5 H~Gi~kT~~~i~~~y~W~gm~~~V~~~ir~C~~Cq 39 (39)
T PF09337_consen 5 HPGINKTTAKISQRYHWPGMKKDVRRVIRSCPQCQ 39 (39)
T ss_pred CCCHHHHHHHHHHhheecCHHHHHHHHHhcCcccC
Confidence 55555555544 47889999999999999998885
No 47
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=93.65 E-value=0.07 Score=57.35 Aligned_cols=53 Identities=26% Similarity=0.560 Sum_probs=31.0
Q ss_pred CccccccC-CCCCcccccccccCCHHHHHHHHHhcccccccccCCCCCCCCCCCC------CCCccccc
Q psy15380 688 GNMCVLCS-EEHPLFKCNKFLKLSPQERYGVVKQHDLCVNCLGTGHKANNCPSKS------NCNICQFR 749 (1738)
Q Consensus 688 ~~~C~~C~-~~H~~~~C~~f~~~s~~eR~~~vk~~~lCf~Clk~GH~a~~C~s~~------~C~~C~~~ 749 (1738)
...|..|+ .+|...+||+... ......||+|...||++++|+.+. .|..|++.
T Consensus 27 ~~~C~~Cg~~GH~~~~Cp~~~~---------~~~~~~C~~Cg~~GH~~~~Cp~~~~~~~~~~C~~Cg~~ 86 (148)
T PTZ00368 27 ARPCYKCGEPGHLSRECPSAPG---------GRGERSCYNCGKTGHLSRECPEAPPGSGPRSCYNCGQT 86 (148)
T ss_pred CccCccCCCCCcCcccCcCCCC---------CCCCcccCCCCCcCcCcccCCCcccCCCCcccCcCCCC
Confidence 35677776 5677777764321 012345777777777777776543 46666654
No 48
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=93.35 E-value=0.071 Score=57.30 Aligned_cols=60 Identities=23% Similarity=0.412 Sum_probs=45.1
Q ss_pred CccccccC-CCCCcccccccccCCHHHHHHHHHhcccccccccCCCCCCCCCCCC-------CCCccccccccC-CCC
Q psy15380 688 GNMCVLCS-EEHPLFKCNKFLKLSPQERYGVVKQHDLCVNCLGTGHKANNCPSKS-------NCNICQFRHNTL-LHF 756 (1738)
Q Consensus 688 ~~~C~~C~-~~H~~~~C~~f~~~s~~eR~~~vk~~~lCf~Clk~GH~a~~C~s~~-------~C~~C~~~Hht~-lh~ 756 (1738)
...|..|+ .+|...+||+-..- .....||+|.+.||++++|+.+. .|..|+..-|.. -|+
T Consensus 52 ~~~C~~Cg~~GH~~~~Cp~~~~~---------~~~~~C~~Cg~~GH~~~~C~~~~~~~~~~~~C~~Cg~~gH~~~~C~ 120 (148)
T PTZ00368 52 ERSCYNCGKTGHLSRECPEAPPG---------SGPRSCYNCGQTGHISRECPNRAKGGAARRACYNCGGEGHISRDCP 120 (148)
T ss_pred CcccCCCCCcCcCcccCCCcccC---------CCCcccCcCCCCCcccccCCCcccccccchhhcccCcCCcchhcCC
Confidence 45799999 78999999863211 13468999999999999998754 599999865544 343
No 49
>PF07727 RVT_2: Reverse transcriptase (RNA-dependent DNA polymerase); InterPro: IPR013103 A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This entry includes reverse transcriptases not recognised by IPR000477 from INTERPRO [].
Probab=92.67 E-value=0.049 Score=63.56 Aligned_cols=102 Identities=23% Similarity=0.271 Sum_probs=70.0
Q ss_pred ceeecccccceeeeEecccCcceEEEEeccC-CcccceEEEEEEeeeccccchHHHHHHHHHHHHhhhhc----Ccc---
Q psy15380 1222 YALNGDITKMFLQIKLLSDYWKFQKILWRFS-NKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDDESKN----FPL--- 1293 (1738)
Q Consensus 1222 ~~~~~Dl~kaf~qv~l~~~dr~~~~f~w~~~-~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~----~p~--- 1293 (1738)
.+-.+|+..||++=.++++ -|.+..-.-. ...+-.++++.+--|||+-||..+...++..|....=. .|-
T Consensus 78 ~~~q~Dv~tAfL~~~l~e~--iym~~P~g~~~~~~~~~v~~L~kaLYGLKQa~r~W~~~l~~~L~~~GF~~~~~D~clfi 155 (246)
T PF07727_consen 78 ELHQMDVKTAFLNGDLDEE--IYMRQPPGFEDPGPPGKVCRLKKALYGLKQAPRLWYKTLDKFLKKLGFKQSKADPCLFI 155 (246)
T ss_pred ccccccccceeeecccccc--hhhcccccccccccccccccccccceecccccchhhhhcccccchhhhhcccccccccc
Confidence 3456799999999998776 2222211000 11245689999999999999999999888887754311 010
Q ss_pred ---cccccccceeccceeeecCCHHHHHHHHHHHH
Q psy15380 1294 ---ACQYIGESLYMDDCVISFPSQKEAIEFFNQTV 1325 (1738)
Q Consensus 1294 ---~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~ 1325 (1738)
-...+.+.+||||+++.+.+.++..+..+++.
T Consensus 156 ~~~~~~~~ii~vYVDDili~~~~~~~i~~~~~~l~ 190 (246)
T PF07727_consen 156 KKSGDGFIIILVYVDDILIAGPSEEEIEEFKKELK 190 (246)
T ss_pred cccccccccccccccccccccccccceeccccccc
Confidence 12346678999999999999987777655443
No 50
>PF14227 UBN2_2: gag-polypeptide of LTR copia-type
Probab=92.52 E-value=1 Score=46.38 Aligned_cols=72 Identities=18% Similarity=0.087 Sum_probs=51.9
Q ss_pred hHHHHHHHHHHHHHHHHhCCCCCCchhhHhhhcccCCCHHHHHHHHH--HhCCCCCccHHHHHHHHHHHHHHHHh
Q psy15380 578 SLETFIDYFGANVAALKALDLPNLSEFFLFYLGNSKLDESTRKQFEL--SLGKHEIPTFHKLLEFAQNHNKILNR 650 (1738)
Q Consensus 578 ~l~~~~~~~~~~~~~L~~l~~~~~~~~~l~~~i~~kLp~~~~~~~~~--~~~~~~~~tl~~ll~fl~~~~~~~~~ 650 (1738)
.++..++.++..++.|+.+|.+.+.+... ..|+..||+.-..-... .......++++++...+..+......
T Consensus 40 ~v~~hi~~~~~l~~~L~~~g~~i~d~~~~-~~lL~sLP~sy~~~~~~l~~~~~~~~~tl~~v~~~L~~ee~~~~~ 113 (119)
T PF14227_consen 40 SVRDHINEFRSLVNQLKSLGVPIDDEDKV-IILLSSLPPSYDSFVTALLYSKPEDELTLEEVKSKLLQEEERRKK 113 (119)
T ss_pred hHHHHHHHHHHHHHhhccccccchHHHHH-HHHHHcCCHhHHHHHHHHHccCCCCCcCHHHHHHHHHHHHHHHHh
Confidence 67778888899999999999988655444 88999999985433322 11123678999999988886554443
No 51
>PF13456 RVT_3: Reverse transcriptase-like; PDB: 3ALY_A 2EHG_A 3HST_B.
Probab=92.50 E-value=0.16 Score=48.63 Aligned_cols=54 Identities=19% Similarity=0.125 Sum_probs=39.4
Q ss_pred cccccEEEEechHHHHHHhcCCCCCcchhhhhhHHHHhhcccC---cceeccCCCCCC
Q psy15380 24 YPIDSIYCFTDSSVALCWAHSPSHLFNTFVANRISKIQQNMKV---DSLYHVSGTENP 78 (1738)
Q Consensus 24 ~~i~~~~~wtDS~ivL~wI~~~~~~~~~fVaNRv~~I~~~~~~---~~w~hVpt~~NP 78 (1738)
+.+.++.+.|||..+...|++...... .....+.+|+..... ..|+|||-+.|=
T Consensus 19 ~g~~~i~v~sDs~~vv~~i~~~~~~~~-~~~~~~~~i~~~~~~~~~~~~~~i~r~~N~ 75 (87)
T PF13456_consen 19 LGIRKIIVESDSQLVVDAINGRSSSRS-ELRPLIQDIRSLLDRFWNVSVSHIPREQNK 75 (87)
T ss_dssp CT-SCEEEEES-HHHHHHHTTSS---S-CCHHHHHHHHHHHCCCSCEEEEE--GGGSH
T ss_pred CCCCEEEEEecCccccccccccccccc-cccccchhhhhhhccccceEEEEEChHHhH
Confidence 679999999999999999988755434 777778888776665 889999999884
No 52
>PRK00203 rnhA ribonuclease H; Reviewed
Probab=92.44 E-value=0.3 Score=52.65 Aligned_cols=90 Identities=19% Similarity=0.262 Sum_probs=53.5
Q ss_pred eeeecccccc------cceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCCc
Q psy15380 1624 LIGFSDASEK------GYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYPI 1697 (1738)
Q Consensus 1624 L~~F~DAS~~------ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~~ 1697 (1738)
+.+|+|+|-. |||++++ . .++.. . +.. +....|-.|.||.|++.|++.+. . .
T Consensus 4 v~iytDGs~~~n~~~~g~g~v~~--~--~~~~~--~-~~~-----~~~~~TN~~aEL~Ai~~AL~~~~------~----~ 61 (150)
T PRK00203 4 VEIYTDGACLGNPGPGGWGAILR--Y--KGHEK--E-LSG-----GEALTTNNRMELMAAIEALEALK------E----P 61 (150)
T ss_pred EEEEEEecccCCCCceEEEEEEE--E--CCeeE--E-Eec-----CCCCCcHHHHHHHHHHHHHHHcC------C----C
Confidence 7789999973 5665544 1 22221 1 111 12246889999999999987442 1 2
Q ss_pred ccEEEEechhhhHH----hhcC-CCcccc----chhhch--HHHHHhhc
Q psy15380 1698 DSIYCFTDSSVALC----WAHS-PSHLFN----TFVANR--ISKIHQNM 1735 (1738)
Q Consensus 1698 ~~~~~~tDS~i~l~----~i~~-~~~~~~----~fv~nR--v~~I~~~~ 1735 (1738)
..+.++|||+.++. |+.+ ..+.|+ .-|+|+ .++|.++.
T Consensus 62 ~~v~I~tDS~yvi~~i~~w~~~Wk~~~~~~~~g~~v~n~dl~~~i~~l~ 110 (150)
T PRK00203 62 CEVTLYTDSQYVRQGITEWIHGWKKNGWKTADKKPVKNVDLWQRLDAAL 110 (150)
T ss_pred CeEEEEECHHHHHHHHHHHHHHHHHcCCcccCCCccccHHHHHHHHHHh
Confidence 46899999987765 4432 112332 356676 45665543
No 53
>PF00336 DNA_pol_viral_C: DNA polymerase (viral) C-terminal domain; InterPro: IPR001462 This domain is at the C terminus of hepatitis B-type viruses P proteins and represents a functional domain that controls the RNase H activities of the protein. The domain is always associated with IPR000201 from INTERPRO and .; GO: 0004523 ribonuclease H activity
Probab=92.01 E-value=0.12 Score=55.93 Aligned_cols=49 Identities=37% Similarity=0.456 Sum_probs=36.5
Q ss_pred cEEEEechHHHHHHhcCCCCCcchh------hhhhHHHHhhcccCcceeccCCCCCCCccCcCCCCh
Q psy15380 28 SIYCFTDSSVALCWAHSPSHLFNTF------VANRISKIQQNMKVDSLYHVSGTENPADGLSRGLLP 88 (1738)
Q Consensus 28 ~~~~wtDS~ivL~wI~~~~~~~~~f------VaNRv~~I~~~~~~~~w~hVpt~~NPAD~aSRG~~~ 88 (1738)
.-.+-|||++||+ +++.+| +||. ++...++-||||+-||||..|||...
T Consensus 142 ~r~l~tDnt~Vls------rkyts~PW~lac~A~w------iLrgts~~yVPS~~NPAD~PsR~~~~ 196 (245)
T PF00336_consen 142 ARCLGTDNTVVLS------RKYTSFPWLLACAANW------ILRGTSFYYVPSKYNPADDPSRGKLG 196 (245)
T ss_pred CcEEeecCcEEEe------cccccCcHHHHHHHHH------hhcCceEEEeccccCcCCCCCCCccc
Confidence 4447899999985 333333 3443 35668899999999999999999754
No 54
>cd06222 RnaseH RNase H (RNase HI) is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a not sequence-specific manner. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H knockout mice lack mitochondrial DNA replication and die as embryos. The retroviral reverse transcriptase contains an RNase H domain that plays an important role in converting a single stranded retroviral genomic RNA into a dsDNA for integration into host chromosomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.
Probab=89.20 E-value=0.39 Score=48.73 Aligned_cols=75 Identities=12% Similarity=0.005 Sum_probs=49.7
Q ss_pred hhHHHHHHHHHHHHHHHhcCCccccccEEEEechHHHHHHhcCCCCCcchhhhhhHHHHhhcc---cCcceeccCC----
Q psy15380 2 FSTVILLMSSLLKVVIDSYTPRYPIDSIYCFTDSSVALCWAHSPSHLFNTFVANRISKIQQNM---KVDSLYHVSG---- 74 (1738)
Q Consensus 2 ~~~~~ll~a~l~~~~~~~~~~~~~i~~~~~wtDS~ivL~wI~~~~~~~~~fVaNRv~~I~~~~---~~~~w~hVpt---- 74 (1738)
|+++++.+.+.+. ...+..+.++|||..++.++.+.......-....+..|++.. ....++|||.
T Consensus 45 El~al~~al~~~~--------~~~~~~i~i~~Ds~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~i~~v~~h~~~ 116 (130)
T cd06222 45 ELLALIEALELAL--------ELGGKKVNIYTDSQYVINALTGWYEGKPVKNVDLWQRLLALLKRFHKVRFEWVPGHSGI 116 (130)
T ss_pred HHHHHHHHHHHHH--------hCCCceEEEEECHHHHHHHhhccccCCChhhHHHHHHHHHHHhCCCeEEEEEcCCCCCC
Confidence 4455555444443 257899999999999999998865422233444455555544 4588899999
Q ss_pred -CCCCCccCcC
Q psy15380 75 -TENPADGLSR 84 (1738)
Q Consensus 75 -~~NPAD~aSR 84 (1738)
..+.||..+|
T Consensus 117 ~~n~~ad~la~ 127 (130)
T cd06222 117 EGNERADALAK 127 (130)
T ss_pred cchHHHHHHHH
Confidence 5558886554
No 55
>PRK08719 ribonuclease H; Reviewed
Probab=88.43 E-value=0.72 Score=49.49 Aligned_cols=71 Identities=18% Similarity=0.195 Sum_probs=45.1
Q ss_pred eeeeeecccccc---------cceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcC
Q psy15380 1622 FQLIGFSDASEK---------GYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYT 1692 (1738)
Q Consensus 1622 ~~L~~F~DAS~~---------ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~ 1692 (1738)
-++++|+|+|-. |||+++| +++|.....+..+ +.+ ..|--|.||.|++.|++.+. .
T Consensus 3 ~~~~iYtDGs~~~n~~~~~~~G~G~vv~----~~~~~~~~~~~~~---~~~--~~Tnn~aEl~A~~~aL~~~~------~ 67 (147)
T PRK08719 3 ASYSIYIDGAAPNNQHGCVRGGIGLVVY----DEAGEIVDEQSIT---VNR--YTDNAELELLALIEALEYAR------D 67 (147)
T ss_pred ceEEEEEecccCCCCCCCCCcEEEEEEE----eCCCCeeEEEEec---CCC--CccHHHHHHHHHHHHHHHcC------C
Confidence 368899998872 6777665 2344432122111 111 25999999999999987542 1
Q ss_pred CCCCcccEEEEechhhhHHhh
Q psy15380 1693 PRYPIDSIYCFTDSSVALCWA 1713 (1738)
Q Consensus 1693 ~~~~~~~~~~~tDS~i~l~~i 1713 (1738)
...++|||+-++.=|
T Consensus 68 ------~~~i~tDS~yvi~~i 82 (147)
T PRK08719 68 ------GDVIYSDSDYCVRGF 82 (147)
T ss_pred ------CCEEEechHHHHHHH
Confidence 126999998886544
No 56
>PRK13907 rnhA ribonuclease H; Provisional
Probab=88.32 E-value=2.6 Score=43.98 Aligned_cols=74 Identities=19% Similarity=0.147 Sum_probs=50.1
Q ss_pred eeeecccccc------cceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCCc
Q psy15380 1624 LIGFSDASEK------GYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYPI 1697 (1738)
Q Consensus 1624 L~~F~DAS~~------ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~~ 1697 (1738)
+++|+|||-. |||+++ | +.+|.... + .+....|=.+.|+.|++.|++++.. ..+
T Consensus 2 ~~iy~DGa~~~~~g~~G~G~vi--~--~~~~~~~~----~----~~~~~~tn~~AE~~All~aL~~a~~--------~g~ 61 (128)
T PRK13907 2 IEVYIDGASKGNPGPSGAGVFI--K--GVQPAVQL----S----LPLGTMSNHEAEYHALLAALKYCTE--------HNY 61 (128)
T ss_pred EEEEEeeCCCCCCCccEEEEEE--E--ECCeeEEE----E----ecccccCCcHHHHHHHHHHHHHHHh--------CCC
Confidence 5689998764 455554 4 23443321 1 1223457789999999999887642 235
Q ss_pred ccEEEEechhhhHHhhcCCC
Q psy15380 1698 DSIYCFTDSSVALCWAHSPS 1717 (1738)
Q Consensus 1698 ~~~~~~tDS~i~l~~i~~~~ 1717 (1738)
..+.++|||+.++.++++..
T Consensus 62 ~~v~i~sDS~~vi~~~~~~~ 81 (128)
T PRK13907 62 NIVSFRTDSQLVERAVEKEY 81 (128)
T ss_pred CEEEEEechHHHHHHHhHHH
Confidence 67999999999999999743
No 57
>KOG4400|consensus
Probab=88.31 E-value=0.2 Score=59.18 Aligned_cols=35 Identities=23% Similarity=0.529 Sum_probs=29.2
Q ss_pred ccccccccCCCCCCCCCC--CCCCCccccccccCCCC
Q psy15380 722 DLCVNCLGTGHKANNCPS--KSNCNICQFRHNTLLHF 756 (1738)
Q Consensus 722 ~lCf~Clk~GH~a~~C~s--~~~C~~C~~~Hht~lh~ 756 (1738)
..||+|++.||+..+|+. ...|..|+..+|-...-
T Consensus 144 ~~Cy~Cg~~GH~s~~C~~~~~~~c~~c~~~~h~~~~C 180 (261)
T KOG4400|consen 144 AKCYSCGEQGHISDDCPENKGGTCFRCGKVGHGSRDC 180 (261)
T ss_pred CccCCCCcCCcchhhCCCCCCCccccCCCcceecccC
Confidence 569999999999999993 56899999988877643
No 58
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=88.11 E-value=0.96 Score=49.99 Aligned_cols=43 Identities=16% Similarity=0.254 Sum_probs=38.7
Q ss_pred eeeEEeeCCCCccccCHhhHhhhCCCcce--eeeEEeeccCCCcc
Q psy15380 889 EMRFLLDCASMSNILSLSACKQLGLKTLF--VATDLKGVGSISSP 931 (1738)
Q Consensus 889 ~v~aLLDSGS~vS~Is~~la~~LgL~~~~--~~i~I~g~g~~~~~ 931 (1738)
.+.+|+||||+.--++++.|++||+.... -++.+.++||...-
T Consensus 116 ~v~fLVDTGATsVal~~~dA~RlGid~~~l~y~~~v~TANG~~~A 160 (215)
T COG3577 116 KVDFLVDTGATSVALNEEDARRLGIDLNSLDYTITVSTANGRARA 160 (215)
T ss_pred EEEEEEecCcceeecCHHHHHHhCCCccccCCceEEEccCCcccc
Confidence 78999999999999999999999998776 68999999996544
No 59
>PRK06548 ribonuclease H; Provisional
Probab=86.04 E-value=2 Score=46.80 Aligned_cols=68 Identities=15% Similarity=0.071 Sum_probs=42.5
Q ss_pred eeeeeccccccc------ceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCC
Q psy15380 1623 QLIGFSDASEKG------YGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYP 1696 (1738)
Q Consensus 1623 ~L~~F~DAS~~a------yga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~ 1696 (1738)
.+++|+|||-.+ ||++ +. .++ ....+....|=-|.||.|++.|++.+ . .+
T Consensus 5 ~~~IytDGa~~gnpg~~G~g~~--~~---~~~----------~~~g~~~~~TNnraEl~Aii~aL~~~-------~--~~ 60 (161)
T PRK06548 5 EIIAATDGSSLANPGPSGWAWY--VD---ENT----------WDSGGWDIATNNIAELTAVRELLIAT-------R--HT 60 (161)
T ss_pred EEEEEEeeccCCCCCceEEEEE--Ee---CCc----------EEccCCCCCCHHHHHHHHHHHHHHhh-------h--cC
Confidence 588999987663 4433 22 111 01223334688999999998876422 2 12
Q ss_pred cccEEEEechhhhHHhhc
Q psy15380 1697 IDSIYCFTDSSVALCWAH 1714 (1738)
Q Consensus 1697 ~~~~~~~tDS~i~l~~i~ 1714 (1738)
...+.++|||+.++.=|.
T Consensus 61 ~~~v~I~TDS~yvi~~i~ 78 (161)
T PRK06548 61 DRPILILSDSKYVINSLT 78 (161)
T ss_pred CceEEEEeChHHHHHHHH
Confidence 345899999999965444
No 60
>PF00075 RNase_H: RNase H; InterPro: IPR002156 The RNase H domain is responsible for hydrolysis of the RNA portion of RNA x DNA hybrids, and this activity requires the presence of divalent cations (Mg2+ or Mn2+) that bind its active site. This domain is a part of a large family of homologous RNase H enzymes of which the RNase HI protein from Escherichia coli is the best characterised []. Secondary structure predictions for the enzymes from E. coli, yeast, human liver and diverse retroviruses (such as Rous sarcoma virus and the Foamy viruses) supported, in every case, the five beta-strands (1 to 5) and four or five alpha-helices (A, B/C, D, E) that have been identified by crystallography in the RNase H domain of Human immunodeficiency virus 1 (HIV-1) reverse transcriptase and in E. coli RNase H []. Reverse transcriptase (RT) is a modular enzyme carrying polymerase and ribonuclease H (RNase H) activities in separable domains. Reverse transcriptase (RT) converts the single-stranded RNA genome of a retrovirus into a double-stranded DNA copy for integration into the host genome. This process requires ribonuclease H as well as RNA- and DNA-directed DNA polymerase activities. Retroviral RNase H is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. Bacterial RNase H 3.1.26.4 from EC catalyses endonucleolytic cleavage to 5'-phosphomonoester acting on RNA-DNA hybrids. The 3D structure of the RNase H domain from diverse bacteria and retroviruses has been solved [, , ]. All have four beta strands and four to five alpha helices. The E. coli RNase H1 protein binds a single Mg2+ ion cofactor in the active site of the enzyme. The divalent cation is bound by the carboxyl groups of four acidic residues, Asp-10, Glu-48, Asp-70, and Asp-134 []. The first three acidic residues are highly conserved in all bacterial and retroviral RNase H sequences. ; GO: 0003676 nucleic acid binding, 0004523 ribonuclease H activity; PDB: 3LP3_B 2KW4_A 3P1G_A 1RIL_A 2RPI_A 4EQJ_G 4EP2_B 3OTY_P 3U3G_D 2ZQB_D ....
Probab=82.93 E-value=1.5 Score=45.77 Aligned_cols=52 Identities=15% Similarity=0.064 Sum_probs=35.1
Q ss_pred ccEEEEechHHHHHHhcC-----CCCCcch--hhhhhHHHHhhcccCcceeccCCCCCC
Q psy15380 27 DSIYCFTDSSVALCWAHS-----PSHLFNT--FVANRISKIQQNMKVDSLYHVSGTENP 78 (1738)
Q Consensus 27 ~~~~~wtDS~ivL~wI~~-----~~~~~~~--fVaNRv~~I~~~~~~~~w~hVpt~~NP 78 (1738)
..+.++|||+.|+.+|.. ..+.-.. =+.+++.+....-....++|||+..|-
T Consensus 58 ~~v~I~tDS~~v~~~l~~~~~~~~~~~~~~~~~i~~~i~~~~~~~~~v~~~~V~~H~~~ 116 (132)
T PF00075_consen 58 RKVTIYTDSQYVLNALNKWLHGNGWKKTSNGRPIKNEIWELLSRGIKVRFRWVPGHSGV 116 (132)
T ss_dssp SEEEEEES-HHHHHHHHTHHHHTTSBSCTSSSBHTHHHHHHHHHSSEEEEEESSSSSSS
T ss_pred ccccccccHHHHHHHHHHhccccccccccccccchhheeeccccceEEeeeeccCcCCC
Confidence 789999999999998876 2221111 244566666544455889999999765
No 61
>PF13456 RVT_3: Reverse transcriptase-like; PDB: 3ALY_A 2EHG_A 3HST_B.
Probab=82.57 E-value=2.7 Score=40.04 Aligned_cols=56 Identities=20% Similarity=0.067 Sum_probs=41.6
Q ss_pred hhHHHHHHHHHHHHHHHHhcCCCCCcccEEEEechhhhHHhhcCCCccccchhhchHHHHHhhcC
Q psy15380 1672 LELCAILLMSSLLKVVIDSYTPRYPIDSIYCFTDSSVALCWAHSPSHLFNTFVANRISKIHQNMT 1736 (1738)
Q Consensus 1672 lEL~a~~~~~~l~~~~~~~l~~~~~~~~~~~~tDS~i~l~~i~~~~~~~~~fv~nRv~~I~~~~~ 1736 (1738)
.|+.|+..|++++. . +.+.++.+.|||..++..|++...... .....+.+|+.+.+
T Consensus 4 aE~~al~~al~~a~------~--~g~~~i~v~sDs~~vv~~i~~~~~~~~-~~~~~~~~i~~~~~ 59 (87)
T PF13456_consen 4 AEALALLEALQLAW------E--LGIRKIIVESDSQLVVDAINGRSSSRS-ELRPLIQDIRSLLD 59 (87)
T ss_dssp HHHHHHHHHHHHHH------C--CT-SCEEEEES-HHHHHHHTTSS---S-CCHHHHHHHHHHHC
T ss_pred HHHHHHHHHHHHHH------H--CCCCEEEEEecCccccccccccccccc-cccccchhhhhhhc
Confidence 58899998887764 2 458899999999999999998754444 77788888887765
No 62
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=81.41 E-value=0.87 Score=35.87 Aligned_cols=20 Identities=40% Similarity=0.591 Sum_probs=12.3
Q ss_pred cccccccccCCCCCCCCCCC
Q psy15380 721 HDLCVNCLGTGHKANNCPSK 740 (1738)
Q Consensus 721 ~~lCf~Clk~GH~a~~C~s~ 740 (1738)
.++|++|.+-.|.+.+|+++
T Consensus 2 ~~~CprC~kg~Hwa~~C~sk 21 (36)
T PF14787_consen 2 PGLCPRCGKGFHWASECRSK 21 (36)
T ss_dssp --C-TTTSSSCS-TTT---T
T ss_pred CccCcccCCCcchhhhhhhh
Confidence 46899999999999999987
No 63
>PRK07238 bifunctional RNase H/acid phosphatase; Provisional
Probab=80.52 E-value=7.7 Score=48.39 Aligned_cols=77 Identities=12% Similarity=0.086 Sum_probs=50.6
Q ss_pred eeeeecccccc------cceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCC
Q psy15380 1623 QLIGFSDASEK------GYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYP 1696 (1738)
Q Consensus 1623 ~L~~F~DAS~~------ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~ 1696 (1738)
.+++|+|||-. |+|++ +|. ++|.....-. +. ++...|-...|+.|++.|++++..+ .
T Consensus 2 ~~~i~~DGa~~~n~g~aG~G~v--i~~--~~~~~~~~~~--~~---~~~~~tnn~AE~~All~gL~~a~~~----g---- 64 (372)
T PRK07238 2 KVVVEADGGSRGNPGPAGYGAV--VWD--ADRGEVLAER--AE---AIGRATNNVAEYRGLIAGLEAAAEL----G---- 64 (372)
T ss_pred eEEEEecCCCCCCCCceEEEEE--EEe--CCCCcEEEEe--ec---ccCCCCchHHHHHHHHHHHHHHHhC----C----
Confidence 57899999866 44444 332 3332211111 11 1224567789999999998876532 2
Q ss_pred cccEEEEechhhhHHhhcCC
Q psy15380 1697 IDSIYCFTDSSVALCWAHSP 1716 (1738)
Q Consensus 1697 ~~~~~~~tDS~i~l~~i~~~ 1716 (1738)
+..+.+++||..++.-+++.
T Consensus 65 ~~~v~i~~DS~lvi~~i~~~ 84 (372)
T PRK07238 65 ATEVEVRMDSKLVVEQMSGR 84 (372)
T ss_pred CCeEEEEeCcHHHHHHhCCC
Confidence 56799999999999999864
No 64
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=79.71 E-value=7.2 Score=39.94 Aligned_cols=68 Identities=15% Similarity=0.028 Sum_probs=45.1
Q ss_pred EEEEEEEEec-CCCceeeeEEeeCCCC-ccccCHhhHhhhCCCcceeeeEEeeccCCCcceeceEEEEEeccc
Q psy15380 874 GTVKILVLDV-YGKPHEMRFLLDCASM-SNILSLSACKQLGLKTLFVATDLKGVGSISSPVQGQVTMRFGSRF 944 (1738)
Q Consensus 874 ~t~~V~v~~~-~g~~~~v~aLLDSGS~-vS~Is~~la~~LgL~~~~~~i~I~g~g~~~~~~~~~v~v~i~~~~ 944 (1738)
.++++....+ +|...... |+|||.+ -..++.++|+++|++...+.-.+.+.|+.... +.....+...+
T Consensus 11 ~~v~~~f~~~~~Gd~~~~~-LiDTGFtg~lvlp~~vaek~~~~~~~~~~~~~a~~~~v~t--~V~~~~iki~g 80 (125)
T COG5550 11 VTVPVTFRLPGQGDFVYDE-LIDTGFTGYLVLPPQVAEKLGLPLFSTIRIVLADGGVVKT--SVALATIKIDG 80 (125)
T ss_pred eeEEEEEEecCCCcEEeee-EEecCCceeEEeCHHHHHhcCCCccCChhhhhhcCCEEEE--EEEEEEEEECC
Confidence 4567777774 46655545 9999999 78899999999999988644444455553222 33344455544
No 65
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=78.51 E-value=0.84 Score=35.26 Aligned_cols=19 Identities=32% Similarity=0.835 Sum_probs=16.1
Q ss_pred ccccccccCCCCCCCCCCC
Q psy15380 722 DLCVNCLGTGHKANNCPSK 740 (1738)
Q Consensus 722 ~lCf~Clk~GH~a~~C~s~ 740 (1738)
-.|+.|.++||..++|+..
T Consensus 9 Y~C~~C~~~GH~i~dCP~~ 27 (32)
T PF13696_consen 9 YVCHRCGQKGHWIQDCPTN 27 (32)
T ss_pred CEeecCCCCCccHhHCCCC
Confidence 4799999999999999874
No 66
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=75.74 E-value=1 Score=37.21 Aligned_cols=20 Identities=30% Similarity=0.692 Sum_probs=17.9
Q ss_pred hcccccccccCCCCCCCCCC
Q psy15380 720 QHDLCVNCLGTGHKANNCPS 739 (1738)
Q Consensus 720 ~~~lCf~Clk~GH~a~~C~s 739 (1738)
...+|.+|++.||.+-+|+.
T Consensus 3 ~~~~CqkC~~~GH~tyeC~~ 22 (42)
T PF13917_consen 3 ARVRCQKCGQKGHWTYECPN 22 (42)
T ss_pred CCCcCcccCCCCcchhhCCC
Confidence 45789999999999999994
No 67
>COG0328 RnhA Ribonuclease HI [DNA replication, recombination, and repair]
Probab=75.45 E-value=4.7 Score=43.38 Aligned_cols=69 Identities=17% Similarity=0.104 Sum_probs=45.3
Q ss_pred ChhHHHHHHHHHHHHHHHhcCCccccccEEEEechHHHHH----HhcCCCC-Ccch----hhhh--hHHHHhhcccC---
Q psy15380 1 MFSTVILLMSSLLKVVIDSYTPRYPIDSIYCFTDSSVALC----WAHSPSH-LFNT----FVAN--RISKIQQNMKV--- 66 (1738)
Q Consensus 1 ~~~~~~ll~a~l~~~~~~~~~~~~~i~~~~~wtDS~ivL~----wI~~~~~-~~~~----fVaN--Rv~~I~~~~~~--- 66 (1738)
||+++++.+-+.+.. +....+.++|||..|.. ||.+..+ .|++ .|-| ...++.++.+.
T Consensus 46 aEl~A~i~AL~~l~~--------~~~~~v~l~tDS~yv~~~i~~w~~~w~~~~w~~~~~~pvkn~dl~~~~~~~~~~~~~ 117 (154)
T COG0328 46 AELRALIEALEALKE--------LGACEVTLYTDSKYVVEGITRWIVKWKKNGWKTADKKPVKNKDLWEELDELLKRHEL 117 (154)
T ss_pred HHHHHHHHHHHHHHh--------cCCceEEEEecHHHHHHHHHHHHhhccccCccccccCccccHHHHHHHHHHHhhCCe
Confidence 677777777666533 47889999999998765 6533333 4553 3333 35555555554
Q ss_pred cceeccCCCCC
Q psy15380 67 DSLYHVSGTEN 77 (1738)
Q Consensus 67 ~~w~hVpt~~N 77 (1738)
..|++|++..+
T Consensus 118 v~~~WVkgH~g 128 (154)
T COG0328 118 VFWEWVKGHAG 128 (154)
T ss_pred EEEEEeeCCCC
Confidence 69999998765
No 68
>PF03732 Retrotrans_gag: Retrotransposon gag protein ; InterPro: IPR005162 Transposable elements (TEs) promote various chromosomal rearrangements more efficiently, and often more specifically, than other cellular processes. Retrotransposons are structurally similar to retroviruses and are bounded by long terminal repeats. This entry represents eukaryotic Gag or capsid-related retrotranspon-related proteins. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.
Probab=74.54 E-value=4.1 Score=39.56 Aligned_cols=52 Identities=15% Similarity=0.126 Sum_probs=38.7
Q ss_pred HHHHhccccHHHhhcCCCC-C---CCcHHHHHHHHHhHhccchhhHHHHHHHHhccc
Q psy15380 474 YLLSKLSGGALAVAAGIPP-T---EHNYQVIYDALLEKYDDKRNLATYYMDSLLNFK 526 (1738)
Q Consensus 474 ~L~s~L~G~A~~~i~~~~~-t---~~nY~~a~~~L~~~fg~~~~i~~~~~~~l~~~~ 526 (1738)
+...+|+|+|+.++..+.. . ..+|++..+.|.++|+.+.... ...++|.++.
T Consensus 2 ~~~~~L~g~A~~w~~~~~~~~~~~~~~W~~~~~~~~~~f~~~~~~~-~~~~~l~~l~ 57 (96)
T PF03732_consen 2 LFPSFLKGPARQWYRNLRPNEIRDFITWEEFKDAFRKRFFPPDRKE-QARQELNSLR 57 (96)
T ss_pred CchHhccCHHHHHHHHhHhcCCCCCCCHHHHHHHHHHHHhhhhccc-cchhhhhhhh
Confidence 3567899999999999853 2 3489999999999999976543 3444455544
No 69
>cd01709 RT_like_1 RT_like_1: A subfamily of reverse transcriptases (RTs). An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.
Probab=74.01 E-value=7.1 Score=47.21 Aligned_cols=111 Identities=13% Similarity=-0.053 Sum_probs=73.1
Q ss_pred chhHHHhhhcccCcceeecccccceeeeEecc-cCcceEEEEeccCCcccceEEEEEEeeeccccchHHHHHHHHHHHHh
Q psy15380 1208 NDLFCILLKFRLFPYALNGDITKMFLQIKLLS-DYWKFQKILWRFSNKEKIDVYELRVVIFGTKASPYLAQRTVKQLIDD 1286 (1738)
Q Consensus 1208 ~~l~~iL~r~r~~~~~~~~Dl~kaf~qv~l~~-~dr~~~~f~w~~~~~~~~~~y~f~r~pFGl~~SP~~~~~~l~~~l~~ 1286 (1738)
..|..+|.++....-.+ ++=+.|+..+|.. +|-+-... .-+=.-+|.|-..||.+++-.|..+-..
T Consensus 39 ~~Lm~vL~~~~~~~~wL--~li~r~L~APl~~~~dg~~~~~-----------r~r~rGtPqGgviSplLaNiyL~~lD~~ 105 (346)
T cd01709 39 STILAVLKFFGVPEKWL--DFFKKFLEAPLRFVADGPDAPP-----------RIRKRGTPMSHALSDVFGELVLFCLDFA 105 (346)
T ss_pred HHHHHHHHHhCCCHHHH--HHHHHHHhCceeecCCCCcccc-----------cccCCccCCCchhhHHHHHHHHHHHHHH
Confidence 34666666554444443 7777777777765 22110000 1223468999999999998888744333
Q ss_pred hhhcCcccccccccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceEe
Q psy15380 1287 ESKNFPLACQYIGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRFT 1336 (1738)
Q Consensus 1287 ~~~~~p~~~~~i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l~ 1336 (1738)
....++ .....=|.||+++.+ +.+++.+....+.+.++..|+.++
T Consensus 106 v~~~~~----g~~l~RYaDD~vi~~-~~~~a~~aw~~i~~fl~~lGLelN 150 (346)
T cd01709 106 VNQATD----GGLLYRLHDDLWFWG-QPETCAKAWKAIQEFAKVMGLELN 150 (346)
T ss_pred HHhcCC----CceEEEEcCeEEEEc-CHHHHHHHHHHHHHHHHHcCceec
Confidence 333222 233445999999995 478999999999999999999994
No 70
>PRK13907 rnhA ribonuclease H; Provisional
Probab=73.48 E-value=4.4 Score=42.21 Aligned_cols=59 Identities=17% Similarity=0.110 Sum_probs=37.9
Q ss_pred cccccEEEEechHHHHHHhcCCCCCcchhhhhhHHHHhhcc---cCcceeccCCCCCC-CccCc
Q psy15380 24 YPIDSIYCFTDSSVALCWAHSPSHLFNTFVANRISKIQQNM---KVDSLYHVSGTENP-ADGLS 83 (1738)
Q Consensus 24 ~~i~~~~~wtDS~ivL~wI~~~~~~~~~fVaNRv~~I~~~~---~~~~w~hVpt~~NP-AD~aS 83 (1738)
..+..+.++|||..|..||.++-.. ..=...-+.+|+++. +...+.|||-..|= ||...
T Consensus 59 ~g~~~v~i~sDS~~vi~~~~~~~~~-~~~~~~l~~~~~~l~~~f~~~~~~~v~r~~N~~Ad~LA 121 (128)
T PRK13907 59 HNYNIVSFRTDSQLVERAVEKEYAK-NKMFAPLLEEALQYIKSFDLFFIKWIPSSQNKVADELA 121 (128)
T ss_pred CCCCEEEEEechHHHHHHHhHHHhc-ChhHHHHHHHHHHHHhcCCceEEEEcCchhchhHHHHH
Confidence 4567899999999999999874431 111223344554443 34556899998874 66543
No 71
>KOG0119|consensus
Probab=72.19 E-value=1.6 Score=53.56 Aligned_cols=42 Identities=24% Similarity=0.630 Sum_probs=34.5
Q ss_pred ccccccC-CCCCcccccccccCCHHHHHHHHHhcccccccccCCCCCCCCCCC
Q psy15380 689 NMCVLCS-EEHPLFKCNKFLKLSPQERYGVVKQHDLCVNCLGTGHKANNCPSK 740 (1738)
Q Consensus 689 ~~C~~C~-~~H~~~~C~~f~~~s~~eR~~~vk~~~lCf~Clk~GH~a~~C~s~ 740 (1738)
..|..|+ .+|.-.+|+. | +-...++|++|.-.||++++|+..
T Consensus 262 ~~c~~cg~~~H~q~~cp~--------r--~~~~~n~c~~cg~~gH~~~dc~~~ 304 (554)
T KOG0119|consen 262 RACRNCGSTGHKQYDCPG--------R--IPNTTNVCKICGPLGHISIDCKVN 304 (554)
T ss_pred ccccccCCCccccccCCc--------c--cccccccccccCCcccccccCCCc
Confidence 6899999 8999999994 2 112234999999999999999986
No 72
>smart00343 ZnF_C2HC zinc finger.
Probab=69.02 E-value=2.3 Score=31.31 Aligned_cols=17 Identities=47% Similarity=1.030 Sum_probs=15.4
Q ss_pred cccccccCCCCCCCCCC
Q psy15380 723 LCVNCLGTGHKANNCPS 739 (1738)
Q Consensus 723 lCf~Clk~GH~a~~C~s 739 (1738)
.|++|++.||.+++|+.
T Consensus 1 ~C~~CG~~GH~~~~C~~ 17 (26)
T smart00343 1 KCYNCGKEGHIARDCPK 17 (26)
T ss_pred CCccCCCCCcchhhCCc
Confidence 49999999999999984
No 73
>KOG4400|consensus
Probab=66.10 E-value=3.2 Score=49.11 Aligned_cols=39 Identities=31% Similarity=0.699 Sum_probs=34.6
Q ss_pred ccccccC-CCCCcccccccccCCHHHHHHHHHhcccccccccCCCCCCCCCCC
Q psy15380 689 NMCVLCS-EEHPLFKCNKFLKLSPQERYGVVKQHDLCVNCLGTGHKANNCPSK 740 (1738)
Q Consensus 689 ~~C~~C~-~~H~~~~C~~f~~~s~~eR~~~vk~~~lCf~Clk~GH~a~~C~s~ 740 (1738)
..|+.|+ .+|...+|+.- +.+.||.|...||.+++|++.
T Consensus 144 ~~Cy~Cg~~GH~s~~C~~~-------------~~~~c~~c~~~~h~~~~C~~~ 183 (261)
T KOG4400|consen 144 AKCYSCGEQGHISDDCPEN-------------KGGTCFRCGKVGHGSRDCPSK 183 (261)
T ss_pred CccCCCCcCCcchhhCCCC-------------CCCccccCCCcceecccCCcc
Confidence 5799999 69999999831 468999999999999999986
No 74
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=65.11 E-value=4.3 Score=27.34 Aligned_cols=16 Identities=31% Similarity=0.648 Sum_probs=14.3
Q ss_pred cccccC-CCCCcccccc
Q psy15380 690 MCVLCS-EEHPLFKCNK 705 (1738)
Q Consensus 690 ~C~~C~-~~H~~~~C~~ 705 (1738)
.|+.|+ .+|...+||+
T Consensus 2 ~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 2 KCFNCGEPGHIARDCPK 18 (18)
T ss_dssp BCTTTSCSSSCGCTSSS
T ss_pred cCcCCCCcCcccccCcc
Confidence 699999 8999999984
No 75
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=62.12 E-value=4.1 Score=32.31 Aligned_cols=18 Identities=22% Similarity=0.490 Sum_probs=10.5
Q ss_pred CccccccC-CCCCcccccc
Q psy15380 688 GNMCVLCS-EEHPLFKCNK 705 (1738)
Q Consensus 688 ~~~C~~C~-~~H~~~~C~~ 705 (1738)
...|+.|+ +.||.++|..
T Consensus 2 ~~~CprC~kg~Hwa~~C~s 20 (36)
T PF14787_consen 2 PGLCPRCGKGFHWASECRS 20 (36)
T ss_dssp --C-TTTSSSCS-TTT---
T ss_pred CccCcccCCCcchhhhhhh
Confidence 36799999 8999999974
No 76
>KOG3752|consensus
Probab=58.05 E-value=22 Score=43.25 Aligned_cols=72 Identities=21% Similarity=0.205 Sum_probs=49.5
Q ss_pred eeeeccccccc-------ceeEEEEEEecCCCcEEEEEEeeccccccCC--CCccchhhHHHHHHHHHHHHHHHHhcCCC
Q psy15380 1624 LIGFSDASEKG-------YGALIFSRVSLPNGSIVIQLICAKSRVSPLK--VESIPRLELCAILLMSSLLKVVIDSYTPR 1694 (1738)
Q Consensus 1624 L~~F~DAS~~a-------yga~vy~r~~~~~g~~~~~l~~aKsrv~p~k--~~tIprlEL~a~~~~~~l~~~~~~~l~~~ 1694 (1738)
+.|++|+|-.+ -|.+||.- +|.. +..-.|+. ..|--|-||.|+..| |++++.
T Consensus 213 ~vvytDGS~~~ng~~~~~AGyGvywg----~~~e-------~N~s~pv~~g~qtNnrAEl~Av~~A------Lkka~~-- 273 (371)
T KOG3752|consen 213 QVVYTDGSSSGNGRKSSRAGYGVYWG----PGHE-------LNVSGPLAGGRQTNNRAELIAAIEA------LKKARS-- 273 (371)
T ss_pred eEEEecCccccCCCCCCcceeEEeeC----CCCc-------ccccccCCCCcccccHHHHHHHHHH------HHHHHh--
Confidence 77899987655 33457743 3322 12233443 459999999999866 455555
Q ss_pred CCcccEEEEechhhhHHhhc
Q psy15380 1695 YPIDSIYCFTDSSVALCWAH 1714 (1738)
Q Consensus 1695 ~~~~~~~~~tDS~i~l~~i~ 1714 (1738)
.++..+++-|||.-+..||.
T Consensus 274 ~~~~kv~I~TDS~~~i~~l~ 293 (371)
T KOG3752|consen 274 KNINKVVIRTDSEYFINSLT 293 (371)
T ss_pred cCCCcEEEEechHHHHHHHH
Confidence 56779999999999988776
No 77
>PRK00203 rnhA ribonuclease H; Reviewed
Probab=55.78 E-value=14 Score=39.71 Aligned_cols=51 Identities=20% Similarity=0.257 Sum_probs=32.0
Q ss_pred ccEEEEechHHHHH----HhcC-CCCCcc----hhhhhh--HHHHhhccc--CcceeccCCCCC
Q psy15380 27 DSIYCFTDSSVALC----WAHS-PSHLFN----TFVANR--ISKIQQNMK--VDSLYHVSGTEN 77 (1738)
Q Consensus 27 ~~~~~wtDS~ivL~----wI~~-~~~~~~----~fVaNR--v~~I~~~~~--~~~w~hVpt~~N 77 (1738)
..+.++|||..|+. |+.. ..+.|+ .=|+|+ .++|.++.. ...|.|||+..+
T Consensus 62 ~~v~I~tDS~yvi~~i~~w~~~Wk~~~~~~~~g~~v~n~dl~~~i~~l~~~~~v~~~wV~~H~~ 125 (150)
T PRK00203 62 CEVTLYTDSQYVRQGITEWIHGWKKNGWKTADKKPVKNVDLWQRLDAALKRHQIKWHWVKGHAG 125 (150)
T ss_pred CeEEEEECHHHHHHHHHHHHHHHHHcCCcccCCCccccHHHHHHHHHHhccCceEEEEecCCCC
Confidence 57899999987765 4422 011122 246676 566666554 378999997764
No 78
>PRK06548 ribonuclease H; Provisional
Probab=53.68 E-value=21 Score=39.00 Aligned_cols=50 Identities=18% Similarity=0.102 Sum_probs=30.8
Q ss_pred ccEEEEechHHHHHHhc--------CCCC-Ccchhhhhh--HHHHhhccc--CcceeccCCCC
Q psy15380 27 DSIYCFTDSSVALCWAH--------SPSH-LFNTFVANR--ISKIQQNMK--VDSLYHVSGTE 76 (1738)
Q Consensus 27 ~~~~~wtDS~ivL~wI~--------~~~~-~~~~fVaNR--v~~I~~~~~--~~~w~hVpt~~ 76 (1738)
..+.++|||+.|+.=|. +.-+ .-+.-|.|| +.+|.++.. ..+|.||+...
T Consensus 62 ~~v~I~TDS~yvi~~i~~W~~~Wk~~gWk~s~G~pV~N~dL~~~l~~l~~~~~v~~~wVkgHs 124 (161)
T PRK06548 62 RPILILSDSKYVINSLTKWVYSWKMRKWRKADGKPVLNQEIIQEIDSLMENRNIRMSWVNAHT 124 (161)
T ss_pred ceEEEEeChHHHHHHHHHHHHHHHHCCCcccCCCccccHHHHHHHHHHHhcCceEEEEEecCC
Confidence 46899999999885443 2222 123457877 344444332 37899998753
No 79
>PRK07708 hypothetical protein; Validated
Probab=48.87 E-value=25 Score=40.44 Aligned_cols=53 Identities=23% Similarity=0.243 Sum_probs=33.3
Q ss_pred cEEEEechHHHHHHhcCCCCC----cchhhhhhHHHHhhcccC-cceeccCCCCCC-Ccc
Q psy15380 28 SIYCFTDSSVALCWAHSPSHL----FNTFVANRISKIQQNMKV-DSLYHVSGTENP-ADG 81 (1738)
Q Consensus 28 ~~~~wtDS~ivL~wI~~~~~~----~~~fVaNRv~~I~~~~~~-~~w~hVpt~~NP-AD~ 81 (1738)
.|.+++||..|..|+.++-+. ...|. .++.++-+.... ..+.|||-++|= ||-
T Consensus 142 ~V~I~~DSqlVi~qi~g~wk~~~~~l~~y~-~~i~~l~~~~~l~~~~~~VpR~~N~~AD~ 200 (219)
T PRK07708 142 PVTFRGDSQVVLNQLAGEWPCYDEHLNHWL-DRIEQKLKQLKLTPVYEPISRKQNKEADQ 200 (219)
T ss_pred eEEEEeccHHHHHHhCCCceeCChhHHHHH-HHHHHHHhhCCceEEEEECCchhhhHHHH
Confidence 489999999999999876332 22233 333332222332 567899998884 543
No 80
>PRK07238 bifunctional RNase H/acid phosphatase; Provisional
Probab=46.32 E-value=25 Score=43.88 Aligned_cols=74 Identities=8% Similarity=0.010 Sum_probs=46.8
Q ss_pred hhHHHHHHHHHHHHHHHhcCCccccccEEEEechHHHHHHhcCCCC----CcchhhhhhHHHHhhcccCcceeccCCCCC
Q psy15380 2 FSTVILLMSSLLKVVIDSYTPRYPIDSIYCFTDSSVALCWAHSPSH----LFNTFVANRISKIQQNMKVDSLYHVSGTEN 77 (1738)
Q Consensus 2 ~~~~~ll~a~l~~~~~~~~~~~~~i~~~~~wtDS~ivL~wI~~~~~----~~~~fVaNRv~~I~~~~~~~~w~hVpt~~N 77 (1738)
|+++.+.+-+++. ...+..+.++|||..|..-|.++-+ .+..+. ..+.+.........|.|||-++|
T Consensus 49 E~~All~gL~~a~--------~~g~~~v~i~~DS~lvi~~i~~~~~~~~~~l~~~~-~~i~~l~~~f~~~~i~~v~r~~N 119 (372)
T PRK07238 49 EYRGLIAGLEAAA--------ELGATEVEVRMDSKLVVEQMSGRWKVKHPDMKPLA-AQARELASQFGRVTYTWIPRARN 119 (372)
T ss_pred HHHHHHHHHHHHH--------hCCCCeEEEEeCcHHHHHHhCCCCccCChHHHHHH-HHHHHHHhcCCceEEEECCchhh
Confidence 4555555445442 2356789999999999999976432 122222 34455555556688999998776
Q ss_pred C-CccCcC
Q psy15380 78 P-ADGLSR 84 (1738)
Q Consensus 78 P-AD~aSR 84 (1738)
= ||...+
T Consensus 120 ~~AD~LA~ 127 (372)
T PRK07238 120 AHADRLAN 127 (372)
T ss_pred hHHHHHHH
Confidence 4 665444
No 81
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=42.27 E-value=10 Score=29.63 Aligned_cols=19 Identities=26% Similarity=0.665 Sum_probs=16.4
Q ss_pred cCccccccC-CCCCcccccc
Q psy15380 687 RGNMCVLCS-EEHPLFKCNK 705 (1738)
Q Consensus 687 ~~~~C~~C~-~~H~~~~C~~ 705 (1738)
..-.|..|+ .+|++.+||.
T Consensus 7 ~~Y~C~~C~~~GH~i~dCP~ 26 (32)
T PF13696_consen 7 PGYVCHRCGQKGHWIQDCPT 26 (32)
T ss_pred CCCEeecCCCCCccHhHCCC
Confidence 346899999 7899999996
No 82
>PF00336 DNA_pol_viral_C: DNA polymerase (viral) C-terminal domain; InterPro: IPR001462 This domain is at the C terminus of hepatitis B-type viruses P proteins and represents a functional domain that controls the RNase H activities of the protein. The domain is always associated with IPR000201 from INTERPRO and .; GO: 0004523 ribonuclease H activity
Probab=41.06 E-value=20 Score=39.60 Aligned_cols=60 Identities=27% Similarity=0.186 Sum_probs=38.0
Q ss_pred eeeecccccccceeEEEEEEecCCCcEEEEEEeeccccccCCCCccchhhHHHHHHHHHHHHHHHHhcCCCCCcccEEEE
Q psy15380 1624 LIGFSDASEKGYGALIFSRVSLPNGSIVIQLICAKSRVSPLKVESIPRLELCAILLMSSLLKVVIDSYTPRYPIDSIYCF 1703 (1738)
Q Consensus 1624 L~~F~DAS~~ayga~vy~r~~~~~g~~~~~l~~aKsrv~p~k~~tIprlEL~a~~~~~~l~~~~~~~l~~~~~~~~~~~~ 1703 (1738)
--||+||+..+||.++- .|.....+ ++..-|--.||+|+-+|-- -+..-.+.
T Consensus 95 c~VfaDATpTgwgi~i~------~~~~~~Tf---------s~~l~IhtaELlaaClAr~-------------~~~~r~l~ 146 (245)
T PF00336_consen 95 CQVFADATPTGWGISIT------GQRMRGTF---------SKPLPIHTAELLAACLARL-------------MSGARCLG 146 (245)
T ss_pred CceeccCCCCcceeeec------Cceeeeee---------cccccchHHHHHHHHHHHh-------------ccCCcEEe
Confidence 35799999999998733 22222222 1234566789999876622 12233478
Q ss_pred echhhhHH
Q psy15380 1704 TDSSVALC 1711 (1738)
Q Consensus 1704 tDS~i~l~ 1711 (1738)
|||++||+
T Consensus 147 tDnt~Vls 154 (245)
T PF00336_consen 147 TDNTVVLS 154 (245)
T ss_pred ecCcEEEe
Confidence 99999875
No 83
>PF12353 eIF3g: Eukaryotic translation initiation factor 3 subunit G ; InterPro: IPR024675 At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding []. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700 kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. Subunit G is required for eIF3 integrity. This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain PF00076 from PFAM.
Probab=40.59 E-value=13 Score=39.04 Aligned_cols=19 Identities=26% Similarity=0.530 Sum_probs=16.8
Q ss_pred ccCccccccCCCCCccccc
Q psy15380 686 MRGNMCVLCSEEHPLFKCN 704 (1738)
Q Consensus 686 ~~~~~C~~C~~~H~~~~C~ 704 (1738)
.....|.+|+++||...||
T Consensus 104 ~~~v~CR~CkGdH~T~~CP 122 (128)
T PF12353_consen 104 KSKVKCRICKGDHWTSKCP 122 (128)
T ss_pred CceEEeCCCCCCcccccCC
Confidence 4456899999999999999
No 84
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=38.76 E-value=11 Score=32.37 Aligned_cols=17 Identities=35% Similarity=1.042 Sum_probs=13.7
Q ss_pred ccccccccCCCCCCCCC
Q psy15380 722 DLCVNCLGTGHKANNCP 738 (1738)
Q Consensus 722 ~lCf~Clk~GH~a~~C~ 738 (1738)
..|++|+.-||..++|+
T Consensus 32 ~~C~~C~~~gH~~~~C~ 48 (49)
T PF14392_consen 32 RFCFHCGRIGHSDKECP 48 (49)
T ss_pred hhhcCCCCcCcCHhHcC
Confidence 56888888888888886
No 85
>KOG4768|consensus
Probab=35.98 E-value=28 Score=44.54 Aligned_cols=131 Identities=17% Similarity=0.134 Sum_probs=87.9
Q ss_pred cccchhHHHhhhcccCcceeecccccceeeeEecc---------cCcceEEE-Ee--ccCC--cccceEEEEEEeeeccc
Q psy15380 1205 KMYNDLFCILLKFRLFPYALNGDITKMFLQIKLLS---------DYWKFQKI-LW--RFSN--KEKIDVYELRVVIFGTK 1270 (1738)
Q Consensus 1205 ~~~~~l~~iL~r~r~~~~~~~~Dl~kaf~qv~l~~---------~dr~~~~f-~w--~~~~--~~~~~~y~f~r~pFGl~ 1270 (1738)
.-...++.+.--||+..||+..||++.|--|+-++ .|..+... .| +.|- +..--.|.|...|.|--
T Consensus 344 Sc~tAi~~~~n~f~gcnw~ie~DLkkcfdtIphd~LI~eL~~rIkdk~fidL~~kll~AGy~ten~ry~~~~lGtpqgsv 423 (796)
T KOG4768|consen 344 SCKTAILKTHNLFRGCNWFIEVDLKKCFDTIPHDELIIELQKRIKDKGFIDLNYKLLRAGYTTENARYHVEFLGTPQGSV 423 (796)
T ss_pred hhhHHHHHHHHHhhccceEEechHHHHhccccHHHHHHHHHHHHhhhhHHHHHHHHHhcCccccccceeccccccccccc
Confidence 33557889999999999999999999999887543 22222111 12 1121 12234778889999999
Q ss_pred cchHHHHHHHHHHHHhhhh----cCcc-------------c------------------------------cccccc---
Q psy15380 1271 ASPYLAQRTVKQLIDDESK----NFPL-------------A------------------------------CQYIGE--- 1300 (1738)
Q Consensus 1271 ~SP~~~~~~l~~~l~~~~~----~~p~-------------~------------------------------~~~i~~--- 1300 (1738)
.||-+.+..|+++-+.+.. +|.. + ......
T Consensus 424 vspil~nifL~~LDk~Lenk~~nefn~~~~~~~rhs~yr~L~~~iakaKl~s~~sKtirlrd~~qrn~~n~~~gfkr~~y 503 (796)
T KOG4768|consen 424 VSPILCNIFLRELDKRLENKHRNEFNAGHLRSARHSKYRNLSDSIAKAKLTSGMSKTIRLRDGVQRNETNDTAGFKRLMY 503 (796)
T ss_pred CCchhHHHHHHHHHHHHHHHHHhhhcccCcccccChhhhhHHHHHHHHHHhhhhhhhhhhhhcccccccCCccccceeeE
Confidence 9999888777655432222 1110 0 000111
Q ss_pred ceeccceeeec-CCHHHHHHHHHHHHHHHHhCCceE
Q psy15380 1301 SLYMDDCVISF-PSQKEAIEFFNQTVKLFASGGFRF 1335 (1738)
Q Consensus 1301 ~~YvDDili~~-~s~ee~~~~~~~v~~~l~~~Gf~l 1335 (1738)
.-|-||++++. -|..+..+.++.+...+.+-|+..
T Consensus 504 VRyadd~ii~v~GS~nd~K~ilr~In~f~sslGls~ 539 (796)
T KOG4768|consen 504 VRYADDIIIGVWGSVNDCKQILRDINNFLSSLGLSN 539 (796)
T ss_pred EEecCCEEEEEeccHHHHHHHHHHHHHHHHhhCccc
Confidence 23679999987 489999999999999999999875
No 86
>PF14244 UBN2_3: gag-polypeptide of LTR copia-type
Probab=34.23 E-value=1.1e+02 Score=32.93 Aligned_cols=84 Identities=19% Similarity=0.290 Sum_probs=48.9
Q ss_pred CCCCCCCCCCcCChHHHHHHHHHHhccCCC---C------ChH-------------HHHHHHHHhccccHHHhhcCCCCC
Q psy15380 436 LEIPTFDGACIGNWPLFIEMYRINIHNRTD---L------TNA-------------HKLQYLLSKLSGGALAVAAGIPPT 493 (1738)
Q Consensus 436 ~~lp~F~G~~~~~~~~f~~~F~~~v~~~~~---l------~d~-------------~K~~~L~s~L~G~A~~~i~~~~~t 493 (1738)
+..++++|. +|..|..+++.++....- + ++. .=+..|+..++.+-...|..+.
T Consensus 8 i~~~kL~g~---NY~~W~~~~~~~L~~~~l~~~i~g~~~~P~~~~~~~~~W~~~d~~v~swl~~sis~~i~~~i~~~~-- 82 (152)
T PF14244_consen 8 ITSIKLNGS---NYLSWSQQMEMALRGKGLWGFIDGTIPKPPETDPAYEKWERKDQLVLSWLLNSISPDILSTIIFCE-- 82 (152)
T ss_pred ccccCCCCc---cHHHHHHHHHHHHHhCCCcccccCccccccccchhhhhHHHhhhHHHHHHHHhhcHHHHhhhHhhh--
Confidence 444788887 688888888886653221 1 111 1123444555555555544433
Q ss_pred CCcHHHHHHHHHhHhccchh--hHHHHHHHHhccc
Q psy15380 494 EHNYQVIYDALLEKYDDKRN--LATYYMDSLLNFK 526 (1738)
Q Consensus 494 ~~nY~~a~~~L~~~fg~~~~--i~~~~~~~l~~~~ 526 (1738)
.=.++|+.|+++|++... -+-.+.++|..+.
T Consensus 83 --tak~~W~~L~~~f~~~~~~~r~~~L~~~l~~~k 115 (152)
T PF14244_consen 83 --TAKEIWDALKERFSQKSNASRVFQLRNELHSLK 115 (152)
T ss_pred --hHHHHHHHHHHHhhcccHHHHHHHHHHHHHHHh
Confidence 446789999999987772 2233445555544
No 87
>KOG1005|consensus
Probab=27.48 E-value=1.1e+02 Score=41.20 Aligned_cols=90 Identities=18% Similarity=0.178 Sum_probs=65.4
Q ss_pred EEEEEeeeccccchHHHHHHHHHHHHhhhhcCcccccc-cccceeccceeeecCCHHHHHHHHHHHHHHHHhCCceEeee
Q psy15380 1260 YELRVVIFGTKASPYLAQRTVKQLIDDESKNFPLACQY-IGESLYMDDCVISFPSQKEAIEFFNQTVKLFASGGFRFTKW 1338 (1738)
Q Consensus 1260 y~f~r~pFGl~~SP~~~~~~l~~~l~~~~~~~p~~~~~-i~~~~YvDDili~~~s~ee~~~~~~~v~~~l~~~Gf~l~k~ 1338 (1738)
-+=..+|.|-.-|..+..-.+..+.+.+-.- ...+. +...-||||.|+-+.+.+++...++-+ .+||+...+
T Consensus 630 vq~~GIpQGs~LSslLc~lyy~dle~~y~~~--~~~~g~~vLlR~vDDFLfITt~~~~a~kfl~~l-----~~Gf~~yn~ 702 (888)
T KOG1005|consen 630 VQKKGIPQGSILSSLLCHLYYGDLEDKYFSF--EKEDGSIVLLRYVDDFLFITTENDQAKKFLKLL-----SRGFNKYNF 702 (888)
T ss_pred EEecCccCCCchhHHHHHHHHHhHHHHHhhc--ccCCCcEEEEEeecceEEEecCHHHHHHHHHHH-----hccccccce
Confidence 3456799999999999998888887766432 12222 234459999999999999999987766 457777777
Q ss_pred ccCChHhhhcCCCccchhh
Q psy15380 1339 SSNSPEILEQIPIHDRLAE 1357 (1738)
Q Consensus 1339 ~SN~~~vl~~i~~~~~~~~ 1357 (1738)
.+|.++.. +++.++....
T Consensus 703 ~tn~~K~~-nF~~se~~~~ 720 (888)
T KOG1005|consen 703 FTNEPKTV-NFEVSEECGA 720 (888)
T ss_pred eccCcccc-cccchhccCc
Confidence 77887765 6676665543
No 88
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=23.87 E-value=40 Score=28.22 Aligned_cols=19 Identities=21% Similarity=0.564 Sum_probs=16.2
Q ss_pred cCccccccC-CCCCcccccc
Q psy15380 687 RGNMCVLCS-EEHPLFKCNK 705 (1738)
Q Consensus 687 ~~~~C~~C~-~~H~~~~C~~ 705 (1738)
....|..|+ .+|+.++|+.
T Consensus 3 ~~~~CqkC~~~GH~tyeC~~ 22 (42)
T PF13917_consen 3 ARVRCQKCGQKGHWTYECPN 22 (42)
T ss_pred CCCcCcccCCCCcchhhCCC
Confidence 356899999 8999999993
No 89
>PRK08719 ribonuclease H; Reviewed
Probab=22.81 E-value=1.3e+02 Score=32.28 Aligned_cols=48 Identities=17% Similarity=0.056 Sum_probs=26.6
Q ss_pred EEEEechHHHHHHh--------cCCCCC-cchhhhhh--HHHHhhcc--cCcceeccCCCC
Q psy15380 29 IYCFTDSSVALCWA--------HSPSHL-FNTFVANR--ISKIQQNM--KVDSLYHVSGTE 76 (1738)
Q Consensus 29 ~~~wtDS~ivL~wI--------~~~~~~-~~~fVaNR--v~~I~~~~--~~~~w~hVpt~~ 76 (1738)
..++|||.-|+.=| ++.=++ -..-|.|+ ..+|.++. ...+|.|||...
T Consensus 69 ~~i~tDS~yvi~~i~~~~~~W~~~~w~~s~g~~v~n~dl~~~i~~l~~~~~i~~~~VkgH~ 129 (147)
T PRK08719 69 DVIYSDSDYCVRGFNEWLDTWKQKGWRKSDKKPVANRDLWQQVDELRARKYVEVEKVTAHS 129 (147)
T ss_pred CEEEechHHHHHHHHHHHHHHHhCCcccCCCcccccHHHHHHHHHHhCCCcEEEEEecCCC
Confidence 37999998887544 222221 12346563 12333332 237799999865
No 90
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=22.09 E-value=49 Score=27.32 Aligned_cols=19 Identities=37% Similarity=0.725 Sum_probs=14.9
Q ss_pred ccccccccCCCCC--CCCCCC
Q psy15380 722 DLCVNCLGTGHKA--NNCPSK 740 (1738)
Q Consensus 722 ~lCf~Clk~GH~a--~~C~s~ 740 (1738)
..|-+|+..||.. +.|+-+
T Consensus 2 ~kC~~CG~~GH~~t~k~CP~~ 22 (40)
T PF15288_consen 2 VKCKNCGAFGHMRTNKRCPMY 22 (40)
T ss_pred ccccccccccccccCccCCCC
Confidence 4699999999996 566654
Done!