Query psy9877
Match_columns 1447
No_of_seqs 498 out of 2246
Neff 7.9
Searched_HMMs 46136
Date Fri Aug 16 21:55:33 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy9877.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/9877hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cd01644 RT_pepA17 RT_pepA17: R 100.0 7.2E-41 1.6E-45 366.1 15.3 195 619-837 1-213 (213)
2 PF05380 Peptidase_A17: Pao re 100.0 2.1E-34 4.6E-39 302.0 12.7 159 854-1020 1-159 (159)
3 cd03715 RT_ZFREV_like RT_ZFREV 100.0 1.2E-27 2.6E-32 264.7 18.4 198 588-832 10-210 (210)
4 cd01645 RT_Rtv RT_Rtv: Reverse 99.9 6E-26 1.3E-30 251.1 15.8 199 589-831 11-212 (213)
5 PF05585 DUF1758: Putative pep 99.9 4.7E-22 1E-26 210.9 14.1 156 330-491 1-160 (164)
6 cd01647 RT_LTR RT_LTR: Reverse 99.8 5.6E-21 1.2E-25 205.2 12.3 176 605-831 1-176 (177)
7 PF03564 DUF1759: Protein of u 99.6 4.4E-16 9.5E-21 162.0 10.0 101 2-108 38-140 (145)
8 cd03714 RT_DIRS1 RT_DIRS1: Rev 99.6 6.8E-16 1.5E-20 154.8 9.6 117 681-831 1-118 (119)
9 PF00078 RVT_1: Reverse transc 99.0 2.9E-10 6.3E-15 126.3 7.5 139 668-831 56-213 (214)
10 cd00304 RT_like RT_like: Rever 98.7 1.8E-08 4E-13 97.3 5.8 87 733-832 12-98 (98)
11 PF00336 DNA_pol_viral_C: DNA 98.7 1.4E-08 3.1E-13 104.1 5.2 157 869-1074 39-196 (245)
12 PF08284 RVP_2: Retroviral asp 98.5 7.1E-07 1.5E-11 91.0 11.2 84 342-460 32-116 (135)
13 cd05484 retropepsin_like_LTR_2 98.5 5.7E-07 1.2E-11 85.6 9.0 71 336-413 8-78 (91)
14 PF12384 Peptidase_A2B: Ty3 tr 98.4 1.7E-06 3.6E-11 87.1 9.8 71 342-415 45-115 (177)
15 cd05479 RP_DDI RP_DDI; retrope 98.3 3E-06 6.6E-11 85.5 10.5 93 325-460 16-111 (124)
16 cd06222 RnaseH RNase H (RNase 98.2 2.7E-06 6E-11 85.3 8.1 112 940-1071 2-128 (130)
17 PF13650 Asp_protease_2: Aspar 98.1 1.2E-05 2.6E-10 76.0 9.8 67 342-413 9-77 (90)
18 PF00077 RVP: Retroviral aspar 98.1 5.8E-06 1.3E-10 80.1 6.4 76 326-414 6-81 (100)
19 PF09668 Asp_protease: Asparty 98.1 8.5E-06 1.8E-10 80.7 7.2 95 324-460 23-119 (124)
20 PF00075 RNase_H: RNase H; In 98.0 1.3E-05 2.8E-10 81.9 8.1 100 937-1064 3-116 (132)
21 PRK07708 hypothetical protein; 97.8 7.1E-05 1.5E-09 82.6 9.6 120 937-1071 73-204 (219)
22 PRK13907 rnhA ribonuclease H; 97.8 5.3E-05 1.1E-09 77.1 7.9 112 938-1071 2-123 (128)
23 TIGR02281 clan_AA_DTGA clan AA 97.7 0.00011 2.4E-09 73.7 8.8 52 327-384 13-65 (121)
24 cd06095 RP_RTVL_H_like Retrope 97.7 0.00012 2.7E-09 68.7 8.4 42 336-383 6-47 (86)
25 cd01648 TERT TERT: Telomerase 97.7 3.4E-05 7.4E-10 77.3 4.9 94 732-833 19-119 (119)
26 cd05481 retropepsin_like_LTR_1 97.7 0.00011 2.3E-09 70.0 8.0 69 342-414 10-81 (93)
27 cd05483 retropepsin_like_bacte 97.7 0.00016 3.4E-09 69.2 9.3 64 326-396 3-66 (96)
28 COG0328 RnhA Ribonuclease HI [ 97.6 0.00033 7.1E-09 72.1 9.7 113 937-1071 3-141 (154)
29 PF13975 gag-asp_proteas: gag- 97.6 0.00013 2.8E-09 66.0 6.0 58 325-388 8-66 (72)
30 cd05480 NRIP_C NRIP_C; putativ 97.5 0.00015 3.2E-09 67.6 5.6 27 342-368 9-35 (103)
31 PF07727 RVT_2: Reverse transc 97.5 4E-05 8.7E-10 86.7 1.4 179 634-843 35-220 (246)
32 cd03487 RT_Bac_retron_II RT_Ba 97.4 4.2E-05 9.1E-10 85.2 0.9 143 671-835 52-198 (214)
33 cd01650 RT_nLTR_like RT_nLTR: 97.2 0.0003 6.6E-09 78.6 4.8 105 673-817 79-186 (220)
34 PRK00203 rnhA ribonuclease H; 97.1 0.0027 5.9E-08 66.4 10.5 109 938-1070 4-137 (150)
35 PF13456 RVT_3: Reverse transc 97.1 0.00016 3.5E-09 67.7 0.9 76 989-1070 3-82 (87)
36 PRK07238 bifunctional RNase H/ 97.1 0.001 2.2E-08 80.8 7.4 117 937-1072 2-129 (372)
37 PRK08719 ribonuclease H; Revie 97.0 0.0017 3.8E-08 67.3 8.0 110 936-1068 3-140 (147)
38 PF00098 zf-CCHC: Zinc knuckle 97.0 0.00034 7.4E-09 44.7 1.4 18 267-284 1-18 (18)
39 cd01646 RT_Bac_retron_I RT_Bac 96.9 0.0019 4.1E-08 68.3 6.7 94 729-834 50-146 (158)
40 TIGR03698 clan_AA_DTGF clan AA 96.9 0.0059 1.3E-07 59.8 9.5 56 327-384 1-57 (107)
41 KOG0012|consensus 96.6 0.0021 4.6E-08 73.0 4.8 28 342-369 246-273 (380)
42 PF14223 UBN2: gag-polypeptide 96.6 0.015 3.2E-07 58.4 10.3 105 21-137 4-110 (119)
43 cd00303 retropepsin_like Retro 96.4 0.017 3.8E-07 52.8 9.2 69 342-413 9-78 (92)
44 COG5082 AIR1 Arginine methyltr 96.3 0.0019 4.1E-08 67.8 2.3 71 234-304 59-138 (190)
45 PRK06548 ribonuclease H; Provi 96.3 0.025 5.5E-07 59.4 10.2 81 984-1071 39-138 (161)
46 COG5082 AIR1 Arginine methyltr 96.2 0.0024 5.1E-08 67.1 2.3 34 265-301 59-92 (190)
47 PF02160 Peptidase_A3: Caulifl 96.1 0.0051 1.1E-07 65.9 4.2 56 327-384 6-61 (201)
48 PF14227 UBN2_2: gag-polypepti 96.1 0.041 8.8E-07 55.1 10.5 104 21-137 5-108 (119)
49 cd06094 RP_Saci_like RP_Saci_l 96.1 0.0097 2.1E-07 55.1 5.2 51 342-396 9-59 (89)
50 cd01651 RT_G2_intron RT_G2_int 96.0 0.011 2.3E-07 66.3 6.3 100 731-831 125-225 (226)
51 PTZ00368 universal minicircle 94.8 0.021 4.5E-07 59.6 3.3 61 236-303 53-120 (148)
52 PTZ00368 universal minicircle 94.7 0.024 5.2E-07 59.2 3.4 63 235-304 27-95 (148)
53 cd05482 HIV_retropepsin_like R 94.7 0.058 1.3E-06 50.5 5.5 53 342-397 9-61 (87)
54 PF12382 Peptidase_A2E: Retrot 94.6 0.093 2E-06 48.3 6.4 70 342-416 47-117 (137)
55 PF09337 zf-H2C2: His(2)-Cys(2 94.6 0.016 3.5E-07 45.2 1.2 25 1267-1292 1-25 (39)
56 KOG3752|consensus 94.1 0.18 4E-06 58.6 8.9 82 984-1071 253-361 (371)
57 PF14787 zf-CCHC_5: GAG-polypr 91.7 0.083 1.8E-06 39.7 1.1 21 266-286 2-22 (36)
58 KOG4400|consensus 91.2 0.073 1.6E-06 61.2 0.7 67 235-304 92-182 (261)
59 PF13696 zf-CCHC_2: Zinc knuck 89.8 0.13 2.9E-06 38.0 0.7 19 267-285 9-27 (32)
60 COG3577 Predicted aspartyl pro 88.1 0.81 1.8E-05 49.0 5.4 48 336-386 113-161 (215)
61 PF13917 zf-CCHC_3: Zinc knuck 87.4 0.18 3.9E-06 40.0 0.1 20 265-284 3-22 (42)
62 cd01709 RT_like_1 RT_like_1: A 85.2 3.1 6.6E-05 48.8 8.5 100 731-843 82-197 (346)
63 KOG0119|consensus 84.6 0.43 9.3E-06 56.9 1.3 43 235-285 261-304 (554)
64 smart00343 ZnF_C2HC zinc finge 84.2 0.43 9.3E-06 33.8 0.7 18 268-285 1-18 (26)
65 KOG4400|consensus 82.8 0.69 1.5E-05 53.2 2.0 40 236-286 144-184 (261)
66 PF00098 zf-CCHC: Zinc knuckle 79.3 1.3 2.9E-05 28.5 1.5 16 237-252 2-18 (18)
67 COG5550 Predicted aspartyl pro 69.4 21 0.00045 35.6 7.6 42 326-369 11-54 (125)
68 PF13696 zf-CCHC_2: Zinc knuck 64.4 3.3 7.1E-05 30.9 0.9 19 234-252 7-26 (32)
69 PF03732 Retrotrans_gag: Retro 64.2 7.8 0.00017 36.5 3.8 85 2-98 7-95 (96)
70 PF14392 zf-CCHC_4: Zinc knuck 62.0 2.9 6.4E-05 34.7 0.3 17 267-283 32-48 (49)
71 cd04714 BAH_BAHCC1 BAH, or Bro 55.8 15 0.00033 36.8 4.3 39 1341-1381 3-41 (121)
72 PF13917 zf-CCHC_3: Zinc knuck 54.9 7.4 0.00016 31.1 1.4 18 234-251 3-21 (42)
73 PF12353 eIF3g: Eukaryotic tra 54.2 5.6 0.00012 40.3 0.9 19 234-252 105-123 (128)
74 PF14787 zf-CCHC_5: GAG-polypr 51.5 6.1 0.00013 30.1 0.5 17 236-252 3-20 (36)
75 PF15288 zf-CCHC_6: Zinc knuck 47.7 10 0.00022 29.9 1.2 19 267-285 2-22 (40)
76 KOG0109|consensus 46.7 8.8 0.00019 43.2 1.0 19 268-286 162-180 (346)
77 PRK01191 rpl24p 50S ribosomal 45.0 28 0.0006 34.7 4.0 51 1339-1392 43-95 (120)
78 COG5222 Uncharacterized conser 43.6 9.7 0.00021 42.6 0.7 20 267-286 177-196 (427)
79 PF11302 DUF3104: Protein of u 42.9 21 0.00046 32.3 2.6 43 1341-1383 5-56 (75)
80 KOG4768|consensus 39.2 42 0.00091 41.9 5.1 165 661-834 345-562 (796)
81 cd04721 BAH_plant_1 BAH, or Br 32.8 52 0.0011 33.5 4.0 35 1339-1376 5-39 (130)
82 PF14893 PNMA: PNMA 32.7 1.8E+02 0.0039 34.6 8.9 114 2-130 212-330 (331)
83 KOG0119|consensus 31.7 28 0.00061 42.2 2.1 38 267-304 262-303 (554)
84 smart00439 BAH Bromo adjacent 30.0 73 0.0016 31.4 4.5 33 1342-1375 2-34 (120)
85 cd04717 BAH_polybromo BAH, or 29.7 47 0.001 33.3 3.1 34 1341-1375 3-36 (121)
86 cd04712 BAH_DCM_I BAH, or Brom 28.7 53 0.0011 33.5 3.2 37 1340-1376 4-49 (130)
87 PF07039 DUF1325: SGF29 tudor- 27.7 51 0.0011 33.6 2.9 57 1343-1400 1-60 (130)
88 KOG3116|consensus 26.3 19 0.00042 36.4 -0.4 19 268-286 29-47 (177)
89 KOG2879|consensus 26.2 39 0.00085 38.1 1.9 45 234-299 238-288 (298)
90 smart00743 Agenet Tudor-like d 25.6 1E+02 0.0022 26.5 4.1 55 1341-1400 2-58 (61)
91 PTZ00194 60S ribosomal protein 22.8 1.2E+02 0.0027 31.2 4.5 51 1339-1392 44-96 (143)
92 KOG0122|consensus 22.6 34 0.00073 38.0 0.5 19 234-252 118-136 (270)
93 PF08750 CNP1: CNP1-like famil 22.3 98 0.0021 31.9 3.7 47 1339-1386 16-63 (139)
94 PF05515 Viral_NABP: Viral nuc 22.2 64 0.0014 32.2 2.3 27 258-284 54-80 (124)
95 PRK05886 yajC preprotein trans 20.6 1.5E+02 0.0032 29.3 4.4 35 1339-1386 36-70 (109)
96 PF01426 BAH: BAH domain; Int 20.5 1.1E+02 0.0024 30.1 3.7 33 1342-1375 3-35 (119)
No 1
>cd01644 RT_pepA17 RT_pepA17: Reverse transcriptase (RTs) in retrotransposons. This subfamily represents the RT domain of a multifunctional enzyme. C-terminal to the RT domain is a domain homologous to aspartic proteinases (corresponding to Merops family A17) encoded by retrotransposons and retroviruses. RT catalyzes DNA replication from an RNA template and is responsible for the replication of retroelements.
Probab=100.00 E-value=7.2e-41 Score=366.14 Aligned_cols=195 Identities=45% Similarity=0.767 Sum_probs=180.4
Q ss_pred eeeecCceeccCCC-CCCeEEEEcCCCCCCCcccCccccCCCCcccchhHHHhhcccccceeecccccceeeeEeCCCCC
Q psy9877 619 GYYMPHHHVVKPGS-TTPVRPVFDASAKDNGVSLNDCLEKGPNLIETIPTSLAKFRINKIGISGDIAKAFLQISVSPQDR 697 (1447)
Q Consensus 619 ~~~~P~~~V~k~~~-t~k~R~v~D~~~~~~~~slN~~~~~g~~~~~~l~~~l~~~r~~~~~~~~Di~~af~qi~l~~~dr 697 (1447)
+||+|||||++++| +||+|+|+|||++.+|.|||+.+.+||+++++|.++|++||++++++++||++|||||+|+|+||
T Consensus 1 ~~y~ph~~V~~~~~~~~k~R~V~D~s~~~~g~sLN~~l~~gp~~~~~l~~iL~~~R~~~~~~~~Di~~af~qI~i~~~d~ 80 (213)
T cd01644 1 VWYLPHHAVIKPSKTTTKLRVVFDASARYNGVSLNDMLLKGPDLLNSLFGVLLRFRQGKIAVSADIEKMFHQVKVRPEDR 80 (213)
T ss_pred CcccCCceecCCCCCCCccEEEEecccccCCchhhHHhccCCccccchhhhheeeecCceeEehhHHHhhhheecCcccC
Confidence 49999999999999 99999999999998999999999999999999999999999999999999999999999999999
Q ss_pred CcccccccccccccceeEEEEecC-CCC-cEEEEEEeeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhh
Q psy9877 698 DCLSMRQPRIMVSRDCLRFLWQDE-NGR-VITYRHCRVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLK 775 (1447)
Q Consensus 698 ~~~~~~~~~~~~~~~~~~f~w~~~-~~~-~~~y~~~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~ 775 (1447)
+ +++|+|+.. +.+ ++.|+|++||||+++||++|+++|++++.++... . +++.+.
T Consensus 81 ~--------------~~~F~w~~~~~~~~~~~Y~~~~~pFG~~~AP~~~~~~~~~~~~~~~~~--~--------~~~~i~ 136 (213)
T cd01644 81 D--------------VLRFLWRKDGDEPKPIEYRMTVVPFGAASAPFLANRALKQHAEDHPHE--A--------AAKIIK 136 (213)
T ss_pred c--------------eEEEEEeCCCCCCcceEEEEEEEccCCccchHHHHHHHHHHHhhcchh--h--------HHHHHH
Confidence 9 999999987 444 4999999999999999999999999999887654 2 566778
Q ss_pred ccccccccccccCcHHHHHHHHHHHHHHHHhcCCccccccccC--------C-------CCCCCcceeceeeecCCC
Q psy9877 776 DSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFDLRGWELTG--------D-------KDDKPTNVLGLLWDKSSD 837 (1447)
Q Consensus 776 ~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~l~k~~snp--------~-------~~~~~~k~LG~~w~~~~d 837 (1447)
..+|||||++++++.+||...++++.++|+++||+++||.||. . ..+...|.||+.|++..|
T Consensus 137 ~~~YvDDili~~~s~~e~~~~~~~v~~~L~~~Gf~l~kw~sn~~~~l~~~~~~~~~~~~~~~~~~k~LGl~W~~~~D 213 (213)
T cd01644 137 RNFYVDDILVSTDTLNEAVNVAKRLIALLKKGGFNLRKWASNSQEVLDDLPEERVLLDRDSDVTEKTLGLRWNPKTD 213 (213)
T ss_pred HeeecccceecCCCHHHHHHHHHHHHHHHHhCCccchhcccCchhhhhcccccccccccccccchhcccceeeccCC
Confidence 8999999999999999999999999999999999999999992 1 234579999999999876
No 2
>PF05380 Peptidase_A17: Pao retrotransposon peptidase ; InterPro: IPR008042 This signature identifies members of the Pao retrotransposon family.
Probab=100.00 E-value=2.1e-34 Score=302.05 Aligned_cols=159 Identities=42% Similarity=0.732 Sum_probs=151.5
Q ss_pred chHHHHHhhhhhhcCccccccceeehhHHHHHHHHhcCCCCCccCChhhHHHHHHHHhhcCccceeecceeeeCCCCCCC
Q psy9877 854 ITKKVMLSAAHRIFDPIGVICAFALIPKLLIQKTWETGLAWNDEVDENTKTKFVQWMAEVPDIAEIRVPRWISEPNVESR 933 (1447)
Q Consensus 854 ~TkR~~~s~~~~~~dplg~~~p~~~~~k~llq~l~~~~~~Wd~~l~~~~~~~~~~~~~~l~~l~~~~ipR~~~~~~~~~~ 933 (1447)
+|||+++|.++++|||+|+++|+++++|.++|.+|+.+++||+++|++....|..|++.+..+..+.+||++.... .
T Consensus 1 pTKR~ils~ia~~yDPlGl~~p~~i~~K~llq~lw~~~l~WD~~lp~el~~~w~~~~~~l~~~~~i~iPR~i~~~~---~ 77 (159)
T PF05380_consen 1 PTKRQILSFIASIYDPLGLLAPIIIRAKLLLQKLWQSKLDWDDPLPDELRKEWKKWLKELESLSPIRIPRCIPISD---Y 77 (159)
T ss_pred CChHHHHHHHHHHcCcchhhHHHHHHHHHHHHhhhccccchhhhhhHHHHHHHHHHHHHHhhcccccCCccccccc---c
Confidence 6999999999999999999999999999999999999999999999999999999999999988999999765433 3
Q ss_pred CcceEEEEecccccceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhccccc
Q psy9877 934 ESWSLHVFSDASKLAYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYKLQD 1013 (1447)
Q Consensus 934 ~~~~L~vftDAS~~g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~~~~ 1013 (1447)
...+||+|||||+.|||||+|+|. ..+|...+.+++||+|++|++ +.||||+||+|+++|++++.++.+++++.+
T Consensus 78 ~~~~L~~F~DAS~~aygavvYlr~-~~~~~~~~~ll~aKsrv~P~k----~~tIPRlEL~a~~l~~~l~~~~~~~l~~~~ 152 (159)
T PF05380_consen 78 RSVELHVFCDASESAYGAVVYLRS-YSDGSVQVRLLFAKSRVAPLK----TVTIPRLELLAALLGVRLANTVKKELDIEI 152 (159)
T ss_pred cceeeeEeecccccceeeEeEeee-ccCCceeeeeeeecccccCCC----CCcHHHHHHHHHHHHHHHHHHHHHHcCCCc
Confidence 678999999999999999999999 588999999999999999999 889999999999999999999999999999
Q ss_pred cceEEEe
Q psy9877 1014 VRTTFWT 1020 (1447)
Q Consensus 1014 ~~~~~~t 1020 (1447)
.++++||
T Consensus 153 ~~~~~wt 159 (159)
T PF05380_consen 153 SQVVFWT 159 (159)
T ss_pred ceeEEeC
Confidence 9999997
No 3
>cd03715 RT_ZFREV_like RT_ZFREV_like: A subfamily of reverse transcriptases (RTs) found in sequences similar to the intact endogenous retrovirus ZFERV from zebrafish and to Moloney murine leukemia virus RT. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs. Phylogenetic analysis suggests that ZFERV belongs to a distinct group of retroviruses.
Probab=99.95 E-value=1.2e-27 Score=264.69 Aligned_cols=198 Identities=19% Similarity=0.241 Sum_probs=176.2
Q ss_pred ChhH-HHHHHHHHHHHHcCcEEecCCCCCCCCeeeecCceeccCCCCC-CeEEEEcCCCCCCCcccCccccCCCCcccch
Q psy9877 588 DQYY-ADYKRVLDTWEMDKIIERVPQEELDNPGYYMPHHHVVKPGSTT-PVRPVFDASAKDNGVSLNDCLEKGPNLIETI 665 (1447)
Q Consensus 588 ~~~~-~~y~~~i~~~l~~G~i~~~~~~~~~~~~~~~P~~~V~k~~~t~-k~R~v~D~~~~~~~~slN~~~~~g~~~~~~l 665 (1447)
+++. ++++++|++|++.|+|+++.++ |..|.|+|.|++ | ++|+|+|+|. ||+.+....++.+.+
T Consensus 10 ~~~~~~~~~~~v~~ll~~G~I~~~~s~------~~sp~~~V~Kk~--g~~~R~~vD~r~------lN~~~~~~~~~~p~~ 75 (210)
T cd03715 10 PREAREGITPHIQELLEAGILVPCQSP------WNTPILPVKKPG--GNDYRMVQDLRL------VNQAVLPIHPAVPNP 75 (210)
T ss_pred CHHHHHHHHHHHHHHHHCCCeECCCCC------CCCceEEEEeCC--CCcceEEEEhhh------hhhcccccCcCCCcH
Confidence 3556 8899999999999999998544 899999999987 6 9999999999 999999999999999
Q ss_pred hHHHhhcc-cccceeecccccceeeeEeCCCCCCcccccccccccccceeEEEEecCCCCcEEEEEEeeecCccCChHHH
Q psy9877 666 PTSLAKFR-INKIGISGDIAKAFLQISVSPQDRDCLSMRQPRIMVSRDCLRFLWQDENGRVITYRHCRVVFGVSSSPFLL 744 (1447)
Q Consensus 666 ~~~l~~~r-~~~~~~~~Di~~af~qi~l~~~dr~~~~~~~~~~~~~~~~~~f~w~~~~~~~~~y~~~~~pfGl~~sP~~~ 744 (1447)
.+++..+. +.+++.++|+++||+||+|+|++++ +++|.|.. +.|+|++||||+++||++|
T Consensus 76 ~~~l~~l~~~~~~~s~lDl~~af~~i~l~~~~~~--------------~taf~~~~-----~~y~~~~lp~Gl~~sp~~f 136 (210)
T cd03715 76 YTLLSLLPPKHQWYTVLDLANAFFSLPLAPDSQP--------------LFAFEWEG-----QQYTFTRLPQGFKNSPTLF 136 (210)
T ss_pred HHHHHHhccCCeEEEEeeccCeEEEEEcccccEE--------------eEEEEECC-----eeEEEEEEeccccCcHHHH
Confidence 99999886 7899999999999999999999999 99999875 8999999999999999999
Q ss_pred HHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcCCccccccccCCCCCCC
Q psy9877 745 ESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFDLRGWELTGDKDDKP 824 (1447)
Q Consensus 745 ~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~l~k~~snp~~~~~~ 824 (1447)
+++|+.++..+...+.. ....+|||||++++++.++|.+.++.+...|.++||.++.-+|. ....+
T Consensus 137 ~~~~~~~l~~~~~~~~~------------~~~~~Y~DDili~s~~~~e~~~~l~~v~~~l~~~gl~l~~~K~~--~~~~~ 202 (210)
T cd03715 137 HEALARDLAPFPLEHEG------------TILLQYVDDLLLAADSEEDCLKGTDALLTHLGELGYKVSPKKAQ--ICRAE 202 (210)
T ss_pred HHHHHHHHHHHHhhCCC------------eEEEEECCcEEEecCCHHHHHHHHHHHHHHHHHCCCCcCHHHee--CCCCc
Confidence 99999998876432111 23458999999999999999999999999999999999877766 34578
Q ss_pred cceeceee
Q psy9877 825 TNVLGLLW 832 (1447)
Q Consensus 825 ~k~LG~~w 832 (1447)
++|||+.|
T Consensus 203 v~fLG~~~ 210 (210)
T cd03715 203 VKFLGVVW 210 (210)
T ss_pred eEEeeEEC
Confidence 99999975
No 4
>cd01645 RT_Rtv RT_Rtv: Reverse transcriptases (RTs) from retroviruses (Rtvs). RTs catalyze the conversion of single-stranded RNA into double-stranded viral DNA for integration into host chromosomes. Proteins in this subfamily contain long terminal repeats (LTRs) and are multifunctional enzymes with RNA-directed DNA polymerase, DNA directed DNA polymerase, and ribonuclease hybrid (RNase H) activities. The viral RNA genome enters the cytoplasm as part of a nucleoprotein complex, and the process of reverse transcription generates in the cytoplasm forming a linear DNA duplex via an intricate series of steps. This duplex DNA is colinear with its RNA template, but contains terminal duplications known as LTRs that are not present in viral RNA. It has been proposed that two specialized template switches, known as strand-transfer reactions or "jumps", are required to generate the LTRs.
Probab=99.93 E-value=6e-26 Score=251.08 Aligned_cols=199 Identities=18% Similarity=0.216 Sum_probs=166.5
Q ss_pred hhH-HHHHHHHHHHHHcCcEEecCCCCCCCCeeeecCceeccCCCCCCeEEEEcCCCCCCCcccCccccCCCCcccchhH
Q psy9877 589 QYY-ADYKRVLDTWEMDKIIERVPQEELDNPGYYMPHHHVVKPGSTTPVRPVFDASAKDNGVSLNDCLEKGPNLIETIPT 667 (1447)
Q Consensus 589 ~~~-~~y~~~i~~~l~~G~i~~~~~~~~~~~~~~~P~~~V~k~~~t~k~R~v~D~~~~~~~~slN~~~~~g~~~~~~l~~ 667 (1447)
++. +++.++|+++++.|+|+++.++ |.+|.++|.|++ |++|+|+|+|. ||+.+.......+.+.
T Consensus 11 ~~~~~~~~~~i~~ll~~g~I~~~~s~------~~sp~~~v~K~~--g~~R~~~D~r~------lN~~~~~~~~~~~~~p- 75 (213)
T cd01645 11 EEKLEALTELVTEQLKEGHIEPSTSP------WNTPVFVIKKKS--GKWRLLHDLRA------VNAQTQDMGALQPGLP- 75 (213)
T ss_pred HHHHHHHHHHHHHHHHCCceecCCCC------CcCcEEEEEcCC--CCeEEEechHH------HhhhcccccccCCCCC-
Confidence 344 8899999999999999997754 899999999987 89999999999 9998876543222111
Q ss_pred HHhhcccccceeecccccceeeeEeCCCCCCcccccccccccccceeEEEEecC--CCCcEEEEEEeeecCccCChHHHH
Q psy9877 668 SLAKFRINKIGISGDIAKAFLQISVSPQDRDCLSMRQPRIMVSRDCLRFLWQDE--NGRVITYRHCRVVFGVSSSPFLLE 745 (1447)
Q Consensus 668 ~l~~~r~~~~~~~~Di~~af~qi~l~~~dr~~~~~~~~~~~~~~~~~~f~w~~~--~~~~~~y~~~~~pfGl~~sP~~~~ 745 (1447)
....+.+.++++++|+++||+||+|+|+++. +++|.|... .++.+.|+|++||||+++||++|+
T Consensus 76 ~~~~l~~~~~~s~lDl~~af~~i~l~~~~~~--------------~taf~~~~~~~~~~~~~~~~~~lP~Gl~~SP~~f~ 141 (213)
T cd01645 76 HPAALPKGWPLIVLDLKDCFFSIPLHPDDRE--------------RFAFTVPSINNKGPAKRYQWKVLPQGMKNSPTICQ 141 (213)
T ss_pred ChHHcCCCceEEEEEccCcEEEeeeccCCcc--------------eeEEEeccccCCCCCceEEEEEeCCCCcChHHHHH
Confidence 1123567889999999999999999999999 999998532 234689999999999999999999
Q ss_pred HHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcCCccccccccCCCCCCCc
Q psy9877 746 SCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFDLRGWELTGDKDDKPT 825 (1447)
Q Consensus 746 ~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~l~k~~snp~~~~~~~ 825 (1447)
++|+.++......++. +....|||||++++++.++|.+.++.+.+.|.++||.++.-++.. ..++
T Consensus 142 ~~m~~~l~~~~~~~~~------------~~~~~Y~DDili~s~~~~~~~~~l~~v~~~l~~~gl~ln~~K~~~---~~~v 206 (213)
T cd01645 142 SFVAQALEPFRKQYPD------------IVIYHYMDDILIASDLEGQLREIYEELRQTLLRWGLTIPPEKVQK---EPPF 206 (213)
T ss_pred HHHHHHHHHHHHHCCC------------eEEEEEcCCEEEEcCCHHHHHHHHHHHHHHHHHCCCEeCHHHEeC---CCCe
Confidence 9999999877654432 344689999999999999999999999999999999998766652 4679
Q ss_pred ceecee
Q psy9877 826 NVLGLL 831 (1447)
Q Consensus 826 k~LG~~ 831 (1447)
+||||.
T Consensus 207 ~fLG~~ 212 (213)
T cd01645 207 QYLGYE 212 (213)
T ss_pred EeccEe
Confidence 999996
No 5
>PF05585 DUF1758: Putative peptidase (DUF1758); InterPro: IPR008737 This is a family of nematode proteins of unknown function []. However, it seems likely that these proteins act as aspartic peptidases.
Probab=99.87 E-value=4.7e-22 Score=210.87 Aligned_cols=156 Identities=30% Similarity=0.506 Sum_probs=128.7
Q ss_pred EEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEEEecCCCceEEEEE
Q psy9877 330 VKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFACDFDV 409 (1447)
Q Consensus 330 v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~~v~~ 409 (1447)
|.|++++|+. ..++||||||||.|||++++|++|+|+......+.+..+|........ .+.+.|....++..+.+++
T Consensus 1 v~V~n~~g~~-~~~~~LlDsGSq~SfIt~~la~~L~L~~~~~~~~~~~~~g~~~~~~~~--~~~~~i~~~~~~~~~~i~a 77 (164)
T PF05585_consen 1 VNVFNPNGNQ-VEARALLDSGSQRSFITESLANKLNLPGTGEKILVIGTFGSSSPKSKK--CVRVKISSRTSNNSLEIEA 77 (164)
T ss_pred CEEECCCCCE-EEEEEEEecCCchhHHhHHHHHHhCCCCCCceEEEEeccCccCcccee--EEEEEEEEecCCCceEEEE
Confidence 5689999997 899999999999999999999999999944444555555544333332 4566666666666799999
Q ss_pred eeecCccccCCCC--CCCCccccccccCcccccCC-CCCcceeEEeccccccccccCceEe-cCCcceEEecceeeEeec
Q psy9877 410 FGQSKICSTIPSI--PSGPWIDELKQHNIELTDLQ-HPSTTVDILVGADIAGKFYTGKRVE-LPSGLVAMETCMGWTLSG 485 (1447)
Q Consensus 410 lvvp~i~~~lp~~--~~~~~~~~L~~~~i~L~d~~-~~~~~iDlLIG~D~~~~il~~~~i~-~~~gp~a~~T~lGwiv~G 485 (1447)
+++|.|++.+|.. +.+.|.| + +++.|+|+. +.+.+||||||+|+|++++.++.++ ++.+++|++|.||||++|
T Consensus 78 lvv~~I~~~l~~~~i~~~~~~~-~--~~l~lad~~f~~~~~iDiLIG~D~~~~ll~~~~i~~~~~~~~a~~T~~GWiisG 154 (164)
T PF05585_consen 78 LVVPKITGNLPSAPIDDSDWKH-L--NNLPLADPNFRESSPIDILIGADYFWQLLTGGQIKRLPGGPTAQETKFGWIISG 154 (164)
T ss_pred EecCcccccccccccCHHHHhh-h--cCCccccccccCCCCCeEEEccchHHHHhCCceEecCCCCCEEEeCCeEeEEeC
Confidence 9999999999966 5567999 8 999999988 8899999999999999999888665 455699999999999999
Q ss_pred CCCCCC
Q psy9877 486 KMPTRY 491 (1447)
Q Consensus 486 ~~~~~~ 491 (1447)
+.....
T Consensus 155 ~~~~~~ 160 (164)
T PF05585_consen 155 KASEQK 160 (164)
T ss_pred ccCCcc
Confidence 876553
No 6
>cd01647 RT_LTR RT_LTR: Reverse transcriptases (RTs) from retrotransposons and retroviruses which have long terminal repeats (LTRs) in their DNA copies but not in their RNA template. RT catalyzes DNA replication from an RNA template, and is responsible for the replication of retroelements. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs are present in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and Caulimoviruses.
Probab=99.84 E-value=5.6e-21 Score=205.19 Aligned_cols=176 Identities=22% Similarity=0.237 Sum_probs=157.4
Q ss_pred CcEEecCCCCCCCCeeeecCceeccCCCCCCeEEEEcCCCCCCCcccCccccCCCCcccchhHHHhhcccccceeecccc
Q psy9877 605 KIIERVPQEELDNPGYYMPHHHVVKPGSTTPVRPVFDASAKDNGVSLNDCLEKGPNLIETIPTSLAKFRINKIGISGDIA 684 (1447)
Q Consensus 605 G~i~~~~~~~~~~~~~~~P~~~V~k~~~t~k~R~v~D~~~~~~~~slN~~~~~g~~~~~~l~~~l~~~r~~~~~~~~Di~ 684 (1447)
|+|++++++ |..|.++|.|++ +|+|+|+|++. +|+.+...+..++.+.+++..+++..++.++|+.
T Consensus 1 g~i~~~~~~------~~~p~~~v~k~~--~k~R~~~D~r~------ln~~~~~~~~~~p~i~~~~~~~~~~~~~~~~D~~ 66 (177)
T cd01647 1 GIIEPSSSP------YASPVVVVKKKD--GKLRLCVDYRK------LNKVTIKDRYPLPTIDELLEELAGAKVFSKLDLR 66 (177)
T ss_pred CeeEccCCC------CCCceEEEECCC--CCEEEEEcCHH------HhcccCCCCCCCCCHHHHHHHhhcCcEEEecccc
Confidence 788887665 448999999987 79999999999 9999999999999999999999999999999999
Q ss_pred cceeeeEeCCCCCCcccccccccccccceeEEEEecCCCCcEEEEEEeeecCccCChHHHHHHHHHHHHHhhcccCCCCC
Q psy9877 685 KAFLQISVSPQDRDCLSMRQPRIMVSRDCLRFLWQDENGRVITYRHCRVVFGVSSSPFLLESCLKLHLELTLTDCREGKS 764 (1447)
Q Consensus 685 ~af~qi~l~~~dr~~~~~~~~~~~~~~~~~~f~w~~~~~~~~~y~~~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~ 764 (1447)
+||+|++++++++. +++|.|.. +.|+++++|||+++||..++.+|+.++......+.
T Consensus 67 ~~~~~i~l~~~~~~--------------~~~~~~~~-----~~~~~~~~p~G~~~s~~~~~~~~~~~l~~~~~~~~---- 123 (177)
T cd01647 67 SGYHQIPLAEESRP--------------KTAFRTPF-----GLYEYTRMPFGLKNAPATFQRLMNKILGDLLGDFV---- 123 (177)
T ss_pred cCcceeeeccCChh--------------hceeecCC-----CccEEEEecCCCccHHHHHHHHHHhhhcccccccc----
Confidence 99999999999999 89998864 78999999999999999999999999887644333
Q ss_pred CCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcCCccccccccCCCCCCCcceecee
Q psy9877 765 SWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFDLRGWELTGDKDDKPTNVLGLL 831 (1447)
Q Consensus 765 ~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~l~k~~snp~~~~~~~k~LG~~ 831 (1447)
..||||+++.+++.+++.+.++.+...+.++||.++..++. ......++|||.
T Consensus 124 ------------~~y~DDi~i~~~~~~~~~~~~~~~~~~l~~~~~~~~~~K~~--~~~~~~~~lG~~ 176 (177)
T cd01647 124 ------------EVYLDDILVYSKTEEEHLEHLREVLERLREAGLKLNPEKCE--FGVPEVEFLGHI 176 (177)
T ss_pred ------------EEEecCccccCCCHHHHHHHHHHHHHHHHHcCCEeCHHHce--eccCceEeeeEE
Confidence 48999999999999999999999999999999998866665 233568999986
No 7
>PF03564 DUF1759: Protein of unknown function (DUF1759); InterPro: IPR005312 This is a small family of proteins of unknown function.
Probab=99.64 E-value=4.4e-16 Score=162.03 Aligned_cols=101 Identities=27% Similarity=0.514 Sum_probs=98.6
Q ss_pred CCCcHhHhhccCCCCCCcHHHHHHHHHHhcCCcchhHHHHHHHHHHHHHhcCC--CCChhhHHHHHHHHHHHHHHHHhcC
Q psy9877 2 KGTPARELVDSYPATGNMYNQVVEALKARFGREDLLTEVYIRELLKLVLANTA--SHDKLPIVILYDRLQSHLRNLESLG 79 (1447)
Q Consensus 2 ~~G~A~~~I~~~~~t~~nY~~A~~~L~~ryg~~~~i~~~~~~~~~~~L~~~p~--~~d~~~Lr~l~d~l~~~i~aL~~lg 79 (1447)
|+|+|+++|++++++++||+.||++|+++||+++.++ ++|+++|.++|+ .+|..+|+.|++++++++++|+++|
T Consensus 38 L~G~A~~~i~~~~~~~~~Y~~a~~~L~~~yg~~~~i~----~~~~~~l~~l~~~~~~d~~~L~~~~~~v~~~i~~L~~lg 113 (145)
T PF03564_consen 38 LKGEAKELIRGLPLSEENYEEAWELLEERYGNPRRII----QALLEELRNLPPISNDDPEALRSLVDKVNNCIRALKALG 113 (145)
T ss_pred hcchHHHHHHcccccchhhHHHHHHHHHHhCCchHHH----HHHHHHHhccccccchhHHHHHHHHHHHHHHHHHHHHcC
Confidence 7999999999999999999999999999999999999 999999999996 8999999999999999999999999
Q ss_pred CCCcCCCcchHHHHHccCCHHHHHHHHhh
Q psy9877 80 VSADRCAPILMPLVSSSLPQDLLQMWERC 108 (1447)
Q Consensus 80 ~~~d~~~~~l~~~i~~KLP~~l~~~w~~~ 108 (1447)
++.+ +..++.+|++|||..++.+|++.
T Consensus 114 ~~~~--~~~l~~~i~~KLp~~~~~~w~~~ 140 (145)
T PF03564_consen 114 VNVD--DPLLISIILSKLPPEIREKWEEH 140 (145)
T ss_pred CCCC--CHHHHHHHHHHCCHHHHHHHHHH
Confidence 9998 78999999999999999999997
No 8
>cd03714 RT_DIRS1 RT_DIRS1: Reverse transcriptases (RTs) occurring in the DIRS1 group of retransposons. Members of the subfamily include the Dictyostelium DIRS-1, Volvox carteri kangaroo, and Panagrellus redivivus PAT elements. These elements differ from LTR and conventional non-LTR retrotransposons. They contain split direct repeat (SDR) termini, and have been proposed to integrate via double-stranded closed-circle DNA intermediates assisted by an encoded recombinase which is similar to gamma-site-specific integrase.
Probab=99.63 E-value=6.8e-16 Score=154.80 Aligned_cols=117 Identities=24% Similarity=0.355 Sum_probs=97.1
Q ss_pred cccccceeeeEeCCCCCCcccccccccccccceeEEEEecCCCCcEEEEEEeeecCccCChHHHHHHHHHHHHHhhcccC
Q psy9877 681 GDIAKAFLQISVSPQDRDCLSMRQPRIMVSRDCLRFLWQDENGRVITYRHCRVVFGVSSSPFLLESCLKLHLELTLTDCR 760 (1447)
Q Consensus 681 ~Di~~af~qi~l~~~dr~~~~~~~~~~~~~~~~~~f~w~~~~~~~~~y~~~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~ 760 (1447)
+|+.+||+||+|+|++++ +++|.+.. +.|+|++||||+++||.+|+++|+.++......
T Consensus 1 lD~~~ay~~i~l~~~~~~--------------~~af~~~~-----~~~~~~~mp~Gl~~sp~~f~~~~~~i~~~~~~~-- 59 (119)
T cd03714 1 VDLKDAYFHIPILPRSRD--------------LLGFAWQG-----ETYQFKALPFGLSLAPRVFTKVVEALLAPLRLL-- 59 (119)
T ss_pred CchhhceEEEecCCCCcc--------------eeeEEECC-----CcEEEEecCCcccchHHHHHHHHHHHHHHhhcC--
Confidence 599999999999999999 99999875 899999999999999999999999998765411
Q ss_pred CCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHH-HHHhcCCccccccccCCCCCCCcceecee
Q psy9877 761 EGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASS-IMKEKGFDLRGWELTGDKDDKPTNVLGLL 831 (1447)
Q Consensus 761 ~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~-~l~~~g~~l~k~~snp~~~~~~~k~LG~~ 831 (1447)
...+..|+|||++.+++.++++.....+.+ .++++||.++.-||.- ....+++|||+.
T Consensus 60 ------------~~~v~~Y~DDili~~~~~~~~~~~~~~l~~~~l~~~gl~ln~~K~~~-~~~~~v~fLG~~ 118 (119)
T cd03714 60 ------------GVRIFSYLDDLLIIASSIKTSEAVLRHLRATLLANLGFTLNLEKSKL-GPTQRITFLGLE 118 (119)
T ss_pred ------------CeEEEEEecCeEEEeCcHHHHHHHHHHHHHHHHHHcCCccChhhcEe-cCCCcEEECcEe
Confidence 123458999999999987777777666666 6999999998666551 234679999986
No 9
>PF00078 RVT_1: Reverse transcriptase (RNA-dependent DNA polymerase); InterPro: IPR000477 The use of an RNA template to produce DNA, for integration into the host genome and exploitation of a host cell, is a strategy employed in the replication of retroid elements, such as the retroviruses and bacterial retrons. The enzyme catalysing polymerisation is an RNA-directed DNA-polymerase, or reverse trancriptase (RT) (2.7.7.49 from EC). Reverse transcriptase occurs in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. Retroviral reverse transcriptase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The discovery of retroelements in the prokaryotes raises intriguing questions concerning their roles in bacteria and the origin and evolution of reverse transcriptases and whether the bacterial reverse transcriptases are older than eukaryotic reverse transcriptases [].; GO: 0003723 RNA binding, 0003964 RNA-directed DNA polymerase activity, 0006278 RNA-dependent DNA replication; PDB: 1MU2_B 3RWE_C 3DU6_B 3DU5_A 3KYL_A 2WOM_B 1DTQ_A 2OPS_A 3FFI_A 1VRU_B ....
Probab=99.04 E-value=2.9e-10 Score=126.27 Aligned_cols=139 Identities=20% Similarity=0.192 Sum_probs=110.0
Q ss_pred HHhhcccccceeecccccceeeeEeCCCCCCcccccccccccccceeEEEEecC------------------CCC-cEEE
Q psy9877 668 SLAKFRINKIGISGDIAKAFLQISVSPQDRDCLSMRQPRIMVSRDCLRFLWQDE------------------NGR-VITY 728 (1447)
Q Consensus 668 ~l~~~r~~~~~~~~Di~~af~qi~l~~~dr~~~~~~~~~~~~~~~~~~f~w~~~------------------~~~-~~~y 728 (1447)
.+...+...+++.+|+++||.+|+.++-.+. +.++.+... +.. ...+
T Consensus 56 ~~~~~~~~~~~~~~Di~~~f~sI~~~~l~~~--------------l~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~ 121 (214)
T PF00078_consen 56 KLNRFKGYLYFLKLDISKAFDSIPHHRLLRK--------------LKRFGVPKKLIRLIQNLLSDRTAKVYLDGDLSPYF 121 (214)
T ss_dssp HHHC-CGSSEEEEEECCCCGGGSBBHTTTGG--------------GGEEEEECCSCHHHHHHHHHHHH-EECGCSSSEEE
T ss_pred cccccccccccceeccccccccceeeecccc--------------ccccccccccccccccccccccccccccccccccc
Confidence 3567888899999999999999999988888 666665421 011 5789
Q ss_pred EEEeeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcC
Q psy9877 729 RHCRVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKG 808 (1447)
Q Consensus 729 ~~~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g 808 (1447)
....+|+|...||.+++.+|..+...+...+ ...+....|+||+++.+.+.+++.+.++.+.+.+.+.|
T Consensus 122 ~~~glpqG~~~S~~l~~~~l~~l~~~~~~~~-----------~~~~~~~rY~DD~~i~~~~~~~~~~~~~~i~~~~~~~g 190 (214)
T PF00078_consen 122 QKRGLPQGSPLSPLLFNIYLDDLDRELQQEL-----------NPDISYLRYADDILIISKSKEELQKILEKISQWLEELG 190 (214)
T ss_dssp EESBS-TTSTCHHHHHHHHHHHHHHHHHHHS------------TTSEEEEETTEEEEEESSHHHHHHHHHHHHHHHHHTT
T ss_pred ccccccccccccchhhccccccccccccccc-----------cccccceEeccccEEEECCHHHHHHHHHHHHHHHHHCC
Confidence 9999999999999999999999988766542 01245679999999999999999999999999999999
Q ss_pred CccccccccCCCCCCCcceecee
Q psy9877 809 FDLRGWELTGDKDDKPTNVLGLL 831 (1447)
Q Consensus 809 ~~l~k~~snp~~~~~~~k~LG~~ 831 (1447)
+.++.-++.-.......+|||+.
T Consensus 191 l~ln~~Kt~~~~~~~~~~~lG~~ 213 (214)
T PF00078_consen 191 LKLNPEKTKILHPSDSVKFLGYV 213 (214)
T ss_dssp SBCSSTTTSCS--ESSEEETTEE
T ss_pred CEEChHHEEEEeCCCCEEEEeEE
Confidence 99985555422236789999986
No 10
>cd00304 RT_like RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.
Probab=98.70 E-value=1.8e-08 Score=97.25 Aligned_cols=87 Identities=18% Similarity=0.160 Sum_probs=72.9
Q ss_pred eecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcCCccc
Q psy9877 733 VVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFDLR 812 (1447)
Q Consensus 733 ~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~l~ 812 (1447)
+|||...||.+++.+|..+......... .+....|+||+++.+.+. ++...+..+.+.+.+.|+.++
T Consensus 12 lPqG~~~Sp~l~~~~~~~l~~~~~~~~~------------~~~~~~Y~DD~~i~~~~~-~~~~~~~~l~~~l~~~gl~ln 78 (98)
T cd00304 12 LPQGSPLSPALANLYMEKLEAPILKQLL------------DITLIRYVDDLVVIAKSE-QQAVKKRELEEFLARLGLNLS 78 (98)
T ss_pred cCCCCchHHHHHHHHHHHHHHHHHHhcC------------CceEEEeeCcEEEEeCcH-HHHHHHHHHHHHHHHcCcEEC
Confidence 6999999999999999999876543221 245679999999999999 999999999999999999999
Q ss_pred cccccCCCCCCCcceeceee
Q psy9877 813 GWELTGDKDDKPTNVLGLLW 832 (1447)
Q Consensus 813 k~~snp~~~~~~~k~LG~~w 832 (1447)
.+++.........++||+.|
T Consensus 79 ~~Kt~~~~~~~~~~flG~~~ 98 (98)
T cd00304 79 DEKTQFTEKEKKFKFLGILV 98 (98)
T ss_pred hheeEEecCCCCeeeeceeC
Confidence 88886333567899999974
No 11
>PF00336 DNA_pol_viral_C: DNA polymerase (viral) C-terminal domain; InterPro: IPR001462 This domain is at the C terminus of hepatitis B-type viruses P proteins and represents a functional domain that controls the RNase H activities of the protein. The domain is always associated with IPR000201 from INTERPRO and .; GO: 0004523 ribonuclease H activity
Probab=98.70 E-value=1.4e-08 Score=104.14 Aligned_cols=157 Identities=22% Similarity=0.176 Sum_probs=96.0
Q ss_pred ccccccceeehhHHHHHHHHhcCCCCCccCChhhHHHHHHHHhhcCccceeecceeeeCCCCCCCCcceEEEEecccccc
Q psy9877 869 PIGVICAFALIPKLLIQKTWETGLAWNDEVDENTKTKFVQWMAEVPDIAEIRVPRWISEPNVESRESWSLHVFSDASKLA 948 (1447)
Q Consensus 869 plg~~~p~~~~~k~llq~l~~~~~~Wd~~l~~~~~~~~~~~~~~l~~l~~~~ipR~~~~~~~~~~~~~~L~vftDAS~~g 948 (1447)
-||+++|++..+-..+..|+..-..=....-....+.| +++..-++..+..-|. ..-.||+||+..|
T Consensus 39 lLgF~aPFTqcgy~aL~PlY~~iq~k~aF~FS~~Yk~~--L~kqy~~l~pvarqr~-----------~lc~VfaDATpTg 105 (245)
T PF00336_consen 39 LLGFAAPFTQCGYPALMPLYAAIQSKQAFTFSPTYKAF--LCKQYMNLYPVARQRP-----------GLCQVFADATPTG 105 (245)
T ss_pred hhhcccccccCCchhhhhHHHHHhhhheeecCHHHHHH--HHHhhccccccccCCC-----------CCCceeccCCCCc
Confidence 38999999999888877776541111111112233333 2223333333332232 2345899999999
Q ss_pred eeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhccccccceEEEechHHHHHH
Q psy9877 949 YAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYKLQDVRTTFWTDATTVLAW 1028 (1447)
Q Consensus 949 ~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~~~~~~~~~~tDs~~~l~~ 1028 (1447)
||.++- .|.. +. +..+|+. |.-.||+|+.+|--+ ...-.+.|||+.|++.
T Consensus 106 wgi~i~------~~~~--~~----Tfs~~l~-------IhtaELlaaClAr~~-----------~~~r~l~tDnt~Vlsr 155 (245)
T PF00336_consen 106 WGISIT------GQRM--RG----TFSKPLP-------IHTAELLAACLARLM-----------SGARCLGTDNTVVLSR 155 (245)
T ss_pred ceeeec------Ccee--ee----eeccccc-------chHHHHHHHHHHHhc-----------cCCcEEeecCcEEEec
Confidence 998852 2211 11 2233444 888999999877633 1222389999999873
Q ss_pred HhcC-CCcchhhhhhHHHhhhcCCCceEEEccCCCCCCCCCCCcccc
Q psy9877 1029 IRRN-EPWNVFVMNRITEIRNLSQEHEWRHVPGEMNPADLPSRGCTM 1074 (1447)
Q Consensus 1029 i~~~-~~~~~~v~nrv~~I~~~~~~~~~~hvp~~~NpAD~~SRg~~~ 1074 (1447)
--.. --.-..++|++ +...++.|||++.||||..|||...
T Consensus 156 kyts~PW~lac~A~wi------Lrgts~~yVPS~~NPAD~PsR~~~~ 196 (245)
T PF00336_consen 156 KYTSFPWLLACAANWI------LRGTSFYYVPSKYNPADDPSRGKLG 196 (245)
T ss_pred ccccCcHHHHHHHHHh------hcCceEEEeccccCcCCCCCCCccc
Confidence 2222 11123355664 5578999999999999999999754
No 12
>PF08284 RVP_2: Retroviral aspartyl protease; InterPro: IPR013242 This region defines single domain aspartyl proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). These proteases are generally part of a larger polyprotein; usually pol, more rarely gag. Retroviral proteases appear to be homologous to a single domain of the two-domain eukaryotic aspartyl proteases.
Probab=98.50 E-value=7.1e-07 Score=91.02 Aligned_cols=84 Identities=20% Similarity=0.226 Sum_probs=62.8
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCC-CeeEEEEEeeeCccccccceeEEEEEEEecCCCceEEEEEeeecCccccCC
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPI-TKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFACDFDVFGQSKICSTIP 420 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~-~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~~v~~lvvp~i~~~lp 420 (1447)
.+.+|+||||+.|||++++|++++++.. .+..+.+.+.|+..........+.+.+ ++..+..++++++
T Consensus 32 ~~~vLiDSGAThsFIs~~~a~~~~l~~~~l~~~~~V~~~g~~~~~~~~~~~~~~~i----~g~~~~~dl~vl~------- 100 (135)
T PF08284_consen 32 PASVLIDSGATHSFISSSFAKKLGLPLEPLPRPIVVSAPGGSINCEGVCPDVPLSI----QGHEFVVDLLVLD------- 100 (135)
T ss_pred EEEEEEecCCCcEEccHHHHHhcCCEEEEccCeeEEecccccccccceeeeEEEEE----CCeEEEeeeEEec-------
Confidence 5999999999999999999999999885 356788887766543333222344444 3345677777777
Q ss_pred CCCCCCccccccccCcccccCCCCCcceeEEecccccccc
Q psy9877 421 SIPSGPWIDELKQHNIELTDLQHPSTTVDILVGADIAGKF 460 (1447)
Q Consensus 421 ~~~~~~~~~~L~~~~i~L~d~~~~~~~iDlLIG~D~~~~i 460 (1447)
+ ...|+|||+||+..+
T Consensus 101 ----------l--------------~~~DvILGm~WL~~~ 116 (135)
T PF08284_consen 101 ----------L--------------GGYDVILGMDWLKKH 116 (135)
T ss_pred ----------c--------------cceeeEeccchHHhC
Confidence 4 468999999999887
No 13
>cd05484 retropepsin_like_LTR_2 Retropepsins_like_LTR, pepsin-like aspartate proteases. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classif
Probab=98.47 E-value=5.7e-07 Score=85.56 Aligned_cols=71 Identities=18% Similarity=0.258 Sum_probs=57.8
Q ss_pred CCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEEEecCCCceEEEEEeeec
Q psy9877 336 DGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFACDFDVFGQS 413 (1447)
Q Consensus 336 ~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~~v~~lvvp 413 (1447)
+|+. +.+|+||||++|+|+++.+.+++.+.+.+....+.+++|......+...+.+++ ++....+++++++
T Consensus 8 ng~~---i~~lvDTGA~~svis~~~~~~lg~~~~~~~~~~v~~a~G~~~~~~G~~~~~v~~----~~~~~~~~~~v~~ 78 (91)
T cd05484 8 NGKP---LKFQLDTGSAITVISEKTWRKLGSPPLKPTKKRLRTATGTKLSVLGQILVTVKY----GGKTKVLTLYVVK 78 (91)
T ss_pred CCEE---EEEEEcCCcceEEeCHHHHHHhCCCccccccEEEEecCCCEeeEeEEEEEEEEE----CCEEEEEEEEEEE
Confidence 5555 999999999999999999999999877889999999999887777653334444 3445888888888
No 14
>PF12384 Peptidase_A2B: Ty3 transposon peptidase; InterPro: IPR024650 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Ty3 is a gypsy-type, retrovirus-like, element found in the budding yeast. The Ty3 aspartyl protease is required for processing of the viral polyprotein into its mature species [].
Probab=98.37 E-value=1.7e-06 Score=87.12 Aligned_cols=71 Identities=11% Similarity=0.198 Sum_probs=60.9
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEEEecCCCceEEEEEeeecCc
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFACDFDVFGQSKI 415 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~~v~~lvvp~i 415 (1447)
.+.+||||||-+|||+.+++++|+|+......++++|+.+.....+.. .|.+.+.. ++..+.+.++|.+.+
T Consensus 45 ~i~vLfDSGSPTSfIr~di~~kL~L~~~~app~~fRG~vs~~~~~tsE-Av~ld~~i--~n~~i~i~aYV~d~m 115 (177)
T PF12384_consen 45 PIKVLFDSGSPTSFIRSDIVEKLELPTHDAPPFRFRGFVSGESATTSE-AVTLDFYI--DNKLIDIAAYVTDNM 115 (177)
T ss_pred EEEEEEeCCCccceeehhhHHhhCCccccCCCEEEeeeccCCceEEEE-eEEEEEEE--CCeEEEEEEEEeccC
Confidence 399999999999999999999999999999999999999876665544 77777754 667789999999954
No 15
>cd05479 RP_DDI RP_DDI; retropepsin-like domain of DNA damage inducible protein. The family represents the retropepsin-like domain of DNA damage inducible protein. DNA damage inducible protein has a retropepsin-like domain and an amino-terminal ubiquitin-like domain and/or a UBA (ubiquitin-associated) domain. This CD represents the retropepsin-like domain of DDI.
Probab=98.32 E-value=3e-06 Score=85.48 Aligned_cols=93 Identities=15% Similarity=0.248 Sum_probs=61.6
Q ss_pred EeeEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEE--EEEeeeCccccccce-eEEEEEEEecCC
Q psy9877 325 LQTLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSM--RHALFGGSITDAMDH-NLFKIVISNLDN 401 (1447)
Q Consensus 325 l~tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l--~i~~~gg~~~~~~~~-~~v~l~I~~~~~ 401 (1447)
...+.+.+ +|.. +.+|+||||+.|||+.++|++||++....... .+.+.|+. ...+. ..+.+.| +
T Consensus 16 ~~~v~~~I---ng~~---~~~LvDTGAs~s~Is~~~a~~lgl~~~~~~~~~~~~~g~g~~--~~~g~~~~~~l~i----~ 83 (124)
T cd05479 16 MLYINVEI---NGVP---VKAFVDSGAQMTIMSKACAEKCGLMRLIDKRFQGIAKGVGTQ--KILGRIHLAQVKI----G 83 (124)
T ss_pred EEEEEEEE---CCEE---EEEEEeCCCceEEeCHHHHHHcCCccccCcceEEEEecCCCc--EEEeEEEEEEEEE----C
Confidence 44555655 4554 89999999999999999999999986433333 44454442 22221 1334444 3
Q ss_pred CceEEEEEeeecCccccCCCCCCCCccccccccCcccccCCCCCcceeEEecccccccc
Q psy9877 402 TFACDFDVFGQSKICSTIPSIPSGPWIDELKQHNIELTDLQHPSTTVDILVGADIAGKF 460 (1447)
Q Consensus 402 ~~~~~v~~lvvp~i~~~lp~~~~~~~~~~L~~~~i~L~d~~~~~~~iDlLIG~D~~~~i 460 (1447)
+..+.+++.++| + ...|+|||+|++..+
T Consensus 84 ~~~~~~~~~Vl~-----------------~--------------~~~d~ILG~d~L~~~ 111 (124)
T cd05479 84 NLFLPCSFTVLE-----------------D--------------DDVDFLIGLDMLKRH 111 (124)
T ss_pred CEEeeeEEEEEC-----------------C--------------CCcCEEecHHHHHhC
Confidence 344566777776 2 267899999999876
No 16
>cd06222 RnaseH RNase H (RNase HI) is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a not sequence-specific manner. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H knockout mice lack mitochondrial DNA replication and die as embryos. The retroviral reverse transcriptase contains an RNase H domain that plays an important role in converting a single stranded retroviral genomic RNA into a dsDNA for integration into host chromosomes. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.
Probab=98.24 E-value=2.7e-06 Score=85.32 Aligned_cols=112 Identities=23% Similarity=0.222 Sum_probs=77.6
Q ss_pred EEeccccc------ceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhccccc
Q psy9877 940 VFSDASKL------AYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYKLQD 1013 (1447)
Q Consensus 940 vftDAS~~------g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~~~~ 1013 (1447)
+|+|||-. |||+++.. .++... +.+ .... . ..|....|+.|++.|++++. ....
T Consensus 2 ~~~Dgs~~~~~~~~g~g~v~~~----~~~~~~---~~~-~~~~--~----~~s~~~aEl~al~~al~~~~------~~~~ 61 (130)
T cd06222 2 IYTDGSCRGNPGPAGAGVVLRD----PGGEVL---LSG-GLLG--G----NTTNNRAELLALIEALELAL------ELGG 61 (130)
T ss_pred EEecccCCCCCCceEEEEEEEe----CCCeEE---Eec-cccC--C----CCcHHHHHHHHHHHHHHHHH------hCCC
Confidence 78999987 77776642 233221 111 1111 3 67899999999999998766 3457
Q ss_pred cceEEEechHHHHHHHhcC-CCcchhhhhhHHHhhhcC---CCceEEEccC----CCC-CCCCCCCc
Q psy9877 1014 VRTTFWTDATTVLAWIRRN-EPWNVFVMNRITEIRNLS---QEHEWRHVPG----EMN-PADLPSRG 1071 (1447)
Q Consensus 1014 ~~~~~~tDs~~~l~~i~~~-~~~~~~v~nrv~~I~~~~---~~~~~~hvp~----~~N-pAD~~SRg 1071 (1447)
..+.++|||+.++..+++. .....-....+..|++.. ..+.+.|||+ ..| .||.++|.
T Consensus 62 ~~i~i~~Ds~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~i~~v~~h~~~~~n~~ad~la~~ 128 (130)
T cd06222 62 KKVNIYTDSQYVINALTGWYEGKPVKNVDLWQRLLALLKRFHKVRFEWVPGHSGIEGNERADALAKE 128 (130)
T ss_pred ceEEEEECHHHHHHHhhccccCCChhhHHHHHHHHHHHhCCCeEEEEEcCCCCCCcchHHHHHHHHh
Confidence 8999999999999999987 312233444444555443 4789999999 787 79988763
No 17
>PF13650 Asp_protease_2: Aspartyl protease
Probab=98.14 E-value=1.2e-05 Score=76.03 Aligned_cols=67 Identities=22% Similarity=0.267 Sum_probs=46.4
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCCCe-eEEEEEeeeCccccccceeEEEEEEEecCCCceE-EEEEeeec
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPITK-QSMRHALFGGSITDAMDHNLFKIVISNLDNTFAC-DFDVFGQS 413 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~~~-~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~-~v~~lvvp 413 (1447)
.+.+||||||+.|+|+++++++|+++.... ....+.+++|.... ......+++| ++..+ .+++.+++
T Consensus 9 ~~~~liDTGa~~~~i~~~~~~~l~~~~~~~~~~~~~~~~~g~~~~-~~~~~~~i~i----g~~~~~~~~~~v~~ 77 (90)
T PF13650_consen 9 PVRFLIDTGASISVISRSLAKKLGLKPRPKSVPISVSGAGGSVTV-YRGRVDSITI----GGITLKNVPFLVVD 77 (90)
T ss_pred EEEEEEcCCCCcEEECHHHHHHcCCCCcCCceeEEEEeCCCCEEE-EEEEEEEEEE----CCEEEEeEEEEEEC
Confidence 489999999999999999999999988432 26888899887333 3332444444 22332 45555555
No 18
>PF00077 RVP: Retroviral aspartyl protease The Prosite entry also includes Pfam:PF00026; InterPro: IPR018061 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of aspartic peptidases belong to the MEROPS peptidase family A2 (retropepsin family, clan AA), subfamily A2A. The family includes the single domain aspartic proteases from retroviruses, retrotransposons, and badnaviruses (plant dsDNA viruses). Retroviral aspartyl protease is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins.; PDB: 3D3T_B 3SQF_A 1NSO_A 2HB3_A 2HS2_A 2HS1_B 3K4V_A 3GGV_C 1HTG_B 2FDE_A ....
Probab=98.08 E-value=5.8e-06 Score=80.14 Aligned_cols=76 Identities=22% Similarity=0.309 Sum_probs=54.9
Q ss_pred eeEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEEEecCCCceE
Q psy9877 326 QTLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFAC 405 (1447)
Q Consensus 326 ~tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~ 405 (1447)
+.+.+.+ +|.. +.|||||||++|+|+++.+..++.. ......+.++||.. ...+...+++.+ +...+
T Consensus 6 p~i~v~i---~g~~---i~~LlDTGA~vsiI~~~~~~~~~~~--~~~~~~v~~~~g~~-~~~~~~~~~v~~----~~~~~ 72 (100)
T PF00077_consen 6 PYITVKI---NGKK---IKALLDTGADVSIISEKDWKKLGPP--PKTSITVRGAGGSS-SILGSTTVEVKI----GGKEF 72 (100)
T ss_dssp SEEEEEE---TTEE---EEEEEETTBSSEEESSGGSSSTSSE--EEEEEEEEETTEEE-EEEEEEEEEEEE----TTEEE
T ss_pred ceEEEeE---CCEE---EEEEEecCCCcceeccccccccccc--ccCCceeccCCCcc-eeeeEEEEEEEE----ECccc
Confidence 3455555 4544 9999999999999999999887655 67888999999987 444432334433 44556
Q ss_pred EEEEeeecC
Q psy9877 406 DFDVFGQSK 414 (1447)
Q Consensus 406 ~v~~lvvp~ 414 (1447)
...++++|.
T Consensus 73 ~~~~~v~~~ 81 (100)
T PF00077_consen 73 NHTFLVVPD 81 (100)
T ss_dssp EEEEEESST
T ss_pred eEEEEecCC
Confidence 667888874
No 19
>PF09668 Asp_protease: Aspartyl protease; InterPro: IPR019103 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This family of eukaryotic aspartyl proteases have a fold similar to retroviral proteases which implies they function proteolytically during regulated protein turnover []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis; PDB: 3S8I_A 2I1A_B.
Probab=98.06 E-value=8.5e-06 Score=80.74 Aligned_cols=95 Identities=14% Similarity=0.209 Sum_probs=55.4
Q ss_pred EEeeEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCC--CeeEEEEEeeeCccccccceeEEEEEEEecCC
Q psy9877 324 LLQTLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPI--TKQSMRHALFGGSITDAMDHNLFKIVISNLDN 401 (1447)
Q Consensus 324 ~l~tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~--~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~ 401 (1447)
.+-++++.+ +|.. +.|++||||+.|+||.++|+++||..+ ..-.....|+|... ..+. +-.+.|+. +
T Consensus 23 ~mLyI~~~i---ng~~---vkA~VDtGAQ~tims~~~a~r~gL~~lid~r~~g~a~GvG~~~--i~G~-Ih~~~l~i--g 91 (124)
T PF09668_consen 23 SMLYINCKI---NGVP---VKAFVDTGAQSTIMSKSCAERCGLMRLIDKRFAGVAKGVGTQK--ILGR-IHSVQLKI--G 91 (124)
T ss_dssp ---EEEEEE---TTEE---EEEEEETT-SS-EEEHHHHHHTTGGGGEEGGG-EE-------E--EEEE-EEEEEEEE--T
T ss_pred ceEEEEEEE---CCEE---EEEEEeCCCCccccCHHHHHHcCChhhccccccccccCCCcCc--eeEE-EEEEEEEE--C
Confidence 344556655 5554 999999999999999999999999765 22333455554333 2222 33344443 4
Q ss_pred CceEEEEEeeecCccccCCCCCCCCccccccccCcccccCCCCCcceeEEecccccccc
Q psy9877 402 TFACDFDVFGQSKICSTIPSIPSGPWIDELKQHNIELTDLQHPSTTVDILVGADIAGKF 460 (1447)
Q Consensus 402 ~~~~~v~~lvvp~i~~~lp~~~~~~~~~~L~~~~i~L~d~~~~~~~iDlLIG~D~~~~i 460 (1447)
+..+.+.+.|++. .++|+|||.|++..+
T Consensus 92 ~~~~~~s~~Vle~-------------------------------~~~d~llGld~L~~~ 119 (124)
T PF09668_consen 92 GLFFPCSFTVLED-------------------------------QDVDLLLGLDMLKRH 119 (124)
T ss_dssp TEEEEEEEEEETT-------------------------------SSSSEEEEHHHHHHT
T ss_pred CEEEEEEEEEeCC-------------------------------CCcceeeeHHHHHHh
Confidence 4556666777662 356899999998654
No 20
>PF00075 RNase_H: RNase H; InterPro: IPR002156 The RNase H domain is responsible for hydrolysis of the RNA portion of RNA x DNA hybrids, and this activity requires the presence of divalent cations (Mg2+ or Mn2+) that bind its active site. This domain is a part of a large family of homologous RNase H enzymes of which the RNase HI protein from Escherichia coli is the best characterised []. Secondary structure predictions for the enzymes from E. coli, yeast, human liver and diverse retroviruses (such as Rous sarcoma virus and the Foamy viruses) supported, in every case, the five beta-strands (1 to 5) and four or five alpha-helices (A, B/C, D, E) that have been identified by crystallography in the RNase H domain of Human immunodeficiency virus 1 (HIV-1) reverse transcriptase and in E. coli RNase H []. Reverse transcriptase (RT) is a modular enzyme carrying polymerase and ribonuclease H (RNase H) activities in separable domains. Reverse transcriptase (RT) converts the single-stranded RNA genome of a retrovirus into a double-stranded DNA copy for integration into the host genome. This process requires ribonuclease H as well as RNA- and DNA-directed DNA polymerase activities. Retroviral RNase H is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. Bacterial RNase H 3.1.26.4 from EC catalyses endonucleolytic cleavage to 5'-phosphomonoester acting on RNA-DNA hybrids. The 3D structure of the RNase H domain from diverse bacteria and retroviruses has been solved [, , ]. All have four beta strands and four to five alpha helices. The E. coli RNase H1 protein binds a single Mg2+ ion cofactor in the active site of the enzyme. The divalent cation is bound by the carboxyl groups of four acidic residues, Asp-10, Glu-48, Asp-70, and Asp-134 []. The first three acidic residues are highly conserved in all bacterial and retroviral RNase H sequences. ; GO: 0003676 nucleic acid binding, 0004523 ribonuclease H activity; PDB: 3LP3_B 2KW4_A 3P1G_A 1RIL_A 2RPI_A 4EQJ_G 4EP2_B 3OTY_P 3U3G_D 2ZQB_D ....
Probab=98.02 E-value=1.3e-05 Score=81.88 Aligned_cols=100 Identities=20% Similarity=0.165 Sum_probs=64.6
Q ss_pred eEEEEecccc------cceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhcc
Q psy9877 937 SLHVFSDASK------LAYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYK 1010 (1447)
Q Consensus 937 ~L~vftDAS~------~g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~ 1010 (1447)
++.+|||||- .|+|.|++ +|. ....+.. ..|..+.||.|+..|++.+ .
T Consensus 3 ~~~iytDgS~~~~~~~~~~g~v~~------~~~---------~~~~~~~----~~s~~~aEl~Ai~~AL~~~---~---- 56 (132)
T PF00075_consen 3 AIIIYTDGSCRPNPGKGGAGYVVW------GGR---------NFSFRLG----GQSNNRAELQAIIEALKAL---E---- 56 (132)
T ss_dssp SEEEEEEEEECTTTTEEEEEEEEE------TTE---------EEEEEEE----SECHHHHHHHHHHHHHHTH---S----
T ss_pred cEEEEEeCCccCCCCceEEEEEEE------CCe---------EEEeccc----ccchhhhheehHHHHHHHh---h----
Confidence 5889999993 36666442 331 1112222 4688999999999999733 1
Q ss_pred ccccceEEEechHHHHHHHhc-----CC--C-cchhhhhhHHHhhhcCCCceEEEccCCCCC
Q psy9877 1011 LQDVRTTFWTDATTVLAWIRR-----NE--P-WNVFVMNRITEIRNLSQEHEWRHVPGEMNP 1064 (1447)
Q Consensus 1011 ~~~~~~~~~tDs~~~l~~i~~-----~~--~-~~~~v~nrv~~I~~~~~~~~~~hvp~~~Np 1064 (1447)
...++|+|||+.++.++.. .. . ...-+.+++.++...-..+.|+||||..|.
T Consensus 57 --~~~v~I~tDS~~v~~~l~~~~~~~~~~~~~~~~~i~~~i~~~~~~~~~v~~~~V~~H~~~ 116 (132)
T PF00075_consen 57 --HRKVTIYTDSQYVLNALNKWLHGNGWKKTSNGRPIKNEIWELLSRGIKVRFRWVPGHSGV 116 (132)
T ss_dssp --TSEEEEEES-HHHHHHHHTHHHHTTSBSCTSSSBHTHHHHHHHHHSSEEEEEESSSSSSS
T ss_pred --cccccccccHHHHHHHHHHhccccccccccccccchhheeeccccceEEeeeeccCcCCC
Confidence 1788999999999998887 31 1 111233344444333447999999999764
No 21
>PRK07708 hypothetical protein; Validated
Probab=97.81 E-value=7.1e-05 Score=82.63 Aligned_cols=120 Identities=13% Similarity=0.137 Sum_probs=77.7
Q ss_pred eEEEEecccc------cceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhcc
Q psy9877 937 SLHVFSDASK------LAYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYK 1010 (1447)
Q Consensus 937 ~L~vftDAS~------~g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~ 1010 (1447)
.+++|+|+|- .|+|+|++. .+|.....+..+ ..+.+ ..|-++.|+.|+..|++++.. ++
T Consensus 73 ~~~vY~DGs~~~n~g~aG~GvVI~~----~~g~~~~~~~~~-~~l~~------~~TNN~AEy~Ali~aL~~A~e----~g 137 (219)
T PRK07708 73 EILVYFDGGFDKETKLAGLGIVIYY----KQGNKRYRIRRN-AYIEG------IYDNNEAEYAALYYAMQELEE----LG 137 (219)
T ss_pred cEEEEEeeccCCCCCCcEEEEEEEE----CCCCEEEEEEee-ccccc------cccCcHHHHHHHHHHHHHHHH----cC
Confidence 5889999975 567777653 334322221111 22222 347788999999999976644 33
Q ss_pred ccccceEEEechHHHHHHHhcC-CCcchhhhhhHHHhhhcCC----CceEEEccCCCC-CCCCCCCc
Q psy9877 1011 LQDVRTTFWTDATTVLAWIRRN-EPWNVFVMNRITEIRNLSQ----EHEWRHVPGEMN-PADLPSRG 1071 (1447)
Q Consensus 1011 ~~~~~~~~~tDs~~~l~~i~~~-~~~~~~v~nrv~~I~~~~~----~~~~~hvp~~~N-pAD~~SRg 1071 (1447)
+....+.|++||+.++.|+++. +.........+.++++... .+.+.|||-+.| .||.+.+-
T Consensus 138 ~~~~~V~I~~DSqlVi~qi~g~wk~~~~~l~~y~~~i~~l~~~~~l~~~~~~VpR~~N~~AD~LAk~ 204 (219)
T PRK07708 138 VKHEPVTFRGDSQVVLNQLAGEWPCYDEHLNHWLDRIEQKLKQLKLTPVYEPISRKQNKEADQLATQ 204 (219)
T ss_pred CCcceEEEEeccHHHHHHhCCCceeCChhHHHHHHHHHHHHhhCCceEEEEECCchhhhHHHHHHHH
Confidence 3334589999999999999998 4433333334444443322 367799999999 69987764
No 22
>PRK13907 rnhA ribonuclease H; Provisional
Probab=97.81 E-value=5.3e-05 Score=77.07 Aligned_cols=112 Identities=18% Similarity=0.161 Sum_probs=73.3
Q ss_pred EEEEecccc------cceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhccc
Q psy9877 938 LHVFSDASK------LAYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYKL 1011 (1447)
Q Consensus 938 L~vftDAS~------~g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~~ 1011 (1447)
+++|+|||- .|||+|+ +. .+|... .+ .+.. ..|=.+.|+.|+..|++++...-
T Consensus 2 ~~iy~DGa~~~~~g~~G~G~vi--~~--~~~~~~----~~----~~~~----~~tn~~AE~~All~aL~~a~~~g----- 60 (128)
T PRK13907 2 IEVYIDGASKGNPGPSGAGVFI--KG--VQPAVQ----LS----LPLG----TMSNHEAEYHALLAALKYCTEHN----- 60 (128)
T ss_pred EEEEEeeCCCCCCCccEEEEEE--EE--CCeeEE----EE----eccc----ccCCcHHHHHHHHHHHHHHHhCC-----
Confidence 578999864 4667776 32 334221 11 1222 45678999999999998765432
Q ss_pred cccceEEEechHHHHHHHhcCCCcchhhhhhHHHhhhc---CCCceEEEccCCCC-CCCCCCCc
Q psy9877 1012 QDVRTTFWTDATTVLAWIRRNEPWNVFVMNRITEIRNL---SQEHEWRHVPGEMN-PADLPSRG 1071 (1447)
Q Consensus 1012 ~~~~~~~~tDs~~~l~~i~~~~~~~~~v~nrv~~I~~~---~~~~~~~hvp~~~N-pAD~~SRg 1071 (1447)
..++.++|||+.++.++++.-....-...-+.+++.+ ...+.+.|||...| .||.+.|.
T Consensus 61 -~~~v~i~sDS~~vi~~~~~~~~~~~~~~~l~~~~~~l~~~f~~~~~~~v~r~~N~~Ad~LA~~ 123 (128)
T PRK13907 61 -YNIVSFRTDSQLVERAVEKEYAKNKMFAPLLEEALQYIKSFDLFFIKWIPSSQNKVADELARK 123 (128)
T ss_pred -CCEEEEEechHHHHHHHhHHHhcChhHHHHHHHHHHHHhcCCceEEEEcCchhchhHHHHHHH
Confidence 4579999999999999998611112223334444443 34567799999999 69987663
No 23
>TIGR02281 clan_AA_DTGA clan AA aspartic protease, TIGR02281 family. This family consists of predicted aspartic proteases, typically from 180 to 230 amino acids in length, in MEROPS clan AA. This model describes the well-conserved 121-residue C-terminal region. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. Sequences in the seed alignment and those scoring above the trusted cutoff are Proteobacterial; homologs scroing between trusted and noise are found in Pyrobaculum aerophilum str. IM2 (archaeal), Pirellula sp. (Planctomycetes), and Nostoc sp. PCC 7120 (Cyanobacteria).
Probab=97.74 E-value=0.00011 Score=73.71 Aligned_cols=52 Identities=17% Similarity=0.297 Sum_probs=41.5
Q ss_pred eEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCC-eeEEEEEeeeCccc
Q psy9877 327 TLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPIT-KQSMRHALFGGSIT 384 (1447)
Q Consensus 327 tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~-~~~l~i~~~gg~~~ 384 (1447)
++++.+ +|.. +.+|+||||+.++|+.++|++||++.-. .....+.+++|...
T Consensus 13 ~v~~~I---nG~~---~~flVDTGAs~t~is~~~A~~Lgl~~~~~~~~~~~~ta~G~~~ 65 (121)
T TIGR02281 13 YATGRV---NGRN---VRFLVDTGATSVALNEEDAQRLGLDLNRLGYTVTVSTANGQIK 65 (121)
T ss_pred EEEEEE---CCEE---EEEEEECCCCcEEcCHHHHHHcCCCcccCCceEEEEeCCCcEE
Confidence 566777 5554 9999999999999999999999998632 34677888888643
No 24
>cd06095 RP_RTVL_H_like Retropepsin of the RTVL_H family of human endogenous retrovirus-like elements. This family includes aspartate proteases from retroelements with LTR (long terminal repeats) including the RTVL_H family of human endogenous retrovirus-like elements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where
Probab=97.72 E-value=0.00012 Score=68.71 Aligned_cols=42 Identities=17% Similarity=0.180 Sum_probs=36.6
Q ss_pred CCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCcc
Q psy9877 336 DGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSI 383 (1447)
Q Consensus 336 ~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~ 383 (1447)
||.. +.+|+||||+.|+|+++.|+++ .. ......+.|+||..
T Consensus 6 nG~~---~~fLvDTGA~~tii~~~~a~~~--~~-~~~~~~v~gagG~~ 47 (86)
T cd06095 6 EGVP---IVFLVDTGATHSVLKSDLGPKQ--EL-STTSVLIRGVSGQS 47 (86)
T ss_pred CCEE---EEEEEECCCCeEEECHHHhhhc--cC-CCCcEEEEeCCCcc
Confidence 5655 9999999999999999999998 22 56899999999987
No 25
>cd01648 TERT TERT: Telomerase reverse transcriptase (TERT). Telomerase is a ribonucleoprotein (RNP) that synthesizes telomeric DNA repeats. The telomerase RNA subunit provides the template for synthesis of these repeats. The catalytic subunit of RNP is known as telomerase reverse transcriptase (TERT). The reverse transcriptase (RT) domain is located in the C-terminal region of the TERT polypeptide. Single amino acid substitutions in this region lead to telomere shortening and senescence. Telomerase is an enzyme that, in certain cells, maintains the physical ends of chromosomes (telomeres) during replication. In somatic cells, replication of the lagging strand requires the continual presence of an RNA primer approximately 200 nucleotides upstream, which is complementary to the template strand. Since there is a region of DNA less than 200 base pairs from the end of the chromosome where this is not possible, the chromosome is continually shortened. However, a surplus of repetitive DNA at
Probab=97.72 E-value=3.4e-05 Score=77.30 Aligned_cols=94 Identities=15% Similarity=-0.031 Sum_probs=71.8
Q ss_pred eeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHH-HhcCCc
Q psy9877 732 RVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIM-KEKGFD 810 (1447)
Q Consensus 732 ~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l-~~~g~~ 810 (1447)
-+|.|...||.++.-.|..+.+........ .........|+||+++.+++.+++.+.++.+.+.+ .+.|+.
T Consensus 19 GlpQG~~lSp~L~nl~l~~l~~~~~~~~~~--------~~~~~~~~rYaDD~li~~~~~~~~~~~~~~l~~~l~~~~gl~ 90 (119)
T cd01648 19 GIPQGSPLSSLLCSLYYADLENKYLSFLDV--------IDKDSLLLRLVDDFLLITTSLDKAIKFLNLLLRGFINQYKTF 90 (119)
T ss_pred cccCCcchHHHHHHHHHHHHHHHHHhhccc--------CCCCceEEEEeCcEEEEeCCHHHHHHHHHHHHHhhHHhhCeE
Confidence 499999999999999998887665433100 01122345799999999999999999999999998 999999
Q ss_pred cccccccCC------CCCCCcceeceeee
Q psy9877 811 LRGWELTGD------KDDKPTNVLGLLWD 833 (1447)
Q Consensus 811 l~k~~snp~------~~~~~~k~LG~~w~ 833 (1447)
++.-|+... .......||||.+|
T Consensus 91 iN~~Kt~~~~~~~~~~~~~~~~flG~~i~ 119 (119)
T cd01648 91 VNFDKTQINFSFAQLDSSDLIPWCGLLIN 119 (119)
T ss_pred ECcccceeeccccccCCCCccCceeEeeC
Confidence 886665421 23567899999875
No 26
>cd05481 retropepsin_like_LTR_1 Retropepsins_like_LTR; pepsin-like aspartate protease from retrotransposons with long terminal repeats. Retropepsin of retrotransposons with long terminal repeats are pepsin-like aspartate proteases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identifi
Probab=97.72 E-value=0.00011 Score=70.03 Aligned_cols=69 Identities=14% Similarity=0.183 Sum_probs=57.8
Q ss_pred eEeEEEeCCCCcccccHHhHhhcC---CCCCCeeEEEEEeeeCccccccceeEEEEEEEecCCCceEEEEEeeecC
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLS---YTPITKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFACDFDVFGQSK 414 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~---L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~~v~~lvvp~ 414 (1447)
.++++|||||++|+|+.+.+++|+ .+.+.++++.+++++|+.....+. +.+.+.. ++..+.++|+|++.
T Consensus 10 ~v~~~vDtGA~vnllp~~~~~~l~~~~~~~L~~t~~~L~~~~g~~~~~~G~--~~~~v~~--~~~~~~~~f~Vvd~ 81 (93)
T cd05481 10 SVKFQLDTGATCNVLPLRWLKSLTPDKDPELRPSPVRLTAYGGSTIPVEGG--VKLKCRY--RNPKYNLTFQVVKE 81 (93)
T ss_pred eEEEEEecCCEEEeccHHHHhhhccCCCCcCccCCeEEEeeCCCEeeeeEE--EEEEEEE--CCcEEEEEEEEECC
Confidence 599999999999999999999998 677799999999999999888876 3444432 45568899999884
No 27
>cd05483 retropepsin_like_bacteria Bacterial aspartate proteases, retropepsin-like protease family. This family of bacteria aspartate proteases is a subfamily of retropepsin-like protease family, which includes enzymes from retrovirus and retrotransposons. While fungal and mammalian pepsin-like aspartate proteases are bilobal proteins with structurally related N- and C-termini, this family of bacteria aspartate proteases is half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate proteases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.71 E-value=0.00016 Score=69.21 Aligned_cols=64 Identities=20% Similarity=0.265 Sum_probs=46.3
Q ss_pred eeEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEE
Q psy9877 326 QTLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVI 396 (1447)
Q Consensus 326 ~tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I 396 (1447)
.++++.+. |. .+++||||||+.|+|+.+++++|++.........+.+.+|........ ..+++|
T Consensus 3 ~~v~v~i~---~~---~~~~llDTGa~~s~i~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~-~~~i~i 66 (96)
T cd05483 3 FVVPVTIN---GQ---PVRFLLDTGASTTVISEELAERLGLPLTLGGKVTVQTANGRVRAARVR-LDSLQI 66 (96)
T ss_pred EEEEEEEC---CE---EEEEEEECCCCcEEcCHHHHHHcCCCccCCCcEEEEecCCCccceEEE-cceEEE
Confidence 35666663 43 599999999999999999999999843356777888888876554332 344444
No 28
>COG0328 RnhA Ribonuclease HI [DNA replication, recombination, and repair]
Probab=97.59 E-value=0.00033 Score=72.05 Aligned_cols=113 Identities=24% Similarity=0.283 Sum_probs=73.5
Q ss_pred eEEEEeccc------ccceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhcc
Q psy9877 937 SLHVFSDAS------KLAYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYK 1010 (1447)
Q Consensus 937 ~L~vftDAS------~~g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~ 1010 (1447)
.+++|+|+| .-|||+|+. +.+++. ....+.. .-|=+|+||+|++.|++.+..
T Consensus 3 ~v~if~DGa~~gNpG~gG~g~vl~----~~~~~~--------~~s~~~~----~tTNNraEl~A~i~AL~~l~~------ 60 (154)
T COG0328 3 KVEIFTDGACLGNPGPGGWGAVLR----YGDGEK--------ELSGGEG----RTTNNRAELRALIEALEALKE------ 60 (154)
T ss_pred ceEEEecCccCCCCCCceEEEEEE----cCCceE--------EEeeeee----cccChHHHHHHHHHHHHHHHh------
Confidence 478999997 567888886 133332 1112233 557799999999999987765
Q ss_pred ccccceEEEechHHHHHHHh----cC--CCcch----hhhh--hHHHhhhc---CCCceEEEccCCCC-C----CCCCCC
Q psy9877 1011 LQDVRTTFWTDATTVLAWIR----RN--EPWNV----FVMN--RITEIRNL---SQEHEWRHVPGEMN-P----ADLPSR 1070 (1447)
Q Consensus 1011 ~~~~~~~~~tDs~~~l~~i~----~~--~~~~~----~v~n--rv~~I~~~---~~~~~~~hvp~~~N-p----AD~~SR 1070 (1447)
.....+.++|||+-++.-|. +- ..|++ .|.| ....+.++ ...+.|.+|||..+ | ||.+-|
T Consensus 61 ~~~~~v~l~tDS~yv~~~i~~w~~~w~~~~w~~~~~~pvkn~dl~~~~~~~~~~~~~v~~~WVkgH~g~~~NeraD~LA~ 140 (154)
T COG0328 61 LGACEVTLYTDSKYVVEGITRWIVKWKKNGWKTADKKPVKNKDLWEELDELLKRHELVFWEWVKGHAGHPENERADQLAR 140 (154)
T ss_pred cCCceEEEEecHHHHHHHHHHHHhhccccCccccccCccccHHHHHHHHHHHhhCCeEEEEEeeCCCCChHHHHHHHHHH
Confidence 24778999999998765443 32 44442 2332 23344443 34679999998775 3 777666
Q ss_pred c
Q psy9877 1071 G 1071 (1447)
Q Consensus 1071 g 1071 (1447)
.
T Consensus 141 ~ 141 (154)
T COG0328 141 E 141 (154)
T ss_pred H
Confidence 4
No 29
>PF13975 gag-asp_proteas: gag-polyprotein putative aspartyl protease
Probab=97.57 E-value=0.00013 Score=66.00 Aligned_cols=58 Identities=22% Similarity=0.362 Sum_probs=45.5
Q ss_pred EeeEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCee-EEEEEeeeCccccccc
Q psy9877 325 LQTLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQ-SMRHALFGGSITDAMD 388 (1447)
Q Consensus 325 l~tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~-~l~i~~~gg~~~~~~~ 388 (1447)
..++++.+ +|. .+.+|+||||+.|||++++|++||++..... ...++.++|......+
T Consensus 8 ~~~v~~~I---~g~---~~~alvDtGat~~fis~~~a~rLgl~~~~~~~~~~v~~a~g~~~~~~g 66 (72)
T PF13975_consen 8 LMYVPVSI---GGV---QVKALVDTGATHNFISESLAKRLGLPLEKPPSPIRVKLANGSVIEIRG 66 (72)
T ss_pred EEEEEEEE---CCE---EEEEEEeCCCcceecCHHHHHHhCCCcccCCCCEEEEECCCCccccce
Confidence 45566666 443 4889999999999999999999999995333 5999998887665554
No 30
>cd05480 NRIP_C NRIP_C; putative nuclear receptor interacting protein. Proteins in this family have been described as probable nuclear receptor interacting proteins. The C-terminal domain of this family is homologous to the retroviral aspartyl protease domain. The domain is structurally related to one lobe of the pepsin molecule. The conserved active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This group of aspartate peptidases is classified by MEROPS as the peptidase family A2 (retropepsin family, clan AA), subfamily A2A.
Probab=97.52 E-value=0.00015 Score=67.56 Aligned_cols=27 Identities=22% Similarity=0.375 Sum_probs=25.6
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCC
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTP 368 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~ 368 (1447)
.+.|++|||||.|+||..+|+++||..
T Consensus 9 ~vkAfVDsGaQ~timS~~caercgL~r 35 (103)
T cd05480 9 ELRALVDTGCQYNLISAACLDRLGLKE 35 (103)
T ss_pred EEEEEEecCCchhhcCHHHHHHcChHh
Confidence 599999999999999999999999975
No 31
>PF07727 RVT_2: Reverse transcriptase (RNA-dependent DNA polymerase); InterPro: IPR013103 A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This entry includes reverse transcriptases not recognised by IPR000477 from INTERPRO [].
Probab=97.46 E-value=4e-05 Score=86.68 Aligned_cols=179 Identities=18% Similarity=0.182 Sum_probs=109.5
Q ss_pred CCeEEEEcCCCCCCCcccCccccCCCCc--ccchhHHHhhcccccceeecccccceeeeEeCCCCCCccccccccccccc
Q psy9877 634 TPVRPVFDASAKDNGVSLNDCLEKGPNL--IETIPTSLAKFRINKIGISGDIAKAFLQISVSPQDRDCLSMRQPRIMVSR 711 (1447)
Q Consensus 634 ~k~R~v~D~~~~~~~~slN~~~~~g~~~--~~~l~~~l~~~r~~~~~~~~Di~~af~qi~l~~~dr~~~~~~~~~~~~~~ 711 (1447)
=|.|+|.=+-....|+...+...+-... ..-+..+. ....-.+..+||..||++-.|.++ .. |..|
T Consensus 35 ~KARlVa~G~~Q~~g~dy~et~apv~~~~s~r~~la~a--a~~~~~~~q~Dv~tAfL~~~l~e~-iy---m~~P------ 102 (246)
T PF07727_consen 35 YKARLVARGFTQKPGVDYEETFAPVARLTSIRILLAIA--ASNGLELHQMDVKTAFLNGDLDEE-IY---MRQP------ 102 (246)
T ss_pred eeecccccccccccccchhccccccccccccccccccc--cccccccccccccceeeecccccc-hh---hccc------
Confidence 3567776554434444444333322221 11122111 122345678999999999988654 32 2221
Q ss_pred ceeEEEEecCCCCcEEEEEEeeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCc-h----hHHHhhhccccccccccc
Q psy9877 712 DCLRFLWQDENGRVITYRHCRVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWP-I----HLVELLKDSFYVDNCLVS 786 (1447)
Q Consensus 712 ~~~~f~w~~~~~~~~~y~~~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p-~----~~~~~~~~~~YvDDil~~ 786 (1447)
.+|.-. ..+...+++.+-.|||+.||.++...++..|....-. . ....| + .....+.+.+||||+++.
T Consensus 103 --~g~~~~--~~~~~v~~L~kaLYGLKQa~r~W~~~l~~~L~~~GF~--~-~~~D~clfi~~~~~~~~ii~vYVDDili~ 175 (246)
T PF07727_consen 103 --PGFEDP--GPPGKVCRLKKALYGLKQAPRLWYKTLDKFLKKLGFK--Q-SKADPCLFIKKSGDGFIIILVYVDDILIA 175 (246)
T ss_pred --cccccc--ccccccccccccceecccccchhhhhcccccchhhhh--c-ccccccccccccccccccccccccccccc
Confidence 122111 1245789999999999999999999999888653211 0 01111 0 012346789999999999
Q ss_pred cCcHHHHHHHHHHHHHHHHhcCCccccccccCCCCCCCcceeceeeecCCCeEEEee
Q psy9877 787 TDSQAEAEQFIQVASSIMKEKGFDLRGWELTGDKDDKPTNVLGLLWDKSSDTLAINI 843 (1447)
Q Consensus 787 ~~s~~e~~~~~~~~~~~l~~~g~~l~k~~snp~~~~~~~k~LG~~w~~~~d~~~~~~ 843 (1447)
+.+.++..+..+++.+.|. ++ +.+....+||+......+++.+..
T Consensus 176 ~~~~~~i~~~~~~l~~~F~-----iK-------dlG~~~~fLGi~i~~~~~~i~lsQ 220 (246)
T PF07727_consen 176 GPSEEEIEEFKKELKKKFE-----IK-------DLGELKYFLGIEIERTKGGIFLSQ 220 (246)
T ss_pred cccccceeccccccccccc-----cc-------cccccccccceEEEECCCEEEEcH
Confidence 9999887776665554433 22 234457899999988888888776
No 32
>cd03487 RT_Bac_retron_II RT_Bac_retron_II: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome.
Probab=97.42 E-value=4.2e-05 Score=85.21 Aligned_cols=143 Identities=18% Similarity=0.184 Sum_probs=92.3
Q ss_pred hcccccceeecccccceeeeEeCCCCCCcccccc--cccccccceeEEEEecCCCCcEEEEEEeeecCccCChHHHHHHH
Q psy9877 671 KFRINKIGISGDIAKAFLQISVSPQDRDCLSMRQ--PRIMVSRDCLRFLWQDENGRVITYRHCRVVFGVSSSPFLLESCL 748 (1447)
Q Consensus 671 ~~r~~~~~~~~Di~~af~qi~l~~~dr~~~~~~~--~~~~~~~~~~~f~w~~~~~~~~~y~~~~~pfGl~~sP~~~~~~l 748 (1447)
...+..+++.+||+++|-.|.-+.=-+.+..... +.+. .-...+ ..+. ..+|.|...||.+++-+|
T Consensus 52 ~~~~~~~v~~~Di~~fFdsI~~~~L~~~l~~~~~~~~~~~--~~l~~~---------~~~~-~GlpQG~~lSp~Lanl~l 119 (214)
T cd03487 52 PHCGAKYVLKLDIKDFFPSITFERVRGVFRSLGYFSPDVA--TILAKL---------CTYN-GHLPQGAPTSPALSNLVF 119 (214)
T ss_pred HhcCCCEEEEeehhhhcccCCHHHHHHHHHHcCCCCHHHH--HHHHHH---------HhCC-CCcCCCCcccHHHHHHHH
Confidence 3556789999999999988875421111000000 0000 000000 0111 179999999999999999
Q ss_pred HHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHH--HHHHHHHHHHHHHHhcCCccccccccCCCCCCCcc
Q psy9877 749 KLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQA--EAEQFIQVASSIMKEKGFDLRGWELTGDKDDKPTN 826 (1447)
Q Consensus 749 ~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~--e~~~~~~~~~~~l~~~g~~l~k~~snp~~~~~~~k 826 (1447)
..+-...... . ....+....||||+++.+++.+ ++.+.+..+.+.|.+.|+.++.-++.-........
T Consensus 120 ~~~d~~l~~~-~---------~~~~~~~~RYaDD~~i~~~~~~~~~~~~~~~~i~~~l~~~gL~ln~~Kt~i~~~~~~~~ 189 (214)
T cd03487 120 RKLDERLSKL-A---------KSNGLTYTRYADDITFSSNKKLKEALDKLLEIIRSILSEEGFKINKSKTRISSKGSRQI 189 (214)
T ss_pred HHHHHHHHHH-H---------HHcCCeEEEEeccEEEEccccchhHHHHHHHHHHHHHHHCCceeCCCceEEccCCCCcE
Confidence 8764433211 0 0112345689999999999988 89999999999999999999855544222356789
Q ss_pred eeceeeecC
Q psy9877 827 VLGLLWDKS 835 (1447)
Q Consensus 827 ~LG~~w~~~ 835 (1447)
+||+.....
T Consensus 190 ~~G~~i~~~ 198 (214)
T cd03487 190 VTGLVVNNG 198 (214)
T ss_pred EEEEEEeCC
Confidence 999998654
No 33
>cd01650 RT_nLTR_like RT_nLTR: Non-LTR (long terminal repeat) retrotransposon and non-LTR retrovirus reverse transcriptase (RT). This subfamily contains both non-LTR retrotransposons and non-LTR retrovirus RTs. RTs catalyze the conversion of single-stranded RNA into double-stranded DNA for integration into host chromosomes. RT is a multifunctional enzyme with RNA-directed DNA polymerase, DNA directed DNA polymerase and ribonuclease hybrid (RNase H) activities.
Probab=97.22 E-value=0.0003 Score=78.57 Aligned_cols=105 Identities=18% Similarity=0.176 Sum_probs=78.2
Q ss_pred ccccceeecccccceeeeEeCCCCCCcccccccccccccceeEEEEecCCCCcEEEEEEeeecCccCChHHHHHHHHHHH
Q psy9877 673 RINKIGISGDIAKAFLQISVSPQDRDCLSMRQPRIMVSRDCLRFLWQDENGRVITYRHCRVVFGVSSSPFLLESCLKLHL 752 (1447)
Q Consensus 673 r~~~~~~~~Di~~af~qi~l~~~dr~~~~~~~~~~~~~~~~~~f~w~~~~~~~~~y~~~~~pfGl~~sP~~~~~~l~~~l 752 (1447)
....+++.+|+++||..|.-+. ++. .. .+|.|...||.+++.+|..+.
T Consensus 79 ~~~~~~l~~Di~~aFdsi~~~~----------------------l~~------~l----GipQG~~lSp~l~~l~~~~l~ 126 (220)
T cd01650 79 KKSLVLVFLDFEKAFDSVDHEF----------------------LLK------AL----GVRQGDPLSPLLFNLALDDLL 126 (220)
T ss_pred CCceEEEEEEHHhhcCcCCHHH----------------------HHH------Hh----CCccCCcccHHHHHHHHHHHH
Confidence 3457899999999997665321 111 00 699999999999999999987
Q ss_pred HHhhccc--CCCCCCCchhHHHhhhccccccccccccCcHH-HHHHHHHHHHHHHHhcCCcccccccc
Q psy9877 753 ELTLTDC--REGKSSWPIHLVELLKDSFYVDNCLVSTDSQA-EAEQFIQVASSIMKEKGFDLRGWELT 817 (1447)
Q Consensus 753 ~~~~~~~--~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~-e~~~~~~~~~~~l~~~g~~l~k~~sn 817 (1447)
+...... +. ....+....|+||+++.+.+.+ .+....+.+...+...|+.++..++.
T Consensus 127 ~~~~~~~~~~~--------~~~~~~~~~yaDD~~i~~~~~~~~~~~~~~~~~~~~~~~gl~in~~Kt~ 186 (220)
T cd01650 127 RLLNKEEEIKL--------GGPGITHLAYADDIVLFSEGKSRKLQELLQRLQEWSKESGLKINPSKSK 186 (220)
T ss_pred HHHHhhccccC--------CCCccceEEeccceeeeccCCHHHHHHHHHHHHHHHHHcCCEEChhheE
Confidence 7654210 00 0122455789999999999988 99999999999999999988765554
No 34
>PRK00203 rnhA ribonuclease H; Reviewed
Probab=97.12 E-value=0.0027 Score=66.41 Aligned_cols=109 Identities=22% Similarity=0.270 Sum_probs=66.4
Q ss_pred EEEEecccc------cceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhccc
Q psy9877 938 LHVFSDASK------LAYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYKL 1011 (1447)
Q Consensus 938 L~vftDAS~------~g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~~ 1011 (1447)
+.+|||+|- .|||+|+.. .++ . ..+ +. ... ..|-++.||.|++.|++.+.
T Consensus 4 v~iytDGs~~~n~~~~g~g~v~~~----~~~-~--~~~-~~----~~~----~~TN~~aEL~Ai~~AL~~~~-------- 59 (150)
T PRK00203 4 VEIYTDGACLGNPGPGGWGAILRY----KGH-E--KEL-SG----GEA----LTTNNRMELMAAIEALEALK-------- 59 (150)
T ss_pred EEEEEEecccCCCCceEEEEEEEE----CCe-e--EEE-ec----CCC----CCcHHHHHHHHHHHHHHHcC--------
Confidence 789999996 467776641 222 1 111 11 122 56789999999999997442
Q ss_pred cccceEEEechHHHHHHHhcC------CCcc----hhhhhh--HHHhhhcCC--CceEEEccCCC----C-CCCCCCC
Q psy9877 1012 QDVRTTFWTDATTVLAWIRRN------EPWN----VFVMNR--ITEIRNLSQ--EHEWRHVPGEM----N-PADLPSR 1070 (1447)
Q Consensus 1012 ~~~~~~~~tDs~~~l~~i~~~------~~~~----~~v~nr--v~~I~~~~~--~~~~~hvp~~~----N-pAD~~SR 1070 (1447)
....+.|+|||+.++.=|+.- +.|+ .-|.|+ +..|.++.. .+.|.|||+.. | -||.+.|
T Consensus 60 ~~~~v~I~tDS~yvi~~i~~w~~~Wk~~~~~~~~g~~v~n~dl~~~i~~l~~~~~v~~~wV~~H~~~~~N~~AD~lA~ 137 (150)
T PRK00203 60 EPCEVTLYTDSQYVRQGITEWIHGWKKNGWKTADKKPVKNVDLWQRLDAALKRHQIKWHWVKGHAGHPENERCDELAR 137 (150)
T ss_pred CCCeEEEEECHHHHHHHHHHHHHHHHHcCCcccCCCccccHHHHHHHHHHhccCceEEEEecCCCCCHHHHHHHHHHH
Confidence 135689999998776544321 2222 234443 344444333 68999999866 3 3786654
No 35
>PF13456 RVT_3: Reverse transcriptase-like; PDB: 3ALY_A 2EHG_A 3HST_B.
Probab=97.10 E-value=0.00016 Score=67.75 Aligned_cols=76 Identities=25% Similarity=0.181 Sum_probs=54.4
Q ss_pred hHhHHHHHHHHHHHHHHHHhccccccceEEEechHHHHHHHhcCCCcchhhhhhHHHhhhcC---CCceEEEccCCCC-C
Q psy9877 989 RLELLAASIATRLCQTVVKDYKLQDVRTTFWTDATTVLAWIRRNEPWNVFVMNRITEIRNLS---QEHEWRHVPGEMN-P 1064 (1447)
Q Consensus 989 rlEL~A~~~a~~~~~~l~~~l~~~~~~~~~~tDs~~~l~~i~~~~~~~~~v~nrv~~I~~~~---~~~~~~hvp~~~N-p 1064 (1447)
..|++|+..|++++.. +...++.+.|||+.++..|++.......+...+..|+... ..+.|.|||.+.| .
T Consensus 3 ~aE~~al~~al~~a~~------~g~~~i~v~sDs~~vv~~i~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~i~r~~N~~ 76 (87)
T PF13456_consen 3 EAEALALLEALQLAWE------LGIRKIIVESDSQLVVDAINGRSSSRSELRPLIQDIRSLLDRFWNVSVSHIPREQNKV 76 (87)
T ss_dssp HHHHHHHHHHHHHHHC------CT-SCEEEEES-HHHHHHHTTSS---SCCHHHHHHHHHHHCCCSCEEEEE--GGGSHH
T ss_pred HHHHHHHHHHHHHHHH------CCCCEEEEEecCccccccccccccccccccccchhhhhhhccccceEEEEEChHHhHH
Confidence 4799999999987643 3378999999999999999998332225566666666654 4789999999999 6
Q ss_pred CCCCCC
Q psy9877 1065 ADLPSR 1070 (1447)
Q Consensus 1065 AD~~SR 1070 (1447)
||.+.|
T Consensus 77 A~~LA~ 82 (87)
T PF13456_consen 77 ADALAK 82 (87)
T ss_dssp HHHHHH
T ss_pred HHHHHH
Confidence 997765
No 36
>PRK07238 bifunctional RNase H/acid phosphatase; Provisional
Probab=97.06 E-value=0.001 Score=80.77 Aligned_cols=117 Identities=13% Similarity=0.062 Sum_probs=75.5
Q ss_pred eEEEEeccccc------ceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHHHhcc
Q psy9877 937 SLHVFSDASKL------AYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVVKDYK 1010 (1447)
Q Consensus 937 ~L~vftDAS~~------g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~~~l~ 1010 (1447)
+++||+|||-. |+|+++.. .+|..... .... ++. ..|-+..|+.|++.|++++..+-
T Consensus 2 ~~~i~~DGa~~~n~g~aG~G~vi~~----~~~~~~~~--~~~~---~~~----~~tnn~AE~~All~gL~~a~~~g---- 64 (372)
T PRK07238 2 KVVVEADGGSRGNPGPAGYGAVVWD----ADRGEVLA--ERAE---AIG----RATNNVAEYRGLIAGLEAAAELG---- 64 (372)
T ss_pred eEEEEecCCCCCCCCceEEEEEEEe----CCCCcEEE--Eeec---ccC----CCCchHHHHHHHHHHHHHHHhCC----
Confidence 57899999765 56665532 33321111 1111 222 34567899999999998765442
Q ss_pred ccccceEEEechHHHHHHHhcC-CCcchhhh---hhHHHhhhcCCCceEEEccCCCC-CCCCCCCcc
Q psy9877 1011 LQDVRTTFWTDATTVLAWIRRN-EPWNVFVM---NRITEIRNLSQEHEWRHVPGEMN-PADLPSRGC 1072 (1447)
Q Consensus 1011 ~~~~~~~~~tDs~~~l~~i~~~-~~~~~~v~---nrv~~I~~~~~~~~~~hvp~~~N-pAD~~SRg~ 1072 (1447)
+..+.|++||+.++.-+++. +....-+. ..+..+......+.|.|||.+.| .||.+.+.-
T Consensus 65 --~~~v~i~~DS~lvi~~i~~~~~~~~~~l~~~~~~i~~l~~~f~~~~i~~v~r~~N~~AD~LA~~a 129 (372)
T PRK07238 65 --ATEVEVRMDSKLVVEQMSGRWKVKHPDMKPLAAQARELASQFGRVTYTWIPRARNAHADRLANEA 129 (372)
T ss_pred --CCeEEEEeCcHHHHHHhCCCCccCChHHHHHHHHHHHHHhcCCceEEEECCchhhhHHHHHHHHH
Confidence 56899999999999999876 22222222 23333334445789999999999 699888743
No 37
>PRK08719 ribonuclease H; Reviewed
Probab=97.04 E-value=0.0017 Score=67.34 Aligned_cols=110 Identities=17% Similarity=0.164 Sum_probs=64.6
Q ss_pred ceEEEEeccccc---------ceeeEEEEEEEecCCceEEEEEeeecccccCCCCCCcccchhHhHHHHHHHHHHHHHHH
Q psy9877 936 WSLHVFSDASKL---------AYAAAVFLRVEHSKTKVSLHLLAARARVAPSSKSKSTLTIPRLELLAASIATRLCQTVV 1006 (1447)
Q Consensus 936 ~~L~vftDAS~~---------g~gavly~r~~~~~g~~~~~~~~sksr~~p~k~~~~~~siprlEL~A~~~a~~~~~~l~ 1006 (1447)
..+++|||+|-. |||++++. .+|...... + .+.. ...|-++.||.|+..|++.+..
T Consensus 3 ~~~~iYtDGs~~~n~~~~~~~G~G~vv~~----~~~~~~~~~--~----~~~~---~~~Tnn~aEl~A~~~aL~~~~~-- 67 (147)
T PRK08719 3 ASYSIYIDGAAPNNQHGCVRGGIGLVVYD----EAGEIVDEQ--S----ITVN---RYTDNAELELLALIEALEYARD-- 67 (147)
T ss_pred ceEEEEEecccCCCCCCCCCcEEEEEEEe----CCCCeeEEE--E----ecCC---CCccHHHHHHHHHHHHHHHcCC--
Confidence 358899998872 77877752 334321111 1 1121 1358999999999999975421
Q ss_pred HhccccccceEEEechHHHHHHHh--------cC-C-Ccchhhhhh--HHHhhhcC--CCceEEEccCCC----C-CCCC
Q psy9877 1007 KDYKLQDVRTTFWTDATTVLAWIR--------RN-E-PWNVFVMNR--ITEIRNLS--QEHEWRHVPGEM----N-PADL 1067 (1447)
Q Consensus 1007 ~~l~~~~~~~~~~tDs~~~l~~i~--------~~-~-~~~~~v~nr--v~~I~~~~--~~~~~~hvp~~~----N-pAD~ 1067 (1447)
...|+|||+-++.=++ +. + .-..-|.|+ +..|.++. ..++|.||||.. | -||.
T Consensus 68 --------~~~i~tDS~yvi~~i~~~~~~W~~~~w~~s~g~~v~n~dl~~~i~~l~~~~~i~~~~VkgH~g~~~Ne~aD~ 139 (147)
T PRK08719 68 --------GDVIYSDSDYCVRGFNEWLDTWKQKGWRKSDKKPVANRDLWQQVDELRARKYVEVEKVTAHSGIEGNEAADM 139 (147)
T ss_pred --------CCEEEechHHHHHHHHHHHHHHHhCCcccCCCcccccHHHHHHHHHHhCCCcEEEEEecCCCCChhHHHHHH
Confidence 2379999988776553 22 1 112334442 22333322 368999999954 4 3664
Q ss_pred C
Q psy9877 1068 P 1068 (1447)
Q Consensus 1068 ~ 1068 (1447)
+
T Consensus 140 l 140 (147)
T PRK08719 140 L 140 (147)
T ss_pred H
Confidence 4
No 38
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=97.01 E-value=0.00034 Score=44.70 Aligned_cols=18 Identities=28% Similarity=0.761 Sum_probs=16.2
Q ss_pred cccccccccccccccccc
Q psy9877 267 QVCYACLKFGHRVSRCRT 284 (1447)
Q Consensus 267 ~lCf~Cl~~GH~~~~C~s 284 (1447)
..||+|++.||++++||+
T Consensus 1 ~~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 1 RKCFNCGEPGHIARDCPK 18 (18)
T ss_dssp SBCTTTSCSSSCGCTSSS
T ss_pred CcCcCCCCcCcccccCcc
Confidence 369999999999999984
No 39
>cd01646 RT_Bac_retron_I RT_Bac_retron_I: Reverse transcriptases (RTs) in bacterial retrotransposons or retrons. The polymerase reaction of this enzyme leads to the production of a unique RNA-DNA complex called msDNA (multicopy single-stranded (ss)DNA) in which a small ssDNA branches out from a small ssRNA molecule via a 2'-5'phosphodiester linkage. Bacterial retron RTs produce cDNA corresponding to only a small portion of the retron genome.
Probab=96.87 E-value=0.0019 Score=68.31 Aligned_cols=94 Identities=19% Similarity=0.158 Sum_probs=72.1
Q ss_pred EEEeeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcC
Q psy9877 729 RHCRVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKG 808 (1447)
Q Consensus 729 ~~~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g 808 (1447)
...-+|.|...||.+++.+|..+-....... ..+....||||+++.+++.+.+.+.++.+.+.+.+.|
T Consensus 50 ~~~GlpqG~~lS~~L~~~~l~~~d~~i~~~~------------~~~~~~RY~DD~~i~~~~~~~~~~~~~~i~~~l~~~g 117 (158)
T cd01646 50 QTNGLPIGPLTSRFLANIYLNDVDHELKSKL------------KGVDYVRYVDDIRIFADSKEEAEEILEELKEFLAELG 117 (158)
T ss_pred CCceEccCcchHHHHHHHHHHHHHHHHHhcc------------CCceEEEecCcEEEEcCCHHHHHHHHHHHHHHHHHCC
Confidence 3457999999999999999988754433211 2345668999999999999999999999999999999
Q ss_pred CccccccccCCCCC---CCcceeceeeec
Q psy9877 809 FDLRGWELTGDKDD---KPTNVLGLLWDK 834 (1447)
Q Consensus 809 ~~l~k~~snp~~~~---~~~k~LG~~w~~ 834 (1447)
+.++..++.-.... ....+||+....
T Consensus 118 L~ln~~Kt~~~~~~~~~~~~~flg~~~~~ 146 (158)
T cd01646 118 LSLNLSKTEILPLPEGTASKDFLGYRFSP 146 (158)
T ss_pred CEEChhhceeeecCCCCccccccceEeeh
Confidence 99986655521112 347899988654
No 40
>TIGR03698 clan_AA_DTGF clan AA aspartic protease, AF_0612 family. Members of this protein family are clan AA aspartic proteases, related to family TIGR02281. These proteins resemble retropepsins, pepsin-like proteases of retroviruses such as HIV. Members of this family are found in archaea and bacteria.
Probab=96.85 E-value=0.0059 Score=59.83 Aligned_cols=56 Identities=23% Similarity=0.210 Sum_probs=44.5
Q ss_pred eEEEEEecCCCCcceeEeEEEeCCCCccc-ccHHhHhhcCCCCCCeeEEEEEeeeCccc
Q psy9877 327 TLVVKIRGKDGKQDKLARLMIDTGSQQSY-VLEQTMNSLSYTPITKQSMRHALFGGSIT 384 (1447)
Q Consensus 327 tv~v~v~~~~g~~~~~v~aLLDSGS~~S~-Ise~la~~L~L~~~~~~~l~i~~~gg~~~ 384 (1447)
++++.+.+++......+.+|+||||+..+ |++++|++||++. .. ...+.+++|...
T Consensus 1 ~~~v~~~~p~~~~~~~v~~LVDTGat~~~~l~~~~a~~lgl~~-~~-~~~~~tA~G~~~ 57 (107)
T TIGR03698 1 TLDVELSNPKNPEFMEVRALVDTGFSGFLLVPPDIVNKLGLPE-LD-QRRVYLADGREV 57 (107)
T ss_pred CEEEEEeCCCCCCceEEEEEEECCCCeEEecCHHHHHHcCCCc-cc-CcEEEecCCcEE
Confidence 36788888854433589999999999887 9999999999988 33 568888888543
No 41
>KOG0012|consensus
Probab=96.60 E-value=0.0021 Score=73.02 Aligned_cols=28 Identities=11% Similarity=0.258 Sum_probs=26.6
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCC
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPI 369 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~ 369 (1447)
.|.|++||||+.|+||..+|+++||...
T Consensus 246 ~VKAfVDsGaq~timS~~Caer~gL~rl 273 (380)
T KOG0012|consen 246 PVKAFVDSGAQTTIMSAACAERCGLNRL 273 (380)
T ss_pred EEEEEEcccchhhhhhHHHHHHhChHHH
Confidence 4999999999999999999999999875
No 42
>PF14223 UBN2: gag-polypeptide of LTR copia-type
Probab=96.59 E-value=0.015 Score=58.36 Aligned_cols=105 Identities=18% Similarity=0.203 Sum_probs=82.6
Q ss_pred HHHHHHHHHhcCCcchhHHHHHHHHHHHHHhcCCCCChhhHHHHHHHHHHHHHHHHhcCCCCcCCCcchHHHHHccCCHH
Q psy9877 21 NQVVEALKARFGREDLLTEVYIRELLKLVLANTASHDKLPIVILYDRLQSHLRNLESLGVSADRCAPILMPLVSSSLPQD 100 (1447)
Q Consensus 21 ~~A~~~L~~ryg~~~~i~~~~~~~~~~~L~~~p~~~d~~~Lr~l~d~l~~~i~aL~~lg~~~d~~~~~l~~~i~~KLP~~ 100 (1447)
..+|+.|+.+|.....+.+.-+..+..++.++.. .+...+..++.++...++.|.++|.+.+ +..++..|++.||+.
T Consensus 4 ~e~W~~L~~~y~~~~~~~~~~~~~L~~~l~~~k~-~~~~sv~~y~~~~~~i~~~L~~~g~~i~--d~~~v~~iL~~Lp~~ 80 (119)
T PF14223_consen 4 KEAWDALKKRYEGQSKVKQARVQQLKSQLENLKM-KDGESVDEYISRLKEIVDELRAIGKPIS--DEDLVSKILRSLPPS 80 (119)
T ss_pred HHHHHHHHHHHcCCchHHHHHHHHHHHHHHHHHh-cccccHHHHHHHHHHhhhhhhhcCCccc--chhHHHHHHhcCCch
Confidence 5799999999999999665556677777777664 3677889999999999999999999877 789999999999977
Q ss_pred HHHHHHhhccccccccCCCCCCcC--CHHHHHHHHHHHH
Q psy9877 101 LLQMWERCSVTQLHATQPSAGCKE--YLDSLMNFLKAEV 137 (1447)
Q Consensus 101 l~~~w~~~~~~~~~~~~~~~~~~~--t~~~l~~fl~~~~ 137 (1447)
...-.... ... ...+ |+++++..|...-
T Consensus 81 y~~~~~~i-------~~~--~~~~~~t~~el~~~L~~~E 110 (119)
T PF14223_consen 81 YDTFVTAI-------RNS--KDLPKMTLEELISRLLAEE 110 (119)
T ss_pred hHHHHHHH-------Hhc--CCCCcCCHHHHHHHHHHHH
Confidence 66655443 211 2445 8999998776654
No 43
>cd00303 retropepsin_like Retropepsins; pepsin-like aspartate proteases. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements, as well as eukaryotic dna-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. While fungal and mammalian pepsins are bilobal proteins with structurally related N and C-terminals, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples
Probab=96.43 E-value=0.017 Score=52.77 Aligned_cols=69 Identities=13% Similarity=0.246 Sum_probs=45.5
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCC-CCCCeeEEEEEeeeCccccccceeEEEEEEEecCCCceEEEEEeeec
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSY-TPITKQSMRHALFGGSITDAMDHNLFKIVISNLDNTFACDFDVFGQS 413 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L-~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~~~~~~~~~~v~~lvvp 413 (1447)
.+.+|+|+||+.++++.+.++++++ .........+.++++........ ...+.+.. +.....+.+++++
T Consensus 9 ~~~~liDtgs~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~i--~~~~~~~~~~~~~ 78 (92)
T cd00303 9 PVRALVDSGASVNFISESLAKKLGLPPRLLPTPLKVKGANGSSVKTLGV-ILPVTIGI--GGKTFTVDFYVLD 78 (92)
T ss_pred EEEEEEcCCCcccccCHHHHHHcCCCcccCCCceEEEecCCCEeccCcE-EEEEEEEe--CCEEEEEEEEEEc
Confidence 5899999999999999999999987 43356667778877755443332 23333332 2234455555555
No 44
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=96.35 E-value=0.0019 Score=67.78 Aligned_cols=71 Identities=27% Similarity=0.593 Sum_probs=38.8
Q ss_pred CCccccccc-cccccccCCC-----CCCHHHHHHHHHhccccccccccccccccc-ccccc-ccccccCCCCc-ccccCC
Q psy9877 234 KGESCIFCS-QGHYSGDCQK-----DMTLSQRQDIIRNKQVCYACLKFGHRVSRC-RTKSR-LKCEKCGSRHL-TILCPR 304 (1447)
Q Consensus 234 ~~~~C~~C~-~~H~~~~C~~-----~~~~~eR~~~~k~~~lCf~Cl~~GH~~~~C-~s~~~-~~C~~C~~~Hh-t~l~~~ 304 (1447)
....|..|+ .+|...+||. -....-|...--+...|++|+.-||++++| |++.. ..|..|+..+| +..|+.
T Consensus 59 ~~~~C~nCg~~GH~~~DCP~~iC~~C~~~~H~s~~C~~~~~C~~Cg~~GH~~~dC~P~~~~~~~C~~C~s~~H~s~~Cp~ 138 (190)
T COG5082 59 ENPVCFNCGQNGHLRRDCPHSICYNCSWDGHRSNHCPKPKKCYNCGETGHLSRDCNPSKDQQKSCFDCNSTRHSSEDCPS 138 (190)
T ss_pred cccccchhcccCcccccCChhHhhhcCCCCcccccCCcccccccccccCccccccCcccccCcceeccCCCccccccCcc
Confidence 456788888 7888888882 000000111011225677777777777777 33321 25677776443 344544
No 45
>PRK06548 ribonuclease H; Provisional
Probab=96.29 E-value=0.025 Score=59.41 Aligned_cols=81 Identities=23% Similarity=0.215 Sum_probs=52.5
Q ss_pred cccchhHhHHHHHHHHHHHHHHHHhccccccceEEEechHHHHHHHhcC------CCc----chhhhhh--HHHhhhcC-
Q psy9877 984 TLTIPRLELLAASIATRLCQTVVKDYKLQDVRTTFWTDATTVLAWIRRN------EPW----NVFVMNR--ITEIRNLS- 1050 (1447)
Q Consensus 984 ~~siprlEL~A~~~a~~~~~~l~~~l~~~~~~~~~~tDs~~~l~~i~~~------~~~----~~~v~nr--v~~I~~~~- 1050 (1447)
..|=+|.||+|+..|++.+ ......+.|+|||+.++.=++.- +.| ..-|.|+ +..|..+.
T Consensus 39 ~~TNnraEl~Aii~aL~~~-------~~~~~~v~I~TDS~yvi~~i~~W~~~Wk~~gWk~s~G~pV~N~dL~~~l~~l~~ 111 (161)
T PRK06548 39 IATNNIAELTAVRELLIAT-------RHTDRPILILSDSKYVINSLTKWVYSWKMRKWRKADGKPVLNQEIIQEIDSLME 111 (161)
T ss_pred CCCHHHHHHHHHHHHHHhh-------hcCCceEEEEeChHHHHHHHHHHHHHHHHCCCcccCCCccccHHHHHHHHHHHh
Confidence 4578999999999887522 11234689999999998766631 223 2345655 23333332
Q ss_pred -CCceEEEccCCC----C-CCCCCCCc
Q psy9877 1051 -QEHEWRHVPGEM----N-PADLPSRG 1071 (1447)
Q Consensus 1051 -~~~~~~hvp~~~----N-pAD~~SRg 1071 (1447)
..++|.||+|.. | -||.|-|.
T Consensus 112 ~~~v~~~wVkgHsg~~gNe~aD~LA~~ 138 (161)
T PRK06548 112 NRNIRMSWVNAHTGHPLNEAADSLARQ 138 (161)
T ss_pred cCceEEEEEecCCCCHHHHHHHHHHHH
Confidence 268999999866 4 37866654
No 46
>COG5082 AIR1 Arginine methyltransferase-interacting protein, contains RING Zn-finger [Posttranslational modification, protein turnover, chaperones / Intracellular trafficking and secretion]
Probab=96.24 E-value=0.0024 Score=67.13 Aligned_cols=34 Identities=26% Similarity=0.536 Sum_probs=24.3
Q ss_pred hccccccccccccccccccccccccccccCCCCcccc
Q psy9877 265 NKQVCYACLKFGHRVSRCRTKSRLKCEKCGSRHLTIL 301 (1447)
Q Consensus 265 ~~~lCf~Cl~~GH~~~~C~s~~~~~C~~C~~~Hht~l 301 (1447)
....||||++.||.+++||. . -|..|....|.+.
T Consensus 59 ~~~~C~nCg~~GH~~~DCP~-~--iC~~C~~~~H~s~ 92 (190)
T COG5082 59 ENPVCFNCGQNGHLRRDCPH-S--ICYNCSWDGHRSN 92 (190)
T ss_pred cccccchhcccCcccccCCh-h--HhhhcCCCCcccc
Confidence 56789999999999999992 3 5666654444433
No 47
>PF02160 Peptidase_A3: Cauliflower mosaic virus peptidase (A3); InterPro: IPR000588 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This group of sequences contain an aspartic peptidase signature that belongs to MEROPS peptidase family A3, subfamily A3A (cauliflower mosaic virus-type endopeptidase, clan AA). Cauliflower mosaic virus belongs to the Retro-transcribing viruses, which have a double-stranded DNA genome. The genome includes an open reading frame (ORF V) that shows similarities to the pol gene of retroviruses. This ORF codes for a polyprotein that includes a reverse transcriptase, which, on the basis of a DTG triplet near the N terminus, was suggested to include an aspartic protease. The presence of an aspartic protease has been confirmed by mutational studies, implicating Asp-45 in catalysis. The protease releases itself from the polyprotein and is involved in reactions required to process the ORF IV polyprotein, which includes the viral coat protein []. The viral aspartic peptidase signature has also been found associated with a polyprotein encoded by integrated pararetrovirus-like sequences in the genome of Nicotiana tabacum (Common tobacco) []. ; GO: 0004190 aspartic-type endopeptidase activity, 0006508 proteolysis
Probab=96.14 E-value=0.0051 Score=65.93 Aligned_cols=56 Identities=20% Similarity=0.160 Sum_probs=41.5
Q ss_pred eEEEEEecCCCCcceeEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccc
Q psy9877 327 TLVVKIRGKDGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSIT 384 (1447)
Q Consensus 327 tv~v~v~~~~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~ 384 (1447)
.+.+.+.-++... ..+.|++||||++.++++.+...-.... .+.++.|++++++..
T Consensus 6 yI~~~i~~~gy~~-~~~~~~vDTGAt~C~~~~~iiP~e~we~-~~~~i~v~~an~~~~ 61 (201)
T PF02160_consen 6 YIKVKISFPGYKK-FNYHCYVDTGATICCASKKIIPEEYWEK-SKKPIKVKGANGSII 61 (201)
T ss_pred EEEEEEEEcCcee-EEEEEEEeCCCceEEecCCcCCHHHHHh-CCCcEEEEEecCCce
Confidence 3455556565455 6899999999999999988775544555 667789999987643
No 48
>PF14227 UBN2_2: gag-polypeptide of LTR copia-type
Probab=96.12 E-value=0.041 Score=55.14 Aligned_cols=104 Identities=17% Similarity=0.187 Sum_probs=77.9
Q ss_pred HHHHHHHHHhcCCcchhHHHHHHHHHHHHHhcCCCCChhhHHHHHHHHHHHHHHHHhcCCCCcCCCcchHHHHHccCCHH
Q psy9877 21 NQVVEALKARFGREDLLTEVYIRELLKLVLANTASHDKLPIVILYDRLQSHLRNLESLGVSADRCAPILMPLVSSSLPQD 100 (1447)
Q Consensus 21 ~~A~~~L~~ryg~~~~i~~~~~~~~~~~L~~~p~~~d~~~Lr~l~d~l~~~i~aL~~lg~~~d~~~~~l~~~i~~KLP~~ 100 (1447)
..+|+.|+..|+......+ -.++++|.++.- .+...++..++.++..+..|+++|.+.+ +...+.+|+..||+.
T Consensus 5 ~~~W~~L~~~y~~~~~~~~---~~l~~kl~~~k~-~~~~~v~~hi~~~~~l~~~L~~~g~~i~--d~~~~~~lL~sLP~s 78 (119)
T PF14227_consen 5 KEMWDKLKKKYEKKSFANK---IYLLRKLYSLKM-DEGGSVRDHINEFRSLVNQLKSLGVPID--DEDKVIILLSSLPPS 78 (119)
T ss_pred HHHHHHHHHHHcCCCHHHH---HHHHHHHHHhHh-ccchhHHHHHHHHHHHHHhhccccccch--HHHHHHHHHHcCCHh
Confidence 5789999999999988662 234677887766 2345699999999999999999999887 678999999999987
Q ss_pred HHHHHHhhccccccccCCCCCCcCCHHHHHHHHHHHH
Q psy9877 101 LLQMWERCSVTQLHATQPSAGCKEYLDSLMNFLKAEV 137 (1447)
Q Consensus 101 l~~~w~~~~~~~~~~~~~~~~~~~t~~~l~~fl~~~~ 137 (1447)
...--... ........+|++++..-|.++=
T Consensus 79 y~~~~~~l-------~~~~~~~~~tl~~v~~~L~~ee 108 (119)
T PF14227_consen 79 YDSFVTAL-------LYSKPEDELTLEEVKSKLLQEE 108 (119)
T ss_pred HHHHHHHH-------HccCCCCCcCHHHHHHHHHHHH
Confidence 33333221 1111136789998888777653
No 49
>cd06094 RP_Saci_like RP_Saci_like, retropepsin family. Retropepsin on retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified
Probab=96.10 E-value=0.0097 Score=55.08 Aligned_cols=51 Identities=12% Similarity=0.157 Sum_probs=39.0
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEE
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVI 396 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I 396 (1447)
.+++|+||||++|+|.....+.. . .+.++.+.++||+....-+...+.+.+
T Consensus 9 ~~~fLVDTGA~vSviP~~~~~~~---~-~~~~~~l~AANgt~I~tyG~~~l~ldl 59 (89)
T cd06094 9 GLRFLVDTGAAVSVLPASSTKKS---L-KPSPLTLQAANGTPIATYGTRSLTLDL 59 (89)
T ss_pred CcEEEEeCCCceEeecccccccc---c-cCCceEEEeCCCCeEeeeeeEEEEEEc
Confidence 37999999999999998888753 2 566789999999887777653444444
No 50
>cd01651 RT_G2_intron RT_G2_intron: Reverse transcriptases (RTs) with group II intron origin. RT transcribes DNA using RNA as template. Proteins in this subfamily are found in bacterial and mitochondrial group II introns. Their most probable ancestor was a retrotransposable element with both gag-like and pol-like genes. This subfamily of proteins appears to have captured the RT sequences from transposable elements, which lack long terminal repeats (LTRs).
Probab=95.99 E-value=0.011 Score=66.31 Aligned_cols=100 Identities=17% Similarity=0.135 Sum_probs=70.5
Q ss_pred EeeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcCCc
Q psy9877 731 CRVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFD 810 (1447)
Q Consensus 731 ~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~ 810 (1447)
.-+|.|...||.++.-+|..+............ .........+....|+||+++.+++.+++.+..+.+.+.++..|+.
T Consensus 125 ~GlpqG~~lSp~L~~~~l~~ld~~l~~~~~~~~-~~~~~~~~~~~~~rY~DD~~i~~~~~~~~~~~~~~i~~~~~~~gl~ 203 (226)
T cd01651 125 KGTPQGGVISPLLANIYLHELDKFVEEKLKEYY-DTSDPKFRRLRYVRYADDFVIGVRGPKEAEEIKELIREFLEELGLE 203 (226)
T ss_pred CCcCCCccHHHHHHHHHHHHHHHHHHHhhhhcc-cccccccCceEEEEecCceEEecCCHHHHHHHHHHHHHHHHHcCCe
Confidence 458999999999999999887655443210000 0000001224567899999999999999999999999999999999
Q ss_pred cccccccCCCC-CCCcceecee
Q psy9877 811 LRGWELTGDKD-DKPTNVLGLL 831 (1447)
Q Consensus 811 l~k~~snp~~~-~~~~k~LG~~ 831 (1447)
++..++.-... .....|||+.
T Consensus 204 ln~~Kt~i~~~~~~~~~fLG~~ 225 (226)
T cd01651 204 LNPEKTRITHFKSEGFDFLGFT 225 (226)
T ss_pred echhhcceeecCCCCCeeCCeE
Confidence 88555542122 5678999985
No 51
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=94.83 E-value=0.021 Score=59.63 Aligned_cols=61 Identities=31% Similarity=0.795 Sum_probs=35.3
Q ss_pred ccccccc-cccccccCCCCCCHHHHHHHHHhcccccccccccccccccccccc-----ccccccCCC-CcccccC
Q psy9877 236 ESCIFCS-QGHYSGDCQKDMTLSQRQDIIRNKQVCYACLKFGHRVSRCRTKSR-----LKCEKCGSR-HLTILCP 303 (1447)
Q Consensus 236 ~~C~~C~-~~H~~~~C~~~~~~~eR~~~~k~~~lCf~Cl~~GH~~~~C~s~~~-----~~C~~C~~~-Hht~l~~ 303 (1447)
..|..|+ .+|...+||..... .....||+|++.||++++|+.+.+ ..|..|++. |...-|+
T Consensus 53 ~~C~~Cg~~GH~~~~Cp~~~~~-------~~~~~C~~Cg~~GH~~~~C~~~~~~~~~~~~C~~Cg~~gH~~~~C~ 120 (148)
T PTZ00368 53 RSCYNCGKTGHLSRECPEAPPG-------SGPRSCYNCGQTGHISRECPNRAKGGAARRACYNCGGEGHISRDCP 120 (148)
T ss_pred cccCCCCCcCcCcccCCCcccC-------CCCcccCcCCCCCcccccCCCcccccccchhhcccCcCCcchhcCC
Confidence 4577777 66777777651100 023467777777777777766431 247777765 3333454
No 52
>PTZ00368 universal minicircle sequence binding protein (UMSBP); Provisional
Probab=94.72 E-value=0.024 Score=59.16 Aligned_cols=63 Identities=32% Similarity=0.759 Sum_probs=47.1
Q ss_pred Cccccccc-cccccccCCCCCCHHHHHHHHHhccccccccccccccccccccc----cccccccCCC-CcccccCC
Q psy9877 235 GESCIFCS-QGHYSGDCQKDMTLSQRQDIIRNKQVCYACLKFGHRVSRCRTKS----RLKCEKCGSR-HLTILCPR 304 (1447)
Q Consensus 235 ~~~C~~C~-~~H~~~~C~~~~~~~eR~~~~k~~~lCf~Cl~~GH~~~~C~s~~----~~~C~~C~~~-Hht~l~~~ 304 (1447)
...|..|+ .+|.+.+||..... .....||+|+..||++++|+.+. ...|..|++. |...-|.+
T Consensus 27 ~~~C~~Cg~~GH~~~~Cp~~~~~-------~~~~~C~~Cg~~GH~~~~Cp~~~~~~~~~~C~~Cg~~GH~~~~C~~ 95 (148)
T PTZ00368 27 ARPCYKCGEPGHLSRECPSAPGG-------RGERSCYNCGKTGHLSRECPEAPPGSGPRSCYNCGQTGHISRECPN 95 (148)
T ss_pred CccCccCCCCCcCcccCcCCCCC-------CCCcccCCCCCcCcCcccCCCcccCCCCcccCcCCCCCcccccCCC
Confidence 46899999 88999999972111 13567999999999999998753 1269999985 55555654
No 53
>cd05482 HIV_retropepsin_like Retropepsins, pepsin-like aspartate proteases. This is a subfamily of retropepsins. The family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Retroviral aspartyl protease is synthesized as part of the POL polyprotein that contains an aspartyl protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. In aspartate peptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. This gro
Probab=94.71 E-value=0.058 Score=50.48 Aligned_cols=53 Identities=23% Similarity=0.270 Sum_probs=36.0
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCccccccceeEEEEEEE
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGSITDAMDHNLFKIVIS 397 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~~~~~~~~~~v~l~I~ 397 (1447)
.+.|||||||+.|+|++..+.... + ....+..|.|+||........ .+.+++.
T Consensus 9 ~~~~llDTGAd~Tvi~~~~~p~~w-~-~~~~~~~i~GIGG~~~~~~~~-~v~i~i~ 61 (87)
T cd05482 9 LFEGLLDTGADVSIIAENDWPKNW-P-IQPAPSNLTGIGGAITPSQSS-VLLLEID 61 (87)
T ss_pred EEEEEEccCCCCeEEcccccCCCC-c-cCCCCeEEEeccceEEEEEEe-eEEEEEc
Confidence 499999999999999986654321 1 134667899999865444332 4555554
No 54
>PF12382 Peptidase_A2E: Retrotransposon peptidase; InterPro: IPR024648 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. This entry represents a small family of fungal retroviral aspartyl peptidases.
Probab=94.65 E-value=0.093 Score=48.33 Aligned_cols=70 Identities=20% Similarity=0.232 Sum_probs=45.6
Q ss_pred eEeEEEeCCCCcccccHHhHhhcCCCCCCeeEEEEEeeeCc-cccccceeEEEEEEEecCCCceEEEEEeeecCcc
Q psy9877 342 LARLMIDTGSQQSYVLEQTMNSLSYTPITKQSMRHALFGGS-ITDAMDHNLFKIVISNLDNTFACDFDVFGQSKIC 416 (1447)
Q Consensus 342 ~v~aLLDSGS~~S~Ise~la~~L~L~~~~~~~l~i~~~gg~-~~~~~~~~~v~l~I~~~~~~~~~~v~~lvvp~i~ 416 (1447)
.+.||+|+|||+++|+++.+..-+|+. .+-.-++- +||- ....-.. ...+.|.- +...+..+++++....
T Consensus 47 sipclidtgaq~niiteetvrahklpt-rpw~~svi-yggvyp~kinrk-t~kl~i~l--ngisikteflvvkkfs 117 (137)
T PF12382_consen 47 SIPCLIDTGAQVNIITEETVRAHKLPT-RPWSQSVI-YGGVYPNKINRK-TIKLNINL--NGISIKTEFLVVKKFS 117 (137)
T ss_pred cceeEEccCceeeeeehhhhhhccCCC-CcchhheE-eccccccccccc-eEEEEEEe--cceEEEEEEEEEEecc
Confidence 589999999999999999999999987 54333332 2332 2222222 34444432 4456777888887654
No 55
>PF09337 zf-H2C2: His(2)-Cys(2) zinc finger; InterPro: IPR015416 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents an H2C2-type zinc finger that binds to histone upstream activating sequence (UAS) elements found in histone gene promoters []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].
Probab=94.56 E-value=0.016 Score=45.22 Aligned_cols=25 Identities=24% Similarity=0.339 Sum_probs=22.1
Q ss_pred hcccCCCchhHHHHHHHheEeecCCC
Q psy9877 1267 HLNNCHAGTQILMSILRQKYWILGET 1292 (1447)
Q Consensus 1267 H~~~~H~g~~~t~~~iR~~yWi~~~r 1292 (1447)
|.. +|.|++.|++.|+++|||++++
T Consensus 1 H~~-~H~Gi~kT~~~i~~~y~W~gm~ 25 (39)
T PF09337_consen 1 HNQ-GHPGINKTTAKISQRYHWPGMK 25 (39)
T ss_pred CCc-cCCCHHHHHHHHHHhheecCHH
Confidence 554 5999999999999999999876
No 56
>KOG3752|consensus
Probab=94.12 E-value=0.18 Score=58.59 Aligned_cols=82 Identities=20% Similarity=0.174 Sum_probs=55.1
Q ss_pred cccchhHhHHHHHHHHHHHHHHHHhccccccceEEEechHHHHHHHhcC------CCcc---------hhhh------hh
Q psy9877 984 TLTIPRLELLAASIATRLCQTVVKDYKLQDVRTTFWTDATTVLAWIRRN------EPWN---------VFVM------NR 1042 (1447)
Q Consensus 984 ~~siprlEL~A~~~a~~~~~~l~~~l~~~~~~~~~~tDs~~~l~~i~~~------~~~~---------~~v~------nr 1042 (1447)
..|-+|.||.|+..|++-+... .+.+++|.|||.-++.|++.- +.|+ .+|. .+
T Consensus 253 ~qtNnrAEl~Av~~ALkka~~~------~~~kv~I~TDS~~~i~~l~~wv~~~k~~~~k~~~~~~~i~~~v~n~~~~~e~ 326 (371)
T KOG3752|consen 253 RQTNNRAELIAAIEALKKARSK------NINKVVIRTDSEYFINSLTLWVQGWKKNGWKTSNGSDRICAYVKNQDFFNEL 326 (371)
T ss_pred cccccHHHHHHHHHHHHHHHhc------CCCcEEEEechHHHHHHHHHHHhhhccCccccccCCCccceeeecchHHHHH
Confidence 6789999999999998755433 355999999999998877642 2221 1222 22
Q ss_pred HHHhhhc-CCCceEEEccCCCC-----CCCCCCCc
Q psy9877 1043 ITEIRNL-SQEHEWRHVPGEMN-----PADLPSRG 1071 (1447)
Q Consensus 1043 v~~I~~~-~~~~~~~hvp~~~N-----pAD~~SRg 1071 (1447)
-..+++. -..++|.||+|... .||.+.|-
T Consensus 327 ~~l~q~~~~~~vq~~~V~Gh~gi~gne~Ad~lARk 361 (371)
T KOG3752|consen 327 DELEQEISNKKVQQEYVGGHSGILGNEMADALARK 361 (371)
T ss_pred HHHHhhhccCceEEEEecCcCCcchHHHHHHHHhh
Confidence 2233332 23799999999763 38888873
No 57
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=91.66 E-value=0.083 Score=39.71 Aligned_cols=21 Identities=38% Similarity=0.594 Sum_probs=13.2
Q ss_pred ccccccccccccccccccccc
Q psy9877 266 KQVCYACLKFGHRVSRCRTKS 286 (1447)
Q Consensus 266 ~~lCf~Cl~~GH~~~~C~s~~ 286 (1447)
.++|++|++-.|.+++|+++.
T Consensus 2 ~~~CprC~kg~Hwa~~C~sk~ 22 (36)
T PF14787_consen 2 PGLCPRCGKGFHWASECRSKT 22 (36)
T ss_dssp --C-TTTSSSCS-TTT---TC
T ss_pred CccCcccCCCcchhhhhhhhh
Confidence 478999999999999999986
No 58
>KOG4400|consensus
Probab=91.23 E-value=0.073 Score=61.17 Aligned_cols=67 Identities=27% Similarity=0.623 Sum_probs=42.9
Q ss_pred Cccccccc-cccccccCCCCCCHH------------------HHH--HHHHhccccccccccccccccccc--ccccccc
Q psy9877 235 GESCIFCS-QGHYSGDCQKDMTLS------------------QRQ--DIIRNKQVCYACLKFGHRVSRCRT--KSRLKCE 291 (1447)
Q Consensus 235 ~~~C~~C~-~~H~~~~C~~~~~~~------------------eR~--~~~k~~~lCf~Cl~~GH~~~~C~s--~~~~~C~ 291 (1447)
...|..|+ .+|...+|+...... ++. ..... ..||+|++.||+..+|+. .. .|.
T Consensus 92 ~~~c~~C~~~gH~~~~c~~~~~~~~~~~~~~~c~~~gh~~~~~~~~~~~~~~-~~Cy~Cg~~GH~s~~C~~~~~~--~c~ 168 (261)
T KOG4400|consen 92 AAACFNCGEGGHIERDCPEAGKEGSSETSCYSCGKTGHRGCPDADPVDGPKP-AKCYSCGEQGHISDDCPENKGG--TCF 168 (261)
T ss_pred chhhhhCCCCccchhhCCcccCcccccceeeccCCCccccCcccccccCCCC-CccCCCCcCCcchhhCCCCCCC--ccc
Confidence 56788888 788888888711111 000 11111 458999998999999983 33 788
Q ss_pred ccCCCCcccc-cCC
Q psy9877 292 KCGSRHLTIL-CPR 304 (1447)
Q Consensus 292 ~C~~~Hht~l-~~~ 304 (1447)
.|++.+|-.. |+.
T Consensus 169 ~c~~~~h~~~~C~~ 182 (261)
T KOG4400|consen 169 RCGKVGHGSRDCPS 182 (261)
T ss_pred cCCCcceecccCCc
Confidence 8888766554 443
No 59
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=89.83 E-value=0.13 Score=37.98 Aligned_cols=19 Identities=37% Similarity=0.765 Sum_probs=13.3
Q ss_pred ccccccccccccccccccc
Q psy9877 267 QVCYACLKFGHRVSRCRTK 285 (1447)
Q Consensus 267 ~lCf~Cl~~GH~~~~C~s~ 285 (1447)
-.|+.|.++||+.++||..
T Consensus 9 Y~C~~C~~~GH~i~dCP~~ 27 (32)
T PF13696_consen 9 YVCHRCGQKGHWIQDCPTN 27 (32)
T ss_pred CEeecCCCCCccHhHCCCC
Confidence 3577777777777777664
No 60
>COG3577 Predicted aspartyl protease [General function prediction only]
Probab=88.10 E-value=0.81 Score=49.04 Aligned_cols=48 Identities=19% Similarity=0.279 Sum_probs=40.4
Q ss_pred CCCcceeEeEEEeCCCCcccccHHhHhhcCCCCC-CeeEEEEEeeeCccccc
Q psy9877 336 DGKQDKLARLMIDTGSQQSYVLEQTMNSLSYTPI-TKQSMRHALFGGSITDA 386 (1447)
Q Consensus 336 ~g~~~~~v~aLLDSGS~~S~Ise~la~~L~L~~~-~~~~l~i~~~gg~~~~~ 386 (1447)
||+. +.+|+||||+.-.++++.|++||+..- ..-++.+.|.+|.....
T Consensus 113 NGk~---v~fLVDTGATsVal~~~dA~RlGid~~~l~y~~~v~TANG~~~AA 161 (215)
T COG3577 113 NGKK---VDFLVDTGATSVALNEEDARRLGIDLNSLDYTITVSTANGRARAA 161 (215)
T ss_pred CCEE---EEEEEecCcceeecCHHHHHHhCCCccccCCceEEEccCCccccc
Confidence 6766 999999999999999999999998762 35678899999876543
No 61
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=87.44 E-value=0.18 Score=39.99 Aligned_cols=20 Identities=25% Similarity=0.571 Sum_probs=17.8
Q ss_pred hccccccccccccccccccc
Q psy9877 265 NKQVCYACLKFGHRVSRCRT 284 (1447)
Q Consensus 265 ~~~lCf~Cl~~GH~~~~C~s 284 (1447)
...+|.+|++.||+..+|+.
T Consensus 3 ~~~~CqkC~~~GH~tyeC~~ 22 (42)
T PF13917_consen 3 ARVRCQKCGQKGHWTYECPN 22 (42)
T ss_pred CCCcCcccCCCCcchhhCCC
Confidence 45789999999999999993
No 62
>cd01709 RT_like_1 RT_like_1: A subfamily of reverse transcriptases (RTs). An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs.
Probab=85.15 E-value=3.1 Score=48.85 Aligned_cols=100 Identities=12% Similarity=-0.078 Sum_probs=70.2
Q ss_pred EeeecCccCChHHHHHHHHHHHHHhhcccCCCCCCCchhHHHhhhccccccccccccCcHHHHHHHHHHHHHHHHhcCCc
Q psy9877 731 CRVVFGVSSSPFLLESCLKLHLELTLTDCREGKSSWPIHLVELLKDSFYVDNCLVSTDSQAEAEQFIQVASSIMKEKGFD 810 (1447)
Q Consensus 731 ~~~pfGl~~sP~~~~~~l~~~l~~~~~~~~~~~~~~p~~~~~~~~~~~YvDDil~~~~s~~e~~~~~~~~~~~l~~~g~~ 810 (1447)
+-+|.|-..||.+.+-.|..+-..+...++ .+...-|.||+++.++ .+++.+..+.+.+.++..|+.
T Consensus 82 rGtPqGgviSplLaNiyL~~lD~~v~~~~~------------g~~l~RYaDD~vi~~~-~~~a~~aw~~i~~fl~~lGLe 148 (346)
T cd01709 82 RGTPMSHALSDVFGELVLFCLDFAVNQATD------------GGLLYRLHDDLWFWGQ-PETCAKAWKAIQEFAKVMGLE 148 (346)
T ss_pred CccCCCchhhHHHHHHHHHHHHHHHHhcCC------------CceEEEEcCeEEEEcC-HHHHHHHHHHHHHHHHHcCce
Confidence 468999999999999988843222222211 2345589999999954 689999999999999999999
Q ss_pred ccccccc-----CC-----------CCCCCcceeceeeecCCCeEEEee
Q psy9877 811 LRGWELT-----GD-----------KDDKPTNVLGLLWDKSSDTLAINI 843 (1447)
Q Consensus 811 l~k~~sn-----p~-----------~~~~~~k~LG~~w~~~~d~~~~~~ 843 (1447)
+++-++. +. -+...+++-=+..|+.++.+.++.
T Consensus 149 lN~eKT~iV~~~~~~r~~~~~~~~~LP~g~i~wgfL~Ld~~~G~~~Idq 197 (346)
T cd01709 149 LNKEKTGSVYLSDDTKTRDTTIDATLPEGPVRWGFLKLDPKTGRWEIDQ 197 (346)
T ss_pred eccccceEEEeccCCccCCCcccccCCCCCceeeeEEecCCCCcEEeeH
Confidence 9865443 00 123455665566677777676665
No 63
>KOG0119|consensus
Probab=84.63 E-value=0.43 Score=56.87 Aligned_cols=43 Identities=30% Similarity=0.672 Sum_probs=35.2
Q ss_pred Cccccccc-cccccccCCCCCCHHHHHHHHHhcccccccccccccccccccc
Q psy9877 235 GESCIFCS-QGHYSGDCQKDMTLSQRQDIIRNKQVCYACLKFGHRVSRCRTK 285 (1447)
Q Consensus 235 ~~~C~~C~-~~H~~~~C~~~~~~~eR~~~~k~~~lCf~Cl~~GH~~~~C~s~ 285 (1447)
...|..|+ .+|.-.+|+.+ +-...++|++|+-.||++++|+..
T Consensus 261 ~~~c~~cg~~~H~q~~cp~r--------~~~~~n~c~~cg~~gH~~~dc~~~ 304 (554)
T KOG0119|consen 261 NRACRNCGSTGHKQYDCPGR--------IPNTTNVCKICGPLGHISIDCKVN 304 (554)
T ss_pred cccccccCCCccccccCCcc--------cccccccccccCCcccccccCCCc
Confidence 36899999 89999999972 112234999999999999999886
No 64
>smart00343 ZnF_C2HC zinc finger.
Probab=84.25 E-value=0.43 Score=33.80 Aligned_cols=18 Identities=39% Similarity=0.805 Sum_probs=15.9
Q ss_pred cccccccccccccccccc
Q psy9877 268 VCYACLKFGHRVSRCRTK 285 (1447)
Q Consensus 268 lCf~Cl~~GH~~~~C~s~ 285 (1447)
.|++|++.||++++|+..
T Consensus 1 ~C~~CG~~GH~~~~C~~~ 18 (26)
T smart00343 1 KCYNCGKEGHIARDCPKX 18 (26)
T ss_pred CCccCCCCCcchhhCCcc
Confidence 499999999999999843
No 65
>KOG4400|consensus
Probab=82.75 E-value=0.69 Score=53.21 Aligned_cols=40 Identities=38% Similarity=0.891 Sum_probs=35.6
Q ss_pred ccccccc-cccccccCCCCCCHHHHHHHHHhccccccccccccccccccccc
Q psy9877 236 ESCIFCS-QGHYSGDCQKDMTLSQRQDIIRNKQVCYACLKFGHRVSRCRTKS 286 (1447)
Q Consensus 236 ~~C~~C~-~~H~~~~C~~~~~~~eR~~~~k~~~lCf~Cl~~GH~~~~C~s~~ 286 (1447)
..|+-|+ .+|...+|+.. +.+.||.|.+.||..++|++..
T Consensus 144 ~~Cy~Cg~~GH~s~~C~~~-----------~~~~c~~c~~~~h~~~~C~~~~ 184 (261)
T KOG4400|consen 144 AKCYSCGEQGHISDDCPEN-----------KGGTCFRCGKVGHGSRDCPSKQ 184 (261)
T ss_pred CccCCCCcCCcchhhCCCC-----------CCCccccCCCcceecccCCccc
Confidence 5699999 69999999951 5799999999999999999874
No 66
>PF00098 zf-CCHC: Zinc knuckle; InterPro: IPR001878 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence: C-X2-C-X4-H-X4-C where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process [, , ]. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003676 nucleic acid binding, 0008270 zinc ion binding; PDB: 2L44_A 1A1T_A 1WWG_A 1U6P_A 1WWD_A 1WWE_A 1A6B_B 1F6U_A 1MFS_A 1NCP_C ....
Probab=79.34 E-value=1.3 Score=28.52 Aligned_cols=16 Identities=44% Similarity=1.091 Sum_probs=14.4
Q ss_pred cccccc-cccccccCCC
Q psy9877 237 SCIFCS-QGHYSGDCQK 252 (1447)
Q Consensus 237 ~C~~C~-~~H~~~~C~~ 252 (1447)
.|..|+ .+|.+.+||+
T Consensus 2 ~C~~C~~~GH~~~~Cp~ 18 (18)
T PF00098_consen 2 KCFNCGEPGHIARDCPK 18 (18)
T ss_dssp BCTTTSCSSSCGCTSSS
T ss_pred cCcCCCCcCcccccCcc
Confidence 699999 8999999984
No 67
>COG5550 Predicted aspartyl protease [Posttranslational modification, protein turnover, chaperones]
Probab=69.44 E-value=21 Score=35.57 Aligned_cols=42 Identities=24% Similarity=0.192 Sum_probs=32.3
Q ss_pred eeEEEEEecC-CCCcceeEeEEEeCCCC-cccccHHhHhhcCCCCC
Q psy9877 326 QTLVVKIRGK-DGKQDKLARLMIDTGSQ-QSYVLEQTMNSLSYTPI 369 (1447)
Q Consensus 326 ~tv~v~v~~~-~g~~~~~v~aLLDSGS~-~S~Ise~la~~L~L~~~ 369 (1447)
.++++....+ +|.. .. .+|+|||.+ -..++.++|++++++..
T Consensus 11 ~~v~~~f~~~~~Gd~-~~-~~LiDTGFtg~lvlp~~vaek~~~~~~ 54 (125)
T COG5550 11 VTVPVTFRLPGQGDF-VY-DELIDTGFTGYLVLPPQVAEKLGLPLF 54 (125)
T ss_pred eeEEEEEEecCCCcE-Ee-eeEEecCCceeEEeCHHHHHhcCCCcc
Confidence 4567777664 3554 33 449999999 88999999999999983
No 68
>PF13696 zf-CCHC_2: Zinc knuckle
Probab=64.37 E-value=3.3 Score=30.89 Aligned_cols=19 Identities=37% Similarity=0.908 Sum_probs=16.9
Q ss_pred CCccccccc-cccccccCCC
Q psy9877 234 KGESCIFCS-QGHYSGDCQK 252 (1447)
Q Consensus 234 ~~~~C~~C~-~~H~~~~C~~ 252 (1447)
..-.|..|+ .+|++.+||.
T Consensus 7 ~~Y~C~~C~~~GH~i~dCP~ 26 (32)
T PF13696_consen 7 PGYVCHRCGQKGHWIQDCPT 26 (32)
T ss_pred CCCEeecCCCCCccHhHCCC
Confidence 567899999 8899999997
No 69
>PF03732 Retrotrans_gag: Retrotransposon gag protein ; InterPro: IPR005162 Transposable elements (TEs) promote various chromosomal rearrangements more efficiently, and often more specifically, than other cellular processes. Retrotransposons are structurally similar to retroviruses and are bounded by long terminal repeats. This entry represents eukaryotic Gag or capsid-related retrotranspon-related proteins. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved.
Probab=64.19 E-value=7.8 Score=36.46 Aligned_cols=85 Identities=13% Similarity=0.116 Sum_probs=61.2
Q ss_pred CCCcHhHhhccCCC-C---CCcHHHHHHHHHHhcCCcchhHHHHHHHHHHHHHhcCCCCChhhHHHHHHHHHHHHHHHHh
Q psy9877 2 KGTPARELVDSYPA-T---GNMYNQVVEALKARFGREDLLTEVYIRELLKLVLANTASHDKLPIVILYDRLQSHLRNLES 77 (1447)
Q Consensus 2 ~~G~A~~~I~~~~~-t---~~nY~~A~~~L~~ryg~~~~i~~~~~~~~~~~L~~~p~~~d~~~Lr~l~d~l~~~i~aL~~ 77 (1447)
++|+|+.-...+.. . ..+|+...+.|.++|+.+.... ....+|.++.. ....+..++..++..+..+..
T Consensus 7 L~g~A~~w~~~~~~~~~~~~~~W~~~~~~~~~~f~~~~~~~-----~~~~~l~~l~Q--~~esv~~y~~rf~~l~~~~~~ 79 (96)
T PF03732_consen 7 LKGPARQWYRNLRPNEIRDFITWEEFKDAFRKRFFPPDRKE-----QARQELNSLRQ--GNESVREYVNRFRELARRAPP 79 (96)
T ss_pred ccCHHHHHHHHhHhcCCCCCCCHHHHHHHHHHHHhhhhccc-----cchhhhhhhhc--cCCcHHHHHHHHHHHHHHCCC
Confidence 78999999998864 2 2389999999999999987666 45566666555 677788898888877766665
Q ss_pred cCCCCcCCCcchHHHHHccCC
Q psy9877 78 LGVSADRCAPILMPLVSSSLP 98 (1447)
Q Consensus 78 lg~~~d~~~~~l~~~i~~KLP 98 (1447)
+.+ +...+..+++-|.
T Consensus 80 -~~~----e~~~v~~f~~GL~ 95 (96)
T PF03732_consen 80 -PMD----EEMLVERFIRGLR 95 (96)
T ss_pred -CcC----HHHHHHHHHHCCC
Confidence 222 2456666665553
No 70
>PF14392 zf-CCHC_4: Zinc knuckle
Probab=61.96 E-value=2.9 Score=34.66 Aligned_cols=17 Identities=29% Similarity=0.956 Sum_probs=12.3
Q ss_pred ccccccccccccccccc
Q psy9877 267 QVCYACLKFGHRVSRCR 283 (1447)
Q Consensus 267 ~lCf~Cl~~GH~~~~C~ 283 (1447)
..|++|+..||...+|+
T Consensus 32 ~~C~~C~~~gH~~~~C~ 48 (49)
T PF14392_consen 32 RFCFHCGRIGHSDKECP 48 (49)
T ss_pred hhhcCCCCcCcCHhHcC
Confidence 45777777777777775
No 71
>cd04714 BAH_BAHCC1 BAH, or Bromo Adjacent Homology domain, as present in mammalian BAHCC1 and similar proteins. BAHCC1 stands for BAH domain and coiled-coil containing 1. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.
Probab=55.84 E-value=15 Score=36.84 Aligned_cols=39 Identities=33% Similarity=0.615 Sum_probs=31.2
Q ss_pred ccccCcEEEeeccCCCCCCccccceeeeecCCCCCeEEEEE
Q psy9877 1341 IIKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGHVRVVKL 1381 (1447)
Q Consensus 1341 ~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~vR~v~v 1381 (1447)
-+++||-|+|+.++.+...| +|||.++..+.||.. .|.+
T Consensus 3 ~~~vGD~V~v~~~~~~~~py-IgrI~~i~e~~~g~~-~~~v 41 (121)
T cd04714 3 IIRVGDCVLFKSPGRPSLPY-VARIESLWEDPEGNM-VVRV 41 (121)
T ss_pred EEEcCCEEEEeCCCCCCCCE-EEEEEEEEEcCCCCE-EEEE
Confidence 46899999999877555566 999999999999876 4444
No 72
>PF13917 zf-CCHC_3: Zinc knuckle
Probab=54.87 E-value=7.4 Score=31.15 Aligned_cols=18 Identities=28% Similarity=0.822 Sum_probs=15.9
Q ss_pred CCccccccc-cccccccCC
Q psy9877 234 KGESCIFCS-QGHYSGDCQ 251 (1447)
Q Consensus 234 ~~~~C~~C~-~~H~~~~C~ 251 (1447)
....|..|+ .+|+.++|+
T Consensus 3 ~~~~CqkC~~~GH~tyeC~ 21 (42)
T PF13917_consen 3 ARVRCQKCGQKGHWTYECP 21 (42)
T ss_pred CCCcCcccCCCCcchhhCC
Confidence 346799999 899999999
No 73
>PF12353 eIF3g: Eukaryotic translation initiation factor 3 subunit G ; InterPro: IPR024675 At least eleven different protein factors are involved in initiation of protein synthesis in eukaryotes. Binding of initiator tRNA and mRNA to the 40S subunit requires the presence of the translation initiation factors eIF-2 and eIF-3, with eIF-3 being particularly important for 80S ribosome dissociation and mRNA binding []. eIF-3 is the most complex translation inititation factor, consisting of about 13 putative subunits and having a molecular weight of between 550 - 700 kDa in mammalian cells. Subunits are designated eIF-3a - eIF-3m; the large number of subunits means that the interactions between the individual subunits that make up the eIF-3 complex are complex and varied. Subunit G is required for eIF3 integrity. This entry represents a domain of approximately 130 amino acids in length found at the N terminus of eukaryotic translation initiation factor 3 subunit G. This domain is commonly found in association with the RNA recognition domain PF00076 from PFAM.
Probab=54.21 E-value=5.6 Score=40.32 Aligned_cols=19 Identities=21% Similarity=0.697 Sum_probs=17.2
Q ss_pred CCccccccccccccccCCC
Q psy9877 234 KGESCIFCSQGHYSGDCQK 252 (1447)
Q Consensus 234 ~~~~C~~C~~~H~~~~C~~ 252 (1447)
....|.+|+++|+...||-
T Consensus 105 ~~v~CR~CkGdH~T~~CPy 123 (128)
T PF12353_consen 105 SKVKCRICKGDHWTSKCPY 123 (128)
T ss_pred ceEEeCCCCCCcccccCCc
Confidence 5678999999999999995
No 74
>PF14787 zf-CCHC_5: GAG-polyprotein viral zinc-finger; PDB: 1CL4_A 1DSV_A.
Probab=51.54 E-value=6.1 Score=30.14 Aligned_cols=17 Identities=24% Similarity=0.704 Sum_probs=10.3
Q ss_pred ccccccc-cccccccCCC
Q psy9877 236 ESCIFCS-QGHYSGDCQK 252 (1447)
Q Consensus 236 ~~C~~C~-~~H~~~~C~~ 252 (1447)
..|+.|+ +.|++.+|..
T Consensus 3 ~~CprC~kg~Hwa~~C~s 20 (36)
T PF14787_consen 3 GLCPRCGKGFHWASECRS 20 (36)
T ss_dssp -C-TTTSSSCS-TTT---
T ss_pred ccCcccCCCcchhhhhhh
Confidence 4699999 8999999986
No 75
>PF15288 zf-CCHC_6: Zinc knuckle
Probab=47.71 E-value=10 Score=29.87 Aligned_cols=19 Identities=37% Similarity=0.723 Sum_probs=14.9
Q ss_pred ccccccccccccc--cccccc
Q psy9877 267 QVCYACLKFGHRV--SRCRTK 285 (1447)
Q Consensus 267 ~lCf~Cl~~GH~~--~~C~s~ 285 (1447)
..|.+|+..||.+ +.||-+
T Consensus 2 ~kC~~CG~~GH~~t~k~CP~~ 22 (40)
T PF15288_consen 2 VKCKNCGAFGHMRTNKRCPMY 22 (40)
T ss_pred ccccccccccccccCccCCCC
Confidence 4699999999987 556654
No 76
>KOG0109|consensus
Probab=46.67 E-value=8.8 Score=43.23 Aligned_cols=19 Identities=37% Similarity=0.758 Sum_probs=17.7
Q ss_pred ccccccccccccccccccc
Q psy9877 268 VCYACLKFGHRVSRCRTKS 286 (1447)
Q Consensus 268 lCf~Cl~~GH~~~~C~s~~ 286 (1447)
-|+.|++.||+.++||..+
T Consensus 162 ~cyrcGkeghwskEcP~~~ 180 (346)
T KOG0109|consen 162 GCYRCGKEGHWSKECPVDR 180 (346)
T ss_pred HheeccccccccccCCccC
Confidence 4999999999999999876
No 77
>PRK01191 rpl24p 50S ribosomal protein L24P; Validated
Probab=44.95 E-value=28 Score=34.70 Aligned_cols=51 Identities=24% Similarity=0.339 Sum_probs=37.2
Q ss_pred cCccccCcEEEeeccCCCCCCccccceeeeecCCC-CCeEEEEEEeCCc-eeeccc
Q psy9877 1339 EHIIKVGDVVLIGDDNVKRLDWPMGRVINLCPGAD-GHVRVVKLRTSSG-VLTRPI 1392 (1447)
Q Consensus 1339 ~~~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~D-g~vR~v~v~~~~~-~~~r~i 1392 (1447)
.-.++.||.|.|-.....- .-|+|+++.+..+ -.|..|.+.++.| +...||
T Consensus 43 ~~~IkkGD~V~VisG~~KG---k~GkV~~V~~~~~~V~VeGvn~~k~~G~~~e~pI 95 (120)
T PRK01191 43 SLPVRKGDTVKVMRGDFKG---EEGKVVEVDLKRGRIYVEGVTVKKADGTEVPRPI 95 (120)
T ss_pred cceEeCCCEEEEeecCCCC---ceEEEEEEEcCCCEEEEeCcEEECCCCeEEEccc
Confidence 5579999999998666443 4599999998866 4566777777777 455555
No 78
>COG5222 Uncharacterized conserved protein, contains RING Zn-finger [General function prediction only]
Probab=43.64 E-value=9.7 Score=42.59 Aligned_cols=20 Identities=40% Similarity=0.906 Sum_probs=16.7
Q ss_pred cccccccccccccccccccc
Q psy9877 267 QVCYACLKFGHRVSRCRTKS 286 (1447)
Q Consensus 267 ~lCf~Cl~~GH~~~~C~s~~ 286 (1447)
-.||+||.+||+..+|+...
T Consensus 177 Y~CyRCGqkgHwIqnCpTN~ 196 (427)
T COG5222 177 YVCYRCGQKGHWIQNCPTNQ 196 (427)
T ss_pred eeEEecCCCCchhhcCCCCC
Confidence 47999999999999998764
No 79
>PF11302 DUF3104: Protein of unknown function (DUF3104); InterPro: IPR021453 This family of proteins with unknown function appears to be restricted to Cyanobacteria.
Probab=42.93 E-value=21 Score=32.26 Aligned_cols=43 Identities=35% Similarity=0.589 Sum_probs=32.4
Q ss_pred ccccCcEEEeeccC----CCCCCccccceeeeecC-CCCC----eEEEEEEe
Q psy9877 1341 IIKVGDVVLIGDDN----VKRLDWPMGRVINLCPG-ADGH----VRVVKLRT 1383 (1447)
Q Consensus 1341 ~~~~gd~Vlv~~~~----~~~~~W~lgri~~~~~~-~Dg~----vR~v~v~~ 1383 (1447)
.++.||.|+|.+++ .....|-||-|+.+.-| .|.. ..++.|-|
T Consensus 5 ~Vk~Gd~ViV~~~~~~~~~~~~dWWmg~Vi~~~ggaR~P~~~tlFQVadVDt 56 (75)
T PF11302_consen 5 SVKPGDTVIVQDEQEVGQKQDKDWWMGQVIHCEGGARDPKVPTLFQVADVDT 56 (75)
T ss_pred ccCCCCEEEEecCccccccCCCCcEEEEEEEEeccccCCCCCceEEEEEccC
Confidence 57899999999888 67789999999988765 3443 33455444
No 80
>KOG4768|consensus
Probab=39.16 E-value=42 Score=41.93 Aligned_cols=165 Identities=16% Similarity=0.125 Sum_probs=100.4
Q ss_pred cccchhHHHhhcccccceeecccccceeeeEeCC---------CCCCccccccc-ccccccceeEEEEecCCCCcEEEEE
Q psy9877 661 LIETIPTSLAKFRINKIGISGDIAKAFLQISVSP---------QDRDCLSMRQP-RIMVSRDCLRFLWQDENGRVITYRH 730 (1447)
Q Consensus 661 ~~~~l~~~l~~~r~~~~~~~~Di~~af~qi~l~~---------~dr~~~~~~~~-~~~~~~~~~~f~w~~~~~~~~~y~~ 730 (1447)
-...++.+.-.|++..|++..||++.|--|+-++ .|.. ++.= -=.+.-||+. ...--.|.|
T Consensus 345 c~tAi~~~~n~f~gcnw~ie~DLkkcfdtIphd~LI~eL~~rIkdk~---fidL~~kll~AGy~t------en~ry~~~~ 415 (796)
T KOG4768|consen 345 CKTAILKTHNLFRGCNWFIEVDLKKCFDTIPHDELIIELQKRIKDKG---FIDLNYKLLRAGYTT------ENARYHVEF 415 (796)
T ss_pred hhHHHHHHHHHhhccceEEechHHHHhccccHHHHHHHHHHHHhhhh---HHHHHHHHHhcCccc------cccceeccc
Confidence 4567888999999999999999999997665321 0111 0000 0000001111 122357888
Q ss_pred EeeecCccCChHHHHHHHHHH---HHH-hhcccCCC---CCCCch--------hHH------------------------
Q psy9877 731 CRVVFGVSSSPFLLESCLKLH---LEL-TLTDCREG---KSSWPI--------HLV------------------------ 771 (1447)
Q Consensus 731 ~~~pfGl~~sP~~~~~~l~~~---l~~-~~~~~~~~---~~~~p~--------~~~------------------------ 771 (1447)
.-.|.|--.||-+.+.+|+.+ +++ +...|..+ +.-.|. +.+
T Consensus 416 lGtpqgsvvspil~nifL~~LDk~Lenk~~nefn~~~~~~~rhs~yr~L~~~iakaKl~s~~sKtirlrd~~qrn~~n~~ 495 (796)
T KOG4768|consen 416 LGTPQGSVVSPILCNIFLRELDKRLENKHRNEFNAGHLRSARHSKYRNLSDSIAKAKLTSGMSKTIRLRDGVQRNETNDT 495 (796)
T ss_pred ccccccccCCchhHHHHHHHHHHHHHHHHHhhhcccCcccccChhhhhHHHHHHHHHHhhhhhhhhhhhhcccccccCCc
Confidence 999999999999998888776 444 33333321 112332 000
Q ss_pred ---HhhhccccccccccccC-cHHHHHHHHHHHHHHHHhcCCccccccccCCCCCCCcceeceeeec
Q psy9877 772 ---ELLKDSFYVDNCLVSTD-SQAEAEQFIQVASSIMKEKGFDLRGWELTGDKDDKPTNVLGLLWDK 834 (1447)
Q Consensus 772 ---~~~~~~~YvDDil~~~~-s~~e~~~~~~~~~~~l~~~g~~l~k~~snp~~~~~~~k~LG~~w~~ 834 (1447)
+.+.-.-|-||++++.. |..+..+.+..+.-.+++.|+....-++--.-..+...+|||....
T Consensus 496 ~gfkr~~yVRyadd~ii~v~GS~nd~K~ilr~In~f~sslGls~n~~kt~it~S~eg~~flg~nis~ 562 (796)
T KOG4768|consen 496 AGFKRLMYVRYADDIIIGVWGSVNDCKQILRDINNFLSSLGLSNNSSKTQITVSREGTHFLGYNIST 562 (796)
T ss_pred cccceeeEEEecCCEEEEEeccHHHHHHHHHHHHHHHHhhCcccCcccceEEeeccceeeeeceecc
Confidence 00112357799999886 8899999999999999999987652221100122348899987543
No 81
>cd04721 BAH_plant_1 BAH, or Bromo Adjacent Homology domain, plant-specific sub-family with unknown function. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.
Probab=32.75 E-value=52 Score=33.51 Aligned_cols=35 Identities=17% Similarity=0.237 Sum_probs=28.5
Q ss_pred cCccccCcEEEeeccCCCCCCccccceeeeecCCCCCe
Q psy9877 1339 EHIIKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGHV 1376 (1447)
Q Consensus 1339 ~~~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~v 1376 (1447)
+..+++||.|+|..++ ...| +|+|.++.-+.||..
T Consensus 5 ~~~i~vGD~V~v~~~~--~~~~-va~Ie~i~ed~~g~~ 39 (130)
T cd04721 5 GVTISVHDFVYVLSEE--EDRY-VAYIEDLYEDKKGSK 39 (130)
T ss_pred CEEEECCCEEEEeCCC--CCcE-EEEEEEEEEcCCCCE
Confidence 4458999999999766 3336 999999999999964
No 82
>PF14893 PNMA: PNMA
Probab=32.70 E-value=1.8e+02 Score=34.57 Aligned_cols=114 Identities=15% Similarity=0.099 Sum_probs=73.4
Q ss_pred CCCcHhHhhccCCC--CCCcHHHHHHHHHHhcCCcchhHHHHHHHHHHHHHhcCCCCChhhHHHHHHHHHHHHHHHHhcC
Q psy9877 2 KGTPARELVDSYPA--TGNMYNQVVEALKARFGREDLLTEVYIRELLKLVLANTASHDKLPIVILYDRLQSHLRNLESLG 79 (1447)
Q Consensus 2 ~~G~A~~~I~~~~~--t~~nY~~A~~~L~~ryg~~~~i~~~~~~~~~~~L~~~p~~~d~~~Lr~l~d~l~~~i~aL~~lg 79 (1447)
+.|.|++.++.+.. .......-++.|+..||++.... +.+.+-+..... ....+..|+..+...+...-..|
T Consensus 212 L~GpA~~~~r~l~~~nP~~t~~~~l~aL~~~Fg~~es~~----~~~~kf~~~~Q~--~~E~ls~yv~RlE~lLqkav~k~ 285 (331)
T PF14893_consen 212 LRGPALDSRRKLQKKNPKQTAQDCLKALGQVFGSSESRE----TLEAKFLNTFQE--PGEKLSAYVKRLESLLQKAVEKG 285 (331)
T ss_pred cccHHHHHHHHHHhcCCCCCHHHHHHHHHHhcCCcccHH----HHHHHHHHhhcc--CCCCHHHHHHHHHHHHHHHHHhc
Confidence 68999999999964 67889999999999999999999 766665555444 44555568888888887765444
Q ss_pred C-CCcCCCc-chHHHHHc-cCCHHHHHHHHhhccccccccCCCCCCcCCHHHHH
Q psy9877 80 V-SADRCAP-ILMPLVSS-SLPQDLLQMWERCSVTQLHATQPSAGCKEYLDSLM 130 (1447)
Q Consensus 80 ~-~~d~~~~-~l~~~i~~-KLP~~l~~~w~~~~~~~~~~~~~~~~~~~t~~~l~ 130 (1447)
. .+..-+. .+-+.+.. .+...++.+.... .. .+..|+|-.++
T Consensus 286 a~~p~~adq~rl~q~l~~a~~~e~lq~klr~~-------~~--~~~~P~fl~l~ 330 (331)
T PF14893_consen 286 AIKPSEADQVRLRQVLSGAVLSESLQDKLRAM-------GW--ESRPPGFLRLL 330 (331)
T ss_pred CCCccccCHHHHHHHHccCCCCHHHHHHHHHh-------hc--cCCCCchhhhc
Confidence 4 3332222 33344433 3444444444333 11 12447776554
No 83
>KOG0119|consensus
Probab=31.68 E-value=28 Score=42.21 Aligned_cols=38 Identities=29% Similarity=0.695 Sum_probs=29.8
Q ss_pred cccccccccccccccccccc---cccccccCCC-CcccccCC
Q psy9877 267 QVCYACLKFGHRVSRCRTKS---RLKCEKCGSR-HLTILCPR 304 (1447)
Q Consensus 267 ~lCf~Cl~~GH~~~~C~s~~---~~~C~~C~~~-Hht~l~~~ 304 (1447)
..|.+|+..||..-+|+.+. ...|..|+.- |.+.-|.-
T Consensus 262 ~~c~~cg~~~H~q~~cp~r~~~~~n~c~~cg~~gH~~~dc~~ 303 (554)
T KOG0119|consen 262 RACRNCGSTGHKQYDCPGRIPNTTNVCKICGPLGHISIDCKV 303 (554)
T ss_pred ccccccCCCccccccCCcccccccccccccCCcccccccCCC
Confidence 58999999999999999871 1279999985 55555654
No 84
>smart00439 BAH Bromo adjacent homology domain.
Probab=29.99 E-value=73 Score=31.41 Aligned_cols=33 Identities=27% Similarity=0.511 Sum_probs=27.1
Q ss_pred cccCcEEEeeccCCCCCCccccceeeeecCCCCC
Q psy9877 1342 IKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGH 1375 (1447)
Q Consensus 1342 ~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~ 1375 (1447)
+++||.|+|..+.. ....-.|+|.++..+.+|.
T Consensus 2 ~~vgd~V~v~~~~~-~~~~~i~~I~~i~~~~~~~ 34 (120)
T smart00439 2 IRVGDFVLVEPDDA-DEPYYIGRIEEIFETKKNS 34 (120)
T ss_pred cccCCEEEEeCCCC-CCCCEEEEEEEEEECCCCC
Confidence 67999999998872 2345599999999999886
No 85
>cd04717 BAH_polybromo BAH, or Bromo Adjacent Homology domain, as present in polybromo and yeast RSC1/2. The human polybromo protein (BAF180) is a component of the SWI/SNF chromatin-remodeling complex PBAF. It is thought that polybromo participates in transcriptional regulation. Saccharomyces cerevisiae RSC1 and RSC2 are part of the 15-subunit nucleosome remodeling RSC complex. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.
Probab=29.73 E-value=47 Score=33.25 Aligned_cols=34 Identities=24% Similarity=0.323 Sum_probs=28.8
Q ss_pred ccccCcEEEeeccCCCCCCccccceeeeecCCCCC
Q psy9877 1341 IIKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGH 1375 (1447)
Q Consensus 1341 ~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~ 1375 (1447)
..++||-|+|..++.|-..| +|||.++..+.+|.
T Consensus 3 ~~~vGD~V~v~~~~~~~~~~-i~~I~~i~~~~~g~ 36 (121)
T cd04717 3 QYRVGDCVYVANPEDPSKPI-IFRIERLWKDEDGE 36 (121)
T ss_pred EEECCCEEEEeCCCCCCCCE-EEEEeEEEECCCCC
Confidence 46899999999888655555 99999999998886
No 86
>cd04712 BAH_DCM_I BAH, or Bromo Adjacent Homology domain, as present in DNA (Cytosine-5)-methyltransferases (DCM) 1. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the genome. These effects include transcriptional repression via inhibition of transcription factor binding, the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting, and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability. BAH domains are found in a variety of proteins playing roles in transcriptional silencing and the remodeling of chromatin. It is assumed that in most or all of these instances the BAH domain mediates protein-protein interactions.
Probab=28.72 E-value=53 Score=33.50 Aligned_cols=37 Identities=32% Similarity=0.512 Sum_probs=28.5
Q ss_pred CccccCcEEEeeccCCCCCC-------cc--ccceeeeecCCCCCe
Q psy9877 1340 HIIKVGDVVLIGDDNVKRLD-------WP--MGRVINLCPGADGHV 1376 (1447)
Q Consensus 1340 ~~~~~gd~Vlv~~~~~~~~~-------W~--lgri~~~~~~~Dg~v 1376 (1447)
..+++||+|+|..+...... |+ +++|....-+.||..
T Consensus 4 ~~i~vGD~V~v~~d~~~~~~~~~~~~~~~~~i~~V~~~~e~~~g~~ 49 (130)
T cd04712 4 LTIRVGDVVSVERDDADSTTKWNDDHRWLPLVQFVEYMKKGSDGSK 49 (130)
T ss_pred CEEeCCCEEEEcCCCCCccccccccccccceEEEEEEeeecCCCce
Confidence 46899999999988765433 33 788888888888853
No 87
>PF07039 DUF1325: SGF29 tudor-like domain; InterPro: IPR010750 SAGA-associated factor 29 is involved in transcriptional regulation, probably through association with histone acetyltransferase (HAT) complexes like the TFTC-HAT or STAGA complexes. It also may be involved in MYC-mediated oncogenic transformation. It is a component of the ATAC complex, which is a complex with histone acetyltransferase activity on histones H3 and H4 []. This entry represents a domain found in yeast and human SAGA-associated factor 29 proteins that is related to the tudor domain. ; PDB: 3MP6_A 3MP1_A 3MP8_A 3MET_B 3ME9_A 3MEU_B 3MEA_A 3MEV_B 3LX7_A 3MEW_A.
Probab=27.72 E-value=51 Score=33.61 Aligned_cols=57 Identities=16% Similarity=0.212 Sum_probs=39.3
Q ss_pred ccCcEEEeecc-CCCCCCccccceeeeecCCCCCeEEEEEEeC--CceeecccceeEeccc
Q psy9877 1343 KVGDVVLIGDD-NVKRLDWPMGRVINLCPGADGHVRVVKLRTS--SGVLTRPIQRIYPLEF 1400 (1447)
Q Consensus 1343 ~~gd~Vlv~~~-~~~~~~W~lgri~~~~~~~Dg~vR~v~v~~~--~~~~~r~i~~l~~l~~ 1400 (1447)
|+||.|..+-. .-.-..|=|++|++..... +.-.+-.+-.. .+.++=+.++++|||.
T Consensus 1 q~G~~VAak~~~~~~~~~WIla~Vv~~~~~~-~rYeV~D~d~~~~~~~~~~~~~~iIPLP~ 60 (130)
T PF07039_consen 1 QPGDQVAAKVKQGNEEEEWILAEVVKYNSDG-NRYEVEDPDPEEEKKRYKLSRKQIIPLPK 60 (130)
T ss_dssp -TT-EEEEEECTTTTTCEEEEEEEEEEETTT-TEEEEEETTTCTTTEEEEEEGGGEEEE-S
T ss_pred CCCCEEEEEcCCCCCCCCEEEEEEEEEeCCC-CEEEEecCCCCCCCceEEeCHHHEEECCC
Confidence 67999998764 4456999999999987654 33333333332 3588889999999998
No 88
>KOG3116|consensus
Probab=26.30 E-value=19 Score=36.38 Aligned_cols=19 Identities=37% Similarity=0.864 Sum_probs=12.3
Q ss_pred ccccccccccccccccccc
Q psy9877 268 VCYACLKFGHRVSRCRTKS 286 (1447)
Q Consensus 268 lCf~Cl~~GH~~~~C~s~~ 286 (1447)
+|-.||+.||+.-+|+.++
T Consensus 29 rCQKClq~GHWtYECk~kR 47 (177)
T KOG3116|consen 29 RCQKCLQAGHWTYECKNKR 47 (177)
T ss_pred hHHHHHhhccceeeecCce
Confidence 5666666677666666554
No 89
>KOG2879|consensus
Probab=26.19 E-value=39 Score=38.08 Aligned_cols=45 Identities=24% Similarity=0.600 Sum_probs=32.1
Q ss_pred CCcccccccc----ccccccCCCCCCHHHHHHHHHhcccccccccccccccccccc--ccccccccCCCCcc
Q psy9877 234 KGESCIFCSQ----GHYSGDCQKDMTLSQRQDIIRNKQVCYACLKFGHRVSRCRTK--SRLKCEKCGSRHLT 299 (1447)
Q Consensus 234 ~~~~C~~C~~----~H~~~~C~~~~~~~eR~~~~k~~~lCf~Cl~~GH~~~~C~s~--~~~~C~~C~~~Hht 299 (1447)
....|++|++ +|.+..|. ..-||.|.+. .|-.. . .|..||..-++
T Consensus 238 ~~~~C~~Cg~~PtiP~~~~~C~--------------HiyCY~Ci~t-----s~~~~asf--~Cp~Cg~~~~~ 288 (298)
T KOG2879|consen 238 SDTECPVCGEPPTIPHVIGKCG--------------HIYCYYCIAT-----SRLWDASF--TCPLCGENVEP 288 (298)
T ss_pred CCceeeccCCCCCCCeeecccc--------------ceeehhhhhh-----hhcchhhc--ccCccCCCCcc
Confidence 5678999994 48888896 4789999874 33322 3 78889876553
No 90
>smart00743 Agenet Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions.
Probab=25.61 E-value=1e+02 Score=26.52 Aligned_cols=55 Identities=22% Similarity=0.145 Sum_probs=39.0
Q ss_pred ccccCcEEEeeccCCCCCCccccceeeeecCCCCCeEEEEEEe--CCceeecccceeEeccc
Q psy9877 1341 IIKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGHVRVVKLRT--SSGVLTRPIQRIYPLEF 1400 (1447)
Q Consensus 1341 ~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~vR~v~v~~--~~~~~~r~i~~l~~l~~ 1400 (1447)
.+++||+|-+..+. .+.|--|+|+++.. ++. -.|.... .....+=+..+|-|++.
T Consensus 2 ~~~~G~~Ve~~~~~--~~~W~~a~V~~~~~--~~~-~~V~~~~~~~~~~e~v~~~~LRp~~~ 58 (61)
T smart00743 2 DFKKGDRVEVFSKE--EDSWWEAVVTKVLG--DGK-YLVRYLTESEPLKETVDWSDLRPHPP 58 (61)
T ss_pred CcCCCCEEEEEECC--CCEEEEEEEEEECC--CCE-EEEEECCCCcccEEEEeHHHcccCCC
Confidence 46899999999765 78999999999976 343 3555554 33355556777777664
No 91
>PTZ00194 60S ribosomal protein L26; Provisional
Probab=22.85 E-value=1.2e+02 Score=31.21 Aligned_cols=51 Identities=16% Similarity=0.175 Sum_probs=36.9
Q ss_pred cCccccCcEEEeeccCCCCCCccccceeeeecCCC-CCeEEEEEEeCCc-eeeccc
Q psy9877 1339 EHIIKVGDVVLIGDDNVKRLDWPMGRVINLCPGAD-GHVRVVKLRTSSG-VLTRPI 1392 (1447)
Q Consensus 1339 ~~~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~D-g~vR~v~v~~~~~-~~~r~i 1392 (1447)
+-.++.||.|.|-.....- .-|.|+++.+..+ -.|..|.+....| ...-||
T Consensus 44 s~~IkkGD~V~Vi~Gk~KG---k~GkV~~V~~k~~~ViVEgvn~~Kk~gk~~e~PI 96 (143)
T PTZ00194 44 SMPVRKDDEVMVVRGHHKG---REGKVTAVYRKKWVIHIEKITREKANGEPVQIGI 96 (143)
T ss_pred cceeecCCEEEEecCCCCC---CceEEEEEEcCCCEEEEeCeEEEecCCCEeecCc
Confidence 5579999999998666433 4599999998665 4566788888887 344444
No 92
>KOG0122|consensus
Probab=22.57 E-value=34 Score=37.96 Aligned_cols=19 Identities=21% Similarity=0.658 Sum_probs=16.9
Q ss_pred CCccccccccccccccCCC
Q psy9877 234 KGESCIFCSQGHYSGDCQK 252 (1447)
Q Consensus 234 ~~~~C~~C~~~H~~~~C~~ 252 (1447)
....|.+|+++|+..+||-
T Consensus 118 ~~~~CR~C~gdHwt~~CPy 136 (270)
T KOG0122|consen 118 SIVACRICKGDHWTTNCPY 136 (270)
T ss_pred ceeeeeecCCCeeeecCCc
Confidence 4578999999999999996
No 93
>PF08750 CNP1: CNP1-like family; InterPro: IPR014861 This group of proteins are likely to be lipoproteins. CNP1 (cryptic neisserial protein) has been expressed in Escherichia coli and shown to be localised periplasmicly [].
Probab=22.31 E-value=98 Score=31.92 Aligned_cols=47 Identities=19% Similarity=0.294 Sum_probs=30.8
Q ss_pred cCccccCcEEEeeccCCCCCCccccceeeeecCCCCCeE-EEEEEeCCc
Q psy9877 1339 EHIIKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGHVR-VVKLRTSSG 1386 (1447)
Q Consensus 1339 ~~~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~vR-~v~v~~~~~ 1386 (1447)
.+-|+.+|++-+.-....+++.-+- -..+..|.||+|| ++.+++++|
T Consensus 16 Pp~P~~~~l~~f~v~~~~~~~f~ID-~~Sisvg~DgvVRY~lv~~S~~G 63 (139)
T PF08750_consen 16 PPAPQDANLLPFDVSPTSPLKFFID-PKSISVGPDGVVRYTLVVRSPSG 63 (139)
T ss_pred CCCCCcCCccEEECCCCCCceEEEc-hhheEECCCCcEEEEEEEcCCCC
Confidence 5567888888777644444443331 2356779999999 666666666
No 94
>PF05515 Viral_NABP: Viral nucleic acid binding ; InterPro: IPR008891 This family is common to ssRNA positive-strand viruses and are commonly described as nucleic acid binding proteins (NABP).
Probab=22.24 E-value=64 Score=32.22 Aligned_cols=27 Identities=19% Similarity=0.547 Sum_probs=20.9
Q ss_pred HHHHHHHhccccccccccccccccccc
Q psy9877 258 QRQDIIRNKQVCYACLKFGHRVSRCRT 284 (1447)
Q Consensus 258 eR~~~~k~~~lCf~Cl~~GH~~~~C~s 284 (1447)
.-++.++..++||.||+--|--..|+.
T Consensus 54 A~KRRAkR~~~C~~CG~~l~~~~~C~~ 80 (124)
T PF05515_consen 54 AAKRRAKRYNRCFKCGRYLHNNGNCRR 80 (124)
T ss_pred HHHHHHHHhCccccccceeecCCcCCC
Confidence 345668889999999997776677763
No 95
>PRK05886 yajC preprotein translocase subunit YajC; Validated
Probab=20.57 E-value=1.5e+02 Score=29.25 Aligned_cols=35 Identities=20% Similarity=0.350 Sum_probs=26.5
Q ss_pred cCccccCcEEEeeccCCCCCCccccceeeeecCCCCCeEEEEEEeCCc
Q psy9877 1339 EHIIKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGHVRVVKLRTSSG 1386 (1447)
Q Consensus 1339 ~~~~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~vR~v~v~~~~~ 1386 (1447)
..++++||-|+...-- .|+|+++. | ..|.|.++.|
T Consensus 36 ~~~Lk~GD~VvT~gGi-------~G~V~~I~---d---~~v~leia~g 70 (109)
T PRK05886 36 HESLQPGDRVHTTSGL-------QATIVGIT---D---DTVDLEIAPG 70 (109)
T ss_pred HHhcCCCCEEEECCCe-------EEEEEEEe---C---CEEEEEECCC
Confidence 6789999999966544 68888874 3 2678888777
No 96
>PF01426 BAH: BAH domain; InterPro: IPR001025 The BAH (bromo-adjacent homology) family contains proteins such as eukaryotic DNA (cytosine-5) methyltransferases IPR001525 from INTERPRO, the origin recognition complex 1 (Orc1) proteins, as well as several proteins involved in transcriptional regulation. The BAH domain appears to act as a protein-protein interaction module specialised in gene silencing, as suggested for example by its interaction within yeast Orc1p with the silent information regulator Sir1p. The BAH module might therefore play an important role by linking DNA methylation, replication and transcriptional regulation [].; GO: 0003677 DNA binding; PDB: 4DA4_A 3PT6_B 3AV6_A 3AV5_A 3AV4_A 3PT9_A 3SWR_A 3PTA_A 1M4Z_A 1ZBX_A ....
Probab=20.47 E-value=1.1e+02 Score=30.12 Aligned_cols=33 Identities=27% Similarity=0.379 Sum_probs=29.6
Q ss_pred cccCcEEEeeccCCCCCCccccceeeeecCCCCC
Q psy9877 1342 IKVGDVVLIGDDNVKRLDWPMGRVINLCPGADGH 1375 (1447)
Q Consensus 1342 ~~~gd~Vlv~~~~~~~~~W~lgri~~~~~~~Dg~ 1375 (1447)
+++||.|+|..+. +....-+|+|.++.-+.++.
T Consensus 3 ~~vGD~V~v~~~~-~~~~~~v~~I~~i~~~~~~~ 35 (119)
T PF01426_consen 3 YKVGDFVYVKPDD-PPEPPYVARIEEIWEDKDGN 35 (119)
T ss_dssp EETTSEEEEECTS-TTSEEEEEEEEEEEEETTTS
T ss_pred EeCCCEEEEeCCC-CCCCCEEEEEEEEEcCCCCC
Confidence 5799999999988 67778899999999998887
Done!