Query 020292
Match_columns 328
No_of_seqs 184 out of 444
Neff 5.5
Searched_HMMs 46136
Date Fri Mar 29 08:34:42 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/020292.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/020292hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG2967 Uncharacterized conser 100.0 7.7E-68 1.7E-72 505.1 23.1 257 21-300 26-287 (314)
2 PF01746 tRNA_m1G_MT: tRNA (Gu 100.0 2.4E-45 5.2E-50 328.4 -17.5 165 124-290 1-186 (186)
3 COG2419 Uncharacterized conser 99.9 2E-24 4.4E-29 204.7 12.7 172 113-290 97-274 (336)
4 PF04252 RNA_Me_trans: Predict 94.7 0.14 2.9E-06 47.2 7.7 53 193-246 61-118 (196)
5 COG2428 Uncharacterized conser 93.9 0.16 3.4E-06 46.0 6.1 77 151-234 27-105 (196)
6 PLN03086 PRLI-interacting fact 87.6 11 0.00023 40.2 13.3 37 241-288 176-212 (567)
7 smart00250 PLEC Plectin repeat 82.6 1.2 2.6E-05 30.0 2.4 26 217-242 8-33 (38)
8 PF07946 DUF1682: Protein of u 79.2 14 0.00029 36.3 9.4 56 25-81 251-310 (321)
9 PF00681 Plectin: Plectin repe 71.0 1.5 3.3E-05 30.6 0.3 26 217-242 8-33 (45)
10 KOG3054 Uncharacterized conser 69.2 1.1E+02 0.0025 29.5 13.0 27 116-142 182-209 (299)
11 PF01344 Kelch_1: Kelch motif; 68.3 3 6.6E-05 28.2 1.3 12 213-224 11-22 (47)
12 KOG1029 Endocytic adaptor prot 61.3 81 0.0017 35.2 10.8 13 26-38 322-334 (1118)
13 KOG1029 Endocytic adaptor prot 55.4 1.6E+02 0.0035 33.0 11.8 9 211-219 712-720 (1118)
14 COG5493 Uncharacterized conser 54.7 12 0.00027 34.7 3.1 34 210-244 193-226 (231)
15 PF13964 Kelch_6: Kelch motif 54.0 5.9 0.00013 27.4 0.7 13 213-225 11-23 (50)
16 COG1901 Uncharacterized conser 49.4 32 0.0007 31.8 4.9 123 116-274 65-193 (197)
17 PF07788 DUF1626: Protein of u 47.7 18 0.00038 28.2 2.5 26 212-242 45-70 (70)
18 smart00612 Kelch Kelch domain. 45.6 11 0.00025 24.6 1.1 9 215-223 1-9 (47)
19 PRK11657 dsbG disulfide isomer 45.0 23 0.00051 33.4 3.5 50 194-254 57-112 (251)
20 KOG2967 Uncharacterized conser 43.4 58 0.0013 32.3 6.0 39 22-61 23-61 (314)
21 PF14213 DUF4325: Domain of un 40.9 94 0.002 23.6 5.7 61 106-169 9-69 (74)
22 KOG2769 Putative u4/u6 small n 39.3 1.4E+02 0.003 31.5 8.1 22 25-46 287-308 (522)
23 cd03020 DsbA_DsbC_DsbG DsbA fa 39.3 53 0.0011 29.2 4.7 50 195-254 21-72 (197)
24 PTZ00266 NIMA-related protein 38.3 3.5E+02 0.0077 31.1 11.8 23 200-222 748-772 (1021)
25 KOG2505 Ankyrin repeat protein 37.6 84 0.0018 33.3 6.3 17 20-36 495-511 (591)
26 PF13466 STAS_2: STAS domain 36.5 1.3E+02 0.0027 22.4 5.8 58 107-170 19-76 (80)
27 PF07646 Kelch_2: Kelch motif; 35.7 19 0.00041 24.9 1.0 10 213-222 11-20 (49)
28 PF07939 DUF1685: Protein of u 35.4 27 0.00058 26.8 1.8 37 116-152 10-48 (64)
29 PF02112 PDEase_II: cAMP phosp 35.3 61 0.0013 32.3 4.8 49 116-164 245-302 (335)
30 PRK02135 hypothetical protein; 35.2 1.3E+02 0.0029 27.9 6.7 70 194-275 126-196 (201)
31 PF01740 STAS: STAS domain; I 35.1 1.6E+02 0.0034 23.5 6.5 55 115-176 49-103 (117)
32 PLN02850 aspartate-tRNA ligase 34.6 64 0.0014 34.1 5.1 21 23-43 10-30 (530)
33 PF06258 Mito_fiss_Elm1: Mitoc 34.1 3.6E+02 0.0078 26.3 9.9 148 113-273 145-309 (311)
34 PF04122 CW_binding_2: Putativ 33.5 8.8 0.00019 30.1 -1.1 16 207-222 67-82 (92)
35 PRK12377 putative replication 32.5 3.4E+02 0.0074 25.6 9.2 53 115-168 65-117 (248)
36 PF13019 Telomere_Sde2: Telome 30.2 96 0.0021 27.9 4.7 24 32-55 126-149 (162)
37 KOG1135 mRNA cleavage and poly 30.2 14 0.00031 40.0 -0.6 33 113-145 23-55 (764)
38 PF09554 RE_HaeII: HaeII restr 28.9 70 0.0015 31.4 3.8 95 113-224 166-263 (338)
39 TIGR02192 HtrL_YibB conserved 28.4 29 0.00063 33.6 1.2 38 192-229 163-205 (270)
40 PF00965 TIMP: Tissue inhibito 28.3 27 0.00058 31.4 0.9 48 196-243 65-124 (181)
41 COG1578 Uncharacterized conser 27.2 2.6E+02 0.0056 27.4 7.3 44 196-242 153-205 (285)
42 KOG0068 D-3-phosphoglycerate d 27.0 55 0.0012 33.2 2.9 118 117-241 95-254 (406)
43 COG1366 SpoIIAA Anti-anti-sigm 26.9 3.3E+02 0.0071 22.0 7.2 55 114-175 44-98 (117)
44 PF08496 Peptidase_S49_N: Pept 26.6 4.5E+02 0.0099 23.3 8.6 42 121-163 103-144 (155)
45 PF13056 DUF3918: Protein of u 26.4 48 0.001 23.5 1.7 15 24-38 25-39 (43)
46 PLN00185 60S ribosomal protein 26.4 87 0.0019 32.1 4.2 93 116-211 155-268 (405)
47 PTZ00266 NIMA-related protein 26.1 9.9E+02 0.021 27.7 12.7 8 23-30 427-434 (1021)
48 PF13854 Kelch_5: Kelch motif 25.8 33 0.00071 23.1 0.8 10 214-223 15-24 (42)
49 COG5055 RAD52 Recombination DN 25.2 1E+02 0.0023 30.7 4.3 113 125-243 23-140 (375)
50 PRK13679 hypothetical protein; 24.8 1.1E+02 0.0025 26.6 4.3 77 129-210 10-94 (168)
51 PF13418 Kelch_4: Galactose ox 24.7 39 0.00084 23.0 1.0 15 212-226 11-25 (49)
52 PF13812 PPR_3: Pentatricopept 24.3 82 0.0018 19.0 2.4 19 265-283 3-22 (34)
53 PF11868 DUF3388: Protein of u 23.9 52 0.0011 30.0 1.9 21 208-228 100-120 (192)
54 cd07043 STAS_anti-anti-sigma_f 23.5 3.2E+02 0.0069 20.4 7.0 52 114-171 38-89 (99)
55 PF11208 DUF2992: Protein of u 23.5 2.8E+02 0.0061 24.0 6.3 24 39-62 94-117 (132)
56 PF06117 DUF957: Enterobacteri 22.9 61 0.0013 24.9 1.8 18 258-275 2-19 (65)
57 KOG4819 Uncharacterized conser 22.6 2.2E+02 0.0047 23.7 5.1 8 58-65 52-59 (106)
58 PTZ00428 60S ribosomal protein 22.5 1E+02 0.0023 31.3 3.9 93 116-210 150-261 (381)
59 PRK05339 PEP synthetase regula 22.4 86 0.0019 30.4 3.2 45 207-254 137-181 (269)
60 PF03618 Kinase-PPPase: Kinase 22.0 1E+02 0.0022 29.6 3.6 44 207-253 131-174 (255)
61 PF07358 DUF1482: Protein of u 21.8 1.1E+02 0.0024 23.0 2.9 26 228-253 25-50 (57)
62 KOG2357 Uncharacterized conser 21.5 5.5E+02 0.012 26.7 8.7 52 25-76 359-413 (440)
63 PRK08154 anaerobic benzoate ca 21.4 1.8E+02 0.0039 28.0 5.3 44 124-167 105-148 (309)
64 PRK11346 hypothetical protein; 21.3 41 0.00088 32.8 0.7 90 191-290 167-271 (285)
65 KOG4364 Chromatin assembly fac 20.5 3.4E+02 0.0073 29.9 7.3 123 27-150 301-425 (811)
66 PF01696 Adeno_E1B_55K: Adenov 20.2 72 0.0016 32.5 2.2 35 184-220 55-91 (386)
67 KOG1144 Translation initiation 20.1 1.3E+03 0.028 26.3 11.9 14 216-229 477-490 (1064)
No 1
>KOG2967 consensus Uncharacterized conserved protein [Function unknown]
Probab=100.00 E-value=7.7e-68 Score=505.09 Aligned_cols=257 Identities=42% Similarity=0.681 Sum_probs=209.9
Q ss_pred CCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcccHHHHHHHHHHHHHHHHHHHHhhhHHH
Q 020292 21 SQPSGLSKTAQKKRLKQLRYEARKAEKKAKMKVEKKREGERKRREWEEKLASLSEEERSKLIEERKGQRKERMEKRSEER 100 (328)
Q Consensus 21 ~~~~~lSK~q~Kkl~k~~~we~~k~~rk~~~kekkk~~kerkr~e~~~~la~~~~e~~~~~~~~r~~~~ke~~~~~~~e~ 100 (328)
+.+.+|||+|+|++.+++.|++.++.++++++++...++++ . + +..++..+...+..+...
T Consensus 26 p~~e~~sk~q~k~~~k~~~wee~~~~~~~~rr~~er~~k~~-~-k------------~~~~i~~g~~~r~~~~~~----- 86 (314)
T KOG2967|consen 26 PVAEPMSKKQLKRQKKQAEWEELKKKKKERRREKERLRKKQ-E-K------------RNELIELGLEVRLRRIRM----- 86 (314)
T ss_pred cccchhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhH-H-H------------HHhhcccCchhhHHHHHH-----
Confidence 34557999999999999999999999877444443333322 1 1 111222111112111100
Q ss_pred HHHHHHHHHhhhcCCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHH---hcCCCCCcc
Q 020292 101 EHKIQRLTKAKENGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQL---QRLPGFDKW 177 (328)
Q Consensus 101 ~~~~~rl~~a~~~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l---~~~~g~~~w 177 (328)
...++..++.+||+|||||+|+++|+.+|+.+|++||++||++||++..||+|+||||.+++.... ....||.+|
T Consensus 87 --~~r~~~~~~~s~~rivlD~sfd~lM~~kei~~l~~Qi~~~y~~Nr~a~~Pf~l~~~n~~~~~~~~~~l~~~~~~~~n~ 164 (314)
T KOG2967|consen 87 --EKRILAKAMDSGPRIVLDCSFDELMNEKEIVNLVNQIQRCYSENRRAKHPFHLHFTNFQGDIFKRQDLVKKNPGYENW 164 (314)
T ss_pred --HHHhHHhhhccCCeEEEeccHHHHHhHHHHHHHHHHHHHHhhhcccCCCCeEEEEecCCcchHHHHHHHhcCCCcccc
Confidence 023455788999999999999999999999999999999999999999999999999999876554 455678888
Q ss_pred cee-cccchhhhh-hhccCCcEEEecCCCccccCcCCCCceEEEeeeecCCCChhhHHHHHHHcCCcccccccccccccC
Q 020292 178 IIE-KENRSYIEA-LEDHKENLVYLTADSDTVLDDLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQTGKLPIGNYLKMS 255 (328)
Q Consensus 178 ~v~-~~~~~~~e~-f~~~~~~iVYLSpDS~~~L~eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~taRLPI~eyi~l~ 255 (328)
..+ +.+.++... |. +++||||||||+++|++|+|++|||||||||+|++||+|+.+|+++||+||||||++||+|.
T Consensus 165 ~~~~~~~~~~s~~~~k--kenlVYLT~ds~~vL~dldp~~vYiIGglVD~n~~k~l~~~kA~~~gi~tarLPi~eyi~~~ 242 (314)
T KOG2967|consen 165 KRIILYPECTSLLPFK--KENLVYLTADSENVLEDLDPDKVYIIGGLVDKNRQKGLTLSKAQEYGIRTARLPLDEYIKME 242 (314)
T ss_pred eeeeccccccccccCC--ccceEEECCCCccchhccCCCcEEEEEEEecCCCCcchhHHHHHHcCCCcccCchHHhhhcC
Confidence 754 344555554 54 89999999999999999999999999999999999999999999999999999999999999
Q ss_pred CCcccchHHHHHHHHHhhcCCCHHHHHHhcCCCCccccCCCCCCC
Q 020292 256 SSQVLTVNQVLEILLKFLETRDWEASFFQVIPQRKRCEADSGEPQ 300 (328)
Q Consensus 256 ~~~VLtiN~V~eILl~~~e~~DW~~A~~~vIP~RK~~~~~~~~~~ 300 (328)
+++|||+||||+||+.|+++|||.+||+++||+||+..+.+....
T Consensus 243 s~~vLt~nqv~~Il~~~~~~~dW~~Al~~~ip~RKg~~~~~~~~~ 287 (314)
T KOG2967|consen 243 SRKVLTLNQVFEILLKYLETGDWKEALLSNIPKRKGAGLKSQESS 287 (314)
T ss_pred CCceeeHHHHHHHHHHHHhcCCHHHHHHHhCccccccCccchhhh
Confidence 999999999999999999999999999999999999998655543
No 2
>PF01746 tRNA_m1G_MT: tRNA (Guanine-1)-methyltransferase; InterPro: IPR016009 In transfer RNA many different modified nucleosides are found, especially in the anticodon region. tRNA (guanine-N1-)-methyltransferase 2.1.1.31 from EC is one of several nucleases operating together with the tRNA-modifying enzymes before the formation of the mature tRNA. It catalyses the reaction: S-adenosyl-L-methionine + tRNA -> S-adenosyl-L-homocysteine + tRNA containing N1-methylguanine methylating guanosine(G) to N1-methylguanine (1-methylguanosine (m1G)) at position 37 of tRNAs that read CUN (leucine), CCN(proline), and CGG (arginine) codons. The presence of m1G improves the cellular growth rate and the polypeptide steptime and also prevents the tRNA from shifting the reading frame []. The mechanism of the trmD3-induced frameshift involving mutant tRNA(Pro) and tRNA(Leu) species has been investigated []. It has been suggested that the conformation of the anticodon loop may be a major determining element for the formation of m1G37 in vivo []. ; PDB: 3IEF_A 3KY7_A 1UAL_A 3AXZ_A 1UAM_A 1UAK_A 1UAJ_A 1OY5_B 3KNU_B 3QUV_A ....
Probab=100.00 E-value=2.4e-45 Score=328.37 Aligned_cols=165 Identities=41% Similarity=0.701 Sum_probs=135.1
Q ss_pred CcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHH--hcC---CCCCccceecc-----cchhhhhhhcc
Q 020292 124 SHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQL--QRL---PGFDKWIIEKE-----NRSYIEALEDH 193 (328)
Q Consensus 124 ~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l--~~~---~g~~~w~v~~~-----~~~~~e~f~~~ 193 (328)
|++|+++|+.||++|+.+|||.||++.+|+++++||+++.+...+ ... .|+..|.+... ...+.++|+
T Consensus 1 ~~~m~~ke~~sl~~Q~~~~y~~nr~~~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~l~~-- 78 (186)
T PF01746_consen 1 DDLMTEKEIKSLAKQLLRCYSANRRAKVPDQLYGTGFGMVLKPEPEFRALDSVNGWRVILLSPQGKPFTQKSAEELFP-- 78 (186)
T ss_dssp THHHHHTTSEEEEEEEGGGGSTSTTG-SEE-BTTS-SS-EE-HHHHHHHHHHHHTSEEEEE-TTSEE--HHHHHHHTT--
T ss_pred CcChhhcCchhhhhhhHHHhhcCCCCCCCCcEEEeCcccccchhhhhHHHHHHhhhccccccccCcchhhHHHHhhcc--
Confidence 578999999999999999999999999999999999999876432 222 27888877654 567788887
Q ss_pred CCcEEEecCCCccccCcCCCCc--eEEEeeeecCCCChhh----HHHHHHHcCCc-ccccccccccccCC--CcccchHH
Q 020292 194 KENLVYLTADSDTVLDDLDPNK--IYIVGGLVDRNRWKGI----TMKKAQEQGLQ-TGKLPIGNYLKMSS--SQVLTVNQ 264 (328)
Q Consensus 194 ~~~iVYLSpDS~~~L~eld~~~--vYIIGGiVDrnr~Kgl----t~~kA~~~GI~-taRLPI~eyi~l~~--~~VLtiN~ 264 (328)
+++||||||||+++|+++++++ +||||||||+|+++|+ |+.+|...||. |+||||++|+.+.+ ++|||+||
T Consensus 79 ~~~lVyLs~d~e~~le~~~~~~~~vyiIGgiVD~~~~k~~~~~~~~~~a~~~Gi~~~~rLpi~~~l~~~~~~~k~l~in~ 158 (186)
T PF01746_consen 79 KENLVYLSPDSEGVLERVDPDKDDVYIIGGIVDRNGEKGAMTIADLEKALEPGIRNTARLPIDSFLLEYPHYTKVLTINQ 158 (186)
T ss_dssp SSEEEEEE-BTT-BBHHHHHHHTSEEESSSS--SCSHHHHHHHHHHHHTTSTTTSGGGG-SSSCCS----B-BSSSEETE
T ss_pred CCCEEEEECCccccccccccccceEEEEccEEccCCcccchhhhhHHHHHccCCCccccCcccccccCCCCCCcCceECc
Confidence 8999999999999999999999 9999999999999999 99999999999 99999999999987 89999999
Q ss_pred HHHHHHHhhc--CCCHHHHHHhcCCCCc
Q 020292 265 VLEILLKFLE--TRDWEASFFQVIPQRK 290 (328)
Q Consensus 265 V~eILl~~~e--~~DW~~A~~~vIP~RK 290 (328)
||+||+.+.+ ++||++||.++||+||
T Consensus 159 V~eILl~~~~~~~~~W~~A~~~~ip~Rk 186 (186)
T PF01746_consen 159 VPEILLSGNHKKIGDWKEALEKTIPKRK 186 (186)
T ss_dssp --GGGGSTSHHHHHHHHHHHHHHHHHHH
T ss_pred hHHHHHhhchhhhchHHHHHHHHhccCC
Confidence 9999999999 9999999999999997
No 3
>COG2419 Uncharacterized conserved protein [Function unknown]
Probab=99.91 E-value=2e-24 Score=204.72 Aligned_cols=172 Identities=25% Similarity=0.403 Sum_probs=139.9
Q ss_pred cCCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHhcCCCCCccceecccchhhhhhhc
Q 020292 113 NGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQRLPGFDKWIIEKENRSYIEALED 192 (328)
Q Consensus 113 ~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~~~~g~~~w~v~~~~~~~~e~f~~ 192 (328)
..|.|||||+|++.++++|..+|..||.-+||..|++.||-||.+|+.++...++| +++ +..-...-....++...+
T Consensus 97 ~~P~fVVDl~lwdeH~~~Ek~kl~lQi~vTl~~vR~ylwD~~L~ltn~~~e~~e~~-~l~-~~k~~~~~~~~e~l~~~~- 173 (336)
T COG2419 97 AYPYFVVDLRLWDEHSDKEKSKLELQIAVTLGTVRRYLWDRHLVLTNTNPEAKERL-KLP-LEKIGYEGKTEEFLKEIG- 173 (336)
T ss_pred CCCEEEEEeechhhcChHhHHHHHHHHHHHHHHHHHHhcCCceEEEecChhhhhhe-ecc-cccccccCCHHHHHhhcC-
Confidence 35889999999999999999999999999999999999999999999999988887 332 111000001122333333
Q ss_pred cCCcEEEecCCCccccCc--CCCCceEEEeeeecCC-CChhhHHHHHHHcCCcccccccccccccCCCcc---cchHHHH
Q 020292 193 HKENLVYLTADSDTVLDD--LDPNKIYIVGGLVDRN-RWKGITMKKAQEQGLQTGKLPIGNYLKMSSSQV---LTVNQVL 266 (328)
Q Consensus 193 ~~~~iVYLSpDS~~~L~e--ld~~~vYIIGGiVDrn-r~Kglt~~kA~~~GI~taRLPI~eyi~l~~~~V---LtiN~V~ 266 (328)
.+++|.|.|++++.|++ ...+++||||||||++ ..||.|...|..+|..- .+|-.+ |.+.|+.| -.|||++
T Consensus 174 -idrvVlLdPnade~lse~~~r~~~~FIIGGIVDk~~~~kg~Tari~~~l~~~g-ev~rrk-I~LRGsvvGVPDRIN~I~ 250 (336)
T COG2419 174 -IDRVVLLDPNADELLSEEEIRGAKAFIIGGIVDKKGTKKGATARIGEVLGREG-EVPRRK-IELRGSVVGVPDRINHIV 250 (336)
T ss_pred -cCceEEeCCCCccccchhhhccCceEEEeeeEcCCCCccchHHHhhhhcCCcc-ccceeE-EEEecCccCCchHHHHHH
Confidence 78999999999999975 7788999999999999 58999999999988774 444322 57887644 8999999
Q ss_pred HHHHHhhcCCCHHHHHHhcCCCCc
Q 020292 267 EILLKFLETRDWEASFFQVIPQRK 290 (328)
Q Consensus 267 eILl~~~e~~DW~~A~~~vIP~RK 290 (328)
|||++++++++-++|+.++.++|.
T Consensus 251 EILlR~~~Ge~lEkAIlavQap~~ 274 (336)
T COG2419 251 EILLRMLYGEPLEKAILAVQAPRD 274 (336)
T ss_pred HHHHHHHcCccHHHHHHHhcCcHH
Confidence 999999999999999999966653
No 4
>PF04252 RNA_Me_trans: Predicted SAM-dependent RNA methyltransferase; InterPro: IPR007364 This family of proteins are predicted to be alpha/beta-knot SAM-dependent RNA methyltransferases [].
Probab=94.69 E-value=0.14 Score=47.21 Aligned_cols=53 Identities=19% Similarity=0.396 Sum_probs=38.5
Q ss_pred cCCcEEEecCCCccccC--cCCCCceEEEeeee-cCCCChhhHHHHHHH--cCCccccc
Q 020292 193 HKENLVYLTADSDTVLD--DLDPNKIYIVGGLV-DRNRWKGITMKKAQE--QGLQTGKL 246 (328)
Q Consensus 193 ~~~~iVYLSpDS~~~L~--eld~~~vYIIGGiV-Drnr~Kglt~~kA~~--~GI~taRL 246 (328)
.+++++.|.|.|+..|+ +.+..+++|+|||- ||-+ ++-|..--.. .|+...||
T Consensus 61 ~~~~VcLLDP~A~~~L~PeD~~~fd~fvfGGILGD~PP-rdRT~eLr~~~~~g~~~R~L 118 (196)
T PF04252_consen 61 DKSRVCLLDPAAEKELSPEDGEKFDYFVFGGILGDHPP-RDRTSELRTKKPKGFEGRRL 118 (196)
T ss_pred CcCCEEEeCCCCCCCCCccccCcccEEEECcccCCCCC-CCchHHHHhhhccCcccccc
Confidence 37899999999999997 57788999999998 3333 3444433333 47776666
No 5
>COG2428 Uncharacterized conserved protein [Function unknown]
Probab=93.88 E-value=0.16 Score=46.01 Aligned_cols=77 Identities=19% Similarity=0.318 Sum_probs=52.9
Q ss_pred CCceEEEecCCcchHHHHhcCCCCCccceecccchhhhhhhccCCcEEEecCCCccccC--cCCCCceEEEeeeecCCCC
Q 020292 151 SPAHLWLTGCKGDMESQLQRLPGFDKWIIEKENRSYIEALEDHKENLVYLTADSDTVLD--DLDPNKIYIVGGLVDRNRW 228 (328)
Q Consensus 151 ~P~~L~lt~~~~~l~~~l~~~~g~~~w~v~~~~~~~~e~f~~~~~~iVYLSpDS~~~L~--eld~~~vYIIGGiVDrnr~ 228 (328)
+-..+.|||..+...+.|.++ |+.- | +.++.. ++-....+|.|.++|+..|+ +.++++++|+|||.---.-
T Consensus 27 ~G~~~~vtna~pe~p~vlak~-g~~~--i---~e~~~~-~~l~r~rvivLDl~a~~~L~PEdas~~~~ivvGGIlGD~pp 99 (196)
T COG2428 27 WGDEFIVTNAKPEEPEVLAKI-GLSG--I---PESITR-LPLDRSRVIVLDLQAEEELKPEDASEDTYIVVGGILGDHPP 99 (196)
T ss_pred hchheeeecCCcchhHHHHHh-cccc--C---chhHhh-cccCCCCEEEecCCCCCCCCcccCCcccEEEECccccCCCC
Confidence 345688999998777777765 3221 1 122222 22236889999999999997 5778899999999855555
Q ss_pred hhhHHH
Q 020292 229 KGITMK 234 (328)
Q Consensus 229 Kglt~~ 234 (328)
+|-|+.
T Consensus 100 rgRT~~ 105 (196)
T COG2428 100 RGRTKE 105 (196)
T ss_pred CCcchh
Confidence 666653
No 6
>PLN03086 PRLI-interacting factor K; Provisional
Probab=87.64 E-value=11 Score=40.15 Aligned_cols=37 Identities=24% Similarity=0.289 Sum_probs=26.7
Q ss_pred CcccccccccccccCCCcccchHHHHHHHHHhhcCCCHHHHHHhcCCC
Q 020292 241 LQTGKLPIGNYLKMSSSQVLTVNQVLEILLKFLETRDWEASFFQVIPQ 288 (328)
Q Consensus 241 I~taRLPI~eyi~l~~~~VLtiN~V~eILl~~~e~~DW~~A~~~vIP~ 288 (328)
|..+.||-+.|+++. +++++ |++-.||+.-|+..+..
T Consensus 176 v~~~~Lpkgt~vklq---P~~~~--------f~di~npKavLE~~Lr~ 212 (567)
T PLN03086 176 VRYIWLPKGTYAKLQ---PDGVG--------FSDLPNHKAVLETALRQ 212 (567)
T ss_pred EEEeecCCCCEEEEe---eccCC--------cCCcccHHHHHHHHhhc
Confidence 566799999999986 23333 55667888888887743
No 7
>smart00250 PLEC Plectin repeat.
Probab=82.55 E-value=1.2 Score=29.99 Aligned_cols=26 Identities=23% Similarity=0.515 Sum_probs=23.2
Q ss_pred EEEeeeecCCCChhhHHHHHHHcCCc
Q 020292 217 YIVGGLVDRNRWKGITMKKAQEQGLQ 242 (328)
Q Consensus 217 YIIGGiVDrnr~Kglt~~kA~~~GI~ 242 (328)
.-+|||+|......+|+..|...|+-
T Consensus 8 ~~~~Giidp~t~~~lsv~eA~~~gli 33 (38)
T smart00250 8 SAIGGIIDPETGQKLSVEEALRRGLI 33 (38)
T ss_pred hheeEEEcCCCCCCcCHHHHHHcCCC
Confidence 36899999999999999999999874
No 8
>PF07946 DUF1682: Protein of unknown function (DUF1682); InterPro: IPR012879 The members of this family are all hypothetical eukaryotic proteins of unknown function. One member (Q920S6 from SWISSPROT) is described as being an adipocyte-specific protein, but no evidence of this was found.
Probab=79.16 E-value=14 Score=36.30 Aligned_cols=56 Identities=39% Similarity=0.444 Sum_probs=28.9
Q ss_pred CCCHHHHHHHHHHHHHHHHHHHHHHH---HHHH-HHHHHHHHHHHHHHHHhcccHHHHHHH
Q 020292 25 GLSKTAQKKRLKQLRYEARKAEKKAK---MKVE-KKREGERKRREWEEKLASLSEEERSKL 81 (328)
Q Consensus 25 ~lSK~q~Kkl~k~~~we~~k~~rk~~---~kek-kk~~kerkr~e~~~~la~~~~e~~~~~ 81 (328)
.||+..++|..+. |-+...+..|+. +.|. ..++.+++|++.++.++.||+++...+
T Consensus 251 ~l~~e~~~K~~k~-R~~~~~~~~K~~~~~r~E~~~~~k~e~kr~e~~~~~~~lspeeQrK~ 310 (321)
T PF07946_consen 251 KLSPEAKKKAKKN-REEEEEKILKEAHQERQEEAQEKKEEKKREERERKLSKLSPEEQRKY 310 (321)
T ss_pred eeCHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCHHHHHHH
Confidence 5777777776553 222222222222 2222 223334445555667889999876544
No 9
>PF00681 Plectin: Plectin repeat; InterPro: IPR001101 Plectin may have a role in cross-linking intermediate filaments, in inter-linking intermediate filaments with microtubules and microfilaments and in anchoring intermediate filaments to the plasma and nuclear membranes. Plectin is recruited into hemidesmosomes, multiprotein complexes that facilitate adhesion of epithelia to the basement membrane, thereby providing linkage between the intracellular keratin filaments to the laminins of the extracellular matrix. Plectin binds to hemidesmosomes through association of its actin-binding domain with the first pair of fibronectin type III repeats and a small part of the connecting segment of the integrin-beta4 subunit, the latter (integrin-alpha6,beta4) acting as a receptor for the extracellular matrix component laminin-5. The plectin repeat is also seen in the cell adhesion junction plaque proteins, desmoplakin, envoplakin, and bullous pemphigoid antigen. The domains in plakins show considerable sequence homology. The N terminus consists of a plakin domain containing a number of subdomains with high alpha-helical content, while the central coiled-coil domain is composed of heptad repeats involved in the dimerisation of plakin, and the C terminus contains one or more homologous repeat sequences referred to plectin repeats []. This entry represents the plectin repeats found in the C terminus of plakin proteins.; GO: 0005856 cytoskeleton; PDB: 1LM7_A 1LM5_A.
Probab=71.02 E-value=1.5 Score=30.65 Aligned_cols=26 Identities=23% Similarity=0.510 Sum_probs=22.7
Q ss_pred EEEeeeecCCCChhhHHHHHHHcCCc
Q 020292 217 YIVGGLVDRNRWKGITMKKAQEQGLQ 242 (328)
Q Consensus 217 YIIGGiVDrnr~Kglt~~kA~~~GI~ 242 (328)
+++|||||-....-++...|...|+=
T Consensus 8 ~~~gGiidp~tg~~lsv~~A~~~glI 33 (45)
T PF00681_consen 8 LATGGIIDPETGERLSVEEAIQRGLI 33 (45)
T ss_dssp HTTTSEEETTTTEEEEHHHHHHTTSS
T ss_pred eeeeeEEeCCCCeEEcHHHHHHCCCc
Confidence 35799999999999999999999863
No 10
>KOG3054 consensus Uncharacterized conserved protein [Function unknown]
Probab=69.20 E-value=1.1e+02 Score=29.50 Aligned_cols=27 Identities=7% Similarity=0.095 Sum_probs=17.4
Q ss_pred cEEEe-cCCCcccCHHHHHHHHHHHHHH
Q 020292 116 NIIID-LEFSHLMSRAEIQSLVQQIMYC 142 (328)
Q Consensus 116 ~IvID-csf~~lM~~kEi~sL~~Qi~~~ 142 (328)
.++|+ -||+..|.+.--+-|+--|.|+
T Consensus 182 aFsVeeEGtee~~~eeqdnll~eFv~YI 209 (299)
T KOG3054|consen 182 AFSVEEEGTEEVQGEEQDNLLSEFVEYI 209 (299)
T ss_pred heeeccccccccccchHHHHHHHHHHHH
Confidence 34444 5788888887665555555554
No 11
>PF01344 Kelch_1: Kelch motif; InterPro: IPR006652 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding; PDB: 2XN4_A 2WOZ_A 3II7_A 4ASC_A 1U6D_X 1ZGK_A 2FLU_X 2VPJ_A 2DYH_A 1X2R_A ....
Probab=68.35 E-value=3 Score=28.19 Aligned_cols=12 Identities=50% Similarity=0.938 Sum_probs=10.5
Q ss_pred CCceEEEeeeec
Q 020292 213 PNKIYIVGGLVD 224 (328)
Q Consensus 213 ~~~vYIIGGiVD 224 (328)
.+++|||||.-.
T Consensus 11 ~~~iyv~GG~~~ 22 (47)
T PF01344_consen 11 GNKIYVIGGYDG 22 (47)
T ss_dssp TTEEEEEEEBES
T ss_pred CCEEEEEeeecc
Confidence 578999999977
No 12
>KOG1029 consensus Endocytic adaptor protein intersectin [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=61.34 E-value=81 Score=35.15 Aligned_cols=13 Identities=8% Similarity=0.049 Sum_probs=6.1
Q ss_pred CCHHHHHHHHHHH
Q 020292 26 LSKTAQKKRLKQL 38 (328)
Q Consensus 26 lSK~q~Kkl~k~~ 38 (328)
++|-|+---+|++
T Consensus 322 y~kGqaELerRRq 334 (1118)
T KOG1029|consen 322 YEKGQAELERRRQ 334 (1118)
T ss_pred HhhhhHHHHHHHH
Confidence 4555554444443
No 13
>KOG1029 consensus Endocytic adaptor protein intersectin [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=55.42 E-value=1.6e+02 Score=32.97 Aligned_cols=9 Identities=44% Similarity=0.885 Sum_probs=4.8
Q ss_pred CCCCceEEE
Q 020292 211 LDPNKIYIV 219 (328)
Q Consensus 211 ld~~~vYII 219 (328)
+.|++|.|+
T Consensus 712 f~pGDII~V 720 (1118)
T KOG1029|consen 712 FEPGDIIIV 720 (1118)
T ss_pred ccCCCEEEE
Confidence 456655543
No 14
>COG5493 Uncharacterized conserved protein containing a coiled-coil domain [Function unknown]
Probab=54.71 E-value=12 Score=34.73 Aligned_cols=34 Identities=15% Similarity=0.321 Sum_probs=28.2
Q ss_pred cCCCCceEEEeeeecCCCChhhHHHHHHHcCCccc
Q 020292 210 DLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQTG 244 (328)
Q Consensus 210 eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~ta 244 (328)
.+.++++|||||.|| .|++..-..-|..+||..+
T Consensus 193 gvki~~vivitpFih-dr~p~~~kAmAe~mGIeii 226 (231)
T COG5493 193 GVKINKVIVITPFIH-DRYPDRVKAMAERMGIEII 226 (231)
T ss_pred CCccceEEEEccccc-ccChHHHHHHHHHcCceec
Confidence 467899999999997 4567788888999999854
No 15
>PF13964 Kelch_6: Kelch motif
Probab=53.98 E-value=5.9 Score=27.41 Aligned_cols=13 Identities=38% Similarity=0.751 Sum_probs=11.1
Q ss_pred CCceEEEeeeecC
Q 020292 213 PNKIYIVGGLVDR 225 (328)
Q Consensus 213 ~~~vYIIGGiVDr 225 (328)
.+++||+||..+.
T Consensus 11 ~~~iyv~GG~~~~ 23 (50)
T PF13964_consen 11 GGKIYVFGGYDNS 23 (50)
T ss_pred CCEEEEECCCCCC
Confidence 5689999999884
No 16
>COG1901 Uncharacterized conserved protein [Function unknown]
Probab=49.43 E-value=32 Score=31.77 Aligned_cols=123 Identities=19% Similarity=0.343 Sum_probs=67.6
Q ss_pred cEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCC--CCCc--eEEEecCCcchHHHHhcCCCCCccceecccchhhhhhh
Q 020292 116 NIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKC--PSPA--HLWLTGCKGDMESQLQRLPGFDKWIIEKENRSYIEALE 191 (328)
Q Consensus 116 ~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~--~~P~--~L~lt~~~~~l~~~l~~~~g~~~w~v~~~~~~~~e~f~ 191 (328)
.|.|+.+--..|++.| .|++.+|..+...+-.- ..+. -+++. ++.+ +.++..+.
T Consensus 65 tI~~~g~~~~~~~pdE-rs~a~~i~kAL~~~~~~~~~~~~~pGi~V~--~~~~-------------------e~ll~~l~ 122 (197)
T COG1901 65 TIRVEGSELRYLNPDE-RSLAILIKKALDAELGKEQTREVTPGIYVR--NGGF-------------------EALLAELA 122 (197)
T ss_pred EEEEEcccccccCcch-HHHHHHHHHHHHhhccccceeecCCCEEEe--cCCH-------------------HHHHHHHh
Confidence 3566666666777764 46677777776652111 1110 02221 1111 22333332
Q ss_pred ccCCcEEEecCCCccccC-cCCCCceEEEeeeecCCCChhhHHHHHHHc-CCcccccccccccccCCCcccchHHHHHHH
Q 020292 192 DHKENLVYLTADSDTVLD-DLDPNKIYIVGGLVDRNRWKGITMKKAQEQ-GLQTGKLPIGNYLKMSSSQVLTVNQVLEIL 269 (328)
Q Consensus 192 ~~~~~iVYLSpDS~~~L~-eld~~~vYIIGGiVDrnr~Kglt~~kA~~~-GI~taRLPI~eyi~l~~~~VLtiN~V~eIL 269 (328)
....++||.-|.+..-+ .+.++-+||+|-=. |+|.+..+-+ .+...+ |.+ |...|-.|||+-++
T Consensus 123 -~~~~ly~L~E~G~DI~~v~~~~np~FIlGDH~------g~t~e~~k~L~r~~~~~------ISl-GP~~lha~hcit~~ 188 (197)
T COG1901 123 -EGRSLYYLHEDGRDISEVDLIPNPVFILGDHI------GLTEEDEKLLERHAAKK------ISL-GPLSLHADHCITLL 188 (197)
T ss_pred -ccCcEEEEccCCccHhhcccCCCceEEeeCCC------CCCHHHHHHHHHhhCce------eEe-CchHHHHHHHHHHH
Confidence 13579999999987655 46899999999543 4444443221 001111 111 45678899999998
Q ss_pred HHhhc
Q 020292 270 LKFLE 274 (328)
Q Consensus 270 l~~~e 274 (328)
-.++.
T Consensus 189 h~~LD 193 (197)
T COG1901 189 HNLLD 193 (197)
T ss_pred HHHHh
Confidence 87764
No 17
>PF07788 DUF1626: Protein of unknown function (DUF1626); InterPro: IPR012431 This is a family consisting of sequences from hypothetical proteins of unknown function expressed by certain species of archaea. One member (Q9YCN7 from SWISSPROT) is thought to be similar to tropomyosin [].
Probab=47.69 E-value=18 Score=28.17 Aligned_cols=26 Identities=19% Similarity=0.431 Sum_probs=22.2
Q ss_pred CCCceEEEeeeecCCCChhhHHHHHHHcCCc
Q 020292 212 DPNKIYIVGGLVDRNRWKGITMKKAQEQGLQ 242 (328)
Q Consensus 212 d~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~ 242 (328)
..+.+|||++.||.. ....|+++||.
T Consensus 45 k~~r~ivVtp~id~~-----a~~~A~~LGIe 70 (70)
T PF07788_consen 45 KVDRLIVVTPYIDDR-----AKEMAEELGIE 70 (70)
T ss_pred CcceEEEEEeecCHH-----HHHHHHHhCCC
Confidence 457899999999966 78899999984
No 18
>smart00612 Kelch Kelch domain.
Probab=45.62 E-value=11 Score=24.56 Aligned_cols=9 Identities=67% Similarity=1.250 Sum_probs=7.4
Q ss_pred ceEEEeeee
Q 020292 215 KIYIVGGLV 223 (328)
Q Consensus 215 ~vYIIGGiV 223 (328)
++|||||.-
T Consensus 1 ~iyv~GG~~ 9 (47)
T smart00612 1 KIYVVGGFD 9 (47)
T ss_pred CEEEEeCCC
Confidence 589999973
No 19
>PRK11657 dsbG disulfide isomerase/thiol-disulfide oxidase; Provisional
Probab=44.96 E-value=23 Score=33.37 Aligned_cols=50 Identities=16% Similarity=0.157 Sum_probs=36.0
Q ss_pred CCcEEEecCCCccccCcCCCCceEEEeeeecCCCChhhHHHHHHHcC------Cccccccccccccc
Q 020292 194 KENLVYLTADSDTVLDDLDPNKIYIVGGLVDRNRWKGITMKKAQEQG------LQTGKLPIGNYLKM 254 (328)
Q Consensus 194 ~~~iVYLSpDS~~~L~eld~~~vYIIGGiVDrnr~Kglt~~kA~~~G------I~taRLPI~eyi~l 254 (328)
...++|+|+|.. .+|.|-|+|-+ .+++|-....+.. ...+.||....|..
T Consensus 57 ~~~i~Y~t~dg~----------y~i~G~l~d~~-~~nlT~~~~~~~~~~~~~~~~~~~l~~~~~i~~ 112 (251)
T PRK11657 57 MGVTIYLTPDGK----------HAISGYMYDEK-GENLSEALLEKEVYAPMGREMWQRLEQSHWILD 112 (251)
T ss_pred CceEEEEcCCCC----------EEEEEEEEcCC-CCccCHHHHHHHhcCCccHHHHHHhhccCCccc
Confidence 455999999876 58889999987 4689876666532 22457777666655
No 20
>KOG2967 consensus Uncharacterized conserved protein [Function unknown]
Probab=43.37 E-value=58 Score=32.26 Aligned_cols=39 Identities=33% Similarity=0.394 Sum_probs=29.7
Q ss_pred CCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Q 020292 22 QPSGLSKTAQKKRLKQLRYEARKAEKKAKMKVEKKREGER 61 (328)
Q Consensus 22 ~~~~lSK~q~Kkl~k~~~we~~k~~rk~~~kekkk~~ker 61 (328)
..+|..+.--|+++|+++..+..++-++++++ +++..+|
T Consensus 23 ~~~p~~e~~sk~q~k~~~k~~~wee~~~~~~~-~rr~~er 61 (314)
T KOG2967|consen 23 GQQPVAEPMSKKQLKRQKKQAEWEELKKKKKE-RRREKER 61 (314)
T ss_pred CCCcccchhhHHHHHHHHHHHHHHHHHHHHHH-HHHHHHH
Confidence 34577888999999999999998888887777 4443333
No 21
>PF14213 DUF4325: Domain of unknown function (DUF4325)
Probab=40.89 E-value=94 Score=23.58 Aligned_cols=61 Identities=10% Similarity=0.218 Sum_probs=48.4
Q ss_pred HHHHhhhcCCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHh
Q 020292 106 RLTKAKENGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQ 169 (328)
Q Consensus 106 rl~~a~~~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~ 169 (328)
.+..+...|-.|+||++=-..++.-=...+..||...|+. ...--+|.+++.++.....+.
T Consensus 9 ~i~~~l~~~~~V~lDF~gv~~~~ssFl~eafg~l~~~~~~---~~~~~~l~~~~~~~~~~~~I~ 69 (74)
T PF14213_consen 9 EIEPALKEGEKVVLDFEGVESITSSFLNEAFGQLVREFGE---EEIKKRLKFKNANESIKEMIK 69 (74)
T ss_pred HHHHHHhcCCeEEEECCCcccccHHHHHHHHHHHHHHcCH---HHHhheeEEecCCHHHHHHHH
Confidence 4566677788899999999999999999999999999983 222346888898887666554
No 22
>KOG2769 consensus Putative u4/u6 small nuclear ribonucleoprotein [RNA processing and modification]
Probab=39.33 E-value=1.4e+02 Score=31.53 Aligned_cols=22 Identities=45% Similarity=0.443 Sum_probs=16.1
Q ss_pred CCCHHHHHHHHHHHHHHHHHHH
Q 020292 25 GLSKTAQKKRLKQLRYEARKAE 46 (328)
Q Consensus 25 ~lSK~q~Kkl~k~~~we~~k~~ 46 (328)
-|+|..+||+.+|-|-++.|+.
T Consensus 287 yLTKKErKKLRRQ~R~ea~KEk 308 (522)
T KOG2769|consen 287 YLTKKERKKLRRQRRKEARKEK 308 (522)
T ss_pred eecHHHHHHHHHHHHHHHHHHH
Confidence 5899999999887655554443
No 23
>cd03020 DsbA_DsbC_DsbG DsbA family, DsbC and DsbG subfamily; V-shaped homodimeric proteins containing a redox active CXXC motif imbedded in a TRX fold. They function as protein disulfide isomerases and chaperones in the bacterial periplasm to correct non-native disulfide bonds formed by DsbA and prevent aggregation of incorrectly folded proteins. DsbC and DsbG are kept in their reduced state by the cytoplasmic membrane protein DsbD, which utilizes the TRX/TRX reductase system in the cytosol as a source of reducing equivalents. DsbG differ from DsbC in that it has a more limited substrate specificity, and it may preferentially act later in the folding process to catalyze disulfide rearrangements in folded or partially folded proteins. Also included in the alignment is the predicted protein TrbB, whose gene was sequenced from the enterohemorrhagic E. coli type IV pilus gene cluster, which is required for efficient plasmid transfer.
Probab=39.32 E-value=53 Score=29.22 Aligned_cols=50 Identities=22% Similarity=0.367 Sum_probs=33.1
Q ss_pred CcEEEecCCCccccCcCCCCceEEEeeeecCCCCh-hhHHHHH-HHcCCccccccccccccc
Q 020292 195 ENLVYLTADSDTVLDDLDPNKIYIVGGLVDRNRWK-GITMKKA-QEQGLQTGKLPIGNYLKM 254 (328)
Q Consensus 195 ~~iVYLSpDS~~~L~eld~~~vYIIGGiVDrnr~K-glt~~kA-~~~GI~taRLPI~eyi~l 254 (328)
..++|+|+|.. ..|+|-++|-...+ ++|.... ....+..+.||....+.+
T Consensus 21 ~~~~y~~~dg~----------~~i~G~l~d~~~~~~~~t~~~~~~~~~~~~~~l~~~~~i~~ 72 (197)
T cd03020 21 GGVLYTDDDGR----------YLIQGNLYDAKGRKDDLTEARLAQLNAIDLSALPLDDAIVY 72 (197)
T ss_pred CEEEEEcCCCC----------EEEEeEEEEccCCCCChhHHHHHHhhhhhhhhCCcccCeEE
Confidence 56999999875 67789999966554 5554333 233455667887665554
No 24
>PTZ00266 NIMA-related protein kinase; Provisional
Probab=38.27 E-value=3.5e+02 Score=31.11 Aligned_cols=23 Identities=22% Similarity=0.343 Sum_probs=11.8
Q ss_pred ecCCCccccCc--CCCCceEEEeee
Q 020292 200 LTADSDTVLDD--LDPNKIYIVGGL 222 (328)
Q Consensus 200 LSpDS~~~L~e--ld~~~vYIIGGi 222 (328)
++|.+-+.++. +....-|-+|||
T Consensus 748 ~~~~~~~~~~~~~~~~~~~~~~~~~ 772 (1021)
T PTZ00266 748 TGPYIGNPMESTKYRDHNKYSVGGL 772 (1021)
T ss_pred cccccCCccccccccccccccccch
Confidence 44444444442 444455666666
No 25
>KOG2505 consensus Ankyrin repeat protein [General function prediction only]
Probab=37.57 E-value=84 Score=33.25 Aligned_cols=17 Identities=12% Similarity=0.386 Sum_probs=12.6
Q ss_pred CCCCCCCCHHHHHHHHH
Q 020292 20 NSQPSGLSKTAQKKRLK 36 (328)
Q Consensus 20 ~~~~~~lSK~q~Kkl~k 36 (328)
..+|.|||+.|--.+.-
T Consensus 495 t~i~~PltrEq~~eq~e 511 (591)
T KOG2505|consen 495 THIPEPLTREQEREQAE 511 (591)
T ss_pred hcCCCcccHHHHHHHHH
Confidence 45789999988766543
No 26
>PF13466 STAS_2: STAS domain
Probab=36.45 E-value=1.3e+02 Score=22.40 Aligned_cols=58 Identities=21% Similarity=0.265 Sum_probs=38.3
Q ss_pred HHHhhhcCCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHhc
Q 020292 107 LTKAKENGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQR 170 (328)
Q Consensus 107 l~~a~~~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~~ 170 (328)
+......+-.|+|||+--+.|. |.+-|+.......-+. ...++.|++.++.+...|..
T Consensus 19 l~~~~~~~~~v~lDls~v~~iD-----sagl~lL~~~~~~~~~-~g~~~~l~~~~~~~~~ll~~ 76 (80)
T PF13466_consen 19 LQALLASGRPVVLDLSGVEFID-----SAGLQLLLAAARRARA-RGRQLRLTGPSPALRRLLEL 76 (80)
T ss_pred HHHHHcCCCeEEEECCCCCeec-----HHHHHHHHHHHHHHHH-CCCeEEEEcCCHHHHHHHHH
Confidence 3344445679999999888765 5555655554443222 45678889999888776653
No 27
>PF07646 Kelch_2: Kelch motif; InterPro: IPR011498 Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified []. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold []. The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila []. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase []. This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.; GO: 0005515 protein binding
Probab=35.67 E-value=19 Score=24.86 Aligned_cols=10 Identities=50% Similarity=1.036 Sum_probs=9.2
Q ss_pred CCceEEEeee
Q 020292 213 PNKIYIVGGL 222 (328)
Q Consensus 213 ~~~vYIIGGi 222 (328)
.+++||+||.
T Consensus 11 ~~kiyv~GG~ 20 (49)
T PF07646_consen 11 DGKIYVFGGY 20 (49)
T ss_pred CCEEEEECCc
Confidence 6799999999
No 28
>PF07939 DUF1685: Protein of unknown function (DUF1685); InterPro: IPR012881 The members of this family are hypothetical eukaryotic proteins of unknown function. The region in question is approximately 100 amino acid residues long.
Probab=35.38 E-value=27 Score=26.77 Aligned_cols=37 Identities=24% Similarity=0.282 Sum_probs=26.2
Q ss_pred cEEEecCCCc--ccCHHHHHHHHHHHHHHHHhhcCCCCC
Q 020292 116 NIIIDLEFSH--LMSRAEIQSLVQQIMYCYAVNRKCPSP 152 (328)
Q Consensus 116 ~IvIDcsf~~--lM~~kEi~sL~~Qi~~~Ys~NRr~~~P 152 (328)
+=+||+||.- .--.-++.++.==|..||+.||++..+
T Consensus 10 kGc~dLGFgF~~~~~~p~L~~tlPaL~lyyavn~q~~~~ 48 (64)
T PF07939_consen 10 KGCIDLGFGFDEEDLDPRLCDTLPALELYYAVNRQYSDH 48 (64)
T ss_pred hhhhhhccccCccccChHHHhhhHHHHHHHHHHHHhccc
Confidence 3367776654 333356777778889999999998765
No 29
>PF02112 PDEase_II: cAMP phosphodiesterases class-II; InterPro: IPR000396 Cyclic-AMP phosphodiesterase (3.1.4.17 from EC) (PDE) catalyses the hydrolysis of cAMP to the corresponding nucleoside 5' monophosphate. On the basis of sequence similarity, most PDEs can be grouped together [], but some enzymes lie apart from the main family and represent a second distinct class [] that includes PDEs from Dictyostelium and yeast. This entry contains class-II cyclic-AMP phosphodiesterases.; GO: 0004115 3',5'-cyclic-AMP phosphodiesterase activity, 0006198 cAMP catabolic process
Probab=35.27 E-value=61 Score=32.29 Aligned_cols=49 Identities=8% Similarity=0.097 Sum_probs=37.6
Q ss_pred cEEEecCCCcc---------cCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcch
Q 020292 116 NIIIDLEFSHL---------MSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDM 164 (328)
Q Consensus 116 ~IvIDcsf~~l---------M~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l 164 (328)
.|+|.|||.+. |+++-+...-.+|...|+.......-+++.||.+.+.+
T Consensus 245 aI~IEcS~~~~~~d~~LyGHLtP~~Li~EL~~L~~~~~~~~~~L~gL~VIItHIK~~~ 302 (335)
T PF02112_consen 245 AIFIECSYPNSQPDSQLYGHLTPKHLIEELKVLASKVGQTSPPLKGLNVIITHIKPSL 302 (335)
T ss_pred EEEEEeCCCCCCCchHhhccCCHHHHHHHHHHHHhccccccCCCCCCeEEEEEeCCcc
Confidence 69999999974 56666777777777777766667778889999886543
No 30
>PRK02135 hypothetical protein; Provisional
Probab=35.25 E-value=1.3e+02 Score=27.94 Aligned_cols=70 Identities=17% Similarity=0.427 Sum_probs=46.7
Q ss_pred CCcEEEecCCCccccC-cCCCCceEEEeeeecCCCChhhHHHHHHHcCCcccccccccccccCCCcccchHHHHHHHHHh
Q 020292 194 KENLVYLTADSDTVLD-DLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQTGKLPIGNYLKMSSSQVLTVNQVLEILLKF 272 (328)
Q Consensus 194 ~~~iVYLSpDS~~~L~-eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~taRLPI~eyi~l~~~~VLtiN~V~eILl~~ 272 (328)
...++||.-|....-+ ++.++-+||+| ||....--..+.=.++|-. ++- + |.+.|-.+|++-++..+
T Consensus 126 ~~~l~~L~e~G~~i~~~~~~~~~~FvLg---DH~~~~~ee~~~L~~~ga~--~iS------l-GP~~l~AshcI~~vhn~ 193 (201)
T PRK02135 126 GKTLYYLHEDGEDIRDVEFPENPVFVLG---DHIGFTEEEENLLKRLGAE--KIS------L-GPKMLHADHCITLIHNE 193 (201)
T ss_pred CCcEEEEeCCCCchhhccCCCCCEEEEe---CCCCCCHHHHHHHHHhCCe--EEE------e-CcHHHHHHHHHHHHHHH
Confidence 4679999999998775 57778899999 7664322222222333322 222 2 56678889999999988
Q ss_pred hcC
Q 020292 273 LET 275 (328)
Q Consensus 273 ~e~ 275 (328)
+..
T Consensus 194 LD~ 196 (201)
T PRK02135 194 LDR 196 (201)
T ss_pred Hhh
Confidence 865
No 31
>PF01740 STAS: STAS domain; InterPro: IPR002645 The STAS (Sulphate Transporter and AntiSigma factor antagonist) domain is found in the C-terminal region of sulphate transporters and bacterial anti-sigma factor antagonists. It has been suggested that this domain may have a general NTP binding function. The establishment of differential gene expression in sporulating Bacillus subtilis involves four protein components one of which is SpoIIAA (P10727 from SWISSPROT). The four components regulate the sporulation sigma factor F. Early in sporulation, SpoIIAA is in the phosphorylated state (SpoIIAA-P), as a result of the activity of the ATP-dependent protein kinase SpoIIAB (P10728 from SWISSPROT). The site at which this protein is a conserved serine. SpoIIAB is an anti-sigma factor that in its free form inhibits F by binding to it. Competition by SpoIIAA (the anti-anti-sigma factor) for binding to SpoIIAB releases Sigma F activity []. The STAS domain is found in the anti-sigma factor antagonist SpoIIAA.; PDB: 3T6O_B 3LKL_B 1H4Z_A 1H4Y_B 1H4X_B 3NY7_A 3OIZ_A 1T6R_A 1VC1_B 1SBO_A ....
Probab=35.15 E-value=1.6e+02 Score=23.47 Aligned_cols=55 Identities=27% Similarity=0.360 Sum_probs=42.9
Q ss_pred CcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHhcCCCCCc
Q 020292 115 QNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQRLPGFDK 176 (328)
Q Consensus 115 ~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~~~~g~~~ 176 (328)
..|||||+--..|.---+..|..=...+.. .-..++|+++++.+...|... |+.+
T Consensus 49 ~~vIlD~s~v~~iDssgi~~L~~~~~~~~~------~g~~~~l~~~~~~v~~~l~~~-~~~~ 103 (117)
T PF01740_consen 49 KNVILDMSGVSFIDSSGIQALVDIIKELRR------RGVQLVLVGLNPDVRRILERS-GLID 103 (117)
T ss_dssp SEEEEEETTESEESHHHHHHHHHHHHHHHH------TTCEEEEESHHHHHHHHHHHT-TGHH
T ss_pred eEEEEEEEeCCcCCHHHHHHHHHHHHHHHH------CCCEEEEEECCHHHHHHHHHc-CCCh
Confidence 589999999999998888877665555553 556799999999988888764 4433
No 32
>PLN02850 aspartate-tRNA ligase
Probab=34.63 E-value=64 Score=34.08 Aligned_cols=21 Identities=29% Similarity=0.194 Sum_probs=16.4
Q ss_pred CCCCCHHHHHHHHHHHHHHHH
Q 020292 23 PSGLSKTAQKKRLKQLRYEAR 43 (328)
Q Consensus 23 ~~~lSK~q~Kkl~k~~~we~~ 43 (328)
..++||+++||+.|++.+++.
T Consensus 10 ~~~~~~~~~~k~~~~~~~~~~ 30 (530)
T PLN02850 10 GEKISKKAAKKAAAKAEKLRR 30 (530)
T ss_pred CCCcCHHHHHHHHHHHHHHHH
Confidence 345999999999987766654
No 33
>PF06258 Mito_fiss_Elm1: Mitochondrial fission ELM1; InterPro: IPR009367 This family consists of several hypothetical eukaryotic and prokaryotic proteins. The function of this family is unknown.
Probab=34.07 E-value=3.6e+02 Score=26.34 Aligned_cols=148 Identities=15% Similarity=0.205 Sum_probs=90.7
Q ss_pred cCCcEEEecCCCc---ccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCC---cchHHHHhcCC-CCC---ccceecc
Q 020292 113 NGQNIIIDLEFSH---LMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCK---GDMESQLQRLP-GFD---KWIIEKE 182 (328)
Q Consensus 113 ~~~~IvIDcsf~~---lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~---~~l~~~l~~~~-g~~---~w~v~~~ 182 (328)
..++++|=.|=++ -+++.....|+.||...... .+..++||... ..+...|.+.. ... -| -...
T Consensus 145 ~~p~~avLIGG~s~~~~~~~~~~~~l~~~l~~~~~~-----~~~~~~vttSRRTp~~~~~~L~~~~~~~~~~~~~-~~~~ 218 (311)
T PF06258_consen 145 PRPRVAVLIGGDSKHYRWDEEDAERLLDQLAALAAA-----YGGSLLVTTSRRTPPEAEAALRELLKDNPGVYIW-DGTG 218 (311)
T ss_pred CCCeEEEEECcCCCCcccCHHHHHHHHHHHHHHHHh-----CCCeEEEEcCCCCcHHHHHHHHHhhcCCCceEEe-cCCC
Confidence 4577777666444 68889999999999988876 44677777652 23445554322 222 23 1223
Q ss_pred cchhhhhhhccCCcEEEecCCCccccCc-C-CCCceEEEeeeecCCCChhhH-HHHHHHcCCcccccccccc--c-ccC-
Q 020292 183 NRSYIEALEDHKENLVYLTADSDTVLDD-L-DPNKIYIVGGLVDRNRWKGIT-MKKAQEQGLQTGKLPIGNY--L-KMS- 255 (328)
Q Consensus 183 ~~~~~e~f~~~~~~iVYLSpDS~~~L~e-l-d~~~vYIIGGiVDrnr~Kglt-~~kA~~~GI~taRLPI~ey--i-~l~- 255 (328)
..+|.+.+. ..+.|+.|+||-+.+.| + .---|||+.-=- ++. |--. ++.=.+.|+-. |+... + .+.
T Consensus 219 ~nPy~~~La--~ad~i~VT~DSvSMvsEA~~tG~pV~v~~l~~-~~~-r~~r~~~~L~~~g~~r---~~~~~~~~~~~~~ 291 (311)
T PF06258_consen 219 ENPYLGFLA--AADAIVVTEDSVSMVSEAAATGKPVYVLPLPG-RSG-RFRRFHQSLEERGAVR---PFTGWRDLEQWTP 291 (311)
T ss_pred CCcHHHHHH--hCCEEEEcCccHHHHHHHHHcCCCEEEecCCC-cch-HHHHHHHHHHHCCCEE---ECCCccccccccc
Confidence 456887776 67899999999999987 3 334788885432 222 2222 22224566554 55544 2 233
Q ss_pred CCcccchHHHHHHHHHhh
Q 020292 256 SSQVLTVNQVLEILLKFL 273 (328)
Q Consensus 256 ~~~VLtiN~V~eILl~~~ 273 (328)
..+.--++.|.++|++.+
T Consensus 292 ~~pl~et~r~A~~i~~r~ 309 (311)
T PF06258_consen 292 YEPLDETDRVAAEIRERL 309 (311)
T ss_pred CCCccHHHHHHHHHHHHh
Confidence 234456888888888754
No 34
>PF04122 CW_binding_2: Putative cell wall binding repeat 2; InterPro: IPR007253 This repeat is found in multiple tandem copies in proteins including amidase enhancers [] and adhesins [].
Probab=33.53 E-value=8.8 Score=30.13 Aligned_cols=16 Identities=44% Similarity=0.856 Sum_probs=12.7
Q ss_pred ccCcCCCCceEEEeee
Q 020292 207 VLDDLDPNKIYIVGGL 222 (328)
Q Consensus 207 ~L~eld~~~vYIIGGi 222 (328)
.|..+.+.++|||||-
T Consensus 67 ~l~~~~~~~v~iiGg~ 82 (92)
T PF04122_consen 67 FLKSLNIKKVYIIGGE 82 (92)
T ss_pred HHHHcCCCEEEEECCC
Confidence 4556678999999985
No 35
>PRK12377 putative replication protein; Provisional
Probab=32.54 E-value=3.4e+02 Score=25.61 Aligned_cols=53 Identities=13% Similarity=0.088 Sum_probs=30.0
Q ss_pred CcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHH
Q 020292 115 QNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQL 168 (328)
Q Consensus 115 ~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l 168 (328)
+...-+|+|++.....+-...+......|+.+-.. ..-.|+|+|-.|-.++.|
T Consensus 65 ~~~~~~~tFdnf~~~~~~~~~a~~~a~~~a~~~~~-~~~~l~l~G~~GtGKThL 117 (248)
T PRK12377 65 QPLHRKCSFANYQVQNDGQRYALSQAKSIADELMT-GCTNFVFSGKPGTGKNHL 117 (248)
T ss_pred CcccccCCcCCcccCChhHHHHHHHHHHHHHHHHh-cCCeEEEECCCCCCHHHH
Confidence 35568999999875322222233333334333221 235799999877666655
No 36
>PF13019 Telomere_Sde2: Telomere stability and silencing
Probab=30.23 E-value=96 Score=27.91 Aligned_cols=24 Identities=21% Similarity=0.309 Sum_probs=15.8
Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHH
Q 020292 32 KKRLKQLRYEARKAEKKAKMKVEK 55 (328)
Q Consensus 32 Kkl~k~~~we~~k~~rk~~~kekk 55 (328)
+-.++.+.|.+.+.+|.++++|++
T Consensus 126 ~~~k~l~~~~~~~~er~k~~~e~~ 149 (162)
T PF13019_consen 126 NEEKKLAEWLEKKPEREKKEKEKR 149 (162)
T ss_pred HHHHHHHHHHhcChhHHHHHHHHH
Confidence 334455779999988875555543
No 37
>KOG1135 consensus mRNA cleavage and polyadenylation factor II complex, subunit CFT2 (CPSF subunit) [RNA processing and modification]
Probab=30.18 E-value=14 Score=40.05 Aligned_cols=33 Identities=24% Similarity=0.308 Sum_probs=27.7
Q ss_pred cCCcEEEecCCCcccCHHHHHHHHHHHHHHHHh
Q 020292 113 NGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAV 145 (328)
Q Consensus 113 ~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~ 145 (328)
.|.+|.||||+++.|....+..|-.||--|=+.
T Consensus 23 D~~~iLiDcGwd~~f~~~~i~~l~~~i~~iDaI 55 (764)
T KOG1135|consen 23 DGVRILIDCGWDESFDMSMIKELKPVIPTIDAI 55 (764)
T ss_pred cCeEEEEeCCCcchhccchhhhhhcccccccEE
Confidence 368999999999999999999999888655433
No 38
>PF09554 RE_HaeII: HaeII restriction endonuclease; InterPro: IPR019058 There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements [, ], as summarised below: Type I enzymes (3.1.21.3 from EC) cleave at sites remote from recognition site; require both ATP and S-adenosyl-L-methionine to function; multifunctional protein with both restriction and methylase (2.1.1.72 from EC) activities. Type II enzymes (3.1.21.4 from EC) cleave within or at short specific distances from recognition site; most require magnesium; single function (restriction) enzymes independent of methylase. Type III enzymes (3.1.21.5 from EC) cleave at sites a short distance from recognition site; require ATP (but doesn't hydrolyse it); S-adenosyl-L-methionine stimulates reaction but is not required; exists as part of a complex with a modification methylase methylase (2.1.1.72 from EC). Type IV enzymes target methylated DNA. Type II restriction endonucleases (3.1.21.4 from EC) are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin []. However, there is still considerable diversity amongst restriction endonucleases [, ]. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone []. This family includes Type-2 restriction enzyme NgoBI, and HaeII. They recognise the double-stranded sequence RGCGCY and cleaves after C-5.
Probab=28.87 E-value=70 Score=31.38 Aligned_cols=95 Identities=15% Similarity=0.208 Sum_probs=68.3
Q ss_pred cCCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHhcCCCCCccceecccchhhhhhhc
Q 020292 113 NGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQRLPGFDKWIIEKENRSYIEALED 192 (328)
Q Consensus 113 ~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~~~~g~~~w~v~~~~~~~~e~f~~ 192 (328)
-+..|=||.++..+.--+|-..++++|...-+.|-....+.+++=+|+.--...-|+ .|.||..
T Consensus 166 l~v~v~i~~~~~n~~Ll~Ef~dft~kil~l~~~~~~~~~~Aki~RVGVTNAADRGLD---MwaNfG~------------- 229 (338)
T PF09554_consen 166 LNVKVEIDLPENNYDLLKEFEDFTKKILNLNPSNPSLTLNAKIYRVGVTNAADRGLD---MWANFGP------------- 229 (338)
T ss_pred hCcEEEEecCchhhHHHHHHHHHHHHHhccCCCCccccccceeEEeccccchhcchh---HHhccCh-------------
Confidence 478899999999999999999999999988888888889999998887643222232 2444421
Q ss_pred cCCcEEEecCCC---ccccCcCCCCceEEEeeeec
Q 020292 193 HKENLVYLTADS---DTVLDDLDPNKIYIVGGLVD 224 (328)
Q Consensus 193 ~~~~iVYLSpDS---~~~L~eld~~~vYIIGGiVD 224 (328)
.=+|=+||-|. ++..+.+..+.++|++-=.+
T Consensus 230 -aiQiKhLSL~~~laE~IV~sissdrIvIVCkdae 263 (338)
T PF09554_consen 230 -AIQIKHLSLDEELAENIVSSISSDRIVIVCKDAE 263 (338)
T ss_pred -heeeeeecccHHHHHHHHhhcccCeEEEEecchh
Confidence 22355565553 45566788888888775444
No 39
>TIGR02192 HtrL_YibB conserved hypothetical protein HtrL. The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologs are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein.
Probab=28.45 E-value=29 Score=33.63 Aligned_cols=38 Identities=18% Similarity=0.281 Sum_probs=27.9
Q ss_pred ccCCcEEEecC----CCccccCc-CCCCceEEEeeeecCCCCh
Q 020292 192 DHKENLVYLTA----DSDTVLDD-LDPNKIYIVGGLVDRNRWK 229 (328)
Q Consensus 192 ~~~~~iVYLSp----DS~~~L~e-ld~~~vYIIGGiVDrnr~K 229 (328)
+.+=.++.++| +++..|.+ +-.+.+||+||++=-..++
T Consensus 163 ~~KI~lf~I~~~~~~~~~~~l~~l~~~n~~~I~GG~i~g~~~~ 205 (270)
T TIGR02192 163 EGKISLFKIKPYFDEVSSQDLKRLYRKNVALLSGGFIAGDKHA 205 (270)
T ss_pred CCcEEEEEecCCCCcchhhhHHHHHhcCceEEEccEEEcCHHH
Confidence 34566788888 66666665 6788999999998666544
No 40
>PF00965 TIMP: Tissue inhibitor of metalloproteinase; InterPro: IPR001820 Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. Tissue inhibitors of metalloproteinases (TIMPs, [, , ]) and their target matrix metalloproteinases (MMPs, MEROPS peptidase family M10A) are important in connective tissue re-modelling in diseases of the cardiovascular system and in the physiological degradation of connective tissue, as well as in pathological states such as tumour invasion and arthritis. TIMPs belong to MEROPS proteinase inhibitor family I35, clan IT. TIMPs complex with extracellular matrix metalloproteinases (such as collagenases) and irreversibly inactivate them. Members of this family are common in extracellular regions of vertebrate species []. TIMPs are proteins of about 200 amino acid residues, 12 of which are cysteines involved in disulphide bonds []. The basic structure of such a type of inhibitor is shown in the following schematic representation: +-----------------------------+ +--------------+ | | | | CxCxCxxxxxxxxxxxxxxxxxCxxxxxxxxxCxxxxxxxCxCxCxCxCxxxxxCxxCxxx | | | | | | | | | +-----------------|-----------------+ +-+ +-----+ +---------------------+ 'C': conserved cysteine involved in a disulphide bond. The crystal structure of the human proMMP-2/TIMP-2 complex reveals an interaction between the hemopexin domain of proMMP-2 and the C-terminal domain of TIMP-2, leaving the catalytic site of MMP-2 and the inhibitory site of TIMP-2 distant and spatially isolated. The interfacial contact of these two proteins is characterised by two distinct binding regions composed of alternating hydrophobic and hydrophilic interactions. This unique structure provides information for how specificity for non-inhibitory MMP/TIMP complex formation is achieved []. ; GO: 0008191 metalloendopeptidase inhibitor activity; PDB: 2E2D_C 1BUV_T 1BQQ_T 1BR9_A 1GXD_D 2TMP_A 2J0T_F 1D2B_A 3MA2_C 1UEA_B ....
Probab=28.32 E-value=27 Score=31.36 Aligned_cols=48 Identities=23% Similarity=0.467 Sum_probs=24.8
Q ss_pred cEEEecCCCccccC--cCCCCceEEEee-ee-cCCCChhhH--------HHHHHHcCCcc
Q 020292 196 NLVYLTADSDTVLD--DLDPNKIYIVGG-LV-DRNRWKGIT--------MKKAQEQGLQT 243 (328)
Q Consensus 196 ~iVYLSpDS~~~L~--eld~~~vYIIGG-iV-Drnr~Kglt--------~~kA~~~GI~t 243 (328)
.++|+...+...+= .|+.++.|+|.| ++ |-+.+-++| +-.+++.|++.
T Consensus 65 ~~~~i~T~~~~s~CGv~l~~g~~YLIaG~~~~~g~l~i~~C~~v~~w~~lt~~Qr~g~~~ 124 (181)
T PF00965_consen 65 DIQFIYTPSSSSLCGVKLELGKEYLIAGRVVSDGKLHISLCSFVEPWDSLTPSQRRGLKH 124 (181)
T ss_dssp S-SEEEEESSGGGTS----SSSEEEEEEEEESTTEEE--TTSCEEEGGGS-HHHHHCCCT
T ss_pred ccceeccCCcccccCcccCCCceEEEEEEecCCCcEEEEECCcEeehhhCCHHHHhhhHH
Confidence 44444444443331 288999999988 77 444455555 55566777654
No 41
>COG1578 Uncharacterized conserved protein [Function unknown]
Probab=27.21 E-value=2.6e+02 Score=27.42 Aligned_cols=44 Identities=27% Similarity=0.524 Sum_probs=37.5
Q ss_pred cEEEecCCCcc---------ccCcCCCCceEEEeeeecCCCChhhHHHHHHHcCCc
Q 020292 196 NLVYLTADSDT---------VLDDLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQ 242 (328)
Q Consensus 196 ~iVYLSpDS~~---------~L~eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~ 242 (328)
+|+||+-.+=+ +|.++...-|||++| .-...-.|.+-|++.||.
T Consensus 153 ~VlYl~DNaGEi~FD~vlie~ik~~~~~vv~vVrg---~PIlnDaT~EDak~~~i~ 205 (285)
T COG1578 153 SVLYLTDNAGEIVFDKVLIEVIKELGKKVVVVVRG---GPILNDATMEDAKEAGID 205 (285)
T ss_pred cEEEEecCCccHHHHHHHHHHHHhcCCceEEEEcC---CceechhhHHHHHHcCcc
Confidence 89999988865 345788899999999 566778999999999987
No 42
>KOG0068 consensus D-3-phosphoglycerate dehydrogenase, D-isomer-specific 2-hydroxy acid dehydrogenase superfamily [Amino acid transport and metabolism]
Probab=26.96 E-value=55 Score=33.17 Aligned_cols=118 Identities=23% Similarity=0.289 Sum_probs=65.8
Q ss_pred EEEecCCCcccCH-----HHHHHHHHHHHHHHHhhcCCCCCceEEE-----------ecCC---cchHHHHhcCC----C
Q 020292 117 IIIDLEFSHLMSR-----AEIQSLVQQIMYCYAVNRKCPSPAHLWL-----------TGCK---GDMESQLQRLP----G 173 (328)
Q Consensus 117 IvIDcsf~~lM~~-----kEi~sL~~Qi~~~Ys~NRr~~~P~~L~l-----------t~~~---~~l~~~l~~~~----g 173 (328)
+|+.-=+.+-.+- -.+-||++||..+|...|.-.|-..-++ -|++ ..+..+++.+. +
T Consensus 95 ~Vvn~P~~Ns~saAEltigli~SLaR~i~~A~~s~k~g~wnr~~~~G~el~GKTLgvlG~GrIGseVA~r~k~~gm~vI~ 174 (406)
T KOG0068|consen 95 LVVNTPTANSRSAAELTIGLILSLARQIGQASASMKEGKWNRVKYLGWELRGKTLGVLGLGRIGSEVAVRAKAMGMHVIG 174 (406)
T ss_pred EEEeCCCCChHHHHHHHHHHHHHHhhhcchhheeeecCceeecceeeeEEeccEEEEeecccchHHHHHHHHhcCceEEe
Confidence 3444444443333 4578999999999999988887654332 2332 12344444332 2
Q ss_pred CCcccee-------cccchhhhhhhccCCcE----EEecCCCccccCc----CCCCceEEE----eeeecCCCChhhHHH
Q 020292 174 FDKWIIE-------KENRSYIEALEDHKENL----VYLTADSDTVLDD----LDPNKIYIV----GGLVDRNRWKGITMK 234 (328)
Q Consensus 174 ~~~w~v~-------~~~~~~~e~f~~~~~~i----VYLSpDS~~~L~e----ld~~~vYII----GGiVDrnr~Kglt~~ 234 (328)
|+-.... +.--++.++++ +.+. +=|||++.+.|.+ .-..-|||| ||+||. -.+=
T Consensus 175 ~dpi~~~~~~~a~gvq~vsl~Eil~--~ADFitlH~PLtP~T~~lin~~tfA~mKkGVriIN~aRGGvVDe-----~ALv 247 (406)
T KOG0068|consen 175 YDPITPMALAEAFGVQLVSLEEILP--KADFITLHVPLTPSTEKLLNDETFAKMKKGVRIINVARGGVVDE-----PALV 247 (406)
T ss_pred ecCCCchHHHHhccceeeeHHHHHh--hcCEEEEccCCCcchhhccCHHHHHHhhCCcEEEEecCCceech-----HHHH
Confidence 2221100 00112233333 3444 4478999988875 234569999 999994 3455
Q ss_pred HHHHcCC
Q 020292 235 KAQEQGL 241 (328)
Q Consensus 235 kA~~~GI 241 (328)
+|-+-|+
T Consensus 248 ~Al~sG~ 254 (406)
T KOG0068|consen 248 RALDSGQ 254 (406)
T ss_pred HHHhcCc
Confidence 5666554
No 43
>COG1366 SpoIIAA Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) [Signal transduction mechanisms]
Probab=26.87 E-value=3.3e+02 Score=22.01 Aligned_cols=55 Identities=24% Similarity=0.330 Sum_probs=40.6
Q ss_pred CCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHhcCCCCC
Q 020292 114 GQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQRLPGFD 175 (328)
Q Consensus 114 ~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~~~~g~~ 175 (328)
+..|||||+--+.|.---+..|...+..+-.. - ..+.++|.++.++.-|... |+.
T Consensus 44 ~~~ivIDls~v~~~dS~gl~~L~~~~~~~~~~----g--~~~~l~~i~p~v~~~~~~~-gl~ 98 (117)
T COG1366 44 ARGLVIDLSGVDFMDSAGLGVLVALLKSARLR----G--VELVLVGIQPEVARTLELT-GLD 98 (117)
T ss_pred CcEEEEECCCCceechHHHHHHHHHHHHHHhc----C--CeEEEEeCCHHHHHHHHHh-Cch
Confidence 34499999999999987777777665555332 2 5789999999988777653 443
No 44
>PF08496 Peptidase_S49_N: Peptidase family S49 N-terminal; InterPro: IPR013703 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found to the N terminus of bacterial signal peptidases that belong to the MEROPS peptidase family S49 (protease IV family, clan SK) (see also IPR002142 from INTERPRO) [, ]. ; GO: 0004252 serine-type endopeptidase activity, 0005886 plasma membrane
Probab=26.59 E-value=4.5e+02 Score=23.26 Aligned_cols=42 Identities=21% Similarity=0.234 Sum_probs=32.1
Q ss_pred cCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcc
Q 020292 121 LEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGD 163 (328)
Q Consensus 121 csf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~ 163 (328)
++|+--|.-.+..+|...|+-..+.-+.- .-+-|-+.|-+|-
T Consensus 103 ldF~Gdi~A~~v~~LReeisail~~a~~~-DeV~~rLES~GG~ 144 (155)
T PF08496_consen 103 LDFKGDIKASEVESLREEISAILSVATPE-DEVLVRLESPGGM 144 (155)
T ss_pred EecCCCccHHHHHHHHHHHHHHHHhCCCC-CeEEEEEecCCce
Confidence 45777788999999999999999986655 4455666666653
No 45
>PF13056 DUF3918: Protein of unknown function (DUF3918)
Probab=26.41 E-value=48 Score=23.52 Aligned_cols=15 Identities=27% Similarity=0.257 Sum_probs=12.3
Q ss_pred CCCCHHHHHHHHHHH
Q 020292 24 SGLSKTAQKKRLKQL 38 (328)
Q Consensus 24 ~~lSK~q~Kkl~k~~ 38 (328)
.-+|++|+||..|+-
T Consensus 25 ~m~n~R~MKKmrkrv 39 (43)
T PF13056_consen 25 DMMNKRQMKKMRKRV 39 (43)
T ss_pred cccchHHHHHHHHHH
Confidence 358999999998864
No 46
>PLN00185 60S ribosomal protein L4-1; Provisional
Probab=26.38 E-value=87 Score=32.10 Aligned_cols=93 Identities=16% Similarity=0.245 Sum_probs=60.4
Q ss_pred cEEEecCCCcccCHHHHHHHHHHHHHHHH------------------hhcCCCCCc-eEEEecC-CcchHHHHhcCCCCC
Q 020292 116 NIIIDLEFSHLMSRAEIQSLVQQIMYCYA------------------VNRKCPSPA-HLWLTGC-KGDMESQLQRLPGFD 175 (328)
Q Consensus 116 ~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys------------------~NRr~~~P~-~L~lt~~-~~~l~~~l~~~~g~~ 175 (328)
+||||-.+...-.-++...+.+.|. +|. -||++..+- .|.|++- ++.+...+.+++|.+
T Consensus 155 pLVV~d~~e~~~KTK~av~~Lk~lg-~~~d~~k~~~s~~iRaGkGKmR~Rr~~~~kg~LIV~~~~~~~l~kA~RNIPgV~ 233 (405)
T PLN00185 155 PLVVSDSAESIEKTSAAIKILKQIG-AYADVEKAKDSKGIRAGKGKMRNRRYVSRKGPLVVYGTEGAKIVKAFRNIPGVE 233 (405)
T ss_pred CEEEEeCccCCcCHHHHHHHHHHcC-CcccchhhhcccccccccccccccccccCCceEEEEcCCchhhhhhhcCCCCCe
Confidence 4677666677778888888888775 332 267776666 5666654 344777777777754
Q ss_pred ccceecccchhhhhhhcc-CCcEEEecCCCccccCcC
Q 020292 176 KWIIEKENRSYIEALEDH-KENLVYLTADSDTVLDDL 211 (328)
Q Consensus 176 ~w~v~~~~~~~~e~f~~~-~~~iVYLSpDS~~~L~el 211 (328)
- +....-...++.+.. ...+|.+|-+|-+.|+++
T Consensus 234 v--~~v~~LNv~dLapggh~gr~vI~TesA~~~Lee~ 268 (405)
T PLN00185 234 L--CSVDRLNLLQLAPGGHLGRFVIWTKSAFEKLDSI 268 (405)
T ss_pred E--EecCCccHHHHhccCCCCCEEEEEhHHHHHHHHH
Confidence 2 233344456666522 256899999998888763
No 47
>PTZ00266 NIMA-related protein kinase; Provisional
Probab=26.08 E-value=9.9e+02 Score=27.65 Aligned_cols=8 Identities=13% Similarity=0.106 Sum_probs=4.4
Q ss_pred CCCCCHHH
Q 020292 23 PSGLSKTA 30 (328)
Q Consensus 23 ~~~lSK~q 30 (328)
...++|.+
T Consensus 427 g~r~eke~ 434 (1021)
T PTZ00266 427 GGRVDKDH 434 (1021)
T ss_pred ccccchhH
Confidence 34566654
No 48
>PF13854 Kelch_5: Kelch motif
Probab=25.82 E-value=33 Score=23.08 Aligned_cols=10 Identities=60% Similarity=1.089 Sum_probs=9.3
Q ss_pred CceEEEeeee
Q 020292 214 NKIYIVGGLV 223 (328)
Q Consensus 214 ~~vYIIGGiV 223 (328)
+.+||+||..
T Consensus 15 ~~iyi~GG~~ 24 (42)
T PF13854_consen 15 NNIYIFGGYS 24 (42)
T ss_pred CEEEEEcCcc
Confidence 7899999998
No 49
>COG5055 RAD52 Recombination DNA repair protein (RAD52 pathway) [DNA replication, recombination, and repair]
Probab=25.18 E-value=1e+02 Score=30.65 Aligned_cols=113 Identities=19% Similarity=0.207 Sum_probs=59.1
Q ss_pred cccCHHHHHHHHHHHHHHHHhhcCCCCCc-eEEEecCCcchHHHHhcCCCCCccceeccc--chhhhhhhccCCcEEEec
Q 020292 125 HLMSRAEIQSLVQQIMYCYAVNRKCPSPA-HLWLTGCKGDMESQLQRLPGFDKWIIEKEN--RSYIEALEDHKENLVYLT 201 (328)
Q Consensus 125 ~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~-~L~lt~~~~~l~~~l~~~~g~~~w~v~~~~--~~~~e~f~~~~~~iVYLS 201 (328)
.....+=..+|.++|.--|-.+|--.--. --+|-|+. +-+.-..+-||+.|...+-+ -+|.+.|... .=-|-+|
T Consensus 23 ~~r~~~lqs~L~rklgpeYis~R~G~g~~~iayIegw~--~I~lANeiFGfnGWss~I~sv~id~~ee~~e~-k~svg~s 99 (375)
T COG5055 23 VQRIGKLQSKLERKLGPEYISRRSGFGGSSIAYIEGWK--AIELANEIFGFNGWSSEIRSVEIDYCEEFEEK-KFSVGAS 99 (375)
T ss_pred hhHHHHHHHHHHHHhccHhhccccCCCCCceeeechhH--HHHHHHHhhCcCccccceeeeeeecccccccc-ceeeeeE
Confidence 33444445667777777777665432221 12222221 22223455799999643322 2344444321 1124444
Q ss_pred CCCccccCc--CCCCceEEEeeeecCCCChhhHHHHHHHcCCcc
Q 020292 202 ADSDTVLDD--LDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQT 243 (328)
Q Consensus 202 pDS~~~L~e--ld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~t 243 (328)
+--.-+|.+ +..+ |==|-|+-.|.|+.||++|++.|+.-
T Consensus 100 aiVRVTLKDGty~Ed---iGyGsien~r~ka~ayek~KKEavtD 140 (375)
T COG5055 100 AIVRVTLKDGTYRED---IGYGSIENCRRKAEAYEKAKKEAVTD 140 (375)
T ss_pred EEEEEEecCCccccc---cccceeecccccHHHHHHHHhhhhHH
Confidence 444444442 1122 11267787889999999999877643
No 50
>PRK13679 hypothetical protein; Provisional
Probab=24.81 E-value=1.1e+02 Score=26.57 Aligned_cols=77 Identities=13% Similarity=0.277 Sum_probs=44.6
Q ss_pred HHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCc-------chHHHHhcC-CCCCccceecccchhhhhhhccCCcEEEe
Q 020292 129 RAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKG-------DMESQLQRL-PGFDKWIIEKENRSYIEALEDHKENLVYL 200 (328)
Q Consensus 129 ~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~-------~l~~~l~~~-~g~~~w~v~~~~~~~~e~f~~~~~~iVYL 200 (328)
+.++.....+++..|....+. .|.||+|...+. .+.+.+..+ .++....+.+.. ...|+. ...+|||
T Consensus 10 p~~~~~~l~~~~~~~~~~~~~-v~pHITL~f~g~~~~~~~~~l~~~l~~~~~~~~pf~l~l~~---~~~F~~-~~~vl~l 84 (168)
T PRK13679 10 SKKIQDFANSYRKRYDPHYAL-IPPHITLKEPFEISDEQLDSIVEELRAIASETKPFTLHVTK---VSSFAP-TNNVIYF 84 (168)
T ss_pred CHHHHHHHHHHHHhhCccccc-CCCceEEecCCCCCHHHHHHHHHHHHHHHhcCCCEEEEEec---cccCCC-CCCEEEE
Confidence 456888888888888766443 566888876432 123333332 123333343322 234542 3579999
Q ss_pred cCCCccccCc
Q 020292 201 TADSDTVLDD 210 (328)
Q Consensus 201 SpDS~~~L~e 210 (328)
+++....|..
T Consensus 85 ~~~~~~~L~~ 94 (168)
T PRK13679 85 KVEKTEELEE 94 (168)
T ss_pred EccCCHHHHH
Confidence 9987666654
No 51
>PF13418 Kelch_4: Galactose oxidase, central domain; PDB: 2UVK_B.
Probab=24.66 E-value=39 Score=22.96 Aligned_cols=15 Identities=33% Similarity=0.603 Sum_probs=8.9
Q ss_pred CCCceEEEeeeecCC
Q 020292 212 DPNKIYIVGGLVDRN 226 (328)
Q Consensus 212 d~~~vYIIGGiVDrn 226 (328)
..+.+||+||.-...
T Consensus 11 ~~~~i~v~GG~~~~~ 25 (49)
T PF13418_consen 11 GDNSIYVFGGRDSSG 25 (49)
T ss_dssp -TTEEEEE--EEE-T
T ss_pred eCCeEEEECCCCCCC
Confidence 457899999997753
No 52
>PF13812 PPR_3: Pentatricopeptide repeat domain
Probab=24.32 E-value=82 Score=19.04 Aligned_cols=19 Identities=21% Similarity=0.263 Sum_probs=13.7
Q ss_pred HHHHHHH-hhcCCCHHHHHH
Q 020292 265 VLEILLK-FLETRDWEASFF 283 (328)
Q Consensus 265 V~eILl~-~~e~~DW~~A~~ 283 (328)
.|++|+. +...|+|..|+.
T Consensus 3 ty~~ll~a~~~~g~~~~a~~ 22 (34)
T PF13812_consen 3 TYNALLRACAKAGDPDAALQ 22 (34)
T ss_pred HHHHHHHHHHHCCCHHHHHH
Confidence 4566666 567899999874
No 53
>PF11868 DUF3388: Protein of unknown function (DUF3388); InterPro: IPR024514 This domain is found in a family of bacterial proteins that are functionally uncharacterised. Proteins in this family are typically between 261 to 275 amino acids in length and have a N-terminal ACT domain.
Probab=23.91 E-value=52 Score=30.01 Aligned_cols=21 Identities=33% Similarity=0.667 Sum_probs=17.1
Q ss_pred cCcCCCCceEEEeeeecCCCC
Q 020292 208 LDDLDPNKIYIVGGLVDRNRW 228 (328)
Q Consensus 208 L~eld~~~vYIIGGiVDrnr~ 228 (328)
.+++.++.||||-|||-..|.
T Consensus 100 ~dE~~~~~ifIIDGivSt~r~ 120 (192)
T PF11868_consen 100 EDEYNENNIFIIDGIVSTRRS 120 (192)
T ss_pred hcccCcCcEEEEeeeeeeccC
Confidence 356789999999999987653
No 54
>cd07043 STAS_anti-anti-sigma_factors Sulphate Transporter and Anti-Sigma factor antagonist) domain of anti-anti-sigma factors, key regulators of anti-sigma factors by phosphorylation. Anti-anti-sigma factors play an important role in the regulation of several sigma factors and their corresponding anti-sigma factors. Upon dephosphorylation they bind the anti-sigma factor and induce the release of the sigma factor from the anti-sigma factor. In a feedback mechanism the anti-anti-sigma factor can be inactivated via phosphorylation by the anti-sigma factor. Well studied examples from Bacillus subtilis are SpoIIAA (regulating sigmaF and sigmaC which play an important role in sporulation) and RsbV (regulating sigmaB involved in the general stress response). The STAS domain is also found in the C- terminal region of sulphate transporters and stressosomes.
Probab=23.53 E-value=3.2e+02 Score=20.42 Aligned_cols=52 Identities=13% Similarity=0.169 Sum_probs=37.8
Q ss_pred CCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHHHhcC
Q 020292 114 GQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQLQRL 171 (328)
Q Consensus 114 ~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~l~~~ 171 (328)
...|||||+--..|.---+.-|..-+..+-. ...++.|+++++.+.+.|...
T Consensus 38 ~~~viid~~~v~~iDs~g~~~L~~l~~~~~~------~g~~v~i~~~~~~~~~~l~~~ 89 (99)
T cd07043 38 PRRLVLDLSGVTFIDSSGLGVLLGAYKRARA------AGGRLVLVNVSPAVRRVLELT 89 (99)
T ss_pred CCEEEEECCCCCEEcchhHHHHHHHHHHHHH------cCCeEEEEcCCHHHHHHHHHh
Confidence 4689999999998887766655544333322 356799999999888888754
No 55
>PF11208 DUF2992: Protein of unknown function (DUF2992); InterPro: IPR016787 There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=23.47 E-value=2.8e+02 Score=24.01 Aligned_cols=24 Identities=42% Similarity=0.574 Sum_probs=11.6
Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHH
Q 020292 39 RYEARKAEKKAKMKVEKKREGERK 62 (328)
Q Consensus 39 ~we~~k~~rk~~~kekkk~~kerk 62 (328)
.+|+.|.+++...|+++.+.++++
T Consensus 94 q~E~~K~~rk~~~k~~re~~k~~k 117 (132)
T PF11208_consen 94 QREQRKKERKKRSKEQREAEKERK 117 (132)
T ss_pred HHHHHHHHHHHHHHHHHHHHHHHH
Confidence 445555555555555444444333
No 56
>PF06117 DUF957: Enterobacterial protein of unknown function (DUF957); InterPro: IPR009301 This family consists of several hypothetical proteins from Escherichia coli, Salmonella typhi, Shigella flexneri and Proteus vulgaris. The function of this family is unknown.
Probab=22.92 E-value=61 Score=24.85 Aligned_cols=18 Identities=50% Similarity=0.818 Sum_probs=15.7
Q ss_pred cccchHHHHHHHHHhhcC
Q 020292 258 QVLTVNQVLEILLKFLET 275 (328)
Q Consensus 258 ~VLtiN~V~eILl~~~e~ 275 (328)
+.||+-++++||+.|++.
T Consensus 2 ~~lt~~~~L~iLi~WLed 19 (65)
T PF06117_consen 2 QILTTEEALEILIAWLED 19 (65)
T ss_pred ccchHHHHHHHHHHHHHc
Confidence 368999999999999874
No 57
>KOG4819 consensus Uncharacterized conserved protein [Function unknown]
Probab=22.55 E-value=2.2e+02 Score=23.75 Aligned_cols=8 Identities=38% Similarity=0.584 Sum_probs=3.2
Q ss_pred HHHHHHHH
Q 020292 58 EGERKRRE 65 (328)
Q Consensus 58 ~kerkr~e 65 (328)
.+|+.|+.
T Consensus 52 Ere~~r~~ 59 (106)
T KOG4819|consen 52 EREKVRAD 59 (106)
T ss_pred HHHHHHHh
Confidence 34444443
No 58
>PTZ00428 60S ribosomal protein L4; Provisional
Probab=22.48 E-value=1e+02 Score=31.33 Aligned_cols=93 Identities=11% Similarity=0.153 Sum_probs=59.6
Q ss_pred cEEEecCCCcccCHHHHHHHHHHHHH------HHH-----------hhcCCCCCc-eEEEecCCcchHHHHhcCCCCCcc
Q 020292 116 NIIIDLEFSHLMSRAEIQSLVQQIMY------CYA-----------VNRKCPSPA-HLWLTGCKGDMESQLQRLPGFDKW 177 (328)
Q Consensus 116 ~IvIDcsf~~lM~~kEi~sL~~Qi~~------~Ys-----------~NRr~~~P~-~L~lt~~~~~l~~~l~~~~g~~~w 177 (328)
+||||=.++..-.-++...+.+.|.. +|. -||++..+- .|.|++-+..+...+.+++|.+-
T Consensus 150 plVV~d~~e~~~KTK~av~~Lk~lg~~~d~~k~~~s~~~R~gkGk~R~rr~~~~~g~LIV~~~d~~l~~A~RNIpgV~v- 228 (381)
T PTZ00428 150 PLVVSDSVESYEKTKEAVAFLKALGAFDDVNRVNDSKKIRAGKGKMRNRRYVMRRGPLVVYANDNGVTKAFRNIPGVDL- 228 (381)
T ss_pred CEEEEcCcCCCCCHHHHHHHHHHcCCcccchhhhcccccccccccccccccccCCceEEEEcCCcchhhhhcCCCCcEE-
Confidence 45555558877788888888887751 222 267776665 57776656566667777776542
Q ss_pred ceecccchhhhhhhccC-CcEEEecCCCccccCc
Q 020292 178 IIEKENRSYIEALEDHK-ENLVYLTADSDTVLDD 210 (328)
Q Consensus 178 ~v~~~~~~~~e~f~~~~-~~iVYLSpDS~~~L~e 210 (328)
+....-...++.+... .+.|.+|-.|-..|++
T Consensus 229 -~~v~~LNv~dLapggh~gr~vI~TesA~~~L~~ 261 (381)
T PTZ00428 229 -CNVTRLNLLQLAPGGHVGRFIIWTKSAFKKLDK 261 (381)
T ss_pred -EecCCccHHHHhccCCCCCEEEEEhHHHHHHHH
Confidence 2333444556665222 5689999888888876
No 59
>PRK05339 PEP synthetase regulatory protein; Provisional
Probab=22.37 E-value=86 Score=30.38 Aligned_cols=45 Identities=20% Similarity=0.346 Sum_probs=38.3
Q ss_pred ccCcCCCCceEEEeeeecCCCChhhHHHHHHHcCCccccccccccccc
Q 020292 207 VLDDLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQTGKLPIGNYLKM 254 (328)
Q Consensus 207 ~L~eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~taRLPI~eyi~l 254 (328)
-...++..++-+|| |=|....-+|.-.|. .|+++|-+||-.-+.+
T Consensus 137 ~~~~l~~ADIiLvG--VSRtsKTPlS~YLA~-~G~KvAN~PLvpe~~l 181 (269)
T PRK05339 137 DPRGLDEADVILVG--VSRTSKTPTSLYLAN-KGIKAANYPLVPEVPL 181 (269)
T ss_pred CcCCcccCCEEEEC--cCCCCCcHHHHHHHc-cCCceEeeCCCCCCCC
Confidence 34567788899999 889999999999999 9999999999765543
No 60
>PF03618 Kinase-PPPase: Kinase/pyrophosphorylase; InterPro: IPR005177 This entry represents a family of uncharacterised proteins which are predicted to function as phosphotransferases.; GO: 0005524 ATP binding, 0016772 transferase activity, transferring phosphorus-containing groups
Probab=21.97 E-value=1e+02 Score=29.65 Aligned_cols=44 Identities=25% Similarity=0.366 Sum_probs=37.5
Q ss_pred ccCcCCCCceEEEeeeecCCCChhhHHHHHHHcCCcccccccccccc
Q 020292 207 VLDDLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQTGKLPIGNYLK 253 (328)
Q Consensus 207 ~L~eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~taRLPI~eyi~ 253 (328)
....|+..++-+|| |=|....-+|.-.|. .|+++|-+||-.-+.
T Consensus 131 ~~~~l~~ADivLvG--VSRtsKTPlS~YLA~-~G~KvAN~PLvpe~~ 174 (255)
T PF03618_consen 131 NPRGLDEADIVLVG--VSRTSKTPLSMYLAN-KGYKVANVPLVPEVP 174 (255)
T ss_pred CccccccCCEEEEc--ccccCCCchhHHHHh-cCcceeecCcCCCCC
Confidence 34567788899998 889989999999999 999999999976554
No 61
>PF07358 DUF1482: Protein of unknown function (DUF1482); InterPro: IPR009954 This family consists of several Enterobacterial proteins of around 60 residues in length. The function of this family is unknown.
Probab=21.77 E-value=1.1e+02 Score=22.99 Aligned_cols=26 Identities=15% Similarity=0.221 Sum_probs=23.4
Q ss_pred ChhhHHHHHHHcCCcccccccccccc
Q 020292 228 WKGITMKKAQEQGLQTGKLPIGNYLK 253 (328)
Q Consensus 228 ~Kglt~~kA~~~GI~taRLPI~eyi~ 253 (328)
..-.|...|.++.|+.-=+|++++|.
T Consensus 25 se~~C~~a~~eQki~g~CyPve~~i~ 50 (57)
T PF07358_consen 25 SEQQCLAAADEQKIPGNCYPVEKVIH 50 (57)
T ss_pred CHHHHHHHHHhccCCCCceehhhccc
Confidence 46789999999999999999999886
No 62
>KOG2357 consensus Uncharacterized conserved protein [Function unknown]
Probab=21.45 E-value=5.5e+02 Score=26.66 Aligned_cols=52 Identities=25% Similarity=0.111 Sum_probs=25.9
Q ss_pred CCCHHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHH-HHHHHHHhcccHH
Q 020292 25 GLSKTAQKKRLKQLRYEARK--AEKKAKMKVEKKREGERKR-REWEEKLASLSEE 76 (328)
Q Consensus 25 ~lSK~q~Kkl~k~~~we~~k--~~rk~~~kekkk~~kerkr-~e~~~~la~~~~e 76 (328)
.||+-..+|.-+..+|.+.. ..-.+.|-|--+++++.++ ++.++..|++++|
T Consensus 359 ~lS~~~k~kt~~~RQ~~~e~~~K~th~~rqEaaQ~kk~Ek~Ka~kekl~a~~d~E 413 (440)
T KOG2357|consen 359 FLSKDAKAKTDKNRQRVEEEFLKLTHAARQEAAQEKKAEKKKAEKEKLKASGDPE 413 (440)
T ss_pred hchHHHHhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcCCHH
Confidence 46777777766655554432 3333333333333333232 3445566667765
No 63
>PRK08154 anaerobic benzoate catabolism transcriptional regulator; Reviewed
Probab=21.40 E-value=1.8e+02 Score=28.05 Aligned_cols=44 Identities=16% Similarity=0.195 Sum_probs=36.9
Q ss_pred CcccCHHHHHHHHHHHHHHHHhhcCCCCCceEEEecCCcchHHH
Q 020292 124 SHLMSRAEIQSLVQQIMYCYAVNRKCPSPAHLWLTGCKGDMESQ 167 (328)
Q Consensus 124 ~~lM~~kEi~sL~~Qi~~~Ys~NRr~~~P~~L~lt~~~~~l~~~ 167 (328)
.+-|++.++..+...|...++.+|+...+..+.|+|+.|..+..
T Consensus 105 l~~l~~~~~~~~~~~l~~~~~~~~~~~~~~~I~l~G~~GsGKSt 148 (309)
T PRK08154 105 LEQASPAQLARVRDALSGMLGAGRRAARRRRIALIGLRGAGKST 148 (309)
T ss_pred HhcCCHHHHHHHHHHHHHHHhhhhhccCCCEEEEECCCCCCHHH
Confidence 34588888888888888999999999999999999998765443
No 64
>PRK11346 hypothetical protein; Provisional
Probab=21.27 E-value=41 Score=32.85 Aligned_cols=90 Identities=18% Similarity=0.250 Sum_probs=48.6
Q ss_pred hccCCcEEEecCC----Ccc-ccCcCCCCceEEEeeeecCCCChhhHHHHHHHcCCcccccccccccccCCCcccchHHH
Q 020292 191 EDHKENLVYLTAD----SDT-VLDDLDPNKIYIVGGLVDRNRWKGITMKKAQEQGLQTGKLPIGNYLKMSSSQVLTVNQV 265 (328)
Q Consensus 191 ~~~~~~iVYLSpD----S~~-~L~eld~~~vYIIGGiVDrnr~Kglt~~kA~~~GI~taRLPI~eyi~l~~~~VLtiN~V 265 (328)
+..|=.+..+++. +.. +.+-+-...+|||||++=-+.++-..+..-. ..-...+.+....--+|.
T Consensus 167 d~nKI~lF~I~~~~~~~d~~~l~di~~~n~~~I~GG~i~~~~e~w~~F~~lv----------~~~~~~ll~~~ivDDDQ~ 236 (285)
T PRK11346 167 DENKMHLFTIKKGLTVTSQQQVFDFMIGNHVYIIGGAIVGSQHKWKEFYKLV----------LESQKITLNNNIVDDDQG 236 (285)
T ss_pred CCCceEEEEecCCCcccchhhHHHHHhcChheEEcceEEecHHHHHHHHHHH----------HHHHHHHHhcCCcccchh
Confidence 3346668888884 444 4444678899999999866554433221110 000011112222233444
Q ss_pred HHHHHH----------hhcCCCHHHHHHhcCCCCc
Q 020292 266 LEILLK----------FLETRDWEASFFQVIPQRK 290 (328)
Q Consensus 266 ~eILl~----------~~e~~DW~~A~~~vIP~RK 290 (328)
+-+|+- |+-.++|-.||..-.|+-.
T Consensus 237 i~lmcy~~~P~iF~l~yl~~~~WFd~f~~F~~~~i 271 (285)
T PRK11346 237 IFVMCYYKRPDLFNLNYLGRGKWFDLFRCFRSNTL 271 (285)
T ss_pred hhhhhhhhCCceEEEeecccccHHHHHHHhccchH
Confidence 444442 2234789999988766543
No 65
>KOG4364 consensus Chromatin assembly factor-I [Chromatin structure and dynamics]
Probab=20.48 E-value=3.4e+02 Score=29.91 Aligned_cols=123 Identities=17% Similarity=0.128 Sum_probs=0.0
Q ss_pred CHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH--HHHHHHhcccHHHHHHHHHHHHHHHHHHHHhhhHHHHHHH
Q 020292 27 SKTAQKKRLKQLRYEARKAEKKAKMKVEKKREGERKRR--EWEEKLASLSEEERSKLIEERKGQRKERMEKRSEEREHKI 104 (328)
Q Consensus 27 SK~q~Kkl~k~~~we~~k~~rk~~~kekkk~~kerkr~--e~~~~la~~~~e~~~~~~~~r~~~~ke~~~~~~~e~~~~~ 104 (328)
++...||+.| +.-.+.|..+++.++.+|...++.+++ +..+.+..-.+|+|...-..+......+.....++--.=.
T Consensus 301 lekd~KKqqk-ekEkeEKrrKdE~Ek~kKqeek~KR~k~~Erkee~~rk~deerkK~e~ke~ea~E~rkkr~~aei~Kff 379 (811)
T KOG4364|consen 301 LEKDIKKQQK-EKEKEEKRRKDEQEKLKKQEEKQKRAKIMERKEEKSRKSDEERKKLESKEVEAQELRKKRHEAEIGKFF 379 (811)
T ss_pred HHHHHHHHHH-HHHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhhhhhHHHHHHHHHHHHHHHHHhhh
Q ss_pred HHHHHhhhcCCcEEEecCCCcccCHHHHHHHHHHHHHHHHhhcCCC
Q 020292 105 QRLTKAKENGQNIIIDLEFSHLMSRAEIQSLVQQIMYCYAVNRKCP 150 (328)
Q Consensus 105 ~rl~~a~~~~~~IvIDcsf~~lM~~kEi~sL~~Qi~~~Ys~NRr~~ 150 (328)
+.+.......-..-.+|..-.-.--++--.|+.=+.-.+.-.+++.
T Consensus 380 qk~~~k~~~~~~a~~s~~~f~pFeikd~M~lapi~s~~~~~~~rsq 425 (811)
T KOG4364|consen 380 QKIDNKFSTTCEATVSDIRFEPFEIKDQMGLAPISSKKHWGMRRSQ 425 (811)
T ss_pred cccccccCCcccccccccccccchhhccccccceehhccchhhhhh
No 66
>PF01696 Adeno_E1B_55K: Adenovirus EB1 55K protein / large t-antigen; InterPro: IPR002612 This family consists of adenovirus E1B 55 kDa protein or large t-antigen. E1B 55 kDa binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site []. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein []. The E1B region of adenovirus encodes two proteins E1B 55 kDa, the large t-antigen as found in this family and E1B 19 kDa IPR002924 from INTERPRO, the small t-antigen. Both of these proteins inhibit E1A induced apoptosis.
Probab=20.15 E-value=72 Score=32.51 Aligned_cols=35 Identities=23% Similarity=0.247 Sum_probs=26.4
Q ss_pred chhhhhhhccCCcEEEecCCCccccCc--CCCCceEEEe
Q 020292 184 RSYIEALEDHKENLVYLTADSDTVLDD--LDPNKIYIVG 220 (328)
Q Consensus 184 ~~~~e~f~~~~~~iVYLSpDS~~~L~e--ld~~~vYIIG 220 (328)
+++.+.+. ..--|||-||...++.. .-...+||||
T Consensus 55 eDle~~I~--~haKVaL~Pg~~Y~i~~~V~I~~~cYIiG 91 (386)
T PF01696_consen 55 EDLEEAIR--QHAKVALRPGAVYVIRKPVNIRSCCYIIG 91 (386)
T ss_pred cCHHHHHH--hcCEEEeCCCCEEEEeeeEEecceEEEEC
Confidence 35555554 44569999999999975 4578999996
No 67
>KOG1144 consensus Translation initiation factor 5B (eIF-5B) [Translation, ribosomal structure and biogenesis]
Probab=20.10 E-value=1.3e+03 Score=26.25 Aligned_cols=14 Identities=36% Similarity=0.394 Sum_probs=7.7
Q ss_pred eEEEeeeecCCCCh
Q 020292 216 IYIVGGLVDRNRWK 229 (328)
Q Consensus 216 vYIIGGiVDrnr~K 229 (328)
|.-|=|-||....|
T Consensus 477 IcCilGHVDTGKTK 490 (1064)
T KOG1144|consen 477 ICCILGHVDTGKTK 490 (1064)
T ss_pred eEEEeecccccchH
Confidence 44455667765444
Done!