Query 010326
Match_columns 513
No_of_seqs 344 out of 1747
Neff 5.9
Searched_HMMs 46136
Date Thu Mar 28 23:13:35 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/010326.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/010326hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PF12315 DUF3633: Protein of u 100.0 2.8E-69 6.1E-74 513.2 13.8 194 311-512 1-194 (212)
2 KOG2272 Focal adhesion protein 99.9 1.2E-26 2.6E-31 224.8 -1.9 179 146-361 59-253 (332)
3 KOG1703 Adaptor protein Enigma 99.9 1.4E-24 2.9E-29 235.1 8.7 322 157-512 132-456 (479)
4 KOG1701 Focal adhesion adaptor 99.8 1.1E-22 2.5E-27 209.7 -1.9 166 156-353 271-438 (468)
5 KOG4577 Transcription factor L 99.8 4.6E-22 9.9E-27 196.0 -4.7 124 157-292 31-154 (383)
6 KOG1701 Focal adhesion adaptor 99.7 1.3E-19 2.8E-24 187.3 -0.3 136 145-292 320-463 (468)
7 KOG2272 Focal adhesion protein 99.6 2.4E-17 5.1E-22 160.5 -0.9 133 143-290 178-311 (332)
8 KOG1044 Actin-binding LIM Zn-f 99.6 3.1E-16 6.8E-21 167.1 2.1 118 157-289 131-248 (670)
9 KOG1703 Adaptor protein Enigma 99.5 3E-15 6.4E-20 162.6 5.3 132 158-306 302-433 (479)
10 PF00412 LIM: LIM domain; Int 99.2 3.8E-12 8.3E-17 98.6 3.2 57 162-218 1-58 (58)
11 KOG1044 Actin-binding LIM Zn-f 99.1 7.2E-11 1.6E-15 126.6 5.5 158 160-358 17-188 (670)
12 KOG1700 Regulatory protein MLP 98.7 4.7E-09 1E-13 102.1 0.4 134 158-293 6-168 (200)
13 PF00412 LIM: LIM domain; Int 98.4 2.1E-07 4.6E-12 71.9 3.2 57 222-291 1-57 (58)
14 smart00132 LIM Zinc-binding do 98.3 7.1E-07 1.5E-11 62.9 3.0 37 161-197 1-38 (39)
15 KOG4577 Transcription factor L 97.7 5.6E-06 1.2E-10 83.1 -1.4 81 135-217 69-153 (383)
16 KOG1702 Nebulin repeat protein 97.4 2E-05 4.2E-10 76.0 -1.7 59 160-218 5-63 (264)
17 KOG0490 Transcription factor, 97.2 6.3E-05 1.4E-09 73.5 -1.4 114 164-290 1-118 (235)
18 KOG1700 Regulatory protein MLP 97.1 0.00016 3.5E-09 70.5 0.5 63 156-218 105-167 (200)
19 smart00132 LIM Zinc-binding do 96.8 0.00083 1.8E-08 47.0 2.5 38 221-269 1-38 (39)
20 PF13485 Peptidase_MA_2: Pepti 93.4 0.087 1.9E-06 45.5 3.7 44 397-440 19-64 (128)
21 smart00726 UIM Ubiquitin-inter 86.5 0.56 1.2E-05 31.1 2.0 21 40-60 1-21 (26)
22 PF02809 UIM: Ubiquitin intera 83.1 0.57 1.2E-05 28.4 0.8 16 40-55 2-17 (18)
23 PF00595 PDZ: PDZ domain (Also 76.8 0.31 6.6E-06 39.8 -2.5 31 367-397 43-73 (81)
24 PF04450 BSP: Peptidase of pla 76.8 1.6 3.4E-05 43.0 2.1 38 401-439 94-131 (205)
25 PF01433 Peptidase_M1: Peptida 72.0 2 4.4E-05 45.0 1.6 43 402-445 294-339 (390)
26 PF10026 DUF2268: Predicted Zn 69.9 4 8.7E-05 39.5 3.0 44 402-445 64-113 (195)
27 TIGR02412 pepN_strep_liv amino 63.8 4.2 9.1E-05 47.9 2.2 40 404-443 288-329 (831)
28 KOG0320 Predicted E3 ubiquitin 63.0 3.1 6.6E-05 40.1 0.7 49 184-232 129-180 (187)
29 PF10460 Peptidase_M30: Peptid 62.8 5.4 0.00012 42.6 2.5 43 402-444 138-186 (366)
30 KOG3549 Syntrophins (type gamm 60.8 5.7 0.00012 42.0 2.2 35 368-402 100-134 (505)
31 PF06114 DUF955: Domain of unk 60.7 5.2 0.00011 33.9 1.7 52 390-441 29-86 (122)
32 PF14835 zf-RING_6: zf-RING of 60.5 8.8 0.00019 31.1 2.7 47 187-233 8-54 (65)
33 PF10367 Vps39_2: Vacuolar sor 58.6 41 0.0009 28.4 7.0 30 158-187 77-107 (109)
34 PRK14873 primosome assembly pr 58.2 7.1 0.00015 45.0 2.7 37 188-227 394-430 (665)
35 PTZ00415 transmission-blocking 56.9 3.8 8.2E-05 50.9 0.2 53 46-100 133-187 (2849)
36 KOG2199 Signal transducing ada 56.6 6.6 0.00014 42.2 1.9 27 37-63 161-187 (462)
37 PF05572 Peptidase_M43: Pregna 56.2 6.7 0.00015 36.7 1.7 48 372-420 33-85 (154)
38 COG2856 Predicted Zn peptidase 55.5 19 0.00041 35.7 4.8 55 390-444 59-120 (213)
39 PRK14559 putative protein seri 53.9 13 0.00028 42.7 3.8 11 160-170 2-12 (645)
40 PF10367 Vps39_2: Vacuolar sor 53.8 9.3 0.0002 32.5 2.1 29 187-215 79-108 (109)
41 KOG1702 Nebulin repeat protein 51.9 4.7 0.0001 39.7 -0.1 59 220-292 5-63 (264)
42 PRK14890 putative Zn-ribbon RN 51.7 12 0.00026 29.8 2.1 27 159-195 7-34 (59)
43 TIGR00595 priA primosomal prot 51.6 15 0.00033 40.8 3.9 48 159-227 213-261 (505)
44 PF13699 DUF4157: Domain of un 51.2 6.4 0.00014 32.8 0.6 16 405-420 63-78 (79)
45 smart00504 Ubox Modified RING 51.1 11 0.00023 29.0 1.9 45 187-232 2-48 (63)
46 PF10263 SprT-like: SprT-like 50.8 7.9 0.00017 35.5 1.3 23 398-420 55-77 (157)
47 PF14891 Peptidase_M91: Effect 50.7 7.3 0.00016 37.0 1.1 22 398-422 101-122 (174)
48 COG4357 Zinc finger domain con 50.6 2.7 5.9E-05 36.5 -1.7 50 161-210 37-86 (105)
49 KOG0320 Predicted E3 ubiquitin 50.6 6.6 0.00014 37.8 0.7 49 256-305 129-177 (187)
50 PHA02456 zinc metallopeptidase 50.5 6.1 0.00013 35.3 0.4 36 404-445 80-116 (141)
51 PF05299 Peptidase_M61: M61 gl 50.3 7.5 0.00016 35.2 1.0 40 404-443 5-57 (122)
52 PF00645 zf-PARP: Poly(ADP-rib 47.9 3.1 6.7E-05 34.4 -1.8 19 215-233 3-21 (82)
53 PF04889 Cwf_Cwc_15: Cwf15/Cwc 47.0 45 0.00097 33.8 6.0 6 104-109 155-160 (244)
54 TIGR02411 leuko_A4_hydro leuko 46.9 10 0.00023 43.0 1.7 38 405-443 281-321 (601)
55 PF01431 Peptidase_M13: Peptid 46.1 6.8 0.00015 37.7 0.0 15 403-417 36-50 (206)
56 PRK04023 DNA polymerase II lar 46.1 23 0.00049 42.5 4.2 55 156-233 623-677 (1121)
57 cd00136 PDZ PDZ domain, also c 44.2 5.8 0.00013 30.9 -0.7 26 366-391 30-55 (70)
58 PRK12495 hypothetical protein; 43.7 55 0.0012 32.7 5.8 27 160-197 43-69 (226)
59 COG2191 Formylmethanofuran deh 42.8 11 0.00024 37.0 0.9 30 188-217 174-203 (206)
60 PF11781 RRN7: RNA polymerase 42.5 14 0.00029 26.4 1.1 25 187-215 9-33 (36)
61 KOG3605 Beta amyloid precursor 41.9 6.1 0.00013 44.8 -1.1 32 361-392 768-799 (829)
62 KOG3209 WW domain-containing p 41.4 4.7 0.0001 46.1 -2.1 37 363-399 938-974 (984)
63 COG1645 Uncharacterized Zn-fin 41.2 15 0.00032 33.9 1.4 22 188-214 30-51 (131)
64 COG1198 PriA Primosomal protei 40.8 29 0.00063 40.5 4.0 49 159-228 435-484 (730)
65 PF10083 DUF2321: Uncharacteri 40.7 12 0.00027 35.3 0.8 55 206-273 27-83 (158)
66 PF13920 zf-C3HC4_3: Zinc fing 40.6 15 0.00032 27.4 1.1 44 187-231 3-49 (50)
67 PF13240 zinc_ribbon_2: zinc-r 40.0 17 0.00037 23.3 1.2 8 162-169 2-9 (23)
68 COG2888 Predicted Zn-ribbon RN 39.8 15 0.00033 29.2 1.1 37 259-303 10-46 (61)
69 PHA03378 EBNA-3B; Provisional 38.1 19 0.00041 41.2 1.9 36 16-53 303-340 (991)
70 PF14471 DUF4428: Domain of un 37.3 16 0.00034 28.1 0.8 30 260-291 1-30 (51)
71 KOG1832 HIV-1 Vpr-binding prot 36.8 14 0.00031 43.5 0.8 24 65-88 1404-1427(1516)
72 PF12773 DZR: Double zinc ribb 36.6 35 0.00075 25.3 2.6 8 282-289 30-37 (50)
73 KOG3552 FERM domain protein FR 35.8 9.9 0.00022 44.8 -0.7 25 365-389 90-114 (1298)
74 PHA02608 67 prohead core prote 35.8 28 0.00061 29.1 2.1 17 44-60 34-50 (80)
75 cd00992 PDZ_signaling PDZ doma 35.4 6.5 0.00014 31.4 -1.7 24 365-388 42-65 (82)
76 KOG1813 Predicted E3 ubiquitin 35.1 21 0.00044 37.1 1.5 45 187-232 242-288 (313)
77 PF09943 DUF2175: Uncharacteri 35.1 10 0.00022 33.3 -0.6 28 161-188 4-32 (101)
78 KOG0490 Transcription factor, 34.1 9.6 0.00021 37.0 -1.0 52 253-305 18-69 (235)
79 PRK14015 pepN aminopeptidase N 33.7 26 0.00057 41.7 2.3 40 403-443 296-338 (875)
80 KOG0978 E3 ubiquitin ligase in 33.5 10 0.00023 43.6 -1.0 46 187-233 644-692 (698)
81 PF08394 Arc_trans_TRASH: Arch 32.7 23 0.00049 25.6 1.0 29 162-193 1-30 (37)
82 smart00731 SprT SprT homologue 32.6 22 0.00048 32.7 1.2 21 400-420 56-76 (146)
83 PF14634 zf-RING_5: zinc-RING 32.3 36 0.00077 24.7 2.0 41 261-301 2-42 (44)
84 KOG0478 DNA replication licens 32.0 41 0.0009 38.9 3.3 23 366-402 342-364 (804)
85 PF09538 FYDLN_acid: Protein o 31.9 28 0.00061 30.9 1.6 33 242-290 3-35 (108)
86 PF07607 DUF1570: Protein of u 31.7 27 0.00059 31.9 1.6 32 405-436 3-38 (128)
87 PRK05580 primosome assembly pr 31.1 39 0.00085 39.0 3.1 11 405-415 556-566 (679)
88 TIGR02420 dksA RNA polymerase- 30.5 35 0.00076 30.0 2.0 30 158-193 79-108 (110)
89 PF10235 Cript: Microtubule-as 30.2 31 0.00068 29.7 1.6 37 187-231 45-81 (90)
90 KOG4739 Uncharacterized protei 30.0 34 0.00073 34.5 2.0 34 198-231 15-49 (233)
91 KOG3039 Uncharacterized conser 29.5 2.5E+02 0.0053 28.8 7.9 78 156-233 180-273 (303)
92 PF09768 Peptidase_M76: Peptid 29.4 20 0.00044 34.4 0.4 14 404-417 72-85 (173)
93 smart00504 Ubox Modified RING 29.1 27 0.0006 26.7 1.0 41 259-302 2-42 (63)
94 COG0308 PepN Aminopeptidase N 28.4 30 0.00064 41.2 1.6 43 401-445 305-351 (859)
95 PF12674 Zn_ribbon_2: Putative 28.3 28 0.00062 29.2 1.0 32 260-291 2-36 (81)
96 PF06677 Auto_anti-p27: Sjogre 27.6 41 0.00089 24.7 1.6 21 261-287 20-40 (41)
97 PF06827 zf-FPG_IleRS: Zinc fi 27.4 25 0.00055 23.5 0.5 13 219-231 1-13 (30)
98 COG1645 Uncharacterized Zn-fin 27.3 33 0.00072 31.5 1.3 24 260-290 30-53 (131)
99 PRK00420 hypothetical protein; 27.1 37 0.0008 30.4 1.6 23 187-213 24-46 (112)
100 cd00162 RING RING-finger (Real 26.6 28 0.00062 23.9 0.6 40 189-228 2-44 (45)
101 KOG1280 Uncharacterized conser 26.2 52 0.0011 35.0 2.6 16 252-267 73-88 (381)
102 COG5148 RPN10 26S proteasome r 26.2 24 0.00053 34.5 0.3 28 42-69 207-234 (243)
103 cd00989 PDZ_metalloprotease PD 26.0 40 0.00088 26.6 1.5 18 366-383 29-46 (79)
104 PRK00420 hypothetical protein; 26.0 42 0.00091 30.1 1.7 26 260-291 25-50 (112)
105 PHA03377 EBNA-3C; Provisional 25.9 45 0.00098 38.5 2.3 34 17-50 310-344 (1000)
106 PF04931 DNA_pol_phi: DNA poly 25.4 42 0.00091 39.4 2.1 8 126-133 739-746 (784)
107 PF01421 Reprolysin: Reprolysi 25.3 42 0.00091 32.1 1.7 25 390-414 118-142 (199)
108 COG2191 Formylmethanofuran deh 25.3 32 0.00068 33.9 0.9 31 259-291 173-203 (206)
109 KOG0957 PHD finger protein [Ge 25.3 72 0.0016 35.5 3.6 139 158-305 118-262 (707)
110 PRK14714 DNA polymerase II lar 25.2 57 0.0012 40.2 3.1 11 221-231 711-721 (1337)
111 TIGR03826 YvyF flagellar opero 25.2 34 0.00074 31.7 1.1 23 258-290 81-103 (137)
112 PRK14714 DNA polymerase II lar 24.7 98 0.0021 38.3 4.9 10 408-417 907-916 (1337)
113 KOG2462 C2H2-type Zn-finger pr 24.6 49 0.0011 34.1 2.1 11 259-269 216-226 (279)
114 PF13834 DUF4193: Domain of un 24.6 24 0.00051 30.9 -0.1 29 258-287 70-98 (99)
115 cd04270 ZnMc_TACE_like Zinc-de 24.1 31 0.00068 34.6 0.6 21 395-415 157-179 (244)
116 PF04502 DUF572: Family of unk 23.9 54 0.0012 34.5 2.3 17 178-197 35-51 (324)
117 PF10083 DUF2321: Uncharacteri 22.6 32 0.00068 32.6 0.3 52 159-230 28-79 (158)
118 cd00991 PDZ_archaeal_metallopr 22.6 51 0.0011 26.7 1.5 19 365-383 26-44 (79)
119 cd04267 ZnMc_ADAM_like Zinc-de 22.5 40 0.00087 31.9 1.0 24 390-414 121-144 (192)
120 PF14446 Prok-RING_1: Prokaryo 22.5 52 0.0011 25.8 1.4 13 159-171 5-17 (54)
121 cd00990 PDZ_glycyl_aminopeptid 22.5 48 0.0011 26.3 1.4 18 366-383 29-46 (80)
122 PLN03208 E3 ubiquitin-protein 22.4 73 0.0016 31.2 2.7 13 220-232 69-81 (193)
123 PF13180 PDZ_2: PDZ domain; PD 22.3 43 0.00094 27.2 1.0 18 366-383 31-48 (82)
124 PRK10778 dksA RNA polymerase-b 22.3 1.4E+02 0.0031 28.0 4.6 31 158-194 110-140 (151)
125 TIGR02414 pepN_proteo aminopep 21.9 55 0.0012 39.0 2.1 40 403-443 283-325 (863)
126 KOG0609 Calcium/calmodulin-dep 21.9 18 0.0004 40.3 -1.6 24 363-386 161-184 (542)
127 KOG1420 Ca2+-activated K+ chan 21.8 61 0.0013 36.8 2.3 11 485-495 47-57 (1103)
128 PF13923 zf-C3HC4_2: Zinc fing 21.6 55 0.0012 23.0 1.3 36 261-298 1-36 (39)
129 cd04275 ZnMc_pappalysin_like Z 21.6 33 0.00072 34.2 0.2 49 370-419 97-152 (225)
130 cd00988 PDZ_CTP_protease PDZ d 21.5 26 0.00057 28.2 -0.4 19 366-384 30-48 (85)
131 KOG4286 Dystrophin-like protei 21.3 32 0.0007 39.9 0.1 54 218-290 602-656 (966)
132 PF06750 DiS_P_DiS: Bacterial 21.3 40 0.00086 28.9 0.6 39 160-200 34-72 (92)
133 PF10235 Cript: Microtubule-as 20.8 40 0.00087 29.1 0.5 36 259-306 45-80 (90)
134 PF01435 Peptidase_M48: Peptid 20.8 37 0.0008 32.5 0.4 14 404-417 90-103 (226)
135 PF01447 Peptidase_M4: Thermol 20.7 39 0.00084 31.6 0.5 18 399-416 131-148 (150)
136 cd04269 ZnMc_adamalysin_II_lik 20.7 60 0.0013 30.8 1.8 24 392-415 120-143 (194)
137 PHA00527 hypothetical protein 20.5 2E+02 0.0042 25.7 4.7 63 373-440 47-113 (129)
138 PF12156 ATPase-cat_bd: Putati 20.4 53 0.0012 27.9 1.2 12 221-232 2-13 (88)
139 PRK04439 S-adenosylmethionine 20.2 73 0.0016 34.5 2.4 54 364-427 72-125 (399)
140 PF12388 Peptidase_M57: Dual-a 20.1 61 0.0013 32.2 1.7 39 376-424 112-150 (211)
141 PF02591 DUF164: Putative zinc 20.1 26 0.00056 27.0 -0.7 15 219-233 22-36 (56)
No 1
>PF12315 DUF3633: Protein of unknown function (DUF3633); InterPro: IPR022087 This domain family is found in bacteria and eukaryotes, and is approximately 210 amino acids in length. The family is found in association with PF00412 from PFAM.
Probab=100.00 E-value=2.8e-69 Score=513.24 Aligned_cols=194 Identities=78% Similarity=1.208 Sum_probs=183.2
Q ss_pred hcCCccccccceEEeehhhHHHhhcCCCCCccccccccCcccCccchhcccccccccCCCCeeeeeccccccccccceee
Q 010326 311 GLNMKVEQQVPLLLVERQALNEAMEGEKNGHHHLPETRGLCLSEEQTVTTVLRRPRIGAGYRLIDMITEPYRLIRRCEVT 390 (513)
Q Consensus 311 ~l~~~i~~~iPv~LVe~~aLn~a~e~e~~g~~~~~e~rGlclSee~~v~~~~~~~~~~~G~rilei~~~p~~~~~~~eV~ 390 (513)
+|||+++++|||+||+++|||++.+.|++|++|.++||||||||+|+|++|.++|++++|+++++|.++|+++++.|+|+
T Consensus 1 ~lnmki~q~~PllLVe~~aLN~a~~~Ek~~~~~~~~tRGLclseeq~v~sv~~~p~~~~~~~~~~~~~e~~~~~~~~eV~ 80 (212)
T PF12315_consen 1 GLNMKIEQEIPLLLVERQALNEAEEGEKIGHHHMPETRGLCLSEEQTVTSVLRRPRMGPGNQLIDMSTEPQRLTRGCEVT 80 (212)
T ss_pred CCCCcccCCCCeEEecHHHHHHHHhhccCCCCCCeeeeeeeeeeeEEEEEEEecCCcCCCCccceeeecceeeccceeEE
Confidence 58999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred eEeeecCchhhhhhhhhhccchhhHhhhcCCCCCCCcchhhHHHHHHHHHhhccccCCCCCCcCCCCCCCCCCCCCCCCC
Q 010326 391 AILILYGLPRLLTGSILAHEMMHAWLRLKGYPNLRPDVEEGICQVLAHMWLESEIYSGSGSDVASSSSSSASSSSSSPSS 470 (513)
Q Consensus 391 ~Il~l~glP~~L~gsilaHE~~Hawl~l~g~~~L~~~~eEG~cq~~a~~wl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 470 (513)
+|+|||||||+|||||||||+|||||||+|||+|+|+||||||||||||||+++++++.+. ++++|++++++
T Consensus 81 ~Ilvl~GLPrll~gsiLAHE~mHa~Lrl~g~~~L~~~vEEGiCqvla~~wL~~~~~~~~~~--------~~~~s~~~s~~ 152 (212)
T PF12315_consen 81 AILVLYGLPRLLTGSILAHELMHAWLRLNGFPNLSPEVEEGICQVLAYLWLESELASGSGS--------SSSSSSSSSSS 152 (212)
T ss_pred EEEEECCCCHHHHhhHHHHHHHHHHhcccCCCCCChHHHHHHHHHHHHHHHhhhhhcccCC--------cccccCCCCCC
Confidence 9999999999999999999999999999999999999999999999999999999987651 11345556677
Q ss_pred CCcCCcCCCcchHHHHHHHHHHhhhhcCCCCCCchhhhhhhc
Q 010326 471 SSTSSKKGKRSDFEKDLGKFFKHQIESDTSSAYGDGLGKVVR 512 (513)
Q Consensus 471 ~~~~~~~~~~~~~~~~l~~~~~~qi~~d~s~~YG~Gfr~~~~ 512 (513)
++++||||++++||+||++||+|||++|+|||||||||+|++
T Consensus 153 ~~~~skkg~~s~~E~kL~~f~~~qIe~D~SpvYGdGFRaa~~ 194 (212)
T PF12315_consen 153 ASSSSKKGAKSQFEKKLGEFFKHQIETDTSPVYGDGFRAANE 194 (212)
T ss_pred cccccccccccHHHHHHHHHHHHHhccCCCcccchHHHHHHH
Confidence 778899999999999999999999999999999999999985
No 2
>KOG2272 consensus Focal adhesion protein PINCH-1, contains LIM domains [Signal transduction mechanisms; Cytoskeleton]
Probab=99.91 E-value=1.2e-26 Score=224.82 Aligned_cols=179 Identities=26% Similarity=0.543 Sum_probs=158.1
Q ss_pred CCCcCCCCCC-CCCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCcccccccccc-----ccC
Q 010326 146 SGNIFQPFPF-FSGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKE-----QHH 219 (513)
Q Consensus 146 ~gsv~~p~~~-~~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~-----~f~ 219 (513)
.|..||..++ ....|.|++|++.| .|++|.+++.+|||.||+|..|++.|.+..|+...|+.+|..|-.+ +..
T Consensus 59 EgRkYCEhDF~~LfaPcC~kC~EFi-iGrVikamnnSwHp~CF~Cd~Cn~~Lad~gf~rnqgr~LC~~Cn~k~Ka~~~g~ 137 (332)
T KOG2272|consen 59 EGRKYCEHDFHVLFAPCCGKCGEFI-IGRVIKAMNNSWHPACFRCDLCNKHLADQGFYRNQGRALCRECNQKEKAKGRGR 137 (332)
T ss_pred cCcccccccchhhhchhhcccccch-hhHHHHhhccccCcccchhHHHHHHHhhhhhHhhcchHHhhhhhhhhcccccce
Confidence 4778899888 78899999999999 6999999999999999999999999999999999999999999866 234
Q ss_pred cccccCCCCcCCCCccceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCc
Q 010326 220 PKCDVCQNFIPTNSAGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQ 299 (513)
Q Consensus 220 pkC~~C~~~I~~~~~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~ 299 (513)
..|+.|+..|... .+.+++.|| |+.+|.|..|++.|..+.+-+ .|.+||+.|+++ |..|+|.
T Consensus 138 YvC~KCh~~iD~~---~l~fr~d~y----------H~yHFkCt~C~keL~sdaRev---k~eLyClrChD~--mgipiCg 199 (332)
T KOG2272|consen 138 YVCQKCHAHIDEQ---PLTFRGDPY----------HPYHFKCTTCGKELTSDAREV---KGELYCLRCHDK--MGIPICG 199 (332)
T ss_pred eehhhhhhhcccc---cccccCCCC----------Cccceecccccccccchhhhh---ccceeccccccc--cCCcccc
Confidence 5799999999885 689999997 777899999999998777765 679999999999 8999999
Q ss_pred ccchHHHHHHhhcCCccccccceEEeehhhHHHhhc----------CCCCCccccccccCcccCccchhccc
Q 010326 300 PLYLEIQEFYEGLNMKVEQQVPLLLVERQALNEAME----------GEKNGHHHLPETRGLCLSEEQTVTTV 361 (513)
Q Consensus 300 ~c~~~I~~f~e~l~~~i~~~iPv~LVe~~aLn~a~e----------~e~~g~~~~~e~rGlclSee~~v~~~ 361 (513)
+|.++|. .++ +.||+++|+ +++.||.|| |.+|++||++||+..+
T Consensus 200 aC~rpIe-----------erv------i~amgKhWHveHFvCa~CekPFlGHrHY-EkkGlaYCe~h~~qLf 253 (332)
T KOG2272|consen 200 ACRRPIE-----------ERV------IFAMGKHWHVEHFVCAKCEKPFLGHRHY-EKKGLAYCETHYHQLF 253 (332)
T ss_pred cccCchH-----------HHH------HHHhccccchhheeehhcCCcccchhhh-hhcCchhHHHHHHHHh
Confidence 9998883 334 788889884 888999998 9999999999997653
No 3
>KOG1703 consensus Adaptor protein Enigma and related PDZ-LIM proteins [Signal transduction mechanisms; Cytoskeleton]
Probab=99.90 E-value=1.4e-24 Score=235.14 Aligned_cols=322 Identities=43% Similarity=0.686 Sum_probs=267.0
Q ss_pred CCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCcccccccccc-ccCcccccCCCCcCCCCcc
Q 010326 157 SGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKE-QHHPKCDVCQNFIPTNSAG 235 (513)
Q Consensus 157 ~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~-~f~pkC~~C~~~I~~~~~g 235 (513)
.....|.+|.-.|..+..+ ||.|..|..++. .+...||.. .-...|.+|...|.....+
T Consensus 132 ~~~~~~~~~~~~~~~~~~~----------~~~~~~~~~p~~----------~~~~~~~~~~~~~~~~~v~~~~~~~~~~~ 191 (479)
T KOG1703|consen 132 PLDSICGGCNSAIEHGRSV----------CFQCKRCSEPLS----------GFPKPSYHESGRSKNEDVEEASSPSSRAG 191 (479)
T ss_pred cccccccCCCcccccccch----------hhhhcccccccC----------Ccccccccccccccccccccccccccccc
Confidence 3457899999999766555 899999988882 233344544 3567899999999988778
Q ss_pred ceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHHHHHHhhcCCc
Q 010326 236 LIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEIQEFYEGLNMK 315 (513)
Q Consensus 236 ~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I~~f~e~l~~~ 315 (513)
.+.++.++||.++||+.|.++.+..|..|.+..+.+..|..+.+++.+|..|....+|+.+.|++....++.++.+..|.
T Consensus 192 ~~~~~~~~~~~~~~~~~~e~~~tp~~~~~~r~e~~~~~~~~l~~~~~~~~~~~~~~~~~~p~~~p~~~~~~~~~~~~~~~ 271 (479)
T KOG1703|consen 192 LILSRSHPFWKQKYCPSHENDGTPKCCSCERLEPLDTRYVELADGRALCLECMGSASMDSPECQPLVSAPRPASEGLHMK 271 (479)
T ss_pred ccccccchhhhhcccccccCCCCCCcccccccccccccceecccchhhhhhccCCcccCCCccCcceecccccccccccc
Confidence 88999999999999999999999999999999876888999989999999999888899999999999999999999999
Q ss_pred cccccceEEeehhhHHHhhcCCCCCccccccccCcccCccchhcccccccccCCCCeeeeeccccccccccceeeeEeee
Q 010326 316 VEQQVPLLLVERQALNEAMEGEKNGHHHLPETRGLCLSEEQTVTTVLRRPRIGAGYRLIDMITEPYRLIRRCEVTAILIL 395 (513)
Q Consensus 316 i~~~iPv~LVe~~aLn~a~e~e~~g~~~~~e~rGlclSee~~v~~~~~~~~~~~G~rilei~~~p~~~~~~~eV~~Il~l 395 (513)
+.+..++.|+++++++....+......|. .++++|.++.++++++ ..|..++++.-+....|++.++.++
T Consensus 272 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~p~c~~c~~~i~~~---------~~i~~~~~~~h~~~~~c~~~~~~~~ 341 (479)
T KOG1703|consen 272 VEKELPLLLVESEALKKLREEEKPAEYHN-VTRPLCLSCNQKIRSV---------KVIVALGKEWHPEHFSCEVCAIVIL 341 (479)
T ss_pred cccccchhhcccccccccccccccccccc-cccccccccccCcccc---------eeEeeccccccccceeecccccccc
Confidence 99999999999999999887666554443 6789999999887553 3477888899999999999999999
Q ss_pred cCchhhhhhhhhhccchhhHhhhcCCCCCCCcchhhHHHHHHHHHhhccccCCCCCCcCCCCCCCCCCCCCCCCCCCcCC
Q 010326 396 YGLPRLLTGSILAHEMMHAWLRLKGYPNLRPDVEEGICQVLAHMWLESEIYSGSGSDVASSSSSSASSSSSSPSSSSTSS 475 (513)
Q Consensus 396 ~glP~~L~gsilaHE~~Hawl~l~g~~~L~~~~eEG~cq~~a~~wl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 475 (513)
++.|+..+|.+++|++||+|++.++...+.+.++++||++ +.+|+....+-..-...-. ..+.+..+..+......
T Consensus 342 ~~~~~~~~g~~~c~~~~~~~~~p~C~~C~~~i~~~~v~a~-~~~wH~~cf~C~~C~~~~~---~~~~~~~~~~pyce~~~ 417 (479)
T KOG1703|consen 342 DGGPRELDGKILCHECFHAPFRPNCKRCLLPILEEGVCAL-GRLWHPECFVCADCGKPLK---NSSFFESDGEPYCEDHY 417 (479)
T ss_pred CCCccccCCCccHHHHHHHhhCccccccCCchHHhHhhhc-cCeechhceeeecccCCCC---CCcccccCCccchhhhH
Confidence 9999999999999999999999999999999999999999 9999998876642111100 11111233333344455
Q ss_pred cCCC--cchHHHHHHHHHHhhhhcCCCCCCchhhhhhhc
Q 010326 476 KKGK--RSDFEKDLGKFFKHQIESDTSSAYGDGLGKVVR 512 (513)
Q Consensus 476 ~~~~--~~~~~~~l~~~~~~qi~~d~s~~YG~Gfr~~~~ 512 (513)
+++. +..+++++++|+.++|+.|.+++||+|||.++.
T Consensus 418 ~~~~~~~~~~~~~p~~~~~~~ie~~~~~~h~~~F~c~~c 456 (479)
T KOG1703|consen 418 KKLFTTKCDYCKKPVEFGSRQIEADGSPFHGDCFRCANC 456 (479)
T ss_pred hhhccccchhccchhHhhhhHhhccCccccccceehhhh
Confidence 5554 678899999999999999999999999998864
No 4
>KOG1701 consensus Focal adhesion adaptor protein Paxillin and related LIM proteins [Signal transduction mechanisms]
Probab=99.84 E-value=1.1e-22 Score=209.74 Aligned_cols=166 Identities=19% Similarity=0.392 Sum_probs=135.6
Q ss_pred CCCCCcCcCCCcccccC-ceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCCc
Q 010326 156 FSGYRICAGCNTEIGHG-RYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNSA 234 (513)
Q Consensus 156 ~~g~~~C~~C~~~I~~g-~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~~ 234 (513)
...+.+|.+|++.|... ..++||++.||..||+|..|++.|.++.|+..++++||+.||.. ...||.+|++.|++.
T Consensus 271 ~~~~~iC~~C~K~V~g~~~ac~Am~~~fHv~CFtC~~C~r~L~Gq~FY~v~~k~~CE~cyq~-tlekC~~Cg~~I~d~-- 347 (468)
T KOG1701|consen 271 EDYFGICAFCHKTVSGQGLAVEAMDQLFHVQCFTCRTCRRQLAGQSFYQVDGKPYCEGCYQD-TLEKCNKCGEPIMDR-- 347 (468)
T ss_pred hhhhhhhhhcCCcccCcchHHHHhhhhhcccceehHhhhhhhccccccccCCcccchHHHHH-HHHHHhhhhhHHHHH--
Confidence 34567999999999643 56899999999999999999999999999999999999999975 578999999999985
Q ss_pred cceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHHHHHHhhcCC
Q 010326 235 GLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEIQEFYEGLNM 314 (513)
Q Consensus 235 g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I~~f~e~l~~ 314 (513)
++...+.. ||++||+|.+|.+.| +|..|.+..++++||..||.+ .++++|..|.++| |
T Consensus 348 -iLrA~Gka----------yHp~CF~Cv~C~r~l-dgipFtvd~~n~v~Cv~dfh~--kfAPrCs~C~~PI--------~ 405 (468)
T KOG1701|consen 348 -ILRALGKA----------YHPGCFTCVVCARCL-DGIPFTVDSQNNVYCVPDFHK--KFAPRCSVCGNPI--------L 405 (468)
T ss_pred -HHHhcccc----------cCCCceEEEEecccc-CCccccccCCCceeeehhhhh--hcCcchhhccCCc--------c
Confidence 56555554 499999999999999 589999988999999999999 5699999999999 6
Q ss_pred cccccc-ceEEeehhhHHHhhcCCCCCccccccccCcccC
Q 010326 315 KVEQQV-PLLLVERQALNEAMEGEKNGHHHLPETRGLCLS 353 (513)
Q Consensus 315 ~i~~~i-Pv~LVe~~aLn~a~e~e~~g~~~~~e~rGlclS 353 (513)
+-+.+- .|++|.+ ++.|.-.-+--|-.|+-||
T Consensus 406 P~~G~~etvRvvam-------dr~fHv~CY~CEDCg~~LS 438 (468)
T KOG1701|consen 406 PRDGKDETVRVVAM-------DRDFHVNCYKCEDCGLLLS 438 (468)
T ss_pred CCCCCcceEEEEEc-------cccccccceehhhcCcccc
Confidence 665443 3777733 3333222222366788888
No 5
>KOG4577 consensus Transcription factor LIM3, contains LIM and HOX domains [Transcription]
Probab=99.81 E-value=4.6e-22 Score=196.04 Aligned_cols=124 Identities=26% Similarity=0.556 Sum_probs=112.2
Q ss_pred CCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCCccc
Q 010326 157 SGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNSAGL 236 (513)
Q Consensus 157 ~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~~g~ 236 (513)
...++|++|.+.|.+..+++++++.||..|++|+.|..+|.+.+|. ++|.+||+.+|.++|+.+|..|...|++..
T Consensus 31 ~eip~CagC~q~IlDrFilKvl~R~wHs~CLkCs~C~~qL~drCFs-R~~s~yCkedFfKrfGTKCsaC~~GIpPtq--- 106 (383)
T KOG4577|consen 31 VEIPICAGCDQHILDRFILKVLDRHWHSSCLKCSDCHDQLADRCFS-REGSVYCKEDFFKRFGTKCSACQEGIPPTQ--- 106 (383)
T ss_pred cccccccchHHHHHHHHHHHHHhhhhhhhhcchhhhhhHHHHHHhh-cCCceeehHHHHHHhCCcchhhcCCCChHH---
Confidence 3678999999999877788999999999999999999999999987 689999999999999999999999999863
Q ss_pred eeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccccc
Q 010326 237 IEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAI 292 (513)
Q Consensus 237 i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v 292 (513)
+..+...| .||.+||.|..|+|.|..|++||+++|++++|+..|+++-
T Consensus 107 VVRkAqd~--------VYHl~CF~C~iC~R~L~TGdEFYLmeD~rLvCK~DYE~Ak 154 (383)
T KOG4577|consen 107 VVRKAQDF--------VYHLHCFACFICKRQLATGDEFYLMEDARLVCKDDYETAK 154 (383)
T ss_pred HHHHhhcc--------eeehhhhhhHhhhcccccCCeeEEeccceeehhhhHHHHH
Confidence 44555554 5799999999999999999999999999999999999863
No 6
>KOG1701 consensus Focal adhesion adaptor protein Paxillin and related LIM proteins [Signal transduction mechanisms]
Probab=99.75 E-value=1.3e-19 Score=187.29 Aligned_cols=136 Identities=23% Similarity=0.474 Sum_probs=118.8
Q ss_pred CCCCcCCCCCCCCCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceee-cCCccccccccccccCcccc
Q 010326 145 ESGNIFQPFPFFSGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSM-SGNRPYHKHCYKEQHHPKCD 223 (513)
Q Consensus 145 ~~gsv~~p~~~~~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~-~dG~pYCk~CY~~~f~pkC~ 223 (513)
..+++||...|......|..|++.| ...+|+++|+.||+.||+|..|.+.|.+..|.+ .++.+||-.||+++|+|+|.
T Consensus 320 v~~k~~CE~cyq~tlekC~~Cg~~I-~d~iLrA~GkayHp~CF~Cv~C~r~ldgipFtvd~~n~v~Cv~dfh~kfAPrCs 398 (468)
T KOG1701|consen 320 VDGKPYCEGCYQDTLEKCNKCGEPI-MDRILRALGKAYHPGCFTCVVCARCLDGIPFTVDSQNNVYCVPDFHKKFAPRCS 398 (468)
T ss_pred cCCcccchHHHHHHHHHHhhhhhHH-HHHHHHhcccccCCCceEEEEeccccCCccccccCCCceeeehhhhhhcCcchh
Confidence 3678888888877888999999999 489999999999999999999999999999986 78899999999999999999
Q ss_pred cCCCCcCCCC----ccceeeecccccccccCCCcccCCCCccCCCCCcCC---CCCceEEccCCcccccccccccc
Q 010326 224 VCQNFIPTNS----AGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEP---RDTKYLSLDDGRKLCLECLDSAI 292 (513)
Q Consensus 224 ~C~~~I~~~~----~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~---~g~~y~~l~dgr~yC~~Cy~~~v 292 (513)
+|+++|...+ ...|+...+.| |.+|++|..|+.+|. .+...|.+ ||.++|+.|+.+.+
T Consensus 399 ~C~~PI~P~~G~~etvRvvamdr~f----------Hv~CY~CEDCg~~LS~e~e~qgCyPl-d~HllCk~Ch~~Rl 463 (468)
T KOG1701|consen 399 VCGNPILPRDGKDETVRVVAMDRDF----------HVNCYKCEDCGLLLSSEEEGQGCYPL-DGHLLCKTCHLKRL 463 (468)
T ss_pred hccCCccCCCCCcceEEEEEccccc----------cccceehhhcCccccccCCCCcceec-cCceeechhhhhhh
Confidence 9999998764 23456667766 899999999999987 46678888 78999999987643
No 7
>KOG2272 consensus Focal adhesion protein PINCH-1, contains LIM domains [Signal transduction mechanisms; Cytoskeleton]
Probab=99.63 E-value=2.4e-17 Score=160.52 Aligned_cols=133 Identities=21% Similarity=0.480 Sum_probs=116.6
Q ss_pred CCCCCCcCCCCCC-CCCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcc
Q 010326 143 RYESGNIFQPFPF-FSGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPK 221 (513)
Q Consensus 143 r~~~gsv~~p~~~-~~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pk 221 (513)
|.-+|.+||+... ..+.|+|+.|.++| .+++|.+||+.||.++|+|+.|.+|+-+...+.+.|.+||+.+|.++|+..
T Consensus 178 Revk~eLyClrChD~mgipiCgaC~rpI-eervi~amgKhWHveHFvCa~CekPFlGHrHYEkkGlaYCe~h~~qLfG~~ 256 (332)
T KOG2272|consen 178 REVKGELYCLRCHDKMGIPICGACRRPI-EERVIFAMGKHWHVEHFVCAKCEKPFLGHRHYEKKGLAYCETHYHQLFGNL 256 (332)
T ss_pred hhhccceeccccccccCCcccccccCch-HHHHHHHhccccchhheeehhcCCcccchhhhhhcCchhHHHHHHHHhhhh
Confidence 3446888999876 68999999999999 589999999999999999999999998888888999999999999999999
Q ss_pred cccCCCCcCCCCccceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccc
Q 010326 222 CDVCQNFIPTNSAGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDS 290 (513)
Q Consensus 222 C~~C~~~I~~~~~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~ 290 (513)
|..|+..|.+. ++...+.. + =+.||+|+.|.+.|..-.+|+.+ |-+++|..||++
T Consensus 257 CF~C~~~i~G~---vv~al~Ka-----w-----Cv~cf~Cs~Cdkkl~~K~Kf~E~-DmkP~CKkCy~r 311 (332)
T KOG2272|consen 257 CFICNRVIGGD---VVSALNKA-----W-----CVECFSCSTCDKKLTQKNKFYEF-DMKPVCKKCYDR 311 (332)
T ss_pred heecCCccCcc---HHHHhhhh-----h-----ccccccccccccccccccceeee-ccchHHHHHHhh
Confidence 99999999985 44444443 2 45689999999999888899988 679999999997
No 8
>KOG1044 consensus Actin-binding LIM Zn-finger protein Limatin involved in axon guidance [Signal transduction mechanisms; Cytoskeleton]
Probab=99.59 E-value=3.1e-16 Score=167.09 Aligned_cols=118 Identities=25% Similarity=0.619 Sum_probs=103.9
Q ss_pred CCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCCccc
Q 010326 157 SGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNSAGL 236 (513)
Q Consensus 157 ~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~~g~ 236 (513)
.+...|++|++.|..|+.+.++++.||..||+|..|+..|.+ +|..++|.|||..||.+.|+.+|..|.++|.+. +
T Consensus 131 ~~ps~cagc~~~lk~gq~llald~qwhv~cfkc~~c~~vL~g-ey~skdg~pyce~dy~~~fgvkc~~c~~fisgk---v 206 (670)
T KOG1044|consen 131 YGPSTCAGCGEELKNGQALLALDKQWHVSCFKCKSCSAVLNG-EYMSKDGVPYCEKDYQAKFGVKCEECEKFISGK---V 206 (670)
T ss_pred cCCccccchhhhhhccceeeeeccceeeeeeehhhhcccccc-eeeccCCCcchhhhhhhhcCeehHHhhhhhhhh---h
Confidence 456789999999999999999999999999999999999987 566689999999999999999999999999995 5
Q ss_pred eeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCccccccccc
Q 010326 237 IEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLD 289 (513)
Q Consensus 237 i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~ 289 (513)
+...++ |||+.|-+|+.|+.+|+.|++-|+- ...++-..|-.
T Consensus 207 Lqag~k----------h~HPtCARCsRCgqmF~eGEEMYlQ-Gs~iWHP~C~q 248 (670)
T KOG1044|consen 207 LQAGDK----------HFHPTCARCSRCGQMFGEGEEMYLQ-GSEIWHPDCKQ 248 (670)
T ss_pred hhccCc----------ccCcchhhhhhhccccccchheeec-cccccCCcccc
Confidence 555553 7799999999999999999988844 56788777754
No 9
>KOG1703 consensus Adaptor protein Enigma and related PDZ-LIM proteins [Signal transduction mechanisms; Cytoskeleton]
Probab=99.55 E-value=3e-15 Score=162.59 Aligned_cols=132 Identities=19% Similarity=0.430 Sum_probs=113.3
Q ss_pred CCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCCccce
Q 010326 158 GYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNSAGLI 237 (513)
Q Consensus 158 g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~~g~i 237 (513)
..+.|..|++.|....++.++++.||+.+|.|..|...|....|...+|.+||..||.+.+.|+|..|+++|.++. |
T Consensus 302 ~~p~c~~c~~~i~~~~~i~~~~~~~h~~~~~c~~~~~~~~~~~~~~~~g~~~c~~~~~~~~~p~C~~C~~~i~~~~---v 378 (479)
T KOG1703|consen 302 TRPLCLSCNQKIRSVKVIVALGKEWHPEHFSCEVCAIVILDGGPRELDGKILCHECFHAPFRPNCKRCLLPILEEG---V 378 (479)
T ss_pred ccccccccccCcccceeEeeccccccccceeeccccccccCCCccccCCCccHHHHHHHhhCccccccCCchHHhH---h
Confidence 4489999999994339999999999999999999999999888888999999999999999999999999999873 4
Q ss_pred eeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHHH
Q 010326 238 EYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEIQ 306 (513)
Q Consensus 238 ~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I~ 306 (513)
...++ |||+.||.|..|++.+. +..|+ ..++.+||..||...+ +..|..|.++|.
T Consensus 379 ~a~~~----------~wH~~cf~C~~C~~~~~-~~~~~-~~~~~pyce~~~~~~~--~~~~~~~~~p~~ 433 (479)
T KOG1703|consen 379 CALGR----------LWHPECFVCADCGKPLK-NSSFF-ESDGEPYCEDHYKKLF--TTKCDYCKKPVE 433 (479)
T ss_pred hhccC----------eechhceeeecccCCCC-CCccc-ccCCccchhhhHhhhc--cccchhccchhH
Confidence 44333 57999999999999885 44555 4489999999999954 578999988874
No 10
>PF00412 LIM: LIM domain; InterPro: IPR001781 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents LIM-type zinc finger (Znf) domains. LIM domains coordinate one or more zinc atoms, and are named after the three proteins (LIN-11, Isl1 and MEC-3) in which they were first found. They consist of two zinc-binding motifs that resemble GATA-like Znf's, however the residues holding the zinc atom(s) are variable, involving Cys, His, Asp or Glu residues. LIM domains are involved in proteins with differing functions, including gene expression, and cytoskeleton organisation and development [, ]. Protein containing LIM Znf domains include: Caenorhabditis elegans mec-3; a protein required for the differentiation of the set of six touch receptor neurons in this nematode. C. elegans. lin-11; a protein required for the asymmetric division of vulval blast cells. Vertebrate insulin gene enhancer binding protein isl-1. Isl-1 binds to one of the two cis-acting protein-binding domains of the insulin gene. Vertebrate homeobox proteins lim-1, lim-2 (lim-5) and lim3. Vertebrate lmx-1, which acts as a transcriptional activator by binding to the FLAT element; a beta-cell-specific transcriptional enhancer found in the insulin gene. Mammalian LH-2, a transcriptional regulatory protein involved in the control of cell differentiation in developing lymphoid and neural cell types. Drosophila melanogaster (Fruit fly) protein apterous, required for the normal development of the wing and halter imaginal discs. Vertebrate protein kinases LIMK-1 and LIMK-2. Mammalian rhombotins. Rhombotin 1 (RBTN1 or TTG-1) and rhombotin-2 (RBTN2 or TTG-2) are proteins of about 160 amino acids whose genes are disrupted by chromosomal translocations in T-cell leukemia. Mammalian and avian cysteine-rich protein (CRP), a 192 amino-acid protein of unknown function. Seems to interact with zyxin. Mammalian cysteine-rich intestinal protein (CRIP), a small protein which seems to have a role in zinc absorption and may function as an intracellular zinc transport protein. Vertebrate paxillin, a cytoskeletal focal adhesion protein. Mus musculus (Mouse) testin which should not be confused with rat testin which is a thiol protease homologue (see IPR000169 from INTERPRO). Helianthus annuus (Common sunflower) pollen specific protein SF3. Chicken zyxin. Zyxin is a low-abundance adhesion plaque protein which has been shown to interact with CRP. Yeast protein LRG1 which is involved in sporulation []. Saccharomyces cerevisiae (Baker's yeast) rho-type GTPase activating protein RGA1/DBM1. C. elegans homeobox protein ceh-14. C. elegans homeobox protein unc-97. S. cerevisiae hypothetical protein YKR090w. C. elegans hypothetical proteins C28H8.6. These proteins generally contain two tandem copies of the LIM domain in their N-terminal section. Zyxin and paxillin are exceptions in that they contain respectively three and four LIM domains at their C-terminal extremity. In apterous, isl-1, LH-2, lin-11, lim-1 to lim-3, lmx-1 and ceh-14 and mec-3 there is a homeobox domain some 50 to 95 amino acids after the LIM domains. LIM domains contain seven conserved cysteine residues and a histidine. The arrangement followed by these conserved residues is: C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] LIM domains bind two zinc ions []. LIM does not bind DNA, rather it seems to act as an interface for protein-protein interaction. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 2CO8_A 2EGQ_A 2CUR_A 3IXE_B 1CTL_A 1B8T_A 1X62_A 2DFY_C 1IML_A 2CUQ_A ....
Probab=99.24 E-value=3.8e-12 Score=98.58 Aligned_cols=57 Identities=39% Similarity=0.996 Sum_probs=52.1
Q ss_pred CcCCCcccccCcee-eecCccccCCCcccCCCCCCCCCcceeecCCcccccccccccc
Q 010326 162 CAGCNTEIGHGRYL-SCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQH 218 (513)
Q Consensus 162 C~~C~~~I~~g~~l-~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f 218 (513)
|.+|+++|..+..+ .++++.||+.||+|..|+.+|.+..|+..+|++||+.||.++|
T Consensus 1 C~~C~~~I~~~~~~~~~~~~~~H~~Cf~C~~C~~~l~~~~~~~~~~~~~C~~c~~~~f 58 (58)
T PF00412_consen 1 CARCGKPIYGTEIVIKAMGKFWHPECFKCSKCGKPLNDGDFYEKDGKPYCKDCYQKRF 58 (58)
T ss_dssp BTTTSSBESSSSEEEEETTEEEETTTSBETTTTCBTTTSSEEEETTEEEEHHHHHHHT
T ss_pred CCCCCCCccCcEEEEEeCCcEEEccccccCCCCCccCCCeeEeECCEEECHHHHhhhC
Confidence 88999999866655 7999999999999999999999888888999999999998865
No 11
>KOG1044 consensus Actin-binding LIM Zn-finger protein Limatin involved in axon guidance [Signal transduction mechanisms; Cytoskeleton]
Probab=99.10 E-value=7.2e-11 Score=126.65 Aligned_cols=158 Identities=16% Similarity=0.324 Sum_probs=117.2
Q ss_pred CcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCCccceee
Q 010326 160 RICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNSAGLIEY 239 (513)
Q Consensus 160 ~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~~g~i~~ 239 (513)
-.|..|.+.- .|+++++.++.||..||.|..|+..|....|+.+++. +++++ ..|..+|.+. ++..
T Consensus 17 i~c~~c~~kc-~gevlrv~d~~fhi~cf~c~~cg~~la~~gff~k~~~--------~~ygt--~~c~~~~~ge---vvsa 82 (670)
T KOG1044|consen 17 IKCDKCRKKC-SGEVLRVNDNHFHINCFQCKKCGRNLAEGGFFTKPEN--------RLYGT--DDCRAFVEGE---VVST 82 (670)
T ss_pred eehhhhCCcc-ccceeEeeccccceeeeeccccCCCcccccceecccc--------eeecc--cchhhhccce---eEec
Confidence 4699999998 5999999999999999999999999999888876654 34444 6788888875 4666
Q ss_pred ecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCccccccccccccc------CCCCCcccchHHHHHHhhcC
Q 010326 240 RAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIM------DTHECQPLYLEIQEFYEGLN 313 (513)
Q Consensus 240 ~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~------~t~~C~~c~~~I~~f~e~l~ 313 (513)
.+..| |+.||.|+.|+.+++.|++.. +.....+|..|-.-+-. +...|++|...|
T Consensus 83 ~gkty----------h~~cf~cs~ck~pf~~g~~vt-~~gk~~~c~~c~~~~~~~p~~~~~ps~cagc~~~l-------- 143 (670)
T KOG1044|consen 83 LGKTY----------HPKCFSCSTCKSPFKSGDKVT-FSGKECLCQTCSQPMPVSPAESYGPSTCAGCGEEL-------- 143 (670)
T ss_pred cccee----------ccccceecccCCCCCCCCeee-ecchhhhhhhhcCcccCCcccccCCccccchhhhh--------
Confidence 66654 899999999999999888754 44556888888654221 345799999877
Q ss_pred CccccccceEEeehhhHHHhhcC--------CCCCccccccccCcccCccchh
Q 010326 314 MKVEQQVPLLLVERQALNEAMEG--------EKNGHHHLPETRGLCLSEEQTV 358 (513)
Q Consensus 314 ~~i~~~iPv~LVe~~aLn~a~e~--------e~~g~~~~~e~rGlclSee~~v 358 (513)
..+|. .-||.++|+. -..-+..++..+|+.||+.+|.
T Consensus 144 --k~gq~------llald~qwhv~cfkc~~c~~vL~gey~skdg~pyce~dy~ 188 (670)
T KOG1044|consen 144 --KNGQA------LLALDKQWHVSCFKCKSCSAVLNGEYMSKDGVPYCEKDYQ 188 (670)
T ss_pred --hccce------eeeeccceeeeeeehhhhcccccceeeccCCCcchhhhhh
Confidence 12333 2256666651 1122334567889999988874
No 12
>KOG1700 consensus Regulatory protein MLP and related LIM proteins [Signal transduction mechanisms; Cytoskeleton]
Probab=98.66 E-value=4.7e-09 Score=102.07 Aligned_cols=134 Identities=20% Similarity=0.338 Sum_probs=90.4
Q ss_pred CCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCC---cCCC--
Q 010326 158 GYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNF---IPTN-- 232 (513)
Q Consensus 158 g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~---I~~~-- 232 (513)
....|..|++.++.-..+...|..||+.||+|..|...|....+..+++.+||+.||..+++|+=..=... +.+.
T Consensus 6 ~~~kc~~c~k~vy~~e~~~~~g~~~hk~c~~c~~~~k~l~~~~~~~~e~~~yc~~~~~~~~~~~~~~~~~~~~~~~~~~~ 85 (200)
T KOG1700|consen 6 TTDKCNACGKTVYFVEKVQKDGVDFHKECFKCEKCKKTLTLSGYSEHEGVPYCKNCHVAQFGPKGGGFGKGFQKAGGLGK 85 (200)
T ss_pred ccchhhhccCcchHHHHHhccCcchhhhHHhccccccccccccccccccccccccchHhhhCcccccccccccccCCCCc
Confidence 34589999999988777779999999999999999999998888889999999998877766654333321 0000
Q ss_pred -Cccceeeecccccc-------cccCC----------------CcccCCCCccCCCCCcCCCCCceEEccCCcccccccc
Q 010326 233 -SAGLIEYRAHPFWL-------QKYCP----------------SHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECL 288 (513)
Q Consensus 233 -~~g~i~~~~hpfw~-------~~yCp----------------~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy 288 (513)
..........+.|. ..-|+ .-||..||+|+.|+..|.. ..|... .+.++|...+
T Consensus 86 ~~~~~~~~~~~~~~~~~~~~g~~~~c~~c~k~vy~~Ek~~~~~~~~hk~cfrc~~~~~~ls~-~~~~~~-~g~l~~~~~~ 163 (200)
T KOG1700|consen 86 DGKSLNESKPNQSAKFQVFAGEKEKCARCQKTVYPLEKVTGNGLEFHKSCFRCTHCGKKLSP-KNYAAL-EGVLYCKHHF 163 (200)
T ss_pred ccccccccccccchhHHhhhccccccccccceeeehHHHhhhhhhhhhhheeecccccccCC-cchhhc-CCccccchhh
Confidence 00000000011010 00111 3469999999999999964 455544 6788887776
Q ss_pred ccccc
Q 010326 289 DSAIM 293 (513)
Q Consensus 289 ~~~v~ 293 (513)
...++
T Consensus 164 ~~~~~ 168 (200)
T KOG1700|consen 164 AQLFK 168 (200)
T ss_pred heeec
Confidence 65443
No 13
>PF00412 LIM: LIM domain; InterPro: IPR001781 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents LIM-type zinc finger (Znf) domains. LIM domains coordinate one or more zinc atoms, and are named after the three proteins (LIN-11, Isl1 and MEC-3) in which they were first found. They consist of two zinc-binding motifs that resemble GATA-like Znf's, however the residues holding the zinc atom(s) are variable, involving Cys, His, Asp or Glu residues. LIM domains are involved in proteins with differing functions, including gene expression, and cytoskeleton organisation and development [, ]. Protein containing LIM Znf domains include: Caenorhabditis elegans mec-3; a protein required for the differentiation of the set of six touch receptor neurons in this nematode. C. elegans. lin-11; a protein required for the asymmetric division of vulval blast cells. Vertebrate insulin gene enhancer binding protein isl-1. Isl-1 binds to one of the two cis-acting protein-binding domains of the insulin gene. Vertebrate homeobox proteins lim-1, lim-2 (lim-5) and lim3. Vertebrate lmx-1, which acts as a transcriptional activator by binding to the FLAT element; a beta-cell-specific transcriptional enhancer found in the insulin gene. Mammalian LH-2, a transcriptional regulatory protein involved in the control of cell differentiation in developing lymphoid and neural cell types. Drosophila melanogaster (Fruit fly) protein apterous, required for the normal development of the wing and halter imaginal discs. Vertebrate protein kinases LIMK-1 and LIMK-2. Mammalian rhombotins. Rhombotin 1 (RBTN1 or TTG-1) and rhombotin-2 (RBTN2 or TTG-2) are proteins of about 160 amino acids whose genes are disrupted by chromosomal translocations in T-cell leukemia. Mammalian and avian cysteine-rich protein (CRP), a 192 amino-acid protein of unknown function. Seems to interact with zyxin. Mammalian cysteine-rich intestinal protein (CRIP), a small protein which seems to have a role in zinc absorption and may function as an intracellular zinc transport protein. Vertebrate paxillin, a cytoskeletal focal adhesion protein. Mus musculus (Mouse) testin which should not be confused with rat testin which is a thiol protease homologue (see IPR000169 from INTERPRO). Helianthus annuus (Common sunflower) pollen specific protein SF3. Chicken zyxin. Zyxin is a low-abundance adhesion plaque protein which has been shown to interact with CRP. Yeast protein LRG1 which is involved in sporulation []. Saccharomyces cerevisiae (Baker's yeast) rho-type GTPase activating protein RGA1/DBM1. C. elegans homeobox protein ceh-14. C. elegans homeobox protein unc-97. S. cerevisiae hypothetical protein YKR090w. C. elegans hypothetical proteins C28H8.6. These proteins generally contain two tandem copies of the LIM domain in their N-terminal section. Zyxin and paxillin are exceptions in that they contain respectively three and four LIM domains at their C-terminal extremity. In apterous, isl-1, LH-2, lin-11, lim-1 to lim-3, lmx-1 and ceh-14 and mec-3 there is a homeobox domain some 50 to 95 amino acids after the LIM domains. LIM domains contain seven conserved cysteine residues and a histidine. The arrangement followed by these conserved residues is: C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] LIM domains bind two zinc ions []. LIM does not bind DNA, rather it seems to act as an interface for protein-protein interaction. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0008270 zinc ion binding; PDB: 2CO8_A 2EGQ_A 2CUR_A 3IXE_B 1CTL_A 1B8T_A 1X62_A 2DFY_C 1IML_A 2CUQ_A ....
Probab=98.39 E-value=2.1e-07 Score=71.89 Aligned_cols=57 Identities=18% Similarity=0.385 Sum_probs=43.9
Q ss_pred cccCCCCcCCCCccceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCccccccccccc
Q 010326 222 CDVCQNFIPTNSAGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSA 291 (513)
Q Consensus 222 C~~C~~~I~~~~~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~ 291 (513)
|..|+++|.+.. ..+...+.. ||..||+|..|++.|..+. |+.. +|++||..||.+.
T Consensus 1 C~~C~~~I~~~~-~~~~~~~~~----------~H~~Cf~C~~C~~~l~~~~-~~~~-~~~~~C~~c~~~~ 57 (58)
T PF00412_consen 1 CARCGKPIYGTE-IVIKAMGKF----------WHPECFKCSKCGKPLNDGD-FYEK-DGKPYCKDCYQKR 57 (58)
T ss_dssp BTTTSSBESSSS-EEEEETTEE----------EETTTSBETTTTCBTTTSS-EEEE-TTEEEEHHHHHHH
T ss_pred CCCCCCCccCcE-EEEEeCCcE----------EEccccccCCCCCccCCCe-eEeE-CCEEECHHHHhhh
Confidence 889999999764 122355543 4999999999999997554 6654 7899999999763
No 14
>smart00132 LIM Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways.
Probab=98.25 E-value=7.1e-07 Score=62.87 Aligned_cols=37 Identities=43% Similarity=1.115 Sum_probs=33.6
Q ss_pred cCcCCCcccccC-ceeeecCccccCCCcccCCCCCCCC
Q 010326 161 ICAGCNTEIGHG-RYLSCMEAFWHPECFRCHSCNLPIT 197 (513)
Q Consensus 161 ~C~~C~~~I~~g-~~l~alg~~wHp~CFrCs~C~~~L~ 197 (513)
.|.+|+++|..+ ..+.+++..||+.||+|..|+.+|.
T Consensus 1 ~C~~C~~~i~~~~~~~~~~~~~~H~~Cf~C~~C~~~L~ 38 (39)
T smart00132 1 KCAGCGKPIRGGELVLRALGKVWHPECFKCSKCGKPLG 38 (39)
T ss_pred CccccCCcccCCcEEEEeCCccccccCCCCcccCCcCc
Confidence 489999999766 7788999999999999999999885
No 15
>KOG4577 consensus Transcription factor LIM3, contains LIM and HOX domains [Transcription]
Probab=97.69 E-value=5.6e-06 Score=83.08 Aligned_cols=81 Identities=22% Similarity=0.400 Sum_probs=65.4
Q ss_pred ccCCCCCCCCCCCCcCCCCCC-CCCCCcCcCCCcccccCceee-ecCccccCCCcccCCCCCCCCC-ccee-ecCCcccc
Q 010326 135 SLRVDSPPRYESGNIFQPFPF-FSGYRICAGCNTEIGHGRYLS-CMEAFWHPECFRCHSCNLPITD-VEFS-MSGNRPYH 210 (513)
Q Consensus 135 sl~~~sppr~~~gsv~~p~~~-~~g~~~C~~C~~~I~~g~~l~-alg~~wHp~CFrCs~C~~~L~~-~~F~-~~dG~pYC 210 (513)
.|......| .|++||..++ ..+.-.|..|...|.+.++|+ +.+..||..||.|..|+..|.. .+|+ +.|+++.|
T Consensus 69 qL~drCFsR--~~s~yCkedFfKrfGTKCsaC~~GIpPtqVVRkAqd~VYHl~CF~C~iC~R~L~TGdEFYLmeD~rLvC 146 (383)
T KOG4577|consen 69 QLADRCFSR--EGSVYCKEDFFKRFGTKCSACQEGIPPTQVVRKAQDFVYHLHCFACFICKRQLATGDEFYLMEDARLVC 146 (383)
T ss_pred HHHHHHhhc--CCceeehHHHHHHhCCcchhhcCCCChHHHHHHhhcceeehhhhhhHhhhcccccCCeeEEeccceeeh
Confidence 333333444 6899999887 777789999999998887764 8899999999999999999963 3454 68999999
Q ss_pred ccccccc
Q 010326 211 KHCYKEQ 217 (513)
Q Consensus 211 k~CY~~~ 217 (513)
+.+|..-
T Consensus 147 K~DYE~A 153 (383)
T KOG4577|consen 147 KDDYETA 153 (383)
T ss_pred hhhHHHH
Confidence 9999764
No 16
>KOG1702 consensus Nebulin repeat protein [Cytoskeleton]
Probab=97.41 E-value=2e-05 Score=75.97 Aligned_cols=59 Identities=20% Similarity=0.613 Sum_probs=53.7
Q ss_pred CcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCcccccccccccc
Q 010326 160 RICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQH 218 (513)
Q Consensus 160 ~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f 218 (513)
..|..|++.+++-+-++++.+.||..||+|..|+.+|....|...+.+|||..+|..+.
T Consensus 5 ~n~~~cgk~vYPvE~v~cldk~whk~cfkce~c~mtlnmKnyKgy~kkpycn~hYpkq~ 63 (264)
T KOG1702|consen 5 CNREDCGKTVYPVEEVKCLDKVWHKQCFKCEVCGMTLNMKNYKGYDKKPYCNPHYPKQV 63 (264)
T ss_pred chhhhhccccccHHHHhhHHHHHHHHhheeeeccCChhhhhccccccCCCcCcccccce
Confidence 46889999999888899999999999999999999999988887789999999998754
No 17
>KOG0490 consensus Transcription factor, contains HOX domain [General function prediction only]
Probab=97.15 E-value=6.3e-05 Score=73.47 Aligned_cols=114 Identities=19% Similarity=0.376 Sum_probs=87.4
Q ss_pred CCCcccccCceeeecCccccCCCcccCCCCCCCC--CcceeecCCcccccccccc--ccCcccccCCCCcCCCCccceee
Q 010326 164 GCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPIT--DVEFSMSGNRPYHKHCYKE--QHHPKCDVCQNFIPTNSAGLIEY 239 (513)
Q Consensus 164 ~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~--~~~F~~~dG~pYCk~CY~~--~f~pkC~~C~~~I~~~~~g~i~~ 239 (513)
+|+..|.+...+...+..||..|..|..|...+. ...|.. +|..||..+|.. .+..+|..|...|...+ .++.
T Consensus 1 ~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~g~~~~~~d~~~~~~~~~rr~rt~~~~~ql~--~ler 77 (235)
T KOG0490|consen 1 GCGRQILDRYLLRVLDRYWHASCLKCAECDNPLGVGDTCFSK-DGSIYCKRDYQREFKFSKRCARCKFTISQLD--ELER 77 (235)
T ss_pred CCCccccchHHhhcccHHHHHHHHhhhhhcchhccCCCcccC-CCcccccccchhhhhccccccCCCCCcCHHH--HHHH
Confidence 4778886556677789999999999999999998 667877 999999999998 88899999998885432 2221
Q ss_pred ecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccc
Q 010326 240 RAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDS 290 (513)
Q Consensus 240 ~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~ 290 (513)
.... . |--||.|..|.+.+..++.+.+.......|...+..
T Consensus 78 ~f~~---------~-h~Pd~~~r~~la~~~~~~e~rVqvwFqnrrak~r~~ 118 (235)
T KOG0490|consen 78 AFEK---------V-HLPCFACRECLALLLTGDEFRVQVWFQNRRAKDRKE 118 (235)
T ss_pred hhcC---------C-CcCccchHHHHhhcCCCCeeeeehhhhhhcHhhhhh
Confidence 1111 1 557999999999887777776554447777777655
No 18
>KOG1700 consensus Regulatory protein MLP and related LIM proteins [Signal transduction mechanisms; Cytoskeleton]
Probab=97.06 E-value=0.00016 Score=70.53 Aligned_cols=63 Identities=19% Similarity=0.408 Sum_probs=55.2
Q ss_pred CCCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCcccccccccccc
Q 010326 156 FSGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQH 218 (513)
Q Consensus 156 ~~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f 218 (513)
......|..|.+.+++-+-+...+..||..||+|..|+..|+...|....|.+||+.++..+|
T Consensus 105 ~g~~~~c~~c~k~vy~~Ek~~~~~~~~hk~cfrc~~~~~~ls~~~~~~~~g~l~~~~~~~~~~ 167 (200)
T KOG1700|consen 105 AGEKEKCARCQKTVYPLEKVTGNGLEFHKSCFRCTHCGKKLSPKNYAALEGVLYCKHHFAQLF 167 (200)
T ss_pred hccccccccccceeeehHHHhhhhhhhhhhheeecccccccCCcchhhcCCccccchhhheee
Confidence 344578999999999888888999999999999999999999999998899999987766543
No 19
>smart00132 LIM Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways.
Probab=96.85 E-value=0.00083 Score=47.00 Aligned_cols=38 Identities=16% Similarity=0.249 Sum_probs=28.4
Q ss_pred ccccCCCCcCCCCccceeeecccccccccCCCcccCCCCccCCCCCcCC
Q 010326 221 KCDVCQNFIPTNSAGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEP 269 (513)
Q Consensus 221 kC~~C~~~I~~~~~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~ 269 (513)
+|..|+++|.+.. ..+...+.. ||..||+|..|++.|.
T Consensus 1 ~C~~C~~~i~~~~-~~~~~~~~~----------~H~~Cf~C~~C~~~L~ 38 (39)
T smart00132 1 KCAGCGKPIRGGE-LVLRALGKV----------WHPECFKCSKCGKPLG 38 (39)
T ss_pred CccccCCcccCCc-EEEEeCCcc----------ccccCCCCcccCCcCc
Confidence 5899999998852 234444443 5999999999999873
No 20
>PF13485 Peptidase_MA_2: Peptidase MA superfamily
Probab=93.36 E-value=0.087 Score=45.50 Aligned_cols=44 Identities=23% Similarity=0.260 Sum_probs=33.1
Q ss_pred CchhhhhhhhhhccchhhHhhhcC--CCCCCCcchhhHHHHHHHHH
Q 010326 397 GLPRLLTGSILAHEMMHAWLRLKG--YPNLRPDVEEGICQVLAHMW 440 (513)
Q Consensus 397 glP~~L~gsilaHE~~Hawl~l~g--~~~L~~~~eEG~cq~~a~~w 440 (513)
+.+..-...+|+||+.|+|+.... ...++..+.||++++++..|
T Consensus 19 ~~~~~~~~~~l~HE~~H~~~~~~~~~~~~~~~W~~EG~A~y~~~~~ 64 (128)
T PF13485_consen 19 GSDEDWLDRVLAHELAHQWFGNYFGGDDNAPRWFNEGLAEYVEGRI 64 (128)
T ss_pred CCCHHHHHHHHHHHHHHHHHHHHcCCCccCchHHHHHHHHHHhcCc
Confidence 344443447999999999987553 24778899999999999653
No 21
>smart00726 UIM Ubiquitin-interacting motif. Present in proteasome subunit S5a and other ubiquitin-associated proteins.
Probab=86.46 E-value=0.56 Score=31.13 Aligned_cols=21 Identities=29% Similarity=0.469 Sum_probs=16.9
Q ss_pred CChhHHHHHHHhhhhhhhhcC
Q 010326 40 FDNEEIDRAIALSLVEVDQKG 60 (513)
Q Consensus 40 ~~~~~~~~~~~~~~~~~~~~~ 60 (513)
.|+|+|.+||++||.|.....
T Consensus 1 ~EDe~Lq~Ai~lSl~e~e~~~ 21 (26)
T smart00726 1 DEDEDLQLALELSLQEAEESX 21 (26)
T ss_pred ChHHHHHHHHHHhHHHhhhcc
Confidence 368999999999998776543
No 22
>PF02809 UIM: Ubiquitin interaction motif; InterPro: IPR003903 The Ubiquitin Interacting Motif (UIM), or 'LALAL-motif', is a stretch of about 20 amino acid residues, which was first described in the 26S proteasome subunit PSD4/RPN-10 that is known to recognise ubiquitin [,]. In addition, the UIM is found, often in tandem or triplet arrays, in a variety of proteins either involved in ubiquitination and ubiquitin metabolism, or known to interact with ubiquitin-like modifiers. Among the UIM proteins are two different subgroups of the UBP (ubiquitin carboxy-terminal hydrolase) family of deubiquitinating enzymes, one F-box protein, one family of HECT-containing ubiquitin-ligases (E3s) from plants, and several proteins containing ubiquitin-associated UBA and/or UBX domains []. In most of these proteins, the UIM occurs in multiple copies and in association with other domains such as UBA (IPR015940 from INTERPRO), UBX (IPR001012 from INTERPRO), ENTH, EH (IPR000261 from INTERPRO), VHS (IPR002014 from INTERPRO), SH3 (IPR001452 from INTERPRO), HECT (IPR000569 from INTERPRO), VWFA (IPR002035 from INTERPRO), EF-hand calcium-binding, WD-40 (IPR001680 from INTERPRO), F-box (IPR001810 from INTERPRO), LIM (IPR001781 from INTERPRO), protein kinase (IPR000719 from INTERPRO), ankyrin (IPR002110 from INTERPRO), PX (IPR001683 from INTERPRO), phosphatidylinositol 3- and 4-kinase (IPR000403 from INTERPRO), C2 (IPR000008 from INTERPRO), OTU (IPR003323 from INTERPRO), dnaJ (IPR001623 from INTERPRO), RING-finger (IPR001841 from INTERPRO) or FYVE-finger (IPR017455 from INTERPRO). UIMs have been shown to bind ubiquitin and to serve as a specific targeting signal important for monoubiquitination. Thus, UIMs may have several functions in ubiquitin metabolism each of which may require different numbers of UIMs [, , ]. The UIM is unlikely to form an independent folding domain. Instead, based on the spacing of the conserved residues, the motif probably forms a short alpha-helix that can be embedded into different protein folds []. Some proteins known to contain an UIM are listed below: Eukaryotic PSD4/RPN-10/S5, a multi-ubiquitin binding subunit of the 26S proteasome. Vertebrate Machado-Joseph disease protein 1 (Ataxin-3), which acts as a histone-binding protein that regulates transcription; defects in Ataxin-3 cause the neurodegenerative disorder Machado-Joseph disease (MJD). Vertebrate epsin and epsin2. Vertebrate hepatocyte growth factor-regulated tyrosine kinase substrate (HRS). Mammalian epidermal growth factor receptor substrate 15 (EPS15), which is involved in cell growth regulation. Mammalian epidermal growth factor receptor substrate EPS15R. Drosophila melanogaster (Fruit fly) liquid facets (lqf), an epsin. Yeast VPS27 vacuolar sorting protein, which is required for membrane traffic to the vacuole. ; PDB: 2KDE_A 2KDF_A 1YX6_A 1YX5_A 1YX4_A 1P9C_A 1UEL_B 1P9D_S 2KLZ_A.
Probab=83.07 E-value=0.57 Score=28.44 Aligned_cols=16 Identities=38% Similarity=0.646 Sum_probs=13.5
Q ss_pred CChhHHHHHHHhhhhh
Q 010326 40 FDNEEIDRAIALSLVE 55 (513)
Q Consensus 40 ~~~~~~~~~~~~~~~~ 55 (513)
.|+++|.+||++|+.|
T Consensus 2 ~Ed~~L~~Al~~S~~e 17 (18)
T PF02809_consen 2 DEDEDLQRALEMSLEE 17 (18)
T ss_dssp HHHHHHHHHHHHHHHH
T ss_pred chHHHHHHHHHhhhcc
Confidence 4678999999999865
No 23
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=76.78 E-value=0.31 Score=39.79 Aligned_cols=31 Identities=19% Similarity=0.279 Sum_probs=24.7
Q ss_pred cCCCCeeeeeccccccccccceeeeEeeecC
Q 010326 367 IGAGYRLIDMITEPYRLIRRCEVTAILILYG 397 (513)
Q Consensus 367 ~~~G~rilei~~~p~~~~~~~eV~~Il~l~g 397 (513)
|++||+|++|||++++.....++..+|.-.+
T Consensus 43 l~~GD~Il~INg~~v~~~~~~~~~~~l~~~~ 73 (81)
T PF00595_consen 43 LKVGDRILEINGQSVRGMSHDEVVQLLKSAS 73 (81)
T ss_dssp SSTTEEEEEETTEESTTSBHHHHHHHHHHST
T ss_pred cchhhhhheeCCEeCCCCCHHHHHHHHHCCC
Confidence 8899999999999999887777655554443
No 24
>PF04450 BSP: Peptidase of plants and bacteria; InterPro: IPR007541 These basic secretory proteins (BSPs) are believed to be part of the plants defence mechanism against pathogens [].
Probab=76.75 E-value=1.6 Score=42.99 Aligned_cols=38 Identities=26% Similarity=0.339 Sum_probs=31.5
Q ss_pred hhhhhhhhccchhhHhhhcCCCCCCCcchhhHHHHHHHH
Q 010326 401 LLTGSILAHEMMHAWLRLKGYPNLRPDVEEGICQVLAHM 439 (513)
Q Consensus 401 ~L~gsilaHE~~Hawl~l~g~~~L~~~~eEG~cq~~a~~ 439 (513)
.-.-.+|-||+||+|+- +|...-|.-+-|||..++-+.
T Consensus 94 ~Ei~Gvl~HE~~H~~Q~-~~~~~~P~~liEGIADyVRl~ 131 (205)
T PF04450_consen 94 DEIIGVLYHEMVHCWQW-DGRGTAPGGLIEGIADYVRLK 131 (205)
T ss_pred HHHHHHHHHHHHHHhhc-CCCCCCChhheecHHHHHHHH
Confidence 33345899999999998 777778889999999988765
No 25
>PF01433 Peptidase_M1: Peptidase family M1 This is family M1 in the peptidase classification.; InterPro: IPR014782 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M1 (clan MA(E)), the type example being aminopeptidase N from Homo sapiens (Human). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA. Membrane alanine aminopeptidase (3.4.11.2 from EC) is part of the HEXXH+E group; it consists entirely of aminopeptidases, spread across a wide variety of species []. Functional studies show that CD13/APN catalyzes the removal of single amino acids from the amino terminus of small peptides and probably plays a role in their final digestion; one family member (leukotriene-A4 hydrolase) is known to hydrolyse the epoxide leukotriene-A4 to form an inflammatory mediator []. This hydrolase has been shown to have aminopeptidase activity [], and the zinc ligands of the M1 family were identified by site-directed mutagenesis on this enzyme [] CD13 participates in trimming peptides bound to MHC class II molecules [] and cleaves MIP-1 chemokine, which alters target cell specificity from basophils to eosinophils []. CD13 acts as a receptor for specific strains of RNA viruses (coronaviruses) which cause a relatively large percentage of upper respiratory trace infections. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0008237 metallopeptidase activity, 0008270 zinc ion binding; PDB: 2XQ0_A 2XPY_A 2XPZ_A 3SE6_B 3EBH_A 3EBG_A 3T8V_A 3Q44_A 3Q43_A 3EBI_A ....
Probab=72.01 E-value=2 Score=44.98 Aligned_cols=43 Identities=26% Similarity=0.357 Sum_probs=32.4
Q ss_pred hhhhhhhccchhhHhhhc--CCCCC-CCcchhhHHHHHHHHHhhccc
Q 010326 402 LTGSILAHEMMHAWLRLK--GYPNL-RPDVEEGICQVLAHMWLESEI 445 (513)
Q Consensus 402 L~gsilaHE~~Hawl~l~--g~~~L-~~~~eEG~cq~~a~~wl~~~~ 445 (513)
....+||||++|-|.. + ....- ...+-|||+..++++|++...
T Consensus 294 ~~~~~iahElahqWfG-n~vt~~~w~d~WL~Eg~a~y~~~~~~~~~~ 339 (390)
T PF01433_consen 294 EIASLIAHELAHQWFG-NLVTPKWWSDLWLNEGFATYLEYLILEKLF 339 (390)
T ss_dssp HHHHHHHHHHHTTTBT-TTEEESSGGGHHHHHHHHHHHHHHHHHHHH
T ss_pred hhHHHHHHHHHHHHhc-cCCccccchhhhHHHHHHHHHHHHhHhhcc
Confidence 3456899999999976 2 12222 236999999999999999755
No 26
>PF10026 DUF2268: Predicted Zn-dependent protease (DUF2268); InterPro: IPR018728 This domain, found in various hypothetical bacterial proteins, as well as predicted zinc dependent proteases, has no known function.
Probab=69.91 E-value=4 Score=39.52 Aligned_cols=44 Identities=20% Similarity=0.196 Sum_probs=31.8
Q ss_pred hhhhhhhccchhhHhhh------cCCCCCCCcchhhHHHHHHHHHhhccc
Q 010326 402 LTGSILAHEMMHAWLRL------KGYPNLRPDVEEGICQVLAHMWLESEI 445 (513)
Q Consensus 402 L~gsilaHE~~Hawl~l------~g~~~L~~~~eEG~cq~~a~~wl~~~~ 445 (513)
-.-++||||+-|++-.- ++...|...|-||+.+.++..-.....
T Consensus 64 ~l~~~iaHE~hH~~r~~~~~~~~~~~TLld~~I~EGlAe~f~~~~~g~~~ 113 (195)
T PF10026_consen 64 ELPALIAHEYHHNCRYEQIGWDPEDTTLLDSLIMEGLAEYFAEELYGEEY 113 (195)
T ss_pred HHHHHHHHHHHHHHHHhccCCCCCCCCHHHHHHHhhHHHHHHHHHcCCCC
Confidence 33579999999985321 234467789999999998887765544
No 27
>TIGR02412 pepN_strep_liv aminopeptidase N, Streptomyces lividans type. This family is a subset of the members of the zinc metallopeptidase family M1 (pfam01433), with a single member characterized in Streptomyces lividans 66 and designated aminopeptidase N. The spectrum of activity may differ somewhat from the aminopeptidase N clade of E. coli and most other Proteobacteria, well separated phylogenetically within the M1 family. The M1 family also includes leukotriene A-4 hydrolase/aminopeptidase (with a bifunctional active site).
Probab=63.80 E-value=4.2 Score=47.91 Aligned_cols=40 Identities=18% Similarity=0.305 Sum_probs=30.6
Q ss_pred hhhhhccchhhHhh-hcCCCCC-CCcchhhHHHHHHHHHhhc
Q 010326 404 GSILAHEMMHAWLR-LKGYPNL-RPDVEEGICQVLAHMWLES 443 (513)
Q Consensus 404 gsilaHE~~Hawl~-l~g~~~L-~~~~eEG~cq~~a~~wl~~ 443 (513)
..+||||+.|-|.. |-...-- ...+-|||..+|+++|++.
T Consensus 288 ~~viaHElAHqWFGnlVT~~wW~dlWLnEGFAty~e~~~~~~ 329 (831)
T TIGR02412 288 AGVILHEMAHMWFGDLVTMRWWNDLWLNESFAEYMGTLASAE 329 (831)
T ss_pred HHHHHHHHHHHHhCCEeccccccchhHHHHHHHHHHHHHHHh
Confidence 46999999999976 1122221 3588999999999999975
No 28
>KOG0320 consensus Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]
Probab=63.05 E-value=3.1 Score=40.06 Aligned_cols=49 Identities=18% Similarity=0.497 Sum_probs=38.0
Q ss_pred CCCcccCCCCCCCCCcc-eeecCCccccccccccc--cCcccccCCCCcCCC
Q 010326 184 PECFRCHSCNLPITDVE-FSMSGNRPYHKHCYKEQ--HHPKCDVCQNFIPTN 232 (513)
Q Consensus 184 p~CFrCs~C~~~L~~~~-F~~~dG~pYCk~CY~~~--f~pkC~~C~~~I~~~ 232 (513)
..||.|..|-....... +..+=|.+||+.|-+.. .+.+|..|++.|+..
T Consensus 129 ~~~~~CPiCl~~~sek~~vsTkCGHvFC~~Cik~alk~~~~CP~C~kkIt~k 180 (187)
T KOG0320|consen 129 EGTYKCPICLDSVSEKVPVSTKCGHVFCSQCIKDALKNTNKCPTCRKKITHK 180 (187)
T ss_pred ccccCCCceecchhhccccccccchhHHHHHHHHHHHhCCCCCCcccccchh
Confidence 35688888877776554 55677999999998864 567999999988764
No 29
>PF10460 Peptidase_M30: Peptidase M30; InterPro: IPR019501 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This family contains metallopeptidases belonging to MEROPS peptidase family M30 (hyicolysin family, clan MA). Hyicolysin has a zinc ion which is liganded by two histidine and one glutamate residue.
Probab=62.75 E-value=5.4 Score=42.64 Aligned_cols=43 Identities=23% Similarity=0.264 Sum_probs=31.3
Q ss_pred hhhhhhhccchhh---Hhh--hcCC-CCCCCcchhhHHHHHHHHHhhcc
Q 010326 402 LTGSILAHEMMHA---WLR--LKGY-PNLRPDVEEGICQVLAHMWLESE 444 (513)
Q Consensus 402 L~gsilaHE~~Ha---wl~--l~g~-~~L~~~~eEG~cq~~a~~wl~~~ 444 (513)
.+-+|||||++|. +.+ +.|- ...+..++||+-+++.++.-...
T Consensus 138 ~~~sTlAHEfQHmInfy~~~v~~g~~~~~dtWLnE~lS~~aEdl~s~~~ 186 (366)
T PF10460_consen 138 TVYSTLAHEFQHMINFYQRGVLHGKQYAMDTWLNEMLSMSAEDLYSSKI 186 (366)
T ss_pred HHHHHHHHHHHHHHHHHHHHHhcCCCcccccHHHHHHHHHHHHHHhcCC
Confidence 3468999999996 333 2332 35788999999999999764433
No 30
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=60.77 E-value=5.7 Score=41.98 Aligned_cols=35 Identities=20% Similarity=-0.002 Sum_probs=30.6
Q ss_pred CCCCeeeeeccccccccccceeeeEeeecCchhhh
Q 010326 368 GAGYRLIDMITEPYRLIRRCEVTAILILYGLPRLL 402 (513)
Q Consensus 368 ~~G~rilei~~~p~~~~~~~eV~~Il~l~glP~~L 402 (513)
=+||.||.|||+-|+.-++++|-.||.--|=-..|
T Consensus 100 FvGDAilqvNGi~v~~c~HeevV~iLRNAGdeVtl 134 (505)
T KOG3549|consen 100 FVGDAILQVNGIYVTACPHEEVVNILRNAGDEVTL 134 (505)
T ss_pred EeeeeeEEeccEEeecCChHHHHHHHHhcCCEEEE
Confidence 37999999999999999999998888877776666
No 31
>PF06114 DUF955: Domain of unknown function (DUF955); InterPro: IPR010359 This is a family of bacterial and viral proteins with undetermined function. A conserved H-E-X-X-H motif is suggestive of a catalytic active site and shows similarity to IPR001915 from INTERPRO.; PDB: 3DTE_A 3DTK_A 3DTI_A.
Probab=60.71 E-value=5.2 Score=33.93 Aligned_cols=52 Identities=23% Similarity=0.150 Sum_probs=33.7
Q ss_pred eeEeeecCchhhhhhhhhhccchhhHhhhcCC------CCCCCcchhhHHHHHHHHHh
Q 010326 390 TAILILYGLPRLLTGSILAHEMMHAWLRLKGY------PNLRPDVEEGICQVLAHMWL 441 (513)
Q Consensus 390 ~~Il~l~glP~~L~gsilaHE~~Hawl~l~g~------~~L~~~~eEG~cq~~a~~wl 441 (513)
..|++-..++..-...+||||++|.++.-.+. ........|--+..+|...|
T Consensus 29 ~~I~in~~~~~~~~~f~laHELgH~~~~~~~~~~~~~~~~~~~~~~E~~An~fA~~lL 86 (122)
T PF06114_consen 29 PIIFINSNLSPERQRFTLAHELGHILLHHGDETFNYYLNYFFNERQEREANAFAAALL 86 (122)
T ss_dssp TEEEEESSS-HHHHHHHHHHHHHHHHHHH-HHHHHHHHHH--THHHHHHHHHHHHHHH
T ss_pred CEEEECCCCCHHHHHHHHHHHHHHHHhhhccccchhhccccchhhHHHHHHHHHHHHh
Confidence 45556667777777889999999999885542 23455566666666666554
No 32
>PF14835 zf-RING_6: zf-RING of BARD1-type protein; PDB: 1JM7_B.
Probab=60.49 E-value=8.8 Score=31.08 Aligned_cols=47 Identities=15% Similarity=0.318 Sum_probs=23.2
Q ss_pred cccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCC
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNS 233 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~ 233 (513)
.+|+.|...|..---...=...||..|-...++..|.+|+.|-...+
T Consensus 8 LrCs~C~~~l~~pv~l~~CeH~fCs~Ci~~~~~~~CPvC~~Paw~qD 54 (65)
T PF14835_consen 8 LRCSICFDILKEPVCLGGCEHIFCSSCIRDCIGSECPVCHTPAWIQD 54 (65)
T ss_dssp TS-SSS-S--SS-B---SSS--B-TTTGGGGTTTB-SSS--B-S-SS
T ss_pred cCCcHHHHHhcCCceeccCccHHHHHHhHHhcCCCCCCcCChHHHHH
Confidence 57888877665432122335789999999989999999998766543
No 33
>PF10367 Vps39_2: Vacuolar sorting protein 39 domain 2; InterPro: IPR019453 This entry represents a domain found in the vacuolar sorting protein Vps39 and transforming growth factor beta receptor-associated protein Trap1. Vps39, a component of the C-Vps complex, is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole [, ]. In Saccharomyces cerevisiae (Baker's yeast), Vps39 has been shown to stimulate nucleotide exchange []. Trap1 plays a role in the TGF-beta/activin signaling pathway. It associates with inactive heteromeric TGF-beta and activin receptor complexes, mainly through the type II receptor, and is released upon activation of signaling [, ]. The precise function of this domain has not been characterised In Vps39 this domain is involved in localisation and in mediating the interactions with Vps11 [].
Probab=58.64 E-value=41 Score=28.42 Aligned_cols=30 Identities=17% Similarity=0.425 Sum_probs=20.1
Q ss_pred CCCcCcCCCcccccCceee-ecCccccCCCc
Q 010326 158 GYRICAGCNTEIGHGRYLS-CMEAFWHPECF 187 (513)
Q Consensus 158 g~~~C~~C~~~I~~g~~l~-alg~~wHp~CF 187 (513)
....|..|+++|..+.++. ..|..+|..|+
T Consensus 77 ~~~~C~vC~k~l~~~~f~~~p~~~v~H~~C~ 107 (109)
T PF10367_consen 77 ESTKCSVCGKPLGNSVFVVFPCGHVVHYSCI 107 (109)
T ss_pred CCCCccCcCCcCCCceEEEeCCCeEEecccc
Confidence 3457999999997655543 34566777765
No 34
>PRK14873 primosome assembly protein PriA; Provisional
Probab=58.18 E-value=7.1 Score=44.97 Aligned_cols=37 Identities=24% Similarity=0.685 Sum_probs=19.0
Q ss_pred ccCCCCCCCCCcceeecCCccccccccccccCcccccCCC
Q 010326 188 RCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQN 227 (513)
Q Consensus 188 rCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~ 227 (513)
+|..|+.+|. |....+.+.|.-|-......+|..|+.
T Consensus 394 ~C~~C~~~L~---~h~~~~~l~Ch~CG~~~~p~~Cp~Cgs 430 (665)
T PRK14873 394 RCRHCTGPLG---LPSAGGTPRCRWCGRAAPDWRCPRCGS 430 (665)
T ss_pred ECCCCCCcee---EecCCCeeECCCCcCCCcCccCCCCcC
Confidence 5677776664 333344555555543333335555543
No 35
>PTZ00415 transmission-blocking target antigen s230; Provisional
Probab=56.87 E-value=3.8 Score=50.86 Aligned_cols=53 Identities=28% Similarity=0.344 Sum_probs=27.7
Q ss_pred HHHHHhhhhhhhhcCCc--cccCCcCChhhhhccCCCCCcccccchHHHHHhhhhHH
Q 010326 46 DRAIALSLVEVDQKGKK--VIENEYDSEDDLQCIKSDDSDEDELDEDEIRAIAQQEE 100 (513)
Q Consensus 46 ~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e 100 (513)
-|-.|.-|+|||..+.. ++++|+ |++++....++||++++.+++.+.+.++++
T Consensus 133 ~~~~~r~l~eed~~~~~~~~~d~~~--~~~~~~~~~~~~~~e~~~~~~~~~~~~de~ 187 (2849)
T PTZ00415 133 KRRRARHLAEEDMSPRDNFVIDDDD--EDEDEDDDDEEDDEEEEEEEEEIKGFDDED 187 (2849)
T ss_pred ehHHhhccchhhcCcccccccCCcc--ccccccccccccccccccccccccCCCchh
Confidence 46678899999976654 444332 222222333334444444444555555543
No 36
>KOG2199 consensus Signal transducing adaptor protein STAM/STAM2 [Signal transduction mechanisms]
Probab=56.61 E-value=6.6 Score=42.17 Aligned_cols=27 Identities=41% Similarity=0.492 Sum_probs=22.7
Q ss_pred CCCCChhHHHHHHHhhhhhhhhcCCcc
Q 010326 37 SSGFDNEEIDRAIALSLVEVDQKGKKV 63 (513)
Q Consensus 37 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 63 (513)
.+..|.|||..||+|||.|..+.+|.+
T Consensus 161 ~~k~EeEdiaKAi~lSL~E~~~Q~k~a 187 (462)
T KOG2199|consen 161 SSKQEEEDIAKAIELSLKEQEKQKKLA 187 (462)
T ss_pred cccccHHHHHHHHHhhHHHHhhchhhc
Confidence 446889999999999999988877663
No 37
>PF05572 Peptidase_M43: Pregnancy-associated plasma protein-A; InterPro: IPR008754 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase M43 (cytophagalysin family, clan MA(M)), subfamily M43. The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH. The type example of this family is the pregnancy-associated plasma protein A (PAPP-A), which cleaves insulin-like growth factor (IGF) binding protein-4 (IGFBP-4), causing a dramatic reduction in its affinity for IGF-I and -II. Through this mechanism, PAPP-A is a regulator of IGF bioactivity in several systems, including the Homo sapiens ovary and the cardiovascular system [, , , ].; PDB: 3LUN_A 3LUM_B 2J83_A 2CKI_A.
Probab=56.21 E-value=6.7 Score=36.74 Aligned_cols=48 Identities=25% Similarity=0.426 Sum_probs=24.4
Q ss_pred eeeeeccccccccccceeeeEeeecC-c----hhhhhhhhhhccchhhHhhhcC
Q 010326 372 RLIDMITEPYRLIRRCEVTAILILYG-L----PRLLTGSILAHEMMHAWLRLKG 420 (513)
Q Consensus 372 rilei~~~p~~~~~~~eV~~Il~l~g-l----P~~L~gsilaHE~~Hawl~l~g 420 (513)
.++.+...|........+..|++.+. + +..-.|.||+||++| ||-|..
T Consensus 33 ~~~G~A~~P~~~~~~~~~~~vv~~~~~l~~~~~~~~~g~TltHEvGH-~LGL~H 85 (154)
T PF05572_consen 33 SILGYAYFPWSGMSDNGTDGVVINYRYLGGNNSQYNFGKTLTHEVGH-WLGLYH 85 (154)
T ss_dssp EESEEE--TTS-GGG-SEEEEGGGSSSSTT--TTS-SSHHHHHHHHH-HTT---
T ss_pred CCCeEEeCCCCCCCCCCCCEEEEcCcccCCCCCccccccchhhhhhh-hhcccc
Confidence 34555556655334445555555431 2 233448899999999 777653
No 38
>COG2856 Predicted Zn peptidase [Amino acid transport and metabolism]
Probab=55.46 E-value=19 Score=35.72 Aligned_cols=55 Identities=27% Similarity=0.164 Sum_probs=36.5
Q ss_pred eeEeeecCchhhhhhhhhhccchhhHhhhcC------CCCC-CCcchhhHHHHHHHHHhhcc
Q 010326 390 TAILILYGLPRLLTGSILAHEMMHAWLRLKG------YPNL-RPDVEEGICQVLAHMWLESE 444 (513)
Q Consensus 390 ~~Il~l~glP~~L~gsilaHE~~Hawl~l~g------~~~L-~~~~eEG~cq~~a~~wl~~~ 444 (513)
..|++-...+...-.=|||||+.|+||.-.+ .+++ ....-|--|+.+|--.|-..
T Consensus 59 ~~I~iN~n~~~~r~rFtlAHELGH~llH~~~~~~~~~~~~~~~~~~~E~~AN~FAa~lLmP~ 120 (213)
T COG2856 59 PVIYINANNSLERKRFTLAHELGHALLHTDLNTRFDAEPTLQQDRKIEAEANAFAAELLMPE 120 (213)
T ss_pred ceEEEeCCCCHHHHHHHHHHHHhHHHhccccchhhhcccccchhHHHHHHHHHHHHHHhCCh
Confidence 3455555444444445999999999998554 1232 23556788999998887643
No 39
>PRK14559 putative protein serine/threonine phosphatase; Provisional
Probab=53.93 E-value=13 Score=42.74 Aligned_cols=11 Identities=36% Similarity=0.661 Sum_probs=7.5
Q ss_pred CcCcCCCcccc
Q 010326 160 RICAGCNTEIG 170 (513)
Q Consensus 160 ~~C~~C~~~I~ 170 (513)
.+|..|+..+.
T Consensus 2 ~~Cp~Cg~~n~ 12 (645)
T PRK14559 2 LICPQCQFENP 12 (645)
T ss_pred CcCCCCCCcCC
Confidence 36777777764
No 40
>PF10367 Vps39_2: Vacuolar sorting protein 39 domain 2; InterPro: IPR019453 This entry represents a domain found in the vacuolar sorting protein Vps39 and transforming growth factor beta receptor-associated protein Trap1. Vps39, a component of the C-Vps complex, is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole [, ]. In Saccharomyces cerevisiae (Baker's yeast), Vps39 has been shown to stimulate nucleotide exchange []. Trap1 plays a role in the TGF-beta/activin signaling pathway. It associates with inactive heteromeric TGF-beta and activin receptor complexes, mainly through the type II receptor, and is released upon activation of signaling [, ]. The precise function of this domain has not been characterised In Vps39 this domain is involved in localisation and in mediating the interactions with Vps11 [].
Probab=53.81 E-value=9.3 Score=32.50 Aligned_cols=29 Identities=24% Similarity=0.587 Sum_probs=22.9
Q ss_pred cccCCCCCCCCCcceee-cCCccccccccc
Q 010326 187 FRCHSCNLPITDVEFSM-SGNRPYHKHCYK 215 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~-~dG~pYCk~CY~ 215 (513)
-.|..|+++|....|.. -+|..+|..|+.
T Consensus 79 ~~C~vC~k~l~~~~f~~~p~~~v~H~~C~~ 108 (109)
T PF10367_consen 79 TKCSVCGKPLGNSVFVVFPCGHVVHYSCIK 108 (109)
T ss_pred CCccCcCCcCCCceEEEeCCCeEEeccccc
Confidence 36999999999877664 567888888864
No 41
>KOG1702 consensus Nebulin repeat protein [Cytoskeleton]
Probab=51.88 E-value=4.7 Score=39.66 Aligned_cols=59 Identities=10% Similarity=0.085 Sum_probs=37.9
Q ss_pred cccccCCCCcCCCCccceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccccc
Q 010326 220 PKCDVCQNFIPTNSAGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAI 292 (513)
Q Consensus 220 pkC~~C~~~I~~~~~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v 292 (513)
+.|..|++.+.+-+ .+ .....+ ||..||.|..|+..+. -.+|- -.+.++||-.+|.+.+
T Consensus 5 ~n~~~cgk~vYPvE--~v-~cldk~---------whk~cfkce~c~mtln-mKnyK-gy~kkpycn~hYpkq~ 63 (264)
T KOG1702|consen 5 CNREDCGKTVYPVE--EV-KCLDKV---------WHKQCFKCEVCGMTLN-MKNYK-GYDKKPYCNPHYPKQV 63 (264)
T ss_pred chhhhhccccccHH--HH-hhHHHH---------HHHHhheeeeccCChh-hhhcc-ccccCCCcCcccccce
Confidence 45777887665421 11 122334 4889999999998763 22332 2377999999998654
No 42
>PRK14890 putative Zn-ribbon RNA-binding protein; Provisional
Probab=51.66 E-value=12 Score=29.78 Aligned_cols=27 Identities=26% Similarity=0.548 Sum_probs=16.9
Q ss_pred CCcCcCCCcccccCc-eeeecCccccCCCcccCCCCCC
Q 010326 159 YRICAGCNTEIGHGR-YLSCMEAFWHPECFRCHSCNLP 195 (513)
Q Consensus 159 ~~~C~~C~~~I~~g~-~l~alg~~wHp~CFrCs~C~~~ 195 (513)
.++|..|+..|.... .+ =|.|..|+..
T Consensus 7 ~~~CtSCg~~i~~~~~~~----------~F~CPnCG~~ 34 (59)
T PRK14890 7 PPKCTSCGIEIAPREKAV----------KFLCPNCGEV 34 (59)
T ss_pred CccccCCCCcccCCCccC----------EeeCCCCCCe
Confidence 357888888885433 22 2677777764
No 43
>TIGR00595 priA primosomal protein N'. All proteins in this family for which functions are known are components of the primosome which is involved in replication, repair, and recombination.This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).
Probab=51.55 E-value=15 Score=40.81 Aligned_cols=48 Identities=17% Similarity=0.525 Sum_probs=28.3
Q ss_pred CCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccc-cCcccccCCC
Q 010326 159 YRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQ-HHPKCDVCQN 227 (513)
Q Consensus 159 ~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~-f~pkC~~C~~ 227 (513)
.-.|..|+..+ +|..|+.+|. |....+.+.|.-|-... +...|..|+.
T Consensus 213 ~~~C~~Cg~~~------------------~C~~C~~~l~---~h~~~~~l~Ch~Cg~~~~~~~~Cp~C~s 261 (505)
T TIGR00595 213 NLLCRSCGYIL------------------CCPNCDVSLT---YHKKEGKLRCHYCGYQEPIPKTCPQCGS 261 (505)
T ss_pred eeEhhhCcCcc------------------CCCCCCCceE---EecCCCeEEcCCCcCcCCCCCCCCCCCC
Confidence 34677777765 6777877764 33455666666664332 3335666654
No 44
>PF13699 DUF4157: Domain of unknown function (DUF4157)
Probab=51.20 E-value=6.4 Score=32.84 Aligned_cols=16 Identities=38% Similarity=0.677 Sum_probs=12.9
Q ss_pred hhhhccchhhHhhhcC
Q 010326 405 SILAHEMMHAWLRLKG 420 (513)
Q Consensus 405 silaHE~~Hawl~l~g 420 (513)
.+||||++|+++.-.|
T Consensus 63 ~llaHEl~Hv~Qq~~g 78 (79)
T PF13699_consen 63 ALLAHELAHVVQQRRG 78 (79)
T ss_pred hhHhHHHHHHHhhccC
Confidence 5899999999976443
No 45
>smart00504 Ubox Modified RING finger domain. Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination.
Probab=51.13 E-value=11 Score=29.05 Aligned_cols=45 Identities=13% Similarity=0.214 Sum_probs=32.8
Q ss_pred cccCCCCCCCCCcceeecCCcccccccccccc--CcccccCCCCcCCC
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHCYKEQH--HPKCDVCQNFIPTN 232 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f--~pkC~~C~~~I~~~ 232 (513)
|.|..|+..+.+ ......|..||+.|..+.+ ..+|..|++.+...
T Consensus 2 ~~Cpi~~~~~~~-Pv~~~~G~v~~~~~i~~~~~~~~~cP~~~~~~~~~ 48 (63)
T smart00504 2 FLCPISLEVMKD-PVILPSGQTYERRAIEKWLLSHGTDPVTGQPLTHE 48 (63)
T ss_pred cCCcCCCCcCCC-CEECCCCCEEeHHHHHHHHHHCCCCCCCcCCCChh
Confidence 678888888876 4445678999998876532 45788888877543
No 46
>PF10263 SprT-like: SprT-like family; InterPro: IPR006640 This is a family of uncharacterised bacterial proteins which includes Escherichia coli SprT (P39902 from SWISSPROT). SprT is described as a regulator of bolA gene in stationary phase []. The majority of members contain the metallopeptidase zinc binding signature which has a HExxH motif, however there is no evidence for them being metallopeptidases.
Probab=50.85 E-value=7.9 Score=35.53 Aligned_cols=23 Identities=35% Similarity=0.239 Sum_probs=19.1
Q ss_pred chhhhhhhhhhccchhhHhhhcC
Q 010326 398 LPRLLTGSILAHEMMHAWLRLKG 420 (513)
Q Consensus 398 lP~~L~gsilaHE~~Hawl~l~g 420 (513)
.|...+-.||.|||.|+|+.+.+
T Consensus 55 ~~~~~~~~tL~HEm~H~~~~~~~ 77 (157)
T PF10263_consen 55 NPEEELIDTLLHEMAHAAAYVFG 77 (157)
T ss_pred hHHHHHHHHHHHHHHHHHhhhcc
Confidence 45666778999999999998774
No 47
>PF14891 Peptidase_M91: Effector protein
Probab=50.75 E-value=7.3 Score=36.99 Aligned_cols=22 Identities=32% Similarity=0.617 Sum_probs=17.7
Q ss_pred chhhhhhhhhhccchhhHhhhcCCC
Q 010326 398 LPRLLTGSILAHEMMHAWLRLKGYP 422 (513)
Q Consensus 398 lP~~L~gsilaHE~~Hawl~l~g~~ 422 (513)
.|-.+ +|+|||.|||=.++|--
T Consensus 101 ~~p~v---~L~HEL~HA~~~~~Gt~ 122 (174)
T PF14891_consen 101 RPPFV---VLYHELIHAYDYMNGTM 122 (174)
T ss_pred HHHHH---HHHHHHHHHHHHHCCCC
Confidence 34455 99999999999999853
No 48
>COG4357 Zinc finger domain containing protein (CHY type) [Function unknown]
Probab=50.59 E-value=2.7 Score=36.46 Aligned_cols=50 Identities=24% Similarity=0.547 Sum_probs=31.9
Q ss_pred cCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCcccc
Q 010326 161 ICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYH 210 (513)
Q Consensus 161 ~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYC 210 (513)
.|..|+..+..-.+..-.-..+++.+..|.+|...|+-.+|...+.-|||
T Consensus 37 aCy~CHdel~~Hpf~p~~~~~~~~~~iiCGvC~~~LT~~EY~~~~~Cp~C 86 (105)
T COG4357 37 ACYHCHDELEDHPFEPWGLQEFNPKAIICGVCRKLLTRAEYGMCGSCPYC 86 (105)
T ss_pred hHHHHHhHHhcCCCccCChhhcCCccEEhhhhhhhhhHHHHhhcCCCCCc
Confidence 46667776654455544446778888888888888876666544443443
No 49
>KOG0320 consensus Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]
Probab=50.58 E-value=6.6 Score=37.84 Aligned_cols=49 Identities=24% Similarity=0.543 Sum_probs=37.6
Q ss_pred CCCCccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHH
Q 010326 256 DGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEI 305 (513)
Q Consensus 256 ~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I 305 (513)
..||.|-.|-.....-.. +.-+=|.+||..|...++..+..|--|.+.|
T Consensus 129 ~~~~~CPiCl~~~sek~~-vsTkCGHvFC~~Cik~alk~~~~CP~C~kkI 177 (187)
T KOG0320|consen 129 EGTYKCPICLDSVSEKVP-VSTKCGHVFCSQCIKDALKNTNKCPTCRKKI 177 (187)
T ss_pred ccccCCCceecchhhccc-cccccchhHHHHHHHHHHHhCCCCCCccccc
Confidence 467888888765432211 3345799999999999999999999999888
No 50
>PHA02456 zinc metallopeptidase motif-containing protein
Probab=50.52 E-value=6.1 Score=35.34 Aligned_cols=36 Identities=31% Similarity=0.490 Sum_probs=21.3
Q ss_pred hhhhhccchhhHhhhc-CCCCCCCcchhhHHHHHHHHHhhccc
Q 010326 404 GSILAHEMMHAWLRLK-GYPNLRPDVEEGICQVLAHMWLESEI 445 (513)
Q Consensus 404 gsilaHE~~Hawl~l~-g~~~L~~~~eEG~cq~~a~~wl~~~~ 445 (513)
.-|||||+.|+|+.-. |+ .-|. -.-.|+-.|=-.+.
T Consensus 80 ~~TL~HEL~H~WQ~RsYG~--i~PI----TY~F~~~~WE~~~P 116 (141)
T PHA02456 80 RDTLAHELNHAWQFRTYGL--VQPI----TYAFSAKVWEPEVP 116 (141)
T ss_pred HHHHHHHHHHHHhhhccce--eeee----ehhhhHhhcCCCCC
Confidence 4599999999998722 43 2221 12356667743333
No 51
>PF05299 Peptidase_M61: M61 glycyl aminopeptidase; InterPro: IPR007963 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M61 (glycyl aminopeptidase family, clan MA(E)).The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH. The type example is glycyl aminopeptidase from Sphingomonas capsulata.
Probab=50.32 E-value=7.5 Score=35.24 Aligned_cols=40 Identities=30% Similarity=0.506 Sum_probs=28.5
Q ss_pred hhhhhccchhhHh--hhcC---------CCCCC--CcchhhHHHHHHHHHhhc
Q 010326 404 GSILAHEMMHAWL--RLKG---------YPNLR--PDVEEGICQVLAHMWLES 443 (513)
Q Consensus 404 gsilaHE~~Hawl--~l~g---------~~~L~--~~~eEG~cq~~a~~wl~~ 443 (513)
-.++|||+-|+|- |+.. -|+.. +.|-||+-+.++.+-|-+
T Consensus 5 l~l~sHEffH~WnvkrirP~~l~p~dy~~~~~t~~LWv~EG~T~Y~~~l~l~R 57 (122)
T PF05299_consen 5 LGLLSHEFFHSWNVKRIRPAELGPFDYEKPNYTELLWVYEGFTSYYGDLLLVR 57 (122)
T ss_pred hhhhhhhccccccceEeccccccCCCCCCCCCCCCEeeeeCcHHHHHHHHHHH
Confidence 4689999999995 3332 11221 278899999999988653
No 52
>PF00645 zf-PARP: Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; InterPro: IPR001510 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains. NAD(+) ADP-ribosyltransferase(2.4.2.30 from EC) [, ] is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III [] contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003677 DNA binding, 0008270 zinc ion binding; PDB: 1UW0_A 3OD8_D 3ODA_A 4AV1_A 2DMJ_A 4DQY_D 2L30_A 2CS2_A 2L31_A 3ODE_B ....
Probab=47.89 E-value=3.1 Score=34.43 Aligned_cols=19 Identities=21% Similarity=0.490 Sum_probs=13.0
Q ss_pred ccccCcccccCCCCcCCCC
Q 010326 215 KEQHHPKCDVCQNFIPTNS 233 (513)
Q Consensus 215 ~~~f~pkC~~C~~~I~~~~ 233 (513)
.+....+|.+|++.|..+.
T Consensus 3 Aks~Ra~Ck~C~~~I~kg~ 21 (82)
T PF00645_consen 3 AKSGRAKCKGCKKKIAKGE 21 (82)
T ss_dssp -SSSTEBETTTSCBE-TTS
T ss_pred CCCCCccCcccCCcCCCCC
Confidence 3444568999999998765
No 53
>PF04889 Cwf_Cwc_15: Cwf15/Cwc15 cell cycle control protein; InterPro: IPR006973 This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologues. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing [].; GO: 0000398 nuclear mRNA splicing, via spliceosome, 0005681 spliceosomal complex
Probab=47.04 E-value=45 Score=33.81 Aligned_cols=6 Identities=17% Similarity=0.180 Sum_probs=2.2
Q ss_pred HHHHHH
Q 010326 104 RAKAQQ 109 (513)
Q Consensus 104 ~~~~~~ 109 (513)
|.+|.+
T Consensus 155 LekIKk 160 (244)
T PF04889_consen 155 LEKIKK 160 (244)
T ss_pred HHHHHH
Confidence 333333
No 54
>TIGR02411 leuko_A4_hydro leukotriene A-4 hydrolase/aminopeptidase. Members of this family represent a distinctive subset within the zinc metallopeptidase family M1 (pfam01433). The majority of the members of pfam01433 are aminopeptidases, but the sequences in this family for which the function is known are leukotriene A-4 hydrolase. A dual epoxide hydrolase and aminopeptidase activity at the same active site is indicated. The physiological substrate for aminopeptidase activity is not known.
Probab=46.87 E-value=10 Score=43.05 Aligned_cols=38 Identities=21% Similarity=0.285 Sum_probs=29.1
Q ss_pred hhhhccchhhHhhhcC--CCCC-CCcchhhHHHHHHHHHhhc
Q 010326 405 SILAHEMMHAWLRLKG--YPNL-RPDVEEGICQVLAHMWLES 443 (513)
Q Consensus 405 silaHE~~Hawl~l~g--~~~L-~~~~eEG~cq~~a~~wl~~ 443 (513)
.+||||++|-|.. |- ...= ...+-|||+-+|.+++++.
T Consensus 281 ~viaHElAHqWfG-NlVT~~~W~d~WLnEGfaty~e~~~~~~ 321 (601)
T TIGR02411 281 DVIAHELAHSWSG-NLVTNCSWEHFWLNEGWTVYLERRIVGR 321 (601)
T ss_pred hhHHHHHHhhccC-ceeecCCchHHHHHhhHHHHHHHHHHHH
Confidence 5999999999987 32 2222 3478999999999987763
No 55
>PF01431 Peptidase_M13: Peptidase family M13 This is family M13 in the peptidase classification. ; InterPro: IPR018497 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M13 (neprilysin family, clan MA(E)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH []. M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk [, ]. The family includes eukaryotic and prokaryotic oligopeptidases, as well as some of the proteins responsible for the molecular basis of the blood group antigens e.g. Kell []. Neprilysin (3.4.24.11 from EC), is another member of this group, it is variously known as common acute lymphoblastic leukemia antigen (CALLA), enkephalinase (gp100) and neutral endopeptidase metalloendopeptidase (NEP). It is a plasma membrane-bound mammalian enzyme that is able to digest biologically-active peptides, including enkephalins []. The zinc ligands of neprilysin are known and are analogous to those in thermolysin, a related peptidase [, ]. Neprilysins, like thermolysin, are inhibited by phosphoramidon, which appears to selectively inhibit this family in mammals. The enzymes are all oligopeptidases, digesting oligo- and polypeptides, but not proteins []. Neprilysin consists of a short cytoplasmic domain, a membrane-spanning region and a large extracellular domain. The cytoplasmic domain contains a conformationally-restrained octapeptide, which is thought to act as a stop transfer sequence that prevents proteolysis and secretion [, ].; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis; PDB: 2QPJ_A 1R1I_A 1R1J_A 1Y8J_A 1R1H_A 1DMT_A 2YB9_A 3DWB_A 3ZUK_A.
Probab=46.11 E-value=6.8 Score=37.66 Aligned_cols=15 Identities=60% Similarity=0.815 Sum_probs=13.0
Q ss_pred hhhhhhccchhhHhh
Q 010326 403 TGSILAHEMMHAWLR 417 (513)
Q Consensus 403 ~gsilaHE~~Hawl~ 417 (513)
+|+|||||+||+.-.
T Consensus 36 lG~ilahel~hafd~ 50 (206)
T PF01431_consen 36 LGFILAHELMHAFDP 50 (206)
T ss_dssp HHHHHHHHHHHCTST
T ss_pred HHHHHHHHHHHHHHH
Confidence 499999999999754
No 56
>PRK04023 DNA polymerase II large subunit; Validated
Probab=46.10 E-value=23 Score=42.47 Aligned_cols=55 Identities=24% Similarity=0.373 Sum_probs=35.4
Q ss_pred CCCCCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCC
Q 010326 156 FSGYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNS 233 (513)
Q Consensus 156 ~~g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~ 233 (513)
+.+.+.|..|+... .=|+|..|+.. .....+|..|=.......|..|+..+....
T Consensus 623 EVg~RfCpsCG~~t---------------~~frCP~CG~~--------Te~i~fCP~CG~~~~~y~CPKCG~El~~~s 677 (1121)
T PRK04023 623 EIGRRKCPSCGKET---------------FYRRCPFCGTH--------TEPVYRCPRCGIEVEEDECEKCGREPTPYS 677 (1121)
T ss_pred cccCccCCCCCCcC---------------CcccCCCCCCC--------CCcceeCccccCcCCCCcCCCCCCCCCccc
Confidence 34556677777763 11778888775 123457888866555567888887776643
No 57
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=44.16 E-value=5.8 Score=30.87 Aligned_cols=26 Identities=12% Similarity=0.129 Sum_probs=19.6
Q ss_pred ccCCCCeeeeeccccccccccceeee
Q 010326 366 RIGAGYRLIDMITEPYRLIRRCEVTA 391 (513)
Q Consensus 366 ~~~~G~rilei~~~p~~~~~~~eV~~ 391 (513)
.|.+||.|+.||++++......++..
T Consensus 30 gl~~GD~I~~Ing~~v~~~~~~~~~~ 55 (70)
T cd00136 30 GLQAGDVILAVNGTDVKNLTLEDVAE 55 (70)
T ss_pred CCCCCCEEEEECCEECCCCCHHHHHH
Confidence 47899999999999987664444333
No 58
>PRK12495 hypothetical protein; Provisional
Probab=43.68 E-value=55 Score=32.72 Aligned_cols=27 Identities=30% Similarity=0.580 Sum_probs=18.5
Q ss_pred CcCcCCCcccccCceeeecCccccCCCcccCCCCCCCC
Q 010326 160 RICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPIT 197 (513)
Q Consensus 160 ~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~ 197 (513)
..|..|+.+|. -++.|-+|..|+....
T Consensus 43 ~hC~~CG~PIp-----------a~pG~~~Cp~CQ~~~~ 69 (226)
T PRK12495 43 AHCDECGDPIF-----------RHDGQEFCPTCQQPVT 69 (226)
T ss_pred hhcccccCccc-----------CCCCeeECCCCCCccc
Confidence 47888999884 1366777777775443
No 59
>COG2191 Formylmethanofuran dehydrogenase subunit E [Energy production and conversion]
Probab=42.78 E-value=11 Score=37.03 Aligned_cols=30 Identities=20% Similarity=0.499 Sum_probs=17.5
Q ss_pred ccCCCCCCCCCcceeecCCccccccccccc
Q 010326 188 RCHSCNLPITDVEFSMSGNRPYHKHCYKEQ 217 (513)
Q Consensus 188 rCs~C~~~L~~~~F~~~dG~pYCk~CY~~~ 217 (513)
+|..|+..+....-...+|++.|+.||...
T Consensus 174 ~C~kCGE~~~e~~~~~~ng~~vC~~C~~~~ 203 (206)
T COG2191 174 RCSKCGELFMEPRAVVLNGKPVCKPCAEKK 203 (206)
T ss_pred eccccCcccccchhhhcCCceecccccccc
Confidence 344444443333333468889999998753
No 60
>PF11781 RRN7: RNA polymerase I-specific transcription initiation factor Rrn7; InterPro: IPR021752 Rrn7 is a transcription binding factor that associates strongly with both Rrn6 and Rrn11 to form a complex which itself binds the TATA-binding protein and is required for transcription by the core domain of the RNA PolI promoter [],[].
Probab=42.53 E-value=14 Score=26.41 Aligned_cols=25 Identities=20% Similarity=0.312 Sum_probs=17.9
Q ss_pred cccCCCCCCCCCcceeecCCccccccccc
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHCYK 215 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~CY~ 215 (513)
+.|..|+.. .|...+|..||..|-.
T Consensus 9 ~~C~~C~~~----~~~~~dG~~yC~~cG~ 33 (36)
T PF11781_consen 9 EPCPVCGSR----WFYSDDGFYYCDRCGH 33 (36)
T ss_pred CcCCCCCCe----EeEccCCEEEhhhCce
Confidence 458888764 4566899999977743
No 61
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=41.95 E-value=6.1 Score=44.82 Aligned_cols=32 Identities=13% Similarity=0.258 Sum_probs=25.9
Q ss_pred cccccccCCCCeeeeeccccccccccceeeeE
Q 010326 361 VLRRPRIGAGYRLIDMITEPYRLIRRCEVTAI 392 (513)
Q Consensus 361 ~~~~~~~~~G~rilei~~~p~~~~~~~eV~~I 392 (513)
|.-|.-+++|+||+||||+.|-.++++.|-.+
T Consensus 768 IAERGGVRVGHRIIEINgQSVVA~pHekIV~l 799 (829)
T KOG3605|consen 768 IAERGGVRVGHRIIEINGQSVVATPHEKIVQL 799 (829)
T ss_pred chhccCceeeeeEEEECCceEEeccHHHHHHH
Confidence 44566788999999999999998888876443
No 62
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=41.37 E-value=4.7 Score=46.10 Aligned_cols=37 Identities=14% Similarity=0.176 Sum_probs=28.4
Q ss_pred cccccCCCCeeeeeccccccccccceeeeEeeecCch
Q 010326 363 RRPRIGAGYRLIDMITEPYRLIRRCEVTAILILYGLP 399 (513)
Q Consensus 363 ~~~~~~~G~rilei~~~p~~~~~~~eV~~Il~l~glP 399 (513)
+..+|++||.|+||||++-+...+..-..||...|.-
T Consensus 938 rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~gg~~ 974 (984)
T KOG3209|consen 938 RDGRMRVGDQITEINGESTKGMTHDRAIELIKQGGRR 974 (984)
T ss_pred ccCceeecceEEEecCcccCCCcHHHHHHHHHhCCeE
Confidence 6788999999999999999988776644444444443
No 63
>COG1645 Uncharacterized Zn-finger containing protein [General function prediction only]
Probab=41.16 E-value=15 Score=33.85 Aligned_cols=22 Identities=23% Similarity=0.709 Sum_probs=16.1
Q ss_pred ccCCCCCCCCCcceeecCCcccccccc
Q 010326 188 RCHSCNLPITDVEFSMSGNRPYHKHCY 214 (513)
Q Consensus 188 rCs~C~~~L~~~~F~~~dG~pYCk~CY 214 (513)
.|..|+.|| |. ++|.+||.-|-
T Consensus 30 hCp~Cg~PL----F~-KdG~v~CPvC~ 51 (131)
T COG1645 30 HCPKCGTPL----FR-KDGEVFCPVCG 51 (131)
T ss_pred hCcccCCcc----ee-eCCeEECCCCC
Confidence 477888877 33 78999987663
No 64
>COG1198 PriA Primosomal protein N' (replication factor Y) - superfamily II helicase [DNA replication, recombination, and repair]
Probab=40.78 E-value=29 Score=40.48 Aligned_cols=49 Identities=18% Similarity=0.487 Sum_probs=30.0
Q ss_pred CCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccc-cCcccccCCCC
Q 010326 159 YRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQ-HHPKCDVCQNF 228 (513)
Q Consensus 159 ~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~-f~pkC~~C~~~ 228 (513)
.-.|..|+... +|..|+.+|+ |....+.+.|.-|-... .-..|..|+..
T Consensus 435 ~l~C~~Cg~v~------------------~Cp~Cd~~lt---~H~~~~~L~CH~Cg~~~~~p~~Cp~Cgs~ 484 (730)
T COG1198 435 LLLCRDCGYIA------------------ECPNCDSPLT---LHKATGQLRCHYCGYQEPIPQSCPECGSE 484 (730)
T ss_pred eeecccCCCcc------------------cCCCCCcceE---EecCCCeeEeCCCCCCCCCCCCCCCCCCC
Confidence 34688887654 7888888775 33445677776665442 22356666654
No 65
>PF10083 DUF2321: Uncharacterized protein conserved in bacteria (DUF2321); InterPro: IPR016891 This entry is represented by Bacteriophage 'Lactobacillus prophage Lj928', Orf-Ljo1454. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=40.65 E-value=12 Score=35.26 Aligned_cols=55 Identities=16% Similarity=0.398 Sum_probs=35.5
Q ss_pred CccccccccccccCcccccCCCCcCCCCc--cceeeecccccccccCCCcccCCCCccCCCCCcCCCCCc
Q 010326 206 NRPYHKHCYKEQHHPKCDVCQNFIPTNSA--GLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTK 273 (513)
Q Consensus 206 G~pYCk~CY~~~f~pkC~~C~~~I~~~~~--g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~ 273 (513)
..-||.+|-.+ .-..|..|+.+|.+... |++.+..+ |+. =-.|..|++..|+...
T Consensus 27 ~~~fC~kCG~~-tI~~Cp~C~~~IrG~y~v~gv~~~g~~-----------~~~-PsYC~~CGkpyPWt~~ 83 (158)
T PF10083_consen 27 REKFCSKCGAK-TITSCPNCSTPIRGDYHVEGVFGLGGH-----------YEA-PSYCHNCGKPYPWTEN 83 (158)
T ss_pred HHHHHHHhhHH-HHHHCcCCCCCCCCceecCCeeeeCCC-----------CCC-ChhHHhCCCCCchHHH
Confidence 35689888544 45689999999998642 44444322 121 1238899998886543
No 66
>PF13920 zf-C3HC4_3: Zinc finger, C3HC4 type (RING finger); PDB: 2YHN_B 2YHO_G 3T6P_A 2CSY_A 2VJE_B 2VJF_B 2HDP_B 2EA5_A 2ECG_A 3EB5_A ....
Probab=40.63 E-value=15 Score=27.41 Aligned_cols=44 Identities=20% Similarity=0.437 Sum_probs=29.4
Q ss_pred cccCCCCCCCCCcceeecCCcc-ccccccccc--cCcccccCCCCcCC
Q 010326 187 FRCHSCNLPITDVEFSMSGNRP-YHKHCYKEQ--HHPKCDVCQNFIPT 231 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~p-YCk~CY~~~--f~pkC~~C~~~I~~ 231 (513)
+.|..|........+. .=|.. +|..|+.+. ...+|..|.++|..
T Consensus 3 ~~C~iC~~~~~~~~~~-pCgH~~~C~~C~~~~~~~~~~CP~Cr~~i~~ 49 (50)
T PF13920_consen 3 EECPICFENPRDVVLL-PCGHLCFCEECAERLLKRKKKCPICRQPIES 49 (50)
T ss_dssp SB-TTTSSSBSSEEEE-TTCEEEEEHHHHHHHHHTTSBBTTTTBB-SE
T ss_pred CCCccCCccCCceEEe-CCCChHHHHHHhHHhcccCCCCCcCChhhcC
Confidence 3566777766554444 34566 999998876 56799999998863
No 67
>PF13240 zinc_ribbon_2: zinc-ribbon domain
Probab=40.01 E-value=17 Score=23.28 Aligned_cols=8 Identities=50% Similarity=1.344 Sum_probs=4.1
Q ss_pred CcCCCccc
Q 010326 162 CAGCNTEI 169 (513)
Q Consensus 162 C~~C~~~I 169 (513)
|..|+..|
T Consensus 2 Cp~CG~~~ 9 (23)
T PF13240_consen 2 CPNCGAEI 9 (23)
T ss_pred CcccCCCC
Confidence 44555555
No 68
>COG2888 Predicted Zn-ribbon RNA-binding protein with a function in translation [Translation, ribosomal structure and biogenesis]
Probab=39.81 E-value=15 Score=29.21 Aligned_cols=37 Identities=22% Similarity=0.429 Sum_probs=23.6
Q ss_pred CccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccch
Q 010326 259 PRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYL 303 (513)
Q Consensus 259 F~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~ 303 (513)
..|.+|+..|..++.++ ...|.+|-+.. .-+|+-|.+
T Consensus 10 ~~CtSCg~~i~p~e~~v-----~F~CPnCGe~~---I~Rc~~CRk 46 (61)
T COG2888 10 PVCTSCGREIAPGETAV-----KFPCPNCGEVE---IYRCAKCRK 46 (61)
T ss_pred ceeccCCCEeccCCcee-----EeeCCCCCcee---eehhhhHHH
Confidence 56788888776666665 45677776553 345666653
No 69
>PHA03378 EBNA-3B; Provisional
Probab=38.10 E-value=19 Score=41.24 Aligned_cols=36 Identities=28% Similarity=0.347 Sum_probs=16.9
Q ss_pred ccccCCCCCcccc-cccc-cCCCCCCCChhHHHHHHHhhh
Q 010326 16 YHARYGDDRTWDE-RRYS-AADDSSGFDNEEIDRAIALSL 53 (513)
Q Consensus 16 ~~~~~~~~~~~~~-~~~~-~~~~~~~~~~~~~~~~~~~~~ 53 (513)
-.|++.-..-|.. .+.. +-+ .....+||+-|.+..+
T Consensus 303 ~iGt~kpt~PWl~a~P~e~pYh--rpLtsedi~~AfarGq 340 (991)
T PHA03378 303 CTGRPRPTKPWLRAHPVAVPYD--DPLTSEEIDLAYARGL 340 (991)
T ss_pred hcCCCCCCCcccCCCCcccccc--ccchHHHHHHHHHHHH
Confidence 3455544555762 2221 111 2344677776655444
No 70
>PF14471 DUF4428: Domain of unknown function (DUF4428)
Probab=37.33 E-value=16 Score=28.06 Aligned_cols=30 Identities=23% Similarity=0.679 Sum_probs=21.9
Q ss_pred ccCCCCCcCCCCCceEEccCCccccccccccc
Q 010326 260 RCCSCERMEPRDTKYLSLDDGRKLCLECLDSA 291 (513)
Q Consensus 260 ~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~ 291 (513)
.|..|+.-++.-.+ +-+.|| .+|..|+.++
T Consensus 1 ~C~iCg~kigl~~~-~k~~DG-~iC~~C~~Kl 30 (51)
T PF14471_consen 1 KCAICGKKIGLFKR-FKIKDG-YICKDCLKKL 30 (51)
T ss_pred CCCccccccccccc-eeccCc-cchHHHHHHh
Confidence 48889987765444 446788 6899998874
No 71
>KOG1832 consensus HIV-1 Vpr-binding protein [Cell cycle control, cell division, chromosome partitioning]
Probab=36.83 E-value=14 Score=43.48 Aligned_cols=24 Identities=29% Similarity=0.544 Sum_probs=12.2
Q ss_pred cCCcCChhhhhccCCCCCcccccc
Q 010326 65 ENEYDSEDDLQCIKSDDSDEDELD 88 (513)
Q Consensus 65 ~~~~~~~~~~~~~~~~~~~~~~~~ 88 (513)
++|+++|||++-+..||||++|+|
T Consensus 1404 ~dd~DeeeD~e~Ed~dEddd~edd 1427 (1516)
T KOG1832|consen 1404 DDDSDEEEDDETEDEDEDDDEEDD 1427 (1516)
T ss_pred ccccCccccchhhccccccccccc
Confidence 455555555555554444444433
No 72
>PF12773 DZR: Double zinc ribbon
Probab=36.65 E-value=35 Score=25.31 Aligned_cols=8 Identities=38% Similarity=0.953 Sum_probs=3.5
Q ss_pred cccccccc
Q 010326 282 KLCLECLD 289 (513)
Q Consensus 282 ~yC~~Cy~ 289 (513)
.+|..|-.
T Consensus 30 ~~C~~Cg~ 37 (50)
T PF12773_consen 30 KICPNCGA 37 (50)
T ss_pred CCCcCCcC
Confidence 44444443
No 73
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=35.84 E-value=9.9 Score=44.83 Aligned_cols=25 Identities=20% Similarity=0.323 Sum_probs=20.3
Q ss_pred cccCCCCeeeeecccccccccccee
Q 010326 365 PRIGAGYRLIDMITEPYRLIRRCEV 389 (513)
Q Consensus 365 ~~~~~G~rilei~~~p~~~~~~~eV 389 (513)
.++.|||.|+-||++||+..+++.|
T Consensus 90 GKL~PGDQIl~vN~Epv~daprerv 114 (1298)
T KOG3552|consen 90 GKLQPGDQILAVNGEPVKDAPRERV 114 (1298)
T ss_pred ccccCCCeEEEecCcccccccHHHH
Confidence 4678999999999999996665543
No 74
>PHA02608 67 prohead core protein; Provisional
Probab=35.76 E-value=28 Score=29.11 Aligned_cols=17 Identities=24% Similarity=0.120 Sum_probs=8.8
Q ss_pred HHHHHHHhhhhhhhhcC
Q 010326 44 EIDRAIALSLVEVDQKG 60 (513)
Q Consensus 44 ~~~~~~~~~~~~~~~~~ 60 (513)
+.--+||.|+.=|--.+
T Consensus 34 e~k~eIA~sv~iEGEe~ 50 (80)
T PHA02608 34 EEKVEIARSVMIEGEEP 50 (80)
T ss_pred HHHHHHHHHHhhcCCCC
Confidence 33456777764443333
No 75
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=35.38 E-value=6.5 Score=31.40 Aligned_cols=24 Identities=13% Similarity=0.155 Sum_probs=18.2
Q ss_pred cccCCCCeeeeeccccccccccce
Q 010326 365 PRIGAGYRLIDMITEPYRLIRRCE 388 (513)
Q Consensus 365 ~~~~~G~rilei~~~p~~~~~~~e 388 (513)
..|++||.|+.||+.++......+
T Consensus 42 ~gl~~GD~I~~ing~~i~~~~~~~ 65 (82)
T cd00992 42 GGLRVGDRILEVNGVSVEGLTHEE 65 (82)
T ss_pred CCCCCCCEEEEECCEEcCccCHHH
Confidence 468899999999999887433333
No 76
>KOG1813 consensus Predicted E3 ubiquitin ligase [Posttranslational modification, protein turnover, chaperones]
Probab=35.11 E-value=21 Score=37.11 Aligned_cols=45 Identities=18% Similarity=0.355 Sum_probs=32.0
Q ss_pred cccCCCCCCCCCcceeecCCcccccccccccc--CcccccCCCCcCCC
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHCYKEQH--HPKCDVCQNFIPTN 232 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f--~pkC~~C~~~I~~~ 232 (513)
|.|..|......- ....=+..||+.|....| +++|.+|++.+.+.
T Consensus 242 f~c~icr~~f~~p-Vvt~c~h~fc~~ca~~~~qk~~~c~vC~~~t~g~ 288 (313)
T KOG1813|consen 242 FKCFICRKYFYRP-VVTKCGHYFCEVCALKPYQKGEKCYVCSQQTHGS 288 (313)
T ss_pred ccccccccccccc-hhhcCCceeehhhhccccccCCcceecccccccc
Confidence 7788888766542 222456788999887654 47999999988763
No 77
>PF09943 DUF2175: Uncharacterized protein conserved in archaea (DUF2175); InterPro: IPR018686 This family of various hypothetical archaeal proteins has no known function.
Probab=35.10 E-value=10 Score=33.32 Aligned_cols=28 Identities=32% Similarity=0.593 Sum_probs=14.5
Q ss_pred cCcCCCcccccCceeeec-CccccCCCcc
Q 010326 161 ICAGCNTEIGHGRYLSCM-EAFWHPECFR 188 (513)
Q Consensus 161 ~C~~C~~~I~~g~~l~al-g~~wHp~CFr 188 (513)
.|.-|+++|+.|+.+++. +..-|-.||+
T Consensus 4 kC~iCg~~I~~gqlFTF~~kG~VH~~C~~ 32 (101)
T PF09943_consen 4 KCYICGKPIYEGQLFTFTKKGPVHYECFR 32 (101)
T ss_pred EEEecCCeeeecceEEEecCCcEeHHHHH
Confidence 466666666666555432 2444444443
No 78
>KOG0490 consensus Transcription factor, contains HOX domain [General function prediction only]
Probab=34.15 E-value=9.6 Score=36.95 Aligned_cols=52 Identities=15% Similarity=0.261 Sum_probs=41.6
Q ss_pred cccCCCCccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHH
Q 010326 253 HERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEI 305 (513)
Q Consensus 253 H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I 305 (513)
+||..|..|..|...+..+...+.. +|..||...|.........|..|.+.|
T Consensus 18 ~~~~~~~~~~~~~~~~~~~~~~~~~-~g~~~~~~d~~~~~~~~~rr~rt~~~~ 69 (235)
T KOG0490|consen 18 YWHASCLKCAECDNPLGVGDTCFSK-DGSIYCKRDYQREFKFSKRCARCKFTI 69 (235)
T ss_pred HHHHHHHhhhhhcchhccCCCcccC-CCcccccccchhhhhccccccCCCCCc
Confidence 4688999999999988644666666 899999999987334467888888776
No 79
>PRK14015 pepN aminopeptidase N; Provisional
Probab=33.73 E-value=26 Score=41.75 Aligned_cols=40 Identities=20% Similarity=0.414 Sum_probs=29.3
Q ss_pred hhhhhhccchhhHhhhcCC--CCC-CCcchhhHHHHHHHHHhhc
Q 010326 403 TGSILAHEMMHAWLRLKGY--PNL-RPDVEEGICQVLAHMWLES 443 (513)
Q Consensus 403 ~gsilaHE~~Hawl~l~g~--~~L-~~~~eEG~cq~~a~~wl~~ 443 (513)
..++||||+.|-|.. |.. ..- ...+-|||.-++.++|.+.
T Consensus 296 i~~vIaHElaHqWFG-NlVT~~~W~dLWLnEGFAty~e~~~~~~ 338 (875)
T PRK14015 296 IESVIAHEYFHNWTG-NRVTCRDWFQLSLKEGLTVFRDQEFSAD 338 (875)
T ss_pred HHHHHHHHHHHHHHh-CcceecchhhhhhhhHHHHHHHHHHHHH
Confidence 457999999999975 332 111 2357999999998888764
No 80
>KOG0978 consensus E3 ubiquitin ligase involved in syntaxin degradation [Posttranslational modification, protein turnover, chaperones]
Probab=33.55 E-value=10 Score=43.58 Aligned_cols=46 Identities=20% Similarity=0.509 Sum_probs=30.4
Q ss_pred cccCCCCCCCCCcceeecCCccccccccccccCc---ccccCCCCcCCCC
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHP---KCDVCQNFIPTNS 233 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~p---kC~~C~~~I~~~~ 233 (513)
.+|+.|+....+.-. ..=+.+||..|-..++.. +|..|+.....++
T Consensus 644 LkCs~Cn~R~Kd~vI-~kC~H~FC~~Cvq~r~etRqRKCP~Cn~aFganD 692 (698)
T KOG0978|consen 644 LKCSVCNTRWKDAVI-TKCGHVFCEECVQTRYETRQRKCPKCNAAFGAND 692 (698)
T ss_pred eeCCCccCchhhHHH-HhcchHHHHHHHHHHHHHhcCCCCCCCCCCCccc
Confidence 577777765443211 123568888888776543 7999998877765
No 81
>PF08394 Arc_trans_TRASH: Archaeal TRASH domain; InterPro: IPR013603 This region is found in the C terminus of a number of archaeal transcriptional regulators. It is thought to function as a metal-sensing regulatory module [].
Probab=32.69 E-value=23 Score=25.62 Aligned_cols=29 Identities=21% Similarity=0.380 Sum_probs=18.5
Q ss_pred CcCCCcccccC-ceeeecCccccCCCcccCCCC
Q 010326 162 CAGCNTEIGHG-RYLSCMEAFWHPECFRCHSCN 193 (513)
Q Consensus 162 C~~C~~~I~~g-~~l~alg~~wHp~CFrCs~C~ 193 (513)
|.-|+++|... .+++..++.||. .|..|.
T Consensus 1 Cd~CG~~I~~eP~~~k~~~~~y~f---CC~tC~ 30 (37)
T PF08394_consen 1 CDYCGGEITGEPIVVKIGNKVYYF---CCPTCL 30 (37)
T ss_pred CCccCCcccCCEEEEEECCeEEEE---ECHHHH
Confidence 67788888533 346677888874 444443
No 82
>smart00731 SprT SprT homologues. Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function.
Probab=32.60 E-value=22 Score=32.67 Aligned_cols=21 Identities=43% Similarity=0.388 Sum_probs=16.3
Q ss_pred hhhhhhhhhccchhhHhhhcC
Q 010326 400 RLLTGSILAHEMMHAWLRLKG 420 (513)
Q Consensus 400 ~~L~gsilaHE~~Hawl~l~g 420 (513)
...+-.||.|||.|+++.+.|
T Consensus 56 ~~~l~~~l~HEm~H~~~~~~g 76 (146)
T smart00731 56 RDRLRETLLHELCHAALYLFG 76 (146)
T ss_pred HHHHHhhHHHHHHHHHHHHhC
Confidence 334457999999999988754
No 83
>PF14634 zf-RING_5: zinc-RING finger domain
Probab=32.32 E-value=36 Score=24.74 Aligned_cols=41 Identities=17% Similarity=0.335 Sum_probs=27.1
Q ss_pred cCCCCCcCCCCCceEEccCCcccccccccccccCCCCCccc
Q 010326 261 CCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPL 301 (513)
Q Consensus 261 C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c 301 (513)
|..|.........++++.=|..+|..|..+.......|.-|
T Consensus 2 C~~C~~~~~~~~~~~l~~CgH~~C~~C~~~~~~~~~~CP~C 42 (44)
T PF14634_consen 2 CNICFEKYSEERRPRLTSCGHIFCEKCLKKLKGKSVKCPIC 42 (44)
T ss_pred CcCcCccccCCCCeEEcccCCHHHHHHHHhhcCCCCCCcCC
Confidence 56676766445567777788999999988754223344444
No 84
>KOG0478 consensus DNA replication licensing factor, MCM4 component [Replication, recombination and repair]
Probab=31.97 E-value=41 Score=38.89 Aligned_cols=23 Identities=30% Similarity=0.290 Sum_probs=16.7
Q ss_pred ccCCCCeeeeeccccccccccceeeeEeeecCchhhh
Q 010326 366 RIGAGYRLIDMITEPYRLIRRCEVTAILILYGLPRLL 402 (513)
Q Consensus 366 ~~~~G~rilei~~~p~~~~~~~eV~~Il~l~glP~~L 402 (513)
+..||||| +||+|+....++.--
T Consensus 342 ~v~pGDrv--------------~VTGi~ra~p~r~np 364 (804)
T KOG0478|consen 342 KVRPGDRV--------------EVTGILRATPVRVNP 364 (804)
T ss_pred ccCCCCeE--------------EEEEEEEeEEeccCc
Confidence 45699999 578888777665543
No 85
>PF09538 FYDLN_acid: Protein of unknown function (FYDLN_acid); InterPro: IPR012644 Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
Probab=31.94 E-value=28 Score=30.92 Aligned_cols=33 Identities=27% Similarity=0.547 Sum_probs=22.8
Q ss_pred ccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccc
Q 010326 242 HPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDS 290 (513)
Q Consensus 242 hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~ 290 (513)
+|-||.| ..|-+|++ +||-|+..-+.|..|-..
T Consensus 3 kpelGtK----------R~Cp~CG~------kFYDLnk~PivCP~CG~~ 35 (108)
T PF09538_consen 3 KPELGTK----------RTCPSCGA------KFYDLNKDPIVCPKCGTE 35 (108)
T ss_pred ccccCCc----------ccCCCCcc------hhccCCCCCccCCCCCCc
Confidence 4667776 45777765 678787766778888554
No 86
>PF07607 DUF1570: Protein of unknown function (DUF1570); InterPro: IPR011464 This entry represents hypothetical proteins confined to bacteria.
Probab=31.67 E-value=27 Score=31.89 Aligned_cols=32 Identities=31% Similarity=0.412 Sum_probs=23.2
Q ss_pred hhhhccchhhHhhhcC-CCCCC---CcchhhHHHHH
Q 010326 405 SILAHEMMHAWLRLKG-YPNLR---PDVEEGICQVL 436 (513)
Q Consensus 405 silaHE~~Hawl~l~g-~~~L~---~~~eEG~cq~~ 436 (513)
+||+||..|--+.--| .+++. ..|-|||...+
T Consensus 3 ~T~~HEa~HQl~~N~Gl~~r~~~~P~Wv~EGlA~yF 38 (128)
T PF07607_consen 3 ATIAHEATHQLAFNTGLHPRLADWPRWVSEGLATYF 38 (128)
T ss_pred hHHHHHHHHHHHHHccccccCCCCchHHHHhHHHHc
Confidence 6999999997666446 45553 38888887744
No 87
>PRK05580 primosome assembly protein PriA; Validated
Probab=31.11 E-value=39 Score=39.03 Aligned_cols=11 Identities=18% Similarity=0.132 Sum_probs=8.0
Q ss_pred hhhhccchhhH
Q 010326 405 SILAHEMMHAW 415 (513)
Q Consensus 405 silaHE~~Haw 415 (513)
+++.|++.-.|
T Consensus 556 ~~~~~d~~~f~ 566 (679)
T PRK05580 556 ALLAQDYDAFA 566 (679)
T ss_pred HHHhCCHHHHH
Confidence 57788886655
No 88
>TIGR02420 dksA RNA polymerase-binding protein DksA. The model that is the basis for this family describes a small, pleiotropic protein, DksA (DnaK suppressor A), originally named as a multicopy suppressor of temperature sensitivity of dnaKJ mutants. DksA mutants are defective in quorum sensing, virulence, etc. DksA is now understood to bind RNA polymerase directly and modulate its response to small molecules to control the level of transcription of rRNA. Nearly all members of this family are in the Proteobacteria. Whether the closest homologs outside the Proteobacteria function equivalently is unknown. The low value set for the noise cutoff allows identification of possible DksA proteins from outside the proteobacteria. TIGR02419 describes a closely related family of short sequences usually found in prophage regions of proteobacterial genomes or in known phage.
Probab=30.49 E-value=35 Score=30.02 Aligned_cols=30 Identities=33% Similarity=0.759 Sum_probs=19.8
Q ss_pred CCCcCcCCCcccccCceeeecCccccCCCcccCCCC
Q 010326 158 GYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCN 193 (513)
Q Consensus 158 g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~ 193 (513)
..++|..|+++|... .+.+ -|++..|..|.
T Consensus 79 ~yG~C~~Cge~I~~~-RL~a-----~P~a~~Cv~Cq 108 (110)
T TIGR02420 79 EYGYCEECGEEIGLR-RLEA-----RPTATLCIDCK 108 (110)
T ss_pred CCCchhccCCcccHH-HHhh-----CCCccccHHhH
Confidence 446999999999533 3333 45666676664
No 89
>PF10235 Cript: Microtubule-associated protein CRIPT; InterPro: IPR019367 The CRIPT protein is a cytoskeletal protein involved in microtubule production. This C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners [].
Probab=30.16 E-value=31 Score=29.73 Aligned_cols=37 Identities=19% Similarity=0.511 Sum_probs=25.6
Q ss_pred cccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCC
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPT 231 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~ 231 (513)
-.|..|...+. ..|.-||..|-.+ .-+|+-|++.|.+
T Consensus 45 ~~C~~CK~~v~------q~g~~YCq~CAYk--kGiCamCGKki~d 81 (90)
T PF10235_consen 45 SKCKICKTKVH------QPGAKYCQTCAYK--KGICAMCGKKILD 81 (90)
T ss_pred ccccccccccc------cCCCccChhhhcc--cCcccccCCeecc
Confidence 35777766543 2367899999654 3479999998865
No 90
>KOG4739 consensus Uncharacterized protein involved in synaptonemal complex formation [Cell cycle control, cell division, chromosome partitioning; General function prediction only]
Probab=29.95 E-value=34 Score=34.45 Aligned_cols=34 Identities=18% Similarity=0.431 Sum_probs=25.9
Q ss_pred Ccceee-cCCccccccccccccCcccccCCCCcCC
Q 010326 198 DVEFSM-SGNRPYHKHCYKEQHHPKCDVCQNFIPT 231 (513)
Q Consensus 198 ~~~F~~-~dG~pYCk~CY~~~f~pkC~~C~~~I~~ 231 (513)
...|+. .=+.+||..|...-+.+.|..|++.|..
T Consensus 15 ~~~f~LTaC~HvfC~~C~k~~~~~~C~lCkk~ir~ 49 (233)
T KOG4739|consen 15 QDPFFLTACRHVFCEPCLKASSPDVCPLCKKSIRI 49 (233)
T ss_pred CCceeeeechhhhhhhhcccCCccccccccceeee
Confidence 334443 4467999999988888899999998764
No 91
>KOG3039 consensus Uncharacterized conserved protein [Function unknown]
Probab=29.54 E-value=2.5e+02 Score=28.82 Aligned_cols=78 Identities=9% Similarity=0.105 Sum_probs=53.0
Q ss_pred CCCCCcCcCCCcccccCce----eeecCc-------cccCCCcccCCCCCCCCCc---ceeecCCccccccccccccC--
Q 010326 156 FSGYRICAGCNTEIGHGRY----LSCMEA-------FWHPECFRCHSCNLPITDV---EFSMSGNRPYHKHCYKEQHH-- 219 (513)
Q Consensus 156 ~~g~~~C~~C~~~I~~g~~----l~alg~-------~wHp~CFrCs~C~~~L~~~---~F~~~dG~pYCk~CY~~~f~-- 219 (513)
+....+|..-+++|--... ++-++. .-|..=|.|..|...|++. .+...-|.++|.+|..++..
T Consensus 180 P~~~v~CP~s~kplklkdL~~VkFT~l~s~~~et~l~a~s~ryiCpvtrd~LtNt~~ca~Lr~sg~Vv~~ecvEklir~D 259 (303)
T KOG3039|consen 180 PSTTVVCPVSGKPLKLKDLFAVKFTPLNSEETETKLIAASKRYICPVTRDTLTNTTPCAVLRPSGHVVTKECVEKLIRKD 259 (303)
T ss_pred CCceeeccCCCCccchhhcceeeeeecCCchhhhhhhhhccceecccchhhhcCccceEEeccCCcEeeHHHHHHhcccc
Confidence 4445679998998843222 122222 3344668999999999764 24457789999999876543
Q ss_pred cccccCCCCcCCCC
Q 010326 220 PKCDVCQNFIPTNS 233 (513)
Q Consensus 220 pkC~~C~~~I~~~~ 233 (513)
-.|.+|+++....+
T Consensus 260 ~v~pv~d~plkdrd 273 (303)
T KOG3039|consen 260 MVDPVTDKPLKDRD 273 (303)
T ss_pred ccccCCCCcCcccc
Confidence 37889998888765
No 92
>PF09768 Peptidase_M76: Peptidase M76 family; InterPro: IPR019165 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. Mitochondrial inner membrane protease ATP23 has two roles in the assembly of mitochondrial ATPase. Firstly, it acts as a protease that removes the N-terminal 10 residues of mitochondrial ATPase CF(0) subunit 6 (ATP6) at the intermembrane space side. Secondly, it is involved in the correct assembly of the membrane-embedded ATPase CF(0) particle, probably mediating association of ATP6 with the subunit 9 ring [, ].; GO: 0004222 metalloendopeptidase activity
Probab=29.37 E-value=20 Score=34.41 Aligned_cols=14 Identities=36% Similarity=0.660 Sum_probs=11.7
Q ss_pred hhhhhccchhhHhh
Q 010326 404 GSILAHEMMHAWLR 417 (513)
Q Consensus 404 gsilaHE~~Hawl~ 417 (513)
.-||+|||.|||=.
T Consensus 72 ~~~l~HELIHayD~ 85 (173)
T PF09768_consen 72 EDTLTHELIHAYDH 85 (173)
T ss_pred HHHHHHHHHHHHHH
Confidence 56999999999833
No 93
>smart00504 Ubox Modified RING finger domain. Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination.
Probab=29.09 E-value=27 Score=26.69 Aligned_cols=41 Identities=7% Similarity=0.004 Sum_probs=28.4
Q ss_pred CccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccc
Q 010326 259 PRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLY 302 (513)
Q Consensus 259 F~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~ 302 (513)
|.|..|+..+. .. +....|..||..|..+.+.....|--|.
T Consensus 2 ~~Cpi~~~~~~--~P-v~~~~G~v~~~~~i~~~~~~~~~cP~~~ 42 (63)
T smart00504 2 FLCPISLEVMK--DP-VILPSGQTYERRAIEKWLLSHGTDPVTG 42 (63)
T ss_pred cCCcCCCCcCC--CC-EECCCCCEEeHHHHHHHHHHCCCCCCCc
Confidence 57889998874 23 4567899999999987665444444444
No 94
>COG0308 PepN Aminopeptidase N [Amino acid transport and metabolism]
Probab=28.42 E-value=30 Score=41.16 Aligned_cols=43 Identities=21% Similarity=0.387 Sum_probs=32.4
Q ss_pred hhhhhhhhccchhhHhhhcCCCCCC----CcchhhHHHHHHHHHhhccc
Q 010326 401 LLTGSILAHEMMHAWLRLKGYPNLR----PDVEEGICQVLAHMWLESEI 445 (513)
Q Consensus 401 ~L~gsilaHE~~Hawl~l~g~~~L~----~~~eEG~cq~~a~~wl~~~~ 445 (513)
+-+.+++|||+.|.|-. |- ..+. ..+-|||.-+|.+.|.++..
T Consensus 305 ~~~~~viaHElaHqWfG-nl-VT~~~W~~lWLnEgfat~~e~~~~~~~~ 351 (859)
T COG0308 305 ENVEEVIAHELAHQWFG-NL-VTMKWWDDLWLNEGFATFREVLWSEDLG 351 (859)
T ss_pred HHHHHHHHHHHhhhccc-ce-eeccCHHHHHHhhhhHHHHHHHHHHHhc
Confidence 34455999999999965 21 1222 58999999999999998665
No 95
>PF12674 Zn_ribbon_2: Putative zinc ribbon domain
Probab=28.28 E-value=28 Score=29.24 Aligned_cols=32 Identities=25% Similarity=0.541 Sum_probs=19.9
Q ss_pred ccCCCCCcCCCCCceEEccCC---ccccccccccc
Q 010326 260 RCCSCERMEPRDTKYLSLDDG---RKLCLECLDSA 291 (513)
Q Consensus 260 ~C~~C~r~l~~g~~y~~l~dg---r~yC~~Cy~~~ 291 (513)
.|-+|+.++.....+-...|| .-||.-||..-
T Consensus 2 ~CQSCGMPl~~~~~~Gte~dGs~s~~YC~yCy~~G 36 (81)
T PF12674_consen 2 FCQSCGMPLSKDEDFGTEADGSKSEDYCSYCYQNG 36 (81)
T ss_pred cCCcCcCccCCccccccccCCCCchhHHHHHhcCC
Confidence 377888877544423333343 56999999763
No 96
>PF06677 Auto_anti-p27: Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27); InterPro: IPR009563 The proteins in this entry are functionally uncharacterised and include several proteins that characterise Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27). It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis [].
Probab=27.57 E-value=41 Score=24.75 Aligned_cols=21 Identities=29% Similarity=0.796 Sum_probs=12.9
Q ss_pred cCCCCCcCCCCCceEEccCCccccccc
Q 010326 261 CCSCERMEPRDTKYLSLDDGRKLCLEC 287 (513)
Q Consensus 261 C~~C~r~l~~g~~y~~l~dgr~yC~~C 287 (513)
|..|+.++ +...+|+.||..|
T Consensus 20 Cp~C~~PL------~~~k~g~~~Cv~C 40 (41)
T PF06677_consen 20 CPDCGTPL------MRDKDGKIYCVSC 40 (41)
T ss_pred cCCCCCee------EEecCCCEECCCC
Confidence 44556654 2345778888877
No 97
>PF06827 zf-FPG_IleRS: Zinc finger found in FPG and IleRS; InterPro: IPR010663 Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a zinc finger domain found at the C-terminal in both DNA glycosylase/AP lyase enzymes and in isoleucyl tRNA synthetase. In these two types of enzymes, the C-terminal domain forms a zinc finger. Some related proteins may not bind zinc. DNA glycosylase/AP lyase enzymes are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. These enzymes have both DNA glycosylase activity (3.2.2 from EC) and AP lyase activity (4.2.99.18 from EC) []. Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; 3.2.2.23 from EC) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; 4.2.99.18 from EC). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge []. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes []. Fpg binds one ion of zinc at the C terminus, which contains four conserved and essential cysteines []. Endonuclease VIII (Nei) has the same enzyme activities as Fpg above, but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine [, ]. An Fpg-type zinc finger is also found at the C terminus of isoleucyl tRNA synthetase (6.1.1.5 from EC) [, ]. This enzyme catalyses the attachment of isoleucine to tRNA(Ile). As IleRS can inadvertently accommodate and process structurally similar amino acids such as valine, to avoid such errors it has two additional distinct tRNA(Ile)-dependent editing activities. One activity is designated as 'pre-transfer' editing and involves the hydrolysis of activated Val-AMP. The other activity is designated 'post-transfer' editing and involves deacylation of mischarged Val-tRNA(Ile) []. More information about these proteins can be found at Protein of the Month: Zinc Fingers [].; GO: 0003824 catalytic activity; PDB: 1K82_C 1Q39_A 2OQ4_B 2OPF_A 1K3X_A 1K3W_A 1Q3B_A 2EA0_A 1Q3C_A 2XZF_A ....
Probab=27.42 E-value=25 Score=23.51 Aligned_cols=13 Identities=38% Similarity=0.802 Sum_probs=7.4
Q ss_pred CcccccCCCCcCC
Q 010326 219 HPKCDVCQNFIPT 231 (513)
Q Consensus 219 ~pkC~~C~~~I~~ 231 (513)
+.+|..|...|..
T Consensus 1 G~~C~rC~~~~~~ 13 (30)
T PF06827_consen 1 GEKCPRCWNYIED 13 (30)
T ss_dssp TSB-TTT--BBEE
T ss_pred CCcCccCCCcceE
Confidence 4578899988875
No 98
>COG1645 Uncharacterized Zn-finger containing protein [General function prediction only]
Probab=27.30 E-value=33 Score=31.54 Aligned_cols=24 Identities=25% Similarity=0.557 Sum_probs=18.3
Q ss_pred ccCCCCCcCCCCCceEEccCCcccccccccc
Q 010326 260 RCCSCERMEPRDTKYLSLDDGRKLCLECLDS 290 (513)
Q Consensus 260 ~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~ 290 (513)
.|..|+.+| |. ++|..||..|-.+
T Consensus 30 hCp~Cg~PL------F~-KdG~v~CPvC~~~ 53 (131)
T COG1645 30 HCPKCGTPL------FR-KDGEVFCPVCGYR 53 (131)
T ss_pred hCcccCCcc------ee-eCCeEECCCCCce
Confidence 377788876 33 6999999999744
No 99
>PRK00420 hypothetical protein; Validated
Probab=27.11 E-value=37 Score=30.43 Aligned_cols=23 Identities=17% Similarity=0.424 Sum_probs=12.9
Q ss_pred cccCCCCCCCCCcceeecCCccccccc
Q 010326 187 FRCHSCNLPITDVEFSMSGNRPYHKHC 213 (513)
Q Consensus 187 FrCs~C~~~L~~~~F~~~dG~pYCk~C 213 (513)
-.|..|+.+|.. .++|+.||..|
T Consensus 24 ~~CP~Cg~pLf~----lk~g~~~Cp~C 46 (112)
T PRK00420 24 KHCPVCGLPLFE----LKDGEVVCPVH 46 (112)
T ss_pred CCCCCCCCccee----cCCCceECCCC
Confidence 356667766532 25677665544
No 100
>cd00162 RING RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)
Probab=26.57 E-value=28 Score=23.87 Aligned_cols=40 Identities=13% Similarity=0.404 Sum_probs=17.9
Q ss_pred cCCCCCCCCCcceeecCCcccccccccccc---CcccccCCCC
Q 010326 189 CHSCNLPITDVEFSMSGNRPYHKHCYKEQH---HPKCDVCQNF 228 (513)
Q Consensus 189 Cs~C~~~L~~~~F~~~dG~pYCk~CY~~~f---~pkC~~C~~~ 228 (513)
|..|...+........=|..||..|....+ ..+|..|+..
T Consensus 2 C~iC~~~~~~~~~~~~C~H~~c~~C~~~~~~~~~~~Cp~C~~~ 44 (45)
T cd00162 2 CPICLEEFREPVVLLPCGHVFCRSCIDKWLKSGKNTCPLCRTP 44 (45)
T ss_pred CCcCchhhhCceEecCCCChhcHHHHHHHHHhCcCCCCCCCCc
Confidence 445555442212222234556666654322 3356666543
No 101
>KOG1280 consensus Uncharacterized conserved protein containing ZZ-type Zn-finger [General function prediction only]
Probab=26.25 E-value=52 Score=34.98 Aligned_cols=16 Identities=31% Similarity=0.420 Sum_probs=13.3
Q ss_pred CcccCCCCccCCCCCc
Q 010326 252 SHERDGTPRCCSCERM 267 (513)
Q Consensus 252 ~H~H~~CF~C~~C~r~ 267 (513)
.||-+.||+|-.|++.
T Consensus 73 ~~y~~qSftCPyC~~~ 88 (381)
T KOG1280|consen 73 SHYDPQSFTCPYCGIM 88 (381)
T ss_pred cccccccccCCccccc
Confidence 4777789999999985
No 102
>COG5148 RPN10 26S proteasome regulatory complex, subunit RPN10/PSMD4 [Posttranslational modification, protein turnover, chaperones]
Probab=26.16 E-value=24 Score=34.45 Aligned_cols=28 Identities=18% Similarity=0.330 Sum_probs=20.3
Q ss_pred hhHHHHHHHhhhhhhhhcCCccccCCcC
Q 010326 42 NEEIDRAIALSLVEVDQKGKKVIENEYD 69 (513)
Q Consensus 42 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 (513)
+-||.-||.||+.|+.++.+......++
T Consensus 207 DpELA~AlrLSmeEek~rQe~~~qk~~e 234 (243)
T COG5148 207 DPELAEALRLSMEEEKKRQEVAAQKSSE 234 (243)
T ss_pred CHHHHHHHHhhHHHHHHHHHHHHHhhhh
Confidence 5578899999998887777666544433
No 103
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=25.97 E-value=40 Score=26.60 Aligned_cols=18 Identities=17% Similarity=0.307 Sum_probs=15.6
Q ss_pred ccCCCCeeeeeccccccc
Q 010326 366 RIGAGYRLIDMITEPYRL 383 (513)
Q Consensus 366 ~~~~G~rilei~~~p~~~ 383 (513)
.|++||.|+.|||.|+..
T Consensus 29 gl~~GD~I~~ing~~i~~ 46 (79)
T cd00989 29 GLKAGDRILAINGQKIKS 46 (79)
T ss_pred CCCCCCEEEEECCEECCC
Confidence 478999999999998863
No 104
>PRK00420 hypothetical protein; Validated
Probab=25.95 E-value=42 Score=30.08 Aligned_cols=26 Identities=23% Similarity=0.369 Sum_probs=18.0
Q ss_pred ccCCCCCcCCCCCceEEccCCccccccccccc
Q 010326 260 RCCSCERMEPRDTKYLSLDDGRKLCLECLDSA 291 (513)
Q Consensus 260 ~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~ 291 (513)
.|..|+.++ +.+.+|..||..|-...
T Consensus 25 ~CP~Cg~pL------f~lk~g~~~Cp~Cg~~~ 50 (112)
T PRK00420 25 HCPVCGLPL------FELKDGEVVCPVHGKVY 50 (112)
T ss_pred CCCCCCCcc------eecCCCceECCCCCCee
Confidence 355676654 44568999999997643
No 105
>PHA03377 EBNA-3C; Provisional
Probab=25.92 E-value=45 Score=38.54 Aligned_cols=34 Identities=12% Similarity=0.250 Sum_probs=17.0
Q ss_pred cccCCCCCccccccccc-CCCCCCCChhHHHHHHH
Q 010326 17 HARYGDDRTWDERRYSA-ADDSSGFDNEEIDRAIA 50 (513)
Q Consensus 17 ~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~ 50 (513)
.|++.-..-|..+++.. -++.-..-.++|+-|.+
T Consensus 310 iGt~kp~~PWl~~P~E~Pyh~~rglt~~~i~~Af~ 344 (1000)
T PHA03377 310 IGNFKPYYPWNAPPNENPYHARRGIKEDVIQTAFR 344 (1000)
T ss_pred cCCCCCCCCCCCCCccCcccccccchHHHHHHHHH
Confidence 45555556687655411 11122355677765544
No 106
>PF04931 DNA_pol_phi: DNA polymerase phi; InterPro: IPR007015 Proteins of this family are predominantly nucleolar. The majority are described as transcription factor transactivators. The family also includes the fifth essential DNA polymerase (Pol5p) of Schizosaccharomyces pombe (Fission yeast) and Saccharomyces cerevisiae (Baker's yeast) (2.7.7.7 from EC). Pol5p is localized exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units.; GO: 0003677 DNA binding, 0003887 DNA-directed DNA polymerase activity, 0006351 transcription, DNA-dependent
Probab=25.41 E-value=42 Score=39.43 Aligned_cols=8 Identities=50% Similarity=0.584 Sum_probs=4.0
Q ss_pred HHHHHHHh
Q 010326 126 EQLAKAIQ 133 (513)
Q Consensus 126 E~Laralq 133 (513)
++|+.++.
T Consensus 739 ~~La~~Fk 746 (784)
T PF04931_consen 739 EQLAAIFK 746 (784)
T ss_pred HHHHHHHH
Confidence 45555544
No 107
>PF01421 Reprolysin: Reprolysin (M12B) family zinc metalloprotease This Prosite motif covers only the active site.; InterPro: IPR001590 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12B (adamalysin family, clan (MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH []. The adamalysins are zinc dependent endopeptidases found in snake venom. There are some mammalian proteins such as P78325 from SWISSPROT, and fertilin Q28472 from SWISSPROT. Fertilin and closely related proteins appear to not have some active site residues and may not be active enzymes. CD156 (also called ADAM8 (3.4.24 from EC) or MS2 human) has been implicated in extravasation of leukocytes. CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://prow.nci.nih.gov/). ; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis; PDB: 2E3X_A 2W15_A 2W14_A 2W13_A 2W12_A 1ND1_A 3K7L_A 2DW2_A 2DW0_B 2DW1_A ....
Probab=25.34 E-value=42 Score=32.07 Aligned_cols=25 Identities=28% Similarity=0.235 Sum_probs=17.9
Q ss_pred eeEeeecCchhhhhhhhhhccchhh
Q 010326 390 TAILILYGLPRLLTGSILAHEMMHA 414 (513)
Q Consensus 390 ~~Il~l~glP~~L~gsilaHE~~Ha 414 (513)
-+|....+.....++.|||||++|.
T Consensus 118 ~~i~~~~~~~~~~~a~~~AHelGH~ 142 (199)
T PF01421_consen 118 CGIVEDHSRSGLSFAVIIAHELGHN 142 (199)
T ss_dssp EEEEE-SSSSHHHHHHHHHHHHHHH
T ss_pred CcEeeeccchhHHHHHHHHHHHHHh
Confidence 3444555566777899999999993
No 108
>COG2191 Formylmethanofuran dehydrogenase subunit E [Energy production and conversion]
Probab=25.26 E-value=32 Score=33.94 Aligned_cols=31 Identities=19% Similarity=0.534 Sum_probs=23.6
Q ss_pred CccCCCCCcCCCCCceEEccCCccccccccccc
Q 010326 259 PRCCSCERMEPRDTKYLSLDDGRKLCLECLDSA 291 (513)
Q Consensus 259 F~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~ 291 (513)
-+|..|+-++. .....+.+|+++|..|+...
T Consensus 173 v~C~kCGE~~~--e~~~~~~ng~~vC~~C~~~~ 203 (206)
T COG2191 173 VRCSKCGELFM--EPRAVVLNGKPVCKPCAEKK 203 (206)
T ss_pred eeccccCcccc--cchhhhcCCceecccccccc
Confidence 58999998874 23345568999999999863
No 109
>KOG0957 consensus PHD finger protein [General function prediction only]
Probab=25.25 E-value=72 Score=35.54 Aligned_cols=139 Identities=20% Similarity=0.354 Sum_probs=66.3
Q ss_pred CCCcCcCCCcccc--cCceeee--cCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcCCCC
Q 010326 158 GYRICAGCNTEIG--HGRYLSC--MEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIPTNS 233 (513)
Q Consensus 158 g~~~C~~C~~~I~--~g~~l~a--lg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~~~~ 233 (513)
...+|..|-..-. .+.++.+ -|..-|..|+-=.. +..|.+..-.-....-||+.|-.....|.|.-|-...
T Consensus 118 k~~iCcVClg~rs~da~ei~qCd~CGi~VHEgCYGv~d-n~si~s~~s~~stepWfCeaC~~Gvs~P~CElCPn~~---- 192 (707)
T KOG0957|consen 118 KAVICCVCLGQRSVDAGEILQCDKCGINVHEGCYGVLD-NVSIPSGSSDCSTEPWFCEACLYGVSLPHCELCPNRF---- 192 (707)
T ss_pred cceEEEEeecCccccccceeeccccCceeccccccccc-ccccCCCCccCCCCchhhhhHhcCCCCCccccCCCcC----
Confidence 3457888865432 2455543 23445555542220 1111111100011346899997777778999995421
Q ss_pred ccceeeecccccccccCCCcccCCCCc--cCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHH
Q 010326 234 AGLIEYRAHPFWLQKYCPSHERDGTPR--CCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEI 305 (513)
Q Consensus 234 ~g~i~~~~hpfw~~~yCp~H~H~~CF~--C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I 305 (513)
|+..-..-.-|-+..|....|---|- =.-|+..+. .-.|. +-|+..|-.|-+..+.-+.+|-.|..-+
T Consensus 193 -GifKetDigrWvH~iCALYvpGVafg~~~~l~~Vtl~-em~ys--k~Gak~Cs~Ced~~fARtGvci~CdaGM 262 (707)
T KOG0957|consen 193 -GIFKETDIGRWVHAICALYVPGVAFGQTHTLCGVTLE-EMDYS--KFGAKTCSACEDKIFARTGVCIRCDAGM 262 (707)
T ss_pred -CcccccchhhHHHHHHHhhcCccccccccccccccHH-Hhhhh--hhccchhccccchhhhhcceeeeccchh
Confidence 22222222235554444322211111 112333221 11222 2467788888877766677787777544
No 110
>PRK14714 DNA polymerase II large subunit; Provisional
Probab=25.23 E-value=57 Score=40.17 Aligned_cols=11 Identities=18% Similarity=0.585 Sum_probs=6.1
Q ss_pred ccccCCCCcCC
Q 010326 221 KCDVCQNFIPT 231 (513)
Q Consensus 221 kC~~C~~~I~~ 231 (513)
.|..|+.+...
T Consensus 711 ~CP~CGtplv~ 721 (1337)
T PRK14714 711 ECPRCDVELTP 721 (1337)
T ss_pred cCCCCCCcccc
Confidence 56666655443
No 111
>TIGR03826 YvyF flagellar operon protein TIGR03826. This gene is found in flagellar operons of Bacillus-related organisms. Its function has not been determined and an official gene symbol has not been assigned, although the gene is designated yvyF in B. subtilus. A tentative assignment as a regulator is suggested in the NCBI record GI:16080597.
Probab=25.22 E-value=34 Score=31.66 Aligned_cols=23 Identities=17% Similarity=0.439 Sum_probs=17.3
Q ss_pred CCccCCCCCcCCCCCceEEccCCcccccccccc
Q 010326 258 TPRCCSCERMEPRDTKYLSLDDGRKLCLECLDS 290 (513)
Q Consensus 258 CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~ 290 (513)
-..|..|+.+|..| +| |..|...
T Consensus 81 ~~~CE~CG~~I~~G-r~---------C~~C~~~ 103 (137)
T TIGR03826 81 GYPCERCGTSIREG-RL---------CDSCAGE 103 (137)
T ss_pred cCcccccCCcCCCC-Cc---------cHHHHHH
Confidence 47899999999766 54 6666554
No 112
>PRK14714 DNA polymerase II large subunit; Provisional
Probab=24.74 E-value=98 Score=38.27 Aligned_cols=10 Identities=50% Similarity=0.557 Sum_probs=5.1
Q ss_pred hccchhhHhh
Q 010326 408 AHEMMHAWLR 417 (513)
Q Consensus 408 aHE~~Hawl~ 417 (513)
||=+-||=.|
T Consensus 907 AHPyFHAAKR 916 (1337)
T PRK14714 907 AHPYFHAAKR 916 (1337)
T ss_pred ccchhhhHhh
Confidence 5555555444
No 113
>KOG2462 consensus C2H2-type Zn-finger protein [Transcription]
Probab=24.64 E-value=49 Score=34.11 Aligned_cols=11 Identities=18% Similarity=0.404 Sum_probs=7.3
Q ss_pred CccCCCCCcCC
Q 010326 259 PRCCSCERMEP 269 (513)
Q Consensus 259 F~C~~C~r~l~ 269 (513)
|.|..|+|.|.
T Consensus 216 F~C~hC~kAFA 226 (279)
T KOG2462|consen 216 FSCPHCGKAFA 226 (279)
T ss_pred ccCCcccchhc
Confidence 66777777663
No 114
>PF13834 DUF4193: Domain of unknown function (DUF4193)
Probab=24.61 E-value=24 Score=30.94 Aligned_cols=29 Identities=28% Similarity=0.673 Sum_probs=19.5
Q ss_pred CCccCCCCCcCCCCCceEEccCCccccccc
Q 010326 258 TPRCCSCERMEPRDTKYLSLDDGRKLCLEC 287 (513)
Q Consensus 258 CF~C~~C~r~l~~g~~y~~l~dgr~yC~~C 287 (513)
=|+|++|.-.-.+ .+-..-.+|.++|..|
T Consensus 70 EFTCssCFLV~HR-SqLa~~~~g~~iC~DC 98 (99)
T PF13834_consen 70 EFTCSSCFLVHHR-SQLAREKDGQPICRDC 98 (99)
T ss_pred ceeeeeeeeEech-hhhccccCCCEecccc
Confidence 3899999765432 2333345789999988
No 115
>cd04270 ZnMc_TACE_like Zinc-dependent metalloprotease; TACE_like subfamily. TACE, the tumor-necrosis factor-alpha converting enzyme, releases soluble TNF-alpha from transmembrane pro-TNF-alpha.
Probab=24.13 E-value=31 Score=34.56 Aligned_cols=21 Identities=33% Similarity=0.536 Sum_probs=14.9
Q ss_pred ecCc--hhhhhhhhhhccchhhH
Q 010326 395 LYGL--PRLLTGSILAHEMMHAW 415 (513)
Q Consensus 395 l~gl--P~~L~gsilaHE~~Haw 415 (513)
.+|. |...+..|+|||++|.+
T Consensus 157 ~~~~~~~~~~~a~t~AHElGHnl 179 (244)
T cd04270 157 NYGKRVPTKESDLVTAHELGHNF 179 (244)
T ss_pred ccCCccchhHHHHHHHHHHHHhc
Confidence 4554 44446679999999965
No 116
>PF04502 DUF572: Family of unknown function (DUF572) ; InterPro: IPR007590 This entry represents eukaryotic proteins with undetermined function belonging to the CWC16 family.
Probab=23.89 E-value=54 Score=34.46 Aligned_cols=17 Identities=18% Similarity=0.302 Sum_probs=8.4
Q ss_pred cCccccCCCcccCCCCCCCC
Q 010326 178 MEAFWHPECFRCHSCNLPIT 197 (513)
Q Consensus 178 lg~~wHp~CFrCs~C~~~L~ 197 (513)
+-..|+.+ |..|+..|.
T Consensus 35 f~~Pf~i~---C~~C~~~I~ 51 (324)
T PF04502_consen 35 FMMPFNIW---CNTCGEYIY 51 (324)
T ss_pred EcCCccCc---CCCCccccc
Confidence 34445543 445555554
No 117
>PF10083 DUF2321: Uncharacterized protein conserved in bacteria (DUF2321); InterPro: IPR016891 This entry is represented by Bacteriophage 'Lactobacillus prophage Lj928', Orf-Ljo1454. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches. There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function.
Probab=22.62 E-value=32 Score=32.57 Aligned_cols=52 Identities=19% Similarity=0.383 Sum_probs=33.7
Q ss_pred CCcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcceeecCCccccccccccccCcccccCCCCcC
Q 010326 159 YRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVEFSMSGNRPYHKHCYKEQHHPKCDVCQNFIP 230 (513)
Q Consensus 159 ~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~F~~~dG~pYCk~CY~~~f~pkC~~C~~~I~ 230 (513)
...|.+||.... -.|..|+.+|.+.++. +|.++=..+|. --..|+.|+++-+
T Consensus 28 ~~fC~kCG~~tI----------------~~Cp~C~~~IrG~y~v--~gv~~~g~~~~--~PsYC~~CGkpyP 79 (158)
T PF10083_consen 28 EKFCSKCGAKTI----------------TSCPNCSTPIRGDYHV--EGVFGLGGHYE--APSYCHNCGKPYP 79 (158)
T ss_pred HHHHHHhhHHHH----------------HHCcCCCCCCCCceec--CCeeeeCCCCC--CChhHHhCCCCCc
Confidence 356888887653 3688899999876443 44444444443 1226999998765
No 118
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=22.58 E-value=51 Score=26.70 Aligned_cols=19 Identities=11% Similarity=0.153 Sum_probs=16.2
Q ss_pred cccCCCCeeeeeccccccc
Q 010326 365 PRIGAGYRLIDMITEPYRL 383 (513)
Q Consensus 365 ~~~~~G~rilei~~~p~~~ 383 (513)
..|++||.|+.||+.+++.
T Consensus 26 aGL~~GDiI~~Ing~~v~~ 44 (79)
T cd00991 26 AVLHTGDVIYSINGTPITT 44 (79)
T ss_pred cCCCCCCEEEEECCEEcCC
Confidence 3578999999999999873
No 119
>cd04267 ZnMc_ADAM_like Zinc-dependent metalloprotease, ADAM_like or reprolysin_like subgroup. The adamalysin_like or ADAM family of metalloproteases contains proteolytic domains from snake venoms, proteases from the mammalian reproductive tract, and the tumor necrosis factor alpha convertase, TACE. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions.
Probab=22.55 E-value=40 Score=31.90 Aligned_cols=24 Identities=33% Similarity=0.463 Sum_probs=16.2
Q ss_pred eeEeeecCchhhhhhhhhhccchhh
Q 010326 390 TAILILYGLPRLLTGSILAHEMMHA 414 (513)
Q Consensus 390 ~~Il~l~glP~~L~gsilaHE~~Ha 414 (513)
.+|....+. ....+.|+|||++|.
T Consensus 121 ~~v~~~~~~-~~~~~~~~aHElGH~ 144 (192)
T cd04267 121 VGVVEDTGF-TLLTALTMAHELGHN 144 (192)
T ss_pred eEEEecCCc-ceeehhhhhhhHHhh
Confidence 344444443 455688999999994
No 120
>PF14446 Prok-RING_1: Prokaryotic RING finger family 1
Probab=22.54 E-value=52 Score=25.76 Aligned_cols=13 Identities=15% Similarity=0.353 Sum_probs=9.2
Q ss_pred CCcCcCCCccccc
Q 010326 159 YRICAGCNTEIGH 171 (513)
Q Consensus 159 ~~~C~~C~~~I~~ 171 (513)
..+|..|++.|..
T Consensus 5 ~~~C~~Cg~~~~~ 17 (54)
T PF14446_consen 5 GCKCPVCGKKFKD 17 (54)
T ss_pred CccChhhCCcccC
Confidence 3578888888853
No 121
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=22.52 E-value=48 Score=26.32 Aligned_cols=18 Identities=17% Similarity=0.036 Sum_probs=15.6
Q ss_pred ccCCCCeeeeeccccccc
Q 010326 366 RIGAGYRLIDMITEPYRL 383 (513)
Q Consensus 366 ~~~~G~rilei~~~p~~~ 383 (513)
.|.+||.|+.||+.+++.
T Consensus 29 Gl~~GD~I~~Ing~~v~~ 46 (80)
T cd00990 29 GLVAGDELVAVNGWRVDA 46 (80)
T ss_pred CCCCCCEEEEECCEEhHH
Confidence 478999999999998864
No 122
>PLN03208 E3 ubiquitin-protein ligase RMA2; Provisional
Probab=22.38 E-value=73 Score=31.25 Aligned_cols=13 Identities=38% Similarity=0.961 Sum_probs=9.0
Q ss_pred cccccCCCCcCCC
Q 010326 220 PKCDVCQNFIPTN 232 (513)
Q Consensus 220 pkC~~C~~~I~~~ 232 (513)
++|..|+..|...
T Consensus 69 ~~CPvCR~~Is~~ 81 (193)
T PLN03208 69 PKCPVCKSDVSEA 81 (193)
T ss_pred CcCCCCCCcCChh
Confidence 4677777777654
No 123
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=22.35 E-value=43 Score=27.15 Aligned_cols=18 Identities=11% Similarity=0.230 Sum_probs=14.8
Q ss_pred ccCCCCeeeeeccccccc
Q 010326 366 RIGAGYRLIDMITEPYRL 383 (513)
Q Consensus 366 ~~~~G~rilei~~~p~~~ 383 (513)
-|.+||.|+.||+.++..
T Consensus 31 Gl~~GD~I~~ing~~v~~ 48 (82)
T PF13180_consen 31 GLQPGDIILAINGKPVNS 48 (82)
T ss_dssp TS-TTEEEEEETTEESSS
T ss_pred CCCCCcEEEEECCEEcCC
Confidence 478999999999999853
No 124
>PRK10778 dksA RNA polymerase-binding transcription factor; Provisional
Probab=22.31 E-value=1.4e+02 Score=27.97 Aligned_cols=31 Identities=32% Similarity=0.757 Sum_probs=21.0
Q ss_pred CCCcCcCCCcccccCceeeecCccccCCCcccCCCCC
Q 010326 158 GYRICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNL 194 (513)
Q Consensus 158 g~~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~ 194 (513)
..+.|..|+.+|...+ +.+ -|.+..|..|..
T Consensus 110 tYG~Ce~CGe~I~~~R-L~A-----~P~A~~CI~CQe 140 (151)
T PRK10778 110 DFGYCESCGVEIGIRR-LEA-----RPTADLCIDCKT 140 (151)
T ss_pred CCceeccCCCcccHHH-Hhc-----CCCccccHHHHH
Confidence 4579999999995333 333 366677777754
No 125
>TIGR02414 pepN_proteo aminopeptidase N, Escherichia coli type. The M1 family of zinc metallopeptidases contains a number of distinct, well-separated clades of proteins with aminopeptidase activity. Several are designated aminopeptidase N, EC 3.4.11.2, after the Escherichia coli enzyme, suggesting a similar activity profile. This family consists of all aminopeptidases closely related to E. coli PepN and presumed to have similar (not identical) function. Nearly all are found in Proteobacteria, but members are found also in Cyanobacteria, plants, and apicomplexan parasites. This family differs greatly in sequence from the family of aminopeptidases typified by Streptomyces lividans PepN (TIGR02412), from the membrane bound aminopeptidase N family in animals, etc.
Probab=21.93 E-value=55 Score=39.05 Aligned_cols=40 Identities=20% Similarity=0.419 Sum_probs=28.2
Q ss_pred hhhhhhccchhhHhhhcC--CCCC-CCcchhhHHHHHHHHHhhc
Q 010326 403 TGSILAHEMMHAWLRLKG--YPNL-RPDVEEGICQVLAHMWLES 443 (513)
Q Consensus 403 ~gsilaHE~~Hawl~l~g--~~~L-~~~~eEG~cq~~a~~wl~~ 443 (513)
..++||||+.|-|.. |. +..- ...+-|||.-++.++|.+.
T Consensus 283 i~~VIaHElaHqWfG-NlVT~~~W~~LWLnEGfAty~e~~~~~~ 325 (863)
T TIGR02414 283 IESVIAHEYFHNWTG-NRVTCRDWFQLSLKEGLTVFRDQEFSAD 325 (863)
T ss_pred HHHHHHHHHHHHHhc-ceeeecchhhhhhhhhHHHHHHHHHHHH
Confidence 357999999999964 22 1111 2357999999988877553
No 126
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=21.92 E-value=18 Score=40.26 Aligned_cols=24 Identities=8% Similarity=0.056 Sum_probs=20.0
Q ss_pred cccccCCCCeeeeecccccccccc
Q 010326 363 RRPRIGAGYRLIDMITEPYRLIRR 386 (513)
Q Consensus 363 ~~~~~~~G~rilei~~~p~~~~~~ 386 (513)
|.+.+|.||.|+||||..|.....
T Consensus 161 r~glL~~GD~i~EvNGi~v~~~~~ 184 (542)
T KOG0609|consen 161 RQGLLHVGDEILEVNGISVANKSP 184 (542)
T ss_pred hccceeeccchheecCeecccCCH
Confidence 667899999999999999886533
No 127
>KOG1420 consensus Ca2+-activated K+ channel Slowpoke, alpha subunit [Inorganic ion transport and metabolism; Signal transduction mechanisms]
Probab=21.80 E-value=61 Score=36.77 Aligned_cols=11 Identities=9% Similarity=0.174 Sum_probs=5.3
Q ss_pred HHHHHHHHhhh
Q 010326 485 KDLGKFFKHQI 495 (513)
Q Consensus 485 ~~l~~~~~~qi 495 (513)
+++=.|+..-|
T Consensus 47 r~~w~fl~ss~ 57 (1103)
T KOG1420|consen 47 RMWWAFLASSM 57 (1103)
T ss_pred HHHHHHHHHHH
Confidence 44445555444
No 128
>PF13923 zf-C3HC4_2: Zinc finger, C3HC4 type (RING finger); PDB: 3HCU_A 2ECI_A 2JMD_A 3HCS_B 3HCT_A 3ZTG_A 2YUR_A 3L11_A.
Probab=21.58 E-value=55 Score=22.99 Aligned_cols=36 Identities=19% Similarity=0.499 Sum_probs=22.9
Q ss_pred cCCCCCcCCCCCceEEccCCcccccccccccccCCCCC
Q 010326 261 CCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHEC 298 (513)
Q Consensus 261 C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C 298 (513)
|..|...+. ...+...=|..||..|..+.+.....|
T Consensus 1 C~iC~~~~~--~~~~~~~CGH~fC~~C~~~~~~~~~~C 36 (39)
T PF13923_consen 1 CPICLDELR--DPVVVTPCGHSFCKECIEKYLEKNPKC 36 (39)
T ss_dssp ETTTTSB-S--SEEEECTTSEEEEHHHHHHHHHCTSB-
T ss_pred CCCCCCccc--CcCEECCCCCchhHHHHHHHHHCcCCC
Confidence 556766553 354567788999999988765544444
No 129
>cd04275 ZnMc_pappalysin_like Zinc-dependent metalloprotease, pappalysin_like subfamily. The pregnancy-associated plasma protein A (PAPP-A or pappalysin-1) cleaves insulin-like growth factor-binding proteins 4 and 5, thereby promoting cell growth by releasing bound growth factor. This model includes pappalysins and related metalloprotease domains from all three kingdoms of life. The three-dimensional structure of an archaeal representative, ulilysin, has been solved.
Probab=21.57 E-value=33 Score=34.19 Aligned_cols=49 Identities=31% Similarity=0.383 Sum_probs=28.3
Q ss_pred CCeeeeecccccccccc-ceeeeEeeec-Cchhh-----hhhhhhhccchhhHhhhc
Q 010326 370 GYRLIDMITEPYRLIRR-CEVTAILILY-GLPRL-----LTGSILAHEMMHAWLRLK 419 (513)
Q Consensus 370 G~rilei~~~p~~~~~~-~eV~~Il~l~-glP~~-----L~gsilaHE~~Hawl~l~ 419 (513)
+..++.+..-|...... .....|.+++ -+|-. -.|.||+||++| ||.|-
T Consensus 97 ~~~~lG~a~fP~~~~~~~~~~dGvvi~~~~~~~~~~~~~n~g~t~~HEvGH-~lGL~ 152 (225)
T cd04275 97 GGGLLGYATFPDSLVSLAFITDGVVINPSSLPGGSAAPYNLGDTATHEVGH-WLGLY 152 (225)
T ss_pred CCCcCEEEECCCcccCCccccceEEEeccccCCCCcccccccceeEEeccc-eeeee
Confidence 34456666666654432 2344555554 22332 347899999999 66655
No 130
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=21.49 E-value=26 Score=28.19 Aligned_cols=19 Identities=26% Similarity=0.302 Sum_probs=16.3
Q ss_pred ccCCCCeeeeecccccccc
Q 010326 366 RIGAGYRLIDMITEPYRLI 384 (513)
Q Consensus 366 ~~~~G~rilei~~~p~~~~ 384 (513)
.|.+||.|+.||+.++...
T Consensus 30 gl~~GD~I~~vng~~i~~~ 48 (85)
T cd00988 30 GIKAGDIIVAIDGEPVDGL 48 (85)
T ss_pred CCCCCCEEEEECCEEcCCC
Confidence 5789999999999988764
No 131
>KOG4286 consensus Dystrophin-like protein [Cell motility; Signal transduction mechanisms; Cytoskeleton]
Probab=21.34 E-value=32 Score=39.88 Aligned_cols=54 Identities=17% Similarity=0.255 Sum_probs=34.5
Q ss_pred cCcccccCCC-CcCCCCccceeeecccccccccCCCcccCCCCccCCCCCcCCCCCceEEccCCcccccccccc
Q 010326 218 HHPKCDVCQN-FIPTNSAGLIEYRAHPFWLQKYCPSHERDGTPRCCSCERMEPRDTKYLSLDDGRKLCLECLDS 290 (513)
Q Consensus 218 f~pkC~~C~~-~I~~~~~g~i~~~~hpfw~~~yCp~H~H~~CF~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~ 290 (513)
...+|.+|++ +|.| +.|+.-. ||.|..|...|-.|..--..+-+.++|.-|-.+
T Consensus 602 H~~kCniCk~~pIvG-----~RyR~l~--------------~fn~dlCq~CF~sgraak~hk~~~pM~Ey~~~t 656 (966)
T KOG4286|consen 602 HQAKCNICKECPIIG-----FRYRSLK--------------HFNYDICQSCFFSGRAAKGHKMHYPMVEYCTPT 656 (966)
T ss_pred hhhhcchhhhCccce-----eeeeehh--------------hcChhHHhhHhhhcccccCCCCCCCceeeeCCC
Confidence 4568999987 5544 5666543 788888877664443222334567888888655
No 132
>PF06750 DiS_P_DiS: Bacterial Peptidase A24 N-terminal domain; InterPro: IPR010627 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Aspartic endopeptidases 3.4.23. from EC of vertebrate, fungal and retroviral origin have been characterised []. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin [] and archaean preflagellin have been described [, ]. Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure. This domain is found at the N terminus of bacterial aspartic peptidases belonging to MEROPS peptidase family A24 (clan AD), subfamily A24A (type IV prepilin peptidase, IPR000045 from INTERPRO). It's function has not been specifically determined; however some of the family have been characterised as bifunctional [], and this domain may contain the N-methylation activity. The domain consists of an intracellular region between a pair of transmembrane domains. This intracellular region contains an invariant proline and four conserved cysteines. These Cys residues are arranged in a two-pair motif, with the Cys residues of a pair separated (usually) by 2 aa and with each pair separated by 21 largely hydrophilic residues (C-X-X-C...X21...C-X-X-C); they have been shown to be essential to the overall function of the enzyme [, ]. The bifunctional enzyme prepilin peptidase (PilD) from Pseudomonas aeruginosa is a key determinant in both type-IV pilus biogenesis and extracellular protein secretion, in its roles as a leader peptidase and methyl transferase (MTase). It is responsible for endopeptidic cleavage of the unique leader peptides that characterise type-IV pilin precursors, as well as proteins with homologous leader sequences that are essential components of the general secretion pathway found in a variety of Gram-negative pathogens. Following removal of the leader peptides, the same enzyme is responsible for the second posttranslational modification that characterises the type-IV pilins and their homologues, namely N-methylation of the newly exposed N-terminal amino acid residue [].
Probab=21.29 E-value=40 Score=28.94 Aligned_cols=39 Identities=18% Similarity=0.388 Sum_probs=25.3
Q ss_pred CcCcCCCcccccCceeeecCccccCCCcccCCCCCCCCCcc
Q 010326 160 RICAGCNTEIGHGRYLSCMEAFWHPECFRCHSCNLPITDVE 200 (513)
Q Consensus 160 ~~C~~C~~~I~~g~~l~alg~~wHp~CFrCs~C~~~L~~~~ 200 (513)
..|..|++++..-+.+...+-.+.. -+|..|+.+|.-.+
T Consensus 34 S~C~~C~~~L~~~~lIPi~S~l~lr--GrCr~C~~~I~~~y 72 (92)
T PF06750_consen 34 SHCPHCGHPLSWWDLIPILSYLLLR--GRCRYCGAPIPPRY 72 (92)
T ss_pred CcCcCCCCcCcccccchHHHHHHhC--CCCcccCCCCChHH
Confidence 5688888888655555555444444 36777777776543
No 133
>PF10235 Cript: Microtubule-associated protein CRIPT; InterPro: IPR019367 The CRIPT protein is a cytoskeletal protein involved in microtubule production. This C-terminal domain is essential for binding to the PDZ3 domain of the SAP90 protein, one of a super-family of PDZ-containing proteins that play an important role in coupling the membrane ion channels with their signalling partners [].
Probab=20.82 E-value=40 Score=29.09 Aligned_cols=36 Identities=22% Similarity=0.425 Sum_probs=26.6
Q ss_pred CccCCCCCcCCCCCceEEccCCcccccccccccccCCCCCcccchHHH
Q 010326 259 PRCCSCERMEPRDTKYLSLDDGRKLCLECLDSAIMDTHECQPLYLEIQ 306 (513)
Q Consensus 259 F~C~~C~r~l~~g~~y~~l~dgr~yC~~Cy~~~v~~t~~C~~c~~~I~ 306 (513)
-.|..|...+. ..|..||..|-.+ ..+|+-|.+.|+
T Consensus 45 ~~C~~CK~~v~--------q~g~~YCq~CAYk----kGiCamCGKki~ 80 (90)
T PF10235_consen 45 SKCKICKTKVH--------QPGAKYCQTCAYK----KGICAMCGKKIL 80 (90)
T ss_pred ccccccccccc--------cCCCccChhhhcc----cCcccccCCeec
Confidence 46888887652 1367889999776 468999988873
No 134
>PF01435 Peptidase_M48: Peptidase family M48 This is family M48 in the peptidase classification. ; InterPro: IPR001915 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to MEROPS peptidase family M48 (Ste24 endopeptidase family, clan M-); members of both subfamily are represented. The members of this set of proteins are mostly described as probable protease htpX homologue (3.4.24 from EC) or CAAX prenyl protease 1, which proteolytically removes the C-terminal three residues of farnesylated proteins. They are integral membrane proteins associated with the endoplasmic reticulum and Golgi, binding one zinc ion per subunit. In Saccharomyces cerevisiae (Baker's yeast) Ste24p is required for the first NH2-terminal proteolytic processing event within the a-factor precursor, which takes place after COOH-terminal CAAX modification is complete. The Ste24p contains multiple predicted membrane spans, a zinc metalloprotease motif (HEXXH), and a COOH-terminal ER retrieval signal (KKXX). The HEXXH protease motif is critical for Ste24p activity, since Ste24p fails to function when conserved residues within this motif are mutated. The Ste24p homologues occur in a diverse group of organisms, including Escherichia coli, Schizosaccharomyces pombe (Fission yeast), Haemophilus influenzae, and Homo sapiens (Human), which indicates that the gene is highly conserved throughout evolution. Ste24p and the proteins related to it define a subfamily of proteins that are likely to function as intracellular, membrane-associated zinc metalloproteases []. HtpX is a zinc-dependent endoprotease member of the membrane-localized proteolytic system in E. coli, which participates in the proteolytic quality control of membrane proteins in conjunction with FtsH, a membrane-bound and ATP-dependent protease. Biochemical characterisation revealed that HtpX undergoes self-degradation upon cell disruption or membrane solubilization. It can also degraded casein and cleaves solubilized membrane proteins, for example, SecY []. Expression of HtpX in the plasma membrane is under the control of CpxR, with the metalloproteinase active site of HtpX located on the cytosolic side of the membrane. This suggests a potential role for HtpX in the response to mis-folded proteins [].; GO: 0004222 metalloendopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 3CQB_A 3C37_B.
Probab=20.80 E-value=37 Score=32.49 Aligned_cols=14 Identities=43% Similarity=0.584 Sum_probs=11.9
Q ss_pred hhhhhccchhhHhh
Q 010326 404 GSILAHEMMHAWLR 417 (513)
Q Consensus 404 gsilaHE~~Hawl~ 417 (513)
.++||||+.|...+
T Consensus 90 ~aVlaHElgH~~~~ 103 (226)
T PF01435_consen 90 AAVLAHELGHIKHR 103 (226)
T ss_dssp HHHHHHHHHHHHTT
T ss_pred HHHHHHHHHHHHcC
Confidence 67999999997655
No 135
>PF01447 Peptidase_M4: Thermolysin metallopeptidase, catalytic domain This Prosite motif covers only the active site. This is family M4 in the peptidase classification. ; InterPro: IPR013856 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases that belong to the MEROPS peptidase family M4 (thermolysin family, clan MA(E)). The protein fold of the peptidase domain of thermolysin, is the type example for members of the clan MA. The thermolysin family is composed only of secreted eubacterial endopeptidases. The zinc-binding residues are H-142, H-146 and E-166, with E-143 acting as the catalytic residue. Thermolysin also contains 4 calcium-binding sites, which contribute to its unusual thermostability. The family also includes enzymes from a number of pathogens, including Legionella and Listeria, and the protein pseudolysin, all with a substrate specificity for an aromatic residue in the P1' position. Three-dimensional structure analysis has shown that the enzymes undergo a hinge-bend motion during catalysis. Pseudolysin has a broader specificity, acting on large molecules such as elastin and collagen, possibly due to its wider active site cleft []. This entry represents a domain found in peptidase M4 family members.; GO: 0004222 metalloendopeptidase activity; PDB: 3NQX_A 3NQZ_B 3NQY_B 1BQB_A 1U4G_A 1EZM_A 3DBK_A 1ESP_A 1NPC_A 1LND_E ....
Probab=20.74 E-value=39 Score=31.64 Aligned_cols=18 Identities=28% Similarity=0.379 Sum_probs=11.8
Q ss_pred hhhhhhhhhhccchhhHh
Q 010326 399 PRLLTGSILAHEMMHAWL 416 (513)
Q Consensus 399 P~~L~gsilaHE~~Hawl 416 (513)
|+.-.--|+|||++|+..
T Consensus 131 ~~~~~lDVvaHEltHGVt 148 (150)
T PF01447_consen 131 PFASSLDVVAHELTHGVT 148 (150)
T ss_dssp -GGG-HHHHHHHHHHHHH
T ss_pred cCccccceeeeccccccc
Confidence 433334599999999864
No 136
>cd04269 ZnMc_adamalysin_II_like Zinc-dependent metalloprotease; adamalysin_II_like subfamily. Adamalysin II is a snake venom zinc endopeptidase. This subfamily contains other snake venom metalloproteinases, as well as membrane-anchored metalloproteases belonging to the ADAM family. ADAMs (A Disintegrin And Metalloprotease) are glycoproteins, which play roles in cell signaling, cell fusion, and cell-cell interactions.
Probab=20.71 E-value=60 Score=30.80 Aligned_cols=24 Identities=25% Similarity=0.220 Sum_probs=15.8
Q ss_pred EeeecCchhhhhhhhhhccchhhH
Q 010326 392 ILILYGLPRLLTGSILAHEMMHAW 415 (513)
Q Consensus 392 Il~l~glP~~L~gsilaHE~~Haw 415 (513)
|....+-....+..|+|||++|.+
T Consensus 120 v~~~~~~~~~~~a~~~AHElGH~l 143 (194)
T cd04269 120 VVQDHSRNLLLFAVTMAHELGHNL 143 (194)
T ss_pred EEEeCCcchHHHHHHHHHHHHhhc
Confidence 333444334566789999999954
No 137
>PHA00527 hypothetical protein
Probab=20.46 E-value=2e+02 Score=25.67 Aligned_cols=63 Identities=24% Similarity=0.295 Sum_probs=40.1
Q ss_pred eeeecccccccccccee-eeEeeecCchhhhhhhhhhccchhhH---hhhcCCCCCCCcchhhHHHHHHHHH
Q 010326 373 LIDMITEPYRLIRRCEV-TAILILYGLPRLLTGSILAHEMMHAW---LRLKGYPNLRPDVEEGICQVLAHMW 440 (513)
Q Consensus 373 ilei~~~p~~~~~~~eV-~~Il~l~glP~~L~gsilaHE~~Haw---l~l~g~~~L~~~~eEG~cq~~a~~w 440 (513)
.|--||.+.-.+...++ -.|=|..|.. +||+||.+|.= .+--|...-|-+.-|-.|-+|.-|.
T Consensus 47 mla~~~~S~~~s~~~~~L~~~GVFNGK~-----~T~~HECAH~AF~vC~~VGV~~E~G~ANETYCY~~~R~~ 113 (129)
T PHA00527 47 MLAGATQSYCNTETGENLYLLGVFNGKA-----ATLVHECAHVAFYVCRDVGVTTEPGDANETYCYMLDRMF 113 (129)
T ss_pred hhhccccccccccCCCeEEEEEEeccHH-----HHHHHHHHHHHHHHHHhcCcccCCCccchhHHHHHHHHH
Confidence 44455555555554443 3334455653 59999999962 2323665556688899999998876
No 138
>PF12156 ATPase-cat_bd: Putative metal-binding domain of cation transport ATPase; InterPro: IPR021993 This domain is found in bacteria, and is approximately 90 amino acids in length. It is found associated with PF00403 from PFAM, PF00122 from PFAM, PF00702 from PFAM. The cysteine-rich nature and composition suggest this might be a cation-binding domain; most members are annotated as being cation transport ATPases.
Probab=20.39 E-value=53 Score=27.86 Aligned_cols=12 Identities=33% Similarity=0.683 Sum_probs=8.5
Q ss_pred ccccCCCCcCCC
Q 010326 221 KCDVCQNFIPTN 232 (513)
Q Consensus 221 kC~~C~~~I~~~ 232 (513)
.|..|+.+|+.+
T Consensus 2 ~C~HCg~~~p~~ 13 (88)
T PF12156_consen 2 KCYHCGLPVPEG 13 (88)
T ss_pred CCCCCCCCCCCC
Confidence 477788888644
No 139
>PRK04439 S-adenosylmethionine synthetase; Provisional
Probab=20.22 E-value=73 Score=34.52 Aligned_cols=54 Identities=28% Similarity=0.375 Sum_probs=31.8
Q ss_pred ccccCCCCeeeeeccccccccccceeeeEeeecCchhhhhhhhhhccchhhHhhhcCCCCCCCc
Q 010326 364 RPRIGAGYRLIDMITEPYRLIRRCEVTAILILYGLPRLLTGSILAHEMMHAWLRLKGYPNLRPD 427 (513)
Q Consensus 364 ~~~~~~G~rilei~~~p~~~~~~~eV~~Il~l~glP~~L~gsilaHE~~Hawl~l~g~~~L~~~ 427 (513)
.|++++|.-| +|++..-....|.-.--..+|.- .||-+-...||+ +..|+|+|+
T Consensus 72 ~p~fGGG~vi-----~Pi~ii~~GRAt~~~~g~~iPv~----~Ia~~Aak~~L~-~~l~~lD~e 125 (399)
T PRK04439 72 APKFGGGEVI-----EPIYIILGGRATKEVGGEEIPVG----EIAIEAAKEYLR-ENLRNLDPE 125 (399)
T ss_pred eccCCCceEE-----eeEEEEEecceeeeECCeEecHH----HHHHHHHHHHHH-HhCccCCcc
Confidence 4566555544 34444433333333222235663 577788899999 888888873
No 140
>PF12388 Peptidase_M57: Dual-action HEIGH metallo-peptidase; InterPro: IPR024653 This entry represents the metallopeptidases M10, M27 and M57. The catalytic triad for proteases in this entry is HE-H-H, which in many members is in the sequence motif HEIGH [].
Probab=20.12 E-value=61 Score=32.20 Aligned_cols=39 Identities=21% Similarity=0.368 Sum_probs=23.8
Q ss_pred eccccccccccceeeeEeeecCchhhhhhhhhhccchhhHhhhcCCCCC
Q 010326 376 MITEPYRLIRRCEVTAILILYGLPRLLTGSILAHEMMHAWLRLKGYPNL 424 (513)
Q Consensus 376 i~~~p~~~~~~~eV~~Il~l~glP~~L~gsilaHE~~Hawl~l~g~~~L 424 (513)
.+|.|.+.+.. .-+......+...+|+||+.|+- |+|+-
T Consensus 112 s~G~P~~~I~I------~~~~~~~~~~~~hvi~HEiGH~I----GfRHT 150 (211)
T PF12388_consen 112 SNGNPYKFIQI------YGLSNYSVNVIEHVITHEIGHCI----GFRHT 150 (211)
T ss_pred CCCCCCceEEE------EecCCCchhHHHHHHHHHhhhhc----ccccc
Confidence 45555554442 33344455566789999999963 55553
No 141
>PF02591 DUF164: Putative zinc ribbon domain; InterPro: IPR003743 This entry describes proteins of unknown function.
Probab=20.12 E-value=26 Score=26.98 Aligned_cols=15 Identities=20% Similarity=0.530 Sum_probs=10.6
Q ss_pred CcccccCCCCcCCCC
Q 010326 219 HPKCDVCQNFIPTNS 233 (513)
Q Consensus 219 ~pkC~~C~~~I~~~~ 233 (513)
+-.|.+|+..|++..
T Consensus 22 ~~~C~gC~~~l~~~~ 36 (56)
T PF02591_consen 22 GGTCSGCHMELPPQE 36 (56)
T ss_pred CCccCCCCEEcCHHH
Confidence 346888888887753
Done!