Analysis of CASP8 targets

A collection of pages for all CASP8 targets and links to specialized target pages.
Use menu on the right to access individual target pages.

CASP8 offered 128 targets for server prediction: from T0387 to T0514. On 21-Nov-2008, structures for 125 (including proteins sequence-close to T0498 and T0499) were available from PDB and other public sources, and were used for evaluation of predictions. Among remaining 3 structures, 2 (T0403 and T0439) will not be determined in the nearest future and T0500 was structurally disordered. Links in the rightmost panel lead to individual target pages for all CASP8 targets.

A web page is dedicated to each target, where we provide basic information about its sequence and structure, illustrate the structure, specify boundaries of evolutionary domains in it, perform sequence and structural classification of domains, attribute them to prediction difficulty category, list discrepancies between PDB file and target sequence, and tabulate server predictions for whole chain and domains evaluated with various scores. Some targets that reveal unexpected nuances about proteins or predictions are described in greater detail, i.e. sequence alignments for their families are provided and additional features are illustrated and discussed. For instance, T0467 was particularly interesting, both because its fold was difficult to predict, and because server predictions told us about structurally meaningful, but non-homologous similarity.

The most significant part of individual target web pages is evolutionary classification of these proteins and their domains. Whenever possible, we tried to stay within the framework of SCOP. For many domains such classification was straightforward, as strong sequence similarity existed between targets and proteins in PDB. However, some targets were particularly interesting, as their homologs in PDB were not easy to find. For instance, we show that T0465 is a very distant version of the FYSH domain, and T0460 is a modified NADH-quinone oxidoreductase chain 5 (Nqo5) domain with a singleton sequence. A common question is whether a particular target or domain has a novel fold. This issue is discussed here.

As a group specializing in protein evolution, we are very excited about evolutionary classification of targets. However, while such classification is very important for interpretation and understanding of predictions, we believe that it is of a limited value for prediction evaluation. In CASP, category classification is more relevant, which is usually based on the target difficulty for prediction. We offer a rather detailed look, and bin targets in 5 categories suggested by the data on prediction accuracy. Our analysis of prediction accuracy indicates that at least 3 categories are required (hard, medium and easy), but to ensure good zoom into predictions, 'hard' and 'medium' were further split, which leads to 5 categories: FM (free modeling), FR (fold recognition), CM_H (comparative modeling: hard), CM_M (comparative modeling: medium), and CM_E (comparative modeling: easy).

In CASP8, a group of targets was designated for "human" prediction, i.e. non-server predictors applied their expert knowledge to such targets. 57 (~45% of) targets were assigned to 'Human/Server' group. The rest (71) were in the 'Server only' group and usually were not worked on by human experts. Here, we provide sets of pages dedicated to a particular group of targets. Browsing though links on these pages keeps you within that group of targets. In addition to 'Human/Server' and 'Server only' sets we offer sets of pages for targets with structure determined by a particular method: X-ray (107 targets) and NMR (21 targets).

Human/Server targets
X-ray targets

Server only targets
NMR targets


Domain parse of targets

To display structures with a mouse-click

Discussion of whole chain vs. domain evaluations

New folds.
Were there any?
Target category definition

Preparation of NMR structures

Download: target seq | target str | domain def | evaluation scores