Predictions without templates:
New folds, secondary structure & contacts in CASP5

Patrick Aloy, Alexander Stark, Caroline Hadley, Robert B. Russell
EMBL, Meyerhofstrasse 1, 69117 Heidelberg, Germany


Web site accompanying our assessment
Last updated 17th November 2003

The special issue is now published. When referreing to these data please cite:

P. Aloy, A. Stark, C. Hadley, R.B. Russell
Predictions without templates: New folds, secondary structure and contacts in CASP5.
PROTEINS: Struct. Funct. Genet., 53, 436-456, 2003 PubMed 14579333

3D/Coordinate predictions

New Fold Targets (NF)

New Fold / Fold Recognition Borderline Targets (NF/FR)


Full versions of tables to appear (truncated) in the paper

Visual assessments (Table 2) (1st models only)

Z-score (GDT_TS) assessment (Table 3) (1st models only)

Survey of methods used (Table 4), based on a questionaire asked of the best predictors.

Visual Inspection

Rob Russell looked through each target where results were sorted by raw GDT_TS score, judging whether or not groups had a correct prediction of overall fold. 2 points were awarded for "excellent" predictions (those with a correct or nearly correct structure) 1 point was given for "good" predictions (those where features of the fold were correct, but with distortions or differences), and 0 points otherwise. He went down the ranked list until it became clear that no more points would be awarded; the number of models considered varied from 20 to 124, and roughly 1000 models were inspected manually.

Summary for NF/FR


Numerical evaluations based on GDT_TS

Filter for overlapping contacts

Initial inspection of the predictions revealed a number where we thought that overlapping coordinates were giving artificially high GDT_TS scores. We thus repeated all calculations after first invoking a filter to try to remove this effect. We removed any prediction that had 10 or more clashes (defined as Calpha to Calpha distances <= 3.0 angstroms,ignoring adjacent residues). More discussion of problems with automated GDT_TS based evalautions can be found here and on the FORCASP website. See here for some of our findings on how overlaping coordinates can increase GDT_TS.

Inspection of known 3D structures shows that they never have more than 1 contact between C-alphas < 3.7 angstroms. We also took precedent from the Critical Assessment of PRedictions of Interactions meeting (assessment of docking; CAPRI), where similar filter is applied for similar reasons.

A total of 146 (out of 4916) models were removed by this filter. Results below are either presented "raw" or "filtered".

Z-score

We calculated the mean, standard-deviation and Z-score for all predictions per target. For each group, we then calculated the sum of the best (positive) Z-score per target over all targets and the average Z-score (performance, sum divided by number of predictions made). Z-scores acknowledge outstanding performance on difficult targets.

Summaries for: NF, NF (filtered) , NF+NF/FR, NF+NF/FR (filtered)
(Text files NF, NF (filtered) , NF+NF/FR, NF+NF/FR (filtered))

Percentiles

We calculated the percentile (0-100) for each GDT_TS score per target. For the best prediction per group, we then gave points depending on the percentile of the GDT_TS score (>=90 : 6; >=80 : 4; >=70 : 3; >= 60 : 2; >= 50 : 1; < 50 : 0) and normalized them in the range 0-100 (see Lesk et al., Proteins Suppl 5:98-118 (2001)). Percentiles acknowledge good sustained performance but remove differences within the different ranges. All targets have equal weights.

Summaries for: NF, NF (filtered), NF/FR, NF/FR (filtered)
(Text files NF, NF (filtered), NF/FR, NF/FR (filtered))

%-Rank

We sorted the groups according to the GDT_TS score and normalized the ranking for each target by removing the lower half of the predictions and giving points (0: worst to 100: best) for the remaining ones. We then added up the points for the best prediction per group for all targets. This acknowledges good performance and takes small differences into account. All targets have equal weights.

Summaries for: NF, NF (filtered), NF/FR, NF/FR (filtered)
(Text files NF, NF (filtered), NF/FR, NF/FR (filtered))

Simple Rankings

These tables report the ranks for each group and target based on GDT_TS score. Sum and group average are the same as for GDT_TS Z-score.

Summaries for: NF, NF (filtered), NF/FR, NF/FR (filtered)
(Text files NF, NF (filtered), NF/FR, NF/FR (filtered))

Secondary Structure Predictions

Text files and explanations