GRID-based three-dimensional pharmacophores II: PharmBench, a benchmark data set for evaluating pharmacophore elucidation methods

Cross, Simon; Ortuso, Francesco; Baroni, Massimo; Costa, Giosuè; Distinto, Simona; Moraca, Federica; Alcaro, Stefano; Cruciani, Gabriele

doi:10.1021/ci300154n

To date, published pharmacophore elucidation approaches typically use a handful of data sets for validation: here, we have assembled a data set for 81 targets, containing 960 ligands aligned using their cocrystallized protein targets, to provide the experimental "gold standard". The two-dimensional structures are also assembled to remove conformational bias; an ideal method would be able to take these structures as input, find the common features, and reproduce the bioactive conformations and their alignments to correspond with the X-ray-determined gold standard alignments. Here we present this data set and describe three objective measures to evaluate performance: the ability to identify the bioactive conformation, the ability to identify and correctly align this conformation for 50% of the molecules in each data set, and the pharmacophoric field similarity. We have applied this validation methodology to our pharmacophore elucidation method FLAPpharm, that is published in the first paper of this series and discuss the limitations of the data set and objective success criteria. Starting from two-dimensional structures and producing unbiased models, FLAPpharm was able to identify the bioactive conformations for 67% of the ligands and also to produce successful models according to the second metric for 67% of the Pharmbench data sets. Inspection of the unsuccessful models highlighted the limitation of this root mean square (rms)-derived metric, since many were found to be pharmacophorically reasonable, increasing the overall success rate to 83%. The PharmBench data set is available at http://www.moldiscovery.com/PharmBench , along with a web service to enable users to score model alignments coming from external methods in the same way that we have presented here and, therefore, establishes a pharmacophore elucidation benchmark data set available to be used by the community.

IRIS - Res&Arch Institutional Research Information System - Research & Archive