Two experimental (log P, RMw) and 17 calculation descriptors for molecular lipophilicity (fragmental, atom-based or based on molecular properties) were investigated by multivariate analysis for a database of 159 compounds including both simple structures as well as more complex drug molecules. Principal component analysis (PCA) of the entire database exhibits a clustering of chemical groups; preciseness of clustering corresponds to chemical similarity. Thus, diversity searching in databases might effectively be performed by PCA on the basis of calculated log P. The comparative validity check of experimental and computational procedures by regression analysis and PCA was performed with a chemically balanced, reduced data set (n D 55) representing 11 chemical groups with 5 members each. Regression of experimental descriptors (log Poct versus RMw) proves that chromatographic data, obtained under well-defined experimental conditions, can be used as valid substitutes for log P. Regression of calculated versus experimental lipophilicity data shows a superiority of fragmental over atom-based methods and approaches based on molecular properties, as indicated by correlation coefficients, slopes and intercepts. In addition, PCA revealed that fragmental methods (Rekker-type, KOWWIN, KLOGP) sense the compound ranking in log P data to almost the same extent as experimental approaches. For atom-based procedures and CLOGP, both the comparability of absolute values and the sensing of the compound ranking in the database are slightly less. This trend is more pronounced for the methods based on molecular properties, with the exception of BLOGP.

Multivariate analysis of experimental and computational descriptors of molecular lipophilicity

CRUCIANI, Gabriele;
1998

Abstract

Two experimental (log P, RMw) and 17 calculation descriptors for molecular lipophilicity (fragmental, atom-based or based on molecular properties) were investigated by multivariate analysis for a database of 159 compounds including both simple structures as well as more complex drug molecules. Principal component analysis (PCA) of the entire database exhibits a clustering of chemical groups; preciseness of clustering corresponds to chemical similarity. Thus, diversity searching in databases might effectively be performed by PCA on the basis of calculated log P. The comparative validity check of experimental and computational procedures by regression analysis and PCA was performed with a chemically balanced, reduced data set (n D 55) representing 11 chemical groups with 5 members each. Regression of experimental descriptors (log Poct versus RMw) proves that chromatographic data, obtained under well-defined experimental conditions, can be used as valid substitutes for log P. Regression of calculated versus experimental lipophilicity data shows a superiority of fragmental over atom-based methods and approaches based on molecular properties, as indicated by correlation coefficients, slopes and intercepts. In addition, PCA revealed that fragmental methods (Rekker-type, KOWWIN, KLOGP) sense the compound ranking in log P data to almost the same extent as experimental approaches. For atom-based procedures and CLOGP, both the comparability of absolute values and the sensing of the compound ranking in the database are slightly less. This trend is more pronounced for the methods based on molecular properties, with the exception of BLOGP.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11391/908968
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 66
  • ???jsp.display-item.citation.isi??? 63
social impact