Documentation scienceplus.abes.fr version Bêta

À propos de : Prediction of n-Octanol/Water Partition Coefficients from PHYSPROP Database UsingArtificial Neural Networks and E-State Indices        

AttributsValeurs
type
Is Part Of
Subject
Title
  • Prediction of n-Octanol/Water Partition Coefficients from PHYSPROP Database UsingArtificial Neural Networks and E-State Indices
has manifestation of work
related by
Author
Abstract
  • A new method, ALOGPS v 2.0 (http://www.lnh.unil.ch/∼itetko/logp/), for the assessment of n-octanol/water partition coefficient, log P, was developed on the basis of neural network ensemble analysis of 12 908organic compounds available from PHYSPROP database of Syracuse Research Corporation. The atom andbond-type E-state indices as well as the number of hydrogen and non-hydrogen atoms were used to representthe molecular structures. A preliminary selection of indices was performed by multiple linear regressionanalysis, and 75 input parameters were chosen. Some of the parameters combined several atom-type orbond-type indices with similar physicochemical properties. The neural network ensemble training wasperformed by efficient partition algorithm developed by the authors. The ensemble contained 50 neuralnetworks, and each neural network had 10 neurons in one hidden layer. The prediction ability of the developedapproach was estimated using both leave-one-out (LOO) technique and training/test protocol. In case ofinterseries predictions, i.e., when molecules in the test and in the training subsets were selected by chancefrom the same set of compounds, both approaches provided similar results. ALOGPS performance wassignificantly better than the results obtained by other tested methods. For a subset of 12 777 molecules theLOO results, namely correlation coefficient r2 = 0.95, root mean squared error, RMSE = 0.39, and anabsolute mean error, MAE = 0.29, were calculated. For two cross-series predictions, i.e., when moleculesin the training and in the test sets belong to different series of compounds, all analyzed methods performedless efficiently. The decrease in the performance could be explained by a different diversity of moleculesin the training and in the test sets. However, even for such difficult cases the ALOGPS method providedbetter prediction ability than the other tested methods. We have shown that the diversity of the training setsrather than the design of the methods is the main factor determining their prediction ability for new data.A comparative performance of the methods as well as a dependence on the number of non-hydrogen atomsin a molecule is also presented.
article type
is part of this journal



Alternative Linked Data Documents: ODE     Content Formats:       RDF       ODATA       Microdata