Abstract
| - A data set consisting of 712 compounds was used for classification into two classes with respectto membrane permeation in a cell-based assay: (0) apparent permeability (Papp) below 4 ×10-6 cm/s and (1) Papp on 4 × 10-6 cm/s or higher. Nine molecular descriptors were calculatedfor each compound and Nearest-Neighbor classification was applied using five neighbors asoptimized by full cross-validation. A model based on five descriptors, number of flex bonds,number of hydrogen bond acceptors and donors, and molecular and polar surface area, wasselected by variable selection. In an external test set of 112 compounds, 104 compounds wereclassified and 8 compounds were judged as “unknown”. Among the 104 compounds, 16 weremisclassified corresponding to a misclassification rate of 15% and no compounds were falselypredicted in the nonpermeable class.
|