Abstract
| - In large-scale virtual screening (VS) campaigns, data are often computed for millions ofcompounds to identify leads, but there remains the task of prioritizing VS “hits” for experimentalassays and the dilemma of assessing true/false positives. We present two statistical methodsfor mining large databases: (1) a general scoring metric based on the VS signal-to-noise levelwithin a compound neighborhood; (2) a neighborhood-based sampling strategy for reducingdatabase size, in lieu of property-based filters.
|