Virtual screening of molecular databases using a support vector machine

Robert Jorissen, Michael K. Gilson

Research output: Contribution to journalArticle

199 Citations (Scopus)

Abstract

The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses cross-validation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprint-based methods such as binary kernel discrimination.

LanguageEnglish
Pages549-561
Number of pages13
JournalJournal of Chemical Information and Modeling
Volume45
Issue number3
DOIs
Publication statusPublished - 1 May 2005

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications
  • Library and Information Sciences

Cite this

@article{d8b511a87c8944f5a3fe6e7362eaf8f9,
title = "Virtual screening of molecular databases using a support vector machine",
abstract = "The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses cross-validation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprint-based methods such as binary kernel discrimination.",
author = "Robert Jorissen and Gilson, {Michael K.}",
year = "2005",
month = "5",
day = "1",
doi = "10.1021/ci049641u",
language = "English",
volume = "45",
pages = "549--561",
journal = "Journal of Chemical Information and Modeling",
issn = "1549-9596",
number = "3",

}

Virtual screening of molecular databases using a support vector machine. / Jorissen, Robert; Gilson, Michael K.

In: Journal of Chemical Information and Modeling, Vol. 45, No. 3, 01.05.2005, p. 549-561.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Virtual screening of molecular databases using a support vector machine

AU - Jorissen, Robert

AU - Gilson, Michael K.

PY - 2005/5/1

Y1 - 2005/5/1

N2 - The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses cross-validation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprint-based methods such as binary kernel discrimination.

AB - The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses cross-validation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprint-based methods such as binary kernel discrimination.

UR - http://www.scopus.com/inward/record.url?scp=20444410410&partnerID=8YFLogxK

U2 - 10.1021/ci049641u

DO - 10.1021/ci049641u

M3 - Article

VL - 45

SP - 549

EP - 561

JO - Journal of Chemical Information and Modeling

T2 - Journal of Chemical Information and Modeling

JF - Journal of Chemical Information and Modeling

SN - 1549-9596

IS - 3

ER -