Introduction of r m 2 (rank) metric incorporating rank-order predictions as an additional tool for validation of QSAR/QSPR models

Kunal Roy, Indrani Mitra, Probir Kumar Ojha, Supratik Kar, Rudra Narayan Das, Humayun Kabir

Research output: Contribution to journalArticlepeer-review

73 Scopus citations

Abstract

In silico techniques involving the development of quantitative regression models have been extensively used for prediction of activity, property and toxicity of new chemicals. The acceptability and subsequent applicability of the models for predictions is determined based on several internal and external validation statistics. Among different validation metrics, Q 2 and R 2 pred represent the classical metrics for internal validation and external validation respectively. Additionally, the r m 2 metrics introduced by Roy and coworkers have been widely used by several groups of authors to ensure the close agreement of the predicted response data with the observed ones. However, none of the currently available and commonly used validation metrics provides any information regarding the rank-order predictions for the test set. Thus, to incorporate the concept of ranking order predictions while calculating the common validation metrics originally using the Pearson's correlation coefficient-based algorithm, the new r m 2 (rank) metric has been introduced in this work as a new variant of the r m 2 series of metrics. The ability of this new metric to perform the rank-order prediction is determined based on its application in judging the quality of predictions of regression - based quantitative structure-activity/property relationship (QSAR/QSPR) models for four different data sets. The different validation metrics calculated in each case were compared for their ability to reflect the rank-order predictions based on their correlation with the conventional Spearman's rank correlation coefficient. Based on the results of the sum of ranking differences analysis performed using the Spearman's rank correlation coefficient as the reference, it was observed that the r m 2 (rank) metric exhibited the least difference in ranking from that of the reference metric. Thus, the close correlation of the r m 2 (rank) metric with the Spearman's rank correlation coefficient inferred that the new metric could aptly perform the rank-order prediction for the test data set and can be utilized as an additional validation tool, besides the conventional metrics, for assessing the acceptability and predictive ability of a QSAR/QSPR model.

Original languageEnglish
Pages (from-to)200-210
Number of pages11
JournalChemometrics and Intelligent Laboratory Systems
Volume118
DOIs
StatePublished - 15 Aug 2012

Keywords

  • Pearson's correlation coefficient
  • QSAR
  • QSPR
  • QSTR
  • Spearman's rank correlation coefficient
  • Validation

Fingerprint

Dive into the research topics of 'Introduction of r m 2 (rank) metric incorporating rank-order predictions as an additional tool for validation of QSAR/QSPR models'. Together they form a unique fingerprint.

Cite this