TY - JOUR
T1 - Prediction reliability of QSAR models
T2 - an overview of various validation tools
AU - De, Priyanka
AU - Kar, Supratik
AU - Ambure, Pravin
AU - Roy, Kunal
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2022/5
Y1 - 2022/5
N2 - The reliability of any quantitative structure–activity relationship (QSAR) model depends on multiple aspects such as the accuracy of the input dataset, selection of significant descriptors, the appropriate splitting process of the dataset, statistical tools used, and most notably on the measures of validation. Validation, the most crucial step in QSAR model development, confirms the reliability of the developed QSAR models and the acceptability of each step in the model development. The present review deals with various validation tools that involve multiple techniques that improve the model quality and robustness. The double cross-validation tool helps in building improved quality models using different combinations of the same training set in an inner cross-validation loop. This exhaustive method is also integrated for small datasets (< 40 compounds) in another tool, namely the small dataset modeler tool. The main aim of QSAR researchers is to improve prediction quality by lowering the prediction errors for the query compounds. ‘Intelligent’ selection of multiple models and consensus predictions integrated in the intelligent consensus predictor tool were found to be more externally predictive than individual models. Furthermore, another tool called Prediction Reliability Indicator was explained to understand the quality of predictions for a true external set. This tool uses a composite scoring technique to identify query compounds as ‘good’ or ‘moderate’ or ‘bad’ predictions. We have also discussed a quantitative read-across tool which predicts a chemical response based on the similarity with structural analogues. The discussed tools are freely available from https://dtclab.webs.com/software-tools or http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/ and https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home (for read-across).
AB - The reliability of any quantitative structure–activity relationship (QSAR) model depends on multiple aspects such as the accuracy of the input dataset, selection of significant descriptors, the appropriate splitting process of the dataset, statistical tools used, and most notably on the measures of validation. Validation, the most crucial step in QSAR model development, confirms the reliability of the developed QSAR models and the acceptability of each step in the model development. The present review deals with various validation tools that involve multiple techniques that improve the model quality and robustness. The double cross-validation tool helps in building improved quality models using different combinations of the same training set in an inner cross-validation loop. This exhaustive method is also integrated for small datasets (< 40 compounds) in another tool, namely the small dataset modeler tool. The main aim of QSAR researchers is to improve prediction quality by lowering the prediction errors for the query compounds. ‘Intelligent’ selection of multiple models and consensus predictions integrated in the intelligent consensus predictor tool were found to be more externally predictive than individual models. Furthermore, another tool called Prediction Reliability Indicator was explained to understand the quality of predictions for a true external set. This tool uses a composite scoring technique to identify query compounds as ‘good’ or ‘moderate’ or ‘bad’ predictions. We have also discussed a quantitative read-across tool which predicts a chemical response based on the similarity with structural analogues. The discussed tools are freely available from https://dtclab.webs.com/software-tools or http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/ and https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home (for read-across).
KW - Double cross-validation
KW - Intelligent consensus prediction
KW - QSAR
KW - Read across
KW - Small dataset modeling
KW - Validation
UR - http://www.scopus.com/inward/record.url?scp=85126112956&partnerID=8YFLogxK
U2 - 10.1007/s00204-022-03252-y
DO - 10.1007/s00204-022-03252-y
M3 - Review article
C2 - 35267067
AN - SCOPUS:85126112956
SN - 0340-5761
VL - 96
SP - 1279
EP - 1295
JO - Archives of Toxicology
JF - Archives of Toxicology
IS - 5
ER -