An implicit scoring model is capable of capturing ‘chemical intuition’, researchers from Novartis and Microsoft write in Nature Communications. They see their tool as a valuable addition to allow for a more efficient selection of drug leads.  

In chemistry, many problems can be tackled using a rational approach. Theoretical underpinning, well-established physico-chemical relationships and a wealth of literature and experimental data are all available. Even so, these do no fully explain all the choices chemists make on a daily basis. Accumulated experience, often expressed as ‘intuition’, also plays a major role.   


It is hard to make that intuition explicit, but that is exactly what researchers from Novartis and Microsoft had in mind. Because if you can reveal underlying patterns, you can formalise that ‘hidden’ knowledge and use it for new, predictive machine learning techniques that can make the selection and optimisation process of new drug leads more efficient.   

To explore whether there actually are patterns underlying that elusive intuition, 35 experienced medicinal chemists from Novartis were presented with well-annotated molecular structures in several rounds, each time with the assignment to choose the best candidate from sets of two for further lead optimisation.   

Their choices were then used as input for training a so-called implicit scoring model, which uses all feedback from the chemists, combined with the other known parameters, to predict from new, as yet unexplored, molecules whether they are suitable drug candidates. What turns out: intuition can be captured. The predictions of the trained model were consistent and correlated well with choices that chemicsts currently make —consciously or intuitively —based on parameters such as QED, quantitative estimate of drug-likeness, ease of synthesis ease and solubility. 

Vague parameters  

According to the Novartis-Microsoft team, their scoring model thus provides a meaningful addition to the manual scoring of drug candidates and can make the lead optimization process much more efficient. ‘This is a very nice and useful paper,’ says Gerard van Westen, professor of artificial intelligence and drug discovery at Leiden University. ‘They show that the sought-after chemical intuition can actually be captured and that there is a correlation with old, fuzzy parameters like QED. The central question in drug development is what makes a molecule a drug. This tool can certainy help to answer that question.’    

According to Van Westen, these results also illustrate the increasing power of machine learning methods. ‘Back in 2006, we tried to reproduce the way chemists score molecules in a computational model, but it didn’t work out. By now, the techniques are apparently good enough to achieve this.’ 

He also appreciates the fact that the new tools is made openly available. Quite extraordinary given that a pharma giant like Novartis is one of the developers. ‘Well, not really,’ Van Westen responds. ‘Of all the names in Big Pharma, Novartis is one of the most approachable companies.’ Now that it is available to everyone, wouldn’t it be very interesting to see if the model also captures the intuition of, say, chemists at other pharma companies? Van Westen: ‘I actually don’t think that would make much difference. I would rather see a comparison with the choices made by academic chemists. I’m really curious to learn how that would turn out.’  

Oh-Hyeon Choung, et al., Extracting medicinal chemistry intuition via preference machine learning, Nature Communications (2023)