Machine learning for chromatography

Geen opmerkingen

It will come as no surprise to anyone who has developed methods for chromatographic separations: separating (complex) mixtures is a lot of work. But with the advent of machine learning, a group at KU Leuven is efficiently speeding up the process. ‘People often get stuck in their own methods.’

Chromatography is an essential technique used in many laboratories to separate mixtures of compounds. However, optimising a separation often involves a lot of trial and error, which is a poor use of time and resources. ‘Everyone involved in chromatography looks at method development, usually based on their own experience,’ says Deirdre Cabooter, Professor of Pharmaceutical Analysis at KU Leuven. ‘We wanted to tackle method development in a more efficient way by using machine learning.’

Along came Alexander Kensert, who did his PhD as a machine learning expert on this project and is currently a postdoc in Cabooter’s group. ‘When I came to Deirdre’s lab, I was literally looking for problems’, Kensert says. ‘Usually it’s the other way around; there’s a problem that needs solving and you try to find a way to solve it. But I already had the solution and needed a problem to solve.’

Kensert has mainly been working on three machine learning projects: signal processing (using convolutional neural networks), retention time modelling (using graph neural networks), and method optimisation (using deep reinforcement learning). ‘We want to be able to remove noise from complex chromatograms to better locate the peaks. It would also be useful to find correlations between the structure of a molecule and its retention time. Finally, we want to intelligently and automatically select the right chromatographic parameters for optimal separation.’

Artificial

However, it’s not as easy as it sounds. ‘Currently, you need a lot of data for the machine learning algorithms to learn efficiently’, explains Cabooter. ‘But it’s very expensive and time-consuming to generate a lot of data from real samples and measurements.’ Instead, the researchers generate artificial chromatograms that reflect reality to train the algorithms. Kensert says: ‘You can’t have learning without good examples, so we had to make simulators that generate these chromatograms to train on.’ But Cabooter stresses that real examples are also used. ‘Everyone in our group does chromatography, so we can use that data and test the algorithm at the same time’, she says. ‘This is a quick way to improve it.’

’We aim to remove noise from complex chromatograms’

‘The biggest challenge is definitely getting good quality data or examples to train the machine learning model to make useful predictions’, says Kensert. ‘We have come a long way with retention modelling, with evidence of both good performance and interpretability.’ He has also taken steps in the signal processing algorithms. ‘We need further validation and testing’, admits Kensert, ‘but we have successfully developed models and published the results in the Journal of Chromatography A.’

Seconds

At the moment, the machine learning algorithms are mainly focused on small molecules, which are relevant to the pharmaceutical field. ‘But the goal is to go beyond that’, says Cabooter. ‘We have recently started working with larger biomolecules, so we are trying to implement the algorithms for those as well.’ Ultimately, in Cabooter’s view, these algorithms should be of benefit to anyone involved in liquid chromatography. In every field of analysis, whether it’s food or environmental studies, chromatography methods are being developed. A lot of time is lost doing these analyses manually, so using machine learning could really help a lot of people. One of Cabooter’s students has already experienced the benefits of Kensert’s algorithms. Instead of spending weeks analysing data herself, the computer was able to do the job in a matter of seconds.

So ,on the one hand, the computer has to learn from all the data you feed it. On the other hand, you can learn from the machine. ‘Improving the interpretability of machine learning models is a hot topic right now’, says Kensert. ‘In one of our projects, we are trying to understand what part of the molecule the machine learning model is looking at when it makes predictions about retention time. This could give us further insight into which substructures might be important for retention to better understand chromatographic processes.’

The hope is to make a significant contribution to the field of chromatography, not only in terms of methodology, but also in terms of insight. Cabooter: ‘People often get stuck in their own methods and think that the way they have been doing method development is the best way to go. But chromatography still takes a lot of time, including manual tweaking, re-checking parameters and so on. It’s just not optimal. Hopefully, we will be able to learn from the computer what the optimal way should be.’