Using Neural Networks to Discover Antibiotics

Antibiotic resistance is one of the greatest challenges of modern medicine. More than a hundred thousand people die every year because doctors cannot treat bacterial infections. However, there is an unexpected ally in this fight for lives, which can help to solve the problem of bacterial resistance to existing drugs. This ally is neural networks. Scientists from the Massachusetts Institute of Technology demonstrated that well-trained neural networks can successfully identify new antibiotics from millions of candidate molecules.

Why many antibiotics are becoming ineffective

There is a simple rule — the less frequently you take antibiotics, the less likely the bacterial communities are to adapt to them. This applies both to individuals and to entire continents. Unfortunately, when most of the bacteria on the planet develop resistance to a substance, the bacteria that made you ill are likely to have this resistance too, even if you have never taken any antibiotics.

Solving the problem of antibiotic resistance requires many simultaneous efforts: reduction of their usage in agriculture, control over their sales, and monitoring resistant nosocomial (hospital-acquired) strains. But even all these efforts combined will not be very effective without discovering new substances, and this task is becoming more difficult every day.

It is well known that the first antibiotic, penicillin, was discovered by accident — a mold that produces penicillin got into a petri dish containing a bacterial culture. Nevertheless, Alexander Fleming, the author of this discovery, won the Nobel prize, and hundreds of thousands of people were saved from a number of infections (at least for a while).

But today, discovering new medicines is a demanding and costly task: all the low-hanging fruit are already picked, and scientists have to waste more and more efforts only to find already known substances. With new computer methods of analysis, the researchers are constantly trying to optimize the process of discovering new antibiotics and facilitate the endless “manual” search for different substances.

How neural networks solve this problem

What’s interesting is that, when it comes to complex studies of genomes or other biological data, the researcher often needs not only to obtain predictions from the neural network but also to understand the stages of its learning process post factum. For instance, a neural network can find a pattern in the interaction of particular proteins and particular segments of DNA and learn to predict which new proteins will have similar properties. However, scientists will still need to figure out what exactly this discovered pattern is, as the neural network does not learn in the same way as people. It has a completely different logic and tracks the “research” in an alternative way.

Nowadays, the effective usage of neural networks in biology and medicine is just in its infancy. In a new article published in Cell, a group of MIT researchers led by James Collins said that they successfully screened millions of candidates for antibiotics using Deep Learning methods (a set of Machine Learning methods used by neural networks).

During the learning process, the neural network was trained to spot potential antibiotics among 2,335 molecules, the effect of which on the model bacterium — Escherichia coli — was well known. The chemical structure of each molecule was encoded using a set of numbers responsible for the interconnections between the atoms. The task of the neural network was to detect the motives in such structures, which were responsible for their antimicrobial activity.

Once the system learned to predict the properties of a substance based on the shape and composition of its molecule, it was granted access to several electronic chemical libraries of a much larger volume. These libraries contained more than a hundred million molecules in total, and the overwhelming majority of them had never been studied for their effect on bacterial cells.

And what is the result?

Scientists were able to find at least one potent substance, the antibacterial properties of which had not previously been known. This was a compound that scientists called “halicin” — an understudied kinase inhibitor, previously not used as an antibiotic. A number of laboratory experiments were conducted with it, and scientists discovered that this substance is really able to inhibit the growth of a wide range of bacteria, including those strains that are resistant to most modern antibiotics. This happened due to the following mechanism (previously not used in antibiotics): halicin inhibits the proton pump activity by reducing the sensitivity of bacteria membranes to changes in pH. And since a proton pump is the most important component of the bacterial cell, it is incompatible with its vital functions.

In addition to halicin, the neural network predicted at least eight new compounds that could have antibacterial properties. In these compounds, it also found mechanisms that had never been used as antibacterial agents before; and at least two of them showed successful results during laboratory research. The scientists note that although these results look impressive, there are still many difficulties to be encountered in further studies of this type.

Why it is not so simple

The neural networks need decent — large and detailed — training samples. However, the set of experimental data and knowledge of all the subtleties of the structure and activity mechanisms of various substances is still limited. In addition, you cannot say which aspects will be important and which won’t. You will not be able to test everything because it is quite expensive, and the neural network itself is often too optimistic. For example, when it screens a million candidates, it offers hundreds of options that are good from its point of view.

Six months ago, Canadian scientists taught the neural network to predict the probability of how often patients with urinary tract infections are prescribed an antibiotic that will actually help them. They trained the neural network using patients’ clinical data from a nine-year period to check whether it would correctly predict what would happen in the tenth year. It was found that in 8.5% of cases the physicians had prescribed an ineffective antibiotic. Moreover, if choosing medicine at random, they would be mistaken almost as often — in 10% of cases. The neural network recommended the ineffective antibiotic only in 5% of cases, which is an improvement of course, although not substantially. However, it should be noted that this study was still quite empirical. The nature of the data on clinical history, the patients’ response to certain antibiotics, as well as the specifics of certain populations of people impose great restrictions, and you cannot blindly follow the advice of the neural network and use this information in medical practice without additional checks.

In order to narrow the number of new antibiotics candidates, the authors of the study published in Cell focused on similarities of candidate molecules with known antibiotics (despite the fact that their general mechanism of action, as was mentioned above, could be quite different), as well as introduced, for example, restrictions on the potential substance toxicity to humans. But the more such clarifications and rules are introduced, the more it looks like a rollback to manual data processing, and scientists want the computer to do everything on its own.

Nevertheless, the researchers are optimistic. They believe that improving methods and training samples (e.g. more clearly dividing them into groups based on their influence on bacteria and on how they interact with different types of substances), as well as developing new neural network algorithms will bring new projects increasingly fruitful results.