With an antimicrobial resistance crisis looming on the public health horizon and a generation of antibiotic treatments falling victim to drug-resistant bacteria, the search for new antibiotics and antimicrobial agents is more important than ever. The natural world remains a supreme source of products that could drive the next wave of antibiotics research, with bacterial metabolites proving to be a particularly rich wellspring of clinically promising compounds.
Genome mining techniques allow researchers to sequence bacterial genomes for in-depth study, hunting for clinically relevant gene clusters responsible for encoding the secondary metabolites that don’t influence a micro-organism’s growth, development or reproduction but often play a role in defending against bacterial predators.
This genome mining process is labour-intensive, and because of its place in the early stages of the drug discovery process, has increasingly proven too high-risk for significant investment from the biotech and pharma sectors. Bacterial genome screening, then, is primarily driven by smaller groups of academic researchers.
Nevertheless, data analytics technologies are helping to speed up the search for metabolites, none more so than antiSMASH, an open-source software suite that allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes. It integrates and cross-links with a large number of computer-simulated secondary metabolite analysis tools that have already been published.
Initially developed by Kai Blin and Marnix Medema in 2010 when they were both PhD students in Germany and the Netherlands respectively, antiSMASH has now become ubiquitous among natural product researchers doing bacterial and fungal genome mining, with more than 418,000 jobs processed to date and contributions coming from research groups all over the world. More recently, the team behind the software launched the antiSMASH database, helping researchers to save time by allowing them to search for pre-calculated antiSMASH results from nearly 25,000 full and draft genomes.
Here, antiSMASH co-creator Kai Blin, who now works at the Technical University of Denmark’s Novo Nordisk Foundation as a researcher and scientific software engineer, elaborates on the benefits of antiSMASH and its database, and the advantages of an open-source approach to scientific software design.
Chris Lo: What challenges are faced by researchers looking for bacterial metabolites to drive drug discovery?
Kai Blin: There are a lot of natural products produced by micro-organisms that are used in clinics. So it’s really attractive to go and look for more compounds there. But unfortunately after a golden age of antibiotic discovery from bacteria in the late 60s and early 70s, the rediscovery rate has gone up. We still keep finding compounds that can kill bacteria, but if you continue the analysis, you start to see that these are compounds that have been isolated before, and for whatever reason haven’t made it to a drug state. Sometimes they keep finding approved drug compounds.
I guess in the early 90s, pharmaceutical companies started to spin down their efforts on natural product isolation from microbes quite a bit, because of that challenge of the high rediscovery rate, and also because they were really jumping on to this train of drug design and high-throughput screening of existing compounds, this really target-based design that was really hip in the 90s.
When the Human Genome Project started winding down, the technologies evolving during sequencing for that gave a big push to the sequencing industry overall. Genome sequencing of bacteria became much cheaper and more feasible at that point. That gave rise to this whole idea that you can do genome mining, where you sequence bacteria, look at the genome and maybe find some gene clusters that look interesting, and then as a next step try to figure out how to get the compound that is hidden in there. That’s where antiSMASH comes in.
CL: When was the antiSMASH tool first introduced?
KB: This was a collaboration between two universities [Tübingen and Groningen]. Both of our groups had some previous pipelines that covered part of the feature set of antiSMASH, and we decided instead of trying to publish two competing tools that both only showed part of the picture, to join forces and come up with something that covers pretty much everything that is out there.
The idea is to collect pretty much everything we know from the science perspective on what we know about secondary metabolites, and we basically built this into an easy-to-use tool that has an easy web interface to submit jobs and a nice HTML output that is useable by biologists. So you don’t have to be a bioinformatician to use it.
CL: How has antiSMASH evolved since it launched?
KB: The nice thing about this multi-disciplinary group that we have here – I’m a bioinformatician, so I take care of the software development of everything. But for users of my software, so the wet lab biologists I built the tool for, I share an office with. So that really allows us to get some really tight feedback cycles in terms of what new features we should work on. Those tend to be the things that are pinpoints while working in the lab and trying to tackle a specific challenge.
Over the last couple of years, it has established itself as the gold-standard tool that everybody uses in the field, as their baseline analysis. I don’t think you will find a lot of people these days who work in the natural product field with microbes and haven’t used it yet.
We’re getting ready to release version 5 of antiSMASH. Our ideas about new features we tend to be driven by both what we encounter in-house as things that are missing, or if there are publications where people are saying this is how this type of secondary metabolite cluster works in detail, we’ll obviously go and look at it and ask if we can turn it into a more detailed prediction module for antiSMASH. Also, over the last two or three years, people have also started approaching us, saying, ‘Hey, we have this really nice dataset on this type of cluster, how do we go about adding this to antiSMASH?’ So people have been approaching us with ideas for new features, while already delivering the datasets that we need. So that’s pretty nice.
CL: With relatively little investment from the pharma industry going into antibiotics R&D, do you think the burden of new discoveries falls mostly on academic research teams?
KB: I think that’s what it looks like at the moment, especially seeing how even some of the smaller companies that were still working in the field are folding up or pivoting to other projects. It’s a bit of a difficult model, because at the end of the day, at this stage you’re really early in the game, so there’s a high risk involved that you’re not going to find anything relevant. So for the big pharmaceutical companies that have been playing that game since the 70s and haven’t found a lot of things that were worth the money, they decided to get out of it.[Academic] researchers have kept at it because there’s less of a ‘we need to make money’ perspective in academia. So yes at the moment the burden is a bit in that direction, but I think with the tools moving the way they are currently, and also with sequencing getting cheaper and cheaper again and long-read sequencing becoming viable and useful, maybe in the future, this might actually become more commercially viable again. I think it will always, with the way things work these days, remain something for small startup-scale companies, where you can shoulder the risk but also move fast, and then if you get somewhere that looks reasonable and start being ready to go into clinical trials, at that point try to go and partner with pharma to take it further.