Animal studies represent one of the fundamental layers of modern clinical research. Preclinical testing of drug candidates on animal models – most commonly mice and dogs, but also pigs, sheep and others – is a vital step in validating their safety and efficacy profile to justify the transition to human clinical trials.
Given animal studies’ status as a preclinical keystone in drug development, without which the medical research pipeline would cave in, there is naturally a great deal of oversight governing the conduct of this research, especially when it comes to safety. Regulatory bodies such as the US Food and Drug Administration (FDA) and the European Medicines Agency (EMA) have strict requirements for assessing a comprehensive range of safety factors to minimise the risk to human participants in subsequent clinical trials.
“The nonclinical safety assessment for marketing approval of a pharmaceutical usually includes pharmacology studies, general toxicity studies, toxicokinetic and nonclinical pharmacokinetic studies, reproduction toxicity studies, genotoxicity studies and, for drugs that have special cause for concern or are intended for a long duration of use, an assessment of carcinogenic potential,” notes an FDA Guidance for Industry document.
While safety is the primary focus for preclinical study oversight, the picture is far less clear on efficacy. Given the high failure rates in drug development and the eye-watering costs involved in running clinical trials, it’s more important than ever that human trials are supported by robust, reproducible and well-reported preclinical efficacy data. Evidence suggests, however, that this may not be the case.
Preclinical efficacy: questions to answer
The latest alarm-bell over the reporting of animal efficacy data was rung by a team of researchers at Hannover Medical School in Germany and McGill University in Canada. For a systematic analysis, the researchers gained access to 109 investigator brochures – documents used by regulators and review boards to assess the potential efficacy of experimental drugs before they move to human trials – that were reviewed by three institutional review boards in Germany between 2010 and 2016.
The study’s findings, which were published in PLOS Biology on 5 April, detailed some concerning gaps in the reporting of preclinical animal studies. Of the more than 700 animal studies presented in the investigator brochures reviewed by the German medical boards, less than 5% referenced the use of randomisation, blinded outcome assessment or sample size calculations – all measures that are commonly used to validate efficacy results and minimise the risk of bias. This doesn’t necessarily mean that these measures weren’t employed, but if they were, their use wasn’t included in the investigator brochures.
“Whether the studies were that bad, whether they really did not include all these methodological steps, we don’t know,” says Dr Daniel Strech, professor of bioethics and Hannover Medical School and senior author of the study. “But if you are one of those who have to make a risk-benefit assessment based on the animal studies, you currently more or less cannot do it.”
The lack of these measures is compounded by another of the study’s findings: only 11% of the animal studies investigated made reference to a publication in a peer-reviewed journal, meaning review boards and regulators assessing these brochures would also have lacked the validation of a separate review on data quality by independent academics. What’s more, all but 6% of the animal studies reported positive outcomes. While this isn’t initially surprising – after all, why would a clinical trial be proposed if the preclinical data was negative? – Strech emphasises the statistical oddity of having so little negative data, and stresses his concern.
“These animal studies had very low sample sizes,” he says. “On average we found that in one animal study there were about eight animals that they tested for the intervention, and eight animals in the control group. If you have these low sample sizes, just by chance, from time to time, you will have a negative study.
“The fact that they do not show up speaks a little bit in the direction that some selection of animal studies has taken place when they decided what studies to present in the investigator brochures. You need negative animal studies to better understand where the window of opportunity for your drug is. You need, for example, animal studies that demonstrate when the dosing scheme or the timing for the intervention is best, and when it becomes negative. If there’s nothing negative in the preclinical evidence, then you just lack any of these demarcations.”
Given the cross-border nature of multi-centre trials and the fact that many preclinical studies are bankrolled by pharma giants with global reach, Strech believes that this is an issue that is not limited to Germany, and is likely to be happening across regulatory systems. “The French, the German, the British regulatory agencies meet very often, so it would be strange if the data looked like this in Germany and the regulatory bodies accept this despite the fact that in other European countries they see a completely different picture.”
Lacklustre reporting: not a new issue
Strech acknowledges that although the study is the first systematic analysis of the data reporting in investigator brochures, concerns over slipshod reporting of preclinical animal data are not new. Several well-known studies have previously been published on the topic of preclinical efficacy data, including one by Amgen’s former head of haematology and oncology research C. Glenn Begley, whose lab attempted to reproduce the findings of 53 ‘landmark’ preclinical studies in cancer research. The researchers were able to confirm the findings for six out of the 53 studies.
“It was shocking,” Begley told Reuters in 2012. “These are the studies the pharmaceutical industry relies on to identify new targets for drug development. But if you’re going to place a $1m or $2m or $5m bet on an observation, you need to be sure it’s true. As we tried to reproduce these papers we became convinced you can’t take anything at face value.”
The issue of patchy reporting is prevalent enough that in 2010, the UK’s National Centre for the Replacement, Refinement and Reduction of Animals in Research introduced a voluntary set of guidelines called Animals in Research: Reporting In Vivo Experiments (ARRIVE) detailing methodological elements that should be included in the reporting of any animal research. More than 600 research journals had signed up to the ARRIVE guidelines by 2016, but subsequent studies have concluded that they have had a limited impact on the quality of reporting, and that researchers largely ignore them.
For Strech, the core question that lingers is: Why haven’t the relevant stakeholders made more of a fuss about these flaws in reporting? For regulators and review boards, he expresses surprise that “nobody has complained about this before”. A possible answer lies in the heavy emphasis on safety and toxicology reporting from preclinical studies, and the tendency for efficacy data to fall by the wayside.
“There was a paper published in Nature maybe a year ago now, from two of our co-authors, where they addressed this point and got some quotations from FDA officials, who said, more or less, they never looked at the preclinical efficacy data, they only do safety and toxicology assessments,” says Strech. “I heard this in personal conversations from many other stakeholders as well. I’m not aware of a very transparent and official statement about this. It would more or less mean that they do not do what they are legally obliged to do, because the legal requirement is for a risk-benefit assessment, not only a risk assessment.”
Giving animal studies a fair assessment
Perhaps even more puzzling is the lack of a strong response from the pharmaceutical industry. Given that the vast majority of preclinical animal studies are funded by industry and the data they generate informs their drug discovery and development programmes, this is high-stakes data that could send expensive clinical trials off the rails if reported poorly.
“[Pharma companies] themselves say that this is probably part of the high failure rate in clinical research,” Strech says. “It costs millions of euros to conduct studies that finally fail, so why do they build their clinical studies on this poor preclinical evidence? I don’t know. For financial reasons, it’s unclear why industry is accepting this and doing this.”
It may well be that there’s a lack of industry focus on efficacy data from preclinical studies because findings from animal models already have a poor reputation for predicting outcomes in humans. Perhaps drug developers feel that as long as a drug candidate has been demonstrated to be safe and well-tolerated before going into human trials, any efficacy findings should not, as Begley said, be taken at face value.
With regulators around the world espousing the ‘3R’ (reduce/refine/replace) principles of reducing the use of animals in medical research and ‘organs-on-chips’ tech poised to potentially change the preclinical research paradigm entirely, perhaps the days of animal data as an efficacy predictor in humans are drawing to a close. Strech disagrees, however.
“How can you know if animal studies are more or less predictive for how drugs work in humans if you use these study designs?” he argues. “If you have low sample sizes and everything just happens by chance – no blinding of outcome assessment, no sample size calculations – we simply don’t know whether they are predictive or not. If we lived in a world where over the past ten years, we did really randomised, blinded animal studies, then we would be in a much better position to judge whether the failures and the success stories in clinical research were more or less predicted by the animal research. Currently I would say we probably need animal research, but we should really start to build bigger databases, registries for animal research, that then help us to understand how predictive animal studies really are.”