When the Chief Medical Officer, Sally Davies, lowered the drinking guidelines for men last year, she cited a report from
the Sheffield Alcohol Research Group (SARG) as supporting evidence.
SARG had been commissioned by Public Health England to help define a
‘safe’ level of alcohol consumption in October 2014 after using their
computer model to predict the impact of minimum pricing on several
occasions in the past.
The SARG report was published on 8 January 2016, the same
day as the Chief Medical Officer announced the new ‘limits’. Its authors
stressed that it was not their job to recommend specific limits, but
nevertheless concluded that ‘the implied weekly guidelines in this
report vary between 7 and 13 units per week for males and 13 and 15
units per week for females’. These figures were significantly lower than
the ‘safe level’ implied by epidemiological evidence,
but they were consistent with the new advice which lowered the male
guidelines from 21 units a week to 14 units a week (the female
guidelines remained at 14 units).
But there is another version of the SARG report tucked away on the Department of Health website that
few people have ever seen. Along with a series of e-mails released
under the Freedom of Information Act, it shines a light on the process
that led to the Chief Medical Officer telling the nation that there is no safe level of drinking.
A year before the guidelines were changed, a draft of the
SARG report was sent to Public Health England that was very different to
the final publication. For example, it contained a graph (see below)
based on SARG’s model showing the relationship between the amount of
alcohol consumed and the risk of alcohol-related death.
Reflecting the epidemiological evidence,
mortality risk is lower for light drinkers than for teetotallers but it
then rises. According to this graph, drinkers’ mortality risk rises to
that of a teetotaller at 17.6 units per week for women and 21.2 units
per week for men. On this analysis, a guideline of 21 units for men was
appropriate and a guideline of 14 units for women was slightly
over-cautious.
But that graph was never published. When the SARG report was
released in January 2016, the findings had been altered and the graph
now looked like this…
All of a sudden, the implied safe limit for men
was barely half of that shown in the original and was now lower than the
implied limit for women. The health benefits of moderate drinking,
which were downplayed in the original, had almost disappeared for men
and only applied at very low levels for women.
Without this volte-face from the Sheffield team, the Chief
Medical Officer would have found it difficult to justify changing the
guidelines. As reported in the Sunday Times yesterday,
e-mails sent between SARG and government agencies strongly suggest that
this change was forced on the Sheffield team by Public Health England.
Now the full story can be told.
On 22 December 2014, the Sheffield team sent Public Health
England the first draft of their report. Several revisions were
suggested and a second draft was submitted on 14 January 2015. In an
e-mail that accompanied the second draft, the Sheffield team made it
clear that they did not expect to make any significant changes to their
findings. Explaining that some of the team had been off sick, the author
of the e-mail said that they would ‘like to go over the text again
before committing to a final public version’ but that ‘[w]e do not
expect to make any further changes to the numbers’. Although the team
had made substantial changes to the text of the report since the first
draft was reviewed, the basic conclusion had remained the same: a safe
level of alcohol consumption was ‘between 12 and 21 units per week for
males and 15 and 18 units per week for females’.
At this stage, it seems that SARG expected the Chief Medical
Officer to keep the guidelines at 14 units for women and 21 units for
men. These ‘limits’ had been criticised in the past, with Richard Smith,
the former editor of the British Medical Journal who had sat on the original guidelines panel, famously claiming that they were ‘plucked out of the air’.
The SARG report would give them some scientific credibility, even if
only from a theoretical model. As the Sheffield team stated on page 6 of
the document: ‘These implied guideline thresholds are generally similar
to those in the current UK lower drinking guidelines’.
Public Health England passed the draft report onto the
Guidelines Development Group (GDG) who were ultimately responsible for
formulating the government’s advice. On 21 January, the GDG held a meeting at which SARG’s John Holmes and Colin Angus presented their findings. The
minutes of this meeting contain the first mention of an idea that would
have a profound impact on the whole project. It was suggested that SARG
researchers should ‘estimate risk curves without threshold effects for
wholly alcohol-attributable chronic conditions’.
To grasp the significance of this, it must be understood
that researchers distinguish between diseases that are wholly caused by
alcohol and those for which alcohol is only one risk factor. Alcoholic
liver cirrhosis, for example, is a wholly alcohol-attributable chronic
condition. You cannot get it unless you are a heavy drinker and
every case of it is caused by drinking. Breast cancer, by contrast, is a
partially attributable chronic condition. Although drinking increases
the risk of breast cancer, a woman does not have to drink alcohol to get
breast cancer and there are many other risk factors. There are
relatively few chronic conditions that are 100 per cent caused by
alcohol (ten are listed by SARG), but they are responsible for a large
proportion of alcohol-related deaths.
It is generally accepted that there is a threshold above
which a person needs to drink to put themselves at risk of these
diseases. If you only have one drink a day, for example, you are at no
more risk of alcohol-induced pancreatitis than a teetotaller. You have
to drink above a certain threshold. This is not just common sense, it has been shown empirically.
Removing these thresholds from the Sheffield model, as the
GDG suggested, was bound to make moderate drinking look more dangerous
than it is. It would make it appear that there was no safe level of
alcohol consumption for several of the most serious alcohol-related
diseases. It would force the computer to assume that any amount of
drinking caused these diseases and, therefore, that some moderate
drinkers were dying of them.
Although there was no scientific justification for such a
change, Public Health England followed up the idea in an e-mail to SARG
on 9 February, asking the team if they were ‘able to deliver additional
work to support the Guidelines Development Group’. The deadline was 11
March when the Chief Medical Officers were due to review the evidence.
PHE had six amendments to the report in mind. Point 4 was ‘Threshold
effects – a sensitivity analyses [sic]’. Point 5 was ‘Threshold effects –
a new base case’.
A sensitivity analysis is basically an alternative scenario.
When your model is heavily dependent on certain assumptions – as the
Sheffield alcohol model is – it can be useful to see what happens to the
results when new assumptions are fed in. Sensitivity analyses show
scientists how sensitive their findings are to different scenarios.
Generally speaking, if the results do not change a great deal, the model
is considered robust.
Public Health England wanted SARG to do a sensitivity
analysis with threshold effects removed. Although there was no obvious
reason to model such an unrealistic scenario, it could, arguably, be
excusable if done as an academic exercise tucked away in the appendix of
the report. But point 5 was more serious. PHE were suggesting that SARG
remove the assumption of a threshold from their core model (the base
case) so that this unrealistic scenario dictated the study’s main
findings.
This idea evidently concerned the Sheffield team. Writing
back the following day, they said that ‘the first four items on the list
are not a problem and can be done before 11th March for £7,800
including VAT’. But with regards to Point 5 they were ‘unclear exactly
what was being requested here and why it was requested in addition to
item 4 (a sensitivity analysis on threshold effects).’
Could PHE really be asking them to rip up their model and
begin again from a patently false premise? Initially, SARG stood their
ground, saying: ‘Our view remains that it does not seem right to assign
people drinking at very low levels a risk of acquiring alcoholic liver
disease and similar conditions. Unless there are strong opposing views,
we think it better to keep the threshold in the base case.’
In an attempt to meet PHE half way, they instead proposed to do the following work:
‘Base case: Threshold effect for wholly attributable chronic [diseases] only
Sensitivity analysis 1: Threshold effect for wholly attributable chronic [diseases] and for all acute conditions.
Sensitivity analysis 2: No threshold effects for any condition’
Although clearly unhappy with PHE’s proposal, the agency was
SARG’s sole funder for this research and the team let them know that
they would capitulate if PHE were insistent, writing: ‘If you remain
keen for us to change the base case, please let us know and I can
quickly update the costs and timing. As noted previously, this carries
some extra costs as changing the base case means updating the whole
report.’
This was all the encouragement PHE needed. At 10.40pm that
evening, they e-mailed back to say:
‘Thank you for your swift response. Could you provide costs/timing for changing the base case please?’
The following day, SARG replied with some quotes for the new
work but were clearly still keen to dissuade Public Health England from
changing the base case. Their e-mail to PHE gave the agency one last
chance to change its mind. It reads, in full: ‘Please see attached a
revised costing. As creating a new base case removes the need for a
further sensitivity analysis on threshold effects, I have presented two
options – a new base case OR a new sensitivity analysis.’
But it was obvious that PHE were not interested in a mere
sensitivity analysis. They wanted to change the headline findings and
one of their employees wrote back on 13 February to announce that ‘I
have now secured PHE funding to proceed with option 2’. By 19 February,
SARG had received a letter of intent from PHE and had started work on
the new model.
Over the next few weeks, the Sheffield team complained about
problems that emerged from their attempts to adjust the model to fit
the new assumptions. They were doing something that they had never done
before. It is unlikely that they had even contemplated doing it before.
Moderate drinkers do not develop diseases such as alcoholic liver
cirrhosis and they knew it. As a result of dealing with a ‘problem in
adapting the pre-existing model to undertake the drinking guidelines
analysis’, the 11 March deadline was missed and it was not until 25
March that SARG could provide PHE with an update. Assuring their funder
that they were ‘now satisfied that we have identified and fixed the
problems with the model’ SARG wrote: ‘The headline message from the new
base case analysis is removing all of the threshold effects and
remedying the problems with the model leads to implied guideline
thresholds which are around 30% – 50% lower than those in the previous
base case.’
For male drinkers, some of the thresholds had nearly halved:
‘For example, under the Canadian approach [to defining a ‘safe level’],
the implied daily guideline for males in the new base case vary between
1.2 and 2.4 units per day depending on number of drinking days per
week. In the old base case the equivalent figures were 2.3 to 4.5
units.’
One suspects that this was music to the ears of PHE and the
GDG. The new figures not only seemed to justify the government’s
existing recommendations for women but could be used as a reason to
lower the guidelines for men. Nobody in the guidelines committee was
likely to object to such a change. As I have previously described, the committee was packed to the gunwales with temperance campaigners.
In so far as there was dissent, it came from the Sheffield
team, but they were careful to voice their misgivings in a low key. In a
meeting held on 8 April, SARG’s John Holmes presented the revised model
to the GDG. According to the minutes, he pointed out that the new,
linear risk curves were ‘not precisely consistent with the literature’,
thereby producing the peculiar result that ‘lower risk guideline levels
for women were now higher than for men’. The minutes note that: ‘John
Holmes felt that the overall message from the different analyses was
that the new base case should not be taken as definitive.’ He also
argued that ‘it would be possible for [the new] guidelines to be little
different from the current ones’.
A briefing note from SARG explaining the revised findings
gently attempted to nudge the GDG away from the new base case. After
stating that the effect of removing threshold effects had been to
‘markedly lower the implied guideline thresholds’, SARG invited the
committee to take a middle path between the original model and the one
that PHE had forced upon them. Moreover, they actively discouraged the
GDG from basing the guidelines on their new research. ‘The true risk
function is likely to lie somewhere between these two scenarios’, they
wrote, ‘and highlights the residual need for expert judgement’. In case
they had not got their message across, they added: ‘There are not strong
reasons for preferring the base case over these alternative analyses
and this challenges the rationale for deriving guidelines directly from
the results of the base case.’
These hints fell on deaf ears at Public Health England and
among the guidelines committee. It appears that they had got the result
they wanted. When the SARG report was published in January 2016, most of
the text was identical to that of the original draft with only the
numbers changing – the opposite of what SARG had expected to see happen
when they submitted the draft a year earlier.
Reading the two documents side by side gives a glimpse of
what might have been if SARG had stuck to their guns.
For example, on
page 6 of the original report, the Sheffield researchers wrote:
These implied guideline thresholds are generally similar to those in the current UK lower drinking guidelines (assuming at least three drinking days per week) and are also similar to those selected in Canada and Australia.
In the same section of the final report (page 7) this has become…
These implied guideline thresholds for males are generally lower than those in the current UK lower risk drinking guidelines (assuming at least three drinking days per week) whereas for females they are similar to the current guidelines. The implied guidelines thresholds are also lower than those selected in Canada and Australia.
In the original they say:
Assuming drinkers consume alcohol at least three times per week, implied weekly guidelines in this report vary between 12 and 21 units per week for males and 15 and 18 units per week for females.
But thanks to the dropped thresholds, this is changed in the final version to:
Assuming drinkers consume alcohol between three and five times a week, the implied weekly guidelines in this report vary between 7 and 13 units per week for males and 13 and 15 units per week for females.
When discussing the sensitivity analyses, the original
report stressed how robust the model was, with different estimates being
within five units of each other, except in the highly unrealistic
scenario of there being no health benefits from moderate drinking. They
could no longer make such a claim in the final report because their main
alternative scenario (ie. their original model) produced results for
men that were twice as large as those produced by the new base case.
This is the original text:
In most cases, the sensitivity analyses suggest the results are moderately sensitive to alternative assumptions. Under different analyses, implied guideline thresholds for mean weekly consumption vary by up to five units per week. A key exception is the results for the sensitivity analysis where all evidence of protective effects was removed.
And this is the published version…
For most sensitivity analyses (e.g. modelling a ten year time period, assuming lower CVD mortality rates, varying the threshold within the Australian approach) the size of variation in implied guideline thresholds from the base case is of the order of three units per week. However, for other sensitivity analyses (e.g. reintroducing threshold effects used in previous versions of SAPM, assuming no cardioprotective effects from moderate alcohol consumption) the variation in results from the base case are larger and of the order of ten units per week.
At this point in the text, SARG could not resist making the
point that they had made repeatedly to Public Health England and the GDG
in private:
These results suggest the base case should not be accepted uncritically as the implied guideline thresholds are sensitive to alternative assumptions and baseline data and there are not strong arguments for preferring the base case specifications over those used in the sensitivity analyses.
This was a brave statement to make in such an influential
public document, although its significance was not noticed by the media
at the time. The original draft had included a similar statement about
there being no strong arguments for preferring the base case over the
sensitivity analyses but that was when the model was only ‘moderately
sensitive to alternative assumptions’ and the implied guidelines varied
‘by up to five units a week’. Now there was a huge difference between
the base case and the alternative case presented in the sensitivity
analysis. If the real figure was twice as high as the base case
suggested – and there were ‘no strong arguments’ to think it wasn’t –
the Chief Medical Officer might as well pick numbers at random.
At various stages in the report, the reader gets the
impression that the authors are trying to distance themselves from their
work. On more than one occasion, they stress that they would not
normally program the Sheffield Alcohol Policy Model (SAPM) in the way
they had. On page 28, they explain that ‘threshold effects normally
included within SAPM were also removed for wholly-attributable acute and
chronic conditions’, and on page 32 they add a new section, saying:
‘…previous analyses using SAPM have included threshold effects within risk functions for acute conditions and wholly-attributable chronic conditions such that risk only begins to increase above a pre-specified consumption level. At the request of the commissioners (Public Health England), this threshold effect was removed for the base case analysis’.
In another new section on page 55, they almost seem to be winking at the reader:
Although the implied guidelines thresholds presented here are lower than those within previous studies, they remain of the same order of magnitude and different assumptions examined within the sensitivity analyses, particularly the reinstatement of threshold effects used in previous versions of SAPM, bring the implied guideline thresholds close to those found elsewhere. The methodological differences described should not be overlooked as these may, in part, be responsible for both the lower estimates presented here and the general similarity of findings in terms of order of magnitude.
When reading the published report and considering the new
guidelines, it should be remembered that they are built on an assumption
that the Sheffield team told PHE ‘does not seem right’. When PHE
commissioned SARG to produce the report (from a shortlist of one; nobody
else applied) they were buying access to the computer model. Whatever
the merits and flaws of that model, it had been used in alcohol research
since 2008. But when it failed to produce the results needed to justify
a change in the guidelines, PHE told SARG to program it in a way that
it had never been programmed before, using an assumption that had no
scientific basis and about which the Sheffield team had obvious,
well-founded reservations.
These facts, which have only come to light as a result of
Freedom of Information requests, give the lie to the idea that research
funded by government is necessarily more neutral or ‘independent’ than
research funded by other means. SARG were clearly not allowed to use
their own judgement. Instead, Public Health England and the guidelines
group leant on the Sheffield team to get a report that was more to their
liking than the document they were originally presented with.
The more that we learn about the process that generated the
new guidelines, the more questions are raised about Public Health
England. Far from being an honest broker in this story, the agency seems
to have acted more like an activist group working towards a particular
conclusion. Its relationship with the anti-drink lobby,
which extends to holding its Alcohol Leadership Board meetings at the
offices of a temperance group, is worryingly cosy for a state agency.
Its decision to appoint leading anti-alcohol campaigners such as Ian
Gilmore and Katherine Brown (both of the Alcohol Health Alliance) to the
guidelines committee shows that it has become politicised.
This bias was on display again at the start of this year when Public Health England published an error-strewn policy document which
it released it to the media with a headline claim that was so incorrect
that it had to be retracted. That report was put together by the same
familiar faces who dominated the guidelines review process. The revision
of those guidelines may seem a relatively minor achievement for the
anti-drink lobby. You can ignore them, after all. But, as the minutes of
one GDG meeting say, it is ‘important to bear in mind that, while
guidelines might have limited influence on behaviour, they could be
influential as a basis for Government policies’. That is why the
guidelines are important and, I would suggest, it is why Public Health
England went to such lengths to change them.
No comments:
Post a Comment
Comments are only moderated after 14 days.