Wednesday, 25 September 2013

No evidence of a rise in problem gambling in Scotland

Estimates of rates of problem gambling in Britain primarily come from three British Gambling Prevalence Surveys published by the Gambling Commission in 1999, 2007 and 2010. Each report uses two slightly different methodologies and found the following:

In 1999, it was estimated that 0.6% of the adult population were problem gamblers under the DSM methodology, with a confidence interval of 0.4-0.8%. Under the alternative measure of the South Oaks Gambling Scheme (SOGS), problem gambling prevalence was estimated to be slightly higher: 0.8% with a confidence interval of 0.6-1.0%.

The second British Gambling Prevalence Survey (2007) found similar rates: 0.6% under the DSM methodology (with a confidence interval of 0.5-0.8%) and 0.5% under the PGSI methodology (with a confidence interval of 0.4-0.8%).

The third and, as it transpired, final British Gambling Prevalence Survey in 2010 found a problem gambling prevalence of 0.9% under the DSM methodology with a confidence interval of 0.7-1.2%. Under the PGSI methodology, problem gambling prevalence was 0.7% with a confidence interval of 0.5-1.0%.

None of these figures differ greatly from the others. The confidence intervals are quite wide because problem gambling is relatively rare. The 2010 survey, for example, questioned 7,756 people but found only 64 problem gamblers. We must be wary of drawing firm conclusions on the basis of a handful of people and the authors of the British Gambling Prevalence Survey are right to refuse to pin problem gambling rates down to a tenth of one per cent. There is a 95% chance that the true figure falls somewhere within the confidence interval, but the researchers can be certain about little more than that.

Unfortunately, anti-gambling campaigners and the media are less respectful of statistical probity. Campaigners against fixed odds betting terminals have claimed that "problem gambling in the UK has increased by 50% in three years"—an assertion that is based on comparing the mid-point estimates from 2007 and 2010 under the DSM methodology. In fact, all of the figures reported in the three British Gambling Prevalence Surveys are consistent with Britain having a problem gambling prevalence of around 0.7%, which is low to middling by international standards.

Responsibility for collating problem gambling data has since been moved to public health departments in England, Scotland and Wales. The first of the new figures were published today in the Scottish Health Survey and they indicate that there has been no rise in problem gambling since 1999. Under both the DSM and PGSI methodologies, problem gambling prevalence in Scotland was found to be 0.7%, with similar confidence intervals of 0.5-1.1% and 0.5-1.2%.

The report notes that these figures are in line with previous surveys:

These 2012 estimates are similar to those observed for Scotland in the BGPS 2010, which estimated that 1.1% (DSM-IV) and 0.9% (PGSI) adults in Scotland were problem gamblers. The confidence intervals around the BGPS estimates were large due to small bases sizes for Scotland. The 95% confidence interval for the BGPS DSM-IV estimate was 0.4% - 2.8% and for the PGSI was 0.4% - 2.2%. This meant that we were 95% confident that the true estimate fell between these figures. The figures produced for the Scottish Health Survey in 2012 (0.7%) are well within this range and are not statistically different from the BGPS estimates.

If we used the campaigners' trick of looking only at the mid-point estimates, we could say that problem gambling has fallen by 40% in Scotland since 2010. That would be highly disingenuous, of course. We can only say that seven years after the last Labour government relaxed Britain's gambling laws—and twelve years after fixed odds betting terminals were introduced to betting shops—there is still no evidence of a rise in problem gambling prevalence.

Incidentally, the report also has smoking prevalence stats for Scotland. Figures for the last five years are as follows:

2008: 26%
2009: 25%
2010: 25%
2011: 23%
2012: 25% 

I'm assured by one of the people who compiled this report that the rise between 2011 and 2012 was not statistically significant (surely they should increase the sample size?). That being so, there has been no significant decline in the smoking rate for some years despite the most intense burst of anti-smoking activity ever. Great success!


Rory Morrison said...

It's not practical to increase the sample size of the Scottish Health Survey, as the various detailed modules (such as the DSM & PGSI gambling instruments, along with a range of biomeasurements) are very time-consuming to conduct, so to significantly increase the sample would be prohibitively costly.

Fortunately, you can get a more precise estimate of smoking prevalence in Scotland from this larger survey however which has about double the sample of the health survey...

2008: 25.2%
2009: 24.3%
2010: 24.2%
2011: 23.3%
2012: 22.9%

PJH said...

Chris, didn't the rate of smoking stop decreasing around the same time all the anti-smoker rhetoric started?

BrianB said...

Interesting that the two series of smoking rates show strong concordance until the latest year. Suggests to me that one of the two surveys may have issues, methodological or otherwise, but we will never know, I suppose. A weighted average of the two might be a better guide.

Chris, I think the "Campaigner's Trick" can also be described as "campaigners are thick" in that they totally fail to understand statistical uncertainty and the purpose of confidence ranges.

Even your own terminology of "mid-point estimate" is wrong (although I am guilty at times of using it myself). It is the 'mid-point' by design, ie the confidence range (CR) is mathematically calculated to be equidistant on both sides of the sample estimate (on a linear or log scale). Once the CR has been calculated, the sample estimate is irrelevant, ie the CR is the estimate, it is everything!

Whilst the explanation of a CR that "we were 95% confident that the true estimate fell between these figures" isn't mathematically true, it is close enough to give it a pass. Lay folk have enough trouble understanding how to interpret derived statistics at the best of times, without blowing their brains out trying to get them to understand the maths behind them as well!

But that's where the "campaigners" come unstuck. The "mid-point estimates" that they are comparing across time periods are completely meaningless - they only ever applied to the original samples, and they do not offer some kind of 'most likely' value for the whole population.

All values within the CR have equal chance of being the 'true' value, so it can only ever be valid to compare the two CRs. When you do this, you will end up with a range of possible % changes that is the product (multiplication) of the two original CRs - and will hence be enormously wide by comparison.

But, most of all, there is no way - given the CRs you are quoting - that there will be any statistically significant changes.

That, surely, is the important result.

Fredrik Eich said...

And despite all this Scotland has not much less lung cancer deaths than Mexico. Which , considering there are nine Mexicans for every Scot, I find very puzzling.

Jonathan Bagley said...

The sample size for the Scottish household survey quoted by Rory Morrison is around 10K. Half that size gives a 95% CI about +/-1.2%. For 2012, 23.8% to 26.2%. So maybe a different question is asked in the two surveys? The one Chris quoted tends to give higher smoking prevalence.

Rory Morrison said...

Jonathan, the CIs are probably a little bit wider than that as both surveys use complex sampling (clustering, stratification, survey weights) which increases standard error. But they do both ask the same question through the same method (in-person interviews).

On gambling, Brian, you say:

All values within the CR have equal chance of being the 'true' value, so it can only ever be valid to compare the two CRs. When you do this, you will end up with a range of possible % changes that is the product (multiplication) of the two original CRs - and will hence be enormously wide by comparison.

But, most of all, there is no way - given the CRs you are quoting - that there will be any statistically significant changes.

This is incorrect (or at least, the way you are expressing it is problematic).

You don't get a range of values for the difference in proportion by multiplying confidence intervals for each point estimate together, you get it from application of this formula.

Doing this very roughly with the 1999 and 2010 British Gambling Prevalence Surveys:

1999 data
prevalence of problem gambling: 0.6%
(approx 46 people of 7,680 surveyed)

2010 data
prevalence of problem gambling: 0.9%
(approx 70 people of 7,756 surveyed)

[does maths]

...therefore the 95% CI of the differences in percentage of problem gamblers between the 1999 and 2010 surveys is about: 0.03% to 0.58%. So it could be not very much at all, but it could also be nearly double the 1999 estimate.

Because the 95% CI of the difference in proportions doesn't contain zero (indicating no difference between years) this is equivalent to saying it's a statistically difference at the conventional 5% level (though obviously, only just).

Or in relative terms, the 'risk' of being a problem gambler in 2010 compared to 1999: RR 1.50 95%CI 1.04 to 2.18).

(These figures are likely all a bit under-conservative as I'm guessing the gambling survey is probably weighted, which you need to make some more complex adjustments to the algebra for.)

So I think campaigners saying that there is a 50% increase in prevalence is defensible on one level. Given random error, it could plausibly be a lot less, but it could also be quite a lot more. (If you want to consider the implications of the lower bounds of the confidence range, you also need to consider the upper.)

You might still be cautious though: random sampling error is only one kind of error in these surveys, often the least important one. Perhaps, because of changed societal attitudes around gambling in the same period, the same people who would have responded one way to the DMS questions on gambling in 1999, respond a different way in 2010, creating more 'problem gamblers' by the same criteria, though underlying behaviours haven't changed. I don't know for sure about that, ask an expert.

I think there is also quite a strong possibility that there is underestimation of the true prevalence of problem gambling: I would question whether really problematic gamblers are likely to respond to these kind of surveys, they are probably systematically under-represented in a way survey weighting won't be able to compensate for. (In a similar fashion to extremely heavy drinkers.)

BrianB said...

I just knew as I wrote the word "multiplication" that I was laying myself open to a smartarse response (no offence intended, Rory). I need to choose my words more carefully in future, since I didn't mean it in a literal sense.

But I believe that you missed my point, which was much simpler than your diving for the textbook statistical method for calculating the CI of the difference in two means.

Note that I referred to the "range of possible %age changes", and in this respect I was referring to the contention that the change from a gambling rate of 0.006 (0.6%) to 0.009 (0.9%) could be simply stated as a "50% increase". That is nonsense.

I am prepared to accept, for the purpose of the exercise, that the respective CIs (0.005-0.008 and 0.007-0.012 - I don't like calcuating %age changes in %age values) are acceptable (if not 100% accurate). I also follow on from the stated fact that these ranges represent the only estimates of the whole population gambling rates (ie the sample means are now irrelevant), then the change from one year to the next will fall within a possible range of:

-0.001 (0.007-0.008 ie from the highest value in the first CI to the lowest value in the second)
to 0.007 (0.012-0.005 ie from the lowest value in the first CI to the highest value in the second).

In %age terms, this translates to a %age change somewhere between -12.5% and +140%, and since it includes negative values, it does not support the contention that there was an increase at all.

Note that I am not calculating the CI of the difference in the sample estimates - this is a purely arithmetic approach but one based on an acceptance of the uncertainty of those estimates.

I would also point out that, given that the two original CIs overlap each other, you cannot conclude that the population estimates differ at all.

The formula that you linked to is flawed (and not just due to its horrendous typo) - it may well be a standard 'text book' formula, but I don't accept that it is mathematically valid. Why? Because it falls into the exact same trap when, at the end of a pseudo-mathematical formula transformation, it just replaces the population estimates (P) by the sample estimates (P hat) (as an 'approximation') - without which step it cannot calculate the CI. But we already know that the real population estimate is a range, incorporating the uncertainty bounds (CI), so it just throws away all of that uncertainty, as if it doesn't matter.

But it does matter, and it is this lost 'uncertainty' that leads to your calculated CI (of the difference in proportions) being much narrower than my equivalent, but simpler, arithmetic calculation.

This type of problem occurs so often in staatistical calculations that it makes me want to weep at times. There is too much acceptance of the precision of statistics that are the result of 'approximation' formulae, yet the true variability (uncertainty) is rarely properly accounted for.

It becomes much worse when (particularly) epidemiologists start introducing other variables, or coarse distribution ranges, for use as eg weighting, or other 'adjustment' factors, which themselves are only the result of sample estimates, so should also carry their own ranges of uncertainty into the further statistical calculations. If this was properly done, there would be far far fewer spurious '95% significant' results and hence far fewer claims of 'junk' science.

I'm sure you will be itching to fire missiles back at me, but I don't really want to engage in a drawn-out, largely sterile debate about the mathematics of statistical methods. As a mathematician who has 'done' enough statistical analyses in my professional career, I recognize the value of statistical analysis when used with eyes open, but I deplore the 'blind faith' approach. I also deeply resent their use, and wilful abuse, by people with a political axe to grind - especially if the axe is intended to be used on my neck!