I have the Wikipedia page which collates polling data on the EU referendum bookmarked, as all nerds do, and I was taking my daily cursory look at it this morning when something struck me about it. I fed the data into Excel to check I wasn’t overinterpreting, and there it was again.
The following graph shows sample size (on the X axis) plotted against Remain’s lead in the polls (on the Y axis).
In general, the smaller the sample size, the larger the Remain lead is. This is largely because of the telephone polls: ORB’s phone polls always have a sample size of 800 exactly, and all the sub-1,000 sample polls are conducted by phone.
What to make of this? Here’s another chart: this time, assessing the the general election polls and the Conservative lead:
It’s perhaps a little hard to make out, but if I’m not mistaken, the general election polls don’t show such a sharp disparity. Small sample sizes show exaggerated leads in one direction or another, but there’s almost as many figures below the X axis as above it. At the general election, the polls all clustered around a tie, which turned out to be very wrong indeed. Contrast that to the EU referendum polls, where there’s a much more marked profusion of big Remain leads in the little polls and a much more close picture in the big ones.
I’d hazard a guess that the bigger sample sizes are probably more accurate, but that depends on the polling companies having sorted out their methodologies. As it is, we’re flying blind.