The Difference Between Valid and Representative Polling Data

computerscat-spending-some-time-physically-on-the-computer.png

The first thing to know about the below poll, is there is nothing wrong with it.  Too many people editorialize and accuse good organizations doing their best.  The truth is data collection is capable of being both perfectly valid and completely inaccurate.

The problem, as it always is, is response rate. This is the worst case scenario for the polling industry.  If ideology correlates with response, as the Democratic base becomes younger and more female, response rates will drop disproportionately compared to right leaning interviewees.

Let me put it in different terms, but first, let’s have QPoll chime in with some data.

REGISTERED VOTERS
WHITE
4 YR COLL DEG
Tot Rep Dem Ind Men Wom Yes No
Favorable 36% 3% 79% 26% 30% 42% 52% 25%
Unfavorable 56 94 14 64 64 49 45 71
Hvn’t hrd enough 4 2 5 5 3 5 2 3
REFUSED 4 1 3 5 4 4 2 2

Let my poll vets perk up at a bizarre sight.  While President Biden is -20 in favorability, (I am sure he isn’t among the overall electorate) white four year college degree have him plus 7.  For reference, Mitt Romney carried all college graduates by 4 percent over President Obama, and if I recall correctly, white college by 12.

AGE IN YRS WHITE
18-34 35-49 50-64 65+ Men Wom Wht Blk Hsp
Favorable 25% 37% 34% 49% 29% 39% 35% 55% 32%
Unfavorable 60 57 62 44 67 55 61 27 58
Hvn’t hrd enough 9 2 1 5 2 3 3 11 4
REFUSED 6 4 3 2 2 2 2 7 6

In table two we see that overall, in the poll done from 10-12 to 10-16, President Biden’s favorability with whites is 35 percent.  Now we know his favorability with white college graduates is plus 7, but with whites overall, -26.  So we have to ask, “is this possible?”

Let’s consult Cook Political’s Swingometer.  Here we see these numbers:

Capture20WV.JPG

From 2020, we can see that white college grads were roughly 43 percent of the overall white vote, leaving again roughly, 57 percent of the white vote being non-college whites.  The end result of this is 41 percent of the white vote for President Biden.  As we can see the poll above has him at 35.

Possible?  Certainly.  Likely?  Not hardly.  For this to be accurate relative to 2020, President Biden would have to be at 22 percent among non-college whites.  Again possible, but highly improbable.  This poll has him at 25 percent.

The problem is this is a poll of all registered voters.  For these numbers to portend anything related to the election, we would have to accept that non-college whites will be 62 percent of the overall white vote, and not, 57.  Put another way, non-college whites would have to increase their margin over white college relative to turnout by 70 percent.

In any given election, this is possible.  But using the overall universe of registered voters to project a result is fraught with error, and white college turns out in higher numbers than white non-college.

Do you remember 2004?  When Democrats were doing better with working class voters, they were getting absolutely trounced with the white college vote.  So we would look at a poll of all registered voters, and Kerry would lead.  Then we would look at a poll of likely voters and want to throw soda cups at the TV screen.

Well the same dynamic is playing out now.  President Biden is doing better among all likely voters, because one of the most likely groups to vote, white college grads, support him.  In 2020, their turnout was 72 percent.

So why is polling so unfavorable to the President?  Imagine a universe of young people, and we do polling.  Young people are notoriously hard to reach by phone, so we have to weight to account for their lack of response.  Say we need, out of a poll of 500 respondents, 100 young people.  but we can only reach 50.

As non-respondents rise, weighting smaller subsets of respondents to fill in those groups create a wide variance.

The other thing to remind everyone of, is that among all demographics, voters that will talk vs. text are more likely to be conservative, as I have noted in previous posts.

From last May’s The Claw News:

In the modern era, a random sampling is almost impossible because there is more cultural diversity, perhaps more importantly, behavioral diversity within cultures, than ever.  A college student that texts only is more likely to be progressive or a Democrat than a college student who talks on the phone.  We knew this nearly ten years ago.

Overall, self-designated political conservatives appear to be the least advanced, and active, when it comes to mobile technology. Pew found that while liberal, conservative and independent voters are equally likely to own a cell phone, only 40% of conservative voters own a smartphone, significantly fewer than liberal (56%) or moderate (55%) voters. Also, only 68% of conservatives use text messaging, compared to 78% of moderates and 81% of liberals.

So those 50 young people we reached?  They are not representative of young people as a whole, who are much more likely to text than talk.

So Quinnipiac is right now, going through the Twitter treatment where partisans are calling into question their credibility and agenda, which is ridiculous.  I can assure you, nobody at Quinnipiac is giggling to themselves, nobody is deliberately skewing data, and the data itself is valid.  Valid data is not always representative, and in this era, response rates are at an all-time low.

From the Harvard Science Data Review:

Because it is incredibly difficult to randomly sample in the contemporary polling environment, most pollsters augment random sampling with weighting and related tools such as quota sampling and multilevel regression with poststratification. These weighting-type adjustments make the nonrandom samples resulting from nonresponse look like they came from a random sample, but with a cost: the techniques require us to assume that the decision to respond is independent of the content of response once the weighting variables have been controlled for.

So my advice is simple:

Look at the cross tabs of any poll and compare those to long-established data trends to get a full read of the poll.

Going forward, the clearest indicators we will be able to take from a poll, most any poll, is internal data.  It is somewhat easier to know how subsets of people are feeling, though by no means a breeze, but response rates make modeling an electorate extremely difficult.

And for the heck of it, where do I see the race, right now?

I am confident in 2024 it will not be close.  As attention focuses on his leadership, and inflation recedes, his numbers will rise.  On election day, I am predicting an approval rating of 51 percent with a popular vote margin of at least nine points carrying all 2020 states plus N.C., with Ohio being a tossup/lean Joe and Texas being a straight up coin flip.

-ROC