Numbers are important in politics.  One reason is that they help win elections.  Campaigns must know voter breakdowns, where to spend time and dollars, and where they need additional manpower.  The four M’s of winning an election are money, message, machine (that is, campaign organization), and math.

But numbers are also important for shaping policy.  Here, I’m talking about the general use of statistics. We throw out numbers left and right in order to support our positions.  Indeed, we seek out the numbers which support our positions and ignore those that don’t, rather than allow numbers to inform our positions.

Generally, there is nothing wrong with this practice, but we must examine more carefully the statistics we believe in and use.

One Set Of Numbers, Two Conclusions

How can it be that those on either side of any ideological perspective can examine a set of numbers and reach different conclusions about what the numbers say?  The most obvious answer is the reader or user’s own biases.

One example is the wage gap debate (which I’ve written about recently).  Wage gappers interpret data that highlights differences in men’s and women’s pay as indicative of sexism in the workplace.  The numbers themselves indicate no such thing, but people interpret them in their own way.  In this case, there is not necessarily a failure in the numbers, but rather in their use.

This is a very common problem in political discourse.  In fact, I believe that 91% of liberals do this on a daily basis.

What Do These Numbers REALLY Say, Anyway?

Another more serious problem with statistics occurs with their actual creation.  Statisticians, scientists, and others who collect data often fail to do so accurately.

Darrell Huff, author of the excellent book titled How to Lie With Statisticsoffers a helpful example called the “sample with the built-in bias.”  It goes something like this: The average Yale graduate of 1924 makes $25,111 per year.  “What does this number mean?” asks Huff.  Does it mean that we ought to send our boys to Yale?

Huff says two things about the sample figure are suspicious.  First, it is oddly precise.  It’s unlikely that yearly income of any group can be accurately known down to the dollar.  Huff points out that “It is unlikely that you know your own income from last year down to the dollar like that, unless it were all salary.” Incomes that large ($25,000 was a lot of money back then) are likely not all salary, but may also be made up of investments.

The second–and more important–problem with the above sample statistic is the sampling procedure.  Is the sample accurate?  How many class members could reasonably have been contacted?  What kind of people would actually respond to a survey letter asking about income?

Huff argues that only those with high incomes would have responded to the survey, because they have something to boast.  Those with lower incomes have less incentive to respond, whether out of shame or any other factor.  Therefore, the sample is likely not representative of the whole.

The conclusion that readers can draw is that the average comes from what some graduates said that they earned.  Some answers could be generalizations, and others could be more precise.  Also, as we know, some people might inflate and others might minimize what they make, for whatever reason.

Be Critical Of Your Data

Huff offers many other examples to show how frequently statistics are poorly created.  The point is that anybody can say anything they want to say with numbers.

It is imperative that we closely examine statistical information.  We have to ask ourselves how it was gathered and carefully examine what it actually says before trusting that it is true.