Lott’s look at those numbers again, shall we?

Update: More comments here.

John “Data Crashes ‘R Us” Lott and his ten research assistants try to come to Limbaugh’s defense:

Whether Rush Limbaugh’s comments on Donovan McNabb were “racist,” there is a general agreement that he was factually wrong, that Limbaugh did not know what he is talking about. Yet, what is the evidence? [snip]

The evidence suggests that Rush is right, though the simplest measures indicate that the difference is not huge. Looking at just the averages, without trying to account for anything else, reveals a ten-percent difference in coverage (with 67 percent of stories on blacks being positive, 61 percent for whites).

The evidence suggests Rush is right? Sadly, No! Actually, the evidence suggests that if you do a good enough job of picking a subset of your data, you can find pretty much any result you want.

In the first section of his article, Lott claims to be examining news coverage of quarterbacks, and it was overall bias that Rush had alleged. But what subset of his data did Lott use when presenting his simple comparison of averages? The one that showed articles about black QBs to be 67% positive, compared to 61% for whites? Lott only used weeks in which these QBs played. Should that matter? Big time.

Lott’s total dataset includes 1346 articles, 1013 about white QBs, 333 [was originally listed as 303, then accidentally for about 20 minutes as 343!*] about black QBs. The “simplest measure” shows that white QBs receive positive comments in 57% of all articles, while black QBs’ positive articles only amount to 53%. Though again a small difference, the result is exactly the opposite of what Lott (and Limbaugh) claimed. [Lott’s selection excludes 50 articles about white QBs that are 70% positive, and 41 articles about black QBs, 30 of which are negative. What a convenient feature of the data this turned out to be.]

When it comes to Lott’s regression, excluding weeks where the QB did not play does make sense, since there is no “objective” measure of player quality that can be used to control for the (alleged) race effect. [In the Eagles’ bye week, Donovan McNabb’s new coverage was 3 positive articles, 23 negative ones.] But we were puzzled by this statement:

In addition, I accounted for average differences in media coverage both in the quarterback’s city and the opponent’s city as well as differences across weeks of the season.

But Lott’s dependent variable, i.e. the one whose variance he wants to explain, is the percentage of articles that are positive about a given QB. Had he used the total number of positive articles, controlling for media coverage would have made sense. Here however we fail to see why. Nor do we see why one would want to control for differences across weeks — the independent variable is size neutral. Moreover, the bias alleged is by the media. Whether a given game is covered by 3 or 8 papers, the bias effect should be found regardless.

So what happens if one runs a regression on Lott’s data without “correcting” for the extent of media coverage? To find out, we put our staff of one to work using EasyReg International, a free econometrics program for regression analysis. We imported Lott’s data, and ran a tobit regression keeping the same dependent variable and using as our independent variables:

  • week
  • Monday Night
  • QB Rating
  • Win
  • Points For
  • Points Against
  • Team Rank (QB)
  • Team Rank (Opp)

    (These are the variables listed by Lott here.)

    Our results? The only variable that is statistically significant (at 95%) in predicting positive coverage is the QB rating for that game. A win comes close (significant at the 90% level.) A QB’s race doesn’t even come close. We wonder what Mary Rosh would make of this?

    There is a more fundamental problem in Lott’s crude analysis. He writes:

    Depending on whether positive or negative words were used to describe the quarterback, stories were classified as positive, negative, or falling into both categories.

    What is the point of this dependent variable? [There isn’t one? We just spent 4 hours writing about it!] Lott’s coding would classify as both positive and negative a story that had 10 positive references and 1 negative one — exactly the same value assigned to an article with 10 negative comments and 1 positive one.

    Exploring the possibility of media bias in sports coverage isn’t a pointless endeavor. Lott’s way of going about it however is.

    Jesse at pandagon and the increasingly malicious Roger Ailes have more.

    Thanks to Tim Lambert for pointing out the typo in 333 v. 303 above.

     
  • Comments

    No comments so far.

    (comments are closed)