Thursday, September 17, 2009

Screwy Statistical Reasoning

As a break from mocking economists for their silly notions, a favorite past-time, it's time to get real on this: psychologists suck at statistical reasoning.

Consider this:

"Oakes (1986, p. 82) reported that 96% of academic psychologists erroneously believed that the level of significance specifies the probability that the hypothesis under question is true or false" (Gigerenzer, "The Superego, the Ego, and the Id in Statistical Reasoning").

Note that I don't have immediate access to the Oakes reference to see what this sample of "academic psychologists" consisted of and so forth, but still, it blows my mind.

What's happening here (presumably) is a serious misunderstanding of conditional probabilities. (A conditional probability is the likelihood of something being true, given that something else is true.)

Here's an example of some conditional probabilities:

(1) What is the probability that a creature is named Leopold Rex, given that the creature is a rabbit?

vs.

(2) What is the probability that a creature is a rabbit, given that the creature is named Leopold Rex?

Now, we don't know the probability of a creature being a rabbit or being named Leopold Rex with any precision, but it seems reasonable to believe that there are a lot of rabbits and not very many Leopold Rexes. So for (1), we'd expect that the probability of a rabbit being named Leopold Rex is pretty small (after all, most rabbits in the world don't have any "name" in the human sense at all). But for (2), once we know we're dealing with a Leopold Rex (a rare thing), the probability that he is a rabbit may be pretty good.

Even though we can't say a lot about these two probabilities, we should be able to fairly easily see that the probabilities are not going to be the same, or at least aren't the necessarily the same. If we were able to determine that 33% of all Leopold Rexes are rabbits, we certainly can't then say that 33% of rabbits are Leopold Rexes.

The fact that the two probabilities aren't the same is even more striking when you compare:

(1) What is the probability of a creature being my pet, given that he's a rabbit?

vs.

(2) What is the probability of a creature being a rabbit, given that he's my pet?

The probability in (1) is vanishingly small, since I have one pet and the world is full of rabbits. But the probability in (2) is 1 - since I have only one pet, once we know we're talking about my pet, we know for certain that the creature is a rabbit.

But it's also not the case that you can say that if 33% of Leopold Rexes are rabbits, then 67% of rabbits are Leopold Rexes. It's even more obvious that if 100% of my pets are Leopold Rexes that it can't be that 0% of Leopold Rexes are my pet. However, it appears that a lot of academic psychologists get this all mixed up in their heads when thinking about their hypotheses and their data.

The p-value only tells you the probability of having gotten the data given that the null hypothesis is true (i.e. that the effect you were looking for isn't there). This does not magically morph into telling you the probability of the hypothesis being true given your data, even though that's what you really want to know.

Lately I've been thinking about how relatively unprepared psychology students (even masters/PhD students) are mathematically. I have not been able to find (in my very cursory search thus far) any information on what the math background of the typical experimental psychology PhD student is, but given how little math prep is expected of even the very good programs, I believe that it isn't very much - possibly not enough to complete undergrad-level stats courses that demand mathematical rigor. And while I believe that it is possible to develop good statistical reasoning / intuition without having the math chops to prove the central limit theorem, etc., I do suspect that it's harder to understand this stuff without a reasonably solid grasp on the basic underlying mathematical concepts.

Of course, according to Gigerenzer, it's the writers of the psychology research methods/stats textbooks and the people teaching these classes who are largely to blame for the faulty reasoning since they either explicitly misrepresent what the p-values are, etc., or do not bother to be clear, thus inviting misunderstanding. (Sadly, they don't appear to be doing as well as to invite confusion.) Maybe they themselves are confused, and maybe it's easy to slip into error when you're trying to address an audience who isn't really mathematically up to the job of understanding what the hell you're talking about.

I know it's easy for me to say this, since I'm decent at math, but it seems that psychology as a discipline would benefit from requiring more mathematical talent and preparation among those entering graduate programs in research-based subdisciplines. (I would like to say that I can't see how it's relevant to the large number of psych students in the practioner-oriented fields and programs, but as Robert has pointed out, if there is any famously and dangerously statistically-incompetent profession in the world, it's medical doctors. Whether people doing marital counseling need the same level of statistical sophistication as doctors, I couldn't say.) There are already approximately a gazillion people who want to get a PhD in psychology, so reducing that number to those who are capable of and willing to get a higher level of math preparation wouldn't decimate the field. It's a separate question whether increasing the math skill requirements would cut out too many people with high levels of creativity as researchers and theorists, but an economist might note that psychology has too many theories already, and I would suspect that many-to-most of these theorists whose work has proven to be important would have been capable of more math had it had been asked of them.

4 comments:

rvman said...

So, you are finding that Psych grad students may need to find a happy mathematical medium between the "I took Calculus and a research methods class while getting my psych BA" level common in Psychology and the "I was an Engineering/Operations Research/Statistics major as an undergrad, and I took a couple of Econ classes as distribution" preparation level often found in Economics?

Sally said...

Yes. I would say that both groups could move toward a more moderate level of math preparation, though definitely with the prep for econ > prep for psych.

And actually, I'm not convinced that all psych majors even take a calculus class...though I hope that those wanting to pursue a PhD in experimental do.

Tam said...

I looked up the requirement for psych majors at my school. You do have to take four statistics/research methods classes from the psychology department (two seem to be about statistics and the other two are more research-oriented). Other than that, there is no math requirement whatsoever, except the general college requirement that you take one of the following:

* mathematical modes of thought
* college algebra
* introduction to statistics
* finite math for the management and social sciences
* integrated mathematics I

Tam said...

Oh, also, none of the psychology courses in that 4-course sequence have any prereqs in math, so I don't think any of it is calculus-based.