Empirical Question: May 2007

Wednesday, May 30, 2007

Sympathetic Magic in the Grocery Aisles

Tam emailed me this article and said she would like to read more about it. Since I happened to have at that very moment a copy of the most recent issue of the Journal of Market Research, which contains the write-up of the study discussed in the article, sitting on the desk in my office, this was too good an opportunity to pass up, so I brought the journal home to blog about for your reading pleasure.

This work involves applying the "law of contagion" to consumer attitudes and choices. The law of contagion states that objects can transfer aspects of their essence through physical contact and that the contamination of the previously neutral/desirable object by a disgusting item continues after the objects are no longer touching. There is an obvious basis here in the possibility of microbial contamination of food through contact with bodily products, vermin, etc., but the law holds even in situations where actual physical contamination is impossible.

The study covers 6 related experiments that included the use of a disgusting object* (a package of menstrual pads or a tub of lard), a desirable target object that comes into contact with the disgusting one (a package of cookies or rice cakes), and 2 other neutral grocery store items. The researchers varied conditions such as whether the disgusting item touched the desirable item and whether the desirable item was in an opaque or transparent container. Subjects looked at the tableau of items and then rated the target object on a 10 point scale for how much they would like to try/use the target object.

* They were able to identify "disgusting" objects in a separate study by asking people to rate a number of grocery store items on a disgust scale. Items such as trash bags, kitty litter, and diapers were also commonly found to be disgusting.

Experiment 1 showed that subjects reported a higher interest in trying cookies and notebook paper when the item was not touching a (new, sealed) box of menstrual pads. Contrary to their hypothesis, the effect was the same for both the consumable and nonconsumable target objects.

In experiment 2, they found a contaminating effect of physical contact with a disgusting item even when subjects rated the cookies an hour after seeing the tableau (the subjects spent the intervening time in an interactive classroom session to distract them from what they had seen).

In experiment 3, the researchers investigated whether contagion effects would be found for products that had a negative association but were not deemed disgusting* by comparing the effect of contact with menstrual pads and income tax software. Subjects were less interested in trying cookies after being in contact with menstrual pads, but were not affected by the cookies being in contact with the software. This suggested that the decreased interest in the cookies was driven not by an overall negative association with the item that it touched, but rather more specifically by the disgusting nature of the pads (even though the cookies did not come into real contact with the pads themselves).

* Again, they did pre-testing on items to ensure that the disgusting and merely negative items yielded the same overall level of negativity in subjects, so that the difference was in the particular emotions they inspired.

Experiment 4 demonstrated that subjects were less interested in trying cookies that had been in contact with menstrual pads when the cookies were in a clear container but not when they were in an opaque (and labeled) container. They found the same effect for rice cakes in a clear vs. opaque container touching a box of lard, showing that contagion effects can occur even when the item is not seen as generally disgusting, but disgusting in some particular way (by being extremely fattening).

In experiment 5, subjects who saw a clear container of rice cakes in contact with the container of lard reported them as having more fat than did subjects who saw the two items not touching. (There was no difference for the opaque container of rice cakes.) However, the groups did not differ in their ratings of the calorie content of the rice cakes across the two conditions. In this case, the lard appeared to transfer its primary quality (fattiness) specifically to the rice cakes rather than simply making the rice cakes appear different (worse) along multiple dimensions.

In experiment 6, the researchers performed a mediation analysis to determine how the variables of "disgust" (measured by asking subjects to rate their level of disgust with the menstrual pads) and "contact with the pads" influence reduced interest in trying the cookies. They found that contact leads to disgust and disgust leads to a lower interest in eat the cookies.

So overall, the study finds that people are less interested in an item when it has been in contact with a disgusting object due to a contamination effect. Further, this effect is strong when the item is in a clear, rather than opaque, package, due to the role of visualization.

The clear implication of this line of research is that if you want to make the cookies you keep in your house (and that you claim to yourself have to be there to keep your spouse/kids happy) less desirable to you, you should place them in a clear container (precisely the opposite of the usual dieting advice) with some disgusting item in physical contact with it. I'm thinking that a line of clear plastic containers with extremely realistically rendered cockroaches on it would do wonders. The devil of it is in figuring out how to get the cookies into the cockroach container, since, ick!, who would want to touch it? The containers would have to be sold with special gloves for handling the container.

I found the Time article a bit misleading in its treatment of the positive effect a desirable object can have on other objects. Although this kind of effect has been found, it isn't quite the same to my mind as the contagion effect. There is no situation in which, for instance, a bowl of pure spring water with a piece of dogshit floating in it will be seen as purifying the shit rather than contaminating the water. The "holy icon"/"my beloved's sweater" effect seems to stem from positive associations with an individual - interpersonal factors - rather than from strictly physical contamination.

Source: Morales, A.C., and Fitzsimons, G. J. (2007). Product contagion: changing consumer evaluations through physical contact with "disgusting" products. Journal of Market Research, 44, 272-283.

Tuesday, May 29, 2007

The Pain of Paying

I recently took a web survey from our friends at the Carnegie Mellon Department of Social and Decision Sciences* which classifies consumers into three categories based on their patterns of under- or over-spending relative to the amount that they desire to spend. The survey is based on the theory that the immediate emotion (experienced at the moment of choosing to buy or not buy) of the anticipatory pain of spending money varies between individuals and impacts spending choices above and beyond the influence of old-school, rationalistic considerations of the forgone future consumption that a present purchase represents.

*Remember that this department remains on my list of Top 20 Grad School Possibilities.

They break down consumers into the three categories as follows:

Tightwads (21%) - their affective reaction to spending may lead them to spend less than their more deliberative (i.e., consequentialist) selves would prefer
Spendthrifts (18%) - the failure to feel the pain of paying may lead these consumers to spend more than their consequentialist selves would prefer
Unconflicted consumers (61%) - tend to spend about as much as their more deliberative selves would prefer

Anybody want to guess where I fell on this scale? Yes, a tightwad.

The researchers have an interesting, freely accessible journal article on their development and validation of the Tightwad-Spendthrift (TW-ST) scale. I will bullet point some of the findings I found of particular interest.

Women are more likely than men to be spendthrifts.
Correlation between income and TW-ST scale is low (r=.12), suggesting that reluctance to spend is not a function of ability to spend.
Tightwads are more motivated by avoiding the pain of spending rather than gaining the pleasure of saving. (Correlation between the two statements was an impressively low r=.08). Seeking the pleasure of saving is a defining characteristic of the "frugal" personality, considered a separate concept from the tightwad.
Tightwads are more likely to find spending painful before a purchase, while spendthrifts are more likely to feel the pain after the purchase.
Consistent with the researchers' hypotheses (and this is correlational, not causal), tightwads and spendthrifts both report lower levels of overall happiness than unconflicted consumers. However, highly frugal individuals are happier than the less frugal.
Tightwads find it difficult to "suspend (otherwise beneficial) self-control when doing so would be desirable." Spendthrifts have the opposite problem with exercising self-control.
Tightwads spend significantly more on investments than consumables, relative to their own desired levels of spending. In other words, spending money on investments (where "thoughts of future consumption dampen the pain" of spending) is less painful to tightwads than spending money on stuff to use, so they come closer to spending the "right" amount on investments.
Spendthrifts overspend significantly more on consumables than investments (consistent with "present-based" time perspective).
In an experiment, tightwads were more willing to pay a $5 fee when it was characterized as a "small $5 fee" than a "$5 fee." Spendthrifts and unconflicted consumers did not differ in their response based on how the fee was framed.
There is no correlation between TW-ST scores and the Eating Attitudes Test. However, the researchers (and I!) are interested in further investigation into possible relationships between the TW-ST scale and dieting behavior.

I find the comparisons of frugal vs. tightwad individuals pretty interesting. I wonder if tightwads are more likely to "underearn" relative to frugal folks, since they are primarily about not spending rather than actually building up savings. (Of course, there is no reason why a person can't have elements of both.)

Right now, I'm glad that I'm a tightwad. That reluctance to spend money has allowed me to build up some reserves to go toward keeping me off the streets while I am back in school. And truly, I shudder to think how many pairs of shoes I would own if I were a merely "unconflicted" consumer.

Sunday, May 27, 2007

First Zuma, then Calculus, then the World

For a while now, my goal in playing the free Yahoo game Zuma has been to complete all 16 levels on the first frog and thus utterly beat the game so I can give it up for good. (You start the game with three frogs and lose one when the chain of balls hits a certain point on the board.) After a more than two week hiatus and in a sick, slightly feverish state, I just now played a round of Zuma with very low expectations of my performance and finally beat the game - 16 levels with a final score of over 110,000 points on one frog.

So there goes one addictive time-waster that has potentially stood in the way of full devotion to my calculus class, which, by the way, I have still not received a single word about regarding when my test will be ready at the testing center. Even though they told me it would be ready in three days in that scary message at the time I requested the exam, it's now over two weeks and I've heard nada. (They were supposed to send a postcard to confirm.) I'll have to call the testing center on Tuesday morning and find out what gives. This examination limbo is really stretching the credibility of their claim that the course is "self-paced" and I'm tired of the whole screwed-up situation. At least I have now gotten grades on homework assignments 7 and 9 (100 and 98 points) but not on assignment 8, mailed in at the same time.

I am going to soldier through Calc I, but I have decided that I am not taking the Calc II course through the UT extension service. For a $10 application fee, a copy of my Rice BA transcript, and a minimal application, I can become a "non degree seeking graduate student" at Texas State (~20 minutes south of my apartment) and take Calc II there, as well as other undergraduate and graduate courses. (I am looking at the introduction to marketing class, which would allow me to take a consumer behavior class, which is right up my alley, the following semester; the undergraduate "independent study" in psychology; and two psych grad courses, advanced statistics and attitude assessment.) So conceivably, I could start my Cunning Plan of going back to school a semester earlier than previously planned, due to this special grad student status available at Texas State. This does not change my plan to enroll at UT in the spring semester. It would, however, give me a totally legitimate reason to leave my job in August instead of January, which is seeming more and more necessary for my mental health and future plans for a multitude of reasons which I will not bore you with in this venue at this time but that really crystalized for me in reading my work email while at my training class at CSU. (I will update you on my last week at CSU in a separate post.)

In other calculus news, Tam and I did some investigation into the math course descriptions at Rice, UT, and her school (Metropolitan State College of Denver) that finally clarified something that has been confusing me for a long time. The infamous "Calculus III" class that is offered at MSCD, Texas State, community colleges, and a lot of other colleges I got from a quick Google search just now (e.g. Georgia Tech, Columbia, Maryland, Iowa State, Florida State, Purdue, Minnesota), and that Tam took at MSCD and I took at Tulsa Community College right after I graduated from Rice, is, as I had thought from my experience of the class, the same class that is generally called "vector" calculus at many other schools. (At Rice, it's official title is Math 212, Multivariate Calculus, but it was always called vector.) The difference is in the order of the courses. Some schools offer vector as the third class in the calculus sequence, followed by differential equations, while others offer diff eq then vector. (And in a rare display of full disclosure, Stanford acknowledges that the two classes are independent of each other and says they can be taken in either order.)

Looking at the UT course listings, Tam also helped me determine that the discrete math class is the same as her "proofs" class at MSCD and is, as I had hoped, not necessary for what I want to do (despite its inclusion on the extremely helpful BS in mathematical sciences - prob & stat concentration degree plan I got from the UT web site). So that brings my post-Calc II math target down to 5 instead of 6 courses.

OK, after all that excitement, it's now time to lie down. Again.

Saturday, May 12, 2007

Going to Colorado, Again

This blog will be updated sporadically at best for the next two weeks while I am in Colorado for my last week of human dimensions training and then a week of site visits to Jefferson County Open Space parks (among other things) with Tam. I will be taking my math book too in the hopes that I will return to Texas having memorized all those pesky formulas. Because as my mom said, it's important that I learn these by heart in the event that some disaster occurs which both makes all computers and calculators stop working and destroys all books. Well, if my plane back from Colorado gets swept way off course and crashes into an island (a la Lost), I should be prepared to whip out the formula for Tan(A-B). That'll keep those monsters at bay!

And I just have to shout out for the writers of the show Lost, by the way. I've caught up with the end of Season One and it just continues to be exciting and fun and occasionally goosebump-inducing. If you do watch Lost, you should also be reading this Lost Blog. Good times, people. If you don't watch Lost, well, your ... um ... loss. I will allow no comments about Lost on this post. I cannot stand the idea of risking someone letting lose a spoiler of any kind, even if it only amounts to "Season 3 starts getting lame." I'm too invested in this thing. You could tell me, but then I would have to kill you, and that just wouldn't be right. This is one of the downsides to watching all my TV via Netflix; I am not at the same point of the shows as anyone else but Robert and when it comes to a show with a heavily mysterious plot development as this one, that means not being able to discuss it at all.

Beautiful People in Film, a Novelty

I just finished watching the movie Closer, which was character-driven yet had characters that I found ultimately boring and disappointing when they weren't being immediately obnoxious, stupid, and shallow. I was surprised to read several reviews (after watching the movie) that suggest that the dialogue was intelligent and witty. Have we reached a point of such diminished expectations for movies that exchanges such as "You've ruined my life" / "You'll get over it" are considered conversational gems? It also featured a couple of the most eye-rolling "meet cutes" I have seen recently. (I learned this phrase from the at turns amusing and gagorific The Holiday. The scenes with Jude Law and Cameron Diaz become increasingly unwatchable due to way overload on giggly saccharine sweetness.) This is not to say I disliked the movie exactly, but I didn't like it - I guess I was influenced by the fundamental isolation (all the characters are emotionally distanced from anyone but themselves, and sometimes even then, despite whatever frenzied, ridiculous declarations of love are made to people they have hardly even spoken to) but not by the ongoing manipulation. I reached the end of the movie with a kind of "okay, who gives a rat's ass" attitude.

I remember in the advertising and discussion around the movie when it came out that much was made of the fact that the film was about the lives of four "beautiful people," which always struck me as idiotic. Because, yes, what we don't have enough of in Hollywood blockbusters is beautiful people. I hate how we are forced to watch so many movies with so many unattractive actors. Like, the aforementioned The Holiday. Jude Law, Kate Winslet, Cameron Diaz - all beautiful people right? But then Jack Black makes the fourth. While surprisingly charming in the movie, certainly not beautiful. This movie was working at only 75% beauty in its main four characters, to say nothing of the old dude who probably wasn't even beautiful in his prime!

And OK, I know it's bitchy to say, but let's face facts: Julia Roberts is no longer looking all that beautiful. If her face gets any thinner, you could cut rope with it. She still has appeal, in a sort of big-eyed, elf-like way, but she's not the stunning beauty she once was. And the blonde hair really wasn't working for her at all.

Friday, May 11, 2007

Distance Learning Frustration

It shouldn't be this hard to figure out what one must do, along what time line, and in what order to arrange to take the midterm exam for an online extension class at UT. They have information in about four different places on the course web site and it conflicts. One place says that I should request my exam at least two weeks before I plan to take it. So when I finally find the spot where I can "request midterm exam" and push the button, I get a message saying, "You can take your exam 3 business days from today." And it gives me the testing location address. This doesn't jive so I go back and see that there was a message above the button warning me "be sure you are ready to take the exam before submitting your request" because "the request cannot be recalled." What? Was the idea that I am to be "ready to take the exam" and stay in this primed, ready state for the next two weeks? That's not how any college student in history has faced an exam. I still need to, you know, study for the damn thing.

And they don't tell me whether it is necessary for me to schedule a time to take the exam or if I can just show up during their business hours at any time. But if that is the case, and they will basically sit on the exam until I show up, why all the scary talk about "the request cannot be recalled"?

I guess I will have to call the testing center on Monday and find out what the hell is going on. Call them from Colorado State, I mean, since I am in class all next week.

Ah, I just found some other information that says: "Your online confirmation will be followed in the U.S. mail by a postcard which will give you the information you need to take the exam; for example, your testing site and the date when your eligibility for exam administration expires. It will tell you also what identifying information and authorized materials you will need to bring to the testing center." So I guess I wait until the post card comes and Robert tells me what it says. Hopefully my testing eligibility will last a while after I get back from Colorado.

Grrrrr.

By Popular Demand, My Gold-Toothed Rock Rabbit - View at Own Risk

I hasten to say, I take no responsibility for the aesthetic horror that is this statue. Does the gold tooth distract you from the shockingly disturbing hot pink eyes and devil horn-like ears? How about the drooping 1970's era redneck truck driver mustache? I thought not. It could only be more horrifying if it were wearing Birkenstocks. Although, now that I look at this photo more closely, can't you imagine that it is wearing a pair of dark grey Crocs? Yikes.

Successfully Too Short

I managed to get my hair cut too short at Super Cuts on my way home from work this afternoon. I told her that I usually get my hair cut at the hairline but I actually wanted it somewhat shorter than this and the extra hair on the neck shaved. She looked at it and said, "I can go an inch shorter than your hairline because your hair is so blonde. If it were black, no." I agreed and ended up with nicely too short hair that I am looking forward to washing and drying extremely quickly while I'm at Colorado State next week and on a night-time shower cycle. And if I'm lucky, it will be at least 3 days before my hair starts frizzing out again at the ends.

Woo hoo, I was able to get a photograph of the back of my head! Note the impressive use of double mirrors. The lighting in the bathroom makes my hair look really orange. I have not actually been washing my hair in Libby's pumpkin puree, despite what a fan of the product I am and that I still have a dozen or so cans of it from that crazy 17c per can (or whatever it was) sale at Albertson's. Man, my neck looks really weird with my head turned slightly to see the camera - I look like I'm about to go all Hulk any moment. You know, if Hulk turned orange.

Wednesday, May 9, 2007

Highlights from Day 2 of the Recreation Planner Conference

My coworker K and I attended one day of this conference, being held in Austin this year, on Tuesday. We went to three presentations in the morning, had a lunch of good-tasting food in fairly small portions (including the thinnest sliver of Italian cream cake I have ever had the good fortune of encountering) at the hotel while being serenaded by a cowboy singer, and took an afternoon field trip to a couple locations in Wimberley (where I had never been).

Unbeatable Visual Highlight:
Floyd fell into the water at Jacob's Well when the unstable planks making a bridge across the water slipped off the rock they were rather half-assedly braced against. He took this in much better spirits than one might expect, especially given that his cell phone was ruined. Fortunately, he fell into the waist-deep water on the near side of the planks and did not fall into the actual cave. A couple young swimmers, including one with full body tattoos that a fellow visitor aptly described as giving him the appearance of a scaled fish (and not, in my view, in a bad way), held the boards steady so I was able to get back across with dry feet, though I was prepared to walk back through the water with my backpack held over my head.

Best Nature Sighting:
The locally rare Chatterbox Orchid growing along the stream bank. Pretty.

Coincidence #1:
I let K pick the seminars we went to, so not having paid much attention at all to what they were about, I was surprised about 90 seconds into the very first one to discover that it was about the development of trails and parks by Jefferson County Open Space in Colorado, where Tam lives. It kept me awake (who had been half-snoozing in a comfy chair in the hotel lobby before the session began) hearing about and looking at photos of places I had actually been - Crown Hill Park (which Jeffco Open Space did not originally support, by the way), Wheat Ridge Recreation Center, the trail along the creek in Golden, Golden Gate State Park (not operated by Jeffco, but in the area), and more. Otherwise, the most interesting thing about this session was seeing how fucking much money these people have to play with. (Presentations at these kinds of conferences are generally given by groups with the kind of money that my agency - and many to most others - can only dream of. I wish there were more "Public Input Processes on a Shoestring" and "High Quality, Fast, Cheap, Pick 2, Maybe: Planning in the Real World" type sessions available, but somehow, nobody has really figured out how to do a good job of things in those situations.) The other was a comment by another attendee that surveys of users of their park/trail system have found a higher percentage of people bringing dogs than bringing children. This pointed up to me the strange disconnect, common to a great many park systems, between actual visitors and the interpretive programming that is made available; too often interpreters provide "family" or "child" oriented programming but nothing for adults.

Coincidence #2:
Our musical entertainment, Jane Leche, is a member of the US Forest Service group "The Fiddlin' Foresters." One of the items available at the Silent Auction was their CD, which featured a cover of one of my favorite songs, "Cold Missouri Waters," which tells the story of the famous Mann Gulch fire in Montana. The leader of the fire crew independently came up with the idea of setting a fire to an area of grass and lying down in the burned out area; because the fire was deprived of fuel in that spot, the flames went around him, saving his life. I was disappointed that Jane did not sing "Cold Missouri Waters" for us at our lunch concert, but we can all listen to her version here. Not as great as Richard Shindell's amazing, goose-bump-inducing version, but it's a good song nonetheless.

Silliest Question:
After I attested to the difficulty of shooting a traditional English longbow given a draw weight of 100 pounds or more that you have to hold to aim, unlike a compound bow that has a much lower hold weight, Floyd of the wet pants asked me, "Sally, are you an Englishman?" I said no. (I'm not much of an archer either. Or an expert on medieval weapons.) This was in the context of discussing the story (story, not reality) that the origins of the raised middle finger and phrase "fuck you" arose from the English bowmen at the Battle of Agincourt raising their middle fingers to the French (who had threatened to cut the middle finger off any captured Englishmen, thereby making them incapable of shooting bows which required three fingers to pull) and saying that they could still "pluck the yew" (the tree from which the bows were made). Although this story is not true, it does make for an amusing bon mot. May "pluck yew" join "Chuck you, Farley" in my personal lexicon of "fuck you" alternatives.

Most Surprising Overheard Comment:
"I spent $5,000 on it used and got another 320,000 miles out of that [American - can't remember which brand] truck before I rebuilt the engine."

Unplanned Side Stop for the Wimberley Field Trip:
Sally's apartment complex! Robert had taken me to the hotel downtown in the morning and was planning to pick me up after the field trip was over. But since we were driving by the apartment on the way back from Wimberley, I asked if they would mind dropping me off on the way back to the hotel. Either they really didn't mind or they admired my pluck/ballsiness in asking enough that they went along with it, though a couple of people said that a spot of single malt scotch would go down quite well before they returned to the hotel. (Little did they know that I actually do have several quite nice bottles in my kitchen.) This saved me a good 45 minutes of commute time, possibly more. I had fun calling Robert on my cell phone and, barely able to hear him and uncertain how well he could hear me, saying "Go home. I have a ride. OK? Just go home!" He got it. Somebody asked me how much my apartment cost, and I told him, and several people from other states seemed surprised - welcome to Austin real estate! (To their credit, they could not know that it's a luxuriously comfortable 1350 square foot 2 bedroom/2 bathroom apartment with two people and one rabbit. They did not actually get the tour.)

Lesson in Electrical Engineering

Upon inspection of our freezer, the repairman told Robert: "It's a good thing your rabbit chewed the white wire and not the black one."

The voltage in the white wire is basically zero. The black wire can give you a nasty shock (and electrocute a rabbit).

So remember this, little bunnies:

Chewing wire
Can be dire:
If you must bite,
Nip the white.
If you chomp black,
It will fight back;
Don't make this chew
The last thing you do.

Sunday, May 6, 2007

A Series of Annoying Events

Thursday evening, our air conditioner broke. The apartment complex repair guy came to look at it on Friday and determined that they needed to replace the motor. A couple other guys, who work for a company the complex contracts out to, came out on Saturday morning and replaced it. But the A/C didn't seem to really be working right even after that, so they came back to check it out and realized that the person at the parts supply store gave them the wrong motor. So they will have to get another one on Monday when the store is open again and switch them out.

Today, I was doing some sewing in Leo's room (where I have to block him away from the sewing table area because he loves to pick up the sewing pedal and toss it out from under my foot) and when I finished, for some reason I can't quite explain (but I assume must have been the result of a sub-conscious appreciation that the room wasn't as noisy as it should be), I decided to open our one-week-old freezer (which contains 5 dozen muffins, 8 pieces of lasagna, and 4 bowls of soup) that is also in Leo's room. And everything was unfrozen and just kind of chilled feeling. Robert came to look at it and discovered that Leo had chewed a hole in the cord. This was quite mysterious because Robert had specifically positioned the freezer unit and the cord against the wall such that Leo's lovely round body could not possibly fit behind it. Yet this was the only explanation consistent with the bitten cord. I was starting to wonder whether rabbits have some amazing ability to squeeze into impossibly small places, but as Robert pointed out, Leo would have had to dislocate his hip to get back there.

But on further examination of the freezer, Robert noticed that it wasn't actually as flush to the wall as it had been, which he found confusing until I realized: on Saturday, the contractor needed to access the breaker box in Leo's room, which is next to the freezer. He must have moved the freezer to get to it easier and then pushed it back, only not as far as it had been, thus leaving Leo with just a large enough opening to get in.

So we currently have a half-assed A/C running and no freezer. The repair phone line the store referred Robert to is not open on the weekend, so we won't know until tomorrow (or later) what will have to be done to get the freezer back in action. Presumably they can simply replace the cord at who knows what expense. (The possibility that we have a $500, ~12 cubic foot paperweight is too depressing to contemplate.)

This, on top of the fact that Robert can't get his new desktop computer that he purchased to run his huge SAS programs to talk to his laptop, has made for a weekend of all kinds of stuff just not working right. I even ran into some difficulty with my sewing machine that finally resolved itself with quite a bit of frustration but no serious mistakes. Our Internet has been iffy all weekend as well, only connecting about half of the time and thus frequently requiring Robert to perform an increasingly complicated series of rituals to get it to work.

At this point, I am almost afraid to try starting my car in the morning. (Actually, for the past couple weeks, it has been giving me a little bit of a hassle when first start it to come home from work; not enough to worry me, but just enough to notice.)

Next: Robert and I make the stereo burst into flames just by looking at it funny.

Saturday, May 5, 2007

Student Ratings of Teaching, Part 1

Note: If this long post starts to wear you down, buck up - it is followed by a video of the funniest thing it has been my pleasure to see in a long time - 1:00 of hilarious bunny-filled brilliance. (Thanks, Mom, for telling me about this commercial and "thank you, science.")

In the comments to the post on people’s ability to evaluate an instructor’s personality based on viewing 6 seconds of silent video, Tam asks:

"I also wonder what the results are of studies comparing these factors (the ones that influence student evaluations, or just the results of student evaluations themselves) to actual effectiveness in teaching - i.e., how much students learn."

The easy answer is, it depends on whom you ask. The validity of student evaluation of teachers is a matter of great controversy among researchers (to say nothing of the larger general academic public). At the most extreme, this divides into two camps: those who have made research predicated on the necessity that student evaluations are reasonably valid measures a prominent aspect of their career and are tempted to believe this despite the contradictory evidence that may appear and those who believe that student evaluations are of obvious bogosity and are tempted to hold researchers in this area to a standard that perhaps is unfairly rigorous (and that they are unlikely to match in their own areas of interest) such that their opponents can never make a good enough case to satisfy them.

Of course, my own tendencies toward critical assessment make me naturally inclined to be skeptical of student evaluations and a half-assed cursory examination of the literature does not make me any less a tentative ally of those in the second camp.

Here I will discuss at length one particular journal article that I liked a lot (written by a critic of student evaluations of teaching named Olivares), briefly another that had a useful run-down of some empirical findings (written by Ahmadi and Cotton), and my general thoughts on the subject. (Sources at the end of the post as usual.)

The Olivares article begins with a review of the as-near-to-universally-accepted-as-I-can-imagine definitions of validity. It poses the general question, what would it mean to say that student ratings of teaching (SRTs) are valid? At its most basic level, a valid measure is one that measures what it is supposed to measure. There are several types of validity that psychologists talk about:
- Content validity: Does the measure (SRTs) represent all aspects of teacher effectiveness?
- Criterion validity: Is there a meaningful relationship between the measure (SRTs) and some measure of the relevant behavior (such as “student learning”)? Note that this is related to the question that Tam asks.
- Construct validity: Do SRTs measure a trait or characteristic of interest? Does “teacher effectiveness” exist?

Olivares argues that SRTs do not hold up well to an examination of their validity. One major problem is that without a good definition of “teacher effectiveness,” it becomes next to impossible to judge whether SRTs do a good job of measuring it. In the absence of such a definition, the ratings themselves become the de facto operational definition of teaching effectiveness. He quotes another critic who has issues with the most obvious definition of teacher effectiveness – how much students learn: “The best teaching is not that which produces the most learning, since what is learned may be worthless.”

I am inclined to agree that the lack of a well-formulated definition of teacher effectiveness is problematic, and the pervasive use of SRTs as the de facto operational definition can put the field in the uncomfortable position of being caught in a circular reference: What is teacher effectiveness? What this test of teacher effectiveness measures. I feel sure that most researchers who use SRTs as a measure in their work appreciate the fact that teacher effectiveness is a multi-faceted concept, but I know that there is a strong tendency to privilege in your mind whatever aspect of some complex thing you can measure and do something with. In my opinion, psychologists, who generally work in an experimental mode and hence have a bit better control over their datasets, can be less prone to this than other social scientists* (and even doctors perhaps**), but it is a danger in all of these fields. I think it’s entirely appropriate for curmudgeonly critics (and I am obviously a student member of the Curmudgeonly Critics of America) to occasionally remind researchers of this fact.

* As Robert has said, economists are forced to use whatever data they can find and thus use very strange measures indeed, like tractors-per-capita, as proxies for their variables of interest, and when asked to explain what one of these measures means, are inclined to respond, “I don’t know, but it explains 73% of the variance.”

** My mom recently commented to me that she was starting to wonder if her doctor’s insistent focus on her cholesterol level was a true reflection of the importance of that level to her health or was simply an artifact of cholesterol being something that she could measure.

It is interesting to think about the ways that even “how much students learn” fails as a universally acceptable definition for teacher effectiveness. Even assuming there was some way to get a very good measure of this (using some kind of pre-/post- measure of knowledge and controlling for student variables like intelligence, motivation, study habits, etc., that could impact learning), maximizing the sheer amount of learning is not always the sole (or in some cases, primary) goal of teaching. One aspect that I think is important is a teacher’s ability to stimulate interest in and future study/thinking about a subject in students. It is easy to imagine the instructor who by blunt force crams a significant amount of knowledge into students’ heads long enough to take the exam, but whose students come away hating the subject and eager to forget this boring crap as soon as possible. Depending on the situation, the ideal balance between knowledge and interest may shift, but in most cases, I believe you want to do both.

It’s certainly true that learning a great amount of trivial information is less desirable than mastery of the fundamental concepts of a subject. (For instance, an American history student may be able to regurgitate a large number of names, dates, and places without understanding how any of it ties together.)

Also, it’s possible that in specialized situations, teachers would focus on very different things. For example, a teacher working with students disadvantaged by some combination of circumstances (e.g. socio-economic status) and innate abilities (e.g. learning disability), and with a history of low achievement, may emphasize increasing the students’ motivation to learn and feelings of self-efficacy toward learning at the expense of short-term mastery of the subject material. (By this I mean students who have done poorly in classes that progressed at a normal pace might be placed in a more slowly paced course that allows them to realize, hey, with effort, I can learn something; it just takes me longer to do it.) Most to all junior colleges and some universities teach what at Rice and (to my knowledge) most other schools is a two-semester course in calculus over three semesters, presumably because they recognize that their students are not prepared to take on this material at such a fast pace. On the flip side, as Robert pointed out, other institutions use a “weed out” process to separate out those who can advance through difficult material very quickly from the masses who cannot. And even I can see the value of this to an elite program for astronauts, specialized doctors, or that kind of thing.

Another huge issue to me, that is more methodological than theoretical, is that effective teaching results in learning that lasts beyond the final exam.

The article lays out four assumptions about student ratings of teachers that Olivares believes are not sufficiently met:
"- rating forms adequately capture the domain of teacher effectiveness across instructional settings, academic disciplines, instructors and course levels and types;
- students know what effective teaching is, hold a common view of teacher effectiveness, and are objective and reliable sources of teacher effectiveness data;
- relatedly, ratings are, for all intents and purposes, unaffected by potential biasing variables; and, collaterally;
- teacher effectiveness is being measured as opposed to, for example, course difficulty or differences in disciplines, student characteristics, grading leniency, teacher expressiveness, teacher popularity or any number of other variables."

Olivares states (in a sentence I enjoy very much), “To think that students, who have no training in evaluation, are not content experts, and possess myriad idiosyncratic tendencies, would not be susceptible to errors in judgment is specious.” I agree that to the degree that the validity of SRTs is dependent on believing otherwise, the SRT project is doomed.

Ahmadi and Cotton, who conclude in their article that “in general, student ratings tend to be statistically reliable, valid, and relatively free from bias or the need for control,” give a run-down of some findings that may be useful and, more significantly, on point to answering Tam’s question.

First, they report that studies have found correlations between exam grades and SRTs such that classes that gave higher ratings tend to be the ones where students learned more (i.e. scored higher on an [I believe standardized] exam). However, they acknowledge that many variables related to student learning are themselves related to student ability rather than teacher performance. This might imply that the answer to Tam’s question is “Yes, but.…”

They list the following factors that have been found to not be related to student ratings: instructor research productivity, age, teaching experience, race, and gender (though there can be interaction effects between student race or gender and teacher race or gender, with students giving higher ratings to instructors with similar characteristics); student age, level (e.g. freshman, grad student), GPA, and personality; class size and time of day.

They also list factors that are related: faculty rank and teacher expressiveness; students’ expected grades and motivation; work load and difficulty (perhaps surprisingly and reassuringly, those correlate positively, with classes perceived as more difficult getting higher ratings), level of course (higher level are rated more positively), and academic field (humanities and arts>social sciences>math and science).

Getting back to our fellow curmudgeonly critic Olivares. He talks about how, when pressed, many supporters of SRTs will fall back on an argument for their utility; he quotes one who wrote, “Student ratings almost certainly contain useful information that is independent of their correlation with student achievement. That is, student ratings provide information on how well students like a course.”

Of course, this is where yet another of my buttons get pushed – customer satisfaction. So join me later for the continuation of this discussion, focusing on the theory of customer satisfaction and practice of its measurement, SRTs as a customer satisfaction measure, comparisons of c-sat in teaching and a field I know quite a bit about, viewing students as the single relevant consumer group, the implications of SRTs for teacher behavior, and the use of SRTs. I do not mean use as in, are Likert scales, which are considered ordinal level data following Stevens' arguably invalid definitions in measurement theory, appropriately reported using parametric statistics (which is an interesting if highly geeky debate with implications for the calculation of student GPAs as well), but rather: the use and misuse of SRTs (understood as a c-sat measure) as an element in making instructor personnel decisions, such as granting tenure.

Sources:

A Conceptual and Analytic Critique of Student Ratings of Teachers in the USA with Implications for Teacher Effectiveness and Student Learning, Teaching in Higher Education, Vol. 8, No. 2, 2003, pp. 233–245, ORLANDO J. OLIVARES

Assessing Students’ Ratings of Faculty, Assessment Update, September–October 1998, Volume 10, Number 5, Reza T. Ahmadi, Samuel E. Cotton

Hilarious Rabbit Foot Commercial

Thursday, May 3, 2007

Judging Teacher Effectiveness in 6 Seconds

A couple of highlights from a 1993 article describing the outcome of a series of experiments in which female students were shown brief, video only (no sound) clips of a university instructor in the classroom and then rated the instructors on a series of personality variables. The scores on these personality variables were compared to the ratings instructors received in the university's normal end of semester student evaluation program.

"Teachers who were rated higher by their students [on the end of semester evaluations] were judged to be significantly more optimistic, confident, dominant, active, enthusiastic, likable, warm, competent, and supportive on the basis of their nonverbal behavior."

A different group of female students also rated the physical attractiveness of the instructors based on still photographs. Correlations suggested that student evaluation ratings are somewhat influenced by physical attractiveness (r=.32), but that when controlling for the scores on the personality variables, the relationship between physical attractiveness and teacher evaluation ratings dropped to r=.14.

They performed the same kind of study of high school teachers, using principals' ratings of teacher effectiveness, and found similar results - except the correlation between physical attractiveness and ratings from the principal was negative. However, scores on the personality variables given by strangers watching a video clip still predicted the principals' ratings.

I was surprised that ratings on the personality variables were not more reliable when people watched 3 10-second clips than watched 3 2-second clips. People are able to make these kinds of judgments extremely quickly and with a great deal of consensus.

An examination of nonverbal behaviors showed that nodding and laughing are related to higher student evaluation scores while fidgeting with hands or an object is related to lower ratings.

(Source: Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness., By: Ambady, Nalini, Rosenthal, Robert, Journal of Personality and Social Psychology, 00223514, 19930301, Vol. 64, Issue 3)

Article linked from Marginal Revolution. Like Tyler, and as I have talked about before in the context of credibility, I think giving the appearance of confidence is very helpful in the public speaking arena.

Wednesday, May 2, 2007

Orwellian Traffic Signs

As I've discussed before, there is a stretch of the southbound I-35 access road between the Slaughter and Slaughter Creek Overpass exits that has two strange merge signs that my fellow drivers do not understand. Or, I should say, had these signs. About a month ago, the confusing straight line/broken line signs that indicated that the left lane ends and the right lane continues were replaced by signs saying "Lane Ends Merge Left." Um, yes, that's correct - they changed it so that the right lane ends and the left lane continues (which both Robert and RB independently decided the road appeared to suggest by design, though it looks completely ambiguous to me). Once I got over boggling at the turn of events, I thought it was more logical - the right lane is usually bogged down by cars turning into the Wal-Mart shopping center or onto Old San Antonio Road, so it made more sense for the left lane to be the through lane. And traffic seemed to me to be slightly more sane along that part of the road after we all got used to the change and/or the rules of the road changed to conform to what we were already doing.

But then - in a bizarre switcharoo that had me talking in my car at loud volume "What?! Now they're just fucking with me!" - on Tuesday night, the signs had been changed again; they now read "Lane Ends Merge Right," thus re-establishing the original protocol, only with words instead of those pictograms that are equally incomprehensible in every language. Well, I say "now" but really, who knows? Since 5:15 this afternoon, they could have been changed to the semi-mysterious drawing showing that we should merge left or replaced by a series of Burma Shave-esque signs:

You must suffer
From aphasia
We were always at war
with Eurasia
Merge now

Empirical Question