Returning to my roots: A literary analysis of the Blogger SAT challenge

As of yesterday, readers had made an astonishing 3,878 individual ratings of the essays in the Blogger SAT Challenge. The average rating was 2.76, compared to 2.9 from the expert judges. Averaging the most popular rating for each essay comes up with an even lower number, 2.51. Anyone who thought that blog readers would judge bloggers more favorably than the experts was sorely mistaken.

Of the 109 entries, just 11 received a score of 5 or higher. Casual readers of the challenge were even stingier with their marks: Only 8 essays were ranked 5 or higher by a plurality of readers. Interestingly, 5 of these 8 were not among the expert graders' top choices. One of them received a 1 from the experts!

Analyzing the ratings in a third way finds that just 6 essays had an average reader rating of 4 or higher. Only one of these recieved a 5 or higher from the expert judges, and only 3 were ranked 5 or higher by a plurality of readers.

So what's the difference between an essay rated highly by the experts and one scoring high marks from casual readers? Careful CogDaily readers know that I have more degrees in English than I do in science. Today I'm going to try to use my two English degrees to try to understand what these different judges were looking for. I'm also going to compare a couple of the the top-scoring essays from our challenge with the top high school entries. It's a long ride, but there'll be a couple of polls at the end, so make sure you stay with me the whole way!

The three ways of analyzing the ratings have produced 18 top-scoring essays, with only one essay common to all three groups. Clearly casual readers are using different standards to judge the essays than expert judges. But what are those standards?

Perhaps the casual reader favors a quick wit to a reasoned analysis. If so, they would probably like this essay, which, as one commenter noted, has "gotta be a parody." While the average reader ranking of 1.49 was slightly higher than the expert grade of 1, clearly most readers didn't enjoy this essay. Neither were they especially appreciative of obvious joke entries, scoring them 0.14 and 0.27 respectively.

Delving a little deeper into the responses, I think I may have found something. Now, with just 19 responses to analyze, I won't be able to do statistically valid correlations, but that never stopped an English major before, so let's have at it. I tried analyzing the number of examples used in these essays to see if that affected the ratiings, but found nothing. Similarly, whether the thesis was placed at the beginning or end of the essay did not matter. However, when I looked at the substance of the thesis statements, I found something interesting:

i-06b83014a42dcb5155856b1b38f62c8a-popular.gif

The thesis statements that agreed with Booker T. Washington's claim that "success is to be measured not so much by the position that one has reached in life as by the obstacles which he has overcome while trying to succeed" all rated fairly well. When a writer disagreed with Washington, scores tended to decrease, but only slightly. But for thesis statements which problematized the question, suggesting that there is no easy answer, there was a dramatic shift. Expert graders consistently rated these essays lower, while blog readers rated them higher than any other type of essay.

Consider the essay rated highest by readers among those with this type of thesis. It was one of only two essays where 6 was the most popular rating, and it had the second highest average rating among casual readers, a 4.11. This essay was given a 3 by the expert graders. Its thesis is probably best stated in its final sentence: "when we attempt the challenging, we succeed in ways that cannot be measured against simple win/lose dichotomies." The writer argues that "Attempting to 'choose' which is, indeed, more important -- struggle or achievement -- misses the point that Washington is trying to make." Clearly this person can write. It may be that the expert judges didn't think the writer took a clear point of view, or supported the thesis adequately enough, but these flaws don't appear to bother blog readers.

I do, however, want to move on to one more point. The initial purpose of the Blogger SAT Challenge, you may recall, was to uncover whether bloggers can perform as well on the SAT writing test as the brightest high schoolers. It's actually a tough challenge for bloggers -- the essays published in the New York Times article which inspired this challenge are among the few receiving perfect scores of 6 from both graders. They represent less than one percent of the high schoolers taking the test. The top one percent of our study would be represented by just one essay. I'm going to choose a few of the top rated essays to compare to the high schoolers, but this requires me to dip substantially lower into our pool than the Times did with its sample of essays.

First, let's consider Essay 2 from the Times, about whether people can learn from their mistakes. Here's the thesis from that essay: "More often, humans will follow one of two paths: either they will allow the miscue to enfold them and find themselves unable to move on past it, or they will be come utterly obstinate and lose hindsight, unable and unwilling to change and thus limiting their future and present success." That's a rather convoluted way of saying "no," but it's clear enough.

Compare that to the essay that received some of the best ratings using all three metrics. Here's the thesis: "I agree with Booker Washington that success ought to be measured by what obstacles a person overcomes. To use the other standard, and judge a person by the position that they have reached in life, is to take no account of the accidents of birth or inheritance." I'd say that's comparable to the student version. But take a look at this essay's conclusion:

In my own life, I have observed the emotional intelligence of others who have had to battle illness, pain, physical or emotional challenges, and felt respect for their ability to hold onto the positive, the life-affirming elements of their life experience. These people are not wealthy, do not drive fine cars, live in modest circumstances, yet are successful because they can put food on the table and face each day with confidence.

That's quite well-said. Contrast it to the finale of the student essay:

With these examples, it is clear to see that the past does not automatically allow for people's success later on; rather, it often hinders it. Human nature will never allow people to ignore their gut instincts.

This paragraph is peppered with poor word choices, and only obliquely points back to the essay's main point. It's true that the student essay has very well-developed examples, but the transitions between paragraphs are quite clumsy. Perhaps the problem isn't so much that bloggers write badly, but that we're too critical of ourselves. But you don't have to take my word for it: let's put it to a vote.

Now let's consider an essay that received relatively high marks from our graders but wasn't appreciated so much by the readers. Here's its thesis: "Struggle can overcome the negative luck in many cases, and enhance the positive luck. Good fortune, with no contribution from struggle, is not an accomplishment at all." The author goes on to show how he overcame the obstacle of rheumatoid arthritis to become an accomplished golfer.

Now contrast it to this "perfect" student essay. This essay only offers an implied thesis: "this year I resolved to forget the past -- but learn from it -- and get into All-State Orchestra." The implication is that since the writer can learn from the past, then anyone can.

Both essays offer similar approaches: an extended personal story used to support the claim. But in my view, the blogger's essay does it much more succinctly and elegantly. Again, let's put it to a vote.

I suppose we'll have to wait for the results to be sure, but my guess is that we'll find that the entire Blogger SAT Challenge endeavor has been stifled by standards that are a bit too high, both from the expert graders, and the casual reader. That said, there were plenty of atrocious responses from the bloggers. They're definitely not out of the water yet. I've got one more analysis planned for tomorrow. Then, hopefully, we'll be done with the SAT for good (or at least until Jim's junior year in high school -- now just two years away!).

More like this

The data-collection phase of the SAT Challenge is complete. By any measure, this was the most successful Casual Friday ever. We maxed out the generous 500 responses I allotted for the challenge, the most ever responses to a Casual Friday study -- despite the fact that participants were warned the…
Today's analysis of the Blogger SAT Challenge results is the one I've been looking forward to the most. After subjecting 109 people to a sample question from the SAT writing test, we've learned that bloggers are dumber than high school kids (though there's some reason to question that analysis).…
As discussed last week, the comments about the perfect-scoring SAT essays published in the New York Times made me wonder whether bloggers could do any better. On the plus side, bloggers write all the time, of their own free will. On the minus side, they don't have to work under test conditions,…
So, the Blogger SAT Challenge has officially run its course, and Dave has posted the question to Cognitive Daily. I'll reproduce it below the fold, and make some general comments. What were the results like? We had 500 people at least look at the survey question, and Dave gives the breakdown: The…

There may be other considerations: my path to a Bachelor's degree wended through junior college. I never followed the "prep" road, never took the SAT--in fact, the "artistes" at the University so frustrated me, I went back to junior college for another semester.

Ah, well..., that's what a kid does when he doesn't have any mentors.

Nevertheless, I've got an ongoing marriage and two successfully grown children now. But sometimes, in our media-saturated society, it feels like squeezing water from rocks to "face each day with confidence."

Dave, really interesting analysis with lots to ponder.

One thing that was probably different in the grading done by "experts" and the general readership is that the graders know that the score depends on the writer taking a position on the question posed (the "prompt") and not on a discussion of the quotations. It's really irrelevant (to the score) what a writer thought of what Booker T. Washington had to say, regardless of how eloquently that opinion was stated.

Frankly, I wish the SAT would just eliminate the quotation; it ends up being more confusing than helpful to most writers.