The Dubious Science of Teacher Coaching: "An Interaction-Based Approach to Enhancing Secondary School Instruction and Student Achievement"

By drorzel on August 23, 2011.

A while back, I Links Dumped Josh Rosenau's Post Firing Bad Teachers Doesn't Create good Teachers, arguing that rather than just firing teachers who need some improvement, schools should look at, well, helping them improve. This produced a bunch of scoffing in a place I can't link to, basically taking the view that people are either good at what they do, or they're not, and if they're not, you just fire them and hire somebody else. I was too busy to respond at the time, but marked that doen as something to come back to. So I was psyched when I saw this paper in Science about a scientific trial of a teacher coaching service, which claims that:

The intervention produced substantial gains in measured student achievement in the year following its completion, equivalent to moving the average student from the 50th to the 59th percentile in achievement test scores.

"Ah-hah!" I said, "Scientific proof that teachers can, in fact, be improved with some extra instruction." So I sat down to go through the paper for ResearchBlogging purposes. Which is when I hit a problem, because the paper is kind of awful.

The awfulness isn't primarily on the scientific side, which is reasonably sound. They ran a controlled trial in Virigina with 78 teachers and more than 2000 students, randomly assigning teachers to the control and intervention groups. Teachers in the intervention group received coaching in making their classes more interactive, and regularly recorded themselves teaching then sent the recordings off for review. Experts at the coaching service being tested reviewed the recordings, then sent pointers to the teachers on what they could do better. They also followed up with a phone conversation.

The result wasn't all that dramatic, but in the year after the coaching, the teachers from the intervention group did substantially better than those from the control group. They measured performance by comparing student scores on the state-mandated end-of-year test the previous year to their performance on the state-mandated end-of-year test for the class being studied. The year after the trial, the intervention group's students improved from a raw score of 479 the previous year to a raw score of 488 for the year being studied, while the control group went from a raw score of 495 the previous year to 482 for the year being studied. This difference is statistically significant, and that's the origin of the 50th to 59th percentile claim.

So what's awful?

Well, for one thing, while the intervention group showed a statistically significant improvement over the control group in the year after they did the intervention, the difference during the study was pretty minimal. The intervention group in that year went from 467 to 460, while the control group went from 470 to 464. They try to shrug this off, writing:

This result lends a cautionary note to these findings. It is, however, consistent with the idea that student gains in achievement would occur only after teachers had the benefit of a year's worth of their own growth, such that students would actually experience enhanced teacher-student interactions over a substantial portion of their academic year.

That sounds an awful lot like retconning to me.

More significantly, because the study only considered two years, there's no way to tell whether this is just a statistical fluke. Looking at just the pre-test scores, you see a pretty big spread: from 467 and 470 in the study year to 479 and 495 in the year after the study. If you're going to say anything sensible about the effect of the intervention, you need more of a baseline. How much variation in these achievement scores do you see from one class to another without the intervention? The standard deviations of all those average scores are in the neighborhood of 70, so I would expect to see a good deal of jumping around from one class to the next. It's conceivable that the whole effect here is just a matter of chance-- if they did the same study again the next year, they might see the results reversed.

And then there's the fact that the paper itself is long on edu-jargon and descriptions of their programs, and short on actual, you know, data. The only data presented in the paper itself is one bar graph, with two sets of two bars (which all by itself is a data presentation method whose ridiculousness is exceeded only by the two-point scatter plot in the next paper in that issue). The useful numbers are all buried in the Supporting Online Material, where you find reasonably informative data tables. The main text actually cites tables that can only be found in the online material, which strikes me as incredibly obnoxious (though for all I know, it's standard practice in the education research world, in which case we need to line up education researchers and slap them all).

Even worse, when you look at the data tables, you find this table of results for the trial year:

i-3e47d5692785e6f94237786bad7ae2af-education_science_bad_table.png

Look closely at the row for the pre-post change in scores. See the problem? This is probably just a typo, but it doesn't really speak well for the care taken when preparing this article. Which, I remind you, was published in Science, one of the world's most prestigious journals.

So, as much as I would like to use these results to argue that it does, in fact, make sense to offer targeted training to teachers to help them get better, there are just too many holes in this to take it seriously. Where I started thinking "Hey, cool, science supports my position!" after reading it, I was left asking "How did this get into Science?"

Allen, J., Pianta, R., Gregory, A., Mikami, A., & Lun, J. (2011). An Interaction-Based Approach to Enhancing Secondary School Instruction and Student Achievement Science, 333 (6045), 1034-1037 DOI: 10.1126/science.1207998

More like this

Some Concerns About the LA Times' Attempt at Teacher Evaluation

The LA Times has taken upon itself to rate school teachers in Los Angeles. To do this, the LA Times has adopted the 'value-added' approach (italics mine): Value-added analysis offers a rigorous approach. In essence, a student's past performance on tests is used to project his or her future results…

Problems With Teacher Evaluation: The Value Added Testing Edition

One of the supposed key innovations in educational 'reform' is the adoption of value added testing. Basically, students are tested at the start of the school year (or at the end of the previous year) and then at the end of the year. The improvement in scores is supposed to reflect the effect of…

Some More Thoughts About Value-Added Teacher Evalution

Tuesday, I criticized the LA Times' use of the 'value-added' approach for teacher evaluation. There were many good comments, which I'll get to tomorrow, but Jason Felch of the LA Times, pointed me to the paper describing the methodology. I'm not happy with the method used. First, I was right to…

Kids These Days: Is Our Learning Measure Valid?

Kevin Drum has done a couple of education-related posts recently, first noting a story claiming that college kids study less than they used to, and following that up with an anecdotal report on kids these days, from an email correspondent who teaches physics. Kevin's emailer writes of his recent…

Does the article state anything about the actual content of the coaching? I'm not a subscriber to Science, so I can't read it.

The relevant paragraphs from the paper are pretty content-free:

This study reports results of a randomized controlled trial of a coaching programâthe My Teaching PartnerâSecondary program (MTP-S)âfocused on improving teacher-student interactions in secondary classrooms with students aged 11 to 18 so as to enhance student motivation and achievement. The program targets the motivational and instructional qualities of teachersâ ongoing, daily interactions with students. MTP-S is conceptualized within the Teaching Through Interactions framework (fig. S1), a content-independent framework that emphasizes the extent to which student-teacher interactions influence student academic motivation, effort, and achievement (18).

MTP-S uses the domains of the Classroom Assessment Scoring SystemâSecondary (CLASS-S) (19) to operationalize this framework by providing clear behavioral anchors for describing, assessing, and intervening to change critical aspects of classroom interactions. These domains focus on the extent to which interactions build a positive emotional climate and demonstrate sensitivity to student needs for autonomy, an active role in their learning, and a sense of the relevance of course content to their lives. Focus is also placed on bolstering the use of varied instructional modalities and engaging students in higher-order thinking and opportunities to apply knowledge to problems. Overall, the intervention is designed to enhance the fit between teacher-student interactions and adolescentsâ developmental, intellectual, and social needs in an approach that aligns closely with elements of high-quality teaching that have been identified as central to student achievement (9).

The MTP-S intervention integrates initial workshop-based training, an annotated video library, and a year of personalized coaching followed by a brief booster workshop. During the school year, teachers send in video recordings of class sessions in which they are delivering a lesson. Trained teacher consultants review recordings that teachers submit and select brief segments that illustrate either positive teacher interactions or areas for growth in one of the dimensions in the CLASS-S. These are posted on a private, password-protected Web site, and each teacher is asked to observe his or her behavior and student reactions and to respond to consultant prompts by noting the connection between the two. This is followed by a 20- to 30-minute phone conference in which the consultant strategizes with the teacher about ways to enhance interactions using the CLASS-S system. This cycle repeats about twice a month for the duration of the school year.

The CLASS-S reference is:
19. R. C. Pianta, B. K. Hamre, N. Hayes, S. Mintz, K. M. LaParo, Classroom Assessment Scoring SystemâSecondary (CLASS-S)

The paragraphs in the supporting online material are nearly identical to the above, with little more detail. They're not as convenient to quote, though.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Go On Till You Come to the End; Then Stop

October 31, 2017

ScienceBlogs is coming to an end. I don't know that there was ever a really official announcement of this, but the bloggers got email a while back letting us know that the site will be closing down. I've been absolutely getting crushed between work and the book-in-progress and getting Charlie the…

Meet Charlie

October 30, 2017

It's been a couple of years since we lost the Queen of Niskayuna, and we've held off getting a dog until now because we were planning a big home renovation-- adding on to the mud room, creating a new bedroom on the second floor, and gutting and replacing the kitchen. This was quite the undertaking…

Physics Blogging Round-Up: August

September 1, 2017

Another month, another set of blog posts. This one includes the highest traffic I think I've ever seen for a post, including the one that started me on the path to a book deal: -- The ALPHA Experiment Records Another First In Measuring Antihydrogen: The good folks trapping antimatter at CERN have…

The Age Math Game

August 22, 2017

I keep falling down on my duty to provide cute-kid content, here; I also keep forgetting to post something about a nerdy bit of our morning routine. So, let's maximize the bird-to-stone ratio, and do them at the same time. The Pip can be a Morning Dude at times, but SteelyKid is never very happy to…

Kid Art Update

August 13, 2017

Our big home renovation has added a level of chaos to everything that's gotten in the way of my doing more regular cute-kid updates. And even more routine tasks, like photographing the giant pile of kid art that we had to move out of the dining room. Clearing stuff up for the next big stage of the…