The Faulty Premise of Value-Added Testing of Teachers

Recently, I described how unreliable value-added testing is when used to determine teacher performance. Whenever I write about that subject, inevitably someone raises the suggestion, either in comments or email, of developing a better method of evaluating teachers, such as more frequent tests. But I think that's missing the point--or, more accurately, the flawed assumption underlying the value added testing movement (aka 'education reform').

The basic assumption the reformers make is that most teachers could perform considerably better, if given proper incentives. And that assumption strikes me as utterly flawed. Think back to your own experience. If you're old enough (sorry to depress you), you probably don't remember many of your teachers--only the good or bad ones. But in many subject areas, you did learn the material. Most teachers, even if not of the caliber featured in movies starring Edward James Olmos, did their jobs. They all can't be in the top ten percent--the Lake Wobegon Effect applies to teachers as well as students.

If you want some figures backing that up, keep in mind that U.S. students in schools with a poverty rate of less than ten percent are the best in the world according to the international PISA exam. Students in schools with ten to twenty five percent poverty are very competitive (and outclass countries with comparable poverty rates). Many teachers seem to be doing their jobs. Students at schools with high levels of poverty do score poorly, but maybe that has something to do with poverty?

When teaching is at fault, the issue isn't often poor individual instruction, but curriculum and pedagogy. Let's not even broach the subject of the incomplete teaching of biology (it would be nice if the reformers showed up for that fight). Value-added testing and performance-based hiring and firing won't fix those problems. Real educational reform that focuses on teaching will.

Of course, the backdrop to all of this is the staggering number of households that qualify as low-income. But austerity and one percent inflation are more important, I guess...

Tags

More like this

I come from a family with a long history as teachers, and was surrounded by teachers from as young as I can remember in the form of family friends, my parents coworkers and relatives. One thing I never, ever suspected was that these teachers weren't trying their hardest to do their jobs. Teachers care.

And, evaluating teachers by student "performance" (whatever that means) is insulting. Insulting and counter-productive, because such systems of evaluation are either so complex that results don't mean anything or so simple that it's remarkably easy to game.

Mike T.M.B.:
3. Please pardon me for playing devil's advocate here:

1. Could you comment on how effective value-added testing is at the school level rather than the level of individual teachers? Once you correct for income level, class attendance rates, and class size, how much variance is left to be determined by a "good" or "bad" school?

2. I tend to think the big reason for volatility in value-added testing of teachers is the distribution of highly disruptive students in classrooms. Whether or not the disruptive students test well, each one added to a class means all of the students in that class will have fewer minutes of instruction time, and the effect isn't linear w/r/t the number of disruptive students. A school where student assignment is truly random (hah) would result in teachers each getting a roll-of-a-die number of disruptive students each year. A good school would try to spread them out evenly. A bad school would dump them all on teachers with little clout.

Has any study tried to take this into account? I realize deciding which students are "disruptive" is yet another value judgement that could be gamed if there are stakes attached to the question, but looking at cumulative disciplinary records over a window of 2-3 years would be a start.

3. I'm not sure if teachers should be happy about value added testing failing to distinguish teacher quality. If the answer to student outcomes is that income level, class attendance, class size, etc determine almost all of the variance, why strive for high quality teachers? They're not going to make a difference. Wouldn't the community be better served by a larger number of lower paid/lower quality teachers so that class size is dropped instead? On the other hand: Assume there is significant variance that can be attributed to teacher quality. How do you show (as opposed to just assuming it) it without value added testing?

My hope is that value added testing gets improved until it is truly prescriptive, instead of just a being political football to be gamed or ignored.