More on Value-Added

Value-added modeling is back in the news (although did it ever really go away?). Among its appearances this week, Stephen Sawchuk reported that Representatives Jared Polis (D-Colorado) and Susan Davis (D-California) recently introduced a bill that would require states to oversee new systems for evaluating teachers and principals. It would also require that student achievement growth as measured by state or local assessments—or value-added analysis where available—be the predominant factor in teacher evaluations.
While Sawchuk points out that this bill will likely not advance on its own, he says to watch it. Given both Polis and Davis are on the committee that will take the lead in shaping the House’s revision of ESEA, it could be an indicator of what is to come.
Language like that of this bill frustrates me to no end. While most, if not all, education stakeholders agree that current methods of teacher evaluation are flawed, why does it always come back to value-added models? After reviewing the research, which clearly shows the limitations of this strategy, how can one argue that more than 50% of a teacher’s evaluation be based on it?
And please do not say, “It’s better than doing nothing.” Over at School Finance 101 Bruce Baker shares an analogy in response to that argument:
If we were in a society that still walked pretty much everywhere, and some tech genius invented a new cool thing – called the automobile – but the automobile would burst into a superheated fireball on every fifth start, I think I’d keep walking until they worked out that little kink. If they never worked out that little kink, I’d probably still be walking.
He had previously written about how this rate relates to likely the errors in teacher dismissal (dismissing effective teachers as ineffective) that would occur when using typical value-added modeling approaches. (That post was based on the recent National Center for Education Statistics report Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains.)
And that piece only really looked at the statistical problems with value-added assessments. It did not consider whether the tests teacher evaluations would be based on measure student knowledge of things we actually care about in a valid way. Roxanna Elden’s piece in the latest Education Next gives me a reason to question that. It includes an experience with a practice FCAT reading question:
Which of the owls’ names is the most misleading?
[Elden] was stuck between (F) the screech owl, because its call rarely approximates a screech, and (I) the long-eared owl, because its real ears are behind its eyes and covered by feathers. The passage explains that owls hear through holes behind their eyes, so the term long-eared owl seemed misleading. Then again, a screech owl that rarely screeches? That is pretty misleading, too.
I personally don't want 50% of a teacher’s evaluation based on whether s/he was able to prepare students to answer a test full of questions like that.
But acknowledging problems with value-added modeling doesn’t mean we should accept the status quo. Instead, as Baker suggests, we could look beyond the false dichotomy (or “really stupid argument”) that there are only two choices in teacher evaluations: status quo or value-added. We could also consider peer reviews of teachers. We can consider other examples of student achievement, such as portfolios or writing samples. Et cetera. We owe it to our teachers—and ultimately, to our students—to ensure that those in the classroom are the ones who really should be.
Image from Maximaximax
SIGN UP
Visionaries
Click here to browse dozens of Public School Insights interviews with extraordinary education advocates, including:
- 2013 Digital Principal Ryan Imbriale
- Best Selling Author Dan Ariely
- Family Engagement Expert Dr. Maria C. Paredes
The views expressed in this website's interviews do not necessarily represent those of the Learning First Alliance or its members.
New Stories
Featured Story

Excellence is the Standard
At Pierce County High School in rural southeast Georgia, the graduation rate has gone up 31% in seven years. Teachers describe their collaboration as the unifying factor that drives the school’s improvement. Learn more...
School/District Characteristics
Hot Topics
Blog Roll
Members' Blogs
- Transforming Learning
- The EDifier
- School Board News Today
- Legal Clips
- Learning Forward’s PD Watch
- NAESP's Principals' Office
- NASSP's Principal's Policy Blog
- The Principal Difference
- ASCA Scene
- PDK Blog
- Always Something
- NSPRA: Social School Public Relations
- AACTE's President's Perspective
- AASA's The Leading Edge
- AASA Connects (formerly AASA's School Street)
- NEA Today
- Angles on Education
- Lily's Blackboard
- PTA's One Voice
- ISTE Connects
What Else We're Reading
- Advancing the Teaching Profession
- Edwize
- The Answer Sheet
- Edutopia's Blogs
- Politics K-12
- U.S. Department of Education Blog
- John Wilson Unleashed
- The Core Knowledge Blog
- This Week in Education
- Inside School Research
- Teacher Leadership Today
- On the Shoulders of Giants
- Teacher in a Strange Land
- Teach Moore
- The Tempered Radical
- The Educated Reporter
- Taking Note
- Character Education Partnership Blog
- Why I Teach



Undemocratic ideals being
Undemocratic ideals being proposed by Democrats. How Republican of them.
If the problems with the
If the problems with the value-added approach are problems of reliability and validity, then can we reasonably look to peer reviews, portfolios, and writing samples as alternatives? These approaches have demonstrable validity and, especially, reliability issues.
The approaches you mention
The approaches you mention have a few clear advantages. It's possible to gather much more evidence, and if it doesn't seem like enough, go get even more. Such assessments can be done repeatedly and reviewed openly by everyone involved. With state tests, it's one-and-done, and the results are much more difficult to review in many cases. Here in CA, we're not allowed to see or discuss the questions that were actually on the state tests.
James Popham, professor emeritus at UCLA, is widely regarded as one of the ultimate authority on assessment. His conclusion was that while no approach is perfect, the best approach for teacher evaluation is the professional judgment of peers. I assume that he meant that evaluation should be grounded in clear standards of practice and with established protocols for gathering and analyzing information. (No citation for that at the moment - sorry). And, since most teachers can't be linked to the type and amount of data needed for VAM, wouldn't it make more sense to focus on the approaches that will work for evaluating all teachers and promoting their improved practice, rather than VAM for teacher evaluation, which runs counter to the findings of the NRC, National Academies, APA, EPI, AERA, NCME, US DOE, etc.?
Granted, you can get more
Granted, you can get more information, but that is a different matter than if the information is valid or reliable. On the latter, when it comes to peer review (in any profession), there is often a world of difference between a solid system with established protocols as suggested and what happens in reality. Especially as the system operates over time, interpretation and application of the protocols can become almost random, varying substantially from location to location, peer to peer. IF clear, established protocols are in place it can work, but the liklihood of maintaining such a premise in reality, in a highly decentralized system of schools, over time, is extremely unlikely. Five or ten years into such a system, the standards used by peers in one school building are likley to be different (and different in unknowable ways) from the standards used by peers in a school building in a neighboring school district. And, if you can't compare school building to school building, district to district, state to state, then there is no "standard" at all.
Over 39 years ago I described
Over 39 years ago I described a way and means to use a primitive - I thought - equation/rubric/algorithm for recording and compiling what was happening in randomly selected classrooms throughout a region or across the nation for as little as 15 minutes a day. The "metrics" that it would yield would tell us with today's computer technology in near real time what was happening, not happening and needed to happen to insure powerful learning outcomes in classrooms with similar demographics. The audience of professors that I presented it to at a conference thought it was an egg-head joke. I tried again at another conference about 6 years ago; there was no response and a denial of the paper for publication in the conference proceedings, ostensibly because I hadn't done it just described it. Formative estimations of the quality of teaching and learning is incredibly easy to do. But unless you have ever tried to answer a question that someone has not properly raised you can not begin to know frustration. Help me find someone with the authority and leverage to listen to a presentation of this two page notion and equation and this problem would not only be solved but it would serve as a heuristic unlocking many other seemingly insurmountable issues in teaching and learning. But you had better hurry I'm 70 years old and have had a quadruple bypass.
Tony Manzo
avmanzo@aol.com
Post new comment