Thursday, February 12, 2015

Standardized Tests: Bitter But Necessary Medicine?

At EdWeek, Cristina Duncan Evans had this to say about standardized testing:

What's worse than annual standardized testing? Not having it at all.

Well, no. I don't think so. Her argument is not an unusual one.


What would happen if we no longer had to take the bitter pill of standardized testing? At the most basic level, it would become much harder to figure out which schools aren't doing an adequate job of reaching students. 

I don't think so. I don't believe that standardized tests are telling us that now, so this is kind of like arguing that closing down the telegraph company would be bad because I would never get any more phone calls from that guy who never calls me on the phone.

There are at least two disconnects. One, the tests aren't telling us about how adequate schools are and two, they never will, because they can't.

Politicians and bureaucrats could game statistics to make achievement gaps disappear in order to appeal to voters who don't know what is going on in their local schools. 

Yes, because the past decade of test-driven accountability has kept politicians so honest.

In fact, we've been treated to a decade of politicians gaming statistics in order to make schools look like failures in order to justify initiatives for charts, vouchers, turnaround scammers and other folks lined up to get their mitts on the goose that lays golden taxpayer-financed eggs. If there's anything standardized tests have NOT been used for, it's to let people know what's going on in their local schools.

And, as always, I have a problem with the idea that local folks have no knowledge of what's going on in schools unless a government bureaucrat with a test results spread sheet tells them.

Without comparisons, failing schools would face little pressure to improve.

Really? Nobody would know they were failing? Not students nor parents nor teachers working there? And the only clue, the only possible hint that they were failing would be standardized test results? A click-and-bubble test that narrowly measures slim aspects of two disciplines is the best measure we can think of for telling whether a school is failing or not?

The needs of historically underserved populations would go unnoticed beyond their classrooms.   

I just addressed this, so I'll be brief. This is a legitimate concern, but after a decade-plus of NCLB, there is no evidence that standardized tests help with the issue in the slightest, and plenty of evidence that they hurt.

Without standardized testing, successful schools with a strong sense of mission would continue to thrive, but would their lessons be adopted for all students?

Because other teachers aren't interested in hearing about what works, or because they have no means of contacting fellow professionals? And why does success need to be scaleable? Can it be scaleable? What makes you think that something that works at my school with my students when implemented by me will work at your school in your classroom with your students? I think I'm a pretty good husband to my wife. Does it follow that my statement is only true if I would be a great husband to every straight woman and gay man in America?

In the comments, Evans goes on to underline that she believes we need to be able to compare schools so that we know if students are getting a good education. This makes no sense. Do I need to compare my performance as a husband to that of other husbands to know whether I have a good marriage or not, or can my wife and I depend on our own judgment of our own circumstances. Every student should get a good education, and that means something different in every situation. Comparison has nothing to do with it.

Then in the comments Evans adds this:

That's why I favor fewer, better tests that are well designed and that align with not just standards, but our values. If we value critical thinking, creativity, and depth of knowledge, then we need to design assessments that measure those things. Would that be expensive? Certainly. Would such assessments be computer graded? Almost certainly not.

Sigh. I favor magical unicorns flying in on rainbow wings to lick my head and make my hair magically grow back. But it's not going to happen. I agree that the tests she describes would be useful, but we don't have those tests, and we are never, ever, ever, EVER going to have those tests. Instead we have tests that devalue and disincetivize the qualities she lists. She really lost me here-- it's like saying we'd like a really great house paint for our home, but until we can have that, we'll just have to bathe the walls in flames instead.

Finally, this:


I don't trust schools and states to equitable teach ALL of their students without some oversight, because historically, that just doesn't tend to happen in this country.

In this, we agree. But I don't think standardized tests help with this problem in the slightest. In fact, they make things worse by creating the illusion that the issue is being addressed and take resources away from initiatives that actually would help. Standardized tests are not the solution, not in the slightest.
 



 

28 comments:

  1. "That's why I favor fewer, better tests that are well designed and that align with not just standards, but our values."

    I value honesty, integrity, compassion, justice and decency. Lord help us when they try to come out with standardized tests to align with that.

    Anyway, what in the world did we all do before Lewis Terman developed his tests to tell us what we don't know about our students, teachers and schools?

    ReplyDelete
    Replies
    1. What did we do before Terman's tests? Not educate black people, for one thing. Emphasize memorization and recitation for another. I'm not saying tests led to better education for blacks or deeper learning. I'm saying holding up a pre-1900 vision of schools as an example isn't useful because the education system before 1900 isn't what we want for our kids today. Teaching is as old as Socrates, but it can look different in 2015 than it did when he taught and still honor integrity, compassion, decency, and justice.

      Delete
    2. You got your cart and your horse mixed up. We didn't educate blacks because we believed them to be inferior. Terman's tests "confirmed" that belief in an "objective" way that made it even harder for blacks to obtain educational equality. IQ tests have been no boon to blacks.

      Delete
    3. I think you have a tendency to stop reading altogether when you come across an idea you disagree with? Take a look at the fourth sentence onward of my last comment. I know it's probably foolish to continue a conversation with someone who isn't reading thoughtfully, but hey, I'm a history teacher, and getting people to read thoughtfully is part of my job. Old habits die hard. :-)

      Delete
    4. No, I think you're the one with a reading comprehension problem. No one is talking about a pre-1900 approach to education. Just because I want to get rid of Terman and everything he spawned (which made the education system worse, not better), does not mean I want to throw out all educational advances made since 1900.

      Delete
    5. I'm with you Dienne, but from the ad hominem attacks it looks like we have a troll on our hands. Seeing as how he doesn't bother to answer your other questions or respond to me. So, pointless to engage any more. Sigh.

      Delete
    6. I'm not a troll, I'm the author of the original post from Ed Week. I care about this issue and was just trying to understand your point and make myself understood. There's definitely been a misunderstanding on both our parts, and I apologize for not getting what you meant originally and in your other comments. Rebecca, I didn't respond to your comment because I mostly agree with you, and didn't have anything new to add to your point.

      Delete
  2. I understand a lot of your points, and I think that on many of them we just disagree. But there are two points you raised above that I keep coming back to in my mind. (I’m afraid my thoughts will run long, so I’m going to address them in two different comments.)
    I teach history, (and in many ways I'm a moderate) so I tend to distrust blanket statements. The first idea in your response that I can’t let go is the idea that standardized tests tell us nothing about the quality of an education a student receives. I think that a typical, low quality state standardized test can give you misleading information about the middle 50% of schools, but it can probably give you some good information for the top 25% and the bottom 25%**. But honestly, I’m not advocating for low-quality tests, and I think they should be shelved for something better. For the sake of argument, lets talk about a standardized test that is generally highly regarded by teachers: the AP Calculus Exam. If you look at schools that overwhelmingly get 5s, and the schools that overwhelmingly get 1s, there are a few conclusions you can draw about the school with the higher scores:
    - its students come from wealthy families with stable home lives and extra resources like tutoring
    - its teachers are extraordinarily skilled
    - the school environment supports and encourages students to take advanced classes by preparing students early or recruiting talented students from feeder middle schools
    - the students are highly motivated
    - there was rampant cheating
    Most likely, there is combination of these factors at play, and definitely a few more. Standardized test scores alone won’t lead us to the answers about which factors dominate. To figure out which combination of the above factors produces strong results takes further analysis, like examining student work, demographic data, and classroom observations. (And no one who’s serious in education is saying that standardized tests should be the only factor that determines how we think about school quality.) But common assessments give us the baseline data to begin that analysis. Identifying the contributing factors above helps us understand better how students achieve, and what the factors are that contribute to success. Knowing with certainty, not based on hunches, what conditions allow students to create good work allows us to recreate those conditions elsewhere. If the root cause of academic success is family wealth, then we need to raise our minimum wage and support economic policies that help working families and the very poor. If it’s the second answer, then we need to invest in teaching, and so on.
    When people hear about standardized tests they automatically assume that it means a paper based test, but I would challenge that idea and extend the example above to any common assessment worth giving. For instance, I really love National History Day, and I think it’s a great educational program. Students all over the country follow the same basic rules and are graded by the same rubrics. I think that rubrics can be subjective (when evaluators aren’t trained to use them properly) and statistically unreliable, but if the AP and IB programs can reliably train people to evaluate student work, then why can’t we think outside the bubble when it comes to standardized testing? (Because it’s expensive. But that’s a different conversation I think our country needs to have.)
    **I know that I’m vastly oversimplifying the issue by assuming that you can even categorize or rank schools, but I hope that you’ll accept that premise so that we can continue to have a productive conversation.

    ReplyDelete
    Replies
    1. "**I know that I’m vastly oversimplifying the issue by assuming that you can even categorize or rank schools, but I hope that you’ll accept that premise so that we can continue to have a productive conversation."

      So we have to accept a completely invalid premise in order to have a "productive conversation"? Typical rephormer - accept my terms or you're not a Serious Person.

      Delete
    2. I appreciate your thoughtful participation here. When I read this long business about how the tests could be useful if we turn our head and squint, all I can think is-- why bother? We should be talking about how best to find out the things we want to know, not how to squeeze the tests to get something useful out.

      We shouldn't be talking about how best to use a hammer to drive in a screw. We should be trying to find a screwdriver.

      Delete
  3. Sigh. I favor magical unicorns flying in on rainbow wings to lick my head and make my hair magically grow back. But it's not going to happen. I agree that the tests she describes would be useful, but we don't have those tests, and we are never, ever, ever, EVER going to have those tests.
    By arguing for the elimination of standardized testing, you’re the one who’s wishing for magical unicorns, not me. In this country testing will be eliminated right around the time that your hair grows back ☺. This is a democracy, and if teachers, parents, students, universities, teachers unions, and civil rights groups advocate for better tests, government will respond. This type of coalition was the impetus for PARCC and Smarter Balanced. Are those tests good enough? Absolutely not. Are they better than what we previously had in MD? Yes. So we keep pushing for better tests, and the test creators get a little closer. Slowly, slowly, slowly. And if the testing industry time after time is unable to get us the high quality tests that we demand, then your position (eliminate testing) begins to look reasonable rather than radical. But right now, in this climate, my advocacy is more practical than yours. The ‘better testing’ coalition I mentioned earlier stands in contrast to a vocal but small minority of ‘anti-testers’. Eliminating testing altogether is a non-starter, because the business community is opposed to it. I don’t like that fact, but I acknowledge it as a political reality. In our current system Pearson/ETS and the US Chamber of Commerce have to be negotiated with – they can’t just be opposed uni-laterally. I’m more satisfied with gradual progress than none at all. But history tells us that your coalition will be remembered as the people who push progress faster, and I think that radicals play an important role in any type of social progress, so I’m okay with the fact that this discussion won’t change your mind – I just want to convince you that our two positions complement each other, and that we’re better off pushing against a common foe. Finally, back to your original point about how we’ll never ever, get tests that are good enough, I actually think that there might be a small chance that you’re right here that standardized assessments can’t measure real, rigorous learning, and I definitely think that they’re not appropriate for all students. But for a majority of students, they can be appropriate. I think that teaching students to be metacognitive and asking students to explain how they think and answer problems is a step in the right direction, testing-wise, and helps us get closer to assessing learning.

    ReplyDelete
    Replies
    1. "And if the testing industry time after time is unable to get us the high quality tests that we demand, then your position (eliminate testing) begins to look reasonable rather than radical."

      So how long do they get to keep trying? It's been a dozen years since NCLB. We had standardized tests long before that. Haven't yet gotten that magical "better" test, and it's not for lack of trying. Give it another dozen years? Two dozen? Hundred? Just how long do we have to keep up this farce and harming our kids?

      Delete
    2. I agree that standardized tests will never disappear-- at least not as long as somebody can make money selling them. It's okay; I've also finished grieving my hair. My realistic goal would be that they are relegated to the kind of irrelevance that they deserve-- we 'd still be giving them, but nobody would be paying attention, which I think is the natural outcome, and the people who insist on attaching high stakes to them know that. I mean-- if we all really believed that these tests were showing worthwhile information, would we have needed the feds to force everyone to attach such high stakes to testing?

      As I think I've already said elsewhere in these comments, my opposition to standardized tests rest on a belief that it simply can't be done. There is no way to produce a standardized test that can be administered and scored on a national scale and which provides any useful information that cannot be more accurately, easily and cheaply collected in other ways.

      It's using a ten foot stepladder to climb to the moon. It's so not going to happen, so "closer" really doesn't mean anything.

      Delete
  4. Okay, I lied, there’s a third thing that I want to continue to talk about.

    Your comment: Do I need to compare my performance as a husband to that of other husbands to know whether I have a good marriage or not, or can my wife and I depend on our own judgment of our own circumstances. Every student should get a good education, and that means something different in every situation. Comparison has nothing to do with it.

    Most wife beaters will say that they have pretty good marriages too. I’m not saying you’re a wife beater, but I’m saying that not all definitions of ‘good’ – in marriage and in education – are equally valid. (I’ve met history teachers who argue that using the textbook as their only instructional resource is ‘good’ history teaching.) If you’re going to disagree with someone about what constitutes ‘good’ you need a common definition of what ‘good’ is so that you can compare that marriage/education to the common definition. No single definition of a ‘good’ education will work for our 50 million public school students, but for most students, say, those involved in your average 10th grade math class, we can say that there are certain skills that students need to have in order show that they have learned the material. (This brings up a related topic –who gets to say what the definition of ‘good’ is, and for too long, classroom educators have played only a minimal role in that debate – we need more teacher input on standards, and we need a wider set of standards that acknowledges the needs of different students.) So, in response to your point about comparison being useless, I’m refining my viewpoint and my wording – maybe the important thing isn’t comparison between groups of students, but comparison to a commonly accepted standard. But as I type that I still have a sense in my gut that I don’t fully believe it. The fact that poor, black and Latino students don’t meet standards at the same level of frequency as white students IS an issue, and it shouldn’t be ignored. Any conversation about equity begins with recognizing disparities, and if you don’t have comparative data, we can’t have a productive conversation that addresses the achievement gap.

    If I can presume to summarize our differences in opinion, neither of us likes standardized testing in its current form, but while you think that no good can ever come from it, I believe that we should improve common assessments, an recognize that they’ve helped encourage a much needed conversation about achievement and equity in this country. That conversation hasn’t led us to widespread improvements yet, but without data as a starting point, we can’t have meaningful, productive conversations at all.

    ReplyDelete
    Replies
    1. You're right about the part radicals have played in social progress. But wrong, I think, about the promise of better standardized testing. Standardized tests now claim to measure critical thinking, which is even more dubious than the claim that they measure learning. The old tests had directions like: "Find the perimeter of the rectangle below." The fourth-grade test taker had to have memorized some facts, like what a perimeter is and what procedure you would use to calculate perimeter, given some information about a shape. Now, however, we do not value mere memorization (too low on Bloom's taxonomy), so we instead test to see whether the test-taker can avoid choosing a plausible but incorrect answer when asked to analyze a text or made-up scenario. If they choose correctly, they then choose the statement that "best explains" the previous answer (which represents the test-designers' thinking, not their own). The tests become lists of trick questions, instead of straightforward assessments of content knowledge. And the test prep required to succeed on the new tests is even less likely to lead to any useful real-world skills or knowledge than the old test prep was. We are spending millions of dollars on tests that are worse, with less construct validity, and that is money that is not being spent on creating better and more equitable curriculum, instruction, facilities, and services for students.

      Delete
    2. CDE, your point about the lack of an objective definition for "good" is well taken. It's true that my definition of "good" might not match anyone else's, but if it matches my wife's, and we both think we're in a good marriage, I'm not sure I see the value in a disconnected third party to come in and tell us differently. But this is a point worth gnawing on (which probably means another post).

      I think you're refinement is important-- to me there is a huge difference between comparing students to each other and comparing them to a separate standard. I think comparison to each other is pointless, and inevitably leads to picking winners and losers, which has no utility in education. The problem with comparing to a separate standard is that such a standard is inevitably sold as "objective" and it never is, ever, ever, ever. Standardized testing has therefor too often led us to check to see if poor urban black kids can imitate well-off non-urban white kids.

      I guess what it ultimately comes down to for me is that I do not believe in an objective test or an objective standard, and therefore any system that's based on finding such a thing will be a lie.

      I also agree completely with gkm001-- I'd say that's exactly how we've ended up with tests that are supposed to be better but are actually worse.

      Delete
    3. The thing about comparison to a separate standard is that it inevitably becomes comparison to each other. On a criteria-referenced test, it is at least theoretically possible that all students would meet or exceed the criteria and pass the test. But if ever there was a test that all students passed, it would immediately be declared an invalid test. So the questions and the cut-scores would be reworked so that a certain percentage would fail, which essentially turns it into a norm-referenced test.

      Delete
    4. The thing about comparison to a separate standard is that it inevitably becomes comparison to each other. On a criteria-referenced test, it is at least theoretically possible that all students would meet or exceed the criteria and pass the test. But if ever there was a test that all students passed, it would immediately be declared an invalid test. So the questions and the cut-scores would be reworked so that a certain percentage would fail, which essentially turns it into a norm-referenced test.

      Delete
    5. Peter, I think you did address the question of what is "good" in your post about meritocracy from November 2014.

      Delete
    6. I've always said there's no such thing as an "objective" test. The best you can try for is fair, and you can't even come close to that except for students you actually know.

      Delete
    7. And I think here the question of "good" is related to the question of what "An Educated Person" should know from your post of that name from May 2014.

      Delete
  5. All studies show that standardized test scores only correlate consistently with socio-economic level. So we don't need more testing, we already know what the problems are and where they are. They all have to do with poverty.

    ReplyDelete
  6. I have to thank everyone for engaging here-- my comments section is usually pleasant but sleepy. What a treat to come home and find such an engaged and thoughtful conversation sprung up here.

    ReplyDelete
  7. It sounds as though we do all agree about something: all kids deserve educational opportunities, and those should be afforded equitably. This is obviously not our reality right now. In my blue-collar, mostly Latino, inner-ring suburb of Chicago, the high school offers a maximum of 16 academic credits for most students, a 5-period school day, no opportunity to retake failed classes during the regular school day, a few clubs and a marching band, and just a smattering of electives (including Spanish, the only language offered), for which you will only have room in your schedule if you cut out one or more academic classes. The much wealthier, mostly white, adjacent suburb provides a 7-period school day; French, German, Italian, Latin, Japanese, and Mandarin Chinese, as well as Spanish; courses in philosophy, sociology, and creative writing; a gospel choir, two jazz bands, two concert orchestras, a symphony orchestra, a dance team, a television studio, and over 60 clubs. Plus it has more counselors, tutoring sessions, and health services, and a lower student-teacher ratio, than our school has.

    Do we need data in order to start to have a conversation about equity here? Let's say the answer is yes, we do need data. Well, we have a couple of decades' worth of state tests telling us that the wealthy suburbs' students have better test scores. Is our most pressing need right now for even more data? Why?

    ReplyDelete
  8. I think it's great that the author engaged here, so I'd like to ask Ms. Duncan Evans some questions.

    - Do you believe the federal government is justified in mandating that all US children need to hit the same benchmarks by a certain age, regardless the pace of their individual physical, cognitive and emotional development?

    - Do you endorse standards written for the nation's top achievers but then forced on all students, regardless of their actual functioning level?

    - What do you think of test data is in a school district where 50% of students refused the test?

    - Do you endorse NYC's policy to use Math or ELA test scores for 20-40% of the evaluations of teachers of science, social studies, art, gym, music, drama, technology, foreign language, or health?

    - Do you think that children given tests far above their functioning level should randomize their guesses, for example spelling their name in the grid in order to try getting a 2 if they are lucky enough, or should they bubble the answer sheets in a straight line, ensuring a 1, but guaranteeing they get at least some answers right. If so, do you advise rows B and C?

    ReplyDelete
  9. The dispositive argument against our public school test obsession may well be the current policy memo of the National Educational Policy Center, "Reathorization of the Elementary and Secondary Education Act: Time to Move Beyond Test-Focused Policies." A petition supporting the policy memo has been signed by 1,300 educational researchers, with the period available for signing-on open until Thursday, February 20, 2015. Here's the URL: http://nepc.colorado.edu/publication/esea

    ReplyDelete
  10. I don't know if this moderated or not. I posted a long response but it didn't make it on.

    Jerry: signed.

    CDE: The regime of test-based evaluation is based on seriously flawed assumptions.

    1. Evaluating schools based on test scores does not measure the effectiveness of schools. Standardized tests like Smarter Balanced or PARCC take a snap shot picture of a student's ability on a given day on a specific, narrowly defined, set of skills. Interpreting the results beyond that is a misapplication of the results. These tests are not a reliable indicator of school or teacher effectiveness. There is no research base that supports the position that they are.

    2. What is the purpose of standardized tests? Do they measure the prospects of future success? Again, research shows that the best indicator of future post-secondary success is HS grade point average (which undermines the "teacher's grades are inconsistent" argument, by the way).

    3. If standardized tests don't measure school or teacher effectiveness or predict future success, what is their purpose? There are much more inexpensive and less obtrusive assessments, formative assessments, that are superior indicators and guides for instruction and intervention than large-scale standardized tests.

    Please, let's stop with the false, uninformed rhetoric about the need for standardized testing. The propaganda campaign is just about as annoying as the test itself.

    ReplyDelete