Are New Jersey’s new state tests more “honest” than its old ones? Now that the data has been released, we can confidently say:
For the last few years, New Jersey’s education policy has been focused on the transition from the old NJASK test to the new PARCC (Partnership for Assessment of Readiness for College and Careers), a nationwide test that was created by a consortium that has seen its membership dwindle to half of its original states.
Despite a growing resistance to the Common Core State Standards on which the PARCC is based, and a burgeoning opt-out movement here and around the nation, both the New Jersey Department of Education and a group of education “reform” outfits have declared that PARCC is a better test than the NJASK, and vitally necessary to improve student outcomes.
We can debate whether the content standards of the Common Core are better than those of New Jersey’s previous standards. We can debate whether PARCC is a better assessment of learning than the NJASK. I’m hardly a fan of the old test, which, to my knowledge, was never properly assessed for its validity anyway.
But now that we have PARCC school-level results], it’s quite clear that PARCC isn’t telling us anything we didn’t already know about New Jersey schools. And it only takes a couple of graphs to prove it.
Figure 1 is a chart showing the distribution of each school’s average grade 8 English Language Arts (ELA) score for the 2014 NJASK test. The shape is quite close to what statisticians call a “normal” distribution, commonly referred to as a “bell curve.” A few schools scored, on average, high on the NJASK, a few scored low, and most scored in the middle.
If you look at the scores for any tested grade in either math or ELA, you’ll find the same shape. That’s because the tests are designed to yield these bell curves: they have some easy questions, some hard questions, and some in between. When it comes to test scores, normal distributions are … well, normal.
I’ve included a red line in the chart: it shows the “cut score,” or the test score necessary to get a score that is described as “proficient.” In 2014, about 88 percent of New Jersey’s schools that tested on the grade 8 ELA got an average score that was at proficient or above (to the right of the red line).
Now let’s jump ahead to PARCC:
Figure 2 shows the distribution for the grade 8 ELA scores in the new test. The first thing you’ll notice is that, once again, the test scores are distributed in a bell curve: a few schools scored high, a few scored low, and most scored in the middle.
You will also notice that the numbers on the horizontal axis are different: they now span from 650 to 850, instead of from 100 to 300 like the NJASK.
This doesn’t matter in the slightest. Changing the scale on a test is like changing from Fahrenheit to Celsius when taking temperature: It still feels just as cold, no matter whether you call it 32 degrees F or 0 degrees C.
The important thing to know is that the school-level test scores are still distributed the same way. And the high-scoring schools remain high-scoring, while the low-scoring schools continue to score low. My analysis shows that 80 percent of the variation in grade 8 ELA PARCC scores for 2015 can be statistically explained by the 2014 NJASK scores; other grades are similar.
It’s also worth noting that PARCC scores, like the NJASK, are highly correlated to student characteristics. If you want to make a guess about how a school scores, look at the percentage of children it enrolls who are eligible for free or reduced-price lunch; you’ll make a much more accurate prediction if you do.
The only thing that’s really changed between the scores of the old and new tests is where the cut score for proficiency now lies. In 2015, about 50 percent of schools reporting grade 8 ELA got an average PARCC score that is above the cut score for proficiency. Notice how the red line has shifted from 2014 to 2015: less proficiency, even though the shapes of the distributions are the same.
Is the new cut score more “honest”? PARCC promoters say it represents “college and career readiness,” but I contend it’s really showing “four-year college readiness.” And I must have missed the meeting where we decided all students should earn a bachelor’s degree; it seems to me that over-credentialing workers who do necessary work but don’t need four years of college would be an enormous waste of resources.
Again: we can debate that point. What we shouldn’t do is pretend that the drop in New Jersey’s proficiency rates has much of anything to do with the switch to the PARCC. Both the old and the new tests yield bell curve distributions; the choice of where to set the cut point for proficiency has nothing to do with that.
There are some concerns about the PARCC going forward. A recent report suggests students who take the computerized version of the test do worse than those who don’t. That’s important for accountability purposes. There is also a real concern the NJASK had a “ceiling effect,” which actually makes the PARCC invalid for use in teacher evaluation; more on this topic in a future column.
But as for the 2015 scores: there’s little evidence that the PARCC is more “honest” than the NJASK. The state could have changed the proficiency cut score for the NJASK and the results would have been almost the same as the PARCC. High scoring schools would still score high — and would still be much less likely to enroll large percentages of children in economic disadvantage.
And that’s the honest truth.