Harvard Civil Rights Project No Child Left Behind Study
(July 29, 2013)
Prefatory note: This subpage of the Educational Disparities page of jpscanlan.com discusses a 2006 study by the Civil Rights Project of Harvard University that found larger relative racial differences in proficiency rates on tests with lower overall pass rates than on a tests with higher overall pass rates without recognizing that whenever one test has substantially lower pass rates than another, the test with lower pass rates will almost invariably show larger relative differences in pass rates, but smaller relative differences in failure rates, than the test with higher overall pass rates.
The Educational Disparities page and its Disparities by Subject discuss the problematic appraisal of racial and ethnic differences in educational outcome undertaken without recognizing the way standard measures of differences in outcome rates tend to be affected by the prevalence of an outcome. As I have explained in many places, and most comprehensively in one place in the Harvard University Measurement Letter, the rarer an outcome the greater tends to be the relative difference in experiencing it and the smaller tends to be the relative difference in avoiding it. Thus, for example, tests with a low cutoff will tend to show larger relative differences in pass rates, but smaller differences in failure rates, than tests with a high cutoff; relative differences between the rates at which a demographic group passes a test with a high cutoff and a test with a low cutoff will be smaller for the higher-scoring group, while relative differences between the rates at which a demographic group fails a test with a high cutoff and a test with a low cutoff will be larger for the higher-scoring group than the lower-scoring group. Absolute differences between rates tend also to be affected by the prevalence of an outcome, and the failure to understand such pattern substantially undermines many analyses of educational disparities. But absolute differences issues are not important to the subject of this subpage.
In 2006 the Civil Rights project of Harvard University issued a study titled “Tracking Achievement Gaps and Assessing the Impact of NCLB on the Gaps: An In-Depth Look into National and State Reading and Math Outcome Trends,” authored by Jaekyung Lee of the University of Buffalo’s Graduate School of Education. The study attempted to appraise the effect of the No Child Left Behind Law ((NCLB) on public school achievement scores and proficiency rates and racial gaps in those scores and rates. This subpage offers no view on the validity of the study’s appraisals concerning achievement gaps or the study’s analyses of trends as to any matter. But the study, like all other efforts to appraise racial or ethnic differences as to some outcome, fails to consider the implications of the overall prevalence of an outcome on the measure employed.
That failure is most pertinent to the study’s appraisals of racial and ethnic differences in achieving proficiency for reading and mathematics according to state tests and National Assessment of Educational Progress tests. The latter are much more difficult (have much lower overall pass rates) than the former and hence will commonly show larger relative differences in pass rates but lower relative differences in failure rates than the former. The study, like virtually all research into demographic differences in favorable or adverse outcome rates, shows no understanding of this pattern.
The study principally analyzes the matter by comparing the size of the relative difference between the white pass rates on state tests and white pass rates on NAEP tests with the size of the relative differences between minority pass rates on state tests and minority pass rates on NAEP test. As suggested in the prefatory note, and discussed at pages 17-18 of the Harvard letter and on the Subgroup Effects sub-page of the Scanlan’s Rule page, a pattern by which, compared with disadvantaged groups, advantaged groups will tend to show smaller relative differences in favorable outcomes, but larger relative differences in adverse outcomes, in settings with different overall adverse (or favorable) outcome rates is simply one manifestation of the pattern whereby the rarer an outcome the greater tends to be the relative difference in experiencing it and the smaller tend to be the relative difference in avoiding it.
Figures 14 and 15 of the study (at 48) present in bar graph form the rates at which 4th graders achieve of various race/ethnic groups achieve proficiency in reading and math according to state tests and NAEP tests. Table 7 then show the ratios of the state proficiency rates to the NAEP proficiency rates for each group (as well as for eight graders, though the underlying figures are not made available for the 8th graders).
It is not possible to precisely determine the rates reflected by the bar graph. But Table 1 below show proficiency rates estimated on basis of the bars in Figures 14 and 15, as informed by the rate ratios in the studies Table 7. That is, visual estimates of the rates reflected in the bar graph were adjusted in a manner that would result in rate ratios that closely approximated those in Table 7. The process is inexact but not to a degree that would bear materially on any point made below.
Table 1 then presents for whites and blacks the (a) ratios of rates of proficiency on state tests to rates of proficiency on NAEP tests (as calculated based on the proficiency rates shown in Table 1 and which differ slightly from that rate ratios in the study’s Table 7), (b) the ratios of rates failing to achieve proficiency according to NAEP tests standards to rates of failing to achieve proficiency according to state tests, and (d) the difference between underlying means in each setting derived from the proficiency rates (EES for estimated effect size, which measures is explained, among other places, on the main Educational Disparities page).
The table shows that, just as would commonly occur when the prevalence of an outcome differs it two settings and as would almost invariably occur when the prevalence of an outcome differs greatly in the two settings, in the setting where the favorable outcome is more common (i.e., among whites) the relative difference in the favorable outcome is smaller, while the relative difference in the adverse outcome is larger, than in the situation where the favorable outcome is less common (i.e., among blacks).
Table 1. Black and White Proficiency Rates in Fourth Grade Reading and Math According to State Tests and NAEP Tests, with Ratios for each Race of Achieving Proficiency on State Tests to Rates of Achieving Proficiency on NAEP Tests, Ratios for each Race of Failing to Achieve Proficiency on NAEP Tests to Rates of Failing to Achieve Proficiency on States Tests, and Estimated Effect Size (EES) of Differences Between Proficiency Rates [ref b4327c2]
S/N Prof Ratio
N/S NonProf Ratio
It is the former figure on which the study relies for the conclusion (at 49) that discrepancies between state and NAEP proficiency are largest for minorities. But the report fails to recognize either that, solely for statistical reasons, one will almost invariably find such a pattern as well as an opposite pattern as to the opposite outcome.
The final column shows an estimate of the difference between the settings based on deriving from the outcome rates the difference between means of the underlying distribution. This shows that difference between types of assessment is indeed larger for blacks than for whites, though the difference between the sizes of the differences for blacks and whites is a good deal smaller than would be suggested by the rates in the study’s Table 7. As discussed in the Educational Disparities page, however, when actual scores are available they provide the best means of illustrating patterns of disparities.
It is important to recognize that, whereas in the instant situation the relative difference in the favorable outcome was consistent as to direction with the EES, one would commonly observe the same pattern (as well as the contrasting patterns as to relative differences in the adverse outcome) when the EES figure was the same as to both settings and even when the EES figure was somewhat larger for whites than for blacks.
(I do not want to complicate this item with substantial discussion of absolute differences and hence do not present those figures in Table 1. I note, however, that observers are increasingly relying on absolute differences as a measures of disparities in proficiency rates, among many other things, and the absolute difference between states and NAEP proficiency rates is the same for whites as for blacks in reading (26 percentage points) and only slightly larger for blacks than whites for math (37 versus 33 percentage points). But as discussed at pages 18-24 of the Harvard letter, the absolute difference is no better a measures of association than either of the two relative differences and reliance upon the absolute is lately causing as much confusion as reliance on either relative difference.)
While the study did not present detailed information on the size of between-group disparities in the two settings, it did rely on such information to quantify the difference in the size of the disparities in the two settings (which it does in terms of ratios of ratios in the bottom part of Table 7). For that reason (and in order to illustrate another methodological issue), Table 2 below presents information on differences between the rates of black and whites according to the two methods of assessment. For reference, I note that the approaches in Tables 1 and 2 here may be compared to the approaches in Tables 2a and 2b on the main Educational Disparities page.
The estimates of rates that appeared approximately consistent with the ratios shown in the top part of the study’s Table 7 fail to provide as close approximations of the ratios of rate ratios in the bottom part of Table 7 or the rate ratios discussed in the test of the study at 50-51 and discussed below). This suggests that my estimates of the proficiency rates shown by bars in Figures 14 and 15 may be somewhat inexact. But, again, discrepancies ought not to materially affect any point made here.
Table 2. Black and White Proficiency Rates in Fourth Grade Reading and Math According to State Tests and NAEP Tests, with Ratios of White Proficiency Rate to Black Proficiency Rate for Each Test, Ratios of Black Rate of Failing to Achieve Proficiency to White Rate of Failing to Achieve Proficiency for each Test, and Estimated Effect Size (EES) of Differences Between Proficiency Rates [refb4327d2]
W/B Prof Ratio
As with Table 1, Table 2 shows the common pattern whereby, for both subjects, the relative difference in the favorable outcome is smaller in the setting whether the favorable outcome is more common (state assessment) while the relative difference in the adverse outcome is larger in that setting. In accordance with the pattern of EES figures shown in Table 1, such figures reflecting the black-white differences are larger for NAEP tests than for state tests.
(As explained in the note attached to the paragraph preceding Table 2 in the Subgroup Effects sub-page of the Scanlan’s Rule page (and on the main Educational Disparities page), there is no reason for the EES figures regarding between-test figures for each racial group and the EES figures for between-race figures for each test to match. But there is reason to expect the difference between the EES figures in each pair of EES figures shown for each subject to match. Thus, the difference between the two EES figures for Reading in Table 1 (.175 standard deviations) approximates the difference between the two EES figures for Reading in Table 2 (.165 standard deviations), with the difference between the two figures being a function of rounding or other inexactness in the calculations. Similarly, the difference between the two EES figures for Math in Table 1 (.31 standard deviations) approximates the difference between the two EES figures for Math in Table 2 (.32 standard deviations), again, with the difference between the two figures being a function of rounding or other inexactness in the calculations.
With respect to the quantification of the size of racial gaps in shown by the state and NAEP test, in a note to Table 7, the study says that it is calculating an odds ratio. But it then describes a calculation of a ratio of rate ratios (with the smaller ratio as the numerator), and it appears to be on the basis of the latter than the study in fact attempts to quantify the difference between the racial gaps in the two settings.
But a ratio of rate ratios does not provide a basis for quantifying the difference in the relative differences. The problem lies in a conflation of the rate ratio (RR), also termed relative risk, with the relative difference that the figures represent (a conflation that is not uncommon, and may be found, among other places, in the University of Michigan Measuring Health Disparities online course). The relative difference is actually RR – 1 where RR is above 1 and 1 – RR where RR is below 1.
Thus, for example, suppose that in situation A, the white rate is 40% and the black rate is 30% and in situation B the white rate is 40% and the black rate is 20%. The rate ratios would be 1.33 in situation A and 2.0 in situation B and the ratio of rate ratios would be .665, which the study would regard as reflecting the fact that the gap in situation A is two-thirds of the gap in situation B. But the white rate is 33.3% greater than the black rate in situation A and 100% greater than the black rate in situation B. Thus the former relative difference is one-third, not two-thirds of the latter-relative difference.
In terms of actual figures, the study (at 50) refers to a 1.8 white-black gap in grade 4 math according to the state test. Though it then characterizes the matter as reflecting a situation where the black rate is “1.8 times greater” than the black rate, I assume that the study in fact means that the white rate is 1.8 times as great as the black rate, which is to say that it is 80% greater. Technically, the ratio should not be described as the gap just as it should not be described as the relative difference. (The characterization of a figure that is 1.8 times greater rather than 1.8 times as great raises a somewhat different, though related, presentation issue, which is discussed at length on the Times Higher subpage of the Vignettes page). In any case, after then describing (at 51) a 4.3 rate ratio for the NAEP assessment in similar terms, the study (at 51) references an “M” term of .45 (which also appears in the bottom part of Table 7) and states that the state gap is only half of the NAEP gap. Given those figures, the correct statement would be that the former gap is only 24% of the latter gap (i.e., .8 over 3.3 rather than 1.8 over 4.3).
I note that, apparently because of imprecision in my deriving the proficiency rates from Figure 15, the corresponding rate ratios shown in Figure 2 are 1.57 and 4.0 rather than 1.8 and 4.3. Further, I calculate the ratio of 1.8 over 4.3 as .42 rather than .45. But I do not believe these discrepancies are of consequence to the point that the study incorrectly regarded proportionate differences between rate ratios as proportionate differences between the relative differences that the rate ratios represented.
I belabor the above technical points partly because of an interest in clarity of presentation (as reflected in the above-reference Times Higher subpage and other subpages to the Vignettes page) but also in order to have a working figure for comparing the difference between the racial gaps according on each test with the difference between the EES figures. According to the EES figures, the racial gap in math is indeed larger on NAEP tests than state tests than (1.03 standard deviations compared with .71 on the state tests). But the impression as to the size of the difference between differences is not the same as one would form based on comparisons of relative differences.
The same measurement issues would apply to calculating the differences between the relative differences in failure rates. In the case of failure rates, however, the larger differences are for the states tests than for the NAEP tests.
Finally, as discussed on the Times Higher subpage, the attention I give in various places to characterizations of relative differences in outcome rates should not be regarded as suggesting that either rate ratios or the relative differences they represent are useful measures of association. Both are in fact illogical measures of association, as discussed on the Illogical Premises and the Illogical Premises II subpages of the Scanlan’s Rule page, and recently in the Comment on Hingorani BMJ 2013.