Note: This page (especially its Section A) is closely related to the Lending Disparities page of this site. Both involve situations where federal enforcement policy is based on a statistical perception that is the exact opposite of reality. Both situations are analogous to one in which the federal government would pressure employers to lower cutoffs on their employment tests because of the impact of those tests on minorities – which, as any competent statistician will advise, will tend to reduce relative differences in pass rates but increase relative differences in failure rates – and then single employers out for litigation based on the size of relative differences in failure rates caused by their tests. Both also involve situations where, assuming the involved agencies in fact come to understand the issues (see Duncan/Ali Letter subpage to his page and Holder/Perez Letter subpage of the Lending Disparities page), their concerns about public perceptions of their expertise may conflict with their interests in rationally executing their law enforcement responsibilities. Subsequent to the initial creation of this page, some of the issues it raises were addressed in “Racial Differences in School Discipline Rates” (The Recorder, June 22, 2012),“The Paradox of Lowering Standards” (Baltimore Sun, Aug. 5, 2013), “Misunderstanding of Statistics Leads to Misguided Law Enforcement Policies,” (Amstat News, Dec. 2012 (which pertains to perceptions about both lending disparities and discipline disparities). Both the discipline disparities and lending disparities issues were addressed to some extent in an October 17, 2012 presentation titled “The Mismeasure of Group Differences in the Law and the Social and Medical Sciences”at an Applied Statistics Workshop of the Institute for Quantitative Social Science at Harvard University and a somewhat similar presentation at a September 25, 2012 Department of Mathematics and Statistics Colloquium at American University and are discussed at pages 13-14 of the October 9, 2012 Harvard University Measurement Letter.
This page addresses four somewhat related matters, aspects of which involve perceptions that are the exact opposite of reality. Section A addresses the perception that large racial differences in public school discipline rates are the consequence of stringent discipline standards when in fact the more stringent the policy the smaller will tend to be the relative difference in discipline rates. Section B addresses perceptions about disparities in adverse outcomes in employment settings that were discussed in “Getting it Straight When Statistics Can Lie” (Legal Times, June 23, 1993). Section C addresses perceptions about racial differences in mandatory life sentences and other adverse outcomes in the criminal justice system that were discussed in “Mired in Numbers” (Legal Times, Oct. 12, 1996). Section D addresses perception about the racial impact of the NCAA academic eligibility standards that were discussed in “An Issue of Numbers” (National Law Journal, Mar. 5, 1990) and “The Perils of Provocative Statistics” (Public Interest, Winter 1991).
A. Racial Disparities in Public School Discipline Rates
There are many remarkable examples of misinterpretations of data on demographic differences as a result of the failure to recognize the pattern whereby the rarer an outcome the greater tends to be the relative difference in experiencing it and the smaller tends to be the relative difference in avoiding it. In Sections B.1 and B.2 of the Scanlan’s Rule page and the Feminization of Poverty page, I cite perceptions about the so-called “feminization of poverty” as one of the more remarkable of these examples, because the pattern is so evident in the census data that researchers analyze in discussing the issue. That is, published income data on proportions of the population falling below various ratios of the poverty line should make it evident that reducing poverty, including the poverty of female-headed families, will tend to cause those families to comprise a larger proportion of the poor than they previously did, while increasing poverty will do the opposite. See Table l and Figure 1 in “Can We Actually Measure Health Disparities?” (Chance 2006), which present patterns by race that census data also show by household type.
The failure to recognize that more lenient lending criteria tend to increase racial differences in mortgage rejection rates, a principal subject of the Lending Disparities page and the recent “The Lending Industry’s Conundrum” (National Law Journal, Apr. 2, 2012) and “’Disparate Impact’: Regulators Need a Lesson in Statistics” (American Banker June 5, 2012), is another quite remarkable example, among other reasons, because so much of the early research in the area was conducted by economists of the Federal Reserve Board. The Board was also among those entities encouraging lenders to relax mortgage lending criteria in the belief that doing so would reduce, rather than increase, rejection rate disparities.
The situation discussed in Comment on Morita Pediatrics 2008 may warrant mention as well, given that the authors, relying on relative difference in favorable outcomes, found dramatic decreases in immunization disparities, while the National Center for Health Statistics (NCHS), relying on relative differences in adverse outcomes, would find dramatic increases in disparities. But everything NCHS, the Agency for Healthcare Research and Quality, and the Centers for Disease Control and Prevention do with regard to the measurement of health and healthcare disparities – all done with complete disregard for the way the measure tends to be affected by the overall prevalence of an outcome or – can be deemed remarkable. See Section A.6 of the Scanlan’s Rule page.
Perceptions about the connection between large racial differences in public school discipline rates and stringent discipline standards is also a remarkable example of the failure to appreciate the above-described tendency, given the universality of the perception that stringent standards cause large disparities in discipline rates and the fact that such perception is the exact opposite of reality.[i] Presumably, as with perceptions that more stringent mortgage lending standards yield larger rejection rate disparities, the perception is based on the general notion that stringent standards are associated with large racial differences. The notion is true as to differences in satisfying the standards, but it is false as to differences in failing to satisfy them. Compare the treatment in the Mortality and Survival page of the way health disparities researchers, particularly when studying cancer outcomes, discuss relative differences in mortality and survival interchangeably without recognizing that the two differences tend to change in opposite directions as the overall prevalence of the outcome changes.
In any case, when large disparities in discipline rates were receiving public attention more than a decade ago, as reflected in the unpublished Data and Discipline (2000), the patterns were commonly attributed to the “zero tolerance” policies in effect in recent decades. Similar perceptions about the racial impact of stringent discipline policies led the Departments of Education and Justice in July 2011 to jointly create the Supportive School Discipline Initiative aimed at promoting the exploration of more lenient alternatives to existing school discipline policies.
When in March of 2012, the Department of Education’s Office of Civil Rights (OCR) released data showing large differences between rates at which minority and white public school students are disciplined for behavior problems, numerous observers again attributed the disparities to “zero tolerance policies,” while calling for more lenient discipline policies. For example, a March 6 New York Timeseditorial, entirely accepting the connection between large disparities in discipline rates and stringent discipline standards, suggested that OCR should press “school systems with the worst records to develop fair and sensible strategies that involve working with troubled children and their families instead of reflexively showing them the door.”
In early May. Colorado passed legislation easing zero tolerance policies. While the legislation was already in the works when OCR released its disparities data in March 2012, according to a Denver Post account, concerns about racial disparities contributed to its support. According to a March 13, 2012 report of the Center for American Progress, Massachusetts and California were both considering legislation to address the disparities problem by reducing overall discipline rates and school boards in various jurisdictions were themselves exploring similar approaches. A summary of the legislation introduced in California aimed at both reductions overall discipline rates and closer monitoring of disparities may be found here. Whether further legislation will be enacted in California or elsewhere (or like measures will be imposed administratively) without recognition by anyone involved in the process that more lenient policies tend to increase racial disparities in discipline rates remains to be seen.
Data made available by OCR themselves illustrate the pattern whereby the rarer an outcome the greater tends to be the relative difference in experiencing it and the smaller tends to be the relative difference in avoiding it. That and certain other aspects of the data are discussed below. I note, however, that like the reportage of OCR’s findings, I am relying on aggregate data. Given the extent to which students in many public schools and school districts tend to be overwhelmingly of one race, one certainly would want to see the data adjusted by school and school district (which, I note, the data collected and published by OCR would seem to make possible). While some may maintain that even if the aggregate disparities are principally or entirely a result of high discipline rates at schools or districts where the students are overwhelmingly minority and where the administrators are largely of the same minority group, one still should want to be able to distinguish between disparities arising from such situations and those arising from decisions individual administrators make about both minority and white students. Appropriate adjustment might also alter somewhat the patterns described below, though unlikely in a way that would affect the key points I make about those patterns.
Table 1 below presents data on three categories of severe discipline – (1) Suspension – Out of School, (2) Total Expulsion, (3) Expulsion – Total Cessation – ordered according to increasing severity. The table shows, by gender, the rate of receiving any form of the discipline ((1), (2), or (3)), the two most severe forms ((2) or (3)), and the most severe ((3)), along with the black/white ratios of rates of receiving the discipline and the white/black ratios of rates of avoiding the discipline.[ii] The final column, termed “EES” for “Estimated Effect Size,” shows the difference between means of hypothesized underlying distributions, in terms of percentages of a standard deviation, derived from each pair of rates, which, for reasons explained on the Solutions subpage of the Measuring Health Disparities page, is the only plausible method for appraising the size of a difference reflected by a pair of rates that is unaffected by the prevalence of an outcome.[iii]
Table 1. Black and White Rates of Discipline by Gender with Ratios of Rates of Experiencing and Avoiding Discipline and Estimated Effect Sizes [ref b2813 c 4]
Several things about the patterns in Table 1 warrant note. To begin with, comparison of the male and female figures reveals the common pattern whereby the relative (racial) difference in adverse outcome rates tends to be greater among female students (where such outcomes are rarer) while the relative difference in avoiding those outcomes is greater among male students.
The EES figures in the final column can, for reference, be compared with the differences in math proficiency shown in Table 2a Educational Disparities page. The black and white math proficiency rates shown in the table (respectively, 53% and 81% in 2003 and 81% and 97% in 2006) reflect differences between hypothesized means of .80 standard deviations in 2003 and 1.0 standard deviation in 2009. For further reference, see of Figure 4 of the 2009 Royal Statistical Society presentation, which is based on .50 standard differences between means on test scores, a situation where scores of approximately 30% of the lower-scoring group exceed the mean of the higher-scoring group. That figure also shows how that difference between means translates into contrasting relative differences in favorable and adverse outcomes at different levels of overall prevalence.
Table 1 above also shows that for male students the relative difference in discipline increases as the severity of the discipline increases. In “Mired in Numbers” (Legal Times, Oct. 12, 1996), which underlies Section C of this page, I discussed that the fact that racial disparities in sentencing increased as the severity of the sentence increases had been read as evidence of unfairness in sentencing, and I pointed out the purely statistical reasons to expect larger racial differences in adverse outcomes (though smaller racial disparities in favorable outcomes) as the adverse outcomes became less common. The final column of Table 1 suggests that, for both male and female students, whatever the forces driving the patterns of differences in discipline rates, to the extent that such forces can be measured, they appear to be diminished with respect to the more severe of the three forms of discipline addressed in the table.
The pattern of increasing relative differences in adverse outcomes as the outcome becomes less common does not hold for female students (though the pattern of decreasing relative differences in the favorable outcome does, as it necessarily would when the former pattern does not hold). The decrease in the force of the factors driving the differences in the more severe forms of discipline was apparently enough to overcome the prevalence related patterns for the adverse outcomes while reinforcing the patterns for the favorable outcome.
Table 2 presents information similar to that in Table 1, but addressing gender differences by race rather than racial differences by gender.
Table 2. Male and Female Rates of Discipline by Race with Ratios of Rates of Experiencing and Avoiding Discipline and Estimated Effect Sizes [ref b2813 d 3]
In Table 2 one observes the standard patterns whereby for all categories of discipline the relative gender differences in adverse outcomes are greater among whites (where such outcomes are less common) than among blacks, while the relative difference in avoiding the outcomes are greater among blacks than among whites. For both races, one observes that relative differences for the two less common (more severe) outcomes are greater than for the most common (least severe) outcome, while the relative differences in the favorable outcome are smaller for the less common outcomes. One also observes that the forces driving the gender difference seem to decrease somewhat as the severity of the discipline increases.
The prevalence-related forces are not, however, evident in all the patterns. For both blacks and whites, the relative difference in adverse outcomes decreases between the second most and most severe categories. Such pattern is a function of the decrease in the forces driving the difference, as reflected in the EES column. Given the very low rates in the most severe category, random variation may have a role as well. Given the small numbers involved in categories 2 and 3, they would probably be better treated grouped together, which is the situation reflected in the “2&3” row, and, which, as noted, is entirely in accord with the prevalence related patterns.
As I discuss in many places, while the prevalence-related forces may not always be evident in the observed patterns, they are almost invariably present and interacting with meaningful differences between the differences in the circumstances of the comparison groups at each prevalence level. See, e.g., the “Paradox of Success and Failure” section of “Race and Mortality” (Society Jan/Feb 2000) and Section B of the 2006 British Society for Population studies presentation. There is little reason to doubt, however, the any general reduction in discipline levels will tend to be accompanied by increasing relative differences in discipline rates and declining relative difference in rates of avoiding discipline.
The above discussion, save for the paragraph about adjustments, relates to the perceived racial impact of stringent discipline standards, which is the main focus of recent public discussion of the matter. There also exist issues of whether discrimination is responsible for some part of observed differences. That issue is addressed on the Disparate Treatment subpage.
Disparate Treatment (addressing issues concerning determining the extent to which observed disparities occur because biased teachers or administrators treat certain groups more harshly)
Offense Type Issues (addressing the perception that white students tend to be disciplined for objectively-identified (i.e., more serious) infractions while black students tend to be disciplined for subjectively identified (i.e., serious) infractions)
Los Angeles SWPBS (addressing the increase in relative differences in discipline rates following implementation of a program in Los Angeles aimed at reducing discipline rates)
Suburban Disparities (addressing reportage of larger relative differences in discipline rates in suburbs of Philadelphia than in Philadelphia itself)
Disabilities – PL 108-446 (addressing provisions of the Disabilities Education Improvement Act that require responses to observed disability-related differences in discipline rates that would be likely to increase those differences)
Gender Disparities (addressing the way researchers tend to regard the likelihood that gender differences in discipline rates may result from bias in the same way they regard the likelihood that racial differences in discipline rates may result from bias)
NEPC Colorado Study (addressing a National Education Policy Center study of disparities in discipline rates in Colorado that reflects the view that stringent policies lead to large relative differences in discipline rates and that raises certain other issues)
NEPC National Study (addressing a National Education Policy Center study of national disparities in discipline rates showing changes in rates over time)
Flawed Inferences – Discipline (addressing perceptions about the comparative size of relative differences that fail to consider that factors that affect outcome rates will tend to show larger proportionate effects on lower baseline rates)
Oakland Agreement (addressing an agreement between the Department of Education and the Oakland Unified School District that calls for general reductions in discipline rates and decreases in relative differences in discipline rates)
DOE Equity Report (addressing a November 2012 report of the Department of Education noting that a number of states were modifying zero tolerance policies in some cases relying on Department of Education data, but also presenting data indicating that relative differences in suspension rates were greater districts without zero tolerance policies than in districts with zero tolerance policies)
Duncan/Ali Letter (addressing an effort to make the Department of Education recognize that actions it recommends aimed at reducing racial differences in discipline rates will tend to increase such differences)
B. Certain Adverse Outcomes in the Employment Setting
The June 23, 1993 Legal Times article“Getting it Straight When Statistics Can Lie” addresses a number of situations where data on group differences in the employment setting were misinterpreted as a result of the failure to recognize the pattern whereby the rarer an outcome the greater tends to be the relative difference in experiencing it and the smaller tends to be the relative difference in avoiding it. These include:
(a) In Fisher v. Transco-Services-Milwaukee, 979 F.2d 1239 (1992), the Seventh Circuit reversed a grant of summary judgment to the defendant in an age discrimination case challenging a computerized system for monitoring the rate at which warehouse workers filled orders. The Seventh Circuit reached its decision based on the view that the stringency of the performance standard led to large relative difference in rates of failing to meet the standard. But the more stringent the standard, the smaller would tend to be the relative difference in failing to meet it.
(b) The discipline practices of the Internal Revenue Service were subjected to intense scrutiny because of widely disparate rates at which blacks and whites were disciplined for workplace infractions. An extensive report was produced attributing the disparity largely to race-neutral factors and recommending largely race-neutral approaches to address the situations causing the discipline problems. If such approaches are effective, they would tend to reduce the racial disparity in avoiding discipline problems. But they would tend to increase further the racial disparities in discipline rates that attracted attention to the situation in the first place.
(c) A study of racially disparate termination rates among Postal Service workers explored whether such disparities would be as great at a quasi-federal agency with an excellent reputation as a fair employer as they were in the private sector. Yet the authors failed to consider that the greater protections afforded public sector workers, by reducing overall termination rates, and hence reducing disparities in keeping one’s job, lead to greater disparities in losing one’s job.
(d) The federal government’s Uniform Guidelines on Employee Selection Procedures speak generally of a “four-fifths rule” (sometimes termed the “80-percent rule”) whereby federal enforcement action usually will be limited to situations where one group’s rate of satisfying a selection criterion is less than 80 percent of the rate of another group. Relying on the Guidelines, many courts also have applied the four fifths rule to limit disparate impact claims to situations where the impact can be deemed serious.[iv] But interpretations accompanying the Guidelines have noted a particular exception to the focus on selection rates in the case of the use of arrest or conviction records as disqualifying criteria. Since usually a large enough majority of members of all races will satisfy the requirement of having no arrest or conviction record, such policies do not often violate the four-fifths rule. In such cases, the interpretations have stated, the appropriate focus is upon disparities in disqualification rates.
A problem arises, however, when one seeks a less discriminatory alternative to a policy barring hire of persons with arrest or conviction records. The seemingly obvious less discriminatory alternative to a rule barring hire of persons with arrest records is a rule barring hire of persons with conviction records, and the seemingly obvious alternative to a rule barring hire of persons with any arrest or conviction record is a rule barring hire of persons only with arrests or convictions for serious crimes. But the probable tendency of these alternatives is to reduce the disparate impact only as measured in terms of rates of meeting the requirement of not having such a record. They would likely increase the disparity in the rates at which members of two groups are disqualified, which the interpretation indicates ought to be the focus.
C. Racial Disparities in the Adverse Outcomes in the Criminal Justice System
The October 12, 1996 Legal Times article “Mired in Numbers” addresses a number of issue involving perceptions about large racial differences in adverse outcomes in the criminal justice system including application of California’s three-strikes law. Among other things, the article explains that measures that would cause the three-strikes law to be applied less frequently would tend to increase the relative differences in rates at which it is applied. The article also addresses interpretative issues related to the fact that at each more extreme level of sanction (hence, involving a rarer adverse outcomes), the relative difference in experiencing that outcome increases (as also discussed in Section A above).
D. Racial Impact of NCAA Academic Ineligibility Standards
The March 5, 1990 National Law Journal article “An Issue of Numbers” and the winter 1991 Public Interest article “The Perils of Provocative Statistics” discuss perceptions that the overwhelmingly black representation among persons disqualified from intercollegiate athletics due to the failure to meet the NCAA’s Proposition 48 academic eligibility standard reflected the fact that the standard was too high. The articles explain that under a lower standard the black representation among disqualified athletes would be expected to be greater. More recently, it appears that similar things are being said about Proposition 16, the successor to Proposition 48, that were said about the earlier standard, but still without recognition that the lower standards tend to yield a greater disproportion in rates of failing to meeting them than yielded by higher standards.
[i] Perceptions that larger relative difference in an outcome reflect a stronger force causing the difference – as, for example, in the case of inferences drawn about increasing relative differences in mortality or even inferences drawn about larger relative differences in discipline rates at one school compared with another, while flawed for failure to consider the implications of overall prevalence, are not necessarily incorrect. Indeed, all things being equal as to prevalence, a larger relative difference in experiencing an adverse outcome will be associated with a larger relative difference in experiencing the corresponding favorable outcome. But in the same way that a belief that lowering test cutoffs would reduce relative differences in failure rates is the opposite of reality, the perception that make discipline standards more lenient would reduce relative differences in discipline rates is the opposite of reality.
[iii] The EES figures in Tables 1 and 2 are from the online calculator referenced on the Solutions subpage. For the outcome rates below 1% that calculator provides values that differ from those derived from the Solutions Database in a way that appears to involve something other than the inexactness of the Solutions Database (a matter warranting further investigation).
[iv] The four-fifth rule may seem to provide a useful measure of the size of an impact of an employment criterion. But situations where the rates of meeting a criterion are 72% for the disadvantaged group and 90% for the advantaged group and where those rates are 8% for the disadvantaged group and 10% for the advantaged group both reflect the 80% break point for compliance/non-compliance with the four-fifths rule. Yet in the former situation the EES is .70 while in the latter it is .12.