Racially Polarized Voting - Redistricting Data Hub

What is a Racially Polarized Voting Analysis?

Racially Polarized Voting (RPV) analyses are done to determine compliance with the Voting Rights Act (VRA). The VRA was passed in 1965 with the intent to enforce the 15th Amendment, which states, “The right of citizens of the United States to vote shall not be denied or abridged by the United States or by any State on account of race, color, or previous condition of servitude.” Section 2 of the Voting Rights Act protects against voting practices or procedures that inhibit the right to vote based on race, color, or membership in a language minority group. In addition, many state constitutions have language similar to the VRA about free and open elections and the right to vote.

Thornburg v. Gingles 1986 is a landmark US Supreme Court case in which Black plaintiffs challenged a North Carolina state legislature district plan on the grounds that it violated Section 2 of the Voting Rights Act by diminishing their ability to elect representatives of their choice. To prove this, the citizens had to show that the redistricting plan apportioned Black voters into districts with a majority of white voters who would vote against, and defeat, their preferred candidates. Thus, they had to prove that racially polarized voting was occurring. In Justice William Brennan’s plurality court opinion, the Court established three criteria that are necessary for establishing a Section 2 vote dilution claim. They are collectively called the Gingles criteria.

Gingles I: The minority group must be large and geographically compact enough to constitute a majority of a single-member district.
Gingles II: The minority group must be politically cohesive (i.e., do minority voters tend to vote similarly to one another).
Gingles III: The majority group must be politically cohesive and have consistently voted as a bloc such that the minority preferred candidate is usually defeated.

All three criteria must be satisfied to indicate a case of racially polarized voting. Analysis of Gingles II and III is generally called a racial bloc voting or racially polarized voting (RPV) analysis.

In 1982, Congress amended Section 2 to consider the totality of circumstances in a jurisdiction after the Gingles criteria have been met. There are seven factors, known as the Senate Factors, that were included in the Senate Committee on the Judiciary’s 1982 report. The court may also consider other factors, and it is not required that the court finds any of the Senate Factors present to rule a violation of Section 2.

In 2000, the US Office of Management and Budget (OMB) stated that US agencies are required to give people the option to select one or more races when reporting information on race in Federal data collection. At minimum, the following five race categories must be included: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. The US Supreme Court ruled in Bartlett v Strickland (2009) that a single minority group must comprise a numerical majority of a district; coalition building with two or more racial groups does not meet Gingles I.

What Data are Used in RPV Analysis?

RPV analysis is complex and data intensive. In the US, we use a secret ballot system, meaning that ballots have no information about the voter or their demographics. There are voter file datasets that indicate whether an individual cast a vote in a given election, but not who they voted for. In other words, Gingles II and III cannot be directly answered because there is no record of how individuals of a particular racial, ethnic, or language group voted in a given election. Therefore, the process of uncovering whether racially polarized voting occurs requires working with multiple aggregate-level datasets to draw conclusions about individuals. An RPV analyst must collect and combine election results with demographic data to make inferences about the voting patterns of individuals within a certain geographic area.

Precinct Boundaries & Election Results

Election results are reported at the precinct level. These datasets have information about how many votes a given candidate or initiative received in the precinct, in addition to the total votes cast in the precinct. Election results must be joined to precinct boundary files to create a Precinct Boundary and Election Results file. This is a difficult process in itself because of the challenges associated with collecting precinct boundaries. Once election results and precinct boundaries are joined together, they must be merged with Census Blocks. This is also a complex process, since Census Block and precinct boundaries often do not align neatly with one another, so precincts must be split and assigned to multiple blocks.

Voter Files

Voter file data contain information about who voted in which elections. They also have information collected from voter registration forms. These forms vary by state and are maintained by counties, but a typical voter file contains a voter’s name, address, and voting history. It may also contain emails, phone numbers, racial identification, and partisan identification, among other information. However, individuals are not required to provide race information in states that do collect it. For more information about what data each state collects in voter registration forms, see this list from the National Conference of State Legislatures (NCSL). In RPV analysis, voter files are used to determine how many registered and actual voters live in a given area, and it is most useful when these data include race information.

ACS

The American Community Survey (ACS) can be used for RPV because it offers information about citizenship, which the Decennial Census does not. Some Circuits require citizenship information to demonstrate Gingles I. Citizen Voting Age Population (CVAP) data comes from the ACS. It is useful because only citizens can vote in state and federal elections, and only people over the age of 18 can vote. CVAP data allows RPV analysts to eliminate people who cannot legally vote in their analysis, although this does not account for people who choose not to vote. The ACS is a survey, unlike the Decennial Census, which is a count of everyone in the US. Therefore, it is less accurate than the Decennial Census. In addition, the smallest geographic area for which ACS data is available is at the Block Group level.

Data from the ACS can be used to support claims about the seven Senate Factors, which are used in Voting Rights Act Section 2 cases. These data include income, healthcare status, educational attainment, and poverty rates.

PL 94-171 Census Data

For Gingles I, experts use demographic data from the Census that is reported at the Block level. Census data is used to find out how many people of a certain race, ethnicity, and language live in each Block. PL 94-171 data is useful in RPV analysis for determining Block-level race, ethnicity, and language data, but it does not have information about individuals or their voting behavior.

Three Methods of RPV Analysis

There are three methods that are usually used for RPV analysis: homogenous precincts, Ecological Regression (ER), and Ecological Inference (EI). All three methods involve making estimates about individuals using aggregate-level data with varying degrees of precision.

The homogenous precincts method is the simplest of the three and compares election results from precincts that are composed homogeneously of one race or ethnicity. For example, if a precinct is 100% Hispanic and voted 85% for Candidate A, it is known that 85% of Hispanic voters in that precinct voted for Candidate A.

Ecological Regression is conducted using bivariate analysis, meaning there are two variables: race or ethnicity of voters and votes for a candidate in each precinct. RPV analysts generate a line of best fit for the data which shows the relationship between the proportion of a district that is the minority race/ethnicity and the percentage of votes for a given candidate.

Ecological Inference is similar to ER. However, it also uses a method of bounds to constrain the results indicating voting patterns by race within a certain range. It also uses a Maximum Likelihood Estimation to create a bivariate normal distribution of the possible percentage of votes for a particular candidate by different racial and ethnic groups with varying confidence intervals. EI is considered the most accurate method.

Data Issues Associated with RPV Analysis

This section discusses the problems RPV analysts may encounter and outlines the reasons why the analysis is challenging. While it addresses specific questions and issues, it is not an exhaustive list, and there are many other questions for which there is no clear answer.

Ecological Fallacy

RPV analysis methodology involves drawing conclusions about individuals based on aggregate-level data. Making assumptions about individual characteristics can lead to an issue known as the ecological fallacy. When we assume that all individuals in a group will behave the way that the group acts when viewed as a whole, ecological fallacy can occur. Here is a simplified example of a geographic ecological fallacy: Say a Census Block has five households, and you know the average income is $1,000,000 per year. An ecological fallacy would assume everyone in the Census Block is rich, but in reality, there is just one household in the Block who makes $5,000,000, and the others make $0.

RPV analysis is complicated because of the lack of data at the individual level about race and voting patterns. As a result, analysts must look at aggregate data and voting patterns to make inferences. It is important to keep the ecological fallacy in mind and understand that it is not possible to know with certainty how individual members of a racial or ethnic group voted.

What if voter files do not have race information?

If race information is not included in voter files, analysts may use the Bayesian Improved Surname Geocoding (BISG) method to estimate the racial composition of a group of individuals. This method uses U.S. Census Surname list to obtain information about an individual’s likely race/ethnicity and combines that with information about the racial or ethnic makeup of the Census Block where they live. The result of this analysis is estimates for the probability that a given individual identifies with a particular race. However, the BISG method is a statistical estimate about race based on where someone lives and their last name, so it does not guarantee accuracy.

Analysts may also use census data which includes race at the Block level. This method is easier and does not rely on statistical estimates, but it does not result in information about whether an individual of a given race was registered to vote in the given election. Census data also contains less detailed information about each voter.

Who is the “preferred candidate?”

Typically, we assume that a minority group would prefer a candidate of the same race or ethnicity, but this is not always the case. For example, if all the candidates are white males, it would make it more difficult to distinguish the minority group’s preferred candidate and, if necessary, show that race is the primary reason for the preference. The court would ultimately handle this, but analysis should generally reveal who the preferred candidate is.

How cohesive must voters be?

Gingles II states that the minority group must be “politically cohesive.” However, there is no official threshold or definition for cohesion. In fact, the Gingles Court purposely did not offer a definition for cohesion, stating that the size of majority voting blocs necessary to have an impact on minority voting blocs varies by district. Since Gingles, there has not been a definition created for cohesion. Instead, courts analyze cohesion across multiple elections and consider other factors. Overall, the lack of threshold for cohesion gives courts the power to look at the evidence and decide whether the data suggest that minority groups are politically cohesive.

Similarly, there is no consistent rule or measurement for how much polarized voting is considered a Section 2 violation.

What is “Compactness?”

Gingles I states that the minority group must be large and geographically compact enough to constitute a majority of a single-member district. However, compactness has never been officially defined in this context. It has generally been interpreted as populations with fairly regular boundaries and areas that are not “far flung.” In other words, the population should be organized geographically so it could realistically be a district.

Correlation vs. Causation

Because RPV analyses use observational data, analysts cannot make claims about causation. Analysts can use statistics and patterns to make inferences, but ultimately, we cannot be certain that race, or any other factor, is directly causing voting decisions. Research has shown that Black, Asian, and Hispanic voters lean Democratic, showing that race is a plausible explanatory factor.

The issue of correlation and causation is also a legal one. In the original ruling in Thornburg v. Gingles, a plurality of judges ruled that RPV only refers to a correlation between race of voters and selection of candidates, meaning that some other factor could be involved, such as partisanship. In other words, they did not think it mattered why minority voters and majority voters voted differently from one another, only that they did. However, two other justices disagreed, arguing that race had to be the primary factor to fulfill Gingles III. Because there was no majority agreement, different circuit courts have different approaches and requirements with respect to this question.

Before the Gingles trial, Congress amended Section 2 of the VRA and stated that discriminatory effect alone would be enough to prove a Section 2 violation, as opposed to intent to dilute minority voters.

Data Problems in Small Jurisdictions

RPV analysis generates estimates of voting patterns with confidence intervals. Therefore, larger samples will result in more precise estimates and smaller confidence intervals. In small jurisdictions with low population, these confidence intervals may be too large to be used in court. For example, if an RPV analysis results in an estimate that 80% of a particular racial group voted for Candidate A, but the confidence interval indicates that the true percentage is between 40 and 120% due to the small population, the estimate is not particularly useful. Small jurisdictions may have extreme RPV, but the estimates are so imprecise that it is difficult to prove in court.

This connects to another issue that estimates can surpass 100% in areas with extreme racially polarized voting. For example, analysis may yield an estimate where the confidence interval exceeds 100% of a particular minority group voting for Candidate A.

Questions

What is a Racially Polarized Voting Analysis?
What Data are Used in RPV Analysis?
Data Issues Associated with RPV Analysis
What if voter files do not have race information?
Who is the “preferred candidate?”
How cohesive must voters be?
What is “Compactness?”

Racially Polarized Voting Analysis

What is a Racially Polarized Voting Analysis?

What Data are Used in RPV Analysis?

Precinct Boundaries & Election Results

Voter Files

ACS

PL 94-171 Census Data

Three Methods of RPV Analysis

Data Issues Associated with RPV Analysis

Ecological Fallacy

What if voter files do not have race information?

Who is the “preferred candidate?”

How cohesive must voters be?

What is “Compactness?”

Correlation vs. Causation

Data Problems in Small Jurisdictions

Questions

Resources

Racially Polarized Voting Case Tool

Watch Trainings and View Slides

Video Playlist

Legally Significant Cases

Data and Elections

From Data to Analysis

Do you have more questions?