Skip to Main Content Site Map

Partner Data Validation

In order to ensure that the Precinct Boundaries and Election Results (PBER) files that the RDH hosts are accurate and reproducible, our Data Team started the Partner Data Validation (PDV) project. In past redistricting cycles, every organization working on redistricting did this work in their own state, often duplicating one another’s work. By tackling the project across the U.S., the RDH saved these organizations time and effort.

Precinct Boundaries and Election Results files are integral to drawing legally compliant maps. However, collecting and processing these files is a complex project. For more detail about how the Precinct Boundaries and Election Results files are sourced and merged, and why the merged files are so important, see Precinct Boundaries and Election Results.

The Redistricting Data Hub worked with a number of data partners who collected election results from state agencies, usually the Secretary of State, and precinct boundary shapefiles from local election officials, the Census Bureau (through its Voting District Project), or the Secretary of State. Our data partners merged these files so that the election results from a given precinct are tied to the geography of that precinct. This allows an individual to use the datasets while drawing a map. The RDH took these merged files from our data partners and began the PDV process.

The PDV process had two key parts:

  1. Ensuring that our partners edited and merged the precinct boundary files and election results files error free
  2. Ensuring that the documentation explaining how those files were edited and merged allows for exact reproduction, including saving the source files.

Following the documentation written by our data partners, the RDH Data Team took the original election results files and precinct boundary shapefile and attempted to replicate the entire merging process from start to finish. At each step, the Data Team compared its results to the results of our data partners.

The first check was whether the candidate vote totals in the state’s election results file were equal to the vote totals in our data partner’s file. The Data Team checked these vote totals at the statewide, county, and precinct levels. Oftentimes a state does not allocate its absentee or early votes to the precinct where the voter lives, and instead reports these votes separately. If this occurs, our data partners reallocated these votes to the correct precinct. The RDH Data Team independently performed this reallocation, and then confirmed that after the reallocation, all the precincts have candidate vote totals equalling those reported by our data partners. If our Data Team’s results matched those reported by our data partners, the Data Team moved to merge the precinct boundaries with precinct-level election results.

This part of the process is more involved. Usually, local election officials, rather than state level agencies, keep track of precinct boundaries in their county. These officials often create unique names for the precincts within their county, such as “01 – Boston” and “02 – Boston.” However, the larger state agency that holds election results may title those same precincts differently, such as “1 – Boston, 2016” and “2 – Boston, 2016,” when storing the election results. Because of this variation, the election results for each precinct cannot be automatically merged with the precinct shapefiles.

Additionally, sometimes the number of precinct boundaries exceeds the number of precincts in the election results file because a precinct boundary is split. If this occurred, our data partners recombined the split precinct if it could be determined which one was split, and if not, our data partners contacted the jurisdiction directly to understand what occurred.

Our data partners combed through the precinct shapefiles and election results files to give each precinct a unique identifier. This ensures that the votes from a given precinct are applied to the correct corresponding precinct in the shapefile. The RDH’s Data Team then went through the exact same process of giving each precinct a unique identifier, ensuring that no errors occurred during this process. After every precinct was given a unique identifier, the election results for each precinct were merged with that precinct’s shapefile. Again, the Data Team checked to see that their results matched those of the data partner. One final check was performed against a third party source, such as Ballotpedia. The Data Team compared candidate vote totals at the state level to these third party groups to make sure that while processing the PBER file, the team did not duplicate an error performed by our data partner. Once this last check is performed, the final Precinct Boundaries and Election Results file was hosted on the RDH website.

While going through the entire validation process, the Data Team created a Validation Report, such as this example for Georgia. This report documents the PDV process for that specific Precinct Boundaries and Election Results file, and includes information about how to access the raw data used to create the PBER file, processing steps to create the PBER file, and additional information relevant to the validation process. Indicating the raw files the Data Team began with and documenting any changes that were performed allows our data users to identify errors and trace them back to where they occurred. Additionally, by outlining the methodologies for creating the PBER files in the Validation Report, the Data Team gave users the ability to review the methodologies employed and determine whether they would like to use our data.

Scripts

The scripts used to create a specific Precinct Boundaries and Election Results file can be found on the RDH’s Github page.

Validation Report Summary

A visual summary of this validation report can be found just below the Validation Report download button on a specific PBER’s page. The visual summary provides Yes, No or N/A answers to 6 criteria of the validation:

Criterion

  1. Raw Data available?
  2. Processing steps available?
  3. Able to replicate joining election data and shapefiles?
  4. Able to replicate by joining demographic data?
  5. Able to replicate by joining boundary data?
  6. Successfully ran validation?

Checkbox Criteria Explainer

Checkbox Criteria Specifics:

Is all raw data available?

Yes: if all raw data used to create the final shapefile is available, this may include the precinct boundaries, election results, demographic data, and anything else included. If precinct collection was done by the group, this means their raw county maps are also available

No: if not all data, even if just one file/source, is not publicly available.


Processing steps available?

(Reports uploaded before 07/06/21)

Yes: this is generous, we would mark it as a yes if the group provides any amount of documentation or explanation on any processing that was completed beyond providing the raw sources and listing the column names in the final file. Generous yes – if there is some documentation.

No: if no steps are publicly available detailing how the data was processed.


Processing steps available?

(Reports uploaded after 07/06/21)

Yes: this is generous, we would mark it as a yes if the group provides any amount of documentation or explanation on any processing that was completed beyond providing the raw sources and listing the column names in the final file. Generous yes – if there is some documentation or if no documentation is needed.

No: if no steps are publicly available detailing how the data was processed.


Able to replicate joining data and shape files?

(Reports uploaded before 07/06/21)

Yes: if we were able to join the election data without issue following the documentation laid out on precinct name changes that were made. If no documentation on precinct name changes were laid out but we had to adjust less than 10 precinct names in non-substantive ways, then we are able to replicate the join.

No: if more than 10 precinct names were changed in substantive ways and these changes were not documented.

N/A: we did not attempt this join


Able to replicate joining data and shape files?

(Reports uploaded after 07/06/21)

Yes: if we were able to join the election data without issue following the documentation laid out on precinct name changes that were made. If no documentation on precinct name changes were laid out but we had to adjust less than 10 precinct names in non-substantive ways, then we are able to replicate the join.

No: if more than 10 precinct names were changed in substantive ways and these changes were not documented or we were not able to attempt this join due to a lack of shapefile or other sourcefile availability


Able to replicate joining demographic data to block-level shapefiles?

Yes: if the join completes and all demographic fields and counts match.

No: if we are unable to join sources, or the demographic fields or counts do not match.

N/A: this join is not needed on this file


Able to replicate joining boundary data?

Yes: if we were able to replicate most precinct assignments to nearly all political districts

No: if we were not able to replicate precinct assignments to all political districts.

N/A: this join is not needed on this file


Successfully validated election results?

(Reports uploaded before 06/21/21)

Yes: we are able to confirm election results match the official results by precinct.

No: if election results do not match official election results or we are unable to validate due to other issues.


Successfully ran validation?

(Reports uploaded after 06/21/21)

Yes: we are able to confirm election results match the official results by precinct and that any precinct shapefile differences are explainable or occur in no-vote areas.

No: either election results do not match official election results or we are unable to validate due to other issues or there are unexplainable precinct shapefile differences that occur in no-vote areas or we were not able to perform the shapefile validation

2020 Validation Reports

Reports for states with 2020 Precinct Boundaries and Election Results shapefiles from VEST. Reports are also available for previous years through the data downloads pages by filtering by year or file type on the state download pages.

Do you have more questions?

Our help desk team can answer your questions about redistricting data and the redistricting process. Send a message and they will respond within one business day!