Skip to Main Content Site Map

PL 94-171 dataset

This article discusses what the PL 94-171 dataset is, outlines the legacy and tabulated data formats, how the RDH tabulated and validated the data, and what is included in the files hosted by the RDH.

Overview

The PL 94-171 dataset is the Redistricting Data File created by the United States Census Bureau based on the decennial census. It was created for use by the states in redistricting.

The RDH hosts tabulated PL 94-171 datasets, which are being used in most states. Some states, however, require adjustments to the PL 94-171 data prior to redistricting, primarily to reallocate incarcerated persons. We have adjusted datasets for most states that require this reallocation. Regardless, we encourage users to check directly with their state or another resource to ensure the correct dataset is being used.

For states that have indicated that they are using the census data without adjustment and will not be hosting a downloadable version of data on their website, our data should match what is being used by the state. For more information, read more about states that adjust the census data.

We have created a list of the fields and descriptions found in the PL 94-171 dataset by referencing the 2020 Census State Redistricting Data (Public Law 94-171) Summary File Technical Documentation

What is the PL 94-171 dataset?

The PL 94-171 dataset is the Redistricting Data File created by the U.S. Census Bureau based on the decennial census for use by the states in redistricting. In the remainder of this article, we are referring to the PL 94-171 dataset whenever we reference “the data.”

The dataset is released by the Census Bureau in two formats: the legacy data format, and the tabulated data format.

What is the legacy data format?

The legacy data format is a pipe-delimited text file, meaning the data are in rows and separated by “|”. The data provides the information contained in the decennial census at many levels of geography in the same file, from the state down to the Census Block, the smallest unit of geography and what is most commonly used as the basis for legislative and congressional redistricting.

What is the tabulated data format?

The tabulated data format contains the data in a data frame, with the columns containing the information from the census (e.g., number of people in a household) and the rows containing the units of geography (e.g., Census Blocks). In addition, the data are organized and separated by geography: with the data for all the Census Blocks together, the data for all the Census Block groups together, and so on. Unlike the legacy data format, the tabulated data format is readily usable in any spreadsheet software.

Why is the Census Bureau releasing the data in two formats?

Due to the COVID-19 pandemic, the release of PL 94-171 data from the Census Bureau was delayed. A number of states expressed concern about the delayed redistricting timeline, and their ability to meet state-mandated deadlines around redistricting and elections. As a result, the Census Bureau announced that they would also be releasing the data in legacy format by August 12th, so that states had the ability to process and tabulate the data on their own, if so desired, before the release of the tabulated data at the end of September.

How did the Redistricting Data Hub tabulate the data?

Our data team first downloaded the legacy format data from the Census Bureau website. The legacy format data is provided in one zip file per state. Each zip file contains four files: 3 “segments” containing the data for 1 or more standard redistricting tables, and 1 “geographic header” file.

The first segment contains the data for Tables P1 (Race) and P2 (Hispanic or Latino, and Not Hispanic or Latino by Race). The second segment contains data for Tables P3 (Race for the Population 18 Years and Over), P4 (Hispanic or Latino, and Not Hispanic or Latino by Race for the Population 18 Years and Over), and H1 (Occupancy Status). The third segment contains Table P5 (Group Quarters Population by Major Group Quarters Type), which was not part of the 2010 PL 94-171 data release.

The files were imported into Python as pipe-delimited data frames and the columns renamed. The segments were joined to each other and to the geo file, using the logical record number, or LOGRECNO.

Next, they queried the data by 10 different summary levels, each corresponding to a particular unit of geography: state, county, tract, block group, block, congressional district, state legislative district – lower, state legislative district – upper, minor civil division, and census place.

Our data team then merged the corresponding geographies with the PL 94-171 shapefiles based on Census GEOIDs. (You can learn more about GEOIDs on the Census Bureau’s website.) This means the data can be used in mapping software without any additional processing.

Finally, the tabulated data were exported in CSV and shapefile formats.

How did the RDH validate the tabulation?

One of our data partners also tabulated the legacy format data. We then compared population totals for every state, at multiple levels of geography.

Please note that we have not validated against the official data used by your state’s redistricting body or bodies. Some states reallocate incarcerated persons or exclude non-permanent residents from the PL 94-171 data file for redistricting. Other states may make additional modifications.

Does my state require use of decennial census data for redistricting?

The National Conference of State Legislatures has put together a guide on the subject; you can see whether your state requires decennial census data for legislative and/or congressional redistricting on their Redistricting and Use of Census Data page.

What fields are included in the PL file?

We have organized the Fields and Descriptions into a table by referencing the 2020 Census State Redistricting Data (Public Law 94-171) Summary File Technical Documentation.

Is the data on your website the official dataset used for redistricting in my state?

The RDH hosts tabulated PL 94-171 datasets, which are being used in most states. We also host adjusted datasets for most states that require adjustments. We encourage users to check directly with their state or another resource to ensure the correct dataset is being used.

Adjusting States

These states have a constitutional or statutory requirement for adjusting the PL 94-171 data prior to redistricting, this is a reference of the state’s constitution / statutes. We encourage you to consult with that state’s constitution or statutes directly.

States and Adjustment Wording

Do you have more questions?

Our help desk team can answer your questions about redistricting data and the redistricting process. Send a message and they will respond within one buisness day!