PL 94-171 dataset
- What is the PL 94-171 dataset?
- What is the legacy data format?
- What is the tabulated data format?
- Why is the Census Bureau releasing the data in two formats?
- How did the Redistricting Data Hub tabulate the data?
- How did you validate the tabulation?
- What fields are included in the PL file?
- Is the data on your website the official dataset used for redistricting in my state?
What is the PL 94-171 dataset?
The PL 94-171 dataset is the Redistricting Data File created by the U.S. Census Bureau for use by the states in redistricting based on the decennial census. In the remainder of this article, we are referring to the PL 94-171 dataset whenever we reference “the data.”
The data will be released by the Census Bureau in two formats: the legacy data format, and the tabulated data format.
What is the legacy data format?
The legacy data format is a pipe-delimited text file, meaning the data are in rows and separated by “|”. The data provides the information contained in the decennial census at many levels of geography in the same file, from the state down to the census block, the smallest unit of geography and what is most commonly used as the basis for legislative and congressional redistricting.
What is the tabulated data format?
The tabulated data format contains the data in a data frame, with the columns containing the information from the census (e.g., number of people in a household) and the rows containing the units of geography (e.g., census blocks). In addition, the data are organized and separated by geography: with the data for all the census blocks together, the data for all the census block groups together, and so on. Unlike the legacy data format, the tabulated data format is readily usable in any spreadsheet software.
Why is the Census Bureau releasing the data in two formats?
Statutorily, the PL 94-171 redistricting data should be released by March 31, 2021. Due to the COVID-19 pandemic, however, this deadline was pushed back to September 30. In response, a number of states expressed concern about the delayed redistricting timeline, and their ability to meet state-mandated deadlines around redistricting and elections. As a result, the Census Bureau announced that they would also be releasing the data in legacy format by August 12th, so that states had the ability to process and tabulate the data on their own, if so desired, before the release of the tabulated data at the end of September.
How did the Redistricting Data Hub tabulate the data?
Our data team first downloaded the legacy format data from the Census Bureau website. The legacy format data is provided in one zip file per state. Each zip file contains four files: 3 “segments” containing the data for 1 or more standard redistricting tables, and 1 “geographic header” file.
The first segment contains the data for Tables P1 (Race) and P2 (Hispanic or Latino, and Not Hispanic or Latino by Race). The second segment contains data for Tables P3 (Race for the Population 18 Years and Over), P4 (Hispanic or Latino, and Not Hispanic or Latino by Race for the Population 18 Years and Over), and H1 (Occupancy Status). The third segment contains Table P5 (Group Quarters Population by Major Group Quarters Type), which was not part of the 2010 PL 94-171 data release.
The files were imported into Python as pipe-delimited data frames and the columns renamed. The segments were joined to each other and to the geo file, using the logical record number, or LOGRECNO.
Next, they queried the data by 10 different summary levels, each corresponding to a particular unit of geography: state, county, tract, block group, block, congressional district, state legislative district – lower, state legislative district – upper, minor civil division, and census place.
Our data team then merged the corresponding geographies with the PL 94-171 shapefiles based on Census GEOIDs. (You can learn more about GEOIDs on the Census Bureau’s website.) This means the data can be used in mapping software without any additional processing.
Finally, the tabulated data were exported in CSV and shapefile formats.
How did you validate the tabulation?
One of our data partners also tabulated the legacy format data. We then compared population totals for every state, at multiple levels of geography.
Please note that we have
Does my state require use of decennial census data for redistricting?
The National Conference of State Legislatures has put together a guide on the subject; you can see whether your state requires decennial census data for legislative and/or congressional redistricting on their Redistricting and Use of Census Data page.
What fields are included in the PL file?
We have organized the Fields and Descriptions into a table by referencing the 2020 Census State Redistricting Data (Public Law 94-171) Summary File Technical Documentation.
Is the data on your website the official dataset used for redistricting in my state?
Only the official redistricting body or bodies in your state will maintain the official data used for redistricting. Your state may or may not make that data publicly available.
We have not modified the data in accordance with any state requirements to modify, and thus we cannot guarantee the PL 94-171 data hosted on the website is identical to the data used by the official redistricting body or bodies in your state.
States that will not host official redistricting datasets
These states have indicated that they are using the census data without modification, and will not be hosting a downloadable version of data on their website.
- New Hampshire
- North Dakota
These states have a constitutional or statutory requirement for modifying the PL 94-171 data prior to redistricting, this is a reference of the state’s constitution / statutes. We encourage you to consult with that state’s constitution or statutes directly.
Do you have more questions?
Our help desk team can answer your questions about redistricting data and the redistricting process. Send a message and they will respond within one buisness day!