Skip to Main Content Site Map

Data

Frequently Asked Questions

Decennial Census Data

Other Census Data

Election Results and Precinct Boundaries

Incumbent Address Data

Population Projections

Voter File Data

Account

General

Decennial Census Data

What is the timeline for delivery of census redistricting data?

Typically the U.S. Census Bureau reports final census numbers to the President by December 31, and the data are released to the states by March 31 of the following year. Due to the pandemic, however, data collection was extended and the Census Bureau did not meet their December 31 deadline. Instead, the apportionment data was not delivered to the President on April 26, 2021. The redistricting data, sometimes referred to as the PL 94-171 or PL data, was released August 12th in legacy format. RDH tabulated this data within days and made it available for download a few weeks before the Census Bureau released te data on September 16th.

How did the delay in receiving census data affect redistricting?

There are three ways in which the delay in receiving census data might have affected the redistricting process in your state:

  1. Some states have deadlines for their passing maps that assumes receipt of the data by April 1. This means states had less time to redistrict, or had to change their deadlines.
  2. All states have filing deadlines for their primary elections. Redistricting must be completed in order for candidates to know what district they are running to represent. Delays in passing maps could affect candidate decision making, or require rescheduling the primary to a later date.
  3. Some legislatures adjourn in summer or even spring. This means that redistricting took place during a special session, rather than the normal legislative session.

For more information on the effects of the delay in releasing census data, see these reports from the Brennan Center and the National Conference of State Legislatures.

What data / variables are included in the decennial census datasets?

The field names for the 2020 census are available in the Census Bureau’s technical documentation, starting on page 99. You can also view these field names directly on our website.

Which states modify the decennial census (PL) data for redistricting?

There are several states that will modify their data for congressional and/or legislative redistricting by reallocating incarcerated persons, we keep an updated list on the States and Modifications page

What data / variables are contained in the P1, P2, P3, P4, P5, and H1 tables?

As the Census states in the technical documentation, Population counts for the total population and for the population 18 years and over are presented by race and by Hispanic or Latino origin, and for the total group quarters population by major group quarters type. The fields in the most recent census data are split into six different tables. The tables break the data down based on the type of data. Fields in the P1 table include counts for Race, fields in the P2 table include counts for Hispanic or Latino, and Not Hispanic or Latino ethnicities by Race, fields in the P3 table include counts for Race for the Population 18 Years and Over, fields in the P4 table include counts for Hispanic or Latino, and Not Hispanic or Latino ethnicities by Race for the Population 18 Years and Over, fields in the P5 table include counts for Group Quarters Population by Major Group Quarters Type and lastly, fields in the H1 table include counts related to Occupancy Status for Housing.

You can confirm what table the field you are looking at came from by looking at its first 4 characters, P00X corresponds to table PX and H001 corresponds to table H1.

Other Census Data

What data / variables are included in the Citizen Voting Age Population datasets?

CVAP Special Tabulation data is available for 2010-2021 at the block, block group, census tract, county, and state level in both SHP and CSV format. Additional CVAP data is available at the Place, SLDU, SLDL, MCD, and Congressional Districts level for 2018-2020. Fields were modified to match OMB race categories as used in voting rights analysis. The CVAP data contains estimates by race (for Non-Hispanic/Latino) for total population (not available at the block group or tract levels), voting age population (not available at the block group or tract levels), citizen voting age population, and citizen population. For more information on field names, see the metadata of your desired dataset.

What data / variables are included in the American Community Survey datasets?

Select fields from the American Community Survey 5-year estimates are available for 2010 to 2021 at the block group, census tract, county, and state level geographies in SHP and CSV format where applicable. The ACS collects a wide range of data, and on our site we host population totals by race and estimates for language spoken at home (not available at the block group level) through 2020. In 2021, we added a number of variables on income, education, poverty, citizenship, and language, and plan to continue including these fields in future ACS data releases. For more information on field names, see the metadata of your desired dataset.

Election Results and Precinct Boundaries

What data do you validate, and how?

For the 2020 and earlier precinct and election result files on our site, we attempted to replicate the dataset using the documentation provided by our data partners. A full description of this process can be found in our election results and precinct boundaries article

Where can I find state election results or precinct shapefiles?

Currently, the RDH hosts precinct and election results files on its website. The majority of 2016, 2018, and 2020 files come from VEST, but there are a handful of files from MGGG and PGP. The RDH team is currently joining these data in key states for 2022 and beyond. If for some reason we are not hosting an election result file that you are interested in, it may be available through MEDSL or Open Elections. If there is a precinct shapefile you are interested in not posted on the site, it may be in the pipeline – please contact our Help Desk to learn more.

Incumbent Address Data

How were addresses for the incumbent address data file produced?

First, winners of state legislative general elections were pulled from the State Legislative Election Returns database, updated through the November 2020 elections. Legislators from any legislative chamber that had at least some of its seats filled by general elections held before the November 2020 elections were then compared to state legislative websites’ current lists of state legislators to find “partial term” legislators. Names were then matched with names and addresses in the L2 voter file data, through 3 rounds of matching:

  1. Dr. Klarner looked for matches based on the last name to catch a broad net — this included non-hyphenated compound last name, hyphenated last name, or the component parts of such names. In some cases this results in many possible matches.
  2. potential matches were winnowed down, still using fairly broad match criteria. The most common match criteria in round 2 were first name, middle name, and suffix. Because many legislators go by their middle name informally, voters were also added to the group of possible matches with the legislator if their middle names matched the first name of a legislator.
  3. additional information was used to narrow down the possible matches to one match for the remaining match failures. Codes in the field “matchcode” identify what variables were used to identify a match.

For more information about this methodology, and for questions regarding the coding of race and gender of incumbents, email info@redistrictingdatahub.org

What data / variables are included in the incumbent address datasets?

The incumbent address data includes:

  • Full name of 2020 incumbents
  • Address and other location information
  • Party affiliation
  • Birthdate and age
  • Race and gender

Population Projections

How did HaystaqDNA generate the population projections hosted on your website?

The RDH hosts population projections generated by HaystaqDNA at both the block and block group levels. These projections were generated in six steps. A full breakdown of the steps can be found on the documentation pages listed below. The most important thing to note is that these projections were created before the August release of 2020 census data and thus based on population data from the 2019 ACS 5-year estimates.

What data / variables are included in the population projections datasets?

HaystaqDNA produced 2020 – 2030 population projections aggregated to the 2010 and 2020 census block and block group levels split into P1 and P2 fields of the PL redistricting file. They are available in CSV and SHP formats.

Voter File Data

How does L2 generate racial/ethnic data for voters?

Some states collect this information on their voter registration forms. For the majority of states that do not collect this information – and for the voters that do not provide that information – L2 generates probabilities for being in a particular racial or ethnic group based on an analysis of both given names and surnames. Please see the PDF document of L2’s Ethnic Coding for more details on how racial and ethnic data are generated for voters for whom this information was not self-reported.

What data / variables are included in the voter file datasets?

The information collected varies by state. Typically a voter file will contain a voter’s name, address, and voting history, but it may also contain other information, including but not limited to emails, phone numbers, racial identification, and partisan identification. National Conference of State Legislatures (NCSL) breaks this down state by state. RDH calculates turnout statistics based on voting history and includes that as separate datasets. Commercial voter files include all of the previous information, plus up to hundreds more fields with known and predicted attitudes and behaviors.

Account

I am having trouble registering and or logging in to the website.

Registering

You can register an account on our website at https://redistrictingdatahub.org/register/ The most common registration problem is mistyping the email address and not receiving (and confirming) the verification email. Please type your email address carefully. Please email info@redistrictingdatahub.org if you encounter any issues registering or logging into your account.

You can also watch a video demonstration that walks you through the process of signing up for an account and downloading data.

Logging In

If you enter the incorrect password 5 times or more, you will be locked out of your account for 24 hours.

Please wait until the following day before trying to reset your account and logging in again.

I am having trouble downloading files.

To download files, you need to have registered and verified your account. If you have not created an account yet, please do so here https://redistrictingdatahub.org/register/. If you are logged into your account and unable to download files, the issue is likely that you did not verify it. When you created your account we sent an email (that may have ended up in your spam folder) to you. If you are unable to find this, please reach out to us at info@redistrictingdatahub.org and we will resend the email.

From our homepage, click Data, Download Data, scroll down to the State Menu and click the state you are interested in. From that state page, scroll down to the Individual Data Downloads section, click the file you wish to download, and on the next screen select the format you’d like to download the file in (usually csv or shp). On this page, you can click Open Metadata to get more information on the particular file you’re interested in. Finally, after selecting your preferred file format, you’ll click to agree to our terms and conditions and then click Download.

You can also watch a video demonstration that walks you through the process of signing up for an account and downloading data.

How do I verify my account?

Upon registering for an account, we send our users a verification email to make sure they own the email they say they do. This email, which may have ended up in your spam folder, has a link which you need to click to verify your account. If you’re having trouble finding this email, please reach out to us at info@redistrictingdatahub.org and we can send it again.

How can I download multiple datasets at one time?

  1. In order to download data in bulk using our API, you must first create an account
  2. After you have verified your account, you can request API access by filling out this request form. Approval for API access depends on identity verification and your description of intended use, in order to ensure the data will not be used to harm others or engage in gerrymandering.
  3. Use of the API requires Jupyter and Python on your computer. You can download Jupyter for free and download Python for free. You will then access to the API from our Github.
  4. You can also watch a short tutorial that walks you through this process.

General

Where can I find political donor data?

Donor Data

At present we do not host any political donor data on the Redistricting Data Hub website. If you are interested in this data, there are a few organizations that collect this data, including the FEC, Open Secrets, and Follow the Money.

What are the different geographies you host data at?

The 5 main census geographies you’ll encounter on the site are (in increasing order of size): blocks, block groups, tracts, counties and states. In addition to these geographies, selected data is also available at the Congressional District, Census Place, State Legislative District, Voting Tabulation District (VTDs), Zip Code Tabulation Areas (ZCTAs, similar to zip codes) and American Indian/Alaska Native/Native Hawaiian Areas (AIANNH) levels. Our precinct and election results datasets are at the precinct level, which is separate from census geographies.

What data can be used to identify partisan gerrymandering?

Partisan Gerrymandering

Although there are several ways to measure it, there is no commonly accepted metric for measuring partisan gerrymandering. Nonetheless, the data required to calculate these metrics is often the same: election results and/or voter file data. Election results can indicate the partisan lean of a proposed district, based on how people in that district have voted in past elections. Since not everyone who is registered votes, voter file data are a different indicator of partisan lean, one that is based on a relatively larger segment of the population in a proposed district.

Where can I find prison population data?

Prison-based gerrymandering

You can access data on the incarcerated (prison) population by using the P5 table of the PL 94-171 census data. This data can be found by searching for “94-171” on any state page, then downloading the data at the geographic level of your choice.

What data can be used to identify racial gerrymandering?

Racial gerrymandering

Racial gerrymandering is prohibited under the Voting Rights Act (VRA) as well as the Fourteenth Amendment. Section 2 of the VRA prohibits minority vote dilution, which is typically assessed by performing what is known as a racial voting bloc or racially polarized voting (RPV) analysis. RPV analyses require data about election results and the race of voters. The race of voters is collected during registration and is thus known in some states. Voter file data from commercial vendors, such as the voter file data we host, will also generate predicted race and ethnicity for voters based on what is known as Bayesian Improved Surname Geocoding (BISG). In addition, the race of potentially eligible voters can be found in the citizen voting age population (CVAP) data, which is derived from the American Community Survey (ACS) data.

Do you have more questions?

Our help desk team can answer your questions about redistricting data and the redistricting process. Send a message and they will respond within one buisness day!