American Community Survey - Redistricting Data Hub

Overview

The ACS is a survey conducted by the US Census Bureau to collect information including households’ social, demographic, economic, and housing status. Data is collected every year on a continuous basis and combined and reported as 1 or 5-year estimates. The ACS is important because it collects more detailed information than the Decennial Census, which is useful for determining how communities are shifting. The statistics generated by the ACS are used by federal, tribal, state, and local officials, as well as the private sector.

What are the differences between the ACS and the Decennial Census?

The U.S. Decennial Census occurs every ten years ending in 0. It is a short series of questions collecting information about every household. Between 1970 and 2000, there was a section of the Decennial Census called the long form, which the Census Bureau sent to a sample of about one sixth of households in the country. As the name suggests, it was a much more detailed survey than the census short form that all households filled out, and it collected information about household demographics, economic status, and housing characteristics.

In 2005, the Census Bureau launched the American Community Survey (ACS) to replace the long form, and the 2010 Decennial Census was the first year since 1970 without the long form. Each year, the Census Bureau sends questionnaires to a sample of the population (about 3.5 million addresses) to collect data and generate estimates on a variety of population and household characteristics. Here is the 2021 ACS questionnaire and an explanation for why the Census Bureau asks each question. Data collection for the Decennial Census typically occurs between March and June (every ten years), whereas collection for the ACS happens almost every day of the year. Overall, the Decennial Census and the ACS are both important for collecting information about the US population, and they serve different purposes.

What is the sampling process?

There are two groups involved in the ACS: housing unit (HU) addresses and residents of group quarters (GQ) facilities. Addresses are found using the Master Address File, which is a nationwide file that obtains its data from the U.S. Postal Service, local governments, and Census Bureau employees in the field.

About 3.5 million housing units (HU) are sampled each year from every county and county-equivalent in the United States, including Washington D.C. and Puerto Rico.

How are the data collected?

The US Census Bureau collects ACS data by internet, mail, telephone, and in-person. There are multiple steps to data collection if selected households do not fill out the survey right away.

The Census Bureau sends a pre-notice letter telling residents that they will receive instructions to fill out the ACS in a few days. Next, an initial mail package is sent including an envelope informing residents that the ACS is required by law. This mailing also includes a cover letter, an instruction card for responding via the internet for those who do not want to respond by mail, and a brochure with Frequently Asked Questions. A toll-free number is provided to answer the survey in English or another language. A replacement questionnaire may be sent as well as up to three reminder postcards until a month after the initial mailing.

If respondents do not fill out the survey, the Census Bureau may select them for a personal visit. However, it is expensive to send Census Bureau employees to nonresponse residences, so only about a third of nonrespondent housing units are selected for in-person interviews. The proportion of nonrespondent housing units selected for in-person interviews depends on their predicted response rates; in areas with lower predicted response rates, a larger proportion of nonrespondent housing units are selected. During the personal visit, if the resident refuses to respond to the ACS, there will be no further attempts.

In addition to housing units, a sample of people living in group quarters facilities such as colleges, nursing homes, and prisons are also selected to complete the ACS.

How are the data processed?

Imputation

Imputation is the process of substituting missing or invalid values based on estimates. In the ACS, blank fields come from respondents refusing to answer a question or marking “don’t know.” The Census Bureau completes the imputation process on a state-by-state basis. There are two methods for filling in missing data: assignment and allocation. Assignment considers other data provided by the same respondent to estimate missing answers. It also uses data reported by other household members to fill in blank data or to correct inconsistent data. Allocation is a process that finds other respondents in the sample with similar answers to estimate the missing data. Any responses filled by imputation are flagged by the Census Bureau. Annual allocation rates for each topic on the ACS are provided by the Census Bureau. Overall housing allocation rate and overall person allocation rate were slightly higher in 2020 than the previous four years at 5.2 and 12.3 percent, respectively.

Estimates

The Census Bureau releases one and five-year ACS estimates that reflect data collected over a period of time. Because ACS data is collected nearly every day, these estimates are how the Census Bureau reports data collected at different times in one release. This is different from the Decennial Census, which releases data reflecting a single point in time every ten years. Three-year ACS estimates were released between 2005 and 2013 but have been discontinued.

One-year estimates contain 12 months of collected data and are used only for areas with populations greater than 65,000. This population threshold exists because 1-year estimates contain a relatively small sample size. Because of the sample size, the margin of error is larger, and therefore, 1-year estimates are less reliable than 3 or 5-year estimates. Beginning in 2014, “1-year Supplemental Estimates” have been produced for areas with at least 20,000 people. These are simplified versions of ACS tables that provide statistics for geographies with mid-sized populations. However, these will not be released for 2020 ACS data.

Five-year estimates contain data for all areas from 60 months of collection. These estimates contain the largest sample size and are therefore the most reliable. However, since they contain data from five years, they are also less current than the 1-year estimates. Block Group data is only available in 5-year estimates. Here is more information on the geographic hierarchy of the ACS. Five-year estimates have been released annually since 2009.

Estimates are created using weights, which are designed to compensate for inconsistencies between the sample and the full population and variation in sampling rates across different areas, including differences due to nonresponding households. Weighting does not change a person’s answer to the ACS, only how many times their answer is repeated for calculating a statistic.

2020 ACS Data and COVID-19

In 2020, the ACS only received about two-thirds of the responses it typically receives in a year. In addition, there was a high nonresponse bias: residents who did not respond to the survey had significantly lower income, education, and were less likely to own their home than people who did respond. Typically, adjustments can be made in processing the data to give more weight to underrepresented groups to account for nonresponse bias. However, because the 2020 cycle was particularly unusual, the issues could not be adequately addressed to meet the Census Bureau’s internal quality standards for a 1-year estimate. In July of 2021, the Census Bureau announced that standard 1-year estimates would not be released for the 2020 ACS. Instead, experimental estimates will be released. While the 2020 Decennial Census postponed Nonresponse Followup to be able to complete the process fully, the ACS was not able to do so.

For the 2020 experimental estimates, the Census Bureau will determine a new set of weights using data from surveys and administrative records. They will apply these to the 2020 ACS data collected during the pandemic. The weighting technique for the 2020 experimental estimates is called Entropy Balancing, which is designed to handle additional inputs to the weighting model. Administrative data was added to the weighting algorithm from 1040 and 1099 forms from the IRS and demographic and program participation from the Social Security Administration (SSA), such as retirement and disability information. However, the Census Bureau discourages using the 2020 ACS 1-Year Experimental Estimates as a replacement for the usual 1-year estimates. These Estimates were released on November 30th, 2021.

As a result of the pandemic, the response rate for the 2020 ACS was much lower than a typical year, at 71.2 percent. For reference, the response rate between 2000 and 2019 ranged from 86 percent to 97.6 percent. The main reason for nonresponse was the “Other” category, at 16.9%. Refusal to complete the ACS was the second highest reason, at 8 percent in 2020, compared to 0.8 percent to 4.7 percent over the previous 19 years.

ACS Data Available on the RDH

The ACS includes many questions that are not directly relevant to redistricting (see the 2021 ACS questionnaire). As a result, the RDH only has a small selection of the variables in our ACS files. Specifically, the variables we include are population estimates by race and ethnicity and estimates for education, income, poverty, citizenship, and language spoken at home. Language data is not available at the Block Group level due to privacy concerns. More information about the variables can be found in the metadata for each dataset. In addition, the RDH only hosts 5-year estimates. These estimates are available for the state, county, Census Tract, and Block Group levels in csv and shapefile formats.

Citizen Voting Age Population (CVAP)

Citizen Voting Age Population datasets contain estimates of the total population of citizens in the US who are eligible to vote based on their age, by race and ethnicity. CVAP datasets were created for use in voting rights analysis. An annual CVAP special tabulation is released yearly by the US Census Bureau using survey data from the American Community Survey 5-year estimates. The first CVAP special tabulation was released in 2002 using data from the Census 2000 long form. In 2011, the Census Bureau began publishing CVAP special tabulations every year using data from the ACS.

CVAP data is published down to the Block Group level. However, the RDH developed a methodology for disaggregating CVAP data to the Block level, which is available for download. For more information about the methodology, refer to the metadata of the dataset you downloaded. The CVAP data files hosted by the RDH are posted at the state, Congressional District, State Legislative District (Upper House and Lower House), County, Place, and Block Group for all states, and Minor Civil Division (MCD) for states where applicable. Census Places are statistical geographic entities representing locally recognized communities that do not have legally defined boundaries. MCDs are legally defined county subdivisions. They are commonly known as towns, townships, and districts. There are 12 states with CVAP data at the MCD level, and in these states, MCDs usually serve as general-purpose local governments.

In addition to CVAP, the RDH also hosts Voting Age Population (VAP) data, which counts all people in the US ages 18 or older. VAP data is at the Census Tract level.

Original CVAP fields

Refer to the Census documentation for more information on CVAP categories.

Total
Not Hispanic or Latino
American Indian or Alaska Native Alone (Not Hispanic or Latino)
Asian Alone (Not Hispanic or Latino)
Black or African American Alone (Not Hispanic or Latino)
Native Hawaiian or Other Pacific Islander Alone (Not Hispanic or Latino)
White Alone (Not Hispanic or Latino)
American Indian or Alaska Native and White (Not Hispanic or Latino)
Asian and White (Not Hispanic or Latino)
Black or African American and White (Not Hispanic or Latino)
American Indian or Alaska Native and Black or African American (Not Hispanic or Latino)
Remainder of Two or More Race Responses (Not Hispanic or Latino)
Hispanic or Latino

OMB Categories as used in the CVAP datasets in our data

These three-letter abbreviations are used for all fields in the CVAP data as well as for each year. In our data this would look like CVAP_TOT19, C_HSP16, ALL_ASW17, or VAP_WHT18, where CVAP is Citizen Voting Age Population Estimate, C is Citizen Estimate, ALL is total estimate (for that racial/ethnic category), and VAP is Voting Age Population.

#	Abbreviation	Field
1	TOT	Total
2	NHS	Not Hispanic or Latino
3	AIA	American Indian or Alaska Native Alone (Not Hispanic or Latino) + American Indian or Alaska Native and White (Not Hispanic or Latino) + American Indian or Alaska Native and Black or African
4	ASN	Asian Alone (Not Hispanic or Latino) + Asian and White (Not Hispanic or Latino)
5	BLK	Black or African American Alone (Not Hispanic or Latino) + Black or African American and White (Not Hispanic or Latino) + American Indian or Alaska Native and Black or African American (Not Hispanic or Latino)
6	NHP	Native Hawaiian or Other Pacific Islander Alone (Not Hispanic or Latino)
7	WHT	White Alone (Not Hispanic or Latino)
8	AIW	American Indian or Alaska Native and White (Not Hispanic or Latino)
9	ASW	Asian and White (Not Hispanic or Latino)
10	BLW	Black or African American and White (Not Hispanic or Latino)
11	AIB	American Indian or Alaska Native and Black or African American (Not Hispanic or Latino)
12	2OM	Remainder of Two or More Race Responses (Not Hispanic or Latino)
13	HSP	Hispanic or Latino

For information on how these data are used in disaggregation, refer to any disaggregation README for a dictionary with PL 2020 comparison tables that are used, such as this example README.

Visit our GitHub to see the code that processes the CVAP data and performs the calculations

How are ACS Data Used in Redistricting?

CVAP data are used in analysis of Voting Rights Act cases. Redistricting plans must be in accordance with Section 2 of the Voting Rights Act, which is designed to prevent minority vote dilution. CVAP data are especially important when looking at whether racially polarized voting (RPV) has occurred in existing districts because it allows experts to better estimate how many potential voters of a certain race or ethnicity are in an area.

Data from the ACS include socioeconomic information that can be used to support claims about the seven Senate Factors, which are used in Voting Rights Act Section 2 cases. These data include income, healthcare status, educational attainment, and poverty rates. ACS data such as languages spoken at home can also be used to support claims about the existence of Communities of Interest.

American Community Survey (ACS)