American Community Survey Data Products, Data Uses, and Data Needs
One of the main justifications for including a sample of the group quarters (GQ) population in the American Community Survey (ACS) is based on the original vision for the survey, which was that it would serve as a replacement for the census long-form sample. The long-form sample included both institutional and noninstitutional group quarters, and GQ facilities are currently included in the ACS to remain faithful to that goal. However, the cost of collecting data from a hard-to-reach population—such as the residents of group quarters—is higher per interview than the cost of housing unit interviews because of the more complex survey operations required (e.g., higher rates of face-to-face interaction with individual respondents and facility managers). Moreover, an inadequate GQ sample size jeopardizes not only the estimates for the GQ population but also the estimates for the total population in areas where a relatively large number of persons live in GQ facilities.
A fundamental question the panel had to consider was whether there is a demonstrated and sufficiently compelling need for collecting GQ data as part of the ACS. Other national surveys conducted by both government agencies and private research organizations typically exclude the institutional population and the active-duty military population and treat residents of civilian, noninstitutional group quarters as if they were part of the household population. These surveys are representative of the U.S. civilian, noninstitutional population. In addition, several other federal and private survey efforts focus specifically on segments of the GQ population (generally at a national level), raising the question of whether there is any redundancy of effort and overlap with the GQ data collection in the ACS.
To understand data user needs, and in particular the relevance of the GQ data to users of the American Community Survey, the panel sought input from researchers and stakeholders, attempting to identify data users who may have specific programmatic requirements for information about GQ residents. A workshop was held with a broad spectrum of users of the ACS data on December 13, 2010, in Washington, DC. The goal of the meeting was to gain a thorough understanding not only of what data users’ needs are, but also of how the GQ data are used and to discuss enhancements and alternatives to the current ACS design and methodologies. In reviewing data user needs, the panel was also assisted by consultants who were asked to examine federal as well as state and local uses of the ACS GQ data, including uses for funding allocation and to meet programmatic needs.
This chapter discusses the input received from data users and draws on two papers commissioned by the panel:
- “The American Community Survey: A Review of the Universe Requirements in Federal Legislation” by Cynthia M. Taeuber and Rachel Blanchard Carpenter, and
- “The Importance of American Community Survey Data on the Group Quarters Population” by Robert Scardamalia.
ACS DATA PRODUCTS
As discussed, since 2006, the Census Bureau has been publishing annual 1-year ACS estimates for geographic entities with a population of at least 65,000. Three-year ACS estimates for geographic entities with populations of at least 20,000 have been published since 2008. The first release based on 5 years of data collection was published in 2010, with estimates for all statistical, legal, and administrative entities, including areas as small as census block groups. Data from 2005 include only the household population, whereas data beginning with 2006 include both households and group quarters. The 2010 release of 5-year period estimates was based on 1 year of data (2005) that did not include a GQ sample and 4 years of data (2006-2009) that included GQ samples (with the GQ data weighted to reflect a 5-year period estimate window). Beginning with the release of the 2006-2010 estimates in December 2011, all new ACS data products will be based on samples of both households and group quarters for every year included.
ACS data products are expected to evolve on the basis of data needs and feedback from researchers and other data users; Table 3-1 summarizes the current products. Not all releases include all of these products. For example, the 5-year release does not include comparison profiles, state ranking tables, or selected population profiles. Some of the derived data products, such as data and narrative profiles, subject profiles, and geographic comparison tables,
TABLE 3-1 Main American Community Survey Data Products
|
|
Data Product | Description |
|
|
Data profiles | Provide broad social, economic, housing, and demographic profiles |
Narrative profiles | Summarize the information in the data profiles using concise, nontechnical text and graphical displays |
Selected population profiles | Provide broad social, economic, and housing profiles for a large number of race, ethnic, ancestry, and country/region of birth groups |
Ranking tables | Provide state rankings of estimates across key variables |
Subject tables | Provide detailed data on a particular topic |
Detailed tables | Provide access to the most in-depth data available on all topics and geographic areas |
Geographic comparison tables | Compare other types of geographic areas in addition to states (e.g., counties or congressional districts) for key variables |
Thematic maps | Interactive, online maps that can be used to display the same estimates available in the geographic comparison tables |
Summary files | Provide access to the detailed tables through a series of comma-delimited text files on the Census Bureau’s FTP site |
Public Use Microdata Sample files | Untabulated, anonymized records that contain information collected about individual people and housing units, as well as residents of GQs |
|
|
SOURCE: Based on U.S. Census Bureau Data Product Descriptions. Available: http://www.census.gov/acs/www/data_documentation/product_descriptions/. |
are produced only for a subset of the geographic summary levels in the 5-year release.
Selected data tables report a breakdown of the total population into those living in households (often accompanied by additional characteristics) and those living in group quarters (with limited detail). Box 3-1 summarizes the data products that highlight group quarters in the 1- and 3-year releases, and Box 3-2 summarizes the data products that highlight group quarters in the 5-year release.
To illustrate the available tables that include information on the GQ population, Appendix D contains the tables published for the state of Virginia based on the 2005-2009 ACS. Appendix E contains the tables published for the Virginia county of Goochland, which had a population of 16,863 based on the 2000 census. The example of Goochland and the impact of the GQ population on the quality of the estimates in this county are discussed further in later chapters.
BOX 3-1
ACS Tables That Highlight Group Quarters in the 1- and 3-Year Data Releases
Base Tables
B26001: total GQ population.
–Selected base tables with a single data line for the GQ population (e.g., B09016, household type by relationship).
Subject Tables
S2601A: characteristics of the GQ population (total population, total GQ population, institutional population, noninstitutional population) at the national, regional, and census division levels.
S2601B: characteristics of the GQ population by GQ type (total population, total GQ population, adult correctional facilities, nursing facilities, college/ university housing) at the national level.
S2601C: characteristics of the GQ population in the United States (total population, total GQ population) for states that meet a population threshold.
SOURCE: Stern (2010).
BOX 3-2
ACS Tables That Highlight Group Quarters in the 5-Year Data Releases
Base Tables
B26001: total GQ population for all geographic areas.
–Selected base tables with a single data line for the GQ population (e.g., B09016, household type by relationship).
Subject Tables
S2601A: characteristics of the GQ population (total population, total GQ population, institutional population, noninstitutional population) at the national, regional, census division, and state levels.
S2601B: characteristics of the GQ population by GQ type (total population, total GQ population, adult correctional facilities, nursing facilities, college/ university housing) at the national level.
SOURCE: Stern (2010).
For now, suffice it to say that relatively little information is released about the GQ population, even based on 5 years of cumulated data. For geographic entities below the state level, no characteristics data or population counts by GQ type are available. In connection with this, it is important to note that the data user workshop and the research conducted by the panel’s consultants took place in the second half of 2010, before the first data release based on 5 years of ACS data. Although information about the Census Bureau’s plans for the 5-year data products had been available prior to the release date, there was some confusion among data users about the level of detail that was going to be available for the GQ populations. Specifically, many users were not aware of the fact that the release of GQ data below the state level would be limited to counts.
DATA USES AND DATA NEEDS
The input from ACS users revealed that although some never use the GQ data (often excluding this population from the population totals before conducting analyses), most perform research that requires information about the characteristics of the total population, which by definition includes both households and group quarters. The allocation of federal program funds is often based on formulas that require information about the total population.
A recent study found that in fiscal year 2008, ACS data or data derived from the ACS were used by 184 federal domestic assistance programs to guide the geographic distribution of $416 billion in funds, representing 29 percent of all federal assistance (Reamer, 2010). Some of the data sets that are derived from the ACS include the Small Area Income and Poverty Estimates, area median income, and fair market rents. The ACS data on international migration feed into the Census Bureau’s Population Estimates Program (PEP), and the journey-to-work data are used by the Bureau of Economic Analysis to determine per capita income and by the Office of Management and Budget to determine statistical area boundaries (Reamer, 2010). The Reamer study’s summary of the main uses of the ACS is shown in Box 3-3.
The panel’s consultants reviewed the federal programs discussed in the Reamer study to identify the programs that use ACS total population estimates in their allocation formulas (as opposed to data limited to the household population only). They found that 133 of the 184 federal programs discussed in the Reamer report use total population estimates from the ACS or based on the ACS in their allocation formulas or to establish eligibility for the distribution of $342 billion in funding.
Appendix F shows the 10 largest federal assistance programs that use estimates that are at least partially based on data from the ACS, along with brief descriptions of the allocation formulas. Although none of these programs requires estimates specific to the population living in group quarters, most are based on estimates of the total population, which by definition includes both
BOX 3-3
Uses of the ACS Data
Public Policy
- ACS data guide the equitable flow of hundreds of billions of dollars in federal domestic assistance across the nation.
- ACS data provide key benchmarks for federal enforcement of civil rights and antidiscrimination laws and court decisions.
- Federal agencies use ACS data to inform the design, implementation, and evaluation of programs and policies in every government realm, such as education, health, housing, transportation, small business development, human services, and environmental protection.
- State and local governments rely on ACS data to make on-the-ground investment decisions across all policy domains.
Economy
- Businesses of all types and sizes use ACS data to identify markets; select business locations; make investment decisions in plant, equipment, and new product development; determine goods and services to be offered; and assess labor markets.
- Nonprofit organizations, such as hospitals and community service organizations, rely on ACS data to better understand and serve the needs of their constituencies.
- ACS data are essential to efforts by state and local governments, chambers of commerce, and public-private partnerships to promote business attraction, expansions, and startups that lead to job creation and a larger tax base.
SOURCE: Reamer (2010).
households and group quarters. The most frequently used ACS-based data in the formulas include commuting data in the per capita income estimates to define metropolitan and nonmetropolitan status, migration estimates in the population estimates, demographic characteristics, and social characteristics, such as ability to speak English and disability status.
Most federal funding is distributed at the state level and then further allocated to substate areas by the states themselves. However, some assistance programs send funding directly to substate areas. Appendix G shows the 10 largest programs that involve funds distributed at the substate level. The data used by these programs include total population estimates, commuting data to define metropolitan areas, and income data.
ACS data are also widely used by state and local organizations, including government organizations. At the subnational level, there is also often a need for a better understanding of the GQ population beyond their role as an
integral part of the total population. Information about the GQ population is often necessary for an accurate picture of small governmental jurisdictions, whether this population is ultimately removed from the universe of interest for the analysis or kept in as part of the total population. Without the group quarters population, the ACS estimates would not reflect local characteristics accurately—data quality concerns about estimates that do include group quarters notwithstanding—and this was evident in the numbers released for some geographic areas based on the first year of the ACS (2005) that did not include group quarters. Although group quarters are now included in the ACS, researchers said that they do not trust these data for small areas because the margins of error are so large or because known GQ facilities are omitted entirely from estimates for small areas of interest.
The data user workshop and the research conducted by the panel’s consultants revealed that at the state and local levels, some of the most frequent uses of the ACS data representing the total population include policy making, program development and administration, and research. Again, some data users are interested primarily in information about the total population for an area, and they are concerned about the effects of group quarters on the total population characteristics in particular. Many would be satisfied with estimates of the size of the GQ population that are accurate and reliable in small areas, especially if the numbers were available by GQ type, because this would provide clues as to how the presence of different GQ types may affect the total population estimates.
Others, particularly those interested in the population characteristics of smaller jurisdictions, would like to have characteristics data by GQ type, because information about characteristics often loses its meaning when data from different GQ types are combined. GQ types that tend to be large and represent a relatively large proportion of the population in small communities, such as correctional facilities, nursing and other institutional facilities, and student housing, are especially important to many data users. Student housing tends to be of particular interest to data users at the local and county levels. The correct assessment of the student population is of concern because students often divide their residence between their college community (whether they reside in dormitories or off-campus housing) and their parents’ home.1 Due to the nature of the year-round data collection and the residence rules in the ACS, this can lead to some degree of double counting of students.
Living arrangements for the elderly are of particular interest to researchers in various disciplines, because they are likely to become increasingly more
1An estimate is that among full-time enrollees of 4-year universities, about 40 percent live in on-campus housing, 42 percent in off-campus housing, and 18 percent with their parents. For 2-year colleges, only about 3 percent are estimated to live in on-campus facilities (National Research Council, 2006).
important as the U.S. population ages and people live longer. The Census Bureau’s major GQ categories sometimes do not reflect the range of changes in living situations for this population. Some data users reported that they used various administrative data that provide information on the characteristics of residents in some GQ types; however, these data are usually limited to basic demographic characteristics.
Users who are interested in one or more GQ types would like additional details or better measures beyond what the ACS is currently able to provide. For example, migration is a recurring topic of interest, which data users thought should be captured more accurately. Another example of a specific need for better measurement is related to workers’ group living quarters, which are within the scope of the ACS but are often missed because of their geographic remoteness and unusual nature, particularly in the case of farm worker housing.
Some of the discussions with data users centered around the difference between institutional and noninstitutional group quarters. Noninstitutional group quarters, such as college dormitories and military quarters, are of particular interest to many data users because they tend to be large, and they are comparable to the household population in the context of many applications of the data. Other data users were more concerned about the representation of smaller group quarters that change status frequently or resemble households so much that they are especially easy to miss as part of the GQ data collection. Data users tend to also have less confidence in the accuracy of the sampling frame in the case of these types of group quarters. Based on these considerations, some data users argued that including noninstitutional group quarters in the ACS should be a priority. However, others would prefer to see more emphasis placed on institutional group quarters, such as correctional facilities and nursing homes, precisely because they differ more from the general population, and excluding them from the total population is likely to affect the population characteristics more significantly.
Although many national surveys collect data on the civilian, noninstitutional population, they generally do not have sample sizes large enough for state-level, let alone for substate-level, analysis. Sources of information about specific GQ types include the periodic censuses of prisons and jails conducted by the Bureau of Justice Statistics, periodic surveys of nursing homes by the National Center for Health Statistics, and information collected by the National Center for Education Statistics on enrollment in colleges and universities. However, these are not potential substitutes for the ACS data, either, because they generally do not provide the same geographic detail as the ACS aims to do (acknowledging that the ACS in its current form does not provide characteristics information about the GQ population below the state level, either).
In some cases, states have extensive information about the populations in group quarters, which prompted a discussion about placing more of the burden of collecting data of particular interest for states at the local level. However, this
would represent a prohibitively high expense, without a clear source of funding, for many states. In addition, the lack of a centralized approach to the data collection would also mean that the data would be less consistent across states and therefore potentially less useful to many users. The Census Bureau would also find it more difficult to resume the collection of GQ data after a period of hiatus should funds for a larger sample become available in the future.
Data users indicated that GQ data from the ACS would be particularly useful in informing a variety of local planning decisions if they were available and reliable for small areas. Those who participated in the workshop organized by the panel also discussed data quality issues and were asked to consider the fact that, with the current sample size and design, the ACS is unable to produce detailed, high-quality characteristics data about the GQ population, particularly for small geographic areas. Although not all users were aware of the extent of the data quality and reporting limitations (discussed further in subsequent sections), they understood that compromises would have to be made in order to maintain the GQ data as part of the total population data. Some participants indicated that they were open to statistical solutions, such as modeling the data to provide information about group quarters not in the sample. One specific suggestion was to focus on obtaining accurate counts at the facility level and model the characteristics data for small areas, based on information collected at higher levels of aggregation (i.e., at the state level).
Especially as regards the GQ population, it is clear that the ACS cannot satisfy all the wishes described by data users who provided input to the panel. It is also important to note that, because of the short history of the ACS as well as the range and complexity of other census products, there was some confusion in the data user community about what data are available from the ACS and what products are based on the decennial census or the estimates produced by the PEP. This should become less of a problem as data users become familiar with the first 5-year release from the ACS, but at this stage it is apparent that many data users overestimate the reliability and detail of the ACS GQ data and hence their projected use of them.
CONCLUSIONS ON AMERICAN COMMUNITY SURVEY DATA USES AND DATA NEEDS
Discussions about the GQ population with data users revealed a lack of knowledge—even among experienced users—about what data products would be available for group quarters based on the ACS. The confusion was due not only to the timing of the first 5-year data product release but also to several other factors: (1) the decennial census produces detailed demographic characteristics of GQ residents by GQ type; (2) at higher levels of geography, the ACS is also able to produce detailed data for some groups quarters types (e.g., correctional facilities); and (3) some data users are able to access local admin-
istrativedata on selected GQ facilities (e.g., nursing homes) through licensing systems. Once an understanding of the limitations associated with the GQ data develops among data users, concerns immediately follow about the potential effects on estimates of the total population characteristics in small areas—and the large impact of what was originally perceived by many to be a problem limited to a small population becomes apparent.
Given the limitations of the GQ data that can be published based on the ACS in its current form, the panel carefully considered whether continuing to collect GQ data as part of the ACS is necessary and justified. The review of data uses by the panel’s consultants and discussions with members of the data user community was by no means a comprehensive or systematic evaluation of all uses or potential uses of the GQ data from the ACS. However, a clear priority emerged from these efforts, which helped inform the panel’s recommendations throughout the report. Specifically, there is little doubt about the importance of incorporating the GQ population into the total population estimates for small areas. There are many data users whose primary interests are in one or more specific GQ types, and they would benefit from more data about the GQ population. However, given the very limited information that can be made available to data users about GQ residents because of the small sample sizes, a more realistic goal that addresses the most pressing need is to ensure that the GQ data are integrated into the estimates of the characteristics of the total population without adversely affecting those estimates, particularly in small areas.
Given that other large-scale national surveys sponsored by the federal government typically limit their study population to housing units and exclude group quarters, the panel initially considered whether it would be possible to envision a similar approach for the ACS. However, it became clear that the ACS fulfills a unique role and meets important needs that no other data collection does in the federal statistical system. The panel thinks that the spirit of the ACS as a program that aims to provide information about the U.S. population—not only those who reside in housing units—for all geographic areas deserves to be preserved. However, improvements to the survey’s design are essential to accomplish the goals set forth for the ACS.
Recommendation 3-1: Data on the characteristics of the total population fulfill an important need, particularly for small geographic areas. The Census Bureau should identify ways of improving the group quarters estimates from the American Community Survey as input to estimates of total population characteristics for small geographic areas.
The reality for the foreseeable future will be that data collection from GQ populations is more resource intensive than data collection from household populations. In subsequent sections of the report, the panel discusses strategies that could improve the quality of GQ estimates from the ACS for
small geographic areas. Some of the most cost-effective solutions are likely to involve such alternatives as modeling or imputing some of the GQ data.
The panel’s expectation is that the Census Bureau will be able to improve the GQ estimates and the estimates of the total characteristics that combine the GQ and household populations for small areas. However, if the Census Bureau finds that the American Community Survey cannot satisfy these basic data user needs at an acceptable cost, then the goals of the survey should be reconsidered. Possible solutions could involve dropping some or all GQ types from the ACS and providing users with substitute information, such as from administrative records or from censuses or surveys of GQ types that are periodically fielded by other statistical agencies.
The use of administrative records could be a particularly promising avenue to explore because group quarters have a generally unexploited advantage. By definition, they are “owned or managed by an entity or organization providing housing and/or services for the residents,” meaning that, in nearly every case, systematic records exist about the residents. An evaluation of the 2000 census enumeration of group quarters noted that more GQ questionnaires were completed by relying on administrative records than by any other method, and that administrative data were used particularly frequently in correctional institutions, nursing homes, hospitals, and group homes (U.S. Census Bureau, 2003). The analyses are not yet in, but one can anticipate that this situation continued, and possibly became even more widespread, in the 2010 census enumeration. Some of the GQ questionnaires in the ACS are also completed on the basis of administrative records rather than interviews, but the GQ data collection is still conceptualized as interviews with GQ residents.
An evaluation of the quality and scope of administrative records available from different types of GQ facilities could enable the Census Bureau to make greater use of administrative records for ACS data collection for portions of these populations. Indeed, it may be possible to reconceptualize the ACS as primarily a household survey with GQ data contributed largely from administrative records. Other sources could also be considered for the GQ data, including the periodic censuses of prisons and jails conducted by the Census Bureau for the Bureau of Justice Statistics.
One concern about the use of administrative records for ACS GQ data is that the administrative records typically do not contain data for the full range of attributes obtained from the ACS questionnaire. As an alternative to the concept described above, the ACS could be reenvisioned as a source of data on the characteristics of the noninstitutional population, supplemented with counts of the institutional population, which could be obtained from administrative records relatively consistently. This would be similar to the approach used by Statistics Canada, which relies primarily on administrative records to collect data about many institutional facilities (also known as collective dwellings in Canada) as part of its census of the population.
This page intentionally left blank.