3
The Ideal Business Data System

3.1
GUIDING DESIGN PRINCIPLES

Business data serve many purposes and are relied on by many users. To meet the wide range of needs, a business data system must be flexible along a number of dimensions.1 For measurement of economic activity, it is important that it be designed with the capacity to disaggregate data at different units of analysis (for example, establishment versus business line versus firm level) in multiunit firms. As noted in Chapter 2, the appropriate unit of measurement varies by sector and type of activity being examined. Furthermore, in order to be able to track employment and other trends geographically, it is critical that data can be disaggregated by location. To capture sectoral trends, the data system must be designed with the capability to follow individual businesses over time, even in cases where changes in the nature of a firm’s business or in capital ownership occur. Finally, an ideal data system would be constructed to facilitate the development of modules that allow linkages between business registers, existing surveys, and administrative data. As we describe below, allowing for linkages across sources can help minimize respondent burden as well, and increase the usefulness of final data products.

1

This theme—that it is important for a business data system to be flexible and to permit “drilling down” from aggregate statistics to the sectoral and firm level—has been explored in the literature. See, for example, McGuckin (1992, 1995) and Becker et al. (2006).



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future 3 The Ideal Business Data System 3.1 GUIDING DESIGN PRINCIPLES Business data serve many purposes and are relied on by many users. To meet the wide range of needs, a business data system must be flexible along a number of dimensions.1 For measurement of economic activity, it is important that it be designed with the capacity to disaggregate data at different units of analysis (for example, establishment versus business line versus firm level) in multiunit firms. As noted in Chapter 2, the appropriate unit of measurement varies by sector and type of activity being examined. Furthermore, in order to be able to track employment and other trends geographically, it is critical that data can be disaggregated by location. To capture sectoral trends, the data system must be designed with the capability to follow individual businesses over time, even in cases where changes in the nature of a firm’s business or in capital ownership occur. Finally, an ideal data system would be constructed to facilitate the development of modules that allow linkages between business registers, existing surveys, and administrative data. As we describe below, allowing for linkages across sources can help minimize respondent burden as well, and increase the usefulness of final data products. 1 This theme—that it is important for a business data system to be flexible and to permit “drilling down” from aggregate statistics to the sectoral and firm level—has been explored in the literature. See, for example, McGuckin (1992, 1995) and Becker et al. (2006).

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future 3.1.1 Recognizing and Responding to Multiple User Needs Among the users of data on business formation and dynamics are federal agencies, researchers, and businesses themselves. One way to increase the probability that final data products will meet demands is to involve users in survey design and data collection strategies from the start. Ideally, statistical agencies should facilitate front-end collaboration with the academic and business communities and not rely solely on postsurvey follow-up. Considering user needs up front should increase the relevance of final data products. Of course, users’ data needs vary enormously. Even among the statistical agencies, whose missions are well documented, this is the case. For example, construction of local-level employment statistics relies more heavily on data for young and small firms than does production of the national economic accounts. Certain organizations—perhaps most notably the Small Business Administration, charged to “aid, counsel, assist and protect the interests of small business concerns”—are focused almost entirely on such data. Another example is the Census Bureau’s Economic Planning and Coordination Division (EPCD), which is charged with, among other things, editing and publishing the Census Bureau’s nonemployer statistics. They have a number of information needs that could be addressed by a regular survey of nonemployer entities. One example of an information gap that has been problematic for EPCD over the years is the extent to which employee leasing or other nontraditional employment arrangements are utilized by nonemployer entities with very large receipts. Beyond the agencies, mayors, other local government leaders, and chambers of commerce need information for their cities and surrounding metropolitan areas, while governors need information for their states and counties. Urban and regional planners and congressional representatives require business data aggregated to areas of different sizes. Local business owners take advantage of information about their competition at the local, even neighborhood, level for purposes of planning various aspects of their operations. Ideally, data should be collected and made accessible in a way that helps business owners answer a broad array of operational questions: Where are my customers and potential customers located? Where is the competition located? Where do my employees and potential employees live? Where should I locate my stores, offices, and plants? How much should I produce? How much should I order? How much should I hold in inventory? How should I set my prices? What is the best way to promote my products and services? How much should I pay my employees? Researchers, policy makers, and businesses use information, disaggregated to various levels of granularity, to plan and set goals for economic growth, to track progress against goals, to identify pockets of underper-

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future formance, and to devise new programs and corrective actions to improve performance. For many of these purposes, the ideal federal statistical system would facilitate the production of data for small domains—that is, localized geographic areas, specific industries, or any other defined set of firms for which survey data might be desired, even though the corresponding sample sizes may be too small to support direct estimation. Recent events—such as the aftermath of September 11, or natural disasters such as Hurricane Katrina—have highlighted the need for information on business activity at regional, state, county, and very local levels. Ideally, one could measure not only the impact on economic activity of the initial destruction, but also follow business dynamics during recovery periods in a way that allows the impact of relief efforts to be assessed. Similarly, local area data are an ideal input to many decision processes, such as federal defense office realignments, base closures, and even military reserve unit call-ups, as well as to the analysis of the resulting economic impact of such decisions.2 Designing data systems with user needs in mind therefore means building in the capability to readily aggregate to a variety of geographic scales, some of which are unknown a priori. For maximum flexibility in aggregating information, each business establishment should be assigned a unique identifier, where applicable, that records the location as a specific latitude-longitude coordinate (a point location); having the street address of a business location is sufficient, as that information can be linked to geographic coordinates. Because the street addresses of establishments and firms are known and readily accessible to owners and managers, providing this information should not be burdensome to most businesses. For certain research and policy questions, noting whether a business operates from a home location or not would also be useful. Because a point-level geocode such as a street address uniquely identifies a site of business activity, issues of confidentiality arise with the release of such data. Detailed information on business locations is typically public but, even so, inclusion of unique geocodes in combination with other variables in a database can compromise confidentiality. Yet, because there is such demand for data that enables aggregation to any one of many possible geographic units, a tension is created between user needs and respondent confidentiality; this issue is discussed in Chapter 4. 2 For example, research on the effect of military reserve deployments and activations on employment at the county level (e.g., Loughran, Klerman, and Savych, 2006), reveals that data limitations make it difficult to assess the impact of their actions on smaller firms and the self-employed.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future 3.1.2 Managing Respondent Burden A critical element of an ideal business data system is the capability of managing the burden to businesses of responding to federal surveys. Firms have an obligation to participate in surveys; their participation is a prerequisite to the production of high-quality data that are, in turn, essential for the analyses and planning required for an efficient, productive economy. However, one reason given for nonresponse in surveys by employers is that they take too much time, especially for smaller businesses. So, to ensure accurate representation of economic activity, data collection should be designed in a manner that minimizes the burden on information providers.3 One aspect of burden management is to avoid asking for unnecessary or redundant data. This highlights the need to make as much use of administrative data as possible. It is important to consider whether there are currently underutilized sources of administrative data that could contribute to the goal of measuring business formation and dynamics while minimizing the survey response burden. Nongovernment data sources may also play a role in an ideal data system. Given the prominence in the U.S. economy of payroll processing firms, it may be possible to obtain data from a large number of businesses while directly approaching only a few. The firm ADP, for example, claims that it handles payments for 1 in 6 private-sector workers (http://www.adp.com). Adding in the next few biggest processing firms would expand coverage considerably further. In addition, the use of such processing services is not limited to large firms, making it a potentially lower cost way to collect information on smaller ones. ADP offers services for businesses with 50 or fewer employees, major accounts services for firms with 50-999 employees, and national account services for firms with 1,000 or more employees. Potentially, if confidentiality concerns can be addressed, a company such as ADP could provide employment numbers per payroll period in a consistent format for each of its client firms. Given that the smallest firms are the most likely to realize proportionately large benefits from 3 Although there is a substantial amount of research on the views of individuals on confidentiality, data sharing, and respondent burden, much of it conducted for the decennial censuses. Information about business views on these issues is much more limited. Qualitative research by Willimack and Nichols (2001) suggests that “large companies generally supported data sharing among statistical agencies under well-specified conditions and with rigorous security and confidentiality provisions in place. They only saw value in this sharing, however, if it reduced the reporting burden placed on them” (p. 1). The authors suggest that redundant external data requests, especially for larger firms, appear to make the prospect of data sharing by designated statistical agencies preferable.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future outsourcing payroll and human resource functions, payroll processing firms may be able to provide a timely picture of small business dynamics. At a minimum, data collection systems should be structured in a manner similar to those used by these payroll processing firms in order to simplify the survey response by businesses. Whenever possible—that is, when businesses keep records in the same way and with the information that government wants—responding to government data collection requests should be made no more difficult than gathering the information necessary for carrying out basic administrative functions for the firm. This is more likely to be the case for information such as payroll, but less so for information relating to, say, energy consumption or investment. One way to facilitate the collection of administrative data is to encourage increased adoption of extensible business reporting language (XBRL). XBRL is a member of the family of languages based on XML, or extensible markup language, which is a standard for the electronic exchange of data between businesses (see www.xbrl.org). In terms of U.S. government data collection, XBRL has already made some inroads with regulators. For example, the Federal Financial Institutions Examination Council uses XBRL for quarterly bank reporting. In terms of broader types of data collection, tax authorities in several countries (e.g., the United Kingdom’s HM Revenue and Customs and the Australian Tax Office) have begun to encourage the use of XBRL for corporate tax returns. As described in more detail in Appendix A, administrative data are currently used in the United States mainly to create the business registers of the Census Bureau and the Bureau of Labor Statistics (BLS). Tax records from the Internal Revenue Service (IRS) form the basis of the Census business register, and ES-202 forms from state unemployment agencies forming the basis of the BLS Business List. Similarly, in the United Kingdom, administrative data from “pay as you earn” employee and value-added taxes form the starting point for the business register. In some European countries, especially in Scandinavia, essentially all interactions of individuals or businesses with the government generate administrative data that are maintained and available for analysis. One reason the United States has two different business registers is that the use of IRS data is extremely limited for confidentiality reasons. The advantages and disadvantages of a unified business registry are discussed below. In addition to making use of administrative data, burden management should include methods for distributing survey response requirements across firms evenly, to the extent it can be done without damaging the representativeness of the survey data. Two equivalent firms should, over time and surveys, be asked to carry a similar burden—acceptable burden should be defined to be proportional to the size of the firm. Cooperation of most or all of the largest enterprises is required in nearly all federal surveys. Small to

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future medium-size firms are only sampled, and it is this domain of firms where the greatest need for burden management exists. Given the theme of this report that a need exists to increase sampling of young and small firms in business surveys, it is especially important to develop mechanisms to help spread the burden across those firms. Two different approaches can achieve this ideal level of burden sharing, and both should be examined by survey managers and statisticians. The first approach may be called the PRN system, which entails the assignment of a permanent random number to each firm (or establishment) in the register. The random number would be stored as a separate field and would be available permanently both for a given survey over time and to different surveys across agencies. The random number would be used for specifying probability samples of firms. Because the number is permanent, it would provide a basis for spreading the sampling across all firms in the population. The implication is that all firms would be asked to respond to a similar number of surveys, leveling the response burden across firms. Some firms would not be asked to respond to a disproportionately large number of surveys, which is what could hypothetically happen today. The coordination or management of reporting burden according to a PRN system has been implemented in Sweden and is described by Ohlsson (1995). The second approach may be called the burden budget system. If Xi is the size of the i-th firm, then define a burden budget Bi proportional to Xi, for each firm in the population. The burden budget might be expressed in units of hours, completed survey questionnaires, or the like. The burden budget would represent the firm’s total obligation to respond over a defined period of time, such as five years. Every five years, the burden budget might be reset to its original value. The business register would record the specific surveys for which each firm was selected to participate, both over time and across agencies. With each survey in which the firm participated, the cumulative burden budget would be reduced by a measure of its participation. At any given time, Bi would represent the firm’s remaining burden in the five-year period. In sampling for a new survey, the firm would be selected with a probability proportional to, or otherwise positively related to, Bi. Thus, as the firm fulfills its assigned reporting obligation through participation in surveys, its cumulative Bi declines and its probability of selection in future surveys declines. In this way, firms that have not participated eventually do participate because their cumulative Bi has not declined and, correspondingly, their probabilities of selection in future surveys increase. A simple system of this type operates in the United Kingdom for small firms. Participation in a survey in a given year earns very small firms a guarantee that they will not be asked to participate in another survey for a set period—three years in the UK example (Office for National Statistics, 2005). A

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future similar regime is also being considered for adoption in Denmark and the Netherlands. It is apparent that a central business register and the information it contains are essential to driving the kind of burden management system described here. Thus, we turn now to describing the characteristics of the ideal business register, before moving on to more general aspects of the business data system. 3.2 DEFINING AND TRACKING BUSINESSES OVER TIME—THE BUSINESS REGISTER 3.2.1 Ideal Business Register Characteristics The ideal business register is no doubt more easily described than created. Nonetheless, it is important to first consider the ideal, before turning to the possible, especially since what is feasible tomorrow will be different from what is feasible today. A business register must first be comprehensive; it must cover the entire business population of firms conducting business in the United States. The implication, then, is that the ideal register would include not only employer businesses, but also nonemployers, and would pick up firm births and deaths with very little lag. Although 100-percent inclusion rates are impossible to attain, the coverage must still be substantial and known. The business register should also indicate the enterprise structure of a firm, such that subsidiaries and multiple work sites are apparent, and mergers, acquisitions, and other status changes are reflected in a timely fashion. To maximize the value of a business register as a sampling frame, these data items must be included and should be available to all approved users of a register (including those positioned beyond the Census Bureau and BLS). In order to manage respondent burden, it is necessary to maintain a complete record of the sampling histories of firms and establishments. As noted above, this would include recording which survey samples the business has been selected into and what the reporting history was for that survey. Note that the complete contact information must also be included. An additional set of items should ideally be linkable to these core business register items, with appropriate safeguards for confidentiality. First, given the importance of the ability to analyze the entrepreneurial process, indicators of both firm size and ownership type should be available, with owner demographics provided for privately held firms. Because small-domain estimates are important, detailed geocodes, industry codes, and product codes are essential. In order to link different aspects of the data with the main business register, a unique identifier is also needed (see below).

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future The ideal business list would also provide a unified sampling frame, in contrast to the overlapping, but not entirely consistent business lists currently maintained separately by the Census Bureau and BLS. A growing amount of time and effort is being spent comparing and attempting to reconcile these different lists. The goals of the major comparison project (described by Paul Hanczaryk of the Census Bureau and James Spletzer of BLS in National Research Council, 2006) are twofold: to understand the differences in the lists and to identify the strengths and weaknesses of each. There are indeed clear advantages and disadvantages to alternative administrative and survey data from BLS and the Census Bureau, respectively. Thus, “unified” should not be interpreted as drawing from a single administrative and survey source or agency. Rather, an ideal business list would integrate multiple sources so that a common registry is used for key business surveys underlying the National Income and Product Accounts (and related federal business statistics). 3.2.2 Unique Business Identifiers The ideal business list would assign a single, unique number that would be used to identify each business in all data sources (including household surveys, for example, when listing the employer of record). This unique identifier would then give users the ability to link business registry information to data items from both employer and demographic surveys that provide data on worker characteristics, self-employment, and household-centered businesses. It would also allow the linking of business registry information to individual-level administrative records, such as those from the Social Security Administration and the Longitudinal Employer-Household Dynamics Program. Another benefit of a unique business identifier is its potential to help minimize respondent burden. By simply entering its one ID number, a business could update all of its information at once. In addition, creation of cross-walks between the unique business number and the current Unemployment Insurance (UI) number, along with data sharing by the state, would facilitate updating of the business list on a more timely (quarterly) basis. Note that, as discussed earlier, what is meant by a business may differ by sector and can potentially change over time. Thus, it will be important for the register to preserve or develop indicators for the legal form and type of business entity. As part of the latter, registers should indicate franchise operations, multilevel marketing operations (e.g., Amway), joint ventures, special-purpose financial entities, and the like that are not captured by traditional “legal form of organization” indicators. It may even be necessary to allow for multiple definitions. One might also consider including additional information on ownership relations—for example, in cases in

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future which one firm owns a portion of another, or in which a firm is jointly owned by two others but neither exercises unilateral control.4 Use of a unique identifier also facilitates the linking of basic business register data to more in-depth firm information, thus enhancing confidentiality. That is, increasingly detailed information can be made available to qualified users, perhaps through the use of a relational database, while, for other uses, a more limited set of data items may be sufficient. Many kinds of users could legitimately claim a need to access information on the existence of a business, but additional safeguards must be in place for users granted access to more detailed information. Examples of more detailed information that could be linked to the business register include financial information (such as bank loans data, which might be bought from credit bureaus), as well as detailed and consistent product descriptions that may be in widespread use in the private sector. Examples of the latter include universal product codes for retail firms, national drug codes for pharmaceuticals, and insurance reimbursement codes for the health industry. Finally, in certain sectors for which the government has special data sources, such as farming, railroads, and nonprofit and public organizations, the unique identifier could be used to link these data. Other governments have already moved forward with respect to this ideal. For example, in the United Kingdom, a major review by the Treasury to identify ways to reduce the administrative cost of regulation concluded that a single business identifier, in combination with data sharing across agencies, could meet this goal. Data sharing is key to reducing duplicate data requests (Hampton, 2005). It can also help streamline the administrative agencies themselves and lead to increased efficiency (O’Donnell, 2004). In the United States, reconciliation of the major business registers generated by the UI system and the Census Bureau would be a major step forward. To the extent that reconciliation of establishment lists can take place as soon as data become available, the ability to rapidly and continuously update sample forms would be improved. The presence of unique geocode identifiers would also help facilitate data integration across sources. 3.2.3 Effective Data Sharing By this point, it is clear that creation of the ideal business register implies a greater degree of data sharing among statistical agencies than is 4 The statistical agencies (in the United States and abroad) have done quite a bit of work on the issue of determining ownership structures of businesses—see Armington (2004).

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future currently in place.5 We have already cited the business list reconciliation example. Because there are advantages and disadvantages to both the BLS and Census Bureau systems—and thus reconciliation should take place downstream to minimize loss of information—a need for data sharing between these two agencies is implied. For example, data on sole proprietors, which originate from Schedule C filings to which only the Census Bureau currently has access, would need to be included. Even without worrying about reconciliation per se, a unified register implies data sharing. Note, though, that since the Schedule C data will include all individuals, including those whose activities can more accurately be described as a hobby than as a business, it is likely that some revenue threshold would need to be imposed before a sole proprietor is actually placed on the list. Not every piece of information on the business register needs to be accessible to every agency, but a much broader range of users should be allowed than is currently the case. The idea of different agencies having access to the business register, with a well-defined set of rights (in terms of which fields are available) and responsibilities (in terms of confidentiality), is similar to a point made for the United Kingdom (Hampton, 2005). The recommendation there was that various agencies should be given access to only those fields in the database that were necessary for their regulatory efforts. The central database, along with data sharing, would play a large part in reducing the administrative burden on UK companies. In addition to data sharing among agencies, the basic elements of the employer section of the business register should also be made accessible to all qualified agency and research users to serve as a sampling frame for their surveys.6 Researcher access to some confidential data is already allowed in 5 “Data sharing” is the exchange of information collected from businesses and individuals or reported to the IRS in identifiable form for statistical purposes. Identifiable form means information “that permits the identity of the respondent to whom the information applies to be reasonably inferred by either direct or indirect means.” Statistical purposes involve “the description, estimation, or analysis of the characteristics of groups, without identifying the individuals or organizations that comprise such groups.” They also include methods and procedures related to the “collection, compilation, processing, or analysis” of data about these groups and the development of related “measurement methods, models, statistical classifications, or sampling frames” (National Research Council, 2006, p. 56). 6 Nonagency use of a business list should be tightly monitored. One danger is that, if there is a business register that can be used as a master sampling frame then, potentially, more surveys will be launched, particularly if the cost for using the frame is trivial. This could lead to the unacceptable consequence of increasing burden placed on business respondents. Our primary concern here is with agencies’ programs that currently must construct frames (or purchase them from Dun & Bradstreet) because they do not have access to BLS or the Census Bureau lists.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future BOX 3-1 The UK Business Data Linking Project The ONS collects large amounts of business microdata. Their Business Data Linking Project provides access to the data via its secure “microdata lab,” where academic researchers can carry out statistical analyses. These data are confidential and access is tightly restricted: Only researchers fully employed at bona fide academic or charitable research institutes, or civil servants, may have access. There is no facility at the moment for PhD students. The employer is required to sign an agreement taking collective responsibility for the actions of all its researchers. Researchers are required to agree to standard secondment contract terms. There is no access without signed agreements. Projects must be of academic value and demonstrate (a) a clear interest for ONS in the results, and (b) a specific need for the data sets requested. Access is granted only through the secure microdata lab onsite at ONS. Researchers must specify which data set(s) they want to use, why they want to use it, and why the data cannot be found elsewhere. In some cases, additional data sets may be linked by researchers. the United States through the safeguards provided for via research data centers. Other countries have programs through which researchers can access data more broadly. For example, in the United Kingdom, the Business Data Linking Project (see Box 3-1) allows access in a secure setting to “safe people” who have been vetted by the Office of National Statistics (ONS).7 More broadly, ONS allows access to a broad range of unpublished microdata to researchers upon application.8 3.3 IDEAL DATA COLLECTION CHARACTERISTICS 3.3.1 Contents of the Ideal Business Data System Not only must the ideal business data system be capable of producing accurate and timely aggregate statistics on income and output, employ- 7 See http://www.statistics.gov.uk/about/bdl/ for more details about the program. 8 Much of this has resulted from the Deregulation and Contracting Out Act of 1994, which has been applied to the Statistics of Trade Act of 1947 and others. See http://www.statistics.gov.uk/about/NS_ONS/ONS_microdata_releases.asp for further details, including a list of data releases.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future ment, investment, prices, and productivity, but it also should allow research that requires use of microdata. In particular, as noted above, individual-level business data are important for studying such areas as productivity growth, firm entry and exit, the role of young and small businesses in fueling economic growth, the characteristics of business owners, and the interactions between large and small businesses. A system capable of such analyses must thus collect a range of key data items beyond the basic information contained in the business register, although the register is certainly one important input. Measurement of transitions (e.g., mergers, acquisitions, and spin-offs) allows researchers to identify such things as expanding employment areas, structural changes in the economy, and business cycle turning points. Just as important as the capacity to track transitions for existing businesses is the ability to detect the birth and growth of new firms. Maintaining ownership type and owner demographics on the business register is just the first step. The data system must also facilitate measurement of gross flows from employee to self-employed categories. To best understand these nascent entrepreneurs, additional information is necessary. First, it would be useful to be able to identify the location of self-employment—specifically, whether the business activity takes place at home or somewhere else. Data capturing time use in different business activities are also essential. In order to understand the underlying sources of growth in the economy, data are needed on a range of business attributes that may be linked to performance. Tangible capital is one important area for which information is sparse, particularly for small firms. Becker et al. (2006) have shown that, in the Annual Capital Expenditures Survey, entrants and younger firms are underrepresented, even though the evidence suggests that capital expenditures in the start-up process of businesses are high. If true, the current survey is missing an important, and nonrandom, component of capital expenditures for U.S. businesses. For new and growing firms, knowing more about financing is of particular interest. A business data system would, therefore, ideally include financial and balance sheet information, including equity, debt financing, and venture capital financing; measures of capital stocks and investments would also be useful. Important expenditures include investments in physical capital and also in technology and research and development (R&D), human capital, and organizational capital. Corrado, Hulten, and Sichel (2006) argue that better measurement is needed of three types of intangible capital: computerized information, knowledge acquired through scientific R&D and creative activities, and economic competencies. The ideal business data system would allow users to calculate intangible capital figures for such categories as stock of education and training in a firm and new investments in human capital and organizational capital. We emphasize

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future this as an ideal recognizing that, in practice, collecting information of this kind, especially from smaller firms, would be a challenge. Black and Lynch (2005) argue that a firm’s organizational capital—broadly categorized as workforce training, employee voice, and work design—contributes in significant ways to its productive capacity. The authors conclude that calculating the stock of training in a firm is difficult; however, even capturing measures of new investments in training would be a significant improvement. At the moment, U.S. statistical agencies collect no ongoing measures of the amount of training workers acquire year over year. This is in sharp contrast to the European Union, which includes training questions in all member countries’ annual household surveys. Some of these kinds of questions could be incorporated into the Current Population Survey as well as most business surveys. Other measures of organizational practices, such as the percentage of workers meeting on a regular basis to discuss workplace issues, unionization, layers of management, benchmarking usage, and the existence of incentive-based compensation could, in principle, also be gathered at the establishment level. Doing so would create minimal respondent burden since, as Black and Lynch recommend, training and compensation data need be collected only on an annual basis. Other components of organizational capital, such as employee voice and work design, could be collected on a less frequent basis; every other year is likely to be sufficient since these practices do not change with high frequency. With minimal respondent burden, policy makers and businesses would then have the ability to understand how these dimensions of intangible capital impact the productive capacity of individual firms and the economy more generally. In 2004, Eurostat began collecting information on organizational innovation in its Community Innovation Survey. Eurostat broadly defines organizational innovation as changes in firm structure or management methods that are intended to improve a firm’s use of knowledge, the quality of goods and services, or the efficiency of work flows. The survey then operationalizes this definition with a range of more specific questions. A full understanding of business success (and failure) requires data not only on these important inputs, but also on outputs, performance, and the market environment. Thus, the ideal business data system would go beyond the detailed industry and product codes on the business register and include such things as productivity measures, profit levels, prices, and even patent information. In addition, it is important to have information on business relationships, and to be able to link a given firm to its suppliers (or whom it supplies) and competitors. Related to this is the need for indicators identifying a franchise or a licensee, or a spin-off from other firms. Clearly, this broad list of information comprising the ideal business data system need not be available for all firms. Rather, the system would make appropriate

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future use of sampling and surveys. The following sections discuss more fully whom to survey, for how long, and how often. 3.3.2 Whom to Survey Many of the interesting questions related to business dynamics are addressed by statistical estimates that measure relative changes in activities. On the one hand, a focus on dynamics forces attention to the act of creation of new business entities, a moment at which fundamental changes take place. Since the start of entrepreneurial activities is often limited in scope and magnitude, high growth rates are possible. On the other hand, many of these new entities die quickly. To understand the dynamics of business entities, nascent units must be included. A critical question, addressed in Chapter 2, is, at what moment does a new business entity begin? Is it when the founder first thinks of the business opportunity? Is it when the founder begins active planning to launch the activity? Is it when the founder creates legal entities? Is it when the entity first takes actions toward producing goods or services? The business lists currently used in the federal government require longer and perhaps more elaborate business activities for inclusion in the target population than would be implied above. Hence, reliance on the current sampling frames may miss much of the economic activity of key interest in business dynamics. A fully developed data system may require a new blending of demographic and economic surveys. Demographic surveys are typically defined as surveys of households and persons; sampling frames consist of addresses and other indicators of housing units in which people reside. Self-reports by the households in answering questions about employment, health, victimization, etc., form the basis of demographic statistics. It may also be possible to identify and measure the volume of nascent business entities through surveys of persons who are in the process of founding them. Samples of these persons could come from demographic surveys now ongoing, perhaps through question modules that ask “screener” questions about the entrepreneurial activities.9 If a sample person reports the requisite activities, then he or she would be eligible for further measurement over time to track those activities. While, in an ideal data system, one would like to have this information, identifying start-ups by household (the majority of which do 9 As described in Chapter 2, substantial work has been done in association with the Panel Study of Entrepreneurial Dynamics to develop cost-effective procedures for identifying entrepreneurs using household surveys.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future not include individuals engaged in entrepreneurial activity), then subsequently linking the information to business identifiers to follow firms from genesis forward is not a trivial task. Use of demographic surveys as a screening tool for new businesses must also acknowledge the problem of multiplicities. Multiplicities exist in sampling processes when the target population unit (in this case, a new business) might possibly be linked to several frame elements (in this case, persons within households). When two or more persons are engaged in the joint creation of a new business entity, then both of them could report it in a demographic survey. Thus, entities with multiple founders would be overrepresented in demographic surveys of persons. Such multiplicities would have to be measured as part of the screening process in order to create selection weights useful for inference to the population of business startups. A further source of multiplicities is the fact that some business entities reported by demographic survey respondents will already be listed in the existing sampling frames; others will not. This issue, too, would have to be addressed with selection weights. Another issue tied to using demographic surveys is the loss of unambiguous and objective criteria for eligibility (as can be the case with Current Employment Statistics’ use of UI filing). Self-reports by household members about their business activity would create the basis for eligibility into the sampling process. Careful development of screening questions to ensure that accurate reports of eligibility could be obtained would be essential. 3.3.3 How to Allocate the Sample of Business Entities It is simple to show that the precision of estimates of population totals in business surveys is maximized when the sample of business entities gives very high probabilities of selection to very large firms and small probabilities of selection to very small entities. Typically, high-volume firms are sampled with certainty (i.e., they always appear in the survey sample). This allocation of the sample ensures that large portions of economic activity are represented in the sample. With a focus on business dynamics, however, when the interest is in relative changes in business activity, then alternative allocations of the sample become more attractive. Allocation of the sample proportionate to some function of change in business activity is reasonable. Depending on the estimates of interest, preferred sample allocation might place more emphasis on small, quickly changing new businesses. Since these two purposes—estimating total volume and estimating rates of relative change—suggest different allocations of the sample, some consideration might be given to a new data collection vehicle of business dynamics that could be viewed as a supplemental survey for the current

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future economic measurement systems. This survey would allocate the sample disproportionately to new business starts, those strata of economic activity that form only small parts of current economic surveys. The data from this new survey, perhaps called the Survey of New Business Dynamics, could be combined with that of other surveys to provide full estimates of relative changes in key activities. Another sample allocation issue concerns the sector of economic activity to be covered. This is best illustrated by considering e-commerce, a sector in which intensely dynamic business activity takes place. A new focus on business dynamics would require much more attention to such sectors that pose new survey measurement problems. Units in the e-commerce sector need to be identifiable with well-defined geographic loca-tions—yet entities exist in cyberspace. The physical location of the server infrastructure can change instantaneously, which makes it difficult to link people and places to business activity. Although the federal statistical agencies have programs addressing e-commerce, studies of business dynamics would benefit from increased investment in survey methodology for technology sectors. Similarly, the activities of nonprofit entities are highly dynamic over time, so that sector could also be part of a new program in business dynamics. A further sample allocation feature that must be addressed is the nature of the longitudinal structure to be carried out in the survey. To maximize the precision of estimated changes in the volume of economic activity, it is common to measure sample units on a frequency that is related to the historic rates of change of the phenomena measured. Phenomena that exhibit relatively rapid change (e.g., employment and unemployment) should be, and typically are, measured more frequently than phenomena that are less volatile (e.g., state government tax revenues). A statistical system that measures highly volatile attributes frequently and less volatile attributes less frequently could be constructed through a set of questionnaires that have flexible modules that are used at different intervals over time. Such a flexible design can reduce the burden of measurement on sample businesses and can be used to gather information at appropriate intervals. 3.3.4 How Long Should a Sample Business Be Measured? Many of the current economic surveys have longitudinal designs; that is, a sample business enters the survey and is repeatedly measured over time. While such a design theoretically permits the analysis of micro-level changes in specific business entities (e.g., gross changes in size over time within a sector), the designs are traditionally not exploited for that purpose. Instead, because many of the surveys are used to produce estimates of

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future change in totals over adjacent time periods, precision is enhanced by having overlap of sample businesses. Studies of business dynamics require following the same business over time, measuring similar phenomena repeatedly, and assembling longitudinal data on each sample business that can be used to examine precursors to change, what types of businesses experience different types of change, and which types of characteristics of the birth process lead to more or less stable entities. Such purposes require that more attention be paid to following time-based linking rules than is true for most current business surveys. Several problems arise and need further study: How are mergers and acquisitions to be handled in the study of business dynamics? How are spin-off businesses from the original entity to be handled? When is a new business de facto the death of a sample business, and when is it a new component of the original sample business?10 In demographic surveys, there exist various following rules, depending on the purposes of the longitudinal survey, some of which permit the inclusion into later waves of the panel persons who join families originally sampled into the survey. For example, the Panel Study of Income Dynamics, which started with a sample of families in 1968, has continued to follow not only those families, but also the new families formed by any member of the original family.11 Such designs permit more complete understanding of the temporal dynamics of families and households. Similar following rules could also be used in carrying out business surveys. Longitudinal surveys are used to produce estimates of the population at any one moment in time and changes in the population as defined at earlier points. Designs with such goals are often rotating panel surveys that, at any one point in time, sample new units and rotate out sample units that have been measured over prior multiple waves. Such designs have the advantage over fixed panel surveys (which draw a sample at one time point and follow it for long periods of time without any refreshment of the sample) by reducing the respondent burden and providing both micro-level change estimates and cross-sectional estimates. A key issue in rotating panel designs is the number of waves of measurement for each sample unit. In line with the logic above, studies of 10 Beyond the work done in the United States (discussed throughout this report), approaches for dealing with problems related to mergers and acquisitions, and births and deaths, have also been recommended by the Organisation for Economic Co-operation and Development, Eurostat, and others internationally—see Pilat (2002) for an overview of methods used in a range of countries (and for further references). 11 See http://psidonline.isr.umich.edu/ for more information on the survey and how it follows family members over time.

OCR for page 47
Understanding Business Dynamics: An Integrated Data System for America’s Future business dynamics might follow units subject to high likelihoods of change for longer periods and units that are unlikely to undergo major changes only briefly. Determining the strata for long follow-up and short follow-up would require some study for each outcome variable selected. Finally, a separate issue is the length of the interwave interval—how often should a sample unit be measured? If life expectancy is low for new businesses, frequent measurement is needed to acquire observations about life-cycle processes. There are many attributes of new businesses for which information is essential (e.g., payroll, employment counts). To reduce the reporting burden of high-frequency measurement, new alliances between the federal statistical agencies and payroll processing firms might be considered; use of scanner data for sales volume estimates might also be studied. Other variables may not require such frequent measurement to be useful (e.g., use of new technologies). Clearly, measurement of business dynamics would benefit if the method of record construction permitted data to be acquired from units in a temporally flexible manner.