| ||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 18
2
Database Development
INTRODUCTION
The project's first task was to develop datasets for use in the analyses, and this
chapter describes the compilation and development of these data. We developed
two principal databases, one for the cross-sectional analysis of data representing
the 33 metropolitan areas with populations of at least one million in 1980, and the
other for the analysis of time series data of cordon counts for New York City.
This chapter documents the investigation into the potential data sources which
were considered for analysis, their strengths and weaknesses, and the construction
of the final datasets used for model estimation. Also included is an explanation of
the limitations of these data when used in this type of analysis.
CROSS-SECTIONAL DATABASE
Based on some preliminary analyses undertaken in Chapter 2 of CRA's TCRP
report Building Transit Ridership An Exploration of Transit's Market Share,
and the Public Policies that Influence It, we chose as the basis for the cross-
sectional dataset the 33 metropolitan areas with population of at least one million
in 1980. This set of cities was selected for the following reasons:
· Relevance to the study. This study is concerned with problems of the major
urban areas, and so it was logical to focus on these locations. Moreover, it is
in the largest cities where we have the opportunity to observe the competition
of transit and the private vehicle on a large scale.
Availability of data. The quality and availability of data for smaller areas can
be a problem at the level of detail of interest to this study.
Representativeness. While these thirty three areas represent only a small
fraction of the 284 MSAs currently defined by the Census, they nonetheless
represent almost 50% of the total population of the US, and perhaps even
more importantly, almost 90% of the total transit trips in the US.
Compatibility. it was our intention to examine changes in the data from 1980
to 1990; using the largest cities will maximize the likelihood that consistent
data can be obtained for multiple time periods.
OCR for page 19
The sections below describe the specific series compiled for the database.
Data and data sources for dependent variables
The conceptual framework for the analysis involves explaining the variations in
observed shares or occupancies of the private vehicle or transit modes. These
general measures were therefore used as the "dependent variables" in the models
estimated in the analysis. Given the limitations of time and budget, however, these
data needed to be compiled from existing, publicly-available sources. We
therefore explored the most likely potential sources of relevant data that could
help to build up the picture of how commuting mode shares and average vehicle
occupancy vary across areas, and have been changing over the last 15 years. We
examined each of these sources to appraise the nature and content of the data
available, as well as the strengths and weaknesses of that source relative to
possible analysis for the objectives of this project. A summary of each of the
sources examined is provided in Table 5. which is followed bY a more detailed
description.
Census of Population journey-to-work questions
As part of the regular population counts that are performed every 10 years by the
US Bureau of the Census, more detailed information is collected from a random
sample of about 8% of all households. This survey is administered through the
so-called "lone form" and includes in addition to more detailed demographic and
. . . ~ . . . . ~ _
socioeconomic Information than Is collected on the standard (census form
questions about household commuting travel to and from work locations.
Specifically, the questions address (for the Census day) workplace locations, the
number of household vehicles available, the commuting mode used (for the
longest part of the journey only), the number of persons traveling in a private
vehicle, departure times, and travel times.
The very large sample size of the Census allows these data to provide usable
statistics for quite small geographic areas. Specifically, mode shares can be
computed for specific areas or flows within individual metropolitan areas, a topic
of particular interest to this study. There are limitations to these data however
and they are discussed in greater detail later in this report.
19
OCR for page 20
Table 5. Summary of data sources explored as dependent variables
~ . ~ :. . . ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~. ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ .~ ~ ~ ~. ~ ~ ~ . ~
~ . ~ ~ . ~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ - ~ ~ Data~Source ~ ~ ~ ~ ~ ~ ~ ~ ~ Nature ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Relevant data conte:n t ~ ~ ~ ~ ~ ~ ~ ~ ~Strengths~and weaknesses ~ ~- ~ ~ ~ ~
. ~ ~ ~ ~ ~ ~ . ~ ~ ~ ~ . . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . ~ ., ~ .~ . .~ . . ~ hi
Census of Decennial · Workplace and · Very large sample for even small areas
Population survey of residence geography . Broad geographic coverage
aourney-to-work trips to work . Primary commute mode · Copious classification details
data) on the · Vehicle occupancy . Limited time series
~Census day . Vehicle availability . Poor handling of multimodal trips
l · Departure time · Includes commute trips only
· Travel time
Nationwide Four surveys . Person trips . Rich detail about local tripmaking for a
Personal (since 1969) . Vehicle trips nationally-representative sample
Transportation of household . VET · Sample too small to provide a picture of
Survey (NPTS) tripmaking · Mode, purpose, timing individual MSAs
on a sample . Vehicle occupancy . Limited detail about trip geography
day · Major changes in method limit compar
ability over time
American Housing Annual sur- . Primary commute mode . Helps identify inter-Census trends at
Survey (AHS) vey includes · Trip length the national level and for specific MSAs
some · Trip timing
journey-to
work data
FHWA Highway Annual time · Road mileage · Long time series of most data items
Statistics series of key . Total vehicle · Few data items of relevance
statistics re registrations · Little geographical detail below the state
highway · Licensed drivers level
provision and · VMT · No vehicle occupancy statistics
use · Fuel consumption · Uncertainties about cross-state compar
. Fuel taxes ability
. Highway speeds
MAMA Motor Annual time . Vehicle sales · Little of relevance that is not taken from
Vehicle Facts & series of key · Vehicle stock other primary sources
Figures statistics re . Newvehicle · Limited metro area detail
motor registrations · Use of some proprietary data restricted
vehicle man- · Vehicle costs
ufacture & . Demographics of new
use car buyers
Cordon counts for Periodic . Numbers of vehicles . Potentially long time series
selected cities observations . Numbers of people, by · Market very relevant to transit
of traffic into mode . MSA or corridor level data
central areas . Some trip geography . Methodological inconsistencies
· Trip timing . Occupancy data sometimes infrequent
Nationwide Personal Transportation Survey
This survey, cagier out under the aegis of the US Department of Transportation,
was begun in 1969 and has since been repeated four more times, in 1977,
1983/84, 1990, and 1995. For a national sample of households, the NETS
20
OCR for page 21
.
inventories the trips (of 75 miles or less) of all household members for a specified
24-hour period. Results of the 1995 survey have not yet been published.
Some major changes in coverage and procedures were made with the 1990 NPTS,
which (in our view) severely limit the ability to make comparisons with the data
from earlier years. Prior to the 1990 survey, the NPTS was calTied out by the
Bureau of the Census, using personal, in-home interviews. For the 1990 study,
the NPTS was combined logistically with a similar US government periodic
survey of longer-distance travel by households (the National Travel Survey), and
both were undertaken by a private contractor using computer-assisted telephone
interviews of a much larger sample of households: 22,000 in 1990, compared
with 6,500 in 1983/84. Proxy information about a person's travel was accepted in
1990 if the household member could not be reached in person, unlike the situation
in the prior surveys. Transit trips are relatively rare in the NPTS databases, and
for a variety of reasons successive NPTS surveys have been judged to
underrepresent transit tripmaking.~
Since the survey focuses specifically on household travel patterns, its major
strength is that it covers trips made for all purposes at a more detailed level than is
available from (say) the Census journey-to-work data. The data can be used to
examine, for example, estimates of the vehicle-m~les and passenger-m~les of
travel, intra-household vehicle allocation behavior, multi-modal journeys, and
non-work trips. Vehicle occupancy levels have been the subject of a number of
NPTS-based studies,2 but there has been some concern from time to time (most
notably in connection with the 1969 NPTS) about the accuracy of the occupancy
data.
American Housing Survey
This is a sample survey, administered by the Bureau of the Census for the US
Department of Housing and Urban Development. Although primarily a survey
about housing characteristics, the ADS does contain some questions about the
mode used for travel to work, as well as travel time and trip distance. There are
actually two samples for this survey:
The national sample, the results of which contain only summary level
information for the entire US; en c!
· The metropolitan area sample, which surveys a group of 44 MSAs across US.
21
OCR for page 22
The first American Housing Survey was conducted in 1973. It was next
conducted in 1976/77 and then again in 1979 and 1982. After 1985 it was not
quite every other year, being conducted in 85, 89, 9l, and 93. The latest version is
1995 but the results have not yet been published. Prior to 1985, there was a
different sample with different questions compared to current practices. Most
importantly, the journey-to-work (ITW) information has not been collected for the
metropolitan area sample since then; it is only available from the national sample,
which includes separate totals for metro areas and rural areas. The ITW questions
were also not asked every time the survey was conducted before 1985, and
perhaps the biggest limitation is that the metro areas sampled in each survey are a
rotating subset of the 44 areas described above; each MSA has probably been
sampled only once since the survey began.
Finally, the survey has a finer level of geography only for residence location and
not for workplace location. Thus while the metro area sample data could
potentially be used to determine mode shares and occupancies, for example, for
neonIe living in suburbs or neoDIe living in central cities. it cannot provide figures
1 1 =7 1 1 "7
representing flows of those living in the suburbs and working in central cities.
FHWA's Highway Statistics
A compilation of highway-related data is published annually by the Federal
Highway Administration, primarily concerned with highway finance and physical
infrastructure. It is generally a good source for macro trend information, but the
smallest geographic unit is often the state and Highway Statistics contains no data
on vehicle occupancy levels. It does contain VMT estimates, however, assembled
by the states using purportedly standardized methods.
AAMA annual statistical compendium
The American Automobile Manufacturers' Association publishes an annual
compendium of data under the title Motor Vehicle Facts and Figures. This
provides a concise, single volume reference for annual statistics on the US and
worldwide motor vehicle industry, containing everything from car production
statistics to demographic data on new car buyers. The data of most relevance to
our purposes mostly duplicates what can be found in the primary sources already
discussed, with the exception of data on the composition of the vehicle stock and
22
OCR for page 23
ilk
on new (as distinct from total vehicle registrations by state. Unfortunately, the
new registrations data are proprietary information, compiler! privately by the R
Polk Company, and cannot be used for further analysis without payment.
Data for explanatory variables
Table 6 provides a summary of the explanatory variables compiled for the cross-
sectional dataset for 1990. Tables containing the full datasets are presented in
Appendix A. Detailed sources for the data are provided in Appendix B. Most of
the data for the explanatory variables comes from the Censuses of Population and
Housing, including population and households, income, demographics, vehicle
availability, and housing and land use statistics. The primary source for the public
transit-related data was the National Transit Database (Section 15 data), and much
of the private vehicle level of service data came from various Census publications
and the FHWA's Highway Statistics.
Table 7 provides a similar summary for the cross-sectional data assembled for
1980. These data were collected for use in models relating changes in mode
shares and occupancies from t980 to 1990. Where possible, we attempted to
obtain the same series as for 1990, but not all of these data were readily available.
As a result, this portion of the dataset is much smaller. The data was obtained
from essentially the same sources as that used for 1990.
TIME SERIES DATABASES
b
The introductory chapter pointed out that our original intention had been to
supplement the cross-sectional study of differences across metropolitan areas with
a time-genes analysis within a few selected areas. This section describes selection
of possible cities for study and the exploration of available time-series data for
each of these cities.
Data on total motor vehicle registrations, by state, are publicly available from FHWA, but the
Polk data provide independent estimates of new registrations for each year and of the number
of passenger vehicles in operation, by model year.
23
OCR for page 24
Table 6. Exl ~
:~ ~ ~ Categ ~ ~ If: ~ ~ ~ if: ::~ ~ i:: ~ ~ ~ ~ ~ ~ -I ~ ~ MS4~ ~:~ ~ I:: ~ I:: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~:~ ~ ~Cenbal~Ci~ ~ I: ~ :~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ -I
Socioeconomics · Population · Population · Population
· % central city population · Households · Households
· Households · Household size · Household size
· Household size · Employment · Employment
· % suburban households · Employed labor force · Employed labor force
· Employment · Workers not working at home · Workers not working at home
· Employed labor force · Workers per household · Workers per household
· % central city employment · Median household income
· % suburban employment · % families household inc >$75K
· Workers not working at home · % families household inc c$15K
· Workers per household
· Median household income
· Median family income
· Per capita income
Demographics · % age 65+
· % black
· % foreign born
· % one person household
· % one parent family household
· % families on public assistance
Vehicle · Total vehicles available · Total vehicles available · Total vehicles available
availability · Private vehicles per capita · Private vehicles per capita · Private vehicles per capita
· Private vehicles per household · Private vehicles per household · Private vehicles per household
· Private vehicles per worker not · Private vehicles per worker not working at · Private vehicles per worker not working at
working at home home home
· % 0 vehicle households
· %1 vehicle households
· % 2 vehicle households
· % 1 + vehicle households
· % 2+ vehicle households
Private vehicle · TTI congestion index · Highway spending per capita
level of service · Average commute trip time
· % w/ 45+ min. commute
· Downtown parking costs
· State gas tax
· Average gasoline price
· Freeway miles per square mile
· Total freeway mileage
· Total road mileage
· HOV lane mileage
Transit level of · Peak fare
service · Peak vehicles
· Vehicle miles
· Vehicle hours
· Vehicles/vehicle miles
· Vehicles/vehicle hours
· Vehicle miles/worker
· Vehicle hours/worker
· Peak vehicles per worker
· Capital spending
· Rail dummy
· Heavy rail dummy
Land Use · Land area · Population density
· Population density · % zoned as residential
· % single family detached houses · % housing built prewar
· % single unit structure · % housing renter occupied
· % high density apartments
· % 5+ units in building
· % housing built prewar
· % housing built post-1970
· Housing density
· Median year housing built
· Median housing value
Other · Annual precipitation
· Average January temperature
· % non-peak departure time to work
· Coefficient of variation of departure
. time
24
OCR for page 25
Table 7. Explanatory Variables in Cross-Sectional Dataset for 1980
.
i ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ USA - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~-~ ~ ~ ~6 ntr I~C ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ i
~ ~ ~ ~ got ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ :~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~: ~ ~ e ~ a ~ I y ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ saw
Socioeconomics . Population · Population . Population
· % central city population · Households · Households
· Households · Household size · Household size
. Household size · Employment · Employment
· % suburban households · Workers not working at home . Employed labor force
· Employment . Workers per household . Workers not working at home
. Employed labor force . Workers per household
· % central city employment
· % suburban employment
· Workers not working at
home
· Workers per household
· Median household income
· Per capita income
Vehicle . Total vehicles available · Total vehicles available . Total vehicles available
availability . Private vehicles per capita . Private vehicles per capita · Private vehicles per capita
· Private vehicles per · Private vehicles per household . Private vehicles per household
household · Private vehicles per worker not · Private vehicles per worker not
· Private vehicles per worker working at home working at home
not working at home
· % 0 vehicle households
. %1 vehicle households
· % 2 vehicle households
· % 1 + vehicle households
· % 2+ vehicle households
Private vehicle · TTI congestion index
level of service (1982)
· Average gasoline price
· Freeway miles per square
mile
· Total freeway mileage
· Total road mileage
· HOV lane mileage
Transit level of · Peak fare
service . Peak vehicles
· Vehicle miles
. Vehicle hours
· Vehicles/vehicle miles
. Vehicles/vehicle hours
· Vehicle miles/worker
. Vehicle hours/worker
. Peak vehicles per worker
Land Use · % single unit structure
25
OCR for page 26
Data sources for depencient variables
Certain local metropolitan planning organizations periodically undertake
observational studies of the vehicular and/or passenger volumes passing key entry
points into the urban core. Natural funnels like bridges or tunnels are obvious
locations to make such "screen line" or "cordon count" observations, sometimes
using vehicle sensor equipment. loss frequently, these studies may also classify
private vehicles by type, and count the numbers of visible vehicle occupants (a so-
called "vehicle classification and occupancy" study). It was data like these that
we hoped to use as "dependent variables" in time series analysis, to explore how
changes over time in the relative costs and service characteristics of different
competing modes influences trip volumes and mode shares.
One advantage of cordon count time series data is that they usually focus on a
market that transit can serve reasonably well, the suburb-to-central-city commuter.
While there are reasons to suspect a measure of noise in the data (because of
changing geographical definitions, changes in methodology, lack of
methodological documentation, and the like), such time series would provide a
potentially rich source for increasing our quantitative understanding of commuting
behavior.
For the time-genes analysis, we initially sought to identify four or five
prototypical cities for study, with the final selection of these cities being based on
the following cnteria:
.
.
Availability of good quality periodic data, without inordinate data collation
effort. The data would need to include (in addition to the cordon counts
themseIves) vehicle cIassification/occupancy, demographics, employment,
prices, and perhaps some measure of congestion levels, development densities,
end so on.
Variety of cities with respect to (for example) size, transit share, density, age
or state of development, and geographic location, as well as the levels of
commute trip mode share and vehicle occupancy and rates of change.
Polar cases with respect to transit share and private vehicle availability.
Examples include areas such as New York (Iowest private vehicle ownership,
highest transit share, one-third of the nation's transit ridership) en c! Los
Angeles/ Orange County (high freeway provision, rapidly increasing highway
congestion, etc.~.
26
OCR for page 27
Unfortunately, a survey of the MPOs in the relevant major metropolitan areas
revealed this plan to be overly optimistic. Table ~ shows that only a very limited
amount of cordon count data is collected on any regular basis. The only area
where counts have been consistently collected every year is New York City, but
even there occupancy is measured only every ten years. There are several years of
data for Washington DC, but the table shows that the construction of a complete
time series would be problematic. While there are historical records on freeway
and arterial vehicle volumes for Los Angeles, as well as a growing body of data
on HOV lane volumes, there has been no systematic measurement of vehicles
entering the central core area. Likewise, there were no suitable data available for
San Francisco. While there is a long series of screen line counts performed at the
Bay Bridge, the incentive to calpooling created by the Bay Bridge tolls makes this
location atypical of the rest of the area, and cordon counts are not performed at
any of the other major points of entry to the city.
In addition to the MPOs, we also contacted some of the city or state departments
of transportation. Most areas do some kind of average daily (or weekday) traffic
counts on a limited number of streets in the city, but they are not cordon counts
measuring entry into the core area. Accordingly, these data were not suitable for
our analysis.
Table 8. Summary of Cordon Count Data Availability
~ ~ ~ ~ ~ . ~ ~ I. ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ I. ~ ~ ~ I. ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Ad. ~ .~ ~ ~ ~ ~ ~ ..~ ..~ ~ ~ ~ ~ ~ ~ ~ ~ ..~....~...~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ I. ~ ~ ~ ~ ~ ~ ~ ~ I.
Metro Area ~ ~ ~ ~ ~ ~ ~ ~ - : Data~ailabili:ty -I ~ ~ ~
New York Annual peak period cordon counts 1971 -1994; decennial occupancy survey
Washington, DC Peak period cordon counts with occupancy for 1968, 1974-1981, 1983, 1985, 1987,
1990, and 1993
:
Philadelphia Cordon counts in 1960, 1980, 1985, 199O, 1995
Chicago MPO did counts in 1981 -1985 only
San Francisco Bay Bridge only; city did one cordon count in 1984
:
Seattle One cordon count in 1977
Minneapolis One cordon count in 1984
:
Houston Only one cordon count in mid-80s
:
Boston Very infrequent; highway counts only
San Diego Average weekday traffic only; occupancy survey every couple of years
:
Portland Average daily traffic only; no occupancy data
:
Phoenix Average weekday traffic only; no occupancy data
Kansas City Average daily traffic only; limited occupancy data
:
Los Angeles No cordon count data
Atlanta No cordon count data .
27
OCR for page 28
Explanatory variables
Table 5 below presents a summary of the explanatory variables in the time-series
database for New York City. The actual data and the sources from which they
were obtained are provided in Appendices C and D respectively.
Table 9. Explanatory Variables in Time-Series Database for New York City
? 5~.~ ~.~ ~ ~ ~.~ ~ ~ ~ ~ ~ . ~ ~ ~ ~ . ~ . . If.. ~ ~ .~ ~ ~ aft. ~ ~ ~ .~ ~. ~.~ . ~ ~ ~ ~ . ~ ~ ~ .. .~ ~ . ~ . ~ . if. ~ ~,~ ~ ~ ~ ~ ~ ~ .~. .~.~.~-~ .~ ~ ~ ~ .
T- ~ .~ ~ ~ ~ ~ ~ ~ ~ ~ ~.. . ~.~ . ~ ~ ~ ~ ~ . ~ . . ~ ~ ~ ~ ~ ~ ~ ~ . ~ ~ . . ~ ~ ~ ~ ~ ~ a, . ~ . .~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ i. ~ ~ ~ ~ ~ i.
-I ~ Ace ~ ~ ~ ,~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ varlJ: ~ :~ ~ ~ ~ ~ ~ :: ~:~ ~ :: : ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ hi:
~ ~ ~ ~ ~ ~ :~::~:: I:: :~ ~ ~ ~ :~ ~ -~:::~::~'::~ :::: : if: I: it- ~-~:~ ~ ~:~:~: :'':' I:: ~ If: :: :: ~ ~ ::~ ~ :: I. I: :::' :~ :' I: ~:~ ~ ~ ~:~ I: ~ I- I: :- I: : ::~ I::: ::~ . ~ I:: ~ :~:: :~: ~ ~ ~ aft: ~ ~ ~ ~ .
Socioeconomic/ ~ ~.
demographic statistics . Manhattan population
· Brooklyn population
· Queens population
· Bronx population
· staten Island population
· Manhattan population as % of New York city population
· Manhattan population as % of New York CMSA population
. CMSA employment
. PMSA employment
· PMSA employed labor force
· New York city employment
. New York city COD employment
· % central city employment
· % central county employment
· % CBD employment
Vehicle availability · Total registered passenger vehicles in New York CMSA
· Renistered vehicles per capita in New York CMSA
_ . .
Prices · Gasoline prices
. Bridge/tunnel tolls
· Quality adjusted new car prices
· MTA Transit fares
With only one city allowing a possible time series analysis, the task of collecting
data for explanatory variables was made somewhat simpler. Nevertheless, the
nature of time-series data makes it difficult to assemble a complete dataset with
many variables. Appendix C shows that many of the variables were not available
for a consistent set of years, and there are inevitably issues of compatibility among
data from different years. Most notably, though, because of budgetary and other
constraints, data for many variables that might be important in this type of
analysis are not collected every year, and thus can't be used in a time-series
analysis without further interpolation or other strong assumptions.
28
OCR for page 29
LIMITATIONS OF THE DATA
in attempting to analyze observed travel patterns to account for variations over
time or different geographical areas, there are several key issues that need to be
addressed: geographical definitions, the potential simultaneous determination of
correlated variables, and other factors relating to quality of the available data.
These issues are summarized in the following sections.
Geographic cIefinition issues
As we have described, the data for the dependent variables in the cross-sectional
models developed in this study are taken from the Census of Population and
Housing. These data represent mode shares and occupancies for each of thirty
three metropolitan areas, computed from the Census data. There are several
issues relating to how the Census defines the geographic units into which the data
are classified that make their use in this type of analysis problematic. Most
important to this study are the definitions of what constitutes a "metropolitan
area" and the definition of so-called "central cities."
Metropolitan area definitions are an issue for two reasons: the size of the area
defined as the metropolitan area will affect the mode share and occupancies
computed for that area, and also because these definitions tend to change over
time, making it difficult to compare data across different years.
We have chosen as the relevant geographic unit of analysis for the cross-sectional
models the largest region defined by the Census to constitute the metropolitan
area for each city. That is, where the Census has designated a Consolidated
Metropolitan Statistical Area (CMSA) for the city, we have used this area for the
data. While some of the CMSAs represent very large areas, incorporating several
large central cities (such as Miami, for example), the use of only the MSA
definition was thought to be too small an area for the largest cities. The New
York City MSA, for example, includes only the City of New York, and none of
the surrounding counties.
Table 10 shows how the size of the area defined as the metropolitan area has
changed from 1980 to 1990, reflecting the official Office of Management and
Budget revision of the CMSA/MSA definitions carried out in 1983.3 As the table
indicates, there have been many changes in geography.
29
OCR for page 30
Table 10. Effects of 1983 Revision of Metropolitan Area Definitions
i. . ~ .
i. . ~ i.
1'. ~'~2~ -~
Area Unchanged
Los Angeles
Houston
Miami
Cleveland
Seattle
San Diego
Phoenix
Milwaukee
San Antonio
Buffalo
New York
Chicago
San Francisco
Philadelphia
Detroit
Boston
Washington
Minneapolis
St. Louis
Baltimore
Pittsburgh
Tampa
Kansas City
Sacramento
Portland
Columbus
New Orleans
Dallas
Atlanta
Denver
Cincinnati
Indianapolis
Providence
Source: Federal Highway Administration.
Clearly this would seem to suggest an inherent incompatibility in comparisons of
1980 and 1990 mode shares and occupancies derived from these data. One
potential solution to this problem might be to recompute the results based on
identical geography, but unfortunately the attempts to do so have been made only
at the national level.
But it is also not clear that having the same geography would resolve the
compatibility problem. The Census definitions of the metropolitan areas are far
from arbitrary, and are meant to describe an area relating to a specific locus of
economic and social activity. This relevant area will necessarily change over
time, and as such the definition of the metropolitan area should change
accordingly. As commuters to a central city area continue to move further away
from the city, they should continue to be counted in the calculations of shares and
occupancies for that city.
So, for example, there is something to be said for comparing transit shares in 1980
for what the Census calls "the Boston area" at that time to transit shares in 1990
for what the Census currently calls "the Boston area."
30
OCR for page 31
There is likewise an even more pervasive problem with the redefinition of "central
cities. " Central cities are officially designated by the Office of Management
using criteria that include both total population and the ratio of employed to total
residents, 4 with the goal being to select those cities that constitute the major
employment centers of the area. At any given point in time, these definitions will
affect mode shares and occupancies computed at the Suburb-regional levels of
geography, and they will often change markedly over time. Again, these
definitions are changed for good reason; a suburban "bedroom community" in
1980 could be a busy center attracting its own workforce in 1990. As such, it is
ambiguous as to whether the "correct" solution to this problem is to compare like
geographies (the same city to the same city) or like definitions (the Census-
designated central cities to the Census-designated central cities).
· . ~. . .
Given these considerations, we have elected to estimate the models of changes
between 1980 and 1990 for the total commute market only (rather than at Suburb-
regional levels of geography). For these models, we have calculated the changes
between 1980 and :~990 by comparing directly the 1980 and 1990 MSA
definitions, rather than attempting to adjust one or the other to obtain completely
consistent geography.
Other mocie share calculation issues
Although the Census Journey-to-Work data have the strong advantages of national
scope and very large sample sizes - large enough to provide usable statistics for
quite small geographic areas - the restricted amount of space on the Census form
imposes some important limitations on how such essentially complex and diverse
behavior as commuting can be characterized in a uniform manner. Most
importantly, the questionnaire only allows respondents to record the "primary"
mode of their work trip, forcing all commute trips into a "single mode"
classification and thereby reducing the precision of mode share calculations. This
can be particularly problematic for cities with complex transit systems and
Cities within a metropolitan area meeting any of the following criteria are designated by OMB as
central cities in that metropolitan area:
1.) the largest city in population within the metropolitan area.
2.) cities with population of at least 250,000 or employment of at least 100,000.
3.) cities with population of 25,000 or more and an employment/population ratio of at least
0.75 and at least 40% of the employed residents working in the city.
4.) Cities of population of 15,000-24,999 and at least one-third the population of the largest
central city and an employment/population ratio of at least 0.75 and at least 40% of the
employed residents working in the city.
31
OCR for page 32
(presumably) many multimodal trips. A similar limitation involves workers with
multiple jobs - trip characteristics are only recorded for one occupation per
household member.
Lack of aclequate time series ciata
Since we are interested in identifying the determinants of trends in mode shares
and occupancies over time, time series analysis is naturally an appropriate
approach to this study. While the Census data provide the level of detail needed,
they are only collected every ten years, and the earlier discussion highlights the
problems with obtaining a consistent time series of mode share or occupancy
measures for individual metropolitan areas. While the relationships we are
examining are sufficiently complex that no amount of data will allow us to
quantify them exactly, adequate time series data would be a very important
component in adding to our understanding of them. Ideally, this analysis would
be performed with a pane! dataset of 20 or more years of annual time-series data
for each of the 33 metro areas.
Simultaneity consicIerations
An issue inherent in all analyses of this nature is the simultaneous nature of the
relationships of interest. The issue pervades many areas of the study of travel
(and related) behaviors. For example, does the decision to live in the suburbs
determine the level of vehicle ownership or vice versa? Likewise, we know that
level of transit service (frequency, for example) is a determinant of the demand for
transit service, but at the same time we know that a transit operator will determine
appropriate service frequencies based partly on anticipated demand.
Given the pervasive nature of this problem, then, what does it mean for this
analysis? in the most general sense, it means that this analysis will be limited in
its ability to identify fully the complex relationships of interest to the study, as
was made clear in the introductory chapter. On a more detailed level, though, it
also has some implications for the aspects of the relationships that we can identify
with the modeling techniques and data available to us.
A fundamental assumption of regression models is that the regressors are
uncorrelated with the error component or disturbance term of the model. When
one of the regressors is endogenous to the model rather than independent (perhaps
because it is a function of the dependent variable), this assumption is violated and
32
OCR for page 33
the coefficients of the mode] will biased.
Typically, this problem is addressed
through two-stage regression techniques employing instrumental variables
variables that are not a function of the dependent variable, but are a good predictor
of the problem endogenous variable. In this type of analysis, though, finding
suitable instrumental vanables can be highly problematic (for example, what is a
good predictor of frequency that is unrelated to demanded. We have attempted to
avoid this problem by selecting as regressors only vanables that do not appear to
be endogenous.
Other quality issues for explanatory variables
We have stressed that the relationships we are examining are inherently complex,
and are likely to be influenced by many factors, some easier to measure than
others. While we designed our analysis with the goal of tractability in mind, and
specifically used the largest metropolitan areas to maximize the amount and
quality of data available, there were nonetheless limitations in the content and
quality of data available to us.
issues that one might think potentially important, such as parking availability, or
the incidence of employer-provided parking or employer-based programs and
incentives, could not be examined because no consistent source of metro area-
level data exists at present. There are also less direct measures of transportation
service that are believed to affect mode choices, but for which consistent
quantitative measures for the metropolitan areas of interest were also not
available. For example, quality of service measures such as on-time performance
or travel time reliability, transit vehicle or station cleanliness, availability of
traveler information, crime rates, etc., could not be included in the analysis.
Even where data are available or can be compiled for all of the metropolitan areas
in our cross-sectional dataset, some variables may represent an average over an
otherwise quite vaned range of values for each city (parking costs, for example),
and will thus be a somewhat crude measure. In other cases, such as the incidence
of HOV facilities, we were able to obtain only an imprecise measurement of the
variable (we collected data on total HOV lane miles, but such issues as the
location of the facilities relative to major points of congestion, are far more
difficult to obtain).
33
OCR for page 34
SUMMARY
The development of the mode! databases used in the analysis began with an initial
exploration of potential data sources. This exploration revealed that for cross-
sectional analysis, the only suitable data providing mode shares and occupancies
at the Suburb-metro area level of geography for many metro areas were those
collected in the lourney-to-Work section of the Census of Population en c}
Housing. These data were therefore used to create the dependent vanables of the
cross-sectional dataset. The cross-sectional dataset contains data for the 33
metropolitan areas having population of one million or more in 1980. Data were
collected for both 1990 and 1980. Much of the data for the explanatory variables
also came from the Census, but was supplemented with several other sources.
Our examination of potential data sources for time-genes analysis indicated a
stnctly limited availability of suitable data. Contacts with local and state agencies
associated with fourteen of the largest cities revealed that New York City was the
only city with adequate time series data from which to compute annual mode
shares.
Hu, P.S., and Young, J., Nationwide Personal Transportation Survey: 1990 NPTS Databook
(two volumes), Washington, DC, US Department of Transportation, Federal Highway
Administration (19931.
2 See, for example,
Strate, H.E., Nationwide Personal Transportation Study. Report no. 1: Automobile
Occupancy, Washington, DC, US Department of Transportation, Federal Highway
Administration (1972~.
Kuzmyak, J.R., 1977 Nationwide Personal Transportation Study. Report no. 6: Vehicle
Occupancy, Washington, DC, US Department of Transportation, Federal Highway
Administration (1981~.
Klinger, D. and Kuzmyak, J.R., Personal Travel in the United States. 1983-1984 Nationwide
Personal Transportation Study (two volumes), Washington, DC, US Department of
Transportation, Federal Highway Administration (1986~.
Hu and Young, op. cat.
3 Rosetti, M.A.and Eversole, B.S., op. cit.
4 US Office of Management and Budget, "Revised Standards for Defining Metropolitan Areas in
the 1990s". Federal Register, Vol. 55, No. 62, March 30,1 1990.
34
Representative terms from entire chapter:
time series