Read "Evaluating Vehicle Emissions Inspection and Maintenance Programs" at NAP.edu

« Previous: Appendix B: Abbreviations and Names Used for Classifying Organic Compounds

Page 234 Cite

Suggested Citation:"Appendix C: Some Statistical Issues in Inspection and Maintenance Evaluations." Transportation Research Board and National Research Council. 2001. Evaluating Vehicle Emissions Inspection and Maintenance Programs. Washington, DC: The National Academies Press. doi: 10.17226/10133.

Page 235 Cite

Page 236 Cite

Page 237 Cite

Page 238 Cite

Page 239 Cite

Page 240 Cite

Page 241 Cite

Page 242 Cite

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Appendix C Some Statistical Issues in Inspection and Maintenance Evaluations VEHICLE EMISSIONS DISTRIBUTIONS A few broken vehicles contribute a disproportionate share of the total emis- sions made up of exhaust hydrocarbon (HC), carbon monoxide (CO), and nitrogen oxide (NOX) or evaporative HC emissions (including diurnal, running loss, liquid leaks, or hot-soak emissions). A vehicle can deteriorate in many ways, and different types of deterioration affect emissions differently. As discussedin Chapter ~ ofthis report, thereis considerable correlation between exhaust HC and CO emissions in high-emitting vehicles, but little relationship between these and high NOX emissions. High evaporative emissions may or may not be correlated with high exhaust emissions (Pierson et al. 19999. High diurnal evaporative emitting vehicles are usually different from those with high hot-soak emissions. Since most vehicles do not have high emissions, the distribution of emis- sions among vehicles for any single emission type are highly skewed and are best characterized by a log or gamma distribution (Zhang et al. ~ 994; Wenze] et al. 2000~. Emissions distributions are more skewed for newer model-year vehicles than for their older counterparts, because only a few newer vehicles have broken fuel delivery or emissions-control equipment. HC and CO emis- sions distributions are more skewed than those of NOX. 234

Appendix C 235 SAMPLING METHODS AND BIAS Special care mustbe taken to avoid selectionbias. Since a small percent- age of vehicles will emit a large percentage of the emissions, any sampling technique that decreases or enhances the percentage ofthese higher emitters may cause questionable conclusions to be drawn from the data. Every report describing an analysis of an inspection and maintenance (~/M) program that uses a sample of the vehicle fleet to estimate the benefits of the program should estimate the degree of selection bias in the sample. The methods for selecting the vehicles in the sample and the tests to determine if the sample is representative ofthe fleet should tee described. An estimate ofthe magnitude of selection bias should also be made for any vehicle sample used to derive a correlation between an I/M test and the Federal Test Procedure (FTP) for the purpose of estimating the tons of pollutant per day reduced by the I/M pro- gram. Mail solicitation of vehicles for laboratory testing has been practiced by the California Air Resources Board (CARB), the U. S. Environmental Protection Agency (EPA), and by auto manufacturers. Responses to mail solicitations have shown low voluntary acceptance by vehicle owners; CARB and EPA have typically experienced acceptance rates on the order of 10°/0. In the California I/M pilot study, when CARB threatened vehicle owners solicited by mad! with loss of registration if they did not come in for testing, only a 60% acceptance rate was obtained. The selection bias in mall solicitation sampling will be influenced by the rewards, penalties, and risks perceived by the recipi- ent ofthe solicitation. If vehicle owners think that they might be penalized by the result ofthe tests they are asked to volunteer for, they will be less likely to agree to do so. If free inspections and repairs are offered for the vehicles to be tested, a disproportionate number of dirty vehicles might be included in th sample. - ~ ~ r--r 'The CARB acceptance rate of 60% is not 60% response rate of letters that were mailed out. Fifteen hundred solicitation letters were mailed out. Of the 1,500, 444 vehicles were dismissed, usually because (1) that the vehicle had previously received a Smog Check test, (2) that the addressee no longer owned the vehicle, or (3) the solicitation letter was resumed as undeliverable. The " 60% response rate" is 60% of the (1,500-444) vehicles. Of the 1,500 letters mailed out, only 43% resulted in a re- cruited vehicle.

23 6 Evaluating Vehicle Emissions I/M Programs Higher acceptance rates have been obtained when vehicles were solicited directly rather than by mail. Recruitment of vehicles to be tested for multiday evaporative emissions testing solicited directly at an I/M lane resulted in a 90°/0 acceptance in the Coordinating Research Council's (CRC) study of evapora- tive emissions. High acceptance rates were also obtained in the 1997-1999 California roadside tests, where vehicles were pulled over by police of dicers, asked if they would agree to be tested, and then tested at roadside. In a sam- ple of the California roadside-test program, where vehicles were also mea- sured by remote sensing as they left the roadside test area, a 92% acceptance rate was obtained. The remote-sensing results in this case showed no differ- ences in the average HC and CO emissions for a mode! year between the vehicles' drivers who agreed to be tested and those who refused. An earlier roadside testing program, however, showed evidence of considerable bias because of higher emissions from vehicles whose owners refused to have them tested.2 Samples of vehicles taken from I/M lanes will not include vehicles that avoid being tested. These include both unregistered vehicles and vehicles registered in an area where I/M is not required but garaged and driven in the I/M area. If recruitment at I/M lanes is limited to vehicles arriving for their initial tests, not having had pre-test repairs, selection bias may exist, since vehicles that received pre-test repairs probably had higher emissions . Vehicles that are unsafe to test on a dynamometer would not be tested in an I/M pro- gram using loaded-mode testing.3 FACTORS INFLUENCING VEHICLE EMISSIONS Vehicle emissions are influenced by numerous factors other than I/M. When evaluating the effect of an I/M program, other factors that may be 2In an earlier Roadside Testing program in California, remote sensing showed that vehicles whose owners refused to allow testing had emissions more than twice those whose owners agreed to the roadside tests (Stedman et al. 1994~. 3There is a group of vehicles (a relatively small fraction of the fleet) that cannot be tested on a two-wheel drive (2WD) dynamometer due to all-wheel drive (AWD) or non- switchable traction control. However, many centralized I/M programs are either using or considering the use of AWD dynamometers and therefore will be able to test these vehicles. The number of vehicles that are unsafe to test on a dynamometer is small enough so that it can be considered insignificant. Many of these are older (pre-1981) vehicles that are not subject to loade-mode testing.

Appendix C 237 partly responsible for emissions changes (either reducing or increasing them) need to be considered. To obtain the influence ofthe I/M program itself, these other factors should be shown either to have a low impact on the results, to be controlled, or to be randomized through sample selection. Vehicle age and mode} year are important factors in vehicle emissions, but the most important factor is vehicle maintenance. Most analyses group vehicle data by mode! year, which is closely associated with vehicle technology. For example, the shift from carbureted to fuel-injected vehicles occurred over a period of a few years. Since emissions deterioration rates have been decreasing due to improve- ments in vehicle design, the amount of emissions reduction attributable to an I/M program will be a function of the year during which the evaluation was made. When comparing evaluations of I/M programs in different years, change in vehicle technology must tee taken into account. This includes vehicle design (engine design, fuel delivery system, and emission controls) and vehicle type (i.e., passenger car, light-duty buck), especially if different vehicle mode! years and different vehicle types within a model year were built to different regulatory emission standards. The amount of vehicle use (miles driven per year) can also influence vehicle emissions. High-use vehicles (e.g., taxis) can be expected to have faster rates of deterioration than similar vehicles driven fewer miles. Another important factor influencing vehicle emissions is driving mode (i.e., cold start, warm start, low load, high load, high acceleration or decelera- tion). In a cold start, the emissions control systems will not tee operating at full capability. Under high load, many vehicles are designed to run fuel rich, caus- ing very high CO emissions. During high acceleration or deceleration, some vehicles have high HC emissions. Fuel quality also effects emissions. Laboratory tests using fuel different from that blended for local conditions can introduce bias into the data. Fuel composition parameters influencing emissions include volatility (especially for evaporative emissions), sulfurievel, the presence of oxygenates (especially for older, carbureted vehicles), and other fuel reformulation, such as federal and California reformulated gasolines. Where there is a seasonal change in fuel composition (e. g., higher oxygenates and volatility in winter), comparisons of year-to-year changes in vehicle emissions should sample vehicles taken the same time of the year. In the wintertime, areas using oxygenated fuels may have a lower I/M test failure rate, because CO emissions are reduced as much as ~ 0°/0. In this situation, the use of oxygenated fuels allows vehicles that normally would have failed to "pass the test" without being repaired.

238 Evaluating Vehicle Emissions I/M Programs Additionally, ambient conditions (temperature, altitude, end humidify) can have an impact on vehicle emissions as evidenced by the strong seasonal variation in the emissions from I/M and remote-sensing data. Finally, the motorist socioeconomics can have an effect on vehicle emis- sions. Less affluent areas will have, on average, older vehicles. But age corrected vehicle emissions from a less effluent area still showedhigheremis- sions (Stedman et al. ~ 994~. Correlations have been found between average vehicle emissions and the zip code where the vehicle was registered (Singer and Harley 2000~. Vehicles ofthe same mode] year and vehicle type regis- tered in more affluent areas have lower emissions (Wenzel 1997~. HUMAN BEHAVIORAL FACTORS The introduction of a more severe I/M program can lead to the re-registra- tion of vehicles into areas not requiring the new I/M program (Stedman et al. ~ 997; McClintock ~ 9984. Some ofthese re-reg~stered vehicles still drive in the I/M program area but are not subject to inspection. The degree of enforcement affects the human behavior of avoiding the test. An evaluation of an I/M program should help in determining where addi- tional enforcement and/or additional economic incentives would improve pro- gram benefits. Fraud, such as cleanpiping, distorts program assessmentbased solely on I/M program records. Clean piping is a type oftest fraud in which a technician tests a clean vehicle and attributes the result to the vehicle that is supposed to be tested to ensure that the vehicle passes. The inspection-lane data would indicate that the program is more effective than it actually is. The use of roadside pullover testing and/or remote-sensing measurements is not sensitive to test fraud and could help to identify testing stations that should be subject to covert audits to detect fraudulent behavior. NUMBER OF VEHICLES I/M programs are designed to minimize the percentage of high-emitting vehicles. To tell whether such vehicles are becoming less prevalent in the fleet, sufficient numbers of vehicles need to be in the sample. Comparing two

Appendix C 239 populations requires enough vehicles in each to have a valid comparison. If further subdivision ofthe population is necessary, additional vehicle emissions data are required. A small sample size may produce too much uncertainty to adequately describe the average emissions reductions due to the I/M program. The size of the sample to use depends on the amount of confidence one wants in the results.4 The number of vehicles needed to characterize fleet emissions is also influenced by vehicle test-to-test variability. Some vehicles, especially some high-emithug ones, show considerable variation in emissions in repeated tests under the same conditions. The shape ofthe vehicle emissions distribution has consequences for how to choose vehicle samples for analysis. Stratified sampling is a method of reducing the total number of vehicles sampled, while obtaining sufficient numbers of hi~-emitting vehicles. For this purpose more vehicles are selected from segments ofthe population that are expected to have more high-emitting vehicles. To keep the frequency ofthese vehicles in perspective, aparalle] record ofthe frequency ofthe fleet segment in the total fleet has to be obtained. A variety of stratified sampling methodologies have been used. EPA and CARE have used sampling strategies based on vehicle technology groupings and vehicle model year. California has used Radian/Eastern Research Group's high-emitter index to select vehicles for sending to test-only inspection stations. The CRC E-35 evaporative emissions study selected equal numbers of vehi- cles in three age groups corresponding to their evaporative emissions-control technologies. 4When Stedman et al. (1997) applied the step method in Denver, five daily aver- ages were used to get an estimate of the effect of the enhanced I/M Program being introduced. With about 4,000 remote-sensing measurements per day (2,000 each for "enhanced I/M tested" and "not enhanced I/M tested"), the study found about a 7% 2-3% emissions reduction benefit, with 95% confidence. Because Stedman et al. ran the experiment halfway through the new biennial program, almost all other factors were randomized. The appropriate sample sizes in any specific test will depend on whether the significance of other factors (such as vehicle type, fuel, and socioeconomics) needs to be understood. Guidance should be sought from a statistician familiar with handling non-normal distributions.

240 Evaluating Vehicle Emissions I/M Programs AVERAGE VALUES AND LOG TRANSFORMS Some researchers take the log of (sometimes binned) vehicle emissions data to obtain a near normal distribution and then take the mean and the error limits ofthe log-transformed distribution. This reduces the weight (importance) of the high-emitting vehicles in the relationships (Pollack et al. 1999~. The mean of a log-transformed sample is the geometric mean ofthe sample rather than the arithmetic mean. The total vehicle emissions introduced into the atmosphere are the sum of all the individual vehicle emissions or the arithmetic average of emissions per vehicle times the number of vehicles. The mean of a set of log-normally distributed vehicle emissions will always be less than the arithmetic average. TESTING NULL HYPOTHESES If the sample is sufficiently large, it canbe randomly divided into two sets, and the difference between the (model-year weighted) averages of the two sets of data should be zero plus or minus some value within an uncertainty range. Assumptions about the lack of influence of certain factors should be checked with a null hypothesis. CONFIDENCE LIMITS Confidence limits of vehicle emissions in log or gamma distributions are asymmetric and can be generated using bootstrap analysis. A bootstrap ap- proach is a Monte Carlo-style simulation technique used to estimate the confi- dence interval when errors are non-normally distributed (Chatterjee et al. 1997). Normal statistics can be applied to the arithmetic averages of sufficiently large subsamples of non-normal distributions. The confidence limits, in this case symmetrical, apply to the means of the samples. CORRELATIONS Pearson product correlation is not appropriate for log or gamma distribu- tions because the R2 values are overly influenced by the small, high emitting

Appendix C 241 fraction. Scatter plots should be presented in linear space so the reader can visually assess the degree of correlation. Spearman rank correlation can be performed, however, to minimize the effect that high emitters have on vehicle emission distributions. ISSUES IN EMISSIONS OF I/M TEST DATA TO ESTIMATE I/M BENEFITS The difference between initial fad] and final pass results on the same vehi- cle tested in one cycle of the I/M program overstates the benefit of the I/M program because of regression to the mean.s Also, the deterioration of emis- sions until the next required test (the next test cycle) is not taken into account. COMPARISON OF TEST RESULTS AMONG PROGRAMS I/M evaluation methods depend on comparing emissions from one vehicle fleet with another, or comparing emissions from one vehicle fleet at different times. Fleet emissions are dependent on the load that the vehicles are under during the test. In order to compare a test fleet with a reference fleet mea- sured using different tests, a correlation equation is necessary. This equation is created from a third fleet (a "correlation fleet") that has experienced both tests. The test and reference fleets need to be free from selection bias and representative ofthe same population. The correlation fleet needs to tee free from selection bias end representative ofthe same vehicle population, unless it can be shown that the correlation equation is not sensitive to potential differ- ences between the correlation fleet and the test and reference fleets. The correlation equations should be derived from fleets subject to the same kind of test procedure. Significant differences between fleets may be caused by differences in any and all ofthe following: vehicle emission control and fuel system technologies, vehicle ages, vehicle types, inspection maintenance histo- ries, socioeconomic owner histories. In addition, similar filet and environmental conditions may be required (altitude, temperature, etc.) for measurement con- sA description of the statistical concept of regression to the mean by W.M. Trochim, a professor in the Department of Policy Analysis and Management at Comell University, is available on the web at http://trochim.human.comell.edu/kb/regrmean. him.

242 Evaluating Vehicle Emissions I/M Programs ditions for the test, reference, and environmental fleet emission measurements. However, fuel and environmental conditions are not taken into account in the pass/fai! cutpoints when the scheduled I/M test is performed. 1

Evaluating Vehicle Emissions Inspection and Maintenance Programs (2001)

Chapter: Appendix C: Some Statistical Issues in Inspection and Maintenance Evaluations

Welcome to OpenBook!

Get Email Updates