Skip to main content

Currently Skimming:

Chapter 3: Statistics and Data Analysis
Pages 38-63

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 38...
... no rollover) to be worth using for consumer information." To this end, the agency undertook a statistical study to investigate the relationship between measured values of SSF for a range of vehicles and corresponding rollover rates determined from real-world crash data (Federal Register 2000)
From page 39...
... The rationale behind the agency's choice of crash data is then reviewed, with particular emphasis on the selection of data from six states for use in constructing the rollover resistance rating system. Crash Data Files Four major databases maintained by NHTSA have the potential to support evaluation of rollover collisions, including rollover rates: • State Data System (SDS)
From page 40...
... database Approximately 50,000 crashes included • annually Estimates of rollover rates, injury • severity, and other characteristics Data acquired from sample of police associated with rollover crashes should reported crashes in 400 jurisdictions provide reasonable national estimates within 60 areas across the United States • of the problem, provided the sampling Can be used to produce national estimates is not biased of crash-related safety problems at all levels of injury severity, from property damage–only to fatal • • Crash- Part of NASS Contains most-detailed crash data • worthiness available in any national file, including Includes detailed postcrash data collected Data an entire subset of variables associated by trained investigators • System with rollover 4,000–5,000 crashes included annually, • (CDS) Does not contain sufficient numbers of selected randomly from a sample of rollover crashes to be useful for national jurisdictions; includes all levels modeling analysis of injury severity • • Used by NHTSA to assess relative Data acquisition includes detailed review frequencies of "investigator defined" of crash site, examination of vehicle(s)
From page 41...
... Thus although the FARS, GES, and CDS databases were deemed inadequate, they were useful in informing NHTSA's analyses. Importance of Single-Vehicle Crashes NHTSA's analyses used SDS crash data relating to single-vehicle events only (see below)
From page 42...
... NHTSA selected six states for modeling: Florida, Maryland, Missouri, North Carolina, Pennsylvania, and Utah. The corresponding single-vehicle crash data were used in the modeling analysis that resulted in the curve used to establish the star rating values for individual vehicle models.
From page 43...
... In light of NHTSA's responsibilities for establishing national policy and providing information relevant at the national level, it is important that the rollover crash data used to derive consumer information be representative of all states. Hence, the agency undertook an additional effort that involved using the GES database to determine whether the rollover rate for a national sample of single-vehicle crashes was similar to the rate for the six states included in the original analysis.
From page 44...
... The explanatory variables are typically divided into two groups: the vehicle metrics are in one group and the driver characteristics and environmental variables in the other. This latter group defines what is called a scenario.
From page 45...
... The purpose of the statistical analysis is to investigate what the crash data indicate about the effect of SSF on a vehicle's propensity to roll over and whether the magnitude of this effect depends on driver and environmental variables. The example of a double-decker bus illustrates the complexities involved in interpreting the results of such crash data analyses.
From page 46...
... A binary-response model is referred to as a logit model if F is the cumulative logistic distribution function and as a probit model if F is the cumulative normal distribution function. NHTSA employed a logit model in its statistical analysis of rollover crash data.
From page 47...
... The current rating system for rollover resistance was constructed using an estimated rollover curve also based on an exponential model. The uncertainties associated with this estimated rollover curve were not considered in deriving the star rating categories.
From page 48...
... ] z In contrast, the rollover probability for the average scenario is _ P*
From page 49...
... Specifically, statistically reliable estimates of the rollover probabilities are obtained when the logit model is estimated by maximum likelihood from the ungrouped binary data. Consequently, statistical uncertainty about the rollover curve is not an issue when the logit model is used.
From page 50...
... At the request of the Alliance of Automobile Manufacturers, Exponent Failure Analysis Associates reviewed NHTSA's statistical analyses of crash data that serve as the basis for the star rating system for rollover resistance. As part of this review, Exponent (Donelson and Ray 2001)
From page 51...
... Hence, an increase in the size of the crash dataset does not improve the accuracy of the estimates according to the formulas employed by Exponent and NHTSA. This result indicates that something is wrong with the method used to calculate the confidence intervals shown in Figure 3-1.
From page 52...
... Hence, if the objective is to estimate the make or model rollover probability for an old make or model group, there is no reason to estimate the rollover curve. For new make and model groups there is no crash history, or a very limited one; that is, the crash dataset contains a small number of crashes, if any.
From page 53...
... The second point is that the widths of the confidence intervals 0.70 0.65 Upper 95% confidence limit 0.60 Predicted Lower 95% confidence limit 0.55 SSF range: passenger cars Rollover Probability per Single-Vehicle Crash SSF range: SUVs 0.50 SSF range: passenger vans SSF range: pickup trucks 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 Static Stability Factor (SSF) FIGURE 3-2 Estimated probability of rollover and 95 percent confidence intervals based on maximum-likelihood estimation of a logit model using data from six states combined (n = 206,822)
From page 54...
... The confidence intervals displayed in Figure 3-2 suggest that, from a statistical perspective, it is possible to discriminate meaningfully among the reported rollover rates for vehicles within a single vehicle class using the logit model. The range of SSF for the four vehicle types used in the analysis is plotted in Figure 3-2 for comparison.
From page 55...
... The average scenario–average state logit model developed to estimate the probability of a single-vehicle rollover crash across all scenarios and states is shown in Figure 3-2. The estimated rollover curves and their 95 percent confidence intervals for the six selected scenarios, averaged across states, are presented in Figures 3-3 through 3-8.
From page 56...
... FIGURE 3-4 Estimated probability of rollover and 95 percent confidence intervals based on maximum-likelihood estimation of a logit model using data from six states for 25th-percentile–risk scenario.
From page 57...
... FIGURE 3-6 Estimated probability of rollover and 95 percent confidence intervals based on maximum-likelihood estimation of a logit model using data from six states for median-risk scenario.
From page 58...
... FIGURE 3-8 Estimated probability of rollover and 95 percent confidence intervals based on maximum-likelihood estimation of a logit model using data from six states for high-risk scenario.
From page 59...
... The estimated rollover curve based on the logit model appears to be a reasonable approximation to the nonparametric-based rollover curve using limited data, suggesting that the logit model is a sensible starting point for constructing a rollover rating system. ROLLOVER CURVE AND STAR RATING SYSTEM NHTSA derived its five star rating categories for rollover resistance from the estimated rollover curve shown in Figure 3-1.
From page 60...
... Accuracy The approach adopted by NHTSA was to approximate a continuous curve -- the estimated rollover curve -- by a discrete approximation comprising five levels, or star rating categories. This is a coarse approximation that results in a substantial loss of information, particularly at lower SSF values where the rollover curve is relatively steep.
From page 61...
... 0.50 Upper 95% confidence limit 0.45 Predicted Lower 95% confidence limit Rollover Probability per Single-Vehicle Crash 0.40 SSF range: passenger cars SSF range: SUVs SSF range: passenger vans 0.35 SSF range: pickup trucks 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.95 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 Static Stability Factor (SSF) FIGURE 3-11 Example of using seven SSF categories based on the model in Figure 3-2.
From page 62...
... One important consequence is that the SSF intervals in the lower SSF range (up to approximately 1.25) , where rollover probability changes quite rapidly with changing SSF, are too wide to permit discrimination among vehicles, even though analysis using the logit model indicates that such discrimination is statistically meaningful on the basis of real-world crash experience.
From page 63...
... Recommendations 3-1. Instead of using an exponential model, NHTSA should use a logit model as a starting point for analysis of the relation between rollover risk and SSF.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.