The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Surface Temperature Reconstructions for the last 2,000 Years
and one or more measurements are made on a response (e.g., the proxy). Subsequently, in a second controlled experiment under identical conditions, the response is measured for an unknown level of the factor, and the regression relationship is used to infer the value of the factor. This approach is known as inverse regression (Eisenhart 1939) because the roles of the response and factor are reversed from the more direct prediction illustrated in Figure 9-1. Attaching an uncertainty to the result is nontrivial, but conservative approximations are known (Fieller 1954). There remains some debate in the statistical literature concerning the circumstances when inverse or direct methods are better (Osborne 1991).
The temperature reconstruction problem does not fit into this framework because both temperature and proxy values are not controlled. A more useful model is to consider the proxy and the target climate variable as a bivariate observation on a complex system. Now the statistical solution to the reconstruction problem is to state the conditional distribution of the unobserved part of the pair, temperature, given the value of the observed part, the proxy. This is also termed the random calibration problem by Brown (1982). If the bivariate distribution is Gaussian, then the conditional distribution is itself Gaussian, with a mean that is a linear function of the proxy and a constant variance. From a sample of completely observed pairs, the regression methods outlined above give unbiased estimates of the intercept and slope in that linear function. In reality, the bivariate distribution is not expected to follow the Gaussian exactly. In this case the linear function is only an approximation; however, the adequacy of these approximations can be checked based on the data using standard regression diagnostic methods. With multiple proxies, the dimension of the joint distribution increases, but the calculation of the conditional distribution is a direct generalization from the bivariate (single-proxy) case.
Regression with Correlated Data
In most cases, calibrations are based on proxy and temperature data that are sequential in time. However, geophysical data are often autocorrelated, which has the effect of reducing the effective sample size of the data. This reduction in sample size decreases the accuracy of the estimated regression coefficients and causes the standard error to be underestimated during the calibration period. To avoid these problems and form a credible measure of uncertainty, the autocorrelation of the input data must be taken into account.
The statistical strategy for accommodating correlation in the data used in a regression model is two-pronged. The first part is to specify a model for the correlation structure and to use modified regression estimates (generalized least squares) that achieve better precision. The correctness of the specification can be tested using, for example, the Durbin-Watson statistic (Durbin and Watson 1950, 1951, 1971). The second part of the strategy is to recognize that correlation structure is usually too complex to be captured with parsimonious models. This structure may be revealed by a significant Durbin-Watson statistic or some other test, or it may be suspected on other grounds. In this case, the model-based standard errors of estimated coefficients may be replaced by more robust versions, discussed for instance by Hinkley and Wang (1991). For time series data, Andrews (1991) describes estimates of standard errors that are consistent in the presence of autocorrelated errors with changing variances. For time series data, the correlations are usually modeled as stationary; parsimonious