models for stationary time series, such as ARMA, were popularized by Box and Jenkins (Box et al. 1994). Note that this approach does *not* require either the temperature or the proxy to be stationary, only the errors in the regression equation.

An indication of the uncertainty of a reconstruction is an important part of any display of the reconstruction itself. Usually this is in the form:

and the standard error is given by conventional regression calculations.

The prediction mean squared error is the square of the standard error and is the sum of two terms. One is the variance of the errors in the regression equation, which is estimated from calibration data, and may be modified in the light of differences between the calibration errors and the validation errors. This term is the same for all reconstruction dates. The other term is the variance of the estimation error in the regression parameters, and this varies in magnitude depending on the values of the proxies and also the degree of autocorrelation in the errors. This second term is usually small for a date when the proxies are well within the range represented by the calibration data, but may become large when the equation is used to extrapolate to proxy values outside that range.

Reconstructions are often shown in a smoothed form, both because the main features are brought out by smoothing and because the reconstruction of low-frequency features may be more precise than short-term behavior. The two parts of the prediction variance are both affected by smoothing but in different ways. The effect on the first depends on the correlation structure of the errors, which may require some further modeling, but is always a reduction in size. The second term depends on the smoothed values of the proxies and may become either larger or smaller but typically becomes a more important part of the resulting standard error, especially when extrapolating.

The basic idea behind principal component regression is to replace the predictors (i.e., individual proxies) with a smaller number of objectively determined variables that are linear combinations of the original proxies. The new variables are designed to contain as much information as possible from the original proxies. As the number of principal components becomes large, the principal component regression becomes close to the regression on the full set of proxies. However, in practice the number of principal components is usually kept small, to avoid overfitting and the consequent loss of prediction skill. No known statistical theory suggests that limiting the number of principal components used in regression leads to good predictions, although this practice has been found to work well in many applications. Fritts et al. (1971) introduced the idea to dendroclimatology, and it was discussed by Briffa and Cook (1990).