Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 107
B
Modern Statistical Methocis and
Weather Moclifi~cation Research
Christopl~erK. [;Yikle
Department of Statistics, University of A~Iissouri-Colun~b~a
July 24, 2003
INTRODUCTION
As discussed in the reports statistical science is important in the design, analysis
and verification of weather modification experiments Given the complexity of the
problem, the necessity to include statisticians in the planning and analysis of such
experiments was recognized early in the history of weather modification. Indeed, many
excellent and well-known statisticians have collaborated on such experiments over the
years. In addition to improvements in deterministic modeling, fundamental science, and
technology there have been tremendous strides in the statistical sciences over the past
two decades as well. Given the importance of statistics to weather modification
experiments, this is indeed a significant and relevant development.
The aforementioned revolution in statistical methodology and computation has
led to many new perspectives that were not available in past weather modifications
research programs. For example, one will never be able to "randomize" effectively all
sources of uncontrollable bias in weather modification experiments. Consequently,
sophisticated statistical models have to be considered to explore potential significant
effects. That is, one can now compare "treatment'' and "control'' environments from a
spatio-te~nporal perspective rather than some potentially inappropriate summary over
space/time/va~iate. Complicated (realistic) spatio-temporal statistical methodologies were
either not available or could not be implemented in realistic settings until the 1990s. A
simple analogy is that R. A. Fisher was aware of the effects of spatial dependence in
nearby field plots in agricultural experi~nents. The computational and modeling
technology did not exist at the time to adequately model such effects. Consequently,
randomization was utilized to mitigate the effects of spatial correlation. However, just as
blocking designs can improve efficiency over randomization one can get more etf~ciellt
estimates by n~odelir~g the spatial (and spatio-te~nporal) effects (e.g., see Cressie, 19934.
' This appendix was added by request of the Committee to supplement the statistical discussion in
the main body of the report.
107
OCR for page 108
10(9
A PPEATDIX B
Statistical modeling theory has advanced significantly since the last major
weather ~nodificatio~ initiative. In particular, in addition to advancements in spatial and
spatio-ten~poral approaches, methodologies such as generalized additive models and
generalized linear mixed remodels have proven to be quite powerful, and relevant. For
example, the generalized linear mixed model fiamework allows for a broad class of data
distributions (i.e.' one is not restricted to normality) and considers some function of the
expected mean response to be the sum of a deterministic (i.e.! regression) component and
a (correlated) random component, if needed. Thus, in addition to knower covariate effects
in the deterministic component, unknown spatial' temporal or spatio-temporal effects can
be considered explicitly as the random effects in this framework. This is critical as
discussed above since weather modification experiments occur over space and time.
Thus, this framework provides a natural way to incorporate the advancements in spatial
statistics within a broader model-based analysis. Estimation for these models is
performed by relatively computer intensive approximate numerical procedures. For an
overview see McCulloch and Sear]e (2001)
Perhaps an even more "revolutionary' development in statistics was the
realization that Markov Chairs Monte Carlo (MCMC) methods could be used to
implement Bayesian statistical models. Inspired by the use of such methods in image
analysis by Geman and Geman (1984,, Gelfand and Smith (1990) realized that MCMC
can be used as a general approach in which to implement Bayesian statistical models.
This led to a dramatic increase in the types and complexity of,uroblems that can be
modeled in this context. For an overview of the approach see Robert and Case]]a (19993.
This development is critical to the science of weather modification for a couple of
reasons. Firsts the Bayesian paradigm provides a natural statistical framework in which
to explicitly account for ALL sources of uncertainty, be they data, model, of parameter
uncertainties (e.g' Berliner ] 9961. Second, such models can be used to incorporate very
complicated spatial and temporal dependence in the generalized linear mixed model
fiamework discussed above with relative ease (e.g., Diggle et al., 1998~. Furthermore,
one can include complicated physical insight (i.e., model physics) directly into this
framework (Wikle et al.' 90011. This methodology is outlined in greater detail in the
following section.
HIERARCHICAL BAYESIAN MODELS
The use of Bayesian ideas in weather modification is not new (e.g., see Olsen
1975), yet such ideas have not entered the mainstream of weather modification research.
This is unfortunate, as the Bayesian paradigm is ideal for combining different sources of
information (e.g., physics and data) and accounting for uncertainty. Common
meteorological procedures such as found in data assimilation have long been recognized
as inherently Bayesian in nature (e.g., Lorenc and Hammon, 19884. In addition, it has
recently been recognized that one of the fundamental approaches to characterizing
uncertainty in climate change assessment is Bayesian (e g., Berliner et al. 2000; Leroy
1998~. However, traditionally it has been difficult to model the full data, process,
parameter distributions in general Tom the Bayesian perspective. Recently, it has been
shown that hierarchical approaches to such models provide an ideal framework in which
OCR for page 109
A PPEArDIX B
109
to account for all such uncertainties in geophysical processes (e.g., Royle et al. 1999;
Wikle et al., 20014.
The hierarchical Bayesian statistical paradigm is based in probability theory (e.g.,
Berger, 1985; Bernardo and Smith' 19941. Assume we are interested in some process
Y and we have observational data for this process, denoted by z . Furthermore there are
parameters associated with our physical-statistical representation of the Y process, as
well as the statistical model for the observations. The collection of Close parameters is
denoted by ~ . A Bayesian hierarchical analysis develops a joint probability model for ah
these variables as the product of a sequence of distributions; formally,
Ez, Y. 9] - Liz ~ Y,03tY ~ 01~],
(1)
whence the be ackets ~ ~ denote probability distribution and vertical bars ~ identify
conditional dependencies for a given process upon other processes and/or parameters. For
example, Adz ~ Y. 0] denotes the distribution of the data z conditional on the process Y and
parameters 9. The process distribution is then given by tY ~ Wand the parameter
distribution by t0] . Learning about the unknown quantities of interest (e.g., Yand
relies on the probability relationship (Bayes's Theorem):
tY, ~ ~ z] oc Liz ~ Y. ARTY ~ 01ft)],
(2)
Pliers the constant of proportionality arises by integrating the right-hard side of (2) with
r espect to Y and ~ .
We can make use of physical relationships to aid in the specifications of the
"prior distributions" typic] and t04. O'er ultimate interest is with the left-hand side
(LHS) of (2), the so-called "posterior distribution." This distribution of the process and
parameters given the data updates the prior formulations in light of the observed data.
For instance, as shown by Royle et al. ( 19991, if the process consists of winds zip, v, and
pressure P. we can exploit the geostrophic relationship, which would allow us to write a
stochastic model for the wind field given the pressure field, tu'v ~ P. P] . Note that this is
a stochastic relationship (i.e., a distribution)? which quantifies a source of variability with
respect to deviations from the gradient relationship (e.g., ~ oc UP / By, v oc UP / fix ). We
can model additional uncertainty by specifying distributions for the parameters ~ as
well. For example, the geostrophic model suggests a parameter (to be included as an
element of the vector ~ ~ that is proportional to the inverse product of the density times
the Coriolis term. One Knight specify this as the prior expected value. A variance about
~ 1 ~ 1 1
.1 . , . . . , . .. 1 , , .. , .. .. , .. . . A.
this expected value is then prescribed to generate a distribution tot this parameter. the
net result is that with relatively simple physical and stochastic representations in tile
sequence of conditional models (e.g., RHS of ~24), we can obtain a posterior distribution
for u and v that has verb complicated spatial structure; one that, through the
quantification of uncertainty, can "adapt" to a wide variety of observations and our prior
knowledge of the geophysical system.
Each stage of the hierarchical model (i.e., data, process, and parameter stages) can
be further partitioned into subeomponents. This is critical in that it allows for inclusion of
OCR for page 110
110
A PPENDIX B
many complications that are extremely difficult to accour~t for in traditional statistical
implementations. Each stage is further discussed below.
Data Models
Datasets commonly considered for atmospheric processes are complicated and
usually exhibit substantial spatial, temporal, or spatio-temporal dependence. The major
advantage of modeling the conditional distribution of the data given the true process is
that substantial simplifications in model form are possible. For examples let Za be data
observed for some process Y. and let 0~ be parameters. The data model is written,
(Za ~ Y,0~li. Usually, this conditional distribution is much simpler than the unconditional
distribution of (z~] since most of the complicated structure comes front the process Y .
Often, this model simply represents measurement e~^ror. Note that in this general
framework the measurement error need not be adcl~tive. Furthermore, and perhaps more
import-aptly! this framework can also accommodate data that is at a different resolution in
space and/or time than tl~e process.
This Framework also provides a natural way to combine datasets. For example,
assume float Za and z, represent data front two different sources (e.g., rain gauge and
tadar measurements of precipitation). Again, let Y be the process of interest (e.g., the
tree precipitation process) and 0`,, 0~ be para~neters. In this case, the data remodel is often
written
(Z61'ZC ~ Y'0a'~c~l Ezra ~ Y?61~EZC ~ Y'~c] (~3~)
Thus, conditioned on the true process, the data are assumed to be independent. Of course,
taxis does not suggest that the two datasets are unconditionally independent. Rather, the
majority ofthe dependence among the datasets is due to the process, Y. This assumption
of independence is exactly that, an assumption. Although often very reasonable, it must
be assessed critically for each problem.
The conditional partitioning of the datasets in (3) is often similarly applied to
multivariate models. That is, say our processes of interest are denoted Ya arid Y? with
associated observations Zc, and zc. . One might write
(Zu ~ ZC ~ Ya ~ Y ~ Flu ~ Tic ~ Lou ~ Y! ~ Ecu ~ (Zc ~ Yc ~ Tic ~ (4)
Again, Ellis represents the assumption that given the true processes of interest, the datasets
al e independent. Such an assumption must be evaluated and is not required ire
hierarchical analysis, but it is often very reasonable and can lead to dramatic
simplifications in the computations.
Process Models
It is usually the case that developing the process distribution is the Almost critical
step in constructing the hierarchical model. This distribution is often further factored
OCR for page 111
A PPE,\7DIX B
111
hierarchically into a series of submodels. For example, assume the process of interest is
composed of two subprocesses, Y arid Yc. Perhaps Y. represents precipitation for a
geographical region and Ye might represent the state oil the atmospheric circulation over
tile same region. Furtl~er~nore, define parameters By = Lily ,P, ~ that describe these two
processes. One might consider the decomposition
t('Y`'~']=~YC'Hy]tYe.~67~.
-a ~
(5)
Ells is just a fact of probability theory and can always be written. However, it may be the
case that one can assume the parameters are conditionally independent in which case the
right hand side of (5) can be written as tYu ~ Iffy ]tY Any ]. The challenge is the
specification of these component distributions. Indeed, most of the effort in the
development of hierarchical models is related to constructing these distributions. It is
often the case, however, that there is very good scientific insight that can suggest
appropriate conditioning older and possible models for the component distributions. For
example, it is probably more seasonable to condition precipitation on the atmospheric
circulation state variables, rather than the alternative. Similarly, Ya might represent the
process of interest at time t and Yc the same process at the previous time, t - ~ . Natural
deterministic models for process evolution could suggest the form of such models.
Parameter Models
The parameter distributions may require significant modeling effort. As is the
case with the data and process models, the joint distribution of parameters is often
partitioned into a product of marginal distributions. For examples consider the data model
(4) and process model (5~. One must specify the parameter distribution tHa'Hc?9Y ,: ].
Often, one can malice reasonable independence assumptions regarding this distribution,
e g ~ tH~ ~ dc ? EYE ~ 0~' ~ = t0u ~ t0C ~ t0~' ~ Icy ~ · Of course, this assumption must be justified.
There are usually appropriate submodels for parameters as well' leading, to other levels of
the model hierarchy. In many cases, for complicated processes. there is substantial
scientific insight that can go into developing the parameter models (e.g., NVikle et al.,
2001~. In other cases, one does not know much about the parameter distribution,
suggesting '~vague priors" or data-based estimates be used. That is, it is often usefu] to
blink empirically at first and perform exploratory data analysis in order to develop
understanding about the process. The emphasis in this case is on model building.
The development of parameter distributions has often been the focus of
objections due to its implied subjectiveness. Of course, the formulation of the data and
process models are quite subjective as well' but those choices have not generated as much
concern? probably because such subjectiveness is just as much a part of classical model
building as it is the Bayesiar~ approach. One must recognize that a strength of the
hierarchical (Bayesian) approach is the quantification of such subjective judgment.
Hierarchical models provide a coherent probabilistic framework ilk which to incorporate
explicitly in the model the uncertainty related to j udgment, scientific reasoning'
subjective decisions' and experience.
OCR for page 112
112
A PPE,N7DIX B
EXPERIMENTAL DESIGN
As indicated in the report, the proper statistical design of weather modification
experi~nents is paramount. Advances in statistical modeling, some of which were outlined
above? should be considered in this aspect of the problem as well. For example, there has
been a significant amount of work considering the design of efficient monitoring
networks in cases where the underlying process of interest is spatial. A nice recent review
of such work can be found in Muller (2000~. In addition, in the context of spatio-temporal
processes' work has been done to consider how one might gain efficiency by allowing
monitoring networks to be dynamic in time (e.~., Wikle and Royle, 1999 J. Finally, fleece
has been recent work related to utilizing the advantages of the Bayesian paradigm in the
context of experimental design (e.g. Besag and Higdon, 1999J. Weather modification
research could benefit from these advances. For example, experimental data from past
weather modification experiments could be used to develop understanding of spatio-
temporal dependencies in the atmospheric variables and constituents of interest. This
understanding (prior knowledge) could then be expressed formally in terms of a statistical
model. At that point, one could utilize a decision theoretic framework to optimize
specific objectives. For example, one might be interested in determining the optimal
location for rain gauges in order to maximize the ability to detect a significant diffidence
in seeded precipitation over a given spatial region. It may be, in this example, that such a
network would be optimized by allowing some monitors to be fixed and others to vary
location at different times, depending on the underlying dynamical environment. The
underlying framework presented here would suggest the optimal locations for such
monitors. In each please ofthis analysis, modern model-based statistical methods could be
used. Although such a model-based design perspective is advantageous, one could still
use the model building and data analysis approach suggested here to analyze results front
past experiments or from new experiments that were not designed from this perspective
CONCLUSION
In addition to new technological advances in the atmosphere ic sciences'
substantial advances also have occurred in the statistical sciences over the past three
decades. These developments—which have not yet been applied to weather
modification -- could greatly improve the design analysis? and verification of
experiments. With the appropriate combination of statistical, computational, and
scientific advances, many of the uncertainties ill establishing the validity of weather
modification research and operational results could be diminished.
REFERENCES
Berger, J. O., 1985. Statistical Decision Theory and Bayesian Analysis. New York:
Springer-Verlag.
Berliner, L. M., 1996. Hierarchical Bayesian tine series models. In Maximum Entropy
and Bayesian Methods, K. Hanson and R. Silver (Eds.), Kl~wer Academic
Publishers, 15-22.
OCR for page 113
A PPEIN7DIX B
113
Berliner, L. M., R. A. Levine, and D. J. Sleep. 2000. Bayesian climate change assessment.
J. Climate 1 3:3 805-3 820.
Bernardo, J. M., and A. F. M. Smith. ] 994. Bayesian Theory. New York: Wiley.
Besag, J.' and D. Higdo~. 1999. Bayesian analysis of agricultural field experiments (with
discussions. J. R. Stat. Soc. B 61 :691-746.
Cressie, N. A. C. 1993. Statistics for Spatial Data, Revised Edition, Wiley, New York.
Diggle, P. J., J. A. Tawn, and R. A. Moyeed. 1998. Model-based geostatistics (with
discussion). Appl. Stat. 47:299-350.
(relend A. 17. and A. F. M. Smith. 1990. Sampling-based approaches to calculating
marginal densities. J. Ash. Stat. Assoc. 85:398-409.
Geman, S., arid D. Geman. 1984. Stochastic relaxation, Gibbs distributions and the
Bayesian restoration of images. IEEE Trans. Pattern Anal. 6:72]-741.
Leroy, S. S. 1998. Detecting climate signals: Some Bayesian aspects. J. Climate 1 1:640-
651.
Lorenc, A., and O. l~a~nmon. 1988. Objective quality control of observations using
Bayesian methods. Theory and a practical implementation. Q. J. Roy. Meteorol.
Soc.114:515-543.
McCulloch, C. Ed and S. R. Searle. 2001. Generalized, Linear, and Mixed Models. New
York:Wiley.
Muller, W. G. 2000. Collecting Spatial Data, 2nd Ed. Physica Verlag.
Olsen, A. R. 1975. Bayesian and classical statistical methods applied to randomized
Deadlier modification experiments. J. Appl. Meteorol. 14:970-973.
Robert, C. P., and G. Casella. 1999. Monte Carlo Statistica] Metl~ods. New York:
Springer.
Royle' J. A., L. M. Berliner, C. K. Wikle, and R. Milliff. 1999. A hierarchical spatial
model for constructing wind fields Prom scatte~ometer data in the Labrador Sea.
Case Studies in Bayesian Statistics? eds. C. Gatsonis et al., pp.376-382. Springer-
Verlag,.
Wikle, C. K.' and J. A. Royle. 1999. Space-tine models and dynamic design of
envi~onmentalmonitoring networks J Agri Biol Environ Stat 4:489-507
Wikle, C K., R. F. Milliff, D NychLa, arid L M. Berliner 2001 Spatiotemporal
hierarchical Bayesian modeling: Tropical ocean surface winds J Am. Stat.
Assoc 96:3 82-397
Representative terms from entire chapter:
modification experiments