an event, such as the appearance of a disease or death. Many statistical models have been developed to test the significance of differences among means of these types of data. Detailed discussions of the models can be found in books on statistics (Cohen, 1988; Fleiss, 1981; Snedecor & Cochran, 1989), in manuals for various computer programs used for statistical analyses (Kirkpatric & Feeney, 2000; SAS, 2000), and on web sites that present elementary courses in statistics (e.g., www.ruf.rice.edu/~lane/rvls.html).

DEFINING THE HYPOTHESIS TO BE TESTED

Although experimental designs can be complicated, an investigator’s hypothesis can usually be reduced to one or a few important questions. It is then possible to compute a sample size that has a particular chance or probability of detecting (with statistical significance) an effect (or difference) that the investigator has postulated. Simple methods are presented below for computing the sample size for each of the three types of variables listed above. Note: the smaller the difference the investigator wishes to detect or the larger the population variability, the larger the sample size must be to detect a significant difference.

EFFECT SIZE, STANDARD DEVIATION, POWER, AND SIGNIFICANCE LEVEL

In general, several factors must be known or estimated to calculate sample size: the effect size (usually the difference between two groups), the population standard deviation (for continuous data), the desired power of the experiment to detect the postulated effect, and the significance level. The first two are unique to the particular experiment; the last two are generally fixed by convention. The magnitude of the effect that the investigator wishes to detect must be stated quantitatively, and an estimate of the population standard deviation of the variable of interest must be available from a pilot study, from data obtained in a previous experiment in the investigator’s laboratory, or from the scientific literature. Power is the probability of detecting a difference between treatment groups and is defined as 1-ß, where ß is the probability of committing a Type II error (concluding that no difference between treatment groups exists, when, in fact, there is a difference). Significance, denoted as a, is the probability of committing a Type I error (concluding that a difference between treatment groups exists, when, in fact, there is no difference). Once values for power and significance level are chosen and the statistical model (such as chi-squared, t-test, analysis of variance, or linear regression) is selected, sample size can be computed by using the size of the effect that the investigator wishes to detect and the estimate of the population standard deviation of the factor to be studied.

It should be noted that in the following discussion of sample-size calculations, the aim is to simplify the question being addressed so that power calcula-



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement