A good example of the fallibility of methods occurred in astronomy in the early part of the twentieth century. One of the most ardent debates in astronomy at that time concerned the nature of what were then known as spiral nebulae—diffuse pinwheels of light that powerful telescopes revealed to be quite common in the night sky. Some astronomers thought that these nebulae were spiral galaxies like the Milky Way at such great distances from the earth that individual stars could not be distinguished. Others believed that they were clouds of gas within our own galaxy.

One astronomer who thought that spiral nebulae were within the Milky Way, Adriaan van Maanen of the Mount Wilson Observatory, sought to resolve the issue by comparing photographs of the nebulae taken several years apart. After making a series of painstaking measurements, van Maanen announced that he had found roughly consistent unwinding motions in the nebulae. The detection of such motions indicated that the spirals had to be within the Milky Way, since motions would be impossible to detect in distant objects.

Van Maanen's reputation caused many astronomers to accept a galactic location for the nebulae. A few years later, however, van Maanen's colleague Edwin Hubble, using the new 100-inch telescope at Mount Wilson, conclusively demonstrated that the nebulae were in fact distant galaxies; van Maanen's observations had to be wrong. Studies of van Maanen's procedures have not revealed any intentional misrepresentation or sources of systematic error. Rather, he was working at the limits of observational accuracy, and his expectations influenced his measurements.


Deborah, a third-year graduate student, and Kathleen, a postdoc, have made a series of measurements on a new experimental semiconductor material using an expensive neutron source at a national laboratory. When they get back to their own laboratory and examine the data, they get the following data points. A newly proposed theory predicts results indicated by the curve.

During the measurements at the national laboratory, Deborah and Kathleen observed that there were power fluctuations they could not control or predict. Furthermore, they discussed their work with another group doing similar experiments, and they knew that the other group had gotten results confirming the theoretical prediction and was writing a manuscript describing their results.

In writing up their own results for publication, Kathleen suggests dropping the two anomalous data points near the abscissa (the solid squares) from the published graph and from a statistical analysis. She proposes that the existence of the data points be mentioned in the paper as possibly due to power fluctuations and being outside the expected standard deviation calculated from the remaining data points. "These two runs," she argues to Deborah, "were obviously wrong."

  1. How should the data from the two suspected runs be handled?

  2. Should the data be included in tests of statistical significance and why?

  3. What other sources of information, in addition to their faculty advisor, can Deborah and Kathleen use to help decide?

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement