APPENDIX B
Abstracts of Background Papers
Although the panel was not charged with developing or executing technical analyses related to operational testing and evaluation, we found that exploring certain technical issues in depth contributed to our deliberations. We present here abstracts of three studies, which are published separately (Cohen, Rolph, and Steffey, 1998).
STRATEGIC INFORMATION GENERATION AND TRANSMISSION: THE EVOLUTION OF INSTITUTIONS IN DoD OPERATIONAL TESTING
Eric M. Gaier, Logistics Management Institute, and Robert C. Marshall, Pennsylvania State University
This paper presents a model that extends the information transmission literature to consider the question of strategic information generation. We analyze a principal-agent game in which the agent strategically chooses the probability with which s/he can distinguish a given state from its complement. We call this stage test design. After observing an information partition, the agent reports to the principal. We analyze several ways in which the principal might choose to extend oversight authority over the process. As the main result of the paper we show that oversight of the test design stage always improves the welfare of the principal while oversight of the reporting stage may not. The model is used to examine the historical evolution of operational testing in the Department of Defense.
ON THE PERFORMANCE OF WEIBULL LIFE TESTS BASED ON EXPONENTIAL LIFE TESTING DESIGNS
Francisco J. Samaniego and Yun Sam Chong, University Of California, Davis
It is common to plan a life test based on the assumption of exponentiality of observed lifetimes or lives between failures. Analysts are then able to calculate specifically how many items should be placed on test (or the number of observed failures it takes to terminate the test) and the maximum total time on test required to resolve the hypothesis test of interest. Once the test data are in hand, one has the opportunity to confirm the exponentiality assumption or to decide that an alternative modeling assumption is preferable. This paper pursues the question: "What if the data point toward a non-exponential Weibull model?" We identify circumstances in which the available data permit testing the original hypotheses with better performance characteristics (that is, smaller error probabilities) than the test originally planned; a complementary analysis of situations leading to poorer performance is also given. We give indications of the potential savings in the number of systems and the time on test that would accrue from having modeled the experiment correctly in the first place.
Various approaches to testing hypotheses concerning Weibull means are discussed. The first two sections of the paper are expository and review the main issues in exponential life testing and some properties and procedures associated with the Weibull distribution. In Sections three and four we develop the mechanics of Weibull life testing, and carefully examine the performance of Weibull life tests based on exponential life test plans.
APPLICATION OF STATISTICAL SCIENCE TO TESTING AND EVALUATING SOFTWARE INTENSIVE SYSTEMS
Jesse H. Poore, University of Tennessee, and Carmen J. Trammel, Software Engineering Technology, Inc.
Defense systems are becoming increasingly software intensive. While software enhances the effectiveness and flexibility of these systems, it also introduces vulnerabilities related to inadequacies in software design, maintenance, and configuration control. Effective testing of these systems must take into account the special vulnerabilities introduced by software. The software testing problem is complex because of the astronomical number of scenarios and states of use. The domain of testing is large and complex beyond human intuition. Because the software testing problem is so complex, statistical principles must be used to guide testing strategy in order to get the best information for the resources invested in testing.
In general, the concept of "testing in" quality is costly and ineffectual; software quality is achieved in the requirements, architecture, specification, design, and coding activities. The problem of doing just enough testing to remove uncertainty regarding critical performance issues, and to support the decisions that must be made in the software life cycle is a problem amenable to solution by statistical science. The question is not whether to test, but when to test, what to test, and how much to test.
Statistical testing enables efficient collection of empirical data that will remove uncertainty about the behavior of the software-intensive system and support economic decisions regarding further testing, deployment, maintenance, and evolution. A statistical principle of fundamental importance is that a population to be studied must first be characterized, and that characterization must include the infrequent and exceptional as well as the common and typical. It must be possible to represent all questions of interest and all decisions to be made in terms of this characterization. When applied to software testing, the population is the set of all possible scenarios of use with each accurately represented as to frequency of occurrence. The operational usage model is a formalism presented in this paper that enables the application of many statistical principles to software testing and forms the basis for efficient testing in support of decision making.
Most usage modeling and related statistical testing experience to date is with embedded real-time systems, application program interfaces, and graphical user interfaces. One very advanced industrial user of this technology is the mass storage devices business. Use of this technology has led to extensive test automation, significant reduction in the time these software-intensive products are in testing, improved feedback to the developers regarding product deficiencies or quality, improved advice to management regarding suitability for deployment, and greatly improved field reliability of products shipped.
From a statistical point of view, all the topics in this paper follow sound problem-solving principles and are direct applications of well-established theory and methodology. From a software testing point of view, the application of statistical science is relatively new and rapidly evolving, as an increasing range of statistical principles is applied to a growing variety of systems. Statistical testing is used in pockets of industry and agencies of government, including DoD, on both experimental and routine bases. This paper is a composite of what is in hand and within reasonable reach in the application of statistical science to software testing.