aspects of the acquisition process as a whole that affected the application of statistical techniques. It became clear to us that adopting effective statistical practices that command wide and consistent support within the DoD acquisition community would yield substantial gains.
The panel's main conclusions concerning the current use of operational testing as part of system development cover broad aspects of DoD operational testing and evaluation.
Currently, operational testing is a collective final event in system development. Since many design flaws in both industrial and defense systems become apparent only in operationally relevant use, some testing with operational realism needs to be carried out as early as possible. Inexpensive, small, focused preliminary tests with operational aspects could help to identify problems before a design is final. Such tests could also identify scenarios in which the current design performs poorly and which system characteristics should be the focus of subsequent tests.
In addition, it is currently uncommon to use developmental test data, or test data for related systems, to augment data from operational testing except for direct pooling of reliability test data. This omission derives in part from understandable concerns about the relevance of developmental test data or test data from related systems for the evaluation of a system's operational performance, but it also originates in part, from a lack of statistical expertise about how to use the information and a lack of access to the information.
Conclusion 2.1: For many defense systems, the current operational testing paradigm restricts the application of statistical techniques and thereby reduces their potential benefits by preventing the integration of all available and relevant information for use in planning and carrying out tests and in making production decisions.
Also, the incentive structure in military procurement provides each major participant-including the program manager for a system, the test director, the contractor, various activities in the Office of the Secretary of Defense (OSD), and Congress-with strong, often differing and even competing perspectives. This set of complicated dynamics affects a variety of aspects of the test and evaluation process-budgets, schedules, test requirements, test size, which test events should be excluded because of possible test abnormalities, and even the rules for scoring a test event a "failure." It is critical that the perspectives of the participants are understood and taken into account in decision making concerning test design and test evaluation.
Further, for operational tests of most complicated systems, the required sample sizes that would support significance tests at the usual levels of statistical