etc., then peculiarities of that data, machine, lab, etc. will be shared between the samples used to develop the computational model and the samples used to evaluate the model. As a result, even if the computational model performs well on that independent set of samples, it might not perform well on samples from patients at a different hospital, processed by a different technician in a different lab, etc. The OvaCheck case study (further discussed in Chapter 6 and Appendix A) is illustrative of the importance of independent datasets for confirmation. Most of the tissue specimens used to test the computational model were obtained from the same institution that provided the specimens used to train the computational model (Baggerly et al., 2004: Petricoin et al., 2002).
In some cases it will not be possible to obtain independent sets of specimens and associated clinical data with all of these characteristics; however, it is important to keep in mind that the quality of evidence provided by good model performance on an independent specimen set depends critically on the characteristics of that set. Hence, it is important that full descriptions of the independent specimen set are reported along with results of the computational model’s performance in the discovery phase. Below, two “levels of evidence” are presented for assessing omics-based computational model performance on an independent specimen set.
Lower Level of Evidence: Independent sets of specimens and clinical data collected at a single institution using carefully controlled protocols, with samples from the same patient population.
Under these circumstances, good performance of the locked-down computational procedure indicates that it works in the particular setting that was studied, with the protocols and the patient profile at that institution, etc. However, this candidate omics-based test might not work well with a slightly different patient population or with samples processed in a different laboratory or using a slightly different protocol.
Higher Level of Evidence: Independent sets of specimens and clinical data collected at multiple institutions.
Success in this setting strongly suggests that the omics measurements and locked-down computational procedure will work well on future patients. It provides evidence that the test is robust to the kinds of things that might change between locations: namely, aspects of the biology of the populations who tend to go to a particular hospital, sample collection and handling, and measurement techniques, etc. This is important because such differences can have large effects on the omics measurements obtained, often larger than the differences associated with the phenotypes of interest.