Click for next page ( 198


The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 197
PART III Implications for a National Reference Ballistic Image Database

OCR for page 197

OCR for page 197
8 Experimental Evidence on Sources of Variability and Imaging Standards The performance studies of the Integrated Ballistics Identification ­System (IBIS) platform—the current standard for ballistic imaging—summarized in Section 4–D provide context for the committee’s own ­analyses. The core of the experimental work performed by the committee was coordinated by the Office of Law Enforcement Standards of the National Institute of Standards and Technology (NIST), with whom the National Institute of Justice executed a separate contract to perform analyses at the committee’s direction. Given the committee’s basic charge, an ideal test would involve the creation of a prototype national reference ballistic image database (RBID), exceeding the size of the De Kinder et al. (2004) exhibit set, and thus getting a direct impression of automated systems’ ability to detect sameness amidst a vast array of exhibits with highly similar class characteristics. How- ever, such a massive collection was clearly beyond the scope of our avail- able resources. Working with NIST, and recognizing the work in ­previous s ­ tudies, we judged it best to focus our analyses on narrower objectives. Our experimentation was aimed at generating an exhibit set that—although small in size—could facilitate studies of system performance when both firearm and ammunition type are varied. Significantly, a main objective of our work was to study the effectiveness of one possible enhancement to the current National Integrated Ballistic Information Network (NIBIN) or, alternately, a possible design choice for a new national RBID: a switch from two-dimensional photography to three-dimensional surface measurement as an imaging standard. When NIST and the committee began its work, three- dimensional profilometry analyses had been performed on bullets but had not yet been attempted on cartridge case markings; our experimental work 199

OCR for page 197
200 BALLISTIC IMAGING was intended to shed light on the tradeoff between two-dimensional and three-dimensional imaging in computer-assisted firearms identification. An important set of caveats is in order at the outset regarding this work. Since NIST and the committee began its work, Forensic Technology WAI, Inc. (FTI) has refined its BulletTRAX-3D offering, and an FTI system for three-dimensional analysis of cartridge casings is in production; see Box 4-1 and Chapter 7. The three-dimensional analyses described in this chapter and expressed fully in NIST’s report (Vorburger et al., 2007) do not make use of FTI’s three-dimensional software and systems, although they do share common technology. Indeed, since our three-dimensional analyses are exclusively focused on cartridge case markings, using the FTI three-dimensional equipment was not possible since its cartridge case sys- tem was not in production during the committee’s period of analysis. In this chapter, we do make reference to performance comparisons between NIST’s three-dimensional system and “IBIS,” referring to the standard IBIS two- dimensional-photography-based product, even though FTI has branded its three-dimensional systems with the IBIS name. As described in Box 4-1, this nomenclature and comparison is appropriate since it is the two-­dimensional version of IBIS that is currently used by the NIBIN program, which is cen- tral to our charge. Although construction of a full-fledged national RBID prototype was not within our committee’s resources, we did wish to do the next best thing: namely, work with data from the existing state-level RBIDs. We accepted an invitation from the New York Combined Ballistic Identification System (CoBIS) RBID to perform limited data entry and experimentation at its Albany headquarters location. The design of the exhibit sets studied in this chapter is described in Sec- tion 8–A, as well as the process used to acquire three-dimensional topographic images. Section 8–B summarizes the work done on the committee’s behalf by NIST and concentrates on the comparison between two-dimensional and three-dimensional performance. The results of limited experimentation with the existing New York RBID are described in Section 8–C. Overall conclu- sions from this work are not presented in this chapter, but are rather deferred to Chapter 9’s discussion of the advisability of a national RBID. 8–A  Data Sources, Design, and Image Acquisition The committee’s experimental work relied principally on two sets of cartridge case exhibits: an extract of casings from the De Kinder et al. (2004) study of large image database performance and a new set of test firings commissioned by the committee and collected by NIST. We describe these sources below, along with the steps taken to acquire two-dimensional and three-dimensional images and measurements; for ease of reference, the basic design of the exhibit sets is summarized in Box 8-1.

OCR for page 197
EXPERIMENTAL EVIDENCE 201 BOX 8-1 Design of Test-Fire Cartridge Sets DKT (De Kinder et al., 2004) Exhibit Set • Firearms Used: 600 California Highway Patrol service pistols, all generally of the same SIG Sauer P226 make. Forty-six of the pistols were of the P225, P228, or P229 series, but these models were judged to be consistent in breech face and firing pin configurations with the P226. • Ammunition Used: Six brands—CCI, Federal, Remington, Speer, Win- chester, and Wolf. • Firing Protocol: A set of seven cartridges—two using Remington ammuni- tion, and one each from the other five ammunition brands—were loaded into a magazine and fired from each pistol. It is not known whether the same ammunition sequence was used in each of the firings. • Analysis Set: NIST obtained access to the full set of 4,200 casings. At the committee’s direction, 10 of the 554 guns known to be of the P226 make were selected at random and the 7 casings from those guns were extracted to form a 70-element analysis set. NBIDE (Vorburger et al., 2007) Exhibit Set • Firearms Used: 12, 4 from each of 3 brands, purchased as new by NIST from standard vendors. Makes were chosen to try to obtain a range of known quality and tooling, subject to constraints on NIST’s ability to purchase from avail- able dealers. Only 9mm caliber firearms were considered, for simplicity. Chosen makes were Ruger P95D, SIG Sauer P226, and Smith & Wesson 9VE. The SIG Sauer pistols bore consecutive serial numbers; the Ruger pistols bore closely proximate serial numbers; the Smith & Wesson pistols included 3 with close serial numbers. • Ammunition Used: Four brands, all 115 grain and full metal jacketed. Cho- sen brands were PMC Eldorado, Remington, Speer, and Winchester. All but the Winchester have nickel-plated primers, while the Winchester is brass. • Firing Protocol: The firearms were inspected and cleaned prior to test fir- ings. One set of repetitions was performed on each of three days. The 12 pistols were fired in randomly chosen order using one ammunition type before going on to the next ammunition brand; the order in which the ammunition brands were handled was varied across the 3 days. After the full set of cartridge casings was collected and labeled, a new randomization was performed and new labels as- signed before the casings were analyzed. • Analysis Set: The full exhibit set has 144 casings (4 guns × 3 gun brands × 4 ammunition brands × 3 ammunition repetitions). Due to time constraints, NIST only processed 108 casings—excluding the Speer-brand firings—using three- dimensional surface measurements.

OCR for page 197
202 BALLISTIC IMAGING 8–A.1  DKT: De Kinder et al. (2004) Exhibit Set NIST staff obtained access to the 4,200-element exhibit set analyzed by De Kinder et al. (2004), representing firings of seven cartridges in each of 600 SIG Sauer pistols. From the pistols known to be of the SIG Sauer P226 model (some of the 600 pistols were very similar to the P226 but not that exact model), the committee randomly selected 10 pistols; all 7 casings for each of those guns were extracted from the exhibit set for further analysis. For convenience, we refer to this sample of 70 casings as the DKT exhibit set (using the initials for the first two authors). 8–A.2  NBIDE: NIST New Test-Fire Exhibit Set The De Kinder et al. (2004) analysis made use of a natural opportunity for test firing many similar weapons—a large order of new firearms for a law enforcement agency. In addition to the advantage of database size, it is also strong due to its attention to varying one major factor in the ­quality of ballistic toolmarks as registered by photographic techniques, namely ammunition type. Strong though it is, it is also limited by its focus on only one firearm type or brand. It is also somewhat limited by its lack of repeti- tions within firearm and ammunition combinations; only two of the seven firings from each pistol repeated the same combination of major factors (i.e., the same firearm using two rounds of Remington-Peters ammuni- tion); hence, it is limited in its ability to study shot-to-shot variability between firings. Working with NIST, the committee sought to develop a small exhibit set addressing both of these limitations that could then be subjected to both two-dimensional and three-dimensional analysis. NIST used the ­terminology “NIST Ballistics Identification Designed Experiment” to describe its work, and so we use the label NBIDE to refer to the experi- ment and the new test-fire exhibit set produced for it. For simplicity, we restricted attention to a single caliber of firearms— 9mm. Moreover, absent the ability to obtain firearms with consecutively manufactured parts or to acquire guns direct off the production line—which was not within our resources—we elected to focus on firearms purchased as new from standard dealers. Within those constraints, the intent for the NBIDE exhibit set was to select several gun models representing a range of perceived quality and precision tooling. The Smith & Wesson 9VE and Ruger P95D were identified as choices, Smith & Wesson being a relatively finely tooled weapon and Ruger being a perceived mid-range choice. How- ever, acquiring as-new firearms on the low end of that continuum—for instance, the relatively inexpensive Lorcin or Bryco firearms that still show up among the most traced guns even though the manufacturers are now out

OCR for page 197
EXPERIMENTAL EVIDENCE 203 of business—proved difficult for NIST to procure. We opted instead to add another relatively high-end firearm model: the same SIG Sauer P226 model used in the De Kinder et al. (2004) study. This serves to give us a point of comparison with that study, albeit adding more repetitions using the same ammunition. For each of these three brands, four new guns were purchased. All four Ruger P95D firearms and three of the four Smith & Wesson 9VE guns bore close serial numbers (within eight units of each other); three of the SIG Sauer P226 guns bore consecutive serial numbers. Like the choice of gun make and model, the selection of ammunition masks or subsumes other individual factors affecting the marks left on fired rounds and the ability to detect them through imaging. These individual factors include variation in such areas as the plating of the primer, the pres- ence of nonfiring manufacturing marks, or the presence and thickness of lacquer on the primer. For the NBIDE exhibit set, we elected to retain three of the ammunition brands used by De Kinder et al. (2004)—­Remington, Winchester, and Speer—while adding another, PMC (Eldorado) brand ammunition. All the selected ammunition had the same powder charge, 115 grain. The full NBIDE exhibit set has 144 elements: three repetitions of each of four ammunition brands, fired through four guns from each of three makes. However, the NIST analysis (Vorburger et al., 2007) uses only a reduced 108-element subset of the exhibits, excluding the Speer brand ammunition firings from analysis. This reduction in size was done to reduce the analysis burden, when it was unclear how time consuming three-­dimensional surface measurement would be. Although only the 108-element set was subjected to three-dimensional analysis, all 144 exhibits were later analyzed using the current IBIS system. Prior to test firing, the firearms were inspected and cleaned: in particu- lar, excess oil left inside the weapons at the factory was removed. The test firings were completed over the course of 3 days inside a range facility at NIST’s Gaithersburg, Maryland, campus. Only the cartridge casings were retained during firing, caught in a windsock-type attachment after each shot was fired; bullets were fired into a destructive, scrap rubber-type trap. One set of repetitions was performed each day; the ordering of guns and ammunition was randomized across the 3 days. After the test firing was complete, the 144 exhibits were re-randomized and labeled (though this was done so that the Speer rounds could be sepa- rated out for NIST’s three-dimensional measurement purposes). The exact mapping of exhibits back to their parent gun was sealed and kept unknown to NIST’s analysts, so that they were blind to the true results until imaging and processing was complete.

OCR for page 197
204 BALLISTIC IMAGING 8–A.3  Image Acquisition Three-dimensional surface measurements of the firing pin, breech face, and ejector mark impressions on the DKT and NBIDE (108) casings were gathered using NanoFocus µSurf microscopes. Measurements were made using a microscope at NIST’s Gaithersburg campus and one at the R ­ ockville, Maryland, facilities of Intelligent Automation, Inc., with whom NIST subcontracted on this work. Each of the microscopes were checked for calibration on a daily basis during the measurement acquisition process, making use of the “standard bullet” and prototypes of a “standard casing” under development under separate studies as NIST-designated reference measurement standards (see Vorburger et al., 2007:Section 4.2). Subsequently, the DKT and NBIDE (144) casings were submitted to the Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF) National Laboratory at Ammendale, Maryland, for entry on an IBIS station. The two batches were processed separately; that is, an IBIS comparison request was generated for each of the 70 DKT casings, comparing each entry against the other 69 in the set. Similar work was done for each of the 144 NBIDE cas- ings. By default, each of these comparison runs produced a hard-copy cover sheet returning the “top 10” comparison results, such as that illustrated in Figure 4-2. Through arrangement with FTI, NIST and the committee were able to obtain additional information from each of the two exhibit sets: • For the DKT casings, the raw IBIS images—including the place- ment of region-of-interest delimiters on the standard center-light images as well as the side-light image—were extracted and provided in electronic form. • For the NBIDE (144) casings, FTI performed a complete com- parison that waived the 20 percent threshold and coarse comparison steps, generating the IBIS correlation scores for each casing against the remaining 143 elements, and provided those scores in electronic form. Figure 8-1 contrasts the greyscale photographic images collected by the current IBIS platform with representations of three-dimensional surface measurement data, for both the breech face and firing pin markings of a particular cartridge casing. The raw data for the three-dimensional surface measurements are just that—numeric distance measurements over a fine array of spatial coordinates; for graphical purposes, these can be rendered in many ways, using colors to suggest “height” or “depth” or simulat- ing lighting from any desired angle. The two three-dimensional plots in Figure 8-1 use different color and texture schemes to approximate the appearance of the surfaces.

OCR for page 197
EXPERIMENTAL EVIDENCE 205 FIGURE 8-1  IBIS two-dimensional images and rendered three-dimensional sur- faces, breech face, and firing pin impressions from one casing. NOTES: Breech face images in row 1; firing pin images in row 2. Images are from the DKT exhibit set, the Federal casing from pistol number 535. The region-of- interest delimiter circles are superimposed on the IBIS images. 8–A.4  Processing Steps and Similarity Measures for Three-Dimensional Measurement Data Vorburger et al. (2007:Section 8) describe the data processing steps for NIST’s topographic measurements in considerable detail. Here, we describe the basic steps: • Data trimming and thinning: Alone, the sheer size and detail of the topographic measurement datasets make the cross-comparison of three-

OCR for page 197
206 BALLISTIC IMAGING dimensional “images” a time- and computer-intensive activity. For breech face impressions in particular, where measurements may be picked up on the cartridge rim, some data are trimmed so as to include only the primer surface. NIST also worked with algorithms for thinning the data, reducing the lateral resolution of breech face images from on the order of a 2,600 × 2,600 grid of data points to 650 × 650. (Measurements for firing pin impressions were not thinned, however.) • Removal of dropout and outlier points: Good though a three- dimensional sensor may be, it does ultimately provide only an estimate of height or depth; there are individual spatial points that the sensor may simply fail to acquire (dropouts) and others where the estimate is made with appreciable noise or inaccuracy (outliers). Code developed by NIST analyzed the three-dimensional measurement datasets for these problematic points and interpolated new values from nearest neighboring points. • Filtering: As a rough means to try to emphasize individual charac- teristics rather than class characteristics in the three-dimensional images, NIST applied standard filters in noncontact optical profilometry, based on spatial wavelengths in the topographic image data. Spatial wavelength cal- culations are based on distance between consecutive peaks after subtracting out a mean surface depth; in this particular case, both very short and long wavelengths are subtracted out, removing effects that can be thought of as corresponding to system measurement noise and broad structural features (class characteristics), respectively. This filtering adjustment stops short of true feature extraction, using algorithms to try to detect and highlight particular image features. • Registration: Finally, the adjusted topographic image is processed by another program, intended to find the rotation and horizontal/vertical shift that gives best correspondence between images. To compare images, NIST used areal cross-correlation functions, as are common in spatial statistics. Like the standard statistical correlation score, the cross-correlation scores are scaled; two topographic images that are exactly the same would yield an areal cross-correlation of 1.0. As the measures are computed, the functions used by NIST are very slightly a ­ symmetric—that is, the cross-correlation of image A compared with image B can be slightly different from the results when image A is used as the reference and compared with image B. Noting that the standard IBIS two-dimensional comparison score is similarly asymmetric, NIST judged the discrepancies to be generally insignificant (Vorburger et al., 2007:126–127).

OCR for page 197
EXPERIMENTAL EVIDENCE 207 8–B  Analysis of Two- and Three-Dimensional Image Data In this section, we describe the results of analyzing the DKT and NBIDE datasets using both two-dimensional image data (e.g., the current IBIS platform) and three-dimensional topographic data (using NIST’s acquisition protocols and algorithms). With specific regard to the DKT data (and, perhaps, to the firings from SIG Sauer pistols in the NBIDE dataset), these analyses are partly meant to assess the consistency of our results with those by De Kinder et al. (2004), as described in Chapter 4. But—in addition to getting a sense of the capability of the current IBIS to detect known “sister” casings from the same gun—this work is also meant to shed some light on the tradeoff between two-dimensional and three-dimensional measurement, to see whether the latter offers clear-cut advantages over the former. In what follows, we rely heavily on “top 10” analyses, looking at the 10 highest-ranked possible matches by different markings. This is some- what unfortunate given our assessment in Section 4–F that there is no special magic in the top 10 as a cutoff (and, indeed, that the focus on the top 10 in current training and IBIS reports has the effect of overpromising the system). However, it is a practical limitation necessitated by a desire to stick to standard IBIS analysis and experience as much as possible: that is, we look principally at the top 10 because generation of the top 10 cover sheet scores is the system default, and we confine ourselves to the top 10 ranks using three-dimensional measurements for consistency. A fuller analysis would have considered larger cuts at the rankings, such as the top 50—more than the strict limit of the top 10, but still within the number of results a human examiner might reasonably routinely scroll through onscreen to find potential matches. However, as we will discuss, we did obtain a full set of comparison scores for the full NBIDE dataset and discuss those results as well. 8–B.1  Two-Dimensional and Three-Dimensional Performance, DKT Data As shown in Box 8-1, each exhibit in the DKT exhibit set has six pos- sible same-gun matches. Table 8-1 summarizes the same-gun entries found in the top 10 rankings in a standard IBIS search against the 69 other DKT exhibits, and Table 8-2 provides the same results based on NIST’s analysis of three-dimensional topographic data. On firing pins, the two-dimensional and three-dimensional systems do comparably well. While the three-dimensional system does a much better job at finding the casings from pistols 375, 430, and 535, the two-

OCR for page 197
212 BALLISTIC IMAGING casings against the 107 other casings. Accordingly, each comparison was against a database containing 8 same-gun matches (three ammunition types × three repetitions – one) and 99 nonmatches (11 other firearms × three ammunition types × three repetitions). The cell counts in Tables 8-3 and 8-4 indicate the number of same-gun matches for each firearm and ammunition combination, averaged across the three repetitions. For the pure SIG Sauer DKT exhibit set, both the two-dimensional and three-dimensional systems appeared to do a better job at finding same-gun matches using the firing pin mark than the breech face; the opposite appears to be true for the NBIDE dataset. The success of the three-dimensional system in finding the same-gun matches in the top 10 ranks on breech face is excellent; indeed, it is near perfect. For the two-dimensional systems, the success rates on breech face generally exceeded those for firing pin; the two-dimensional scores are not weak (averaging about six of eight same-gun matches detected for the Smith & Wesson pistols, and doing less well—about four of eight—on the SIG Sauers), but they do not approach the high success of the three-dimensional analysis. On firing pins, the scores corresponding to the SIG Sauer firings are generally lower than those for the Ruger and Smith & Wesson guns. The second of the SIG Sauer pistols seems particularly difficult, yielding less than three out of eight same-gun matches on either the two-dimensional or three- dimensional systems. As suggested in the DKT analysis above, ammunition seems to have a strong effect, with the Remington ammunition producing consistently fewer matches than the PMC or Winchester rounds. Overall, the three-dimensional analysis appears to outperform the two-dimensional, particularly for the Ruger firings and the second Ruger pistol. Some further insight into the two-dimensional/IBIS performance on this dataset can be had by considering a complete set of scores and rankings— waiving the coarse comparison and 20 percent threshold steps—that was prepared for the committee by FTI. Table 8-5 summarizes the distribution of the ranks of matching exhibits in these complete score lists; for this analysis, we include the Speer casings and use the complete 144-element NBIDE set. The table combines the 144 separate comparison reports, indi- cating the distribution of all 144 × 143 = 20,592 pairwise comparisons, of which 1,584 were between exhibits that were fired from the same gun. Out of 1,440 possible top-10-ranked positions by breech face, across the 144 different comparisons, about 57 percent were between reference and test exhibits from the same firearm; 33 percent were from different firearms but the same gun brand, while 10 percent were from exhibits from completely different gun brands. On the firing pin impression alone, the share of top-10 positions filled by same-firearm matches dips to 42 percent while the share from same-brand-but-different-firearm comparisons grows to 43 percent.

OCR for page 197
TABLE 8-5  Summary of IBIS Comparisons for Full 144-Exhibit NBIDE Set Ranks of Matching Exhibits in Complete Score List Breech Face Only Firing Pin Only Breech Face or Firing Pin Relationship of Reference To Test Casing #1 #2–10 #11–29 #30+ #1 #2–10 #11–29 #30+ #1 #2–10 #11–29 #30+ Different Gun Brand   Different Ammunition 0 71 602 9,695 6 140 640 9,582 6 210 1,182 8,970   Same Ammunition 0 70 383 3,003 0 71 286 3,099 0 133 615 2,708 Same Gun Brand   Different Specific Firearm    Different Ammunition 1 181 960 2,746 13 347 1,064 2,464 14 502 1,613 1,759    Same Ammunition 11 288 451 546 8 256 426 606 19 450 522 305   Same Specific Firearm    Different Ammunition 49 553 303 391 55 373 281 587 92 678 336 190    Same Ammunition 83 133 37 35 62 109 39 78 115 118 30 25 NOTES: Tabulations are from score lists generated by Forensic Technology WAI, Inc., waiving the coarse comparison pass and 20 percent threshold. Each of the 144 casings was compared with the 143 other exhibits in the set, for a total of 20,592 comparisons. The cutoff at rank 29 corresponds to a strict 20 percent threshold on a sample size of 143; the actual effective sample size for any of these comparisons in a standard IBIS run would be somewhat larger because both breech face and firing pin images are considered, but we use the 29 cutoff for simplicity. For purposes of generat- ing ranks, tie scores in the score lists are assigned final ranks by sorting by the NIST-assigned ID number for the test (in-database) exhibits. This is tantamount to a random assignment of ranks for tie scores because the ID numbers were randomly mixed prior to analysis. 213

OCR for page 197
214 BALLISTIC IMAGING 8–B.3  Analysis of Matching and Nonmatching Distributions of Similarity Scores Extending beyond analysis of ranks—as in Table 8-5 for the two- dimensional IBIS data—the NIST study (Vorburger et al., 2007:Section 9.5) derives an overlap metric to assess its cross-correlation similarity scores for three-dimensional topography data. The empirical distribution of the cross-correlation scores can be derived separately for the matching (same-firearm) and nonmatching pairwise com- parisons in a dataset of topographic “images”; Figure 8-2 illustrates such a distribution for the scores generated from the firing pin scores using the NBIDE exhibit set. Continuous distributions can then be estimated for the matching and nonmatching comparisons, with the intent of calculating the degree of overlap between them. Ideally, the matching and nonmatching distributions would have no overlap and be wholly distinct from each other, 50 40 Percentage 30 Nonmatch 20 10 Match 0 0 20 40 60 80 100 CCF FIGURE 8-2  Empirical distribution8-2.eps of matching and nonmatching pairwise comparisons. NOTE: Data used are the scores from comparisons of the three-dimensional topo- graphic measurements of firing pin impressions using the NBIDE exhibit set. SOURCE: Vorburger et al. (2007:Fig. 9-16).

OCR for page 197
EXPERIMENTAL EVIDENCE 215 with the matching scores at the high end of the range and nonmatching scores concentrated near 0. The more the matching and nonmatching dis- tributions overlap, the greater the degree of false matches may be expected since matches and nonmatches become harder to distinguish. The extent of overlap can be summarized by estimating p, the probability that the similar- ity score (maximum cross-correlation) of a randomly chosen element of the nonmatching distribution is larger than a randomly selected member from the matching distribution. In the ideal separation described above, p would be 0; in a completely overlapping distribution, p would be 0.5. This same logic can be applied to an exhibit set as a whole (to provide a single estimate of p) or on subsets: specific values of p could be derived for particular firearms or particular casings. For instance, for the 108-­element NBIDE dataset (excluding the Speer firings), a casing-specific p can be derived from the 107 pairwise comparisons using that casing as the refer- ence, 8 of which are same-gun matches and 99 of which are nonmatches. Table 8-6 summarizes estimates of casing-specific values of p from the DKT and NBIDE exhibit sets. The table confirms the three-dimensional system’s strong performance on breech face measurements for the NBIDE exhibits, with 90 percent of the casing-specific estimates of p being less than 0.001. For the NBIDE firing pin measurements, separation was less clear; only 18 percent of the p estimates were less than 0.001. The degree of overlap is more pronounced for the DKT data; the maximum estimated p for the DKT firing pin comparisons was 0.415, quite close to complete overlap. TABLE 8-6  Summary of Overlap Metrics for Three-Dimensional Images Proportion of p ≤ Image Type 0.001 0.01 0.1 DKT Data   Firing Pin 0.09 0.21 0.45   Breech Face 0.01 0.03 0.21 NBIDE Data   Firing Pin 0.18 0.25 0.56   Breech Face 0.90 0.95 1.00 NOTE: The table summarizes the proportion of casing-specific estimates of p falling below a particular value, where p = 0 indicates perfect separation of matching and nonmatching distributions; see text for further derivation.

OCR for page 197
216 BALLISTIC IMAGING 8–B.4 General Assessment Although we defer our main conclusions from this experimental work to our discussion of a national reference ballistic image database in Chapter 9, some comment is in order here on the trade-off between two-­dimensional photography and three-dimensional surface measurement as an imaging standard, particularly as a possible technical enhancement to the current NIBIN system. We conclude that NIST’s work on the committee’s behalf on a prototype three-dimensional ballistics evidence comparison system suggests that such a system has strong merit. Although in early development, NIST’s version of a three-dimensional analysis system produced results on par with—and, for some markings, outperformed—the current two-dimensional system in detecting same-gun sisters. That it did not consistently produce same-gun match rates that exceed the two-dimensional system—for instance, that it did not always do best at handling breech face markings—suggests that there is room for improvement and refinement. Much work also remains to done on streamlining the acquisition and data processing steps. As a first foray—one geared to ensuring proper calibration of equipment and to test- ing different algorithms and computer programs for generating comparison scores—the data acquisition process was time consuming and comparisons took many hours to run to completion. In both respects, the specific three- dimensional system developed by NIST is unsuitable for deployment and immediate use. We have no information on the performance of the new Forensic Technology WAI, Inc., three-dimensional-based IBIS for cartridge casings and hence cannot comment on it. However, we are confident that three-dimensional surface measurement of ballistics evidence can be made to be tractable; though not ready for immediate implementation, three- dimensional measurement and analysis of bullet and cartridge evidence should be a high priority for continued research and development. 8–C Basic Experimentation With New York CoBIS Database A subgroup of committee members and staff made two visits to the New York State Police (NYSP) Forensic Investigation Center in Albany in March and July 2005 to see a state RBID in operation and to perform some small-scale tests of system performance. Our analysis was deliberately lim- ited in nature, so as to avoid unduly interfering with the center’s operations for part or all of a day. The exploration we pursued consisted in part of entering a subsample of exhibits from the DKT set, for which we also had (independent) IBIS analysis by ATF and three-dimensional measurement by NIST. We also drew a sample from past CoBIS caseload for reacquisition

OCR for page 197
EXPERIMENTAL EVIDENCE 217 and analysis, and observed the entry of new exhibits waiting in the queue for entry. In advance of the visits, committee staff asked NYSP for a basic break- down of gun makes in the CoBIS dataset, to get a sense of high- and low- frequency cases. Casings from 9mm pistols make up the largest portion of the CoBIS data (about 38 percent of the total); of the 9mm pistols, Glock pistols are most highly represented (46.5 percent), followed by Smith & Wesson (18 percent). The second-most represented caliber group are 40 caliber firearms, followed by .22 caliber; Glock and Smith & Wesson are the largest entrants among .40 caliber arms, while Kimber and Ruger fire- arms are the largest constituent parts of the .22 caliber database. We selected four manufacturer-and-caliber combinations, including both high-frequency cases and very-low-frequency cases. For the latter, we selected Kimber 9mm, a group for which only 23 exhibits, of the entire 29,355-element 9mm pool, were known to be in the database. The other combinations we sought were Glock 9mm (the single largest component of the database), Beretta .22, and Smith & Wesson .357. For each of these groups, one exhibit was retrieved arbitrarily from storage (archive) and one from waiting caseload (new). The “new” e ­ xhibits were checked in and entered into CoBIS as usual, and the enve- lopes retained so that they could be reentered later. As the exhibits were being processed, we learned of Glock’s practice of including two casings in the sample envelope packaged with its new guns. In standard CoBIS practice, the IBIS technician briefly looks at both casings and chooses one as the “best” casing for entry. Only the casing deemed to be the best for entry was reacquired into the system, but we generated comparison scores using both casings as references to see if the second casing matched well with the best. The technician-determined best casing was labeled 1, and the second was labeled 2. Only the breech face and firing pin images were acquired; one of our chosen weapons, the Beretta .22, is a rimfire firearm, and so its single mark is acquired in the free-hand trace method usually used for ejector marks. This set of casings was supplemented by a small extract of eight cas- ings from the DKT exhibit set. Two DKT pistols (numbers 535 and 68)   or the archive cases, exhibits were drawn from 2004 forward, since boxes containing F those exhibits were accessible near the IBIS entry room. As described below, we had occasion to retrieve one specimen from CoBIS pre-2004 archive. One of the casings used as a “new” was extremely new; it was from a firing performed on-site on the morning of our visit, when a dealer brought in a gun for firing and cartridge collection prior to sale.   ur analysis set also had one other unplanned addition. The very first comparison score O results to be returned on screen were for NYSP05, a “new” casing from a Smith & ­Wesson .357 model 640 revolver; that casing had a CoBIS ID with stub 05-05061 (05 indicating 2005 and the last five digits being a sequential ID number). It turned out that the highest-

OCR for page 197
218 BALLISTIC IMAGING were chosen; both Remington casings plus the CCI and Speer casings were selected from gun 535, and both Remington casings plus the Wolf and F ­ ederal casings were selected from gun number 68. These were entered into the system as new evidence, labeled NAS01 through NAS08, respec- tively. The workload for entering the NAS exhibits and the “new” caseload exhibits was divided among four CoBIS operators; however, when exhibits were entered for querying the database, all entries were made by the same person. The exhibit set analyzed in Albany is summarized in Box 8-2. 8–C.1 Basic IBIS Results, NAS Exhibits Table 8-7 reports the IBIS breech face and firing pin scores and ranks for the eight NAS exhibits, extracted from the DKT exhibit set. Practically, these comparison runs looked at the performance of the current IBIS in finding elements of an eight-exhibit set, nested within a database of effective sample size 15,082 of casing images from new firearms of the same caliber and basic demographic characteristics. The first thing that is evident from Table 8-7 is that the system did effectively find matches between images of the absolute highest similarity: different image entries of the exact same casing, differing only (possibly) by the operator who acquired the images. These exact image repetitions are the top-ranked possible match on both the breech face and firing pin marks, and the raw scores dwarf the others. The second finding shown in the table is that performance on these exhibits is far from the ideal, which is that each exhibit (when used in the reference) would find its three known “sister” casings from the same gun as very highly ranked possible matches, and that the four NAS-labeled exhibits known to be from a different gun would not be highly ranked. For only one of the exhibits—NAS05—do all three of the casings known to be from the same gun even appear on the “full” IBIS comparison score report: the others are rejected in the coarse comparison and 20 percent threshold steps described in Chapter 4. Out of 24 possible same-gun matches (eight exhibits times three “sisters” from the same gun): ranked possible match by breech face—other than the known image in the system, already entered—bore the ID stub 01-05061, a casing from 2001 bearing the same ID number as the new 2005 case. The breech face score (60) was not exceptional relative to the rest of the distribution, and indeed, visual examination of the images suggested nothing close to a true match. But the happenstance of having a very similar revolver (same manufacturer, slightly different make) from 4 years prior show up at the top of the correlation heap raised some curiosity (was the system somehow sorting on ID?), so the 2001 exhibit was pulled from deep storage for direct examination.

OCR for page 197
EXPERIMENTAL EVIDENCE 219 BOX 8-2 Exhibit Set Tested in Work with CoBIS Database DKT (De Kinder et al., 2004) Exhibit Set Extract: All cartridges are firings from SIG Sauer P226 pistols and represent a subset of the DKT data analyzed by NIST. • NAS01: Pistol #535, Remington-Peters casing 1 • NAS02: Pistol #535, Remington-Peters casing 2 • NAS03: Pistol #535, CCI casing • NAS04: Pistol #535, Speer casing • NAS05: Pistol #68, Remington-Peters casing 1 • NAS06: Pistol #68, Remington-Peters casing 2 • NAS07: Pistol #68, Wolf casing • NAS08: Pistol #68, Federal casing CoBIS Extract: For each gun type, one “new” case was selected from casings awaiting entry in the database and one “archive” case drawn from the past 1–2 years of entered exhibits. • NYSP01: Beretta .22, new • NYSP02: Beretta .22, archive • NYSP03-1: Glock 9mm, new, “best” of the two sample casings included in envelope by manufacturer • NYSP03-2: Glock 9mm, new, second sample casing from manufacturer • NYSP04-1: Glock 9mm, archive, “best” of the two sample casings included in envelope by manufacturer • NYSP04-2: Glock 9mm, archive, second sample casing from manufacturer • NYSP05: Smith & Wesson .357 640 revolver, new • NYSP06: Smith & Wesson .357 640 revolver, archive • NYSP07: Kimber 9mm, new • NYSP08: Kimber 9mm, archive • Only 3 are found are in the top 11 ranks by either breech face or fir- ing pin (11 is used because of the presence of the image from the exact same exhibit that is always the #1 entry). These are NAS01 to NAS02, NAS02 to NAS01 (on firing pin only), and NAS06 to NAS05—all three casings using the same ammunition (Remington-Peters) as well as the same gun. • Half (12) had a best ranking (between the breech face and firing pin) that was less than 11—none of these lower than 27, and most of them greater than 100—but still merited inclusion in the “full” correlation report. • The balance, 9, failed the coarse comparison pass and 20 percent threshold.

OCR for page 197
220 BALLISTIC IMAGING TABLE 8-7  IBIS Comparison Results, DKT Exhibit Set Extract in CoBIS Database Breech Face Firing Pin Reference ID (# of results) Test ID Score Rank Score Rank NAS01 (1,014) NAS01 229 1 226 1 NAS02 29 11 101 2 NAS04 10 728 61 86 NAS05 12 641 55 199 NAS06 7 861 50 340 NAS02 (1,106) NAS01 26 279 106 2 NAS02 249 1 366 1 NAS04 22 594 52 160 NAS05 25 342 49 225 NAS06 11 838 46 338 NAS03 (1,024) NAS02 15 641 41 177 NAS03 280 1 253 1 NAS04 25 339 28 657 NAS04 (987) NAS03 36 69 43 645 NAS04 273 1 299 1 NAS08 12 711 65 288 NAS05 (1,039) NAS01 17 637 54 305 NAS04 25 143 58 175 NAS05 275 1 226 1 NAS06 15 676 63 69 NAS07 4 1,013 55 251 NAS08 14 706 65 48 NAS06 (1,048) NAS01 15 655 70 37 NAS02 11 763 60 140 NAS04 22 167 60 139 NAS05 33 3 99 2 NAS06 190 1 203 1 NAS08 18 484 60 138 NAS07 (1,018) NAS05 4 899 63 27 NAS07 275 1 253 1 NAS08 1 1,001 51 724 NAS08 (1,049) NAS02 20 486 36 761 NAS04 21 409 60 337 NAS05 23 281 63 215 NAS08 169 1 276 1 NOTES: All exhibits are of the same 9mm caliber, and so all had the same effective sample size of 15,082 exhibits. The (# of results) entries represent the number of entries included in the “full” IBIS comparison report and are the number of exhibits that survive the coarse comparison and 20 percent threshold steps (see Chapter 4). NAS01–04 are from the same SIG Sauer P226, De Kinder et al. (2004) pistol 535; NAS01 and 02 use the same Remington ammunition. NAS05–08 are from the same SIG Sauer P226 pistol, De Kinder et al. (2004) pistol 68; NAS05 and 06 use Remington ammunition.

OCR for page 197
EXPERIMENTAL EVIDENCE 221 NAS-labeled exhibits from a different gun than the reference exhibit did appear in the full comparison reports. In fact, when NAS06 was used in the reference, three NAS exhibits from the other SIG Sauer pistol could be found in the comparison report compared to two sister entries from the same pistol. However, none of these comparisons yielded scores that cracked the top 11 rankings by either mark. With CoBIS staff, we examined the firearms makes and models for the 16 highest-ranked possible matches, on the breech face and firing pin lists, for each of the NAS-labeled exhibits. A wide variety of 9mm pistols appear throughout the listings, including Smith & Wesson, Beretta, and Taurus arms, with smaller numbers of Kahr, Springfield, and Keltec guns. Other SIG models are not uncommon in the listings, but the highest ranks are not dominated by them. It is of interest that some of the NAS casings do fre- quently match to casings from two runs of near-consecutive serial numbers and, consequently, near-consecutive CoBIS IDs. These likely correspond to large batch sales, such as police department orders. For instance, for the NAS02 casing, the top 16 ranks by firing pin include four entries from one of these runs, entered in 2003 (two other nonrelated SIG exhibits from 2001 are also highly ranked on firing pin); however, none of these pistols appears in the highest ranked possible matches by breech face. Independently, a committee subgroup visited the New York City Police Department forensic laboratory and ran tests on NAS01–NAS04; the Albany test had the effect of seeing how these same-gun casings were handled in an RBID of images from new firearms, while the New York City test con- trasted that with performance in a large database of crime scene evidence. The results were very consistent with those reported in Table 8-7, against a crime-evidence database of 12,427 exhibits after demographic filtering. 8–C.2 Basic IBIS Results, NYSP Exhibits Our work with the NYSP-labeled exhibits described in Box 8-2 was similar to that done for the NAS-labeled exhibits. “New” exhibits were entered into CoBIS, and then subsequently re-imaged for comparisons.   Another oddity that shows up in the top 16 rankings is that the NAS01 and NAS02 casings, in particular, find three test casings—apparently entered by FTI in setting up and maintaining CoBIS’ IBIS equipment—among the top ranks by breech face. The make and model of these test rounds (which stand out in the listings because they, like the NAS-labeled exhibits, do not use the typical CoBIS naming conventions) are unknown. It should be emphasized, though, that although they appear in the high ranks, the actual scores and visual match on the images are unremarkable.   pecifically, when NAS01 was used as the reference, NAS03 was again excluded by the S coarse comparison pass; NAS02 was nearly top-ranked on firing pin (but low-ranked on breech face), and NAS04 ranked 90th on firing pin and 673rd on breech face.

OCR for page 197
222 BALLISTIC IMAGING We were interested in determining whether the system reliably found this exact same image—differing only by acquisition at different times, possi- bly by different operators—in databases of different effective sample sizes (between 5,312 and 15,082) after the standard filtering. The Glock entries (NYSP03 and NYSP04) provided the opportunity to see where casings in the same manufacturer-supplied envelope related strongly to each other. And, on a follow-up visit to the NYSP Forensic Investigation Center, the makes and models for the top 10 results by both breech face and firing pin were recorded to see whether like models dominated the top rankings. As with the NAS-labeled exhibits, the current IBIS system had no prob- lem detecting the “needle”—a new instance of the same exhibit image—in “haystacks” of varying sizes, with one prominent exception. That excep- tion was NYSP01, a “new” Beretta 9mm; because it is a rimfire weapon, image entry is done by manually tracing the region of interest (see Section 4–B.2), and a single ejector mark/rimfire impression score is returned by the system. For this exhibit, the image on file in CoBIS—acquired that same morning—appeared as the 137th ranked possible match; its score was 328, compared to the top score (to another Beretta pistol, entered in 2004) of 571. The effective sample size was 8,106, so the link from NYSP01 to itself was not in great danger of being excluded by the coarse comparison and 20 percent threshold steps. Visual examination of the surface images suggest a curious ridge-like structure in the rimfire impression that appar- ently registers differently under slightly discrepant lighting and orientation. NYSP02, the “archive” Beretta casing, encountered no such difficulty; the original image from 2004 was found in the #1 position with score 2,631, with the score dropping to 444 for the #2 entry. For each of the Glock exhibits, the second casing in the manufacturer- supplied envelope could be found as a match in the top 10 by one of the marks. When NYSP03-2 was used as the reference, NYSP03-1 was returned as the #4-ranked entry on breech face (raw score 61, relative to a maximum of 64) but was not within the top 10 on firing pin. When NYSP04-2 was the reference, NYSP04-1 was the top-ranked potential match by firing pin (score 168, with an unrelated Glock scoring 163 as the #2 ­possibility) but dropped out of the top 10 on breech face. In all these cases, the demo- graphic filtering by Glock firing pin gave an effective sample size of 12,353 casings. For the remaining NYSP-labeled exhibits, the same-gun image was very comfortably returned as the #1-ranked entry on both the breech face and firing pin scores, with wide separations between it and the remaining entries. The top 10 lists for each exhibit are most populated by guns by the same manufacturer and the same model, except for the Kimber exhibits (NYSP07 and NYSP08), for which none of the less-than-30 Kimbers in the CoBIS system were returned as top-10 candidates by either score.