PART III
Implications for a National Reference Ballistic Image Database



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 197
PART III Implications for a National Reference  Ballistic Image Database

OCR for page 197

OCR for page 197
8 Experimental Evidence on Sources of  Variability and Imaging Standards The performance studies of the Integrated Ballistics Identification System  (IBIS) platform—the current standard for ballistic imaging—summarized in  Section 4–D provide context for the committee’s own analyses. The core of  the experimental work performed by the committee was coordinated by the  Office of Law Enforcement Standards of the National Institute of Standards  and Technology (NIST), with whom the National Institute of Justice executed  a separate contract to perform analyses at the committee’s direction.  Given  the  committee’s  basic  charge,  an  ideal  test  would  involve  the  creation of a prototype national reference ballistic image database (RBID),  exceeding the size of the De Kinder et al. (2004) exhibit set, and thus getting  a direct impression of automated systems’ ability to detect sameness amidst  a  vast  array  of  exhibits  with  highly  similar  class  characteristics.  How- ever, such a massive collection was clearly beyond the scope of our avail- able resources. Working with NIST, and recognizing the work in previous  s   tudies, we judged it best to focus our analyses on narrower objectives. Our  experimentation  was  aimed  at  generating  an  exhibit  set  that—although  small  in  size—could  facilitate  studies  of  system  performance  when  both  firearm and ammunition type are varied. Significantly, a main objective of  our  work  was  to  study  the  effectiveness  of  one  possible  enhancement  to  the current National Integrated Ballistic Information Network (NIBIN) or,  alternately, a possible design choice for a new national RBID: a switch from  two-dimensional photography to three-dimensional surface measurement as  an imaging standard. When NIST and the committee began its work, three- dimensional profilometry analyses had been performed on bullets but had  not yet been attempted on cartridge case markings; our experimental work  99

OCR for page 197
00 BALLISTIC IMAGING was  intended  to  shed  light  on  the  tradeoff  between  two-dimensional  and  three-dimensional imaging in computer-assisted firearms identification. An  important  set  of  caveats  is  in  order  at  the  outset  regarding  this  work. Since NIST and the committee began its work, Forensic Technology  WAI, Inc. (FTI) has refined its BulletTRAX-3D offering, and an FTI system  for  three-dimensional  analysis  of  cartridge  casings  is  in  production;  see  Box 4-1 and Chapter 7. The three-dimensional analyses described in this  chapter  and  expressed  fully  in  NIST’s  report  (Vorburger  et  al.,  2007)  do  not  make  use  of  FTI’s  three-dimensional  software  and  systems,  although  they  do  share  common  technology.  Indeed,  since  our  three-dimensional  analyses are exclusively focused on cartridge case markings, using the FTI  three-dimensional equipment was not possible since its cartridge case sys- tem was not in production during the committee’s period of analysis. In this  chapter, we do make reference to performance comparisons between NIST’s  three-dimensional system and “IBIS,” referring to the standard IBIS two- dimensional-photography-based product, even though FTI has branded its  three-dimensional systems with the IBIS name. As described in Box 4-1, this  nomenclature and comparison is appropriate since it is the two-dimensional  version of IBIS that is currently used by the NIBIN program, which is cen- tral to our charge. Although construction of a full-fledged national RBID prototype was  not within our committee’s resources, we did wish to do the next best thing:  namely, work with data from the existing state-level RBIDs. We accepted  an invitation from the New York Combined Ballistic Identification System  (CoBIS)  RBID  to  perform  limited  data  entry  and  experimentation  at  its  Albany headquarters location. The design of the exhibit sets studied in this chapter is described in Sec- tion 8–A, as well as the process used to acquire three-dimensional topographic  images.  Section  8–B  summarizes  the  work  done  on  the  committee’s  behalf  by NIST and concentrates on the comparison between two-dimensional and  three-dimensional performance. The results of limited experimentation with  the existing New York RBID are described in Section 8–C. Overall conclu- sions from this work are not presented in this chapter, but are rather deferred  to Chapter 9’s discussion of the advisability of a national RBID. 8–A DATA SOuRCES, DESIgN, AND IMAgE ACquISITION The  committee’s  experimental  work  relied  principally  on  two  sets  of  cartridge  case  exhibits:  an  extract  of  casings  from  the  De  Kinder  et  al.  (2004)  study  of  large  image  database  performance  and  a  new  set  of  test  firings commissioned by the committee and collected by NIST. We describe  these sources below, along with the steps taken to acquire two-dimensional  and three-dimensional images and measurements; for ease of reference, the  basic design of the exhibit sets is summarized in Box 8-1.

OCR for page 197
0 EXPERIMENTAL EVIDENCE BOX 8-1 Design of Test-Fire Cartridge Sets DKT (De Kinder et al., 2004) Exhibit Set • Firearms Used: 600 California Highway Patrol service pistols, all generally of the same SIG Sauer P226 make. Forty-six of the pistols were of the P225, P228, or P229 series, but these models were judged to be consistent in breech face and firing pin configurations with the P226. • Ammunition Used: Six brands—CCI, Federal, Remington, Speer, Win- chester, and Wolf. • Firing Protocol: A set of seven cartridges—two using Remington ammuni- tion, and one each from the other five ammunition brands—were loaded into a magazine and fired from each pistol. It is not known whether the same ammunition sequence was used in each of the firings. • Analysis Set: NIST obtained access to the full set of 4,200 casings. At the committee’s direction, 10 of the 554 guns known to be of the P226 make were selected at random and the 7 casings from those guns were extracted to form a 70-element analysis set. NBIDE (Vorburger et al., 2007) Exhibit Set • Firearms Used: 12, 4 from each of 3 brands, purchased as new by NIST from standard vendors. Makes were chosen to try to obtain a range of known quality and tooling, subject to constraints on NIST’s ability to purchase from avail- able dealers. Only 9mm caliber firearms were considered, for simplicity. Chosen makes were Ruger P95D, SIG Sauer P226, and Smith & Wesson 9VE. The SIG Sauer pistols bore consecutive serial numbers; the Ruger pistols bore closely proximate serial numbers; the Smith & Wesson pistols included 3 with close serial numbers. • Ammunition Used: Four brands, all 115 grain and full metal jacketed. Cho- sen brands were PMC Eldorado, Remington, Speer, and Winchester. All but the Winchester have nickel-plated primers, while the Winchester is brass. • Firing Protocol: The firearms were inspected and cleaned prior to test fir- ings. One set of repetitions was performed on each of three days. The 12 pistols were fired in randomly chosen order using one ammunition type before going on to the next ammunition brand; the order in which the ammunition brands were handled was varied across the 3 days. After the full set of cartridge casings was collected and labeled, a new randomization was performed and new labels as- signed before the casings were analyzed. • Analysis Set: The full exhibit set has 144 casings (4 guns × 3 gun brands × 4 ammunition brands × 3 ammunition repetitions). Due to time constraints, NIST only processed 108 casings—excluding the Speer-brand firings—using three- dimensional surface measurements.

OCR for page 197
0 BALLISTIC IMAGING 8–A.1 DkT: De kinder et al. (2004) Exhibit Set NIST staff obtained access to the 4,200-element exhibit set analyzed by  De Kinder et al. (2004), representing firings of seven cartridges in each of  600 SIG Sauer pistols. From the pistols known to be of the SIG Sauer P226  model (some of the 600 pistols were very similar to the P226 but not that  exact model), the committee randomly selected 10 pistols; all 7 casings for  each of those guns were extracted from the exhibit set for further analysis.  For convenience, we refer to this sample of 70 casings as the DKT exhibit  set (using the initials for the first two authors). 8–A.2 NbIDE: NIST New Test-Fire Exhibit Set The De Kinder et al. (2004) analysis made use of a natural opportunity  for test firing many similar weapons—a large order of new firearms for a  law enforcement agency. In addition to the advantage of database size, it is  also strong due to its attention to varying one major factor in the quality  of  ballistic  toolmarks  as  registered  by  photographic  techniques,  namely  ammunition type. Strong though it is, it is also limited by its focus on only  one firearm type or brand. It is also somewhat limited by its lack of repeti- tions within firearm and ammunition combinations; only two of the seven  firings  from  each  pistol  repeated  the  same  combination  of  major  factors  (i.e.,  the  same  firearm  using  two  rounds  of  Remington-Peters  ammuni- tion);  hence,  it  is  limited  in  its  ability  to  study  shot-to-shot  variability  between  firings.  Working  with  NIST,  the  committee  sought  to  develop  a  small  exhibit  set  addressing  both  of  these  limitations  that  could  then  be  subjected  to  both  two-dimensional  and  three-dimensional  analysis.  NIST  used the terminology “NIST Ballistics Identification Designed Experiment”  to describe its work, and so we use the label NBIDE to refer to the experi- ment and the new test-fire exhibit set produced for it. For simplicity, we restricted attention to a single caliber of firearms— 9mm.  Moreover,  absent  the  ability  to  obtain  firearms  with  consecutively  manufactured parts or to acquire guns direct off the production line—which  was not within our resources—we elected to focus on firearms purchased  as new from standard dealers. Within those constraints, the intent for the  NBIDE  exhibit  set  was  to  select  several  gun  models  representing  a  range  of perceived quality and precision tooling. The Smith & Wesson 9VE and  Ruger P95D were identified as choices, Smith & Wesson being a relatively  finely tooled weapon and Ruger being a perceived mid-range choice. How- ever,  acquiring  as-new  firearms  on  the  low  end  of  that  continuum—for  instance, the relatively inexpensive Lorcin or Bryco firearms that still show  up among the most traced guns even though the manufacturers are now out 

OCR for page 197
0 EXPERIMENTAL EVIDENCE of business—proved difficult for NIST to procure. We opted instead to add  another relatively high-end firearm model: the same SIG Sauer P226 model  used in the De Kinder et al. (2004) study. This serves to give us a point of  comparison with that study, albeit adding more repetitions using the same  ammunition. For each of these three brands, four new guns were purchased.  All four Ruger P95D firearms and three of the four Smith & Wesson 9VE  guns bore close serial numbers (within eight units of each other); three of  the SIG Sauer P226 guns bore consecutive serial numbers. Like the choice of gun make and model, the selection of ammunition  masks or subsumes other individual factors affecting the marks left on fired  rounds  and  the  ability  to  detect  them  through  imaging.  These  individual  factors include variation in such areas as the plating of the primer, the pres- ence  of  nonfiring  manufacturing  marks,  or  the  presence  and  thickness  of  lacquer on the primer. For the NBIDE exhibit set, we elected to retain three  of  the  ammunition  brands  used  by  De  Kinder  et  al.  (2004)—Remington,  Winchester,  and  Speer—while  adding  another,  PMC  (Eldorado)  brand  ammunition.  All  the  selected  ammunition  had  the  same  powder  charge,  115 grain. The full NBIDE exhibit set has 144 elements: three repetitions of each of  four ammunition brands, fired through four guns from each of three makes.  However, the NIST analysis (Vorburger et al., 2007) uses only a reduced  108-element subset of the exhibits, excluding the Speer brand ammunition  firings from analysis. This reduction in size was done to reduce the analysis  burden, when it was unclear how time consuming three-dimensional surface  measurement would be. Although only the 108-element set was subjected  to three-dimensional analysis, all 144 exhibits were later analyzed using the  current IBIS system. Prior to test firing, the firearms were inspected and cleaned: in particu- lar, excess oil left inside the weapons at the factory was removed. The test  firings were completed over the course of 3 days inside a range facility at  NIST’s Gaithersburg, Maryland, campus. Only the cartridge casings were  retained  during  firing,  caught  in  a  windsock-type  attachment  after  each  shot was fired; bullets were fired into a destructive, scrap rubber-type trap.  One set of repetitions was performed each day; the ordering of guns and  ammunition was randomized across the 3 days. After the test firing was complete, the 144 exhibits were re-randomized  and labeled (though this was done so that the Speer rounds could be sepa- rated out for NIST’s three-dimensional measurement purposes). The exact  mapping of exhibits back to their parent gun was sealed and kept unknown  to NIST’s analysts, so that they were blind to the true results until imaging  and processing was complete.

OCR for page 197
0 BALLISTIC IMAGING 8–A.3 Image Acquisition Three-dimensional  surface  measurements  of  the  firing  pin,  breech  face, and ejector mark impressions on the DKT and NBIDE (108) casings  were  gathered  using  NanoFocus  µSurf  microscopes.  Measurements  were  made  using  a  microscope  at  NIST’s  Gaithersburg  campus  and  one  at  the  R   ockville, Maryland, facilities of Intelligent Automation, Inc., with whom  NIST subcontracted on this work. Each of the microscopes were checked  for calibration on a daily basis during the measurement acquisition process,  making use of the “standard bullet” and prototypes of a “standard casing”  under  development  under  separate  studies  as  NIST-designated  reference  measurement standards (see Vorburger et al., 2007:Section 4.2). Subsequently,  the  DKT  and  NBIDE  (144)  casings  were  submitted  to  the Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF) National  Laboratory at Ammendale, Maryland, for entry on an IBIS station. The two  batches were processed separately; that is, an IBIS comparison request was  generated for each of the 70 DKT casings, comparing each entry against the  other 69 in the set. Similar work was done for each of the 144 NBIDE cas- ings. By default, each of these comparison runs produced a hard-copy cover  sheet returning the “top 10” comparison results, such as that illustrated in  Figure 4-2. Through arrangement with FTI, NIST and the committee were  able to obtain additional information from each of the two exhibit sets: •  For  the  DKT  casings,  the  raw  IBIS  images—including  the  place- ment  of  region-of-interest  delimiters  on  the  standard  center-light  images  as well as the side-light image—were extracted and provided in electronic  form. •  For  the  NBIDE  (144)  casings,  FTI  performed  a  complete  com- parison that waived the 20 percent threshold and coarse comparison steps,  generating the IBIS correlation scores for each casing against the remaining  143 elements, and provided those scores in electronic form. Figure  8-1  contrasts  the  greyscale  photographic  images  collected  by  the current IBIS platform with representations of three-dimensional surface  measurement data, for both the breech face and firing pin markings of a  particular cartridge casing. The raw data for the three-dimensional surface  measurements  are  just  that—numeric  distance  measurements  over  a  fine  array of spatial coordinates; for graphical purposes, these can be rendered  in  many  ways,  using  colors  to  suggest  “height”  or  “depth”  or  simulat- ing  lighting  from  any  desired  angle.  The  two  three-dimensional  plots  in  Figure  8-1  use  different  color  and  texture  schemes  to  approximate  the  appearance of the surfaces.

OCR for page 197
0 EXPERIMENTAL EVIDENCE FIguRE 8-1  IBIS  two-dimensional  images  and  rendered  three-dimensional  sur- faces, breech face, and firing pin impressions from one casing. NOTES: Breech face images in row 1; firing pin images in row 2. Images are from  the  DKT  exhibit  set,  the  Federal  casing  from  pistol  number  535.  The  region-of- interest delimiter circles are superimposed on the IBIS images. 8–A.4 Processing Steps and Similarity Measures for Three-Dimensional Measurement Data Vorburger et al. (2007:Section 8) describe the data processing steps for  NIST’s topographic measurements in considerable detail. Here, we describe  the basic steps: •  Data trimming and thinning: Alone,  the  sheer  size  and  detail  of  the topographic measurement datasets make the cross-comparison of three-

OCR for page 197
0 BALLISTIC IMAGING dimensional “images” a time- and computer-intensive activity. For breech  face impressions in particular, where measurements may be picked up on  the cartridge rim, some data are trimmed so as to include only the primer  surface. NIST also worked with algorithms for thinning the data, reducing  the lateral resolution of breech face images from on the order of a 2,600  × 2,600  grid  of  data  points  to  650  × 650.  (Measurements  for  firing  pin  impressions were not thinned, however.) •  Remoal of dropout and outlier points:  Good  though  a  three- dimensional  sensor  may  be,  it  does  ultimately  provide  only  an  estimate  of height or depth; there are individual spatial points that the sensor may  simply  fail  to  acquire  (dropouts)  and  others  where  the  estimate  is  made  with appreciable noise or inaccuracy (outliers). Code developed by NIST  analyzed the three-dimensional measurement datasets for these problematic  points and interpolated new values from nearest neighboring points. •  Filtering: As a rough means to try to emphasize individual charac- teristics  rather  than  class  characteristics  in  the  three-dimensional  images,  NIST applied standard filters in noncontact optical profilometry, based on  spatial wavelengths in the topographic image data. Spatial wavelength cal- culations are based on distance between consecutive peaks after subtracting  out a mean surface depth; in this particular case, both very short and long  wavelengths are subtracted out, removing effects that can be thought of as  corresponding to system measurement noise and broad structural features  (class  characteristics),  respectively.  This  filtering  adjustment  stops  short  of true feature extraction, using algorithms to try to detect and highlight  particular image features. •  Registration: Finally, the adjusted topographic image is processed  by another program, intended to find the rotation and horizontal/vertical  shift that gives best correspondence between images. To  compare  images,  NIST  used  areal  cross-correlation  functions,  as  are  common  in  spatial  statistics.  Like  the  standard  statistical  correlation  score, the cross-correlation scores are scaled; two topographic images that  are  exactly  the  same  would  yield  an  areal  cross-correlation  of  1.0.  As  the measures are computed, the functions used by NIST are very slightly  a   symmetric—that  is,  the  cross-correlation  of  image  A  compared  with  image B  can  be  slightly  different  from  the  results  when  image  A  is  used  as  the  reference  and  compared  with  image  B.  Noting  that  the  standard  IBIS  two-dimensional  comparison  score  is  similarly  asymmetric,  NIST  judged  the  discrepancies  to  be  generally  insignificant  (Vorburger  et  al.,  2007:126–127).

OCR for page 197
0 EXPERIMENTAL EVIDENCE 8–b ANALySIS OF TWO- AND THREE-DIMENSIONAL IMAgE DATA In  this  section,  we  describe  the  results  of  analyzing  the  DKT  and  NBIDE datasets using both two-dimensional image data (e.g., the current  IBIS  platform)  and  three-dimensional  topographic  data  (using  NIST’s  acquisition  protocols  and  algorithms).  With  specific  regard  to  the  DKT  data  (and,  perhaps,  to  the  firings  from  SIG  Sauer  pistols  in  the  NBIDE  dataset), these analyses are partly meant to assess the consistency of our  results with those by De Kinder et al. (2004), as described in Chapter 4.  But—in  addition  to  getting  a  sense  of  the  capability  of  the  current  IBIS  to  detect  known  “sister”  casings  from  the  same  gun—this  work  is  also  meant to shed some light on the tradeoff between two-dimensional and  three-dimensional measurement, to see whether the latter offers clear-cut  advantages over the former. In what follows, we rely heavily on “top 10” analyses, looking at the  10  highest-ranked  possible  matches  by  different  markings.  This  is  some- what  unfortunate  given  our  assessment  in  Section  4–F  that  there  is  no  special magic in the top 10 as a cutoff (and, indeed, that the focus on the  top 10 in current training and IBIS reports has the effect of overpromising  the system). However, it is a practical limitation necessitated by a desire to  stick to standard IBIS analysis and experience as much as possible: that is,  we look principally at the top 10 because generation of the top 10 cover  sheet  scores  is  the  system  default,  and  we  confine  ourselves  to  the  top  10  ranks  using  three-dimensional  measurements  for  consistency.  A  fuller  analysis would have considered larger cuts at the rankings, such as the top  50—more  than  the  strict  limit  of  the  top  10,  but  still  within  the  number  of  results  a  human  examiner  might  reasonably  routinely  scroll  through  onscreen  to  find  potential  matches.  However,  as  we  will  discuss,  we  did  obtain a full set of comparison scores for the full NBIDE dataset and discuss  those results as well. 8–b.1 Two-Dimensional and Three-Dimensional Performance, DkT Data As shown in Box 8-1, each exhibit in the DKT exhibit set has six pos- sible same-gun matches. Table 8-1 summarizes the same-gun entries found  in the top 10 rankings in a standard IBIS search against the 69 other DKT  exhibits, and Table 8-2 provides the same results based on NIST’s analysis  of three-dimensional topographic data. On  firing  pins,  the  two-dimensional  and  three-dimensional  systems  do  comparably  well.  While  the  three-dimensional  system  does  a  much  better job at finding the casings from pistols 375, 430, and 535, the two-

OCR for page 197
 BALLISTIC IMAGING casings  against  the  107  other  casings.  Accordingly,  each  comparison  was  against a database containing 8 same-gun matches (three ammunition types  ×  three  repetitions  –  one)  and  99  nonmatches  (11  other  firearms  ×  three  ammunition types × three repetitions). The cell counts in Tables 8-3 and 8-4  indicate the number of same-gun matches for each firearm and ammunition  combination, averaged across the three repetitions. For the pure SIG Sauer DKT exhibit set, both the two-dimensional and  three-dimensional systems appeared to do a better job at finding same-gun  matches using the firing pin mark than the breech face; the opposite appears  to  be  true  for  the  NBIDE  dataset.  The  success  of  the  three-dimensional  system in finding the same-gun matches in the top 10 ranks on breech face  is  excellent;  indeed,  it  is  near  perfect.  For  the  two-dimensional  systems,  the  success  rates  on  breech  face  generally  exceeded  those  for  firing  pin;  the  two-dimensional  scores  are  not  weak  (averaging  about  six  of  eight  same-gun matches detected for the Smith & Wesson pistols, and doing less  well—about four of eight—on the SIG Sauers), but they do not approach  the high success of the three-dimensional analysis. On  firing  pins,  the  scores  corresponding  to  the  SIG  Sauer  firings  are  generally lower than those for the Ruger and Smith & Wesson guns. The  second of the SIG Sauer pistols seems particularly difficult, yielding less than  three out of eight same-gun matches on either the two-dimensional or three- dimensional systems. As suggested in the DKT analysis above, ammunition  seems to have a strong effect, with the Remington ammunition producing  consistently fewer matches than the PMC or Winchester rounds. Overall,  the three-dimensional analysis appears to outperform the two-dimensional,  particularly for the Ruger firings and the second Ruger pistol. Some further insight into the two-dimensional/IBIS performance on this  dataset can be had by considering a complete set of scores and rankings— waiving the coarse comparison and 20 percent threshold steps—that was  prepared for the committee by FTI. Table 8-5 summarizes the distribution  of  the  ranks  of  matching  exhibits  in  these  complete  score  lists;  for  this  analysis,  we  include  the  Speer  casings  and  use  the  complete  144-element  NBIDE set. The table combines the 144 separate comparison reports, indi- cating the distribution of all 144 × 143 = 20,592 pairwise comparisons, of  which 1,584 were between exhibits that were fired from the same gun. Out  of 1,440 possible top-10-ranked positions by breech face, across the 144  different  comparisons,  about  57  percent  were  between  reference  and  test  exhibits from the same firearm; 33 percent were from different firearms but  the same gun brand, while 10 percent were from exhibits from completely  different gun brands. On the firing pin impression alone, the share of top-10  positions filled by same-firearm matches dips to 42 percent while the share  from same-brand-but-different-firearm comparisons grows to 43 percent.

OCR for page 197
TAbLE 8-5 Summary of IBIS Comparisons for Full 144-Exhibit NBIDE Set Ranks of Matching Exhibits in Complete Score List Breech Face Only Firing Pin Only Breech Face or Firing Pin Relationship of Reference  To Test Casing #1 #2–10 #11–29 #30+ #1 #2–10 #11–29 #30+ #1 #2–10 #11–29 #30+ Different Gun Brand   Different Ammunition 0 71 602 9,695 6 140 640 9,582 6 210 1,182 8,970   Same Ammunition 0 70 383 3,003 0 71 286 3,099 0 133 615 2,708 Same Gun Brand   Different Specific Firearm     Different Ammunition 1 181 960 2,746 13 347 1,064 2,464 14 502 1,613 1,759     Same Ammunition 11 288 451 546 8 256 426 606 19 450 522 305   Same Specific Firearm     Different Ammunition 49 553 303 391 55 373 281 587 92 678 336 190     Same Ammunition 83 133 37 35 62 109 39 78 115 118 30 25 NOTES: Tabulations are from score lists generated by Forensic Technology WAI, Inc., waiving the coarse comparison pass and 20 percent threshold.  Each of the 144 casings was compared with the 143 other exhibits in the set, for a total of 20,592 comparisons. The cutoff at rank 29 corresponds  to a strict 20 percent threshold on a sample size of 143; the actual effective sample size for any of these comparisons in a standard IBIS run would  be somewhat larger because both breech face and firing pin images are considered, but we use the 29 cutoff for simplicity. For purposes of generat- ing ranks, tie scores in the score lists are assigned final ranks by sorting by the NIST-assigned ID number for the test (in-database) exhibits. This is  tantamount to a random assignment of ranks for tie scores because the ID numbers were randomly mixed prior to analysis. 

OCR for page 197
 BALLISTIC IMAGING 8–b.3 Analysis of Matching and Nonmatching Distributions of Similarity Scores Extending  beyond  analysis  of  ranks—as  in  Table  8-5  for  the  two- dimensional IBIS data—the NIST study (Vorburger et al., 2007:Section 9.5)  derives an oerlap metric to assess its cross-correlation similarity scores for  three-dimensional topography data.  The empirical distribution of the cross-correlation scores can be derived  separately for the matching (same-firearm) and nonmatching pairwise com- parisons in a dataset of topographic “images”; Figure 8-2 illustrates such  a distribution for the scores generated from the firing pin scores using the  NBIDE exhibit set. Continuous distributions can then be estimated for the  matching and nonmatching comparisons, with the intent of calculating the  degree  of  overlap  between  them.  Ideally,  the  matching  and  nonmatching  distributions would have no overlap and be wholly distinct from each other,  50 40 Percentage Nonmatch 30 20 10 Match 0 0 20 40 60 80 100 CCF FIguRE 8-2  Empirical  distribution  of  matching  and  nonmatching  pairwise  8-2.eps comparisons. NOTE: Data used are the scores from comparisons of the three-dimensional topo- graphic measurements of firing pin impressions using the NBIDE exhibit set. SOURCE: Vorburger et al. (2007:Fig. 9-16).

OCR for page 197
 EXPERIMENTAL EVIDENCE with  the  matching  scores  at  the  high  end  of  the  range  and  nonmatching  scores concentrated near 0. The more the matching and nonmatching dis- tributions overlap, the greater the degree of false matches may be expected  since matches and nonmatches become harder to distinguish. The extent of  overlap can be summarized by estimating p, the probability that the similar- ity score (maximum cross-correlation) of a randomly chosen element of the  nonmatching distribution is larger than a randomly selected member from  the matching distribution. In the ideal separation described above, p would  be 0; in a completely overlapping distribution, p would be 0.5.  This same logic can be applied to an exhibit set as a whole (to provide a  single estimate of p) or on subsets: specific values of p could be derived for  particular firearms or particular casings. For instance, for the 108-element  NBIDE  dataset  (excluding  the  Speer  firings),  a  casing-specific  p  can  be  derived from the 107 pairwise comparisons using that casing as the refer- ence, 8 of which are same-gun matches and 99 of which are nonmatches.  Table 8-6 summarizes estimates of casing-specific values of p from the  DKT  and  NBIDE  exhibit  sets.  The  table  confirms  the  three-dimensional  system’s strong performance on breech face measurements for the NBIDE  exhibits,  with  90  percent  of  the  casing-specific  estimates  of  p  being  less  than  0.001.  For  the  NBIDE  firing  pin  measurements,  separation  was  less  clear; only 18 percent of the p estimates were less than 0.001. The degree  of overlap is more pronounced for the DKT data; the maximum estimated  p for the DKT firing pin comparisons was 0.415, quite close to complete  overlap. TAbLE 8-6  Summary of Overlap Metrics for Three-Dimensional Images Proportion of p ≤ Image Type 0.001 0.01 0.1 DKT Data   Firing Pin 0.09 0.21 0.45   Breech Face 0.01 0.03 0.21 NBIDE Data   Firing Pin 0.18 0.25 0.56   Breech Face 0.90 0.95 1.00 NOTE: The table summarizes the proportion of casing-specific estimates of p falling below  a  particular  value,  where  p  =  0  indicates  perfect  separation  of  matching  and  nonmatching  distributions; see text for further derivation.

OCR for page 197
 BALLISTIC IMAGING 8–b.4 general Assessment Although we defer our main conclusions from this experimental work to  our discussion of a national reference ballistic image database in Chapter 9,  some comment is in order here on the trade-off between two-dimensional  photography  and  three-dimensional  surface  measurement  as  an  imaging  standard,  particularly  as  a  possible  technical  enhancement  to  the  current  NIBIN system.  We conclude that NIST’s work on the committee’s behalf on a prototype  three-dimensional ballistics evidence comparison system suggests that such  a system has strong merit. Although in early development, NIST’s version  of a three-dimensional analysis system produced results on par with—and,  for some markings, outperformed—the current two-dimensional system in  detecting same-gun sisters. That it did not consistently produce same-gun  match rates that exceed the two-dimensional system—for instance, that it  did  not  always  do  best  at  handling  breech  face  markings—suggests  that  there is room for improvement and refinement. Much work also remains  to done on streamlining the acquisition and data processing steps. As a first  foray—one geared to ensuring proper calibration of equipment and to test- ing different algorithms and computer programs for generating comparison  scores—the data acquisition process was time consuming and comparisons  took many hours to run to completion. In both respects, the specific three- dimensional system developed by NIST is unsuitable for deployment and  immediate  use.  We  have  no  information  on  the  performance  of  the  new  Forensic Technology WAI, Inc., three-dimensional-based IBIS for cartridge  casings and hence cannot comment on it. However, we are confident that  three-dimensional surface measurement of ballistics evidence can be made  to  be  tractable;  though  not  ready  for  immediate  implementation,  three- dimensional  measurement  and  analysis  of  bullet  and  cartridge  evidence  should be a high priority for continued research and development. 8–C bASIC ExPERIMENTATION WITH NEW yORk CobIS DATAbASE A  subgroup  of  committee  members  and  staff  made  two  visits  to  the  New York State Police (NYSP) Forensic Investigation Center in Albany in  March and July 2005 to see a state RBID in operation and to perform some  small-scale tests of system performance. Our analysis was deliberately lim- ited in nature, so as to avoid unduly interfering with the center’s operations  for part or all of a day. The exploration we pursued consisted in part of  entering a subsample of exhibits from the DKT set, for which we also had  (independent) IBIS analysis by ATF and three-dimensional measurement by  NIST. We also drew a sample from past CoBIS caseload for reacquisition 

OCR for page 197
 EXPERIMENTAL EVIDENCE and analysis, and observed the entry of new exhibits waiting in the queue  for entry. In advance of the visits, committee staff asked NYSP for a basic break- down of gun makes in the CoBIS dataset, to get a sense of high- and low- frequency cases. Casings from 9mm pistols make up the largest portion of  the CoBIS data (about 38 percent of the total); of the 9mm pistols, Glock  pistols  are  most  highly  represented  (46.5  percent),  followed  by  Smith  &  Wesson  (18  percent).  The  second-most  represented  caliber  group  are  40  caliber firearms, followed by .22 caliber; Glock and Smith & Wesson are  the largest entrants among .40 caliber arms, while Kimber and Ruger fire- arms are the largest constituent parts of the .22 caliber database. We  selected  four  manufacturer-and-caliber  combinations,  including  both high-frequency cases and very-low-frequency cases. For the latter, we  selected  Kimber  9mm,  a  group  for  which  only  23  exhibits,  of  the  entire  29,355-element 9mm pool, were known to be in the database. The other  combinations we sought were Glock 9mm (the single largest component of  the database), Beretta .22, and Smith & Wesson .357.  For  each  of  these  groups,  one  exhibit  was  retrieved  arbitrarily  from  storage  (archive)  and  one  from  waiting  caseload  (new).1  The  “new”  e   xhibits  were  checked  in  and  entered  into  CoBIS  as  usual,  and  the  enve- lopes  retained  so  that  they  could  be  reentered  later.  As  the  exhibits  were  being  processed,  we  learned  of  Glock’s  practice  of  including  two  casings  in  the  sample  envelope  packaged  with  its  new  guns.  In  standard  CoBIS  practice, the IBIS technician briefly looks at both casings and chooses one  as the “best” casing for entry. Only the casing deemed to be the best for  entry was reacquired into the system, but we generated comparison scores  using  both  casings  as  references  to  see  if  the  second  casing  matched  well  with  the  best.  The  technician-determined  best  casing  was  labeled  1,  and  the second was labeled 2. Only the breech face and firing pin images were  acquired; one of our chosen weapons, the Beretta .22, is a rimfire firearm,  and  so  its  single  mark  is  acquired  in  the  free-hand  trace  method  usually  used for ejector marks. This set of casings was supplemented by a small extract of eight cas- ings from the DKT exhibit set.2 Two DKT pistols (numbers 535 and 68)  1  or  the  archive  cases,  exhibits  were  drawn  from  2004  forward,  since  boxes  containing  F those exhibits were accessible near the IBIS entry room. As described below, we had occasion  to retrieve one specimen from CoBIS pre-2004 archive. One of the casings used as a “new”  was extremely new; it was from a firing performed on-site on the morning of our visit, when  a dealer brought in a gun for firing and cartridge collection prior to sale. 2  ur analysis set also had one other unplanned addition. The very first comparison score  O results to be returned on screen were for NYSP05, a “new” casing from a Smith & Wesson  .357  model  640  revolver;  that  casing  had  a  CoBIS  ID  with  stub  05-05061  (05  indicating  2005  and  the  last  five  digits  being  a  sequential  ID  number).  It  turned  out  that  the  highest-

OCR for page 197
 BALLISTIC IMAGING were chosen; both Remington casings plus the CCI and Speer casings were  selected  from  gun  535,  and  both  Remington  casings  plus  the  Wolf  and  F   ederal  casings  were  selected  from  gun  number  68.  These  were  entered  into the system as new evidence, labeled NAS01 through NAS08, respec- tively. The workload for entering the NAS exhibits and the “new” caseload  exhibits was divided among four CoBIS operators; however, when exhibits  were entered for querying the database, all entries were made by the same  person. The exhibit set analyzed in Albany is summarized in Box 8-2. 8–C.1 basic IbIS Results, NAS Exhibits Table 8-7 reports the IBIS breech face and firing pin scores and ranks  for the eight NAS exhibits, extracted from the DKT exhibit set. Practically,  these  comparison  runs  looked  at  the  performance  of  the  current  IBIS  in  finding elements of an eight-exhibit set, nested within a database of effective  sample size 15,082 of casing images from new firearms of the same caliber  and basic demographic characteristics. The  first  thing  that  is  evident  from  Table  8-7  is  that  the  system  did  effectively find matches between images of the absolute highest similarity:  different  image  entries  of  the  exact  same  casing,  differing  only  (possibly)  by the operator who acquired the images. These exact image repetitions are  the top-ranked possible match on both the breech face and firing pin marks,  and the raw scores dwarf the others. The  second  finding  shown  in  the  table  is  that  performance  on  these  exhibits is far from the ideal, which is that each exhibit (when used in the  reference) would find its three known “sister” casings from the same gun as  very highly ranked possible matches, and that the four NAS-labeled exhibits  known  to  be  from  a  different  gun  would  not  be  highly  ranked.  For  only  one of the exhibits—NAS05—do all three of the casings known to be from  the same gun even appear on the “full” IBIS comparison score report: the  others are rejected in the coarse comparison and 20 percent threshold steps  described in Chapter 4. Out of 24 possible same-gun matches (eight exhibits  times three “sisters” from the same gun): ranked  possible  match  by  breech  face—other  than  the  known  image  in  the  system,  already  entered—bore  the  ID  stub  01-05061,  a  casing  from  2001  bearing  the  same  ID  number  as  the new 2005 case. The breech face score (60) was not exceptional relative to the rest of the  distribution, and indeed, visual examination of the images suggested nothing close to a true  match. But the happenstance of having a very similar revolver (same manufacturer, slightly  different  make)  from  4  years  prior  show  up  at  the  top  of  the  correlation  heap  raised  some  curiosity (was the system somehow sorting on ID?), so the 2001 exhibit was pulled from deep  storage for direct examination.

OCR for page 197
9 EXPERIMENTAL EVIDENCE BOX 8-2 Exhibit Set Tested in Work with CoBIS Database DKT (De Kinder et al., 2004) Exhibit Set Extract: All cartridges are firings from SIG Sauer P226 pistols and represent a subset of the DKT data analyzed by NIST. • NAS01: Pistol #535, Remington-Peters casing 1 • NAS02: Pistol #535, Remington-Peters casing 2 • NAS03: Pistol #535, CCI casing • NAS04: Pistol #535, Speer casing • NAS05: Pistol #68, Remington-Peters casing 1 • NAS06: Pistol #68, Remington-Peters casing 2 • NAS07: Pistol #68, Wolf casing • NAS08: Pistol #68, Federal casing CoBIS Extract: For each gun type, one “new” case was selected from casings awaiting entry in the database and one “archive” case drawn from the past 1–2 years of entered exhibits. • NYSP01: Beretta .22, new • NYSP02: Beretta .22, archive • NYSP03-1: Glock 9mm, new, “best” of the two sample casings included in envelope by manufacturer • NYSP03-2: Glock 9mm, new, second sample casing from manufacturer • NYSP04-1: Glock 9mm, archive, “best” of the two sample casings included in envelope by manufacturer • NYSP04-2: Glock 9mm, archive, second sample casing from manufacturer • NYSP05: Smith & Wesson .357 640 revolver, new • NYSP06: Smith & Wesson .357 640 revolver, archive • NYSP07: Kimber 9mm, new • NYSP08: Kimber 9mm, archive •  Only 3 are found are in the top 11 ranks by either breech face or fir- ing pin (11 is used because of the presence of the image from the exact same  exhibit that is always the #1 entry). These are NAS01 to NAS02, NAS02 to  NAS01 (on firing pin only), and NAS06 to NAS05—all three casings using  the same ammunition (Remington-Peters) as well as the same gun. •  Half (12) had a best ranking (between the breech face and firing  pin)  that  was  less  than  11—none  of  these  lower  than  27,  and  most  of  them greater than 100—but still merited inclusion in the “full” correlation  report. •  The balance, 9, failed the coarse comparison pass and 20 percent  threshold.

OCR for page 197
0 BALLISTIC IMAGING TAbLE 8-7 IBIS Comparison Results, DKT Exhibit Set Extract in CoBIS  Database Breech Face Firing Pin Reference ID  (# of results) Test ID Score Rank Score Rank NAS01 (1,014) NAS01 229 1 226 1 NAS02 29 11 101 2 NAS04 10 728 61 86 NAS05 12 641 55 199 NAS06 7 861 50 340 NAS02 (1,106) NAS01 26 279 106 2 NAS02 249 1 366 1 NAS04 22 594 52 160 NAS05 25 342 49 225 NAS06 11 838 46 338 NAS03 (1,024) NAS02 15 641 41 177 NAS03 280 1 253 1 NAS04 25 339 28 657 NAS04 (987) NAS03 36 69 43 645 NAS04 273 1 299 1 NAS08 12 711 65 288 NAS05 (1,039) NAS01 17 637 54 305 NAS04 25 143 58 175 NAS05 275 1 226 1 NAS06 15 676 63 69 NAS07 4 1,013 55 251 NAS08 14 706 65 48 NAS06 (1,048) NAS01 15 655 70 37 NAS02 11 763 60 140 NAS04 22 167 60 139 NAS05 33 3 99 2 NAS06 190 1 203 1 NAS08 18 484 60 138 NAS07 (1,018) NAS05 4 899 63 27 NAS07 275 1 253 1 NAS08 1 1,001 51 724 NAS08 (1,049) NAS02 20 486 36 761 NAS04 21 409 60 337 NAS05 23 281 63 215 NAS08 169 1 276 1 NOTES: All exhibits are of the same 9mm caliber, and so all had the same effective sample  size  of  15,082  exhibits.  The  (#  of  results)  entries  represent  the  number  of  entries  included  in the “full” IBIS comparison report and are the number of exhibits that survive the coarse  comparison  and  20  percent  threshold  steps  (see  Chapter  4).  NAS01–04  are  from  the  same  SIG Sauer P226, De Kinder et al. (2004) pistol 535; NAS01 and 02 use the same Remington  ammunition. NAS05–08 are from the same SIG Sauer P226 pistol, De Kinder et al. (2004)  pistol 68; NAS05 and 06 use Remington ammunition.

OCR for page 197
 EXPERIMENTAL EVIDENCE NAS-labeled  exhibits  from  a  different  gun  than  the  reference  exhibit  did appear in the full comparison reports. In fact, when NAS06 was used  in the reference, three NAS exhibits from the other SIG Sauer pistol could  be  found  in  the  comparison  report  compared  to  two  sister  entries  from  the  same  pistol.  However,  none  of  these  comparisons  yielded  scores  that  cracked the top 11 rankings by either mark.  With CoBIS staff, we examined the firearms makes and models for the  16 highest-ranked possible matches, on the breech face and firing pin lists,  for each of the NAS-labeled exhibits. A wide variety of 9mm pistols appear  throughout  the  listings,  including  Smith  &  Wesson,  Beretta,  and  Taurus  arms,  with  smaller  numbers  of  Kahr,  Springfield,  and  Keltec  guns.  Other  SIG models are not uncommon in the listings, but the highest ranks are not  dominated by them. It is of interest that some of the NAS casings do fre- quently match to casings from two runs of near-consecutive serial numbers  and, consequently, near-consecutive CoBIS IDs. These likely correspond to  large  batch  sales,  such  as  police  department  orders.  For  instance,  for  the  NAS02 casing, the top 16 ranks by firing pin include four entries from one  of  these  runs,  entered  in  2003  (two  other  nonrelated  SIG  exhibits  from  2001 are also highly ranked on firing pin); however, none of these pistols  appears in the highest ranked possible matches by breech face.3  Independently, a committee subgroup visited the New York City Police  Department forensic laboratory and ran tests on NAS01–NAS04; the Albany  test  had  the  effect  of  seeing  how  these  same-gun  casings  were  handled  in  an RBID of images from new firearms, while the New York City test con- trasted that with performance in a large database of crime scene evidence.  The results were very consistent with those reported in Table 8-7, against a  crime-evidence database of 12,427 exhibits after demographic filtering.4  8–C.2 basic IbIS Results, NySP Exhibits Our  work  with  the  NYSP-labeled  exhibits  described  in  Box  8-2  was  similar  to  that  done  for  the  NAS-labeled  exhibits.  “New”  exhibits  were  entered  into  CoBIS,  and  then  subsequently  re-imaged  for  comparisons.  3  Another oddity that shows up in the top 16 rankings is that the NAS01 and NAS02 casings,  in particular, find three test casings—apparently entered by FTI in setting up and maintaining  CoBIS’ IBIS equipment—among the top ranks by breech face. The make and model of these  test rounds (which stand out in the listings because they, like the NAS-labeled exhibits, do not  use the typical CoBIS naming conventions) are unknown. It should be emphasized, though,  that although they appear in the high ranks, the actual scores and visual match on the images  are unremarkable. 4  pecifically,  when  NAS01  was  used  as  the  reference,  NAS03  was  again  excluded  by  the  S coarse  comparison  pass;  NAS02  was  nearly  top-ranked  on  firing  pin  (but  low-ranked  on  breech face), and NAS04 ranked 90th on firing pin and 673rd on breech face. 

OCR for page 197
 BALLISTIC IMAGING We were interested in determining whether the system reliably found this  exact same image—differing only by acquisition at different times, possi- bly by different operators—in databases of different effective sample sizes  (between 5,312 and 15,082) after the standard filtering. The Glock entries  (NYSP03 and NYSP04) provided the opportunity to see where casings in  the  same  manufacturer-supplied  envelope  related  strongly  to  each  other.  And, on a follow-up visit to the NYSP Forensic Investigation Center, the  makes and models for the top 10 results by both breech face and firing pin  were recorded to see whether like models dominated the top rankings. As with the NAS-labeled exhibits, the current IBIS system had no prob- lem detecting the “needle”—a new instance of the same exhibit image—in  “haystacks” of varying sizes, with one prominent exception. That excep- tion was NYSP01, a “new” Beretta 9mm; because it is a rimfire weapon,  image entry is done by manually tracing the region of interest (see Section  4–B.2),  and  a  single  ejector  mark/rimfire  impression  score  is  returned  by  the system. For this exhibit, the image on file in CoBIS—acquired that same  morning—appeared as the 137th ranked possible match; its score was 328,  compared to the top score (to another Beretta pistol, entered in 2004) of  571.  The  effective  sample  size  was  8,106,  so  the  link  from  NYSP01  to  itself was not in great danger of being excluded by the coarse comparison  and 20 percent threshold steps. Visual examination of the surface images  suggest a curious ridge-like structure in the rimfire impression that appar- ently registers differently under slightly discrepant lighting and orientation.  NYSP02, the “archive” Beretta casing, encountered no such difficulty; the  original image from 2004 was found in the #1 position with score 2,631,  with the score dropping to 444 for the #2 entry. For each of the Glock exhibits, the second casing in the manufacturer- supplied envelope could be found as a match in the top 10 by one of the  marks. When NYSP03-2 was used as the reference, NYSP03-1 was returned  as the #4-ranked entry on breech face (raw score 61, relative to a maximum  of 64) but was not within the top 10 on firing pin. When NYSP04-2 was  the reference, NYSP04-1 was the top-ranked potential match by firing pin  (score 168, with an unrelated Glock scoring 163 as the #2 possibility) but  dropped  out  of  the  top  10  on  breech  face.  In  all  these  cases,  the  demo- graphic filtering by Glock firing pin gave an effective sample size of 12,353  casings. For  the  remaining  NYSP-labeled  exhibits,  the  same-gun  image  was  very comfortably returned as the #1-ranked entry on both the breech face  and firing pin scores, with wide separations between it and the remaining  entries. The top 10 lists for each exhibit are most populated by guns by the  same  manufacturer  and  the  same  model,  except  for  the  Kimber  exhibits  (NYSP07 and NYSP08), for which none of the less-than-30 Kimbers in the  CoBIS system were returned as top-10 candidates by either score.