11
Best Standards for Future Developments in Computer-Assisted Firearms Identification

The technology of the Integrated Ballistics Identification System (IBIS) provides a significant benefit in reducing the time it takes to identify a match and increasing the overall capacity of toolmark examiners to find matches and to link crimes committed with the same gun. Properly used, the committee believes that the current National Integrated Ballistic Information Network (NIBIN) can be a valuable investigative tool, providing important leads to law enforcement through searches of ballistics evidence images stored in a database. However, a mature scientific approach is required to improve the reliability of automated searches and, if possible, ultimately to reduce costs, particularly labor costs associated with acquisition and search. Neither the current system, nor newer technologies under development, have demonstrated the ability to operate with the precision, safety, and cost effectiveness needed for a national reference ballistic image database (RBID).

The current system has been designed to support the traditional task of having a firearms examiner confirm that a particular cartridge was fired from a particular gun or that two or more cartridges were fired from the same gun. Chapter 6 provides a number of recommendations regarding the kinds of operational and technical improvements that are needed to smooth the progress of this task using the current system. However, the enormous number of firearms crimes committed annually in the United States with their accompanying toll of serious injury and death would seem to call for a far more robust research enterprise in the area of firearms identification than exists in the nation today. This chapter discusses what the government can do to advance the science in acquisition technology, search, and pattern



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 272
11 Best Standards for Future Developments  in Computer-Assisted   Firearms Identification The technology of the Integrated Ballistics Identification System (IBIS)  provides  a  significant  benefit  in  reducing  the  time  it  takes  to  identify  a  match  and  increasing  the  overall  capacity  of  toolmark  examiners  to  find  matches and to link crimes committed with the same gun. Properly used,  the committee believes that the current National Integrated Ballistic Infor- mation  Network  (NIBIN)  can  be  a  valuable  investigative  tool,  providing  important leads to law enforcement through searches of ballistics evidence  images  stored  in  a  database.  However,  a  mature  scientific  approach  is  required to improve the reliability of automated searches and, if possible,  ultimately to reduce costs, particularly labor costs associated with acquisi- tion and search. Neither the current system, nor newer technologies under  development, have demonstrated the ability to operate with the precision,  safety, and cost effectiveness needed for a national reference ballistic image  database (RBID).  The current system has been designed to support the traditional task  of having a firearms examiner confirm that a particular cartridge was fired  from a particular gun or that two or more cartridges were fired from the  same gun. Chapter 6 provides a number of recommendations regarding the  kinds of operational and technical improvements that are needed to smooth  the progress of this task using the current system. However, the enormous  number  of  firearms  crimes  committed  annually  in  the  United  States  with  their accompanying toll of serious injury and death would seem to call for  a far more robust research enterprise in the area of firearms identification  than exists in the nation today. This chapter discusses what the government  can do to advance the science in acquisition technology, search, and pattern  

OCR for page 272
 BEST STANDARDS FOR FUTURE DEVELOPMENTS recognition to improve the specific performance of technologies designed to  assist in firearms identification and to address systematically the problems  that prevent current technologies from being scaled up.  11–A vERIFICATION, SEARCH, AND THE CHALLENgE OF SCALE Forensic analysis of firearms has traditionally been a process in which  an  expert  examiner  is  charged  with  the  task  of  matching  spent  cartridge  cases  or  bullets  with  a  particular  firearm  or  linking  evidence  from  dif- ferent  crimes  to  a  particular  weapon.  This  is  fundamentally  a  process  of  verification, in which a hypothesis—that the same firearm was used in two  firings—is accepted, rejected, or found to be inconclusive. This judgment  is made on the basis of physical markings on the cartridge case or bullet,  generally observed visually by the firearms examiner with the assistance of  a microscope. An examiner must usually support the judgment of a match  in court and thus seeks considerable evidence of a match in order to reach  the conclusion of a definitive match. In considering the development of ballistic image databases, it is criti- cally  important  to  distinguish  this  traditional  process  of  verification,  in  which there are external reasons that lead investigators to ask whether two  b   ullets or casings were fired by the same firearm, from the process of search  in which a number of cases are compared with the goal of finding possible  reasons to tie them together. In verification, one is validating or rejecting  a specific hypothesis on the basis of additional data, in this case forensic  evidence. In search, one is trying to come up with potential hypotheses by  filtering through potentially large amounts of data. In general, search tasks  are considerably more difficult than verification tasks. This same distinction  arises in a number of areas other than ballistics, most notably biometrics.  For instance, it is a considerably easier task to determine whether two par- ticular fingerprints match each other than it is to find potentially matching  fingerprints from a large database.  A central distinction between verification and search is that in a verifica- tion task one can be quite conservative, not accepting a match unless there  is overwhelming evidence. In law enforcement this is ensured by the courts  and expert testimony. In fingerprint-based security systems, this is ensured  by requiring a very high-quality match of an individual’s stored fingerprint  to the one read by a scanner, even if that requires several attempts by a user  to have the print correctly read. In contrast, for a search task, if a system  is  too  conservative  it  does  not  generate  any  useful  potential  matches,  or  hypotheses, to consider. Yet if a search system is not conservative enough,  it  generates  too  many  useless  hypotheses  or  false  leads.  Neither  of  these  approaches is very useful. Thus, for search-based tasks, such as a ballistic  image database, it is very important that the system have both a low false 

OCR for page 272
 BALLISTIC IMAGING alarm rate (reporting of incorrect matches) and a high true detection rate  (reporting of correct matches).  Simultaneously achieving a low false alarm rate and high true detection  rate is well known in the statistics and pattern recognition scientific litera- tures  to  be  challenging,  although  for  many  tasks  not  insurmountable.  A  given false alarm rate and true detection rate may even produce acceptable  performance for a particular database size but still not scale up effectively  to larger databases. For instance, if a database grows by a factor of 100,  for a given false alarm rate the number of incorrect matches reported will  also be expected to grow by a factor of 100. This may simply be too many  potential leads to follow up on. Thus, as a rule of thumb, the false alarm  rate often must get better (lower) as the database size increases, while at  the same time maintaining the true detection rate. Early ballistic image “databases” consisted of photographs of bullets  and  shell  casings  hanging  on  the  wall.  These  photographs  were  taken  with a camera attached to a forensic microscope. For unsolved cases these  p   hotographs served as reminders in the event that an examiner encountered  other  evidence  that  could  possibly  be  tied  to  these  cases.  Ballistic  image  database systems, such as NIBIN, can be viewed as a means of automating  this manual process of hanging photos on the wall, enabling investigators  to potentially tie cases together based on images of a larger number of bul- lets and shell casings than can be considered by manual inspection. These  systems are now routinely used to handle much larger databases of ballistic  images than one could hang on a wall, and in several law enforcement juris- dictions have been effective for finding “cold hits” or links between cases  that were not otherwise known. One  can  thus  view  NIBIN  as  an  illustration  of  the  potential  that  an  automated image database search has to increase the capacity to tie cases  together  in  comparison  with  the  manual  examination  of  images  (or  evi- dence itself). However, as detailed throughout this report, there is a finite  limit on the extent to which such a database can be scaled up and still prove  useful. This is both an empirical fact for the particular technologies used by  the NIBIN system and a question of both theory and experimentation for  other imaging technologies and other pattern recognition techniques. In this  chapter we briefly review some of the relevant technologies and techniques  and offer suggestions for improving the system. 11–b vISuAL PATTERN RECOgNITION The  goal  of  visual  pattern  recognition  methods  is  to  find  possible  matches between images. Pattern recognition methods can be used as part  of  either  a  verification  or  a  search  task.  As  discussed  above,  the  former  involves validating a particular hypothesis, in this case assessing a potential 

OCR for page 272
 BEST STANDARDS FOR FUTURE DEVELOPMENTS match between a particular pair of images, and search involves matching a  query or probe image against a potentially large set of other images to find  potential matches. Pattern matching techniques used for search are gener- ally specifically designed in order to be able to efficiently consider a large  number  of  images.  Search  techniques  generally  also  provide  a  ranking  of  how well each potential image matches the query (such rankings for collec- tions of text are now widely familiar in web search engines). There are typically two parts to the pattern recognition process: how to  compare a single pair of a probe and a target image and how to structure  a search over a large set of targets. Clearly, the search element incorporates  the  comparison  stage  as  part  of  its  process.  The  comparison  step  further  typically involves two key elements: what features, signature, or other rep- resentations of the image content are to be used in the actual comparison  operation  and  what  measure  is  used  to  compare  these  features.  Associ- ated  with  the  measure  will  often  be  a  set  of  allowable  transformations:  for example, objects may be allowed to translate, rotate, or scale without  penalty, or they may be allowed to deform in certain other ways without  penalty. These transformations are often not only geometric in nature, but  also include transformations that might result from other sources of varia- tion, such as changes in lighting.  Search-based  pattern  recognition  methods  involve  a  broad  range  of  possible techniques. The most straightforward are sequential searches and  rankings  that  in  effect  verify  a  match  between  the  query  image  and  each  image  in  the  dataset.  However,  this  kind  of  approach  only  works  for  relatively small datasets. More sophisticated methods include hierarchical  search methods, in which one first uses a coarse set of features to roughly  rank the targets and then a refined comparison is performed only on the  top  few  selections,  or  hashing  function  methods,  in  which  a  small  set  of  features are used to index into a precomputed arrangement of the targets,  focusing on a small set of most likely matches.  Fingerprints are a good example with which to illustrate these trade- offs. There are many choices of possible features. One could use minutiae  (i.e., sets of distinctive local points in the pattern of lines, based on sharp  changes in curvature). One could use a broader distribution of the overall  orientation of the lines in the pattern, or the density of lines in the pattern,  such as histograms of orientation. One could use a learned representation  of distinctive features (that is, a set of local features that have been learned  as  distinctive  for  this  particular  print  by  a  series  of  trials  against  a  large  database).  Or  one  could  use  model-driven  features,  in  which  an  analysis  of  the  process  of  generation  of  fingerprints  or  an  analysis  of  a  particular  pattern is used to determine which specific features are distinctive (such as  using a local feature focus method). Thus,  in  matching  fingerprints,  images  of  two  fingerprints  are  not 

OCR for page 272
 BALLISTIC IMAGING compared directly, pixel for pixel: instead, each fingerprint image is pre- processed  to  extract  certain  features.  These  features  are  then  compared.  Human fingerprint experts use features such as minutiae. Pattern recogni- tion  systems  use  features  or  signatures  that  are  derived  mathematically  or with machine learning techniques. For automated pattern recognition  systems,  such  formally  derived  features  generally  work  better  than  do  features that are used by human experts. A recent study by the National  Institute  of  Standards  and  Technology  (NIST)  on  the  accuracy  of  finger- print recognition systems found that the best pattern recognition methods  are  able  to  achieve  a  98.6  percent  correct  detection  rate  using  a  single  finger  and  a  99.9 percent  correct  detection  rate  using  four  fingers,  with  a  false  alarm  rate  of  0.01  percent  (Wilson  et  al.,  2004).  That  is,  such  a  system will correctly match two fingerprints from the same person much  of the time while only incorrectly saying there is a match (when there is  none) only 1 in 10,000 times (for details, see, e.g., http://www.sciencedaily. com/releases/2004/07/040716080142.htm [February 2008]). When considering possible comparison measures, there is again a broad  range of options. Again we use fingerprints as an example. One approach  is simply to measure the degree of overlap between two patterns—that is,  search over all possible alignments of the features or signature (query for  a  particular  target)  and  count  something,  such  as  the  number  of  pixels  in  the  query  and  target,  to  find  pairs  that  have  the  same  value  or  values  within  some  tolerance.  Note  that  inherent  in  this  definition  is  the  notion  of allowable transformations between the probe and the target, which may  be  abstracted  out  in  the  feature  extraction  process,  part  of  the  matching  process, or a mixture of both.  11–C bEST PRACTICES FOR LESS MATuRE TECHNOLOgIES Current NIBIN technology has been developed using a single vendor  approach. This kind of approach is common when the technological prob- lem to be solved—in this case, automating the search function in firearms  identification—seems to be straightforward and the market for the result- ing  product  is  limited.  However,  any  vendor  must  necessarily  choose  a  particular approach based on its best judgment as to what is most feasible  and  cost  effective.  The  kinds  and  scope  of  empirical  questions  involved  in advancing the technologies and improving performance and scalability  are difficult for a single vendor to address. The challenge, then, is how to  divide the task so that particular pieces of the application can be addressed  through a competitive research and development process.  There are two recent examples of government mandated large-scale sys- tem developments based on (initially) nonmature technologies: fingerprint  identification and facial recognition. Both systems required the creation of 

OCR for page 272
 BEST STANDARDS FOR FUTURE DEVELOPMENTS dedicated pattern recognition algorithms, similar to the requirements of the  proposed RBID. Instead of relying on a single system produced by a single  vendor,  both  systems  were  organized  as  competitions  between  vendors.  In the following sections, we first describe the two competitions and then  extract best practice suggestions from those experiences. 11–C.1 Fingerprint Identification The  statutory  mandate  of  NIST  under  Section  403c  of  the  USA  PATRIOT Act requires that NIST examine and certify biometric technolo- gies  that  may  be  used,  among  others,  in  the  U.S.  Visitor  and  Immigrant  Status  Indication  Technology  (VISIT),  formerly  known  as  the  U.S.  entry- exit system. The Fingerprint Vendor Technology Evaluation (FpVTE) 2003  was conducted on behalf of the Justice Management Division of the Depart- ment of Justice in the fall of 2003, to evaluate the accuracy of commercial  fingerprint matching, identification, and verification systems (Wilson et al.,  2004; see also http://fpvte.nist.gov [January 15, 2007]).  FpVTE 2003 was designed to assess the capability of fingerprint sys- tems to meet requirements for both large-scale and small-scale real-world  applications. FpVTE 2003 consists of multiple tests performed with com- binations of fingers (e.g., single fingers, 2 index fingers, 4 to 10 fingers) and  different  types  and  qualities  of  operational  fingerprints  (e.g.,  flat  livescan  images from visa applicants, multifinger slap livescan images from present- day booking or background check systems, or rolled and flat inked finger- prints from legacy criminal databases). FpVTE 2003 was among the most comprehensive evaluations of finger- print matching systems ever executed, particularly in terms of the number  and variety of systems and fingerprints: 18 companies participated, with 34  systems tested. The test used 48,105 sets of flat slap or rolled fingerprint sets  from 25,309 individuals, with a total of 393,370 distinct fingerprint images.  The tests revealed that, when four fingerprints were used for matching, the  most accurate fingerprint system tested always had a true accept rate that  was higher than 99.9 percent with a false accept rate of 0.01 percent. The evaluations were conducted to (1) measure the accuracy of finger- print  matching,  identification,  and  verification  systems  using  operational  fingerprint  data;  (2)  identify  the  most  accurate  fingerprint  matching  sys- tems;  (3)  determine  the  effect  of  a  wide  variety  of  variables  on  matcher  accuracy; and (4) develop well-vetted sets of operational data from a variety  of sources for use in future research. As such, the fingerprint identification  system  is  considered  to  be  a  system  in  continuous  evolution.  As  better  algorithms  become  available,  the  system  can  be  updated  to  improve  the  identification success rate. The use of a systematic competitive test between vendors ensures that 

OCR for page 272
 BALLISTIC IMAGING the best possible algorithms are developed and used. In addition, the effects  of  various  external  factors  on  the  accuracy  of  the  identifications  can  be  quantitatively  addressed.  For  instance,  it  was  shown  unambiguously  that  the variables that had the largest effect on system accuracy were the number  of  fingers  used  and  fingerprint  quality.  A  national  RBID  would  require  a  similar systematic study of the effect of external variables on the accuracy  of matching for both cartridge cases and bullets. 11–C.2 Facial Recognition The U.S. Department of Defense Counterdrug Technology Development  Program Office began the Face Recognition Program (FERET) in 1993 and  sponsored it through its completion in 1998. Total funding for the program  was a little over $6.5 million. The goal of FERET was to develop automatic  face recognition capabilities that could be employed to assist security, intel- ligence, and law enforcement personnel in the performance of their duties.  FERET  consisted  of  three  major  elements.  First,  the  program  sponsored  research that advanced facial recognition from theory to working labora- tory  algorithms.  Many  of  the  algorithms  that  were  developed  in  FERET  form the foundation of today’s commercial systems. Second was the collec- tion and distribution of the FERET database, which contains 14,126 facial  images of 1,199 individuals. (The FERET database is currently maintained  at NIST.) The development portion of the FERET database has been distrib- uted to more than 100 groups outside the original program. The final, and  most recognized, part of the FERET program involved the FERET evalu- ations that compared the abilities of various facial recognition algorithms  using the FERET database. A  standard  database  of  face  imagery  was  essential  to  the  success  of  FERET, both to supply standard imagery to the algorithm developers and  to supply a sufficient number of images to allow testing of these algorithms.  Before the start of FERET, there was no way to accurately evaluate or com- pare facial recognition algorithms (see http://www.frvt.org/FERET/default. htm [February 2008]). FERET set out to establish a large database of facial  images that was gathered independently from the algorithm developers. The  database made it possible for researchers to develop algorithms on a com- mon database and to report results in the literature using this database.  The results reported in the standard literature did not provide a direct  comparison  among  algorithms  because  each  researcher  reported  results  using  different  assumptions,  scoring  methods,  and  images.  The  indepen- dently administered FERET evaluations, using well-defined and published  evaluation methodologies (Phillips et al., 2000), allowed for a direct quan- titative  assessment  of  the  relative  strengths  and  weaknesses  of  different  approaches. One of the most important aspects of the use of this database 

OCR for page 272
9 BEST STANDARDS FOR FUTURE DEVELOPMENTS was that the variability of the data could be controlled (e.g., images of a  person taken on the same day under different lighting conditions, images  taken  on  different  days  or  a  year  apart,  and  so  on).  It  is  only  after  the  i  ntrinsic variability of the data is explicitly taken into account that a facial  recognition  system  can  function  reliably.  The  FERET  database  has  been  used in two face-recognition vendor tests (FRVT), one in 2000 and 2002,  and  a  face-recognition  grand  challenge  in  2004–2006.  The  grand  chal- lenge was motivated by advances in computer vision techniques, computer  design, and sensor design that held the promise of reducing the error rate  of the present systems by an order of magnitude (see http://www.frvt.org/ FRGC/ [February 2008]).  The  use  of  a  standardized  evaluation  method  also  allows  for  a  com- parison of different systems, in this case of the accuracy of the fingerprint  systems and the facial recognition systems. It was concluded that leading  contemporary fingerprint systems are substantially more accurate than the  face-recognition systems tested in 2002 (Wilson et al., 2004).  11–D bEST PRACTICES Both  of  the  systems  discussed  in  the  preceding  sections  share  several  commonalities with the proposed national RBID: •  Fingerprint, facial recognition, and ballistic imaging all use images  as input. •  There  is  considerable  variability  between  the  images  in  each  of  these areas. •  All  three  systems  have  potentially  large  databases  that  must  be  searched with high accuracy and within a reasonable search time. One important distinction between the three systems is that fingerprint  and facial recognition attempt to directly connect an image with a person  but the proposed national RBID connects an image to a weapon, not to a  person. A second important distinction emanates from the stochastic nature  of ballistics: that is, noise and variation in fingerprints comes from acquisi- tion; in ballistics there is the additional process of generating the physical  characteristics that are then going to be acquired changes each time. Just as automated fingerprint and facial recognition systems were con- sidered  to  be  nonmature  technologies  in  the  1990s,  automated  ballistic  imaging  can  today  be  considered  as  a  nonmature  technology.  The  use  of  large-scale evaluations of the fingerprint and face recognition technologies  through controlled competitive vendor tests has advanced those technolo- gies  tremendously.  The  committee  believes  that  it  is  likely  that  a  similar  competitive  research  program  for  ballistic  imaging—involving  university, 

OCR for page 272
0 BALLISTIC IMAGING federal  and  state  agencies,  and  industrial  researchers—would  lead  to  sig- nificant improvements in image matching algorithms. The research could  be partitioned into separable components that have applicability across a  wide range of research applications. For example, image acquisition could  be  investigated  separately  from  search  and  pattern  recognition.  In  addi- tion, the competitive vendor tests approach could be used to test the safety,  durability, and cost-effectiveness of engraving identifiers on firearms parts  and or bullets and cartridge cases, such as the microstamping approaches  discussed in Chapter 10.  Given  the  cost  of  the  current  system,  the  need  for  improved  perfor- mance of the system documented in this report and elsewhere, the costs to  society  of  crimes  committed  with  firearms,  and  the  clear  interest  of  state  legislatures and Congress to make improvements in firearms identification,  the committee believes that such an investment in research to support the  development of technologies to assist in firearms identification is critically  important.