4
Hard Problems and Promising Approaches

The National Geospatial-Intelligence Agency (NGA) has historically carried significant responsibilities in mapping, charting, geodesy, and imagery analysis to gather geospatial intelligence, such as the locations of obstacles, navigable areas, friends, foes, and noncombatants. However, the creation of geospatial intelligence not only requires optimal performance in these four areas, but also demands an effective integration of the four functions that comprise persistent TPED (i.e., tasking, processing, exploitation, and dissemination of data) over vast geographic areas and at the time intervals of interest. Moreover, as NGA transitions from a “data-driven” model to a “data- and process- driven” model in order to provide timely, accurate, and precise geospatial intelligence, the need to integrate other sources of intelligence with geospatial intelligence becomes even more critical.

This chapter lists a set of as-yet unsolved or “hard” problems faced by NGA in the post-9/11 world. They are organized into six classes that align with the NGA top 10 challenges: achieving TPED; compressing the time line of geospatial intelligence generation; exploitation of all forms of intelligence (which includes challenges 2-6); sharing geospatial information with coalition and allied forces; supporting homeland security; and promoting horizontal integration. Note that the third category has been broadened from “all forms of imagery” to “all forms of intelligence” to reflect the evolution of geospatial intelligence (GEOINT) beyond an imagery focus. Also identified are promising methods and tools for addressing the hard problems. Virtually none of these tools are part of NGA’s current



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency 4 Hard Problems and Promising Approaches The National Geospatial-Intelligence Agency (NGA) has historically carried significant responsibilities in mapping, charting, geodesy, and imagery analysis to gather geospatial intelligence, such as the locations of obstacles, navigable areas, friends, foes, and noncombatants. However, the creation of geospatial intelligence not only requires optimal performance in these four areas, but also demands an effective integration of the four functions that comprise persistent TPED (i.e., tasking, processing, exploitation, and dissemination of data) over vast geographic areas and at the time intervals of interest. Moreover, as NGA transitions from a “data-driven” model to a “data- and process- driven” model in order to provide timely, accurate, and precise geospatial intelligence, the need to integrate other sources of intelligence with geospatial intelligence becomes even more critical. This chapter lists a set of as-yet unsolved or “hard” problems faced by NGA in the post-9/11 world. They are organized into six classes that align with the NGA top 10 challenges: achieving TPED; compressing the time line of geospatial intelligence generation; exploitation of all forms of intelligence (which includes challenges 2-6); sharing geospatial information with coalition and allied forces; supporting homeland security; and promoting horizontal integration. Note that the third category has been broadened from “all forms of imagery” to “all forms of intelligence” to reflect the evolution of geospatial intelligence (GEOINT) beyond an imagery focus. Also identified are promising methods and tools for addressing the hard problems. Virtually none of these tools are part of NGA’s current

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency systems architecture or set of operating procedures, and so should be termed “disruptive.” Disruptive methods necessitate retraining and redesign at the least. However, it is likely that many of the tools will be introduced incrementally; therefore the transformation itself may feel evolutionary to those involved. Many of the problems involve extensions to spatial database management systems (S-DBMS), which have long been seen as different from the standard DBMS used in information technology and commerce. Such systems are essential to manage vast data holdings, yet only recently have they been adapted for geospatial data and the special needs of GEOINT. Based on the committee’s knowledge of the hard problems in geographic information science (GIScience) and information from NGA (as described in earlier chapters) on the current and future challenges in developing GEOINT, the subset of hard geospatial research problems most relevant to NGA was selected as the list of “hard problems” identified here. Aspects that can be addressed in the short term versus the long term are discussed after each hard problem. Then, based on knowledge of current research and literature, and after considerable debate and discussion, the committee selected methods and techniques that seem most promising for addressing the hard problems. These are not ranked in any way, but were seen by the committee as potential starting points for future research. As a final step, a prioritization of the hard problems is proposed in Chapter 6. ACHIEVE PERSISTENT TPED Hard Problems In the post-9/11 world, persistent tracking, processing, exploitation, and dissemination of geospatial intelligence over geographic space and time is crucial. However, current sensor networks (i.e., remote sensing using satellites and aircraft) and database management systems are inadequate to achieve persistent TPED for many reasons. First, current sensor networks were designed for tracking fixed targets (e.g., buildings, military equipment). They are sparse in space and time, and it takes a long time (e.g., hours) to move sensors to focus on the desired geographic area of interest for the relevant time interval. Lastly, even if an appropriate network were employed, current databases do not scale up to the significantly higher data rates and volumes of data generated by deployed sensor arrays. Basic and applied research on next-generation sensors, sensor networks, and spatiotemporal databases is crucial to achieving persistent TPED. Of particular importance has been the rapid development and deployment of unpiloted aircraft with multiple sensor systems that

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency can remain aloft for long periods of time, such as Predator and Global Hawk. In the future, swarms of such aircraft linked to more static sensor webs will provide enormous amounts of space-time data. Ground sensors include cameras, microelectrical mechanical systems or motes, data retrieved via the Internet such as weather information, and other devices (Warneke and Pister, 2002). Such systems of linked sensors will create a sensor web or network with enhanced capabilities, just as connecting computers together into networks has transformed computation. Yet sensor network theory is in its infancy, and even some of the first-generation technology lacks operational robustness. Much existing research and development to date has been on applications outside the geospatial context (Bulusu and Jha, 2005). Lastly, the ever-growing suite of positioning technologies continues to improve in accuracy and to overcome some of the initial problems of the global positioning system (GPS). Similarly, as location-based services move into broad consumer use, new products and services have become available for GEOINT. Research is needed to improve the design and effectiveness of sensor networks. Many issues are highly spatial, for example the optimization of sensor suites, quantities and locations, the mix of stream versus temporally sampled data, the mix of static versus mobile sensors, and the movement of sensors to adaptively sample activity. In addition, new sensor types can be used to supplement and build on existing systems for imaging, mapping, and data collection. For example, a software program monitoring Internet traffic is a sensor, as also are civil systems such as air traffic control and camera-based traffic monitoring systems. Indeed, any mobile operative with a positioning device could be considered an input device to a sensor net. Furthermore, sensors can now be adaptable in terms of timing, fault tolerance, and power consumption in addition to geographical placement and movement. Given the importance of nontraditional sensor networks, their linkage to geographical space, and the need to integrate the information that they supply both within and across systems, the committee recommends the following. RECOMMENDATION 1: Sensor network research should focus on the impact of sensor networks on the entire knowledge assimilation process (acquisition, identification, integration, analysis, dissemination, and preservation) in order to improve the use and effectiveness of new and nontraditional sensor networks. Particular emphasis should be placed on the relation between sensor networks and space, sensor networks and time, accuracy and uncertainty, and sensor networks and data integration.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency From NGA’s perspective, this is more important than pursuing new variants in existing sensors, since industry now seems capable of delivering innovations for a growing nonmilitary sensor market in the coming years. One of the shorter-term issues related to sensor networks relates to scheduling of sensors. Traditionally, the NGA has relied on space-based and airborne sensors. Even though the resolution of measurements is improving over time, space-based sensors tend to have coarse resolution and require time for repositioning. Space-based sensor systems are costly and must be designed and deployed years in advance of use. In the short term, the NGA will deploy novel sensors to detect subsurface and hidden human activity and military equipment. Sensor networks will include ground-based fixed as well as mobile sensors to provide even finer resolution and better persistence. However, it is expensive to provide persistent coverage of large geographic areas over long periods of time. Thus, benefits can be gained in the short term by addressing the geospatial scheduling problem to minimize the time to reach arbitrary locations and to maximize coverage. Scheduling will involve sequencing a suite of sensors, both ground and air, and not simply dealing with the details of aircraft access and orbital position. Scheduling problems of this type, however, can become computationally complex and involve multiple, conflicting criteria. Consequently, research on efficient multicriteria optimization methods that can be used by decision makers to configure sensor arrays is needed. The new streams of multisensor data will strain existing database systems and require new approaches to dealing with vast quantities of time- and space-stamped information. There are challenges across the board for the development of spatiotemporal database management systems (ST-DBMS) and analysis routines based on the time-space patterns they reveal. Research will have to build a theoretical understanding of the tracking and recognition of movement, both of objects and of more complex entities such smoke, clouds, weather systems, and biothreats. While GIScience has begun work in the area of methods for the analytical treatment of time-space trajectories or lifelines (e.g., Laub et al., 2005; Peuquet, 2002), and the importance of the concept is represented well in the University Consortium of Geographic Information Science (UCGIS) research agenda (McMaster and Usery, 2004), much work on data structures, analytical methods, and theory remains. Research to date has been centered on transportation systems and human activity space, including visualization and description of process (McCray et al., 2005; McIntosh and Yuan, 2005; Miller, 2005a). Much is based on Torsten Hagerstrand’s concept of a time line or prism (Kraak, 2003; Miller, 2005b). As yet, however, little research pertinent to GEOINT has been done. Consequently, the committee recommends the following.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency RECOMMENDATION 2: Research should be encouraged on spatiotemporally enabled geospatial data mining and knowledge discovery, on the visualization of time-space, and on the detection and description of time-space patterns, such as trajectories, in order to provide the new data models, theory, and analytical methods required for persistent TPED. Specific problems are real-time inputs, sparse and incomplete data, uncertain conditions, content-based filtering, moving targets, multiple targets, and changing profiles in time and space. In addition, there is a strong likelihood that future sensor networks will outstrip the capacity and capabilities of systems for data management, reduction, and visualization. Many statistical packages and GISs, for example, place a limit on the maximum number of features or records they are capable of processing (e.g., samples, nodes, polygons). While ArcSDE and Oracle 10g have support for raster databases, in general current S-DBMS only poorly support many subtypes of geospatial intelligence data models, including raster (e.g., imagery), indoor spaces, subsurface objects (e.g., caves, bunkers), visibility relationships, or direction predicates (e.g., left, north). Research is needed to develop next-generation S-DBMS, if current commercial or research prototype S-DBMS fail to meet the performance needs of persistent TPED. The committee recommends that such research be conducted. RECOMMENDATION 3: Research should be targeted at the ability of current database architectures and data models to scale up to meet the demands of agile geospatial sensor networks. The next generation of spatial database management systems must be able to flexibly and efficiently manage billions of georeferenced data records. These systems should support time-space tracking, automatically create and save metadata, and make active use of data on source accuracy and uncertainty. Research on the problems of representing, storing, and managing unprecedented amounts of spatiotemporal data streams from sensor networks is generally a longer-term issue. Specific challenges (Koubarakis et al., 2003) are semantic data models, query languages, query processing techniques, and indexing methods for representing spatiotemporal datasets from sensor networks. In particular, research should explore high-performance computing techniques (e.g., data structures, algorithms, parallel computing, grids) to deal with the volume of data coming from

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency sensor networks for achieving persistent TPED. Also, the steady migration of imagery from the multispectral to hyperspectral and ultraspectral realms will demand the generation of newer and more efficient algorithms and models to enhance imagery exploitation and feature extraction processes. Moreover, the increasing generation of three-dimensional datasets (including light detection and ranging [LIDAR]) from active and passive remote sensors will have to be used in ways that are quite different from the traditional data models that were generated to deal with geospatial data in two-dimensional space. In addition to the three-dimensional potential of LIDAR, interferometric synthetic aperture radar (IFSAR) will generate substantial amounts of detailed surface data, including details of surface cover such as buildings, structures, and vegetation canopy. However, analysis, representation, and visualization of geospatial intelligence will have to be accomplished in both two-dimensional and three-dimensional environments, on both mobile and virtual clients, and in near real time or real time. Also in the long term, research will have to be aimed at integrating vastly different data from traditional and nontraditional sensors in both time and space. Given the simultaneous sensing of events by sensors in the air and space, on the ground, and through non-NGA systems (e.g., census data, employment records, criminal justice system reports), the likelihood of false duplicates is probably greater than the likelihood of missing data. Future systems will have to resolve the ambiguity that results from multiple sensors sensing multiple moving targets, probably in real time. Similarly, each sensor will have its own relative and absolute accuracy and level of statistical certainty associated with features and their locations and descriptions. It is essential that the integration solutions devise means to store and use the known measures of these properties so that they can be applied to derivative products and decisions. For all of these reasons, sensor integration is considered a priority in the research agenda. Other long-term issues include the development of techniques for combining data of different spatial and temporal resolution, with different levels of accuracy and uncertainty, including both conflation and generalization. Integration applies not only to data items, but also to data catalogs. Since the type of features apparent in an image or dataset may vary with resolution, the development of thesauri that match feature semantics (including behavior) rather than feature types is a current research need. This could lead to deployment of fully operational multiscale or scaleless databases. A third area of relevance is the fusion of two-dimensional and three-dimensional datasets, resolving uncertainties such as those caused by building shadows and varying information quality (Edwards and Jeansoulin, 2004).

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency Promising Methods and Techniques An Agile Geospatial Sensor Network The emerging research area of location-based services (LBS) is providing algorithms for determining optimal positioning of mobile servers to minimize the time to reach an arbitrary geographic area of interest. Such algorithms may be used to evaluate the quality of current geospatial scheduling methods for mobile sensor platforms (Bespamyatnikh et al., 2000). If current scheduling methods are not optimal, LBS research can be used to reduce the time to reach unanticipated geographic areas specified by customers of NGA. In addition, newer sensor platforms, such as motes and remote-controlled mobile platforms deployable via air drops, have the promise to reduce the time of positioning sensors over geographic areas of interest (Warneke and Pister, 2002). Moreover, autonomous and distributed sensor networks capable of locally optimizing sensor tasking and collection rather than centralized accumulation and processing of geospatial-temporal data will provide greater efficiency in information generation (Lesser et al., 2003). Spatiotemporal Database Management and Analysis Systems Many current GIS software vendors have moved their systems architecture to a georelational database model incorporating object-oriented properties. The consequent tools have led to experiments with data model applications templates (e.g., for intelligence uses) that encourage reuse and interoperability and can be used in more complex “model building” operations and systems database design that is more conducive to use over the Internet in a variety of client-server architectures. Exploration of the consequences of this transition is not yet complete, especially for processing time-related transactions (Worboys and Duckham, 2005). New theory may be necessary. Research is needed on semantic data models, query languages, query processing techniques, and indexing methods for representing spatiotemporal datasets from heterogeneous sensor networks. Extensions beyond the Quad, R, and S trees will be necessary, and new search and query tools based on spatiotemporal zones and patterns will be required. Some convergence of time-space geography and time-space data management will also be necessary. Extensions to analytical methods to incorporate temporal as well as spatial description and inference should be a priority. Multicriteria analyses and object tracking are in the early stages of development (Bennett et al., 2004). Multicriteria analyses become important, for example, if a decision maker is forced to trade off risk-exposure against time-expedience. What

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency tools can be used to make objective decisions and to explore the consequences? Analytical methods that incorporate input from multiple participants—for example, in the specification of parameters—offer promise since complex problems require a range of expertise that is rarely held by a single individual (Armstrong, 1994). With further advances in change detection, monitoring individual time-space trajectories is becoming robust and is leading to some potential analytical approaches (Laub et al., 2005; McCray et al., 2005). Monitoring, minimizing, and communicating uncertainty in analytic outcomes is another area of high-priority investigation that is developing methods of value (Foody and Atkinson, 2003). Technologies are beginning to be developed for resolving locations to geographic place names that advance beyond current address matching and geocoding capabilities into telephone communications, news reports, IP addresses, and e-mail (Hill and Zheng, 1999). These place name, or toponymic, services need to offer multilinguistic transliteration and temporal place name shifts. Cultural transliteration remains a difficult problem since the names given to localities can vary among local communities according to cultural activity or context. Research is needed to crosswalk the toponymic view with map and image views. Performance Benchmarking Performance benchmarks (Transaction Processing Performance Council, 2005) are an objective way to evaluate the ability of alternative systems (e.g., sensor networks, S-DBMS) to support the goal of achieving persistent TPED. Consider a benchmark to evaluate an S-DBMS (Shekhar and Chawla, 2003) to manage the data rates and data volumes from persistent TPED sensor networks. The benchmark may contain specific geospatial intelligence data streams and datasets, geospatial analysis tasks, performance measures, and target values for the performance measures. Spatial database vendors and research groups could be invited to evaluate commercial (e.g., object-relational database management systems that support open geographic information systems spatial data types and operations) and research prototype spatial database management systems. If current S-DBMS meet the performance needs of NGA, adoption of current S-DBMS for various kinds of geospatial intelligence data may be appropriate. Specific tasks could include development of a semantic schema and object-relational table structures, plus conversion of existing geospatial datasets and applications to new data representations. However, additional research is needed to develop next generation S-DBMS if current commercial or research prototype S-DBMS fail to meet the performance needs of persistent TPED.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency In summary, the hard problems in achieving TPED are the effective use of sensor networks, spatiotemporal data mining and discovery, and spatiotemporal database management systems. Promising solutions are suggested in the areas of developing agile sensor networks; spatiotemporal database management models and theory; detecting patterns within spatiotemporal data; and exploiting performance benchmarking. COMPRESS THE TIME LINE OF GEOSPATIAL INTELLIGENCE GENERATION Hard Problems The timeliness of geospatial intelligence is becoming more crucial due to, among other things, the increasing numbers of mobile targets. Thus, the field of geospatial intelligence is making a transition from deliberate targeting to time-sensitive targeting. It is becoming increasingly important to move toward real-time data generation, processing, and dissemination to reduce latency in intelligence generation and delivery processes. However, the traditional geospatial intelligence generation process relies heavily on manual interpretation of data gathered from geospatial sensors and sources. This poses an immediate challenge given the increasing volume of data from geospatial sensors. Characterization and reengineering of the geospatial intelligence cycle are crucial to achieve the goal of compressing the time line of geospatial intelligence generation. Characterization of the processes of generating geospatial intelligence would be a good starting point for NGA, including the necessity for and levels of human intervention in these processes. For illustration, consider the following process: Raw Data → Annotated Subset → Summary Patterns → Knowledge and Understanding. In other words, a large amount of raw sensor data is gathered continuously over geographic areas of interest. Human analysts review the stream of raw sensor data to identify and annotate interesting subsets of sensor data. The collection of interesting data items is analyzed to produce summaries and to identify interesting, nontrivial, and useful patterns. Human analysts correlate these patterns with other information to create knowledge and understanding about their meaning and underlying causes. Once the process of geospatial intelligence generation is characterized, the bottleneck steps can be identified, and ways found to reduce the time to complete those steps, possibly via automation or via the provision of tools to speed up the manual steps. Creating a system that provides the most efficient flow for the knowledge required would be the target of this research. An important area for research is determining the scope of what is

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency achievable in time line reduction using automation versus human cognition (Egenhofer and Golledge, 1998; Nyerges et al., 1995). Are there limits to autonomy? At what parts of the knowledge cycle can complete automation produce best results (e.g., image processing, georectification, and mosaicing versus source discovery, temporal conflation, feature detection, and extraction)? What roles do humans play best in the knowledge discovery loop? NGA can benefit directly from research that delineates the limits of what tasks can be semi- or fully autonomous, and which will remain best served by trained interpreters. Automatic feature extraction algorithms continue to advance but remain somewhat unreliable. Although they can combine spectral and textural information, and recognize primitive shapes and their combinations, the automated segmentation and labeling of entire images remains an elusive goal. This leads to the following. RECOMMENDATION 4: Research should be directed toward the determination of what processes in the intelligence cycle (acquisition, identification, integration, analysis, dissemination, preservation) are most suitable for automated processing, which favor human cognition, and which need a combination of human-machine assistance in order to compress the GEOINT time line. This is equally important for current and future systems. Benefits could be gained in the near future by characterizing the processes of generating geospatial intelligence, possibly by observing codified as well as tacit organizational information flows and by surveying operational analysts. This would include examining the information flow dependencies between tasks and categorizing them into necessary and accidental dependencies. Once the steps of common processes are characterized with dependencies, NGA can gather data on a typical time duration needed to complete the overall cycle as well as individual subprocesses. Other potential short-term areas of focus include studying ways to automate the bottleneck steps in the processes of geospatial intelligence generation and use, and identifying ways to eliminate unnecessary waiting and dependencies to speed up the process by exploring alternatives to accomplish the same results. Ways to speed up the remaining manual tasks that cannot be automated because of the need for higher accuracies or for other reasons could be studied in the longer term, becoming the target of research designed to yield information about human behavior and cognition and of human-computer interaction studies. While this research evolves however, it would be useful to explore tools to aid analysts in completing the remaining manual steps by understanding the

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency cognitive processes that human analysts use. Such information would also be of value in training and evaluating analysts. Effective interpretation, representation, and visualization of spatial information across all types of displays (virtual, web-based, mobile, three-dimensional, and two-dimensional) calls for innovative research. Recent trends indicate a strong inclination for transitioning into digital maps that can be delivered quickly to a variety of end user thick (desktops and larger) and thin (handheld and mobile) clients. These trends foreshadow a new paradigm of spatial data visualization. Simply moving away from static hard-copy maps to interactive digital media will not necessarily solve the issues of static information. A goal is to have “dynamic” maps rather than “digital” maps. Visualization of dynamic spatial-temporal information within a traditional cartographic framework will be an exciting area for future research that will address the new ways of depicting space-time changes in geospatial features or objects through animated symbology and cartographic designs. Moreover, end users of geospatial intelligence are expected to be using a variety of thin or thick clients that in turn will have variable connectivity, dictating the amount of information that can be efficiently delivered and visualized. Thus, middleware that performs optimized filtering of geospatial intelligence for content delivery based on the end user’s connectivity and visualization environment will be an important research area to be addressed. High-priority research includes development of intelligent algorithms that become more proficient over time at browsing and sifting through image and data archives. Autonomous workflow management procedures would streamline retrieval of specific types of information by anticipating what an analyst might search for next, given what has just been requested. By learning from the results of previous analysis tasks, it would also be possible to suggest “best-practice” approaches to new tasks. Agent-based information retrieval can seek out additional sources that may be distributed in friendly or foreign archives. Accomplishing these tasks requires advances in intelligent image compression, multiple levels of security masking, and routines for efficient, on-the-fly semantic indexing. Each of these stages must be advanced in the context of the significant computational resources that will be locally unavailable, but accessible through a network. Middleware that supports distributed data sharing and computation is an important area of future work (e.g., Armstrong et al., 2005). Other common themes among these research challenges include the development of procedures for handling, storing, and disseminating intelligence that is context sensitive. Another theme is development of self-describing resources that can be linked readily to other possibly disparate forms of data with similar content and include information on uncertainty and provenance. The equivalent of landscape intervisibility

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency grid toolkits such as Globus (Foster and Kesselman, 1997); agent-based approaches; creation of new network philosophies for lightweight communications protocols; methods for dealing with trust and provenance; and methods for dealing with metadata and annotation. Most important are the knowledge technologies, which include knowledge capture tools, dynamic content linkage, annotation-based search, annotation reuse repositories, and natural language processing. Toponymic Services There is relatively little research on the relationship between toponymy and advanced search tools. Areas of promise include geographic information retrieval methods and geoparsing, a cross between natural language understanding and gazetteer lookup. Tools are needed to support the modeling of information about named geographic places and access to distributed, independent gazetteer resources. This has involved semantic webs, resource description frameworks (RDFs), ontologies and mark-up languages such as extensible markup language (XML) and geography markup language (GML). Related to this is research into the type classification of named places (e.g., features types, populated places) and the correspondence between different approaches to such classifications. A component of place name research deals with the history and etymology of place names and their cultural context. While the methods used in this area are simple, the research is nevertheless important and can be enhanced with advances in information technology. Reuse and Preservation of Data Promising methods for the reuse and preservation of data have emerged from research in geospatial databases and from the various Digital Libraries initiatives by NSF and the National Aeronautics and Space Administration. There have been considerable advances in the creation of effective metadata standards and in the creation of tools for authoring metadata. Next-generation systems will leverage the Federal Geographic Data Committee (FGDC)-style metadata to support advanced reuse. Many of the promising methods and techniques reflect those of semantic interoperability, large-scale database management, and toponymic services. Spatial OnLine Analytical Processing (SOLAP), a type of software technology that enables rapid information retrieval from multidimensional data and has been extended to geospatial data and GIS, has some potential to integrate metadata about data quality into the information processing flow (Devilliers et al., 2005).

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency Database and Sensor Technologies for Moving Objects Sensor technologies for moving objects include technical measurement solutions for positioning, with GPS and similar technologies, in both indoor and constrained environments. They also include video-capture and machine vision methods, mature research fields. Less mature is the database management research aimed at creating computer systems for managing and exploiting moving object data. Recommendation 2 covers theory and visualization for moving objects. Guting (2005) has recently presented some promising methods and techniques for moving objects databases, including extended query languages and data models (e.g., Transect-Structured Query Language), spatiobitemporal objects, event-based and transactions processing approaches (Worboys and Duckham, 2005; Worboys and Hornsby, 2004), trajectory uncertainty analysis, spatiotemporal predicates, indexing methods (e.g., time-parameterized R-tree, kinetic B-tree, kinetic external range trees), and special cases, such as analysis of movement on networks. In summary, hard problems related to the exploitation of all forms of intelligence include information fusion across diverse sources, the role of text and place name search in data integration, preserving data in a way that they can be easily reused, and techniques for using multiple sources to detect moving objects. Promising methods include evidential reasoning methods over homogeneous spatiotemporal frameworks, geospatial framework synchronization, robust frameworks for geospatial computations and reasoning, semantic interoperability, toponymic services, methods for reuse and preservation of data, and database and sensor technologies to support moving objects. SHARE WITH COALITION FORCES: INTEROPERABILITY Hard Problems Interoperability will be a key challenge for NGA in the coming years as it pursues its goal of sharing geospatial intelligence not only with other U.S. organizations but also with coalition forces and foreign partners. For example, consider an exchange of navigation maps among coalition forces. A source may focus on terrain maps, where each route segment is navigable by a land vehicle (e.g., a tank), possibly because it primarily serves Army missions. Maps from another source may include land- as well as water-based route segments for amphibious vehicles, possibly because they serve U.S. Marines. If the maps from two sources are merged without accounting for differences in semantic meanings, it can

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency lead to land vehicles losing egress routes during battles or falling into deep water bodies. In addition, precise tracking of a moving target becomes difficult if geopositions recorded by two different sources use disparate coordinate systems, data file formats, and map symbols. In general, combining the maps from these two sources will require careful consideration of differences at the semantic (e.g., meaning of route segments), structural (e.g., coordinate system, other metadata), and syntactic (e.g., data format) levels. Some critical problems faced in spatiotemporal interoperability are the role of real-time sensor inputs, the problems of dealing with incomplete and sparse data, disparate ontologies, uncertainty management, content-based filtering, moving targets, and changing profiles in time and space (e.g., growth, aging, decay). These all have implications for data conflation, for analysis, and for data mining and are components of Recommendation 2. Issues of syntactic interoperability are already being addressed through techniques such as spatial data standards, especially those of the Open Geospatial Consortium. Similarly, structural interoperability can also be addressed by practical means. Therefore, these were not considered hard research problems. However, because of their importance, they are still included in the following discussion. Nevertheless, semantic interoperability is considered a hard problem. There can be little progress in pursuing interoperability without a thorough examination of the abstract set of objects or features of interest to GEOINT, so that they can be formally defined and converted into abstract objects that become transferable because they are complete in their descriptions. While GIScience research has begun this task, there is little motive outside of NGA to target an ontology toward NGA’s needs. Nevertheless, a generic ontology would have great value to other agencies, software developers, and researchers. As a result, the committee makes the following recommendation. RECOMMENDATION 11: Research that creates a complete descriptive schema for geospatial objects of importance to GEOINT as formalized in a GEOINT ontology should be pursued to ensure effective data interoperability and fusion of multisource intelligence. This ontology should have a set of object descriptions, should contain precise definitions, and should translate into a unified modeling language (UML) or other diagram suitable for adaptation into spatial data models. A short-term issue is addressing the syntactic interoperability of

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency geospatial data, such as those related to data file formats. One option is to carefully diagram and examine a complete catalog of differences between two varying spatiotemporal conceptual models, sufficient that a translation procedure between them can be either automated or exhaustively described procedurally. A good pair of models to choose would be two that cause known interoperability problems at NGA. The structural interoperability challenges are longer-term issues. Example challenges include interoperability across geospatial intelligence sources with differences in conceptual schemas (e.g., an entity relationship diagram or UML diagram) and metadata such as coordinate systems, resolution, and accuracy. Long-term challenges include semantic interoperability toward addressing the challenges related to differences in meanings (e.g., definition) of geospatial intelligence across sources. This is an extremely difficult and long-standing problem. Thus, it will be important to support high-risk research to explore promising approaches (e.g., semantic web, ontology translation) that address important sub-problems. Promising Methods and Techniques Geospatial Intelligence Standards Initial efforts are addressing syntactic interoperability by developing common standards. The Open Geospatial Consortium (OGC) has provided a sound foundation for work in distributed geoprocessing, real-time processing, sensor-web challenges, geospatial semantic webs, and brokering multiple distributed ontologies. The first step is to determine whether intelligence needs dictate new standards, require extended existing standards, or fit within existing standards. Applied research can evaluate current geospatial data interchange standards, especially OGC, for exchanging geospatial intelligence across U.S. organizations as well as coalition organizations. If current standards do not cover crucial types of geospatial intelligence, it would be of benefit to NGA to encourage the extension of current standards or the development of new standards for exchanging geospatial intelligence data and services from a variety of sources such as sensors, human interpretation, modeling, and simulation. Effective standards are based on a consensus among major stakeholders including the producers and consumers of geospatial intelligence within the United States and its partner countries. Thus, basic and applied researchers need to address how their results can be incorporated into geospatial and/or computational interoperability standards through the evolution of those standards.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency Spatial and Spatiotemporal DBMS Interoperability Structural differences (e.g., conceptual data models, reference coordinate systems, resolution, accuracy) could be addressed by a combination of automatic and manual methods. For example, geospatial intelligence analysts and their database designers could review the differences between the geospatial conceptual models (e.g., entity relationship diagrams with pictograms) of a pair of sources to develop translation schemes. This would require a careful analysis of issues such as synonyms and homonyms. In addition, it requires establishing correspondences between the building blocks of two conceptual data models. Once the translation scheme is developed and validated, future interchange of geospatial intelligence between the selected source pair can be automated by implementing the translation scheme in software. However, manual effort for this approach grows superlinearly with an increase in the number of sources, and an alternative approach based on global conceptual schema becomes more attractive. Of course, the time and effort of developing a global schema and translation procedures can be reduced by the provision of appropriate tools. Geospatial Intelligence Ontology Semantic differences across sources are difficult to resolve largely because of differing ontologies. Availability of a geospatial intelligence ontology (e.g., a concept dictionary, thesaurus, concept taxonomies) is likely to help manual tasks of developing global schemas and translation procedures. It may also help formalize the geospatial intelligence and make it amenable to large audiences to facilitate training of new analysts. Several ontologies have been explored in the GIScience literature and can be researched to provide a framework for further work (e.g., Agarwal, 2005). Prior standards efforts such as the federal spatial data transfer standard (SDTS) include feature lists and definitions, and geometric objects both with and without topology that could be building blocks for future work. Future work will build on the ongoing body of research on the semantic web (Berners-Lee et al., 2001). The GML standard and work by the Open Geospatial Consortium already form a significant element in NGA’s research programs. OGC in particular has been closely integrated with NGA’s research, and it would be beneficial to continue this in the future. Research into geospatial intelligence ontologies will build on the more generic work described above, but will build tools specific to the needs of geospatial intelligence analysts.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency In summary, the hard problem associated with sharing data is semantic interoperability. Promising methods and techniques for increasing interoperability include further research and development of geospatial intelligence standards, translation schema for spatiotemporal conceptual models, and geospatial intelligence ontologies. SUPPORTING HOMELAND SECURITY The creation of the Department of Homeland Security (DHS) as a response to the nation’s increased vulnerability to terrorist attack has led to another significant demand for GEOINT from NGA. To quote a recent planning document (MITRE, 2004): “In the war against terrorism, our front-line troops are not all soldiers, sailors, fliers, and marines. They are also police, firefighters, medical first responders, and other civilian personnel. These are groups whose historical access to sources of national intelligence has been near zero; yet their need for real-time and analytical intelligence is now critical.” Extension of NGA’s responsibilities to work far more closely with civilian agencies, including but not limited to DHS, has broadened NGA’s mission. For the most part, DHS’s needs place NGA in the category of an information supplier. With current trends, NGA will be more suited to serving as a knowledge supplier. In this case, few options are available for collaboration in research. There remains an opportunity for the intelligence world to collaborate with academics and others in the conduct of research. The few DHS-funded centers in universities are starting points, but there are already a large number of vehicles in place to encourage collaboration and the sharing of experience, expertise, and resources. It is in the interest of NGA to explore relationships among the existing research encouragement mechanisms, reviewed in the next chapter, and DHS. The committee believes that working with DHS involves many of the same issues as sharing GEOINT with coalition forces, similar to ensuring horizontal integration. Therefore, homeland security issues are supported by many of the recommendations in this report. However, while many of the report’s recommendations are applicable to homeland security, the distinction between domestic and foreign intelligence is of great importance. The need for and use of such information within the United States will also be substantially different from that outside the United States. With a new institutional infrastructure for intelligence in the United States, NGA is well placed to clarify and support the role of GEOINT in the new integration-based intelligence environment.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency PROMOTING HORIZONTAL INTEGRATION Hard Problems Horizontal integration refers to “the desired end-state where intelligence of all kinds flows rapidly and seamlessly to the warfighter, and enables information dominance warfare” (MITRE, 2004). The expanded role of NGA and the new clients for NGA services brought by international collaboration and work with civilian communities and agencies places strains on the mechanisms in place for protecting the security of assets and technologies available to NGA but not available elsewhere. As GEOINT2 evolves and creates new ways to assimilate geospatial intelligence for a particular problem, new vehicles will be necessary “so that the full value of the information can be realized by delivering it to the broadest set of users consistent with its prudent protection” (MITRE, 2004). Yet this multilevel demand for information brings with it risks. In the past, a culture has existed of separable roles of content producer and protector. A bias toward knowledge withholding as the default case has led to an extraordinary amount of geospatial data being withheld from the potential user community. In at least one case, there is evidence that such overprotection of geospatial data is unnecessary or even damaging (Baker et al., 2004). New protocols must be established to promote safe data exchange in light of legitimate changing demands for geospatial intelligence products. Existing solutions include using and sharing similar data, such as commercially available high-resolution imagery, without security classification. Research can contribute reliable security protocols in several ways. Discussions by the committee focused on the reported issues of dealing with two problems. First, how can GEOINT products be modified so that their content at any given level of clearance is visible only to those at that level? For example, could an image be made such that its display resolution varied depending on the interpreter? Could a GIS dataset be made that hides detail or entire features on the same basis? Such data could be distributed universally in a single form, but would be used differentially under the control of keys associated with different security levels. Such key-based methods are the domain of research in cryptography and steganography, where GEOINT has received less attention than elsewhere. Secondly, what alternate data sources in the public or commercial domain can be shared, so that information can flow while sources are protected? Anecdotally, commercial high-resolution remote sensing seems to be filling this need in many contexts. Nevertheless, research could contribute to both of these options.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency RECOMMENDATION 12: Research should be directed toward the particular needs of geospatial data for protection with multilevel security to promote safe data exchange, including innovative coding schemes, steganography, cryptography, and lineage tracking. Similarly, the processing of data so that it resembles public domain (e.g., digital orthophotoquadrangle) or commercially produced structures and formats should be pursued. In the short term, issues of image and map degradation are pertinent. For example, what other than a median filter can be used to gracefully degrade the contents of a high-resolution image so that it can be shared? There are also tasks relating to metadata that can be done immediately: for example allowing a web-based query to indicate that spatial data covering a particular area of interest exist, but not allowing access other than providing relevant contact information. In addition, within computer networks, various firewall protection systems and sub-local area networks (LANs) can limit access by Internet domain with ease and can alert NGA to users seeking access inappropriately. Also in the short term, new steganographic methods to support multilevel security could be researched, and protocols developed for location-specific identification and spatially constrained security. For example, a user undeniably situated at a particular location (e.g., from GPS codes) could be granted access for data covering that location. Alternatively, users from or in particular locations could be denied access, perhaps even temporarily. In the longer term, increased research in spatial data licensing (NRC, 2005), geospatial digital rights management, location privacy rights, and geospatial denial and deception methods would be beneficial. Promising Methods and Techniques The field of image processing has developed numerous ways for encoding and selectively processing imagery. However, little work has extended methods into multispectral and hyperspectral sources, or other spatial data such as digital elevation models. Similarly, little work has been done with place name or vector data (Armstrong et al., 1999). An emerging body of research in location-based services is examining some of the technical issues (Schiller and Voisard, 2005), but policy issues such as location privacy need further study. Location authentication research has focused on technology of the GPS, yet next-generation systems will use both new systems such as Galileo and new positioning approaches (Rizos and Drane, 2004). Methods are already available to support mutiresolution imagery and to some extent maps, such as quadtrees,

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency recursive and adaptive meshes, and resolution-dependent georeferencing (Shekhar and Chawla, 2003). In summary, the hard problem associated with horizontal integration is the issue of multilevel security. Promising methods and techniques in- TABLE 4.1 Summary of Hard Problems NGA Challenge Hard Problems Recommendation (1) Achieving persistent TPED Assimilation of new, numerous, and disparate sensor networks within the TPED process 1 Spatiotemporal data mining and knowledge discovery from heterogeneous sensor data streams 2 Spatiotemporal database management systems 3 (7) Compress time line Process automation versus human cognition 4 Visualization 5 High-performance grid computing for geospatial data 6 (2-6) Exploit all forms of imagery (and intelligence) Image data fusion across space, time, spectrum, and scale 7 Role of text and place name search in data integration 8 Reuse and preservation of data 9 Detection of moving objects from multiple heterogeneous intelligence sources 10 (8) Sharing with coalition forces, partners, and communities at large GEOINT ontology 11 (9) Supporting homeland security Covered by other areas   (10) Promoting horizontal integration Multilevel security 12

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency clude current research in location-based services, location authentication, and methods for selectively processing multiresolution imagery. SUMMARY This chapter has presented recommendations on the “hard problems” in geospatial science that NGA should address in order to meet its evolving mission toward GEOINT2. It has also examined promising methods, approaches, and technologies for the solutions to the hard problems. The hard research problems and associated methods are summarized in Table 4.1. Many of the technical problems are ontological issues (i.e., the solution of architecture and interoperability problems lies in the creation of a comprehensive ontology for the collection, handling, and archiving of geospatial information). The chapter also shows that the nature of input networks, and the volume and type of data coming from these networks, are likely to change markedly in the future. By exploiting foreknowledge of these changes, NGA can position itself for the radical shift in geospatial paradigms discussed in Chapter 3. Nevertheless, the challenges of responding to the hard problems outlined in this chapter will be disruptive to NGA both technologically and organizationally. In Chapter 5 recommendations are made that are intended to ease the transitions due to the hard problems outlined in this chapter.

OCR for page 31
Priorities for Geoint Research at the National Geospatial-Intelligence Agency This page intentionally left blank.