4
Priorities for Geographic Information Science

Chapter 3 explores the significant challenges and opportunities at the USGS for processing spatial data. This chapter addresses the maintenance and enhancement of analytical tools and the creation of specific geographic products. The chapter describes the Survey’s primary priority as The National Map and defines the processing and research necessary for its creation. It also outlines the following more general secondary research priorities related to the generation and distribution of geographic products:

  • resolution and scale;

  • delivery of vector products to users;

  • standards for GIS products; and

  • spatial statistics and analysis.

These more general research priorities in GIScience are important because they present problems that must be resolved for two end uses. First, they support the completion of The National Map, which cannot become a reality until progress is made in these areas. Second, these general GIScience topics must be addressed if the USGS is to successfully produce an entire range of map products in new digital formats.

PRIMARY PRIORITIES

The National Map

Developing The National Map is the most important single initiative in the Geography Discipline at the USGS (USGS, 2001b). USGS administrators



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey 4 Priorities for Geographic Information Science Chapter 3 explores the significant challenges and opportunities at the USGS for processing spatial data. This chapter addresses the maintenance and enhancement of analytical tools and the creation of specific geographic products. The chapter describes the Survey’s primary priority as The National Map and defines the processing and research necessary for its creation. It also outlines the following more general secondary research priorities related to the generation and distribution of geographic products: resolution and scale; delivery of vector products to users; standards for GIS products; and spatial statistics and analysis. These more general research priorities in GIScience are important because they present problems that must be resolved for two end uses. First, they support the completion of The National Map, which cannot become a reality until progress is made in these areas. Second, these general GIScience topics must be addressed if the USGS is to successfully produce an entire range of map products in new digital formats. PRIMARY PRIORITIES The National Map Developing The National Map is the most important single initiative in the Geography Discipline at the USGS (USGS, 2001b). USGS administrators

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey view The National Map as more than a single map or a static atlas. Rather, it will be a spatial database covering the United States and territories as a continually updated, “cooperative” topographic map. In this report the committee views The National Map as a digital product, a long-term research and production core project for the Geography Discipline. This definition is more restrictive than the use of the term by the USGS, where it is used to encompass almost all the activities of the Geography Discipline. The committee believes that the activities of the Geography Discipline should be more wide-ranging than The National Map, involving several lines of geographic investigation in the Critical Zone. Currently, the Geography Discipline compiles, integrates, and maintains databases that form the foundation of The National Map, but many additional datasets from outside the Survey will also be included. For example, street and highway locations and alignments will be derived from proprietary data, as well as federal data including human census data from the Bureau of the Census, agricultural data from the Department of Agriculture, and airport, railway, and port data from the Department of Transportation. The USGS is the appropriate agency to serve as the focal point of these various data streams and their expression in The National Map because the Survey is congressionally mandated to serve as the nation’s manager of spatial data in the National Spatial Data Infrastructure. The USGS also has a historical foundation for its role as primary manager of spatial data. The Survey is the national purveyor of authoritative maps, and The National Map is a logical extension of that activity. The proposed project is ambitious, requiring appropriate funding and considerable expertise in the processing of geographic information. The project is intended to provide integrated geospatial digital data for the nation, which will enable advancement in geographic research at the USGS, and promote the application of geographic information in decision making. Unprecedented collaboration and partnerships among agencies, private organizations, and individuals will be necessary to develop and continually update The National Map (USGS, 2001b). Creation of The National Map involves much more than simply digitizing the current topographic maps; it requires a seamless geospatial database that has information from individual topographic maps restructured to achieve the goal of a map without edges. As discussed in Chapter 3, this information is currently available in fragmented form in individual topographic quadrangle maps, but the new database will require seamless integration of the information from the full set of quadrangle maps. Another goal of The National Map is the capability to update individual database elements as soon as change occurs in the landscape. The average age of the quadrangle maps is 23 years, and the current update cycles for paper topographic maps are unacceptable to modern users. Updating the paper-printed product is inefficient and costly. The goal for The National Map is to

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey update weekly, essentially making the product a real-time spatial database. Given its other responsibilities for coordination and collaboration in map production, spatial data management, and the NSDI, the Geography Discipline should also assume responsibility for creating The National Map. The database will have exceptional positional accuracy, with other federal agencies, private companies, and state and local entities providing information through the Cooperative Topographic Mapping Program (CTM). The inclusion of multiple partners means that the USGS’s maintenance of data standards is a critical national function. Although Chapter 3 discussed spatial data in general, the following subsections describe the specific data needs for The National Map. The USGS identifies the following data sets as the foundation for the map (USGS, 2001b): high-resolution digital orthorectified imagery; high-resolution surface elevation (topographic) digital data; vector feature data including hydrographic data, transportation data, structures, and boundaries. land cover data; and geographic place names. The digital terrain information and the digital orthorectified imagery in The National Map are the foundation of spatial information in the NSDI (NRC, 1995). The digital orthoimagery is provided by the USGS, while the first-order geodetic control, from which digital terrain datasets are created, comes from the National Geodetic Survey. The National Geodetic Survey is a Department of Commerce agency within NOAA that provides accurate land survey and terrain elevation data by physical, on-the-ground methods. The National Map will grow from the NSDI, but will be a far-reaching extension of it. Major components of The National Map [for example, transportation, hydrology, cadastral (land ownership boundaries differentiating private from public parcels of terrain), and natural resources] are in harmony with the foundation and framework layers that reside in the NSDI (NRC, 1995). In addition, the committee identifies biogeographic data (the distribution of flora and fauna) as an essential dataset for The National Map. The committee did not include structures and boundaries in the list of priority research areas for The National Map, but these datasets are important to the Survey’s clients and efforts to maintain and improve them should continue. In the future The National Map will be more useful than paper topographic maps to the American public because it will be more accessible, more current, and will include a broader range of geographic information types from multiple sources. However, present knowledge, methods, and tools are inadequate to create The National Map. In order to achieve the project’s goals,

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey coordinated geographic research will be required in all three existing programs of the Geography Discipline: Cooperative Topographic Mapping (CTM), Land Remote Sensing (LRS), and Geographic Analysis and Monitoring (GAM). Additional research will require coordination between the Geography Discipline and the other disciplines at the USGS. The committee observes that there is no well-developed tradition of such interdisciplinary cooperation involving geography on large projects such as The National Map, but these intra-agency connections will be vital to a successful product. The Cooperative Topographic Mapping program should have primary responsibility for the implementation and maintenance of The National Map, both challenging tasks that require research and development of new GIScience. Issues such as positional accuracy, seamless integration of disparate data from a variety of sources, resolution of scale issues, and construction of suitable database architecture require research prior to implementation. An additional challenge is the construction of an efficient Web-based interface for agency users and the general public. The Land Remote Sensing program will contribute Landsat and similar data to The National Map. LRS presently coordinates and distributes satellite and aerial photographic imagery for the nation. LRS’s broader mission is to expand the use of these data. To do so, LRS must first coordinate remote-sensing data acquisition, processing, and distribution. These tasks have not been synchronized in the past, and federal efforts have been duplicated in independent projects that do not benefit from potential economies of scale. To coordinate these activities in a single program, LRS must also be able to advance remote sensing technology and improve the analysis of sensor records. Finally, LRS provides clients with better understanding of the applications and benefits of remote sensing. For example, the USGS’s Earthshots Web site provides new users of remotely sensed products a basic introduction to applications of imagery for education, research, and problem solving (USGS, 2001c). A focal point for project-specific geographic research at the USGS, the Geographic Analysis and Monitoring program (GAM) investigates the dynamics of environmental systems in the Critical Zone. GAM explores issues of sustainability, human health, natural hazards (including wildfire), and the geographic aspects of the global carbon budget. The focus of the program is on land surface dynamics. GAM applies the spatial, nature-society, and integrative perspectives of geography to USGS science. Databases for The National Map The committee identifies the following foundation data sets for The National Map: (1) orthorectified imagery (orthoimagery), (2) digital elevation

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey data, (3) land cover data, (4) biogeographic data, (5) hydrographic data, (6) transportation feature data, and (7) geographic place names. Although the USGS deals with an enormous array of databases, these seven should receive the highest priority. Orthorectified imagery, high-resolution digital orthorectified imagery, aerial photography, and satellite imagery from which distortions have been removed should form the basis of The National Map. Such images combine the characteristics of a photograph with the geometric qualities of a map (Thrower and Jensen, 1976). Orthorectified images provide useful data on land use, built structures, vegetation patterns, and transportation features. Such images are often used by USGS and other agencies (for example, the U.S. Department of Agriculture for soil surveys) as a platform for the compilation and orientation of information from other datasets. Traditionally the USGS has produced Orthorectified imagery of much of the United States from 1:80,000-scale National High Altitude Photography program imagery, 1:40,000-scale National Aerial Photography program imagery, or high-resolution imagery from satellites (Figure 4.1). The use of Orthorectified imagery in The National Map and in other digital map products of the future requires new knowledge. Orthorectified imagery research should include the development of methods to eliminate edge-match problems, and to unify the color balance throughout the dataset. Research should also be directed towards efforts to improve visibility problems in cloud-shrouded environments, perhaps by integrating orthoimagery produced from RADAR, or even by thermal infrared imagery. New methods of cartographic symbolization are needed that will enhance the visual utility of the images without obscuring the information shown on the base maps. High-quality metadata that describe the accuracy, processing chronology, and various sources of orthoimagery will be critical for analysis and sharing of orthoimagery. Digital Elevation Data High-resolution surface elevation data, including Digital Elevation Models (DEMs), are important by-products of the orthrectification process and provide the land surface for The National Map. The USGS has compiled terrain information for all of the nation’s lands, but the coverage is at varying levels of detail. The most widely known map is the 7.5-minute quadrangle created from a DEM with a spatial resolution of 30m. For Alaska, most quadrangles are distributed at a scale of 1:63,360. Unfortunately, as the average age of the paper map series increases, much of the digital terrain data are out-of-date, particularly in areas subject to rapid change by human activities.

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey FIGURE 4.1 An example of an orthorectified image. A view of downtown Columbia, South Carolina, constructed from two separate images with different resolutions. IKONOS pan-sharpened 1×1 meter image courtesy of Space Imaging, Inc. The USGS’s current National Elevation Dataset (NED) is one of the most important datasets required for producing The National Map. The NED has grown out of the effort to create, from a variety of sources, a seamless global elevation dataset with a consistent geodetic datum at a nominal spatial resolution of 1km. The NED data are used for hydrologic modeling at the regional and continental scales, for transportation planning and development, and for geographic, geologic and biologic modeling problems such as wetland delineation, habitat management, and national response to environmental

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey catastrophes (e.g., earthquakes). On average, approximately 20 percent of the 60-gigabyte database of 1-by-1 degree quadrangles is updated bi-monthly. The initial compilation of the NED is complete (Figure 4.2). The USGS’s researchers are now updating and refining the NED using a variety of federal, state, and local sources. Next, research should define the most efficient protocols for NED updates and refinements. For example, the USGS recently developed a derivative hydrological 1km dataset that includes slope and aspect information, which will be useful for modeling water flows and defining stream locations. However, this resolution is too coarse for identifying numerous smaller catchments, particularly those at high elevation. Limitations such as this can impede important national applications, such as predicting future sources of potable water for urban areas undergoing rapid growth. Improvement and refinement of a comprehensive national elevation dataset should be a high priority, and research is needed to guide this effort. GIScience research is also needed to improve data integration algorithms for fusing the data from multiple agencies and for cleaning existing digital terrain data, which are prone to error accumulation during compilation. Management of data updating would also be improved with additional research. Increased emphasis should be placed on incorporating higher resolution elevation data from soft-copy photogrammetric techniques, light detection and ranging (LIDAR) remote sensing instruments, interferometric synthetic aperture radar (InSAR), and in situ point observations drawn from Global Positioning Systems (GPS) measurements. Land Cover General land cover information is required for many environmental, land management, and modeling applications (USGS, 2001d), and should be included in The National Map. Land cover data may be derived at a range of resolutions, depending on its source and its intended use. Spatially coarse land characterization datasets, such as those derived from the 1km Advanced Very High Resolution Radiometer (AVHRR), are well suited for global analyses. However, coarse-resolution data sets are often of limited value for regional investigations, and they are not appropriate for local land use studies. Although land cover datasets with very fine resolution (e.g., 1m by 1m cells or pixels) are appropriate for local land use planning, they are generally too voluminous for regional to global analyses (Vogelman et al., 2001). In late 2000, the USGS EROS Data Center, in cooperation with the EPA, compiled the initial version of the National Land Cover Data (NLCD) set. The NLCD provides reasonably consistent and seamless 30m digital land cover data for the conterminous United States. It is an intermediate-scale national

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey FIGURE 4.2 Mosaic of examples of the U.S. Geological Survey’s National Elevation Dataset. A. A shaded-relief image B. Yosemite National Park in California at a spatial resolution of 2-arc seconds. The data have been rotated counter-clockwise 90 degrees. C. Yosemite National Park in California at a spatial resolution of 2-arc seconds displayed in an oblique, analytical hill-shading format with 2× vertical exaggeration. land cover dataset for assessing water quality, ecosystem health, wildlife habitat, land cover, and land management issues (Figure 4.3). The NLCD was derived from 30m Landsat Thematic Mapper imagery and other ancillary data from the early 1990s, but significant research needs remain to be addressed.

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey FIGURE 4.3 An example map created from data in the USGS National Elevation Dataset: a three-dimensional image of the Colorado Central Front Range near Denver, land use and land cover. SOURCE: USGS Front Range Infrastructure Resources Project; Data Source: USGS National Landcover Database (30 meter resolution), 1990. Overall accuracy for the eastern United States was 81 percent for Anderson Level I aggregations (water, urban, barren land, forest, agricultural land, wetland, and rangeland) (Anderson et al., 1976), and 60 percent for all Anderson Level II classes (more finely divided classes, such as streams, lakes, and marshes for water features) (Vogelman et al., 2001). Since scientists and public agencies are generally unwilling to incorporate thematic data with only 60 percent accuracy, the committee believes that USGS geographers, in coordination with other scientists, should strive for 85 percent accuracy for Level II classes. This task represents a significant research challenge for the USGS. The USGS EROS Data Center has commenced refining the NLCD based on Landsat Enhanced Thematic Mapper Plus (ETM+) 30m data from early 2000. The Geography Discipline should use expert systems and machine algorithms to incorporate additional information in the NLCD, including

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey canopy closure/density, soil taxonomy, permeability, and digital elevation data. In the future, USGS geographers should integrate finer spatial-resolution remote sensing information into The National Map. Stereoscopic aerial photography with fine resolution will be needed to obtain accurate urban/ suburban land cover and land-use information. In this way, The National Map can achieve the accuracy and completeness required for modeling urban dynamics, land use change, and mitigation of agricultural and open space resources. Other types of supportive research required for land cover contributions to The National Map include improving change detection methods, developing error assessment mechanisms for the various classes of land cover, and making the data more accessible for general users by improving reporting. Biogeographic Data Data on flora and fauna along with their distributions should be included in The National Map. Ideally, databases for past patterns should be included and should be linked to models generating possible future biogeographies. The Geography Discipline should interact with the Biology Discipline and other organizations such as the Nature Conservancy to produce useful products for geographic science and the public good. Biogeographic data from the Gap Analysis Program (GAP) should also be included in The National Map. GAP, coordinated by the USGS and carried out by state agencies, links predicted distributions of vertebrate species with natural land covers. The objective of GAP is to determine the effects of land management on the long-term maintenance of biodiversity. This program identifies species likely to be endangered by changing land uses. GAP data support and enhance the utility of other biogeographic data in The National Map. Hydrographic Data Hydrographic features are important components of The National Map for many applications in water quality, water use, land-use planning, and environmental management. The basic geographic building block for hydrologic systems in the Critical Zone is the watershed. The outlines of watersheds are major research, planning, and management tools for water and water-related resources such as habitat. Assessment of watershed resources, including water

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey quality, has been part of the USGS mission since the Survey’s founding, and support for watershed analysis has always been part of the geographic contributions of the Survey. In the late 1880s, the Survey developed a Hydrologic Survey to map dam sites and watersheds. Under the general guidance of the Water Resources Council, the USGS used several data sources to refine the basic geographic classification system for watersheds. The USGS created boundaries for 21 water resource regions, 222 planning regions, 352 accounting units, and 2,149 cataloging units (Seaber et al., 1987). These parcels of geographic space form the background for many potential applications using The National Map. Watersheds are included in the National Hydrography Dataset (NHD), a geospatial vector dataset for the United States that includes all significant water features (USGS and EPA, 2000). Currently, NHD contains intermediate-scale data (1:100,000 scale) for most states and high-resolution data (1:24,000 scale) for some others. The hydrography of the nation is an important part of The National Map because many future users of the map will require this information. Transportation Data Transportation data describing railroads, airfields, highways, and city streets for The National Map are among the most valuable and widely used of all the data layers, but they are also among the most difficult to develop. These data are among the most needed components of the nation’s data infrastructure. Linkage of geospatial transportation data to GPSs has stimulated automation and increased efficiency in emergency response, freight and mail delivery, and many other commercial routing applications. Highways and streets are not only conduits for the movement of people and goods, but they are also the alignments for a locational network of other data layers, particularly those for economic activity and population distribution and movement. The U.S. Bureau of the Census uses address files to supply locational identifiers for its fundamental data on human population, so that accurate locations related to street addresses represent the crossover point between transportation and socio-economic data of all kinds. An important potential partner for the transportation layer of The National Map is the Department of Transportation’s Bureau of Transportation Statistics (BTS), created in 1992. The USGS would be unlikely to contribute substantial amounts of transportation data to BTS, but is likely BTS would be a strong source of data for The National Map. The development of a USGS-DOT partnership requires improved interagency cooperation.

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey The maintenance of a transportation data layer creates one of the most serious challenges to The National Map’s objective of a near real-time database. The substance of many data layers, such as the land surface configuration, changes slowly over time, but transportation lines change rapidly. Within a single year, for example, many new streets are created, rural routes are abandoned, and significant changes occur in highway alignments as construction crews complete modifications. As the bridge network ages, load-bearing capabilities change and are reported at the state level by engineering inspectors. As a result of rapid changes such as high-way modification and load-bearing capacity, it is unlikely that the USGS will be able to update its own transportation database quickly enough to provide real-time data but will depend on acquisition of these data from other agencies or from private sources. For these reasons, the USGS should begin developing partnerships and promoting data standards for the transportation data needed in The National Map. Geographic Place Names The National Map will be useful only if accurately defined and located place names are included. The USGS is responsible for defining place names of natural and cultural features. Place names are a critical part of any geographic database because of their official sanction, which avoids confusing duplication with standard locations for each named feature. Each state has several-thousand place names, so that managing these data is a large task. The committee notes that The National Map should include the location of named features, and incorporate automatic and intelligent decisions about which names to include, and where to place them on the finished map. When users of The National Map zoom in or out, the system supporting the map should be able to automatically adjust its representation of place names to accommodate the user’s needs. Because of the connection between place names and the map, a gazetteer is an integral part of The National Map. A gazetteer converts the place name into geographic coordinates, permitting a GIS to identify a location on-screen, or to browse for information about the location in the data archive. Gazetteers preserve past as well as current names and support data retrieval for historical geographic research. The Geographic Names Information System (GNIS), a digital gazetteer developed by the USGS in cooperation with the U.S. Board on Geographic Names (BGN), contains information on nearly 2 million physical and cultural geographic features in the United States. The federally recognized

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey name of each feature described in the database is listed, and the feature’s location by state, county, and geographic coordinates is defined. The GNIS is the nation’s official repository of domestic geographic names information. The committee recognizes that additional research is required to translate data from GNIS into useful information for The National Map. SECONDARY PRIORITIES Resolution and Scale Construction of The National Map and other digital map products requires a better understanding of how to deal with resolution of the original data and its final representation, because it will be a product that integrates data from a wide variety of sources. The content and geometry of vector data change with resolution. For example, tectonic processes are evident on maps at scales of 1:1,000,000 or coarser resolution that show vast areas of terrain simultaneously. More research is needed to fuse data from multiple sources to preserve geometric relationships (Quattrocchi and Goodchild, 1997). Research should also enable The National Map to model dynamic spatial processes. Questions about the appropriate resolution also affect the creation and management of raster datasets. In some cases, the density of detail should vary within a single dataset. One example is the National Land Cover Dataset (NLCD), which must use the most appropriate resolution for each region of the nation rather than a uniform resolution for all regions regardless of complexity. For example, urban areas and strategic military sites might be compiled at a finer resolution than national forests. In particular, the question of how to develop detailed land-use information in the urban and suburban environment, and then include it in a more general database, remains a significant research need. It will be necessary to create automated techniques to extract information on structures (e.g., buildings, bridges), perimeter, and height from monoscopic and/or stereoscopic remotely sensed data. In addition, some of the information on urban infrastructure might more logically be derived from local LIDAR and InSAR data, both of which provide the exceptionally high resolution needed in complex urban areas. To make them manageable, the resulting datasets will require new algorithms for data compression. Resolution is closely related to scale. The most obvious unsolved scale issues are connected to the need to use data from a variety of sources at different scales to create a product at a single scale. Additionally, all the data in any digital map are unlikely to be used at the same scale, so that when the

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey user “zooms in” some previously unobtrusive data become apparent. The consequence of these scale-dependent problems is that it is impossible to create a single detailed national database containing all geographic themes (transportation, hydrography, elevation, vegetation, and so on) that would be needed to model every application at all possible levels of resolution. Because geographic data can be expensive to collect (e.g., data from extreme arctic or alpine locations) or available at unpredictable points in time (e.g., volcanic or seismic activity), data must be integrated from many sources at different levels of resolution. An important and challenging component of GIScience research is the integration or fusion of multi-scale datasets into a coherent and integrated database with established and usable scale limits for modeling, statistical analysis, and visualization. Multi-scale databases are often needed in a single application (Longley et al., 1999). For example, routing a national package delivery service requires geographic data at national, regional, and local levels to determine cost breakpoints for flying or driving the delivery routes. Delivering Vector Data to Users Much of the USGS’s digital information can be viewed over the Internet using the Earth Explorer program. The USGS has been the world leader in developing systems for Internet access to digital environmental data, particularly through: Terraserver, providing access to the entire topographic map collection of the USGS; Earth Explorer, a gateway to the aerial photographic and satellite imagery collections of the survey; the National Atlas, a multiple-part mapping engine that allows users to customize individual map products; and the National Water Information System (NWIS), which provides geographically designed access to water quantity and quality information. These systems have been extremely popular with users (they are accessed several thousand times each day), both inside and outside the federal government. Demand for digital data will undoubtedly increase as GIS technology spreads, and the quantity of data to be stored and accessed will grow exponentially with frequent revision of The National Map. Transmitting geospatial data via the Internet has become commonplace, usually using raster data that is efficient for imagery, orthophotography, and satellite data products. Transmission proceeds incrementally: transmitting

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey additional raster rows and columns in random order refines an initial coarse resolution. The computations are straightforward, and file size remains predictable and constant for a given resolution, regardless of feature complexity. At the viewer’s workstation, the Internet browser displays the visual effect of a blurred image whose details gradually sharpen. Unlike raster data, vector data remains a challenge for Internet transmission (NRC, 1999a; Buttenfield, 1999). Vector files tend to be large, and file size increases unpredictably depending on the complexity of feature geometry. It would be ideal to transmit vector data by first sending a coarse “browse version” and then amplifying details. This problem is a high priority research area because 911 emergency response systems and other uses require real-time access to vector data. The USGS should pursue research to improve transmission of vector files because they are essential for most of the digital products likely to be produced by the Survey. Standards for GIS Products Standards for geospatial products address data lineage, accuracy, the compilation sources, the processing methods, and the chronology of processing, as well as error and uncertainty. The elements of standards for geospatial data are positional and attribute accuracy, completeness, logical consistency, and lineage. The description of these elements is in the metadata for any geospatial product. Initially, the FGDC developed data standards for U.S. agencies to share national geographic information. The creation of the standards was a monumental organizational achievement involving government, private sector, and academic participation. The resulting Spatial Data Transfer Standard (SDTS) was incorporated as Federal Information Processing Standard (FIPS-173) by legal mandate (NIST, 1994). The NRC has been continually involved in the standards process through its Mapping Science Committee. Assessing the reliability of geographical data for inference and reasoning is an important and emerging area of research in GIScience. With development of GIS and related technologies, large volumes of data may be displayed and analyzed rapidly. But interpretive capabilities depend, to a certain extent, on the ability to visualize accurately and to discern the quality of the identified patterns. Data quality influences the credibility of data analysis and the confidence attached to data interpretation. It affects the reliability of interpretations and thus decision-making based on GIS modeling, sensitivity analysis, and data exploration. Data quality usually varies across the map surface, and this variation needs to be communicated to the map user. Data producers, such as the USGS, have a responsibility to advise data users about

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey the quality of generated data products. USGS data products form a foundation for many critical decisions and policies about the nation and the environment. For example, data collected in ecological field studies are difficult to ascertain as being positionally accurate, correctly categorized, complete, or logically consistent (Buttenfield, 2001). Nonetheless, these observations affect policies on wetland delineation, habitat loss of indicator species, and similar issues affecting urban corridor development and spread. The USGS should conduct research on how to model spatial variation in error, how error is introduced into geographic information during data processing, and under what circumstances error may be reduced by particular computations. Research underlying the development of data standards includes many components of data production that at first seem unimportant. For example, standards include rules for thesaurus development to ensure the consistency of data definitions (geographic meaning as well as semantics) (Salge, 1995). Thesaurus development becomes especially important when a dataset is used for multiple purposes. For example, the definition of an “address” to the U.S. Postal Service is the coordinates of a mailbox. The definition of an “address” to the 911 Emergency Dispatch teams across the nation is the coordinates of a front door. In urban areas, the geographic distance between a front door and a mailbox can be measured in inches. In rural areas, the distance may extend a quarter mile, or more. When the Postal Service and 911 Dispatch first exchanged national datasets, both agencies intended to reduce their cost of digital data compilation. However, the differences in data definitions of an “address” made it impossible for the agencies to distinguish semantic differences from database errors (Buttenfield, 1997). Research on GIS standards has progressed in other countries. With international collaboration for creating global databases, including in the International Geosphere-Biosphere (IGBP) Project, the need for data sharing with other national governments complicates the standards development procedure (Wortman and Buttenfield, 1994). Specifications for data production may differ dramatically. The differences in positional accuracy for cadastral data compiled in East Germany and West Germany before the fall of the Berlin Wall were not greater than 16 centimeters. Yet when the two national databases were integrated, the positional discrepancy was propagated across hundreds of kilometers, and entire land parcels dropped out of virtual existence in the new database. This problem had important implications for the landowners and for the newly unified German government as it tried to establish demographic inventories for social services and taxation.

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey Priorities in Spatial Analysis The successful creation and use of USGS digital products for geographic data require a sophisticated capability in spatial statistics and analysis in two ways. First, the USGS personnel must know about analytic techniques if they are to create products that lend themselves effectively to analysis. Second, USGS personnel must possess substantial analytical capability to address the Survey’s vision and mission, both heavily reliant on spatial data. Spatial analysis uses transformations and statistical manipulations to identify trends, reveal patterns, and detect outliers or extreme values. Spatial analysis also provides computational guidance, confirming the presence of a pattern and its significance in situations where visual displays may confuse or deceive the researcher. The distinction between spatial analysis and conventional statistical analysis is that the former assumes that the results of a method will vary as the location of the objects under scrutiny varies, and location can be defined in terms of location on the planet or in the database. Longley et al. (2001) identified six areas of spatial analysis: queries and spatial reasoning, geospatial measurements, data transformations, descriptive summaries, spatial optimization, and hypothesis testing for spatial pattern. These areas, which use spatial inference, are enhanced by and rely on effective research on the techniques of spatial analysis. To support the Geography Discipline at the USGS, fundamental research in spatial analysis is needed. Research in spatial analysis will improve understanding of spatial scale, spatial association, spatial variation, and spatial movement. Such research will also improve the Survey’s service to users by improving query and retrieval processes as well as the handling of very large spatial datasets (e.g., disaggregated census data and remotely-sensed data on a global scale). Likewise, appropriate tools to model geo-graphic processes will be developed through such research. In the United States, enormous quantities of data are now available to help solve local and regional problems, but the magnitude of the data sometimes makes them difficult to use. Research on spatial analysis will help researchers and decision-makers use available data optimally, even when those data are in terabyte- or petabytesized collections (UCGIS, 1998). As satellite imagery archives, digital government repositories, and other georeferenced datasets become larger, methods should be developed for browsing, classifying, and manipulating the information efficiently. Real-time database access can improve support for emergency teams such as national 911 Emergency Dispatch services designed to speed responses to natural, technological, or terrorist hazards. To help achieve the vision and mission of the USGS, the Survey should improve its contributions to geographic knowledge, tools, and techniques by developing the capability to address the high priority subjects of resolution

OCR for page 65
Research Opportunities in Geography at the U.S. Geological Survey and scale, delivering vector data to users, standards, and spatial statistics and analysis. SUMMARY This chapter explores the role of the USGS in contributing to new knowledge in the areas of cartography and GIScience, and delineates the priority research items. The National Map is expected to be the flagship product of the Geography Discipline in the next decade. However the accomplishment of this worthy goal is not assured with present knowledge. Further basic GIScience research is needed to make the product a reality. The USGS should undertake additional GIScience research towards the development of The National Map and of a more general nature, if it is to fulfill its vision and mission.