1
INTRODUCTION
We live in an age of information, and in recent years the nationhas made unprecedented investments in both information and the meansto assemble, store, process, analyze, and disseminate it. Given thehigh costs of these activities, the nation needs to develop policiesthat are designed to invest and allocate information resources wiselyand to ensure the greatest possible efficiency, effectiveness, andequity in the use of information.
The Mapping Science Committee (MSC) focuses its attention on a particulartype of data—spatially referenced digital data. Spatial data1 establish the positions of objects or activities on the surfaceof the Earth. Applications need to use data that have the appropriateaccuracy. Some applications might require a positional accuracy of1 mm and others 1 km. Although many applications are satisfied withtwo-dimensional analysis, there is an increasing demand for positionin three dimensions.
In the past few years the MSC has focused its discussions on theconcept of a national spatial data infrastructure (NSDI) and addressedspecific aspects of its development and use. A previous NationalResearch Council report2 defined the concept of the NSDI to be “the means to assemble geographic information that describes the arrangementand attributes of features and phenomena on the Earth.” The primarypurpose of the present report is to address an important issue withinthe overall structure of the NSDI—namely, identification of a foundationthat provides a common reference
system for the generation and exchange of spatial data. The MSC believesthat it is in the public interest for government to play a leadingand facilitating role in research and production activities to developspatial data and to make those data available for public use andexchange. Many of the production activities will take advantage ofthe capabilities of the private sector through contracts and otherarrangements; to a certain extent, market forces will further pushthe production, maintenance, and distribution of certain data sets.This report:
-
identifies three categories of spatial data that form the foundationfor the NSDI,
-
identifies minimum specifications required to integrate other spatialdata with the foundation, and
-
recommends specific activities that should be pursued to achievean integrated and accessible NSDI.
NEED FOR SPATIAL DATA
People need spatial data to establish the position of identifiedfeatures on the surface of the Earth. But why is position important?First, knowledge of the location of an activity allows it to be linkedto other activities or features that occur in the same or nearbylocations. Location allows distances to be calculated, maps to bemade, directions to be given, and decisions to be made about complexissues. Examples range from local to national scale and address issuessuch as land-use planning and zoning, transmission corridors, newshopping centers or schools, siting of landfills, environmental regulation,and emergency relief; the potential list of uses is enormous.
Currently the bulk of the spatial data knowledge of the nation isembodied in the agencies, people, and technologies that make anduse the nation’s maps. These maps include commercial road maps fordrivers, property maps maintained as land records, nautical chartsused by mariners, floodplain maps used to determine control measuresand the need for flood insurance, and the extensive series of basictopographic quadrangle (“quad”) maps provided by the U.S. Geological Survey(USGS), which historically have had a wide range of users from hikersand hunters to resource planners.
The needs for spatial data are continually changing and include addressmatching, real-time monitoring of weather observations, water qualitymodeling, and countless other types of analyses requiring much moreinformation than is traditionally represented on maps. These analysesmay require real-time animation, scene comparison, data overlay,buffering, and other operations that cannot be supported by analysesof paper maps alone. This is not to say that paper maps are no longerneeded but rather that their production should no longer be the primarydriver for spatial data production.
Digital spatial data have become a critical ingredient in the decision-makingprocess for government and business alike and can be an importantagent of improved productivity in many sectors of the economy. Digitalinformation has become increasingly important in effective sitingof public and private facilities. Processing of spatial data alsocan lead to greater efficiency in the logistical operation of vehicles,with concomitant savings in fossil fuel use and reduction of pollutionand traffic congestion. Likewise, spatial data are essential to responsibledevelopment or preservation of natural resources such as agriculturalsoils and wetlands.
As a consequence, spatial data that are customarily represented onmaps and aerial photos are migrating to computer storage. The technologyof data conversion is surprisingly complicated. And
there is much more to moving spatial knowledge from paper to computersthan simply digitizing current paper maps. Although digitizing papermaps automates the printing of more paper maps, it does not by itselfsupport analytical needs.
What is needed is a realignment of priorities from paper map productionto that of digital data production, which supports both digital analysisand paper map products. This requires rethinking the production process,retraining staff, adopting new technologies, and setting aside oldattitudes.
The people in the NSDI will be profoundly affected by the technicalcomplexity of converting spatial data from paper to computer databases. Seemingly endless technical issues, such as differences indata models, data content, data quality, and data transfer must beresolved. Attention must be given to the problems of data redundancy,standards, and training in the new technologies. The incentive tomake the conversion has to do with the ease with which digital datamay be shared by many users with diverse spatial data needs.
IMPORTANCE OF SHARING SPATIAL DATA
People need to share spatial data to avoid duplication of expenses3 associated with generation and maintenance of data and their integrationwith other data. In paper map form, data sharing is obstructed ifscales differ, if projections differ, if symbologies are not uniform,if legends do not identify all map items, and so on. In digital formmost of these same problems exist and must be taken into account.Often, the spatial data produced for one application can be appliedin others, thus saving money by sharing data. Mechanisms to facilitatethe use and exchange of digital spatial data are a major justificationfor developing and expanding the NSDI.
A recent example demonstrates society’s need for digital spatial data sharing at a level not now available.Analysis of problems resulting from major flooding in the upper Mississippidrainage basin during the summer of 1993 demonstrates the benefitsof sharing digital spatial data for potential damage assessment4 and forecasting results of mitigation scenarios. Had they been available,digital data layers such as geodetic control, digital elevation,hydrology, wetlands, street centerlines, geology, soils, and existingflood control structures could have been integrated with other information(e.g., meteorological, levee conditions and breaks, and other localdata) to forecast the extent of flooding at a given river stage.An analytical system supported by digital spatial data could havedetermined routes of drainage and water absorption by soils and substructures.In addition, automatic traffic routing at each stage of floodingcould also have been supported, allowing emergency services and commercialtransportation planning to be put in place before crises of streetand bridge loss occurred. These same data layers, if available, couldbe shared now that the flood is over to develop dynamic models ofdike restoration or addition and to model impacts of alternativefloodplain land-use scenarios. The availability and use of thesedata should lead to decisions that could help mitigate future financialloss and personal tragedies.
DATA SHARING AND DATA QUALITY
Each application has its own data requirements, including data resolutionor precision, locational and attribute accuracy, logical integrityand semantic consistency, completeness, and temporal currentness.Response to an emergency 911 call may require positional accuracyof perhaps 10 m, while a surveyor of a property boundary will requiremeasurements having a resolution in units of
perhaps 1 cm. Land cover data bases for the nation could incorporatehundreds of themes, while mapping surface material for highways mayrequire only tens of categories. Several of the themes for mappingforest resources might require frequent updating (e.g., annuallyor semiannually), whereas soil mapping may be accomplished with lessfrequent updates.
Data quality is an important component of data sharing because peopleneed to know the reliability of interpretations and decisions basedon the data generated by one agency and used in an application byanother organization. No single federal, state, or local agency caneffectively respond to all the possible spatial data needs of theirconstituencies. Nor can a single level of accuracy, consistency,or currentness be reasonably applied to all data products or applications.Table 1 presents examples of spatial data applications and their correspondingprecision or resolution requirements.
To provide the necessary assurances of data quality, data sets maybe evaluated according to guidelines and strategies set forth inthe Spatial Data Transfer Standard (SDTS; FIPS Publication 1735). For example, when evaluating positional accuracy, data may becompared with other data sets of higher positional accuracy for thenecessary registration and control. Table 1 shows this in the form of a hierarchy, where the less precise applicationsdepend on source data of greater resolution. Notice that the mostaccurate layer is not essential for all applications. It is an oversimplificationthat the most precise or accurate data are always the “best” for a givenapplication. Computational requirements, time available to reacha decision, and precision of other data to be integrated may in factpreclude the need for the most precise data in every application.It is true, however, that one data-producing agency must often dependon another data producer for source data, which points once againto the importance of sharing data and of integrating data from multiplesources.
TABLE 1. Example Applications of Spatial Data and Their Common ResolutionRequirementsa
Horizontal Resolution |
Example Applications |
0.001 m (1 mm) |
Crustal motion, geodynamics, geophysics |
0.01 m |
Property surveying, civil engineering |
0.1 m |
Cadastral mapping, utility location |
1 m |
Facilities management for utilities |
10 m |
Mapping, soil, and wetlands mapping, National Biological Survey |
100 m |
Small-scale mapping |
1,000 m (1 km) |
Ice flows, global change research |
aExceptions to these requirements will depend on specific user needs. |
CHALLENGE OF DATA INTEGRATION
The sharing of spatial data involves more than just ensuring thatdata are in digital form and that the metadata (information on dataquality, accuracy, lineage, currentness, etc.) are included withthe spatial data. As data are drawn from many producing organizations—each source (private, local, state, or federal) may have a possiblydifferent application —there are likely to be differences in datadefinitions as well as in resolution, accuracy, and other data qualitycomponents. Users need to integrate spatial data to process it andunderstand its patterns. Methods for spatial data integration canrange from very simple to very complicated processing steps, dependingon the application and the differences in the type of data (see Figure 1).
For example, data sets to support emergency response have often beencreated from USGS Digital Elevation Models and the Bureau of theCensus’ TIGER (Topologically Integrated Geographic Encoding and Referencing)files, registered to geodetic control. Facilities management applicationsmay depend on cadastral maps (often produced at the county level)for identifying utility hookups. Integrating data can require mergingposition, merging attribute categories, correcting geometry or topology,and/or revising data definitions to embed the contents of one dataset into another.
Data integration is not solely limited to positional registration.A hydrologist integrating two data sources may have as first prioritythe need to preserve the topological integrity of tributary flowsrather than the need to locate them precisely. Integrating data fromtwo different census decades may require statistical correction ofattributes enumerated for different census tract boundaries. To takean actual example, some of the USGS Level II Land Cover categoriesare drawn or modified from categories in the Anderson et al. (1976)system,6 whereas other categories are drawn from the Defense Mapping Agency’s Digital Chart of the World. These data sources contain varyingdata definitions; thus, some conceptual revision is necessary tointegrate all the sources.
Regardless of the type of operation (geometric, topological, statistical,or conceptual), data integration requires a specific spatial referenceto be successful. The reference might be a set of registered datalayers, a set of data themes, a published standard for evaluatinglogical consistency, or some combination of these. The choice ofa single and unified reference provides a foundation for full integrationof all other data into a common framework. That is, integrating somedata to one reference and other data to another reference will notnecessarily integrate all data to each other. This is especiallytrue for spatial data processing, where it is common to combine multiplethemes of data from many different sources.
For a single application, the principle of a single reference isstraightforward, and one chooses the most convenient reference strategy.Within an organization, where a single data set may be used for multipleapplications, data integration becomes more problematic as everyintegration process requires time and money. It is advantageous toadopt a single reference strategy and integrate data for any applicationto that reference, to minimize integration costs. For the nationas a whole, if each data producer relied on a different reference,data sharing, data exchange, and data integration would be impeded.Removing this impediment will strengthen the NSDI.
NOTES
1. The MSC continues to use the term “spatial data” in the context of “geographically” referenced data. The Federal Geographic Data Committee recentlystarted to use the term “geospatial” within the same context.
2. Toward a Coordinated Spatial Data Infrastructure for the Nation (1993), Mapping Science Committee, National Research Council, NationalAcademy Press, Washington, DC, 171 pp.
3. The Office of Management and Budget (OMB) determined that federalspatial data activities amounted to about $4.4 billion in FY1994;this number resulted from a data call described in OMB Bulletin 93-14.Most analysts agree that an equal or greater amount is spent on spatialdata activities by state and local governments and the private sector.
4. For additional information on how spatial data were used in responseto the flooding, see V. Speed (1994), “GIS and Satellite ImageryTake Center Stage in Mississippi Flood Relief,” GeoInfo Systems 4(1), 40-43.
5. Spatial Data Transfer Standard, Federal Information Processing Standard Publication 173, National Institute of Standards and Technology, Gaithersburg, MD.
6. J. R. Anderson, E. E. Hardy, J. T. Roach, and R. E. Witmer (1976).A Land Use and Land Cover Classification System for Use with RemoteSensor Data, USGS Professional Paper 964, 28 pp.