Cover Image

Not for Sale



View/Hide Left Panel

3

Characteristics of Scientific and Technical Databases

DR. SERAFIN: We are going to begin now with the scientific data panels, which will describe and discuss the salient characteristics of scientific and technical databases in four disciplines—geography, genomics, chemistry and chemical engineering, and meteorology—from the government, not-for-profit, and commercial perspectives. [NOTE: Prior to the workshop, the National Research Council study committee distributed a set of questions to the data panelists requesting detailed information on their respective data activities. The data panelists' prepared responses to these questions, which were distributed to the workshop participants, are included in these proceedings because they are more comprehensive than the transcribed text of the oral workshop presentations. See Box 3.1 for a list of questions to the data panelists.]

BOX 3.1 QUESTIONS FOR DATA PANELISTS

The study committee prepared a list of nine questions for the participants of the workshop's data panels. The committee asked that the panelists use their current activities as a baseline but also provide information about major changes that have taken place over the past five years and the changes that they anticipate in each area over the next five years, and state why these changes have, or will, occur.

Provide a description of your organization and database-related operations.

1a.

What is the primary purpose of your organization?

1b.

What are the main incentives for your database activities (both economic and other)?

2a.

What are your data sources and how do you obtain data from them? 2b. What barriers do you encounter in getting these data and integrating them, and how do you deal with those barriers?

3.

What are the main cost drivers of your database operations?

4a.

Describe the main products you distribute/sell.

4b.

What are the main issues in developing those products?

4c.

Are you the only source of all or some of your data products? If not, please describe he competition you have for your data products and services.

5a.

What methods/formats do you use in disseminating your products?

5b.

What are the most significant problems you confront in disseminating your data?

6a.

Who are your principal customers (categories/types)?

6b.

What terms and conditions do you place on access to and use of your data?

6c.

Do you provide differential terms for certain categories of customers?

7a.

What are the principal sources of funding for your database activities?

7b.

What pricing structure do you use and how do you differentiate (e.g., by product, ime, format, type of customers, etc.)?

7c.

Do your revenues meet your targets/projections? Please elaborate, if possible.

8a.

Have you encountered problems from unduly restrictive access or use provisions ertaining to any external source databases?



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 3 Characteristics of Scientific and Technical Databases DR. SERAFIN: We are going to begin now with the scientific data panels, which will describe and discuss the salient characteristics of scientific and technical databases in four disciplines—geography, genomics, chemistry and chemical engineering, and meteorology—from the government, not-for-profit, and commercial perspectives. [NOTE: Prior to the workshop, the National Research Council study committee distributed a set of questions to the data panelists requesting detailed information on their respective data activities. The data panelists' prepared responses to these questions, which were distributed to the workshop participants, are included in these proceedings because they are more comprehensive than the transcribed text of the oral workshop presentations. See Box 3.1 for a list of questions to the data panelists.] BOX 3.1 QUESTIONS FOR DATA PANELISTS The study committee prepared a list of nine questions for the participants of the workshop's data panels. The committee asked that the panelists use their current activities as a baseline but also provide information about major changes that have taken place over the past five years and the changes that they anticipate in each area over the next five years, and state why these changes have, or will, occur. Provide a description of your organization and database-related operations. 1a. What is the primary purpose of your organization? 1b. What are the main incentives for your database activities (both economic and other)? 2a. What are your data sources and how do you obtain data from them? 2b. What barriers do you encounter in getting these data and integrating them, and how do you deal with those barriers? 3. What are the main cost drivers of your database operations? 4a. Describe the main products you distribute/sell. 4b. What are the main issues in developing those products? 4c. Are you the only source of all or some of your data products? If not, please describe he competition you have for your data products and services. 5a. What methods/formats do you use in disseminating your products? 5b. What are the most significant problems you confront in disseminating your data? 6a. Who are your principal customers (categories/types)? 6b. What terms and conditions do you place on access to and use of your data? 6c. Do you provide differential terms for certain categories of customers? 7a. What are the principal sources of funding for your database activities? 7b. What pricing structure do you use and how do you differentiate (e.g., by product, ime, format, type of customers, etc.)? 7c. Do your revenues meet your targets/projections? Please elaborate, if possible. 8a. Have you encountered problems from unduly restrictive access or use provisions ertaining to any external source databases?

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 8b. What problems have you had with legal protection of your own database activities and hat are some examples of harm to you or misuse of your data that you have experienced, if any? 8c. How have these problems differed according to data product, medium, or form of delivery, and how have you addressed them (e.g., using management, technology, and contractual means)? 8d. What specific legal or policy changes would you like to see implemented to help address the problems identified above? 9. Do you believe the main problems/barriers/issues you have described above are epresentative of other similar data activities in your discipline or sector? If so, which ones? If not, what other major issues can you identify that other organizations in your area of activity face? The moderator of the first panel, which focuses on geographic data, is Harlan Onsrud, professor at the University of Maine. GEOGRAPHIC DATA PANEL MR. ONSRUD: My name is, again, Harlan Onsrud with the Department of Spatial Information, Science, and Engineering at the University of Maine, which is also affiliated with the National Center for Geographic Information and Analysis. We will have two speakers today, since James Brunt, from the Long-Term Ecological Research Network Office at the University of New Mexico, is unable to join us. Our first speaker is Barbara Ryan. She is associate director for operations for the U.S. Geological Survey (USGS). Barbara is going to be highlighting her agency's experience in the creation, sharing, and handling of geographic data, as well as some of the other data that the agency certainly collects. USGS, of course, is very much both a creator of geographic data as well as a major user of geographic data. So, both of those perspectives are represented. Government Data Activity Barbara Ryan, U.S. Geological Survey Response to Committee Questions Provide a description of your organization and database-related operations. The U.S. Geological Survey (USGS) and its information assets provide a gateway to the Earth. Sound stewardship of the nation's land, natural, and biological resources requires up-to-date, and often up-to-the-minute, information on how these vital resources are being used, as well as an understanding of how possible changes in use might impact the national economy, the environment, and the quality of life for all Americans. A core responsibility of the federal government is to enhance and protect the quality of life for its citizens, and the USGS provides the scientific underpinning for sound stewardship decisions that have an impact in each community, but that also extend beyond state boundaries and benefit the nation as a whole. With scientific information from the USGS, policy makers can foresee possible impacts of their

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS decisions on America's economy, on the environment, and on the lives of the citizens they represent. With an interdisciplinary mix of nearly 10,000 scientists including geologists, biologists, hydrologists, cartographers, computer scientists, and support staff at work in every state and in cooperation with over 2,000 local, state, and other federal organizations, the USGS is uniquely positioned to serve the science needs of the communities, the states, and the federal government by describing processes that occur in, on, and around the Earth. 1a. What is the primary purpose of your organization? The USGS serves the nation by providing reliable scientific information to (1) describe and understand the Earth; (2) minimize loss of life and property from natural disasters; (3) manage its water, biological, energy, and mineral resources; and (4) enhance and protect the quality of life. It is the primary science agency of the Department of the Interior. The USGS carries out its research and activities at the global, national, regional, state, and local levels. Because the USGS encompasses numerous natural science disciplines, it is possible for the bureau to bring physical plus biological science to natural resource management problems. The aggregation of this information provides a national perspective on the landscape of the country, from understanding processes deep beneath the Earth's surface to preserving habitat for threatened and endangered species. A sampling of current USGS programs includes (1) biological activities such as the cooperative biological research units, the Gap analysis program, biomonitoring of environmental status and trends, and the Species at Risk program; (2) geologic activities such as the Energy and Mineral Resource Assessment, National Cooperative Geologic Mapping, landscape and coastal assessment, and geologic hazards assessments; (3) mapping activities such as the mapping cooperative partnerships, business partner product distribution program, and cooperative research and development agreement partnerships with Microsoft TerraServer, Environmental Science Research Institute, Lizard Tech, and Now What, National Atlas of the United States of America, Center for Integration of Natural Disaster Information, National Geographic Research program, and National Satellite Land Remote Sensing Data Archive; and (4) water resource activities, such as the Federal-State Cooperative Water Resources Program, National Water Quality Assessment Program, Water Resources Research Act Grant Program, ground-water resources program, toxic substances hydrology program, and national water resources research program. 1b. What are the main incentives for your database activities (both economic and other)? As a science agency, a fundamental part of the USGS mission is the collection, quality assurance, storage (archiving), and dissemination of basic natural science data that are reliable and have continuity over time and space. Embodied in its mission is also a commitment to make USGS data and information more accessible to more people. Other important incentives are as follows: Meet a growing number of requirements and support a wide array of constituents by using rapidly advancing technology. Provide updated and revised graphic topographic maps and ensure that the nation has access to the best available geospatial information in formats and on media best suited to customer needs.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Use creativity in cooperation and coordination, seek and find matching dollars from other government agencies and the private sector in many different kinds of partnerships and consortia of customers. Ensure timely presentation of scientific information and effective use of this information by decision makers. Ensure that products are published in digital format, have consistent data standards, and are available through the National Spatial Data Infrastructure (NSDI). Provide searchable indexes to access USGS projects. Provide reliable, impartial and timely information that is needed to understand the nation's natural resources. Establish a network of distributed databases and information sources on natural resources directed toward the needs and responsibilities of Interior resource management bureaus. 2a. What are your data sources and how do you obtain data from them? The USGSGeospatial Data Clearinghouse provides information about USGS geospatial or spatially referenced data holdings. The agency is an active participant in the NSDI. The USGS NSDI node encompasses a distributed set of sites organized on the basis of the USGS's four principal data themes—biological resource information, geological information, national mapping information, and water resources information. (See <http://nsdi.usgs.gov/nsdi/> for additional information.) For biologic data, the USGS works cooperatively with many government agencies; nongovernmental institutions including academia, the private sector, and museums; and international organizations to share data and information. At this time, the National Biological Information Infrastructure (NBII) is based upon a fully distributed, World Wide Web-based architecture, in which the provider sites, in addition to providing data and information for the NBII, also serve the data and information. As the infrastructure develops and matures, it may be possible in the future to create a central server site that allows provider sites to concentrate on their primary functions, except for providing their data for public availability. The centralized server node then would take care of virtually all of the additional mechanics required for making the data accessible. This second model is under consideration for future implementation. Geographic and cartographic data are obtained primarily from state and local government mapping and Geographic Information System agencies, other federal agencies, and partnerships and relationships with the private sector. These data are obtained mainly through cooperative agreements or innovative partnerships. The USGS has relationships with both the National Oceanic and Atmospheric Administration (NOAA) and the National Aeronautics and Space Administration (NASA) for archiving satellite data. Sources of geologic data include the USGS; state geological surveys and academic institutions through the National Cooperative Geologic Mapping Program; academic institutions that operate, through cooperative agreements with the USGS, regional earthquake monitoring networks; and international partners (academic institutions or foreign government agencies) that operate nodes of the Global Seismographic Network through agreements with the USGS. The National Water Information System (NWIS) is the primary corporate database for the USGS water information. NWIS receives data from a variety of sources, including field instruments through a variety of different telemetry, field computers, laboratory instruments, and direct input from investigators.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 2b. What barriers do you encounter in getting these data and integrating them, and how do you deal with those barriers? Funding and integration of common data requirements among several partners. There are many issues here related to content, format, accuracy, etc. It is best to look for common ground and minimum specifications that work for both parties. Database content and merging of data together that have different content specifications. Work toward a specification that ensures a minimum level of content and find partners that are willing to provide data to that minimum level. Copyright problems when working with private-sector organizations. To deal with this problem, look for data exchange opportunities or possible degradation of the copyrighted data. Great variety of data types located in many legacy systems and format; lack of common data models. The USGS is dealing with these barriers by working to develop common data models and migrating priority legacy data sets to make them more widely available. When dealing with real-time data, absence of data due to problems with the reliability of system components and erroneous readings resulting from damaged or malfunctioning system components. The USGS deals with these problems through vigorous quality control procedures and the use of hardened or redundant components. Building of partnerships representing a broad array of organization types coming together for a unified purpose. This task is important but difficult. Issues and challenges are raised emanating from the diverse needs of such organizations. Each type of organization must be enticed in a manner that is of benefit to them to enable their participation. One method that has been effective in meeting this challenge is the dialog and demonstration method, that is, participating actively in groups where the highest number of partner and potential partner organizations can be reached to deliver information about the status and progress of the partnerships. In addition, one-on-one dialog and technical support can be maintained as needed with new partner organizations to assist them in complying with the requirements for participation. Monetary support is sometimes provided to organizations with key data sources. 3. What are the main cost drivers of your database operations? Cost drivers for USGS information products can be grouped into two categories. (1) Data collection and management costs including interpretation, maintenance, administration, archive, and analysis; software enhancement; hardware upgrades; hardened and/or redundant systems; World Wide Web page development and maintenance; searchable online clearinghouses; controlled vocabularies; data discovery, retrieval, and access tools; assessment and documentation of user requirements; partnerships with key non-USGS sources of data (such as state government agencies, academic scientists, or natural history museums) to assist in their efforts to document and serve important data sets and information products; and support of trained staff to prepare high-quality metadata documentation of data sets and information products: These cost drivers are funded by congressional appropriations and cooperative funds. (2) Reproduction and distribution costs, with the primary cost drivers being customer service, order taking, accounting, and order fulfillment: These cost drivers are funded by congressional appropriations for legislatively required distributions; all other distributions are funded through cost reimbursement fees.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Cost drivers for reproduction-related costs for maps, map products, and digital data are inspection of the press-ready combined negatives, press plate production, press setup and press plate calibration, production supplies, quality control, equipment amortization, equipment maintenance, space and utilities, and shipment to the main USGS distribution facility. Cost drivers for distribution-related costs for maps, map products, and digital data are receiving and processing the shipments received from the USGS printing operation into inventory, inventory management and quality control, processing orders from operational databases, customer service, order taking, accounting, order fulfillment, packaging, postage, distribution supplies, order closeout, and marketing. If the maps or map products are in digital format, costs are similar to graphic maps with the exception of media costs, research, and order staging. Equipment amortization and maintenance costs for digital format production equipment are somewhat higher. Text products have additional costs of editing and Government Printing Office contract overrides as well as higher unit costs due to limited demand and small production lot sizes. 4a. Describe the main products you distribute/sell. The USGS products, information, and services are based on or support natural science data and include the following formats: publications (professional papers, circulars, and general interest), both in electronic and hard copy forms; fact sheets; digital data; maps (including geologic, hydrologic, and topographic); analytical studies; technical assistance; tangible technology; new processes and procedures; emergency assistance; predictive modeling and analysis; environmental assessments and reports; water-resource assessments; biological assessments; biological status and trends reports; satellite imagery; and aerial photography. Information products disseminated by the USGS are grouped into four general categories: (1) maps and map products, (2) text products, (3) scientific data, and (4) remotely sensed imagery. These products are made available in various formats to include paper, plastic, film, and digital. The 1:24,000-scale standard topographic quadrangle maps (topoquads) on paper are probably the best known USGS product and are distributed most widely. In fiscal year (FY) 1998, the USGS disseminated approximately 3.1 million 1:24,000-scale topoquad sheets and approximately 4.3 million topoquad sheets to include all scales available. The USGS also disseminates information generated by other federal agencies, i.e., the National Imagery and Mapping Agency of the Department of Defense, the United States Forest Service of the Department of Agriculture, other Department of the Interior bureaus, the U.S. Customs Service of the Department of Commerce, etc. The USGS holds databases across many subject areas including biological information, climate, natural hazards, minerals, ecosystems, coastal and marine geology, energy, geography, real-time streamflow discharge, water-use, groundwater, and water-quality data. 4b. What are the main issues in developing those products? Trying to produce national data sets from many regional data sets that do not have common standards and may be incomplete. Producing printed products as part of cooperative agreements with state and local agencies and other federal agencies through a distributed production process that decentralizes the approval, preparation, and distribution activities.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Migrating toward more electronic publishing and distribution of products —toward an as yet undetermined end point—the USGS is still dealing with various issues in the print world and the evolving technology of electronic publishing. In addition, the costs of getting to that end point, coupled with level or decreasing funding for production and printing, are dynamic issues. Evaluating the potential effect of distribution. Due to the nature of its scientific focus, USGS research sometimes results in data and information about threatened or endangered species. While the agency has no security restrictions or limitations on distribution of these publications, it does find it necessary to evaluate the potential effect of publication on the resource being studied. For example, it is an unfortunate fact that publication of endangered species data sometimes results in further harm to the species at the hands of those who wish to possess rare commodities. 4c. Are you the only source of all or some of your data products? If not, please describe the competition you have for your data products and services. The USGS is not the only source of many of its data products, although it produces some specific research products that can be found only at the USGS. The National Water Information System is a unique national database providing consistent, reliable, long-term water information. However, many private sector concerns, state governments, and academic institutions gather information similar to that collected by the USGS. The USGS strives to develop multiuse information products on a national level. Its competition, both public and private, develop information products, with a specific customer in mind, that meet certain demand-level projections. The USGS strives to work cooperatively with many organizations to collect, coordinate, and share data and information, e.g., Incorporated Research Institutions for Seismology data center, state geological surveys, and state geographical information system groups. Often the greatest value of USGS database activities is derived from the federation of partners we strive to create and we are not in a competitive role with regard to the other producers. The national coverage provided by the USGS ensures consistent management of all U.S. land, water, and natural resources for the betterment of all. 5a. What methods/formats do you use in disseminating your products? The tangible products (inventoried items) of the USGS and custom products produced on demand are disseminated out of the USGS Denver warehouse via the mail and over the counter. The format is mostly paper, although USGS products come in a wide range of flat maps, folded maps, books, etc. Some inventoried items are on CD-ROM. Digital products are produced on demand (primarily) and distributed via the mail, over the counter, through retail business partners, and over the Internet. Formats vary widely, but USGS is trying to standardize on the Spatial Data Transfer Standard, the native format (the archive format), other nonproprietary formats such as GeoTIFF, and sometimes proprietary formats like ARC-INFO. A variety of media are offered including C-R, CD-ROM, 8-mm tape, 3480 cartridge, and digital linear tape. Much of the USGS digital data and information are distributed over the Internet. The rapid movement of the Web from a novelty to mainstream distribution mechanism has presented the USGS with challenges unthought of just five years ago. The biggest challenge has been to organize, integrate, and present, in a sensible manner, the broad range of data and information types that characterize USGS products.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS The Web medium has made USGS products visible to a vast and varied clientele, ranging from the traditional USGS customer base among scientists and policy makers to hobbyists and the K-12 education community. These new audiences have their own unique needs and abilities to digest and use USGS products, which has placed great pressure on the agency to create multiple views and tailored extracts of its Web products and services. For example, genealogists are now a major nonscientific user group for the online USGS Geographic Names Information System, and whitewater recreationists are heavy users of the USGS online real-time stream flow data. 5b. What are the most significant problems you confront in disseminating your data? A fundamental goal of the USGS is to maximize the dissemination of information products to the broadest possible audience given the constraint of recovering costs associated with reproduction and distribution. Fees for USGS information products are therefore based on reproduction and distribution costs and not on the value of the product provided. These fees pursue full recovery of costs, including indirect costs such as depreciation of equipment. USGS information products are in the public domain, carry no copyrights, and may be used and shared freely. The public policy rationale for charging no more than the cost of reproduction and distribution for information products is that the taxpayer has already expended resources to create the data. The costs associated with reproduction and distribution to specific customers represent the incremental or additional cost that the USGS incurs to disseminate the information products to these customers. The most significant problem with digital data is that every order is customized. This causes problems in ordering the correct data type and format for the customer. It also creates bottlenecks within the production processes, sometimes resulting in delays in distribution. Due to file size, distribution over the Internet is limited by bandwidth, both on the USGS end and the customer end. The Web “pipeline” is presently inadequate to efficiently deliver some USGS products, such as remotely sensed satellite imagery. Another goal is to provide customers with data, information, and products in the format they most need, in a timely manner, and at a level of information that is appropriate to the intended audience. In addition, the proprietary nature of information that is collected as part of some cooperative agreements presents a problem in the broad release of information. The current, inconsistent pattern of electronic publishing—some products are available on the Web; some are not—is not based on an established policy, but rather arbitrary decisions. The support of printed products and their distribution is also a significant problem in addressing cost recovery mandates and in long-term funding of free products. The USGS is striving to find more cost effective means to disseminate a large variety of distinct products that may each have a relatively small or specialized customer base. 6a. Who are your principal customers (categories/types)? Because the USGS mission encompasses a broad range of natural science studies, issues, and interests, the agency serves many different customers. It defines its customers as anyone who uses USGS information, services, and products or as anyone who works with USGS to produce and deliver these. Its customers include the engineer who uses USGS data to revise building codes, the resource manager who uses USGS information to make critical resource and land management decisions at the state and local levels, the water manager who uses the data and information from USGS research and investigations and data collection in fulfilling his or her responsibilities to manage

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS the nation's water resources, and the hiker who uses USGS topographic maps. These customers also include Congress; state and local agencies; federal government agencies such as the Forest Service, NOAA, the Department of Energy, Environmental Protection Agency, U.S. Army Corps of Engineers, NASA, and the Federal Aviation Administration; land and resource management bureaus of the Department of the Interior, (Bureau of Land Management, National Park Service, Minerals Management Service, Bureau of Reclamation, Fish and Wildlife Service, and Bureau of Indian Affairs); the science community; elected officials at the state and local levels; other state, local, and tribal authorities; federal, state, and local emergency management agencies (Federal Emergency Management Agency, state offices of emergency services); producers and users of mineral and energy commodities; nongovernment organizations (e.g., insurance sector, structural engineering industry, not-for-profit natural resource interest groups); the news media; the private sector; citizens; universities and schools; representatives of other countries; and other USGS employees (internal customers). 6b. What terms and conditions do you place on access to and use of your data? USGS data are in the public domain and are not subject to copyright protection. Copyright is considered to be a barrier to use of data as a public good. Although not a term or condition per se, the fact that streamflow information is being served in real time on the Internet requires the statement that they are provisional data, subject to quality assurance and quality control. 6c. Do you provide differential terms for certain categories of customers? The USGS provides a volume discount pricing structure for registered business partners, federal agencies, and non-profit organizations that is different from the prices offered to the general public. 7a. What are the principal sources of funding for your database activities? The principal sources of funding for USGS database activities are congressional appropriations, interagency cooperative agreements (other federal agencies, and state and local agencies), and joint funding arrangements for geospatial data collection, analysis, and interpretation. Reproducing and distributing copies of USGS archival information is funded by congressional appropriations for legislatively required distributions, and through fees established to recover costs associated with reproduction and distribution to all others. A mix of legislation and executive direction authorizes and requires the USGS to charge for the dissemination of information products to customers both within and outside the federal government. The USGS is required to recover the full costs associated with the reproduction and dissemination of information products. Three fundamental concepts describe the philosophy that underlies USGS pricing policy: (1) the goal of the USGS pricing policy is to maximize the dissemination of information products to the broadest possible audience given the constraint of recovering the cost of reproduction and distribution; (2) prices should be based on costs, not on the value of the product provided; and (3) prices should pursue the full recovery of costs, including indirect costs such as depreciation of equipment. 7b. What pricing structure do you use and how do you differentiate (e.g., by product, time, format, type of customers, etc.)? The USGS pricing structures are based on algorithms designed to track estimates of the actual costs of reproduction and distribution. Whenever possible, products are grouped by like type and are priced accordingly. Since reproduction and

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS distribution costs are similar regardless of customer, the USGS pricing structures are applied equally. Projected targets for reimbursable revenues from the sale of USGS information products, coupled with congressional appropriations and cooperative funding, are used in developing USGS budgets. 7c. Do your revenues meet your targets/projections? Please elaborate, if possible. The USGS has made cost recovery a priority activity for the past two years. The overall USGS FY 1998 recovery rate is 100 percent. On a product-line basis, recovery rates for several product lines are less than 95 percent. However, the USGS is taking aggressive steps to update processes, contain costs, and update prices where necessary for each of these product lines. 8a. Have you encountered problems from unduly restrictive access or use provisions pertaining to any external source databases? No. However, the lack of adequate copyright guidance for federal agencies when publishing in the electronic era is a problem (see question 8d). As the National Biological Information Infrastructure federation is expanded to include international partners, it is anticipated that problems will arise pertaining to World Intellectual Property Organization (WIPO) issues. However, as of yet USGS has no experience with this. In addition, since it is a government agency, information in USGS possession is subject to Freedom of Information Act (FOIA) guidelines. Since anyone may make a FOIA request for information in the agency 's possession, some organizations have been reluctant to pass over to the USGS their data and information for the reasons described in question 4b. 8b. What problems have you had with legal protection of your own database activities and what are some examples of harm to you or misuse of your data that you have experienced, if any? Because USGS data are not copyrighted, the USGS identity is sometimes not carried or acknowledged on products that reproduce or use USGS data. This practice may be harmful as it could blend data from multiple sources and of different quality. Primary harm has been experienced when species have been researched, especially when the data or information produced reveals their exact location. For example, after USGS sent out a FOIA-requested release of information from a research study concerning the location of certain wolves, the animals were soon found dead. 8c. How have these problems differed according to data product, medium, or form of delivery, and how have you addressed them (e.g., using management, technology, and contractual means)? No differences. 8d. What specific legal or policy changes would you like to see implemented to help address the problems identified above? The problem statement is that there is no clear mechanism for guiding USGS authors with respect to copyright privileges and responsibilities. The two areas needing policy development are (1) public domain of reports in compliance with OMB Circular A-130 and (2) use of copyrighted material. Exceptions should be provided to the FOIA guidelines that would exclude the mandatory release of data and information pertaining to threatened and endangered species.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 9. Do you believe the main problems/barriers/issues you have described above are representative of other similar data activities in your discipline or sector? If so, which ones? If not, what other major issues can you identify that other organizations in your area of activity face? Yes, especially barriers that deal with difficulty in integrating data from various legacy systems. Headway is being made in these areas, as both more standards and better tools are developed for integrating data from different sources. Two specific problems are (1) lack of restrictions on FOIA guidelines and (2) potential difficulties in cultivating international partnerships due to WIPO-induced restrictions. Both of these problems will be encountered by any federal agency attempting to provide access to data and information about threatened and endangered species or attempting to partner internationally. The former problem pertains only to federal agencies. The latter problem might be encountered by all who engage in international partnerships if the WIPO were to adopt a treaty based on the E.U. Database Directive model. General Discussion PARTICIPANT: Can you tell us something about the financial relationship between USGS and Microsoft? MS. RYAN: Yes; with the guidelines on entering into CRADAs—cooperative research and development agreements—with the private sector, we are starting to see more of these, not just with Microsoft. So, as pressure starts to hit the public sector for finances, I think there will be a much broader range of partnerships with the private sector. Right now, Microsoft has purchased the digital orthophoto quadrangles (DOQ) data, just like any other customer would purchase those DOQ data. That is about the only financial exchange of research. In return for that, we had to advertise the CRADA in the Federal Register, so that any other group who wanted to do something similar, had the ability to do that right up front. PARTICIPANT: To follow that, two questions. One, how do you access the information if you don't go through Microsoft? Two, what if Netscape comes along and wants to do the same thing? Will the CRADA with Microsoft permit the USGS to enter into the same deal with someone else? MS. RYAN: Let me just answer that first question. The DOQ data are probably our best example of information available over the Internet. For any of these other data sets, that is the challenge that we have internally. Right now we have something like 300 or 400 home pages out there. Each of these individual data sets has its own home page. So, the challenge is currently getting those together, so that when you want to focus on a place on Earth, you can get the full range of these data. In terms of your question about another group entering into it, I think, in the life of the CRADA, they likely couldn't come in at that juncture. Their opportunity to enter into that was at the beginning when it was advertised in the Federal Register. If they wanted to come, and if it was to our benefit to spin off a different angle, then we would similarly advertise the goals, the missions, the functions for that, and enter into new CRADAs. There are actually a couple of other different partners in this CRADA with Microsoft. They wanted to get worldwide data as well as U.S. data. So, one of the goals was to use other partners for the other parts of the world, such as the Russians and their spy satellite data.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS detection network, the NWS radar network, and (soon) the commercial airlines—can be accessed only after direct agreements are struck between the university and the provider. In no case are licenses or contractual agreements with Unidata required to access data, though we point recipients to a warning statement (see<http://www.unidata.ucar.edu/data/data_usage.html> for additional information), which refers to conditions placed on the data by the NWS and foreign weather services, and it cautions against using the data for purposes other than education and research. 6c. Do you provide preferential terms for certain categories of customers? Yes; colleges and universities in North America, the Caribbean, and Central America have essentially unlimited, free access to Unidata software and services, including comprehensive support. Much Unidata software is freely available to anyone via Internet, but support is not guaranteed. 7a. What pricing structure do you use and how do you differentiate (e.g., by product, time format, type of customer, etc.)? We do not price our products, and most are available to universities at no cost. In the one exception—radar data—the pricing structure was set by the vendor who won our competitive procurement. In an effort to minimize university costs, the evaluation criteria for our procurement included the pricing structure that would be imposed upon university recipients of the data. There are a few nonuniversity recipients of our data products. These are groups (mostly government agencies) with whom we collaborate, and such organizations can receive only a subset of the data available to our university users, in accordance with our data-access agreements. Except where prohibited by the (external) owner, Unidata software is available to anyone, and the cost is always zero. 7b. Do your revenues meet your targets/projections? Please elaborate if possible. Unidata seeks no revenue from its products, and we meet that target exactly. The contractor who provides our radar data probably has a revenue target that is not being met. I estimate that the provider's Unidata-related revenues—the sum of our contract (about $70,000 per year) and fees from universities (about $50,000 per year)—fall short of the target by at least 50 percent. 8a. Have you encountered problems from unduly restrictive access or use provisions pertaining to any external source databases? Though I hesitate to describe the provisions as “unduly restrictive,” it is clear that costs and redistribution constraints are limiting the educational uses of certain data we acquire. In contrast, where data can be used without restrictions, our university community has shown remarkable ingenuity in creating Web-based materials of educational value in a surprising number of fields. 8b. What problems have you had with legal protection of your own database activities and what are some examples of harm to you or misuse of your data that you have experienced, if any? We have not sought legal protections for our database activities, and we do not think Unidata products have been misused with respect to our rights or those of our data providers. Our view notwithstanding, complaints have been raised—to the NWS and the U.S. Congress—about university use of Unidata services to create Web pages that “unfairly compete” with private-sector products here and abroad.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 8c. How have these problems differed according to data product, medium, or form of delivery, and how have you addressed them (e.g., using management, technology, and contractual means)? Where the data being conveyed are proprietary, we have helped protect providers' rights by using point-to-point delivery methods (i.e., direct from provider to university) rather than the data-sharing delivery methods we employ for most data streams. This imposes a greater computing and networking load on the provider, but allows more direct control over who receives data. For example, some providers require signed usage agreements. Except for the above technical approach—where providers implement their own (contractual) protections—Unidata generally employs informal (managerial) mechanisms to prevent data misuse. For example, certain data from the NWS are designated by the country of origin as “not for export, except for research and education purposes.” We have, through e-mail and newsletter announcements, discouraged universities from posting these data or derived products on the Web, even though such restraint may not be legally required. This matter is under discussion. 8d. What specific legal or policy changes would you like to see implemented to help address the problems identified above? The ideal—from a purely educational and research perspective—would be for data depicting Earth's natural systems to be available at no cost and without distribution constraints. Of course similar benefits would derive from a policy allowing unlimited use specifically for research and education, if such usage could be properly distinguished. However, educational use increasingly depends on access via the Web, and user/usage characteristics cannot be determined in this medium without a level of effort that is beyond most educational organizations. I am unable to articulate an overarching approach that fully resolves this issue, knowing that Web-based distribution can cause monetary or other harm. However, there are clear educational and economic benefits to government policies that maximize the availability of data depicting our environment. Perhaps the law of eminent domain should apply to databases and their encryption keys. In addition, it might be sensible for governments to offer legal protections only to those database authors who guarantee access at marginal cost for uses “such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research,” as described in the current copyright law. 9. Do you believe the main problems/barriers/issues you have described above are representative of other similar data activities in your discipline or sector? If so, which ones? If not, what other major issues can you identify that other organizations in your area of activity face? Though Unidata focuses primarily on current data, I think the problems, barriers, and issues we face are similar for retrospective databases in all of the natural sciences. In particular, the absence of common methods and metadata to handle spatial and temporal referencing—especially across databases from different disciplines—is a problem faced in all of the Earth sciences. Similarly, the tension between educational and commercial data interests exists in all disciplines. Actually, the tension may be worse in other disciplines because the global nature of atmospheric phenomena has created a culture of free and open data exchange, at least on some levels.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Issues in geoscience, broadly defined, that have not arisen in Unidata include cases where the data are politically loaded (because they reflect government activities, government inaction, threats to tourism, etc.) or where the most crucial data are unaffordable (as with Landsat, for example) or highly proprietary (as with oil-well data). Finally, I am concerned that current efforts to strengthen database protections may damage a long history of judicial and legislative efforts to balance authors' rights to exclusive control over their creative works against users' rights to utilize the ideas contained in such works. The need for balance—as reflected, for example, in current “fair-use” legislation—derives from the “Progress” objective set forth in the Constitution: “The Congress shall have Power . . . to promote the Progress of Science and useful Arts, by Securing for Limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” To an increasing extent, the “progress of science” is manifest as a succession of databases, each predicated on previous ones. (I note that a computer model can be encoded in a database; hence even the evolution of models may be viewed as a series of databases.) As yet this is not an issue in Unidata. However, I foresee the need for regulations and policies that foster rather than inhibit the creation of derivative databases, especially where the derivatives show creative differences from the originals. General Discussion MR. REICHMAN: Jerry Reichman, Vanderbilt Law School. It seems to me you already have a consortium of universities that is exchanging data for noncommercial purposes. I wonder if this model is capable of being enlarged into something much bigger and broader. In other words, would it be workable, in your opinion, if universities did this generally with data that they generate? Would it be workable to have at least a two-tiered price structure, or term structure —one for other universities participating in the consortium and one for outside commercial people who want to take these data and do other things with them? In simple form, would a consortia system solve the problem of universities, which want to generate and need access to data, to distribute data for scientific purposes, but also to commercialize data? MR. FULKER: I think you pose a good question. I don't know that it could be put in quite such a broad context as that. We have been motivated to avoid creating sensitivities to competition with private-sector vendors. We have been very careful to think up ways for distribution serving universities. You are proposing a different model. Quite frankly, I can't think of any reason why it wouldn't be possible. PARTICIPANT: Can you give an example of database protection that would inhibit your ability to provide service? MR. FULKER: The service that we provide most directly is not, I think, especially vulnerable to most of the database protection efforts. The biggest problem that we have has to do with redistribution constraints, preventing our universities from exercising the full range of educational opportunities, which have included, to a very successful extent I believe, the provision of information in the K-12 context. Instead of using our distribution system, they are turning around and putting information on the Web, making it accessible for use in the schools. The general indications from our universities is that access control is impractical in such extended contexts. I don't think there are any examples where we or universities are directly using the data for other than education or research, but there may be secondary usage via the Web which is not so constrained. Thus I find myself alarmed by provisions that rely heavily on

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS distinctions between educational and private uses of data. The problem concerning the World Meteorological Organization Resolution 40 is that I believe nations have a public-good responsibility to share data with other nations on an unrestricted basis. I think that is the biggest threat, and the database protections encourage it. DR. SERAFIN: I would just like to comment on that. Were you talking about some example beyond the radar example that Dave described? PARTICIPANT: I was just asking the general question. DR. SERAFIN: That radar example is an interesting one. The radar data are actually provided or collected or acquired through the National Weather Service radar. The National Weather Service determined that it did not have the resources to broadly distribute those data to the community, even its own weather forecasting offices in the network. So, it went to a private-sector mechanism for doing that, actually contracted with several vendors so that there would be competition, and allowed them, through charging for those services, to distribute those data. Whenever you see, on the Weather Channel or your local weathercast, the radar picture of the country or the radar picture of your region, they are getting those data through a private-sector company, but those data originated with the National Weather Service. What we have seen is that a rather large number of universities feel that they can't afford that. Of course, they can turn on The Weather Channel in their departments and see some of it there. They may not have some of the same tailored products that they would prefer. The next speaker is Bob Brammer. Bob, a long-time colleague of mine, is the vice president and chief technology officer for TASC. Commercial Data Activity Robert Brammer, TASC Response to Committee Questions 1a. What is the primary purpose of your organization? TASC is a diversified information systems integration corporation. Our customers are both government and commercial organizations, primarily in the United States but with a growing international segment. For the purposes of this NRC workshop, we will focus on TASC's information businesses and weather and agriculture. These operating entities are organized into TASC subsidiaries—the WSI Corporation (weather) and Emerge, Inc. (agriculture). While these do not form the majority of TASC's revenues, they are significant parts of our business. WSI recently had its twentieth anniversary, while Emerge is a recently formed start-up. 1b. What are the main incentives for your database activities (both economic and other)? As a commercial for-profit business and a subsidiary of a publicly traded firm (Litton Industries), TASC obviously expects its business units to be growing and profitable, according to approved business plans. In addition, TASC believes that these information businesses are strong strategic fits with the information technology focus of TASC's overall business and have excellent growth potential over the next several years. 2a. What are your data sources and how do you obtain data from them? The WSI Corporation is primarily a real-time business. We receive our information via several digital

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS communication networks from a variety of sources, both government and commercial. Our primary supplier is the U.S. National Weather Service (Family of Services). We also downlink information directly from both U.S. and international weather satellites. Additionally, we receive information from a variety of other government agencies and private organizations under many types of terms and conditions. The information from these sources is integrated and processed in many ways to create a variety of information products. Conceptually, this model has not materially changed in the past five years, although we have a significantly more diverse database today than we had five years ago. We expect that this model will still be relevant in the next five years, although we will likely have a much broader range of commercial data sources than we have today. In our Emerge agricultural information unit, the primary data sources are aircraft multispectral remote-sensing systems. We lease aircraft and host our uniquely designed scanners on these aircraft and fly surveys under contract from various agribusiness organizations. The data are sent back to our central computing facility for processing to create value-added information products. These products are transmitted to our clients. In the course of doing these surveys, we also use data from the Global Positioning System for precise navigation and data from our clients concerning their agricultural operations. Since our Emerge unit is new, we don't have five years of history or a strong basis for future prediction. However, we anticipate rapid growth in data sources as the business builds. 2b. What barriers do you encounter in getting these data and integrating them, and how do you deal with those barriers? The primary barriers are the technology issues and associated costs of implementing data communication networks, satellite downlink stations, aircraft remote-sensing systems, etc. Obviously, we deal with those challenges with a mix of staff expertise and technology. Occasionally in the weather aspects of our business there are political barriers to receiving data from international organizations. We work cooperatively with the U.S. National Weather Service in those areas. 3. What are the main cost drivers of your database operations? The main cost drivers are the costs of the skilled labor required to preprocess and quality-assure the incoming data, to operate the information systems, and to respond to customer questions and requests. The associated hardware, software, and networking technology are also significant budget items. 4a. Describe the main products you distribute/sell. For the weather information part of our business, we have a variety of workstation products and weather information products that are addressed to our various markets. The primary markets are the news media (network and cable television), aviation, energy and power, and agribusiness. Our agricultural information services are targeted at large growers. (These are described in further detail at our Web sites, see <www.wsicorp.com and www.emerge.wsicorp.com>.) WSI Weather Information Products and Systems Weather Radar Products

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS NOWrad® mosaic radar imagery providing local, regional, and national coverage with 5-and 15-minute updates. Unaltered single-site NEXRAD imagery 4-tilt base reflectivity. Composite reflectivity. 3-Layer composite reflectivity. Echo tops. Velocity azimuth display winds. Vertically integrated liquid. 4-Tilt radial velocity. 2-Tilt mean storm relative velocity maps. Increased radar sensitivity for better coverage and definition of precipitation. One- and three-hour storm accumulation. Total storm precipitation. Hourly digital rainfall array. Free text message. Product updates: 10 minutes in clear air mode, 6 minutes in precipitation mode, 5 minutes when local severe weather is detected. Enhanced NEXRAD mosaic imagery. Complete reflectivity. 3-Layer composite reflectivity. Echo tops. Vertically integrated liquid. Constant Altitude Planned Position Indicator Winds. Enhanced velocity azimuth display winds: Contoured echo tops. Radar summary. Regional and national coverage. Combines NOWrad radar mosaics with NEXRAD storm information—including storm-cell movement, echo top heights, hail, mesocyclone, tornadic vortex signatures, and severe weather watch boxes. Simultaneous viewing of multiple radar sites in a single image. Automatic suppression of most false echoes. 15-Minute updates via dial-up or via satellite delivery on WSI's HCSN. Winter storm mosaic regional, national coverage. 15-Minute updates via dial-up or satellite delivery on WSI's HCSN. Color-coded NOWrad mosaic radar indicate precipitation type: rain, snow, mixed. Automatic suppression of most false echoes. Simultaneous viewing of multiple radar sites in a single image. PRECIP rainfall estimates regional and national coverage. NOWrad mosaic radar interpreted into quantitative precipitation amounts. Cumulative totals appear in color-contoured bands. Real-time hourly estimates available by dial-up or via satellite delivery on WSI' s HCSN. Climatic summaries: daily, weekly, monthly, seasonally, and yearly. Meteorological Satellite Image Products WSI provides worldwide satellite imagery with 100% global coverage, including the U.S. GOES and NOAA Polar Orbiters, Japan's GMS and Europe's Meteosat. Imagery included infrared, visible, water vapor, thresholded, and full spectrum. Alphanumeric Data Raw data, decoded or plain language observations, severe weather, forecasts, technical discussions, numerical model output data, weather summaries, calculations, and conversions. Access to National Weather Service, domestic, public and international data plus FAA 604 circuit. Data available within seconds of receipt from NWS and available by dial-up or via satellite delivery on WSI's HCSN. DIFAX Operational weather charts with timely, frequent updates. AVcharts ™ for aviation professionals, weather charts for professionals and enthusiasts. Uses high-resolution forecast model data service gridded model data from the following models: Aviation spectral, Nested Grid, European Center for Meteorological Weather forecasting, Medium Range Forecast, Rapid update cycle and ETA. Timely delivery via satellite delivery on WSI's HCSN. Raw data available hours before NWS DIFAX charts. DATAsuite DATAsuite incorporates all of WSI's data and value-added products into one offering with the added advantage of including all future data products still in development during the life of a customer's contract. DATAsuite includes unlimited domestic and

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS international satellite imagery, and the NOWrad® family of radar products—winter storm mosaics, radar summary, STORMcast®, and PRECIP™ rainfall mosaics. Also, unlimited NEXRAD single-site products from all WSR-88D sites, HRS Forecast Model data, DIFAX and SUPERfax™ charts, and NWS text products and our complete family of on-air WEATHERcharts™ and more: STORMcast®: Weather information for the media market. STORMcast® automatically locates, tracks, and forecasts intense storms as they bear down on a station's area. Images showing storm cell position, movement, and intensity are updated and sent over a dedicated network within two minutes of the WSI radar scan. Severe storm tracking and path projection are depicted in clear, crisp icons with smooth, visually appealing technical radar echoes. WEATHERcast: Forecasting information for the media. With this software package for WEATHERproducer, broadcasters now have access to ready-made, on-air graphical products together with meteorological tools that actually illustrate what their viewers want most—future weather conditions, automatically. Embedded intelligence puts WSI data and reliable, science-based tools in the hands of the meteorologist. The latest projections, detailed graphics, and proven computer modeling from WEATHERcast create graphical forecasts that help viewers peer into the future. They can watch as their weather week emerges: sun and cloud casts, temperature, rain or snow, fog, thunderstorm, and severe weather forecasting. WEATHERproducer—Media WEATHERproducer—the totally integrated, data-to-graphics workstation from WSI—builds ratings by delivering more of what broadcasters want—the forecast—automatically. As a single, integrated workstation, WEATHERproducer appeals to the science-driven meteorologist and the audience-driven station management. WEATHERworkstation for Aviation is a monitoring and alerting system designed for operations where weather plays a critical role in safety and profit and loss. Briefings can be tailored to a user's specific needs. WEATHERworkstation for Industry is a one-of-a-kind weather monitoring, alerting, forecasting system designed for strategic and tactical industry applications. Markets include utilities, transportation, geology, construction, agriculture, travel, insurance, education, and entertainment. Internet Services Advertiser-sponsored consumer-oriented Web site, Intellicast (see <www.intellicast.com>), as well as a subscription service for energy companies, EnergyCast (see <www.energycast.wsicorp.com>). Services Services include round-the-clock customer and technological service. Customers can talk to WSI meteorologists to consult on weather or reporting anomalies or to reach a systems expert for tech support. Service also includes a full range of specialties, such as consulting, design, animation, programming, and forecasting services. Emerge Agricultural Information Products Emerge is a comprehensive precision agricultural information service that provides real-time site-specific data to subscribers. Emerge products assist in detecting crop variability, determining possible causes, and deciding what

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS remedial actions could be taken, if necessary. Emerge gives growers a complete informational view of a farm or agricultural operations, with access 24 hours a day, 7 days a week. The Emerge service includes information products such as: Detailed infrared imagery and enhanced vegetation maps, enabling detection and measurement of areas of variability. Critical weather data, forecasts, and agricultural weather alerts, at both a regional and field-specific level. These include such parameters as growing degree days, evapotranspiration, local inversions, and other information essential for crop management. Complete management of and access to important field data, such as yield maps, soil tests, and field inputs. Pest and disease alerts based on the exact weather conditions on designated fields. Crop yield modeling software, predicting potential yields based on specific seed, soil, and other inputs. EmergeView™ mapping workstation software for information display and analysis. Information access through a customized and secure Internet site. Ongoing field-level support and assistance. 4b. What are the main issues in developing those products? The main issues are ensuring that our products are focused on the specific applications that our customers require, that our implementations are better than the competition's, and that we deal effectively with the various technology issues associated with these developments in a cost-effective way. 4c. Are you the only source for all or some of your data products? If not, please describe the competition you have for your data products and services. WSI is the largest of the providers of real-time weather information. However, there are competitors in the various segments of the weather information business. In the United States these competitors tend to be small, privately held firms who focus their expertise and competitive products in various specific market segments. Internationally, to the extent that there are competing services, these are generally provided by the different countries' national weather services. Our aircraft remote sensing information service for agriculture is a relatively new business, and it does not yet have direct competitors providing similar services. 5a. What methods/formats do you use in disseminating your products? Our products are transmitted through various private and public data communications networks. For the weather information part of our business, we make heavy use of satellite broadcasting services from various satellite providers. For the agricultural unit, much of our information is distributed on a subscription basis through the Internet. We also use the Internet for weather information business. Additionally, there are various of private networks, which some of our customers use to obtain our information products. 5b. What are the most significant problems you confront in disseminating your data? There are many operational problems in dealing with a variety of telecommunications providers. Variations in quality of service and reliability are significant and expensive issues. The Internet is also a somewhat uncertain medium.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 6a. Who are your principal customers (categories/types)? Television meteorologists, major airlines, air freight companies, electric power utilities, and major agribusiness firms are our principal customers. Some federal, state, and local government agencies are also important customers. 6b. What terms and conditions do you place on access to and use of your data? Generally, a monthly subscription fee provides access to a defined broadcast stream. Dial-up connections are also available on connect-time fee basis. Licenses for specialized user software and redistribution rights are also established. There are also statements about the advisory nature of the forecasting services and certain limitations of liability. There are also advertising fees since some of our Internet services are advertiser-sponsored. 6c. Do you provide differential terms for certain categories of customers? Yes; distinctions on resolution (spatial and spectral) and timeliness are commonly used differentiators. Variations in user software functionality and in redistribution rights are also used. 7a. What are the principal sources for funding for your database activity? These are commercial businesses. The funds for the database activities come from the revenues from selling the products on commercial terms. 7b. What pricing structure do you use and how do you differentiate (e.g., by product, time, format, type of customer, etc.)? As noted in the response to question 6b, most of our revenue is derived from subscriptions. The customers sign a contract for a period of time (generally a year) and pay monthly for the information that we provide. Product differentiation is done by all of the methods in the above question. Products can be differentiated by resolution (spatial or spectral), by timeliness (minutes are very significant in some applications), or type of customer (we differentiate by functionality and by data volume). Additional revenues are derived from the sale of workstation systems and/or local area networks that receive our information products. In some cases we provide integration services to connect our systems with customer operations. 7c. Do your revenues meet your targets/projections? Please elaborate, if possible. In general, we meet our business plan objectives. If there were to be significant deviations from plans, we would make necessary changes. We do not report revenues at the subsidiary level. 8a. Have you encountered problems from unduly restrictive access or use provisions pertaining to any external source databases? In general, within the United States we can get the information needed on a commercial basis, if we feel that there is a sufficient market demand. Until recently the commercial terms from many national weather services were far too expensive for us to obtain data from them on a profitable basis. However, we are now seeing some very large price reductions due to commercialization efforts in some countries that are changing this situation significantly. These changes, if sustained, may do much to stimulate weather information services internationally.

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 8b. What problems have you had with legal protection of your own database activities and what are some examples of harm to you or misuse of your data that you have experienced, if any? We have had some instances of unauthorized copying or redistribution of data. Although this has not yet been a major problem in our businesses, there are enough instances that we have to devote some staff time to reviewing reports of misuse. Certainly, there is the possible risk that such problems could grow. For example, we have seen some of our image products (e.g., weather radar images) used in promotional material without attribution despite the clear presence of copyright statements on these image products. We have called the offending organizations to attempt to resolve these issues with varying degrees of success. Almost surely, there are incidents like this that we never hear about. To some extent, there is loss of revenue and profit from this type of misuse. We do not feel that this has yet been material in our business, but we certainly will continue to monitor within our resources. 8c. How have these problems differed according to data product, medium, or form of delivery, and how have you addressed them (e.g., using management, technology, and contractual means)? Much of our revenue and profit derives from image and graphic products. In recent years, we have marked all these products with copyright statements. We believe that this has helped inhibit some misuse. The real-time nature of much of our information business is also a partial inhibitor to redistributors. The delays involved in redistribution would limit the value of this type of unauthorized use. We use the various methods of intellectual property protection including trademarks, trade secrets, copyrights, etc. Our contracts specify the rights of the customer for redistribution. In some cases redistribution is the intent of the agreement, and there are specific measures detailing how such redistribution is to be done and what limits are placed on such redistribution. We have done some experimenting with some “watermark” technical approaches to inhibit unauthorized copying or redistribution. Subtle signatures can be placed into image, graphics, or other types of information products to demonstrate authorship. These encrypted signatures can be placed into the data without being apparent to uninformed users. We are currently investigating the operational implications of such techniques before placing them under full-scale development and implementation. We also use logging and reporting techniques to see who is using our Internet sites. In some cases, we have found apparent program-automated accesses that indicate likely retrieval and storage of some of our data. We are able to track the users and to investigate their usage. Generally, we can limit this type of access with today's technology. This may be more difficult in the future, depending on technical developments in computer security. 8d. What specific legal or policy changes would you like to see implemented to help address the problems addressed above? In the United States there are already applicable policies and laws governing our types of products and services. In particular, it seems clear that our image and graphic products are protected under copyright. In some cases, better enforcement might help. Further legislation does not appear necessary, although consistency in court rulings on what types of information can be copyrighted would be of benefit to the information industry. Internationally, however, there are certain countries in which stronger local laws and enforcement would definitely be an improvement. The lack of a uniform legal framework is an inhibitor to certain types of information businesses in these countries. As a company with

OCR for page 6
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS growing international markets, we would like to see uniformity in international laws for intellectual property. 9. Do you believe that the main problems/barriers/issues you have described above are representative of other similar data activities in your discipline or sector? If so, which ones? If not, what other major issues can you identify that other organizations in your area of activity face? The problems that we face are representative of those faced by similar data activities elsewhere. The strict time-limit requirements of much of our business is a limitation to some of the unauthorized copying and redistribution issues that other types of information businesses may face. Furthermore, the image and graphic products are somewhat easier to protect under copyright than archival text databases. These are not the reasons that we are focused primarily on real-time information services, but that aspect does provide some measure of protection. General Discussion PARTICIPANT: You mentioned that some of your sales go back to government agencies. What, if any, restrictions are placed on the redistribution or open access to those data sets that go back to government organizations? DR. BRAMMER: Generally, the government agencies contract for them for their own use; and the redistribution—we come to an agreement in the contract for those services, and how they're used. PARTICIPANT: Could one access it under the Freedom of Information Act? DR. BRAMMER: That really hasn't come up. One of the advantages of that part of our business is that those are real-time products for the most part. The unauthorized redistribution has not, at least to date, been a real problem for us. Occasionally we see some of our image products on the covers of publications, maybe an image product from a hurricane or some other special event. We copyright all of these image products, and we believe that these copyrights are viable. Occasionally they are violated. So, it hasn't been a big loss in revenue, but we do see it once in a while. As far as I am aware, we haven't had a Freedom of Information Act occurrence with our customers. DR. SERAFIN: I was reminded by Barbara Ryan earlier that we have been looking at four different disciplinary types of databases. Within each of these we have heard about the fact that there are distributed diverse data sets within these disciplines, through which the combination or the integration can result in rather significant scientific advances. She also pointed out—and I think this is important—that there are also benefits to be gained, and perhaps even greater benefits, by going across those disciplines, and the four that we talked about this morning are only four. There are many others that would be valid and worthwhile to cut across. We are using these today, I think, as our examples of databases and how they might be used. By no means do we have an exhaustive list before us.