Cover Image

Not for Sale



View/Hide Left Panel

9

Commercial-Sector Data

DR. SERAFIN: I am going to pose five questions (see Box 7.1 in chapter 7). Each of the panelists will give their perspective or comments regarding each question, and then we will have a general discussion of those questions. The panelists are Leslie Singer, Institute for Scientific Information; Myra Williams, Molecular Applications Group; Barry Glick, GeoSystems Global Corporation (retired); and Robert Brammer, TASC. Our rapporteur is Mark Stefik from Xerox Palo Alto Research Center, and I am sitting in for Martha Williams as moderator of this panel. So let me begin with the first question, which, is to identify and discuss the principal benefits and opportunities to your database production or dissemination activities from the current legal policy regime. Try to rank them in order of importance.

MS. SINGER: We use both copyright and licenses together and issue cease and desist letters if we find unauthorized uses of our databases. Having said that, we have always believe that the license that is part of our contract is our main protection, and we put a lot of thought and effort into the contracts that we put out in the marketplace. We spend a lot of time and effort in the negotiations with our customers, especially on the consortium level, which are quite intense. In today 's business environment that is our primary focus, and we use that extensively. If people don't adhere to our contracts in certain ways, then the leverage we do have is just to stop shipping content, and that has worked for us in places where licensing is taken seriously.

DR. WILLIAMS: We are in a very similar situation, in that our real protection in today's environment comes from our licenses, as well as from the proprietary technology that generates the information for our databases. I don't worry that much about inappropriate uses. Yes, certainly it happens on occasion, but it has not been a major problem for us.

The thing that has been of greater concern for us, and that is very valuable about the current public policy, is that we do have access to information, and we use that information judiciously. If we are extracting only a small portion of public information to feed into our collection systems, then there is no need to have any kind of negotiation in most cases. In some cases we do pay even for the initial access.

If we would like to work with someone to include a substantial portion of their data, then we will negotiate specifically with that individual or the institution. It is the concept today of fair use (even though no one has qualified what fair use means) that has been very important to us as a company.

Some of the technology that we utilize goes out dynamically to the World Wide Web, identifies information, and brings it back and holds it. Even if it doesn't put it in a permanent database, it holds it in cache or the information is stored in flat files. However, it could be going into a permanent database. We are free to do that under the current regulations. This capability is very important to us because it means that the scientists that use our software always will be able either to use the information that is in the database or go out and update it with the most current information. But it does mean that they are extracting information from a wide number of sources over the Web to do that.

I think that those are the most important things—having access to information and being able to use that information freely by adding value to it in the creation of derivative databases.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 9 Commercial-Sector Data DR. SERAFIN: I am going to pose five questions (see Box 7.1 in chapter 7). Each of the panelists will give their perspective or comments regarding each question, and then we will have a general discussion of those questions. The panelists are Leslie Singer, Institute for Scientific Information; Myra Williams, Molecular Applications Group; Barry Glick, GeoSystems Global Corporation (retired); and Robert Brammer, TASC. Our rapporteur is Mark Stefik from Xerox Palo Alto Research Center, and I am sitting in for Martha Williams as moderator of this panel. So let me begin with the first question, which, is to identify and discuss the principal benefits and opportunities to your database production or dissemination activities from the current legal policy regime. Try to rank them in order of importance. MS. SINGER: We use both copyright and licenses together and issue cease and desist letters if we find unauthorized uses of our databases. Having said that, we have always believe that the license that is part of our contract is our main protection, and we put a lot of thought and effort into the contracts that we put out in the marketplace. We spend a lot of time and effort in the negotiations with our customers, especially on the consortium level, which are quite intense. In today 's business environment that is our primary focus, and we use that extensively. If people don't adhere to our contracts in certain ways, then the leverage we do have is just to stop shipping content, and that has worked for us in places where licensing is taken seriously. DR. WILLIAMS: We are in a very similar situation, in that our real protection in today's environment comes from our licenses, as well as from the proprietary technology that generates the information for our databases. I don't worry that much about inappropriate uses. Yes, certainly it happens on occasion, but it has not been a major problem for us. The thing that has been of greater concern for us, and that is very valuable about the current public policy, is that we do have access to information, and we use that information judiciously. If we are extracting only a small portion of public information to feed into our collection systems, then there is no need to have any kind of negotiation in most cases. In some cases we do pay even for the initial access. If we would like to work with someone to include a substantial portion of their data, then we will negotiate specifically with that individual or the institution. It is the concept today of fair use (even though no one has qualified what fair use means) that has been very important to us as a company. Some of the technology that we utilize goes out dynamically to the World Wide Web, identifies information, and brings it back and holds it. Even if it doesn't put it in a permanent database, it holds it in cache or the information is stored in flat files. However, it could be going into a permanent database. We are free to do that under the current regulations. This capability is very important to us because it means that the scientists that use our software always will be able either to use the information that is in the database or go out and update it with the most current information. But it does mean that they are extracting information from a wide number of sources over the Web to do that. I think that those are the most important things—having access to information and being able to use that information freely by adding value to it in the creation of derivative databases.

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Currently we can do this without having any legal concern that we are depriving someone else of subsequent revenue for which they have no defined plan on that specific day. MR. GLICK: I have had a very similar experience. If you include in the current legal and policy regime the whole public-domain issue —the U.S. perspective on public domain that was discussed in the last plenary session—that clearly would be the highest importance to us. The fact that in the United States we have the right to use public-domain data without any cost at all to us and without any restrictions is essential. For protection we also rely primarily on licensing. I note, and there are some people who know a lot more about this than I do, that one of the legal protections that has been discussed here is shrink-wrap licensing. I note that this is under attack and may not hold water, the whole concept of implied license when someone tears a shrink-wrap. The kind of licenses that we have on our Web site—which are basically that you agree, by using our Web site, to our license terms, which include our third-party license terms—I doubt that that would hold water either. So, the existing protections are very weak in terms of the benefits they provide to the company. In our industry there is a kind of ad hoc assumption of what fair use is. For example, we can use multiple copyrighted materials to compile from as long as we don't rely on a single one, and we have got confirmation from a number of them; that is something we rely on and that is important. I don't know if that is codified anywhere, but that seems to be generally accepted practice. DR. BRAMMER: I think I can make a couple of distinctions. First, on the supply side of our weather information in the United States, I would say that the principal benefit is the policy that the National Weather Service has had of providing a defined external interface that allows private corporations access to information in a defined way. It allows the Weather Service freedom to alter their own operations without altering the interface and allows us to get data in a predictable way, and then to add value and disseminate them. We pay back the cost of that. The Weather Service is providing us the information, but then we are free to use it for our commercial businesses. Occasionally there are problems in deciding exactly what the government is going to do and what the private sector is going to do, but by and large those are issues we can resolve. On the customer side of our business, there we have a variety of terms in which we can license the information to a secondary distributor under certain terms and conditions, or it can be sold or licensed directly to an end user. In either case those mechanisms seem adequate. At least we don't see widespread misappropriation or improper redistribution of our information. I think part of that is due to the fact that most of the data are real-time information, but mostly I think we are just dealing with real commercial businesses that want to know the terms and conditions under which they are operating and then they negotiate a satisfactory price for that. So by and large this has worked well for us. If we look at this internationally it is a lot less clear to me. Most of our revenues are from domestic U.S. sales. So a lot of this is relatively new territory for us. The various international weather services don't have the same sort of mechanisms as our National Weather Service, and there is a definite ambivalence on providing access to some of their information on reasonable terms. In some cases the prices are so high that purchasing it would not be commercially viable. That seems to be changing, but it is very recent. We also have problems with redistribution. We have some international customers, and as far as I can tell they are not doing anything improper

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS or violating our license agreements. Again, that is relatively recent and a relatively small part of our revenue. So there are still question marks there. DR. SERAFIN: We have heard from all of our panelists, so the floor is now open for comments and questions from any of the people in the room. DR. FISCHER: I have a question for Myra Williams. You mentioned that you get a lot of data on the Web without any problems or restrictions. Do you think some of these data are from commercial publishers who normally sell their data under other conditions? DR. WILLIAMS: We negotiate licenses to acquire any information that is provided by commercial sources. DR. FISCHER: So the data you are acquiring are from sites that make them freely available? DR. WILLIAMS: That is correct. There are some academic sites that require commercial institutions to have a license—we pay for access to those sites. One thing that may not have been clear is that when I said that our software brings in information dynamically and caches some of it, it is actually caching the Web pages. What does that really mean from a legal standpoint? Scientists don't just want our analysis. They want to know what the source of the information is and they would like to see the original data. They can drill down and get access to that information now on those sources that we utilize. That is why you are hearing so much agony today over the privatization of SWISS-PROT, because it is not clear that we are going to be able to continue to form derivative databases using SWISS-PROT as one of many sources. Even though we have a license for the use of SWISS-PROT, they would say that all of our customers would have to have licenses if they are going to cache SWISS-PROT data internally. MR. UHLIR: I would like to ask all the panelists if they believe that current technical protection measures, particularly online, have been adequate or if they have problems with unauthorized access and use, or theft of data online? MS. SINGER: Before we went to an Internet product we did a lot of research and a pilot project in which we used watermarking, encryption, and a lot of other technologies. We went to five or six commercial and academic institutions to actually install what we had put together. This was technology admittedly from three to four years ago, but the technology was too heavy for the commercial environment and our customers just blatantly said that they would not buy a commercial product with this type of technology. We were very concerned because a lot of our content comes from primary publishers and, in fact, we had a cooperative arrangement with them at that time for this pilot to actually deliver their full text as well as our bibliographic database. We were very cognizant of the primary publishers in that we were doing a small pilot delivering full text, and that really made us go the extra yard. After that, we came out with a commercial database product that does not have any encryption or watermarking. The intellectual property is password protected. The primary publishers followed us in not using that technology as well. I don't know if great advances have been made, but at that point in time the technology was just too heavy. MR. GLICK: On the one hand, the technological measures are not adequate to protect the databases, but on the other hand, it almost doesn 't matter very much because we don't allow direct access to our database through our public Web sites, only to derived products. We are comfortable that the process of reverse engineering back to the database from the derived products would be extremely difficult. That is the protection that we use. DR. BRAMMER: I would agree with Barry Glick regarding the database itself, although I took the question to mean are we satisfied with the protection of the product and not just the

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS database. If you had asked this question a couple of years ago I would have said, “Yes, they are adequate for distributing most of our information via satellite broadcast with encoding and decoding at the user end.” It could have been broken, but I am not sure the cost would have been worth it to someone to do that. What we have seen recently, though, I find a little disturbing. On our advertiser-sponsored Web site, we have had a few indications of people setting up production operations to download some of the image products. We see the same individual hitting our Web site and hitting particular products repeatedly, and in some cases we have actually tracked them down. At least we can trace it adequately and we have gotten them to stop, but this is fairly recent. I am not sure where this is going. It has not been a big fraction of our revenue. I don't think we have lost a lot of money, but it is raising some alarm bells that there may be something coming, so we are going to pay more attention to it. I agree with Leslie Singer's comments about the watermarking and other types of technology as being overkill. Frankly, we can tell whether someone is using our products or images just by particular features. We don't even need anything as subtle as a watermark, and in fact, we have seen some inappropriate use of some of our products by government agencies. So we would like that additional protection. We have spent a lot of money putting one of our images on the cover of some publication, so they should provide proper credit for that. MR. LEAVITT: You mentioned end user and further distribution, and my understanding then is that you are licensing to any use that an end user puts to it, but what constitutes an end user rather than a penultimate? How do you handle that? DR. BRAMMER: The user signs an agreement for the type of use of the information. There may be cases that we just don't know about, but generally we don't see any indication of widespread misuse of the information. MR. LEAVITT: Let me clarify the point I was trying to make. I admit that I probably didn't state it very well. What I am thinking is that someone gets data from you and there are many things they can do with these data. They can take the data and reformat them or create another product from the data. Then the reason that the user wants your data is to create this secondary product, to use them to work within his own operation, say for farming or something like that. The question is, What do you consider legal end-user use of your data? Do they have to be consumed at the user site or in the user operation? DR. BRAMMER: No, not necessarily. For example, we supply information to companies like Bloomberg or Reuters, which then redistribute the information on a very large scale. There is an agreement signed that talks about the price estimate of doing data redistribution. It is the intent of the agreement that they will redistribute. In other cases, the intent of the agreement is that they are going to use it only for their own internal purposes —maybe integrate it into their own flight operations, if it is an aviation organization or something like that. We hold our sales reps responsible for knowing at least in general terms what the customer is doing with the information. DR. SERAFIN: I want to go to the second question now. This one has to do with the problems and challenges to your database activities that current legal policy presents to you. DR. WILLIAMS: I will lead off on this one. We have already discussed it today. The privatization of government-funded research data is the major issue. Most of the problems we face currently involve European databases. We have already talked about SWISS-PROT. There is another database called SCOP, which is a structural classification of proteins database. It is becoming increasingly difficult to utilize information from SCOP to create derivative databases.

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Thus, although the situation is still controllable to a certain degree now, it is of growing concern as we look toward the future. One of the other things that needs clarification as we move forward is that academic scientists increasingly are involved in commercial activities. They are funded in their work by grants, by government contracts, and by cooperative research and development agreements (CRADAs). Each of these funding mechanisms comes with very different stipulations about what they can do with the data. We heard today with great clarity that if your research is funded by a grant and you publish the information, the data themselves can be freely used by anyone. The copyright only covers the publication, not the data. Evidently, that is not the case if it is funded by contract or if it is developed under a CRADA. The universities themselves lack clarity on these issues, which is why it takes us so long to negotiate with them about exactly what is the appropriate way to obtain access to their information. As these scientists increasingly are getting involved in their own commercial companies and using the work that was funded by the government as the foundation for the initial products of those companies, I think it is going to be a greater issue. DR. BRAMMER: You and a couple of the other people this morning talked about privatization—and I don't know anything about the SWISS-PROT database—as if it is necessarily a bad thing. From my point of view, however, privatization is a good thing. I would much rather deal with an organization that is trying to operate a business on a commercial basis. Our experience with trying to get weather or environmental information out of government agencies internationally, when they were not set up to provide the information on a regular operational commercial basis, has been that the costs were prohibitive. They didn't have a delivery mechanism that was commercially viable. Now that there is some privatization, we are beginning to see some reality setting in and costs coming down and both operations and access improving. So from our viewpoint privatization is a good thing. DR. WILLIAMS: Yes, as long as it can be balanced with some type of assurance of access, because the problem that we were discussing this morning is that scientists need to utilize information from numerous different sources in their analyses. They may collect 2,000 different bits of data and then use statistics to determine which data are statistically relevant for some particular prediction. They bring in vast quantities of information and store large quantities of data. One major pharmaceutical company said that they will have 20 terabytes of information stored by the end of this year—some data they are bringing in from the Web and some they are generating internally. So the concern is that, yes you will improve quality, but you will also at the same time run the risk of reducing that kind of dynamic access. DR. SERAFIN: I did not see an inconsistency in your positions. Myra Williams stated that data that have previously been fully open and accessible and useful to them were now being privatized and restrictions were being placed on them. Robert Brammer's example was that there are certain databases that are not easy to get at. Government is not making them available, and if they try to, the costs are prohibitive. A private-sector source might provide a better avenue, and I think anyone could agree with both of these opinions. DR. BRAMMER: But isn't it in the interest of anyone privatizing a database to promote access to it? MR. GLICK: Not if it is a monopoly. DR. WILLIAMS: Chris Overton said this morning that he was thinking about preventing commercial access to his data. If all academics prevent commercial access to their data, then those companies that have actually added great value through derivative use of the data and have

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS benefited from such access will be handicapped. It will just cascade. There is a limit on how much companies, even companies as large as Merck, can pay to bring in information from 500 different sources that are academic sources. MS. SINGER: I guess I will switch to a completely different tack. Because we are a global company and our sales are around the world, it is very important for us to have a level playing field. It would greatly benefit us if there could be some type of compromise between the European Community and the United States that both entities could work out. It also is the other nations around the world that don 't have really very mature attitudes toward copyright. If we could get an international agreement and maybe we could sell in places that we don't presently sell in today. So to us, this is the major issue. DR. BRAMMER: Consistency in international law would be very helpful. MR. GLICK: I want to echo that as well, because we also have customers around the world and sources around the world, too. I think the number one problem and challenge is just uncertainty about the law. We have heard some of it here. What is covered under copyright? What kind of licenses really work? We just don't know, and our lawyers don't know. We have to spend more time negotiating because our vendors don't know either, and that means more time with lawyers; and that is not the most productive use of our resources in general. Dealing with non-U.S. governments is a very difficult situation. For example, the Canadian Census Bureau has given a sole-source contract to a commercial company in Canada, which is the only entity you can negotiate with for access, and as you can imagine they feel they have a lot of leverage in negotiations, which makes it difficult. MR. LEAVITT: I want to clarify something on that. It seems to me that if a database is privatized for the purpose of the entity creating a product, which a company then has exclusive control over, that is quite different thing than if you are talking about privatizing a database for the purpose of marketing the database because the model is going to be totally different. DR. BRAMMER: Yes, I understood that to be the response. You have plenty of examples of both cases. DR. SERAFIN: I am going to go to the third question. What specific conduct on the part of others most adversely impacts your organization 's database activities? In answering this question consider the impacts on your data activities caused by other database producers, data product disseminators, and data users. MS. SINGER: I will start with database producers. I guess it is a normal competitive environment. We are in competition with other database producers, and we are also in competition with the National Library of Medicine (NLM), and I will echo what I said initially. It would be grand for us to have a more definitive idea of what the government's role is in information gathering and information dissemination so that we know where to invest our dollars and we know how to play in the commercial marketplace and what the role of the government is in that marketplace. DR. BRAMMER: I absolutely agree with that. That is also true in our business. Getting a consistent statement about what the government is and is not going to do would be very helpful. Then you would know what your environment is and you can adjust your business accordingly, but when it is inconsistent or varies from either one part of the country to another or one year to the next, that is when problems come up. MR. GLICK: Earlier I mentioned a company that is rapidly developing a monopoly position for navigable databases. What they have done is use that database to basically carve out a reserve market for themselves, which is the automobile market, and basically they have said,

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS “We are not going to license data to anyone who wants to be in that business, but we will let you use it for other things.” So this is essentially the exercise of monopoly power, and the cost of reproducing that database would be so enormous that no one else would possibly do that. DR. WILLIAMS: But they produced it with private funds. MR. GLICK: Yes, totally private. However, the underlying data came from the government. DR. WILLIAMS: But the underlying data are still accessible. Anyone who has the wisdom to identify something where they can actually carve out a niche can do so. We look for opportunities to create important derivative databases, but all our competitors still have access to the underlying data. MR. GLICK: The interesting situation there is that of course they can maintain a high price level, which they need to do to recover that investment. It would be an irrational decision for any other business to go into that because then there would be competition and prices would come down, and no one could recover. So it almost has to be a monopoly or a government type of activity. DR. BRAMMER: But that wasn't a monopoly created by government fiat. That was a business growing its business and carving out a good position based on its decisions. So more power to them. MR. GLICK: Again, I will go back to the TV guide situation in Northern Ireland in the Magill case. If the information is withheld, if, for example there is some public good application like an emergency response system that need these data, and they withhold that, then there may be an issue there. I think someone said that antitrust law generally hasn't been applied to databases in the United States, but it has been applied in Europe. This may be something that is going to have to be looked at at some point. I don't think this company is going to do that. They are going to make the data available, but they do have the ability to prevent it. DR. WILLIAMS: There is an interesting situation in the genomics area involving Incyte, a company that has spent hundreds of millions of dollars sequencing human DNA. Incyte charges a substantial annual license fee for these sequences. Celera is now coming in many years after Incyte claiming that they will do the same thing, but that ultimately their sequences will go into the public domain. The one certain conclusion is that the price for accessing these sequences will be driven down. It is an interesting challenge to think of how Perkin-Elmer, the major investor in Celera, will ever recover that sort of investment because it will cost them hundreds of millions of dollars to generate the data. MR. GLICK: If, for example, the Maptech database was available on the Web and someone could, through some process, download all that value-added information quite easily and add it to the government source, they would be able to recover their investment at a much lower cost. This would create an unfair playing field advantage for the second comer who would be taking advantage of that work done by the pioneer. DR. WILLIAMS: The other problem that we face is that some of our competitors, particularly those in other countries, still have substantial government funding. Being a start-up company that lives off venture capital funding and private investment makes a very different sort of playing field for us in what we can afford to charge for our products and the kind of access we can provide. If, for example, I could have exclusive rights to commercialize all the work done by the National Center for Biotechnology Information (NCBI), we would dominate the world in bioinformatics. There is a company in Germany that is getting exclusive rights from the

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS European Molecular Biology Laboratory (EMBL) to commercialize some of the work that is generated by the EMBL. As a result, they have enormous leverage. DR. BRAMMER: Why would the government grant an exclusive agreement? DR. WILLIAMS: To encourage the growth of that industry in Europe. DR. BRAMMER: By granting an exclusive agreement? DR. WILLIAMS: Yes. It is a German company, and they want to see that company thrive. MR. BAND: I have many clients who are in both the financial services area and the technology area. A lot of them have had problems similar to the ones that have been identified by Barry Glick in terms of monopoly pricing. This is certainly the case in the financial area because of the stock market. They are obviously monopolies controlling that information and there have already been increasing prices, increasing during the time over which they exercise control over their information. More database protection will lead to more of this type of protection, and all those costs will have to be passed on. It allows the monopolists to decide whether they will license the information or whether they will choose to keep the information and add value on their own, again causing ripple effects and diminishing competition downstream. MR. MAURER: I just want to make a comment about monopolies. You mentioned that a second entrant could come in and then both of them would lose their shirts. However, the classical theory says that a dual-operator solution helps the consumer some but still maintains an elevated price. It is only in the market where only one investor can make the database and the whole revenue stream from that market is needed to pay off that first database that you have a natural monopoly solution. In any other situation, more and more entrants will come in and discipline that market. Of course, when you have an infinite number of people who can be supported, then it gets down to the market solution. So the first point that I want to make is that you have to have an accident where the database creation cost is an appreciable fraction of all the revenues that will ever come from that market (and automobile navigation may be a good example of that) for this monopoly problem to ever happen, even if the initial database is very expensive to create. The second point is that if you do have such a market, and you do have a natural monopolist, how do you fix it? Presumably we want the monopolist if the other choice is that we don't have the database at all. Laura Tyson called this the niche market problem. You need to have some sense of how often we get into the situation where there are natural monopolies. I think that would be an interesting thing for people to comment on. And second, when you do have that situation, what are we going to do about it, because not having the information at all may be worse than having a monopolist. MR. BAND: You have to consider that every market is different. Again, in financial markets, it is not a natural monopoly, but the publisher of information is also the person who creates the information. The stock market is a perfect example of that, where it is very easy for them to maintain a monopoly. In theory, someone could call up all the 3,000 companies that are traded on the New York Stock Exchange at the end of the day to find out where all the stocks ended up. It would be cost prohibitive relative to the cost that the New York Stock Exchange incurs in obtaining that information because it is the place where it is taking place. It is very easy to get the information. Another example, of course, is in telephone service. The phone company has all the phone listings, which are ancillary to its provision of telephone service. There are other

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS examples, the Intemet and consumer places where a person has a database or is a publisher of a database where it is ancillary to some other function, and they have a monopoly in that other function, they have similar leverage over the information and in marketing the resulting database. MR. EISGRAU: I am with the American Library Association. I am not trying to put words in the panel's mouth, but I think I heard what I am about to describe, and I want to make sure. It seems as if none of the panelists, in their lines of business, are aggrieved at the moment or even substantially worried about being victimized by the following situation, where someone has lawfully acquired their information initially and then makes a subsequent transformative use, a derivative use of the information in a way that doesn't compete with their core business. DR. WILLIAMS: We address that in our license agreements. Whether we could enforce it is a different matter. MR. EISGRAU: It is either that you are not concerned or that you have another way to deal with it. DR. WILLIAMS: We recognize that this is a serious issue—one that could deprive us of future revenues. Our license agreements give licensees the right to use software and databases to create derivative products for internal use. If they commercialize the derivative products, they would need to renegotiate the license agreement. MR. EISGRAU: Does anyone on the panel believe that, whether by license or by the nature of the business, that they are exposed to risks that have not been controlled for by some other means, transformative noncompetitive use? MR. GLICK: When you asked that question initially, what did you mean by “noncore”? MR. EISGRAU: Noncompetitive. Let me make it concrete with an example. I acquire access to a database of restaurant reviews, and in point of fact I don't really care about whether a restaurant has three stars or one. What I care about, because I am in the tofu business, is the percentage of vegetarian meals that a restaurant has available on its menu so that I can, in combination with data that I have acquired lawfully from other sources, crank my sales force up, use the proprietary information, and target my tofu team on going to the right places. So, if you are the producer of that restaurant review database, what I do with what turns out to be field 12 out of 22 fields in what I bought from you is not competitive at all. I am not putting out alternative restaurant reviews or anything close to it. I am just extracting field number 12 from the 22 to help my tofu sales. MR. GLICK: So your question is, What if someone has licensed data from me for a totally different purpose, let us say for an end-user license, for example, and instead has taken pieces of my data and now used them commercially? MR. EISGRAU: Would it concern you and how do you control for that now? MR. GLICK: Again, hopefully through license agreements. The bar is set on gaining commercial benefit from our data without proper recompense to us, whether or not it is in a competitive area. That is where the licensing angle comes in. I think Leslie Singer mentioned that earlier today about niche providers who do a query on her database and then go out and do business in a particular industry. DR. SERAFIN: I would like some further clarification on this issue if I might. I imagine that some of your customers are biotech companies, and so therefore they are in the business of making money too. One of the reasons they are buying information from you is that they want to help their business. Where then do you draw the line, because they are going to improve their

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS commercial position, their competitive position, through the use of your information? Where do you draw the line and say that no, you cannot do that, but you can do this? MR. GLICK: Again, it is the license terms that define the use of the product. In most cases the product is used to enhance an internal process, whether it is research or customer service or producing analyses, or whatever. Of course they can use that to their advantage by getting business and doing things more efficiently, and that is all perfectly well and good. What is expressly prohibited is reselling the data, aligning the data with other things and shipping it out to someone. DR. SERAFIN: Reselling I would say is competitive, but what about starting a new business line that incorporates your data, starting a new business line that is based on information that they received from you? MR. GLICK: A new business line that is not in the database business? DR. SERAFIN: Not in the database business. I am in the shipping business and next year I am in the airplane business because I found that your data are valuable to me in both. MR. GLICK: I would think most of the corporate business we do would allow some of that. MR. EISGRAU: May I just follow that up? To what degree are you satisfied with the ability of licensing and other currently available mechanisms to secure your data in a way that is sufficiently protective in order to facilitate the business? Does licensing work? MR. GLICK: I think all of us have said that it pretty much works. DR. MARTINEZ: Regarding the earlier point about competition with government in producing data, I think there is a restriction that the government is not supposed to compete with the private sector. Where does that stand in light of this discussion? MR. GLICK: We would like to know that. DR. BRAMMER: That gets tested a little bit in the market every day. Sometimes there is a perception on the part of some companies that there is competition, and there is a government policy that says that the government is not to get into the practice. DR. MARTINEZ: The NLM was mentioned. What about that? DR. WILLIAMS: The NCBI is a subset of the NLM. The government relationship is one of the reasons, as you heard from Jim Ostell, that they give away everything. We can get their algorithms. We can get anything we want from the NCBI, and so can every other commercial company. When they were first founded, they built a fabulous innovative group of scientists within the normally bureaucratic government organization. NCBI has been a remarkable success story. At one time, they had a lot of negative feedback from the commercial providers because the companies felt that the government was now competing with them directly. So that is why NCBI just said, “Okay, you can have it. We will provide a transfer of technology.” MR. GLICK: It used to be difficult for regular people to get access to government data. For example, there is a small cottage industry that takes Census Bureau data and puts it onto CDs and makes them available on the Web. If the Census Bureau moves more and more—and they have been doing that—to make it easier for people to access a particular piece of data on the Web, that is going to compete, but where do they stop in fulfilling their mission? DR. MARTINEZ: Am I hearing that this is not a sensitive enough issue yet? MS. SINGER: It is sensitive to me because we don't use government data. So you have a positive aspect of this when NLM and NCBI go out and collect all this wonderful data and offer them for free, the public gets the positive effects of that, and that is wonderful. But I am in

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS the same business. I offer a multidisciplinary database, and a little bit of NLM overlaps with what I am doing. If NLM decided one day to expand from medicine into physics, that would take a chunk away. If they go into chemistry, that would take another chunk away. It is very confusing to us as to what the government mandate is and how we are protected. There have been cases—and I don't know if they are real or not—where we have heard that things are being considered that are outside what we view as the normal parameters, and it is very difficult for us to really understand what is in the government purview and what is in ours. DR. SERAFIN: What is the role of the federal government in disseminating its databases? We have been struggling with this issue for a long time on a National Research Council committee that is providing advice to the National Weather Service. It is apparent, to me anyway, that technology is going to change roles. I have heard many people—from the Vice President down to the director of the National Weather Service down to the local weather forecaster—say that we want to make as much data available as broadly as possible within the constraints posed by our budgets. Until the advent of the Internet, if the National Weather Service wanted to think about disseminating its satellite data or its weather radar data, it would have had to make really big investments in communications. So they simply couldn't do that, and they said, “We cannot do that as much as we would like to. The private sector has always been the primary disseminator and we will just find mechanisms working with them to do it, and in fact in some cases enter into contracts with them that will give them certain rights.” They did that, but now with the Internet and with the explosion of information technology they are going to be able to provide 100 times, 1,000 times, a million times the amount of data in the future that they did in the past. My perspective is that the taxpayer should be entitled to the data because the data now cost the government very little. The data used to cost a lot, and now they do not. Within their budgets they can now do this. So I see this happening, and I also see that, from the private-sector 's standpoint, you all have to be really nimble to see these things happening and to determine where the value-added line is going to be in the future and to adjust to that. DR. BRAMMER: I think that is exactly the challenge. I don't see the point in trying to fight that battle as a policy issue and say that the government can't provide the information. I think the government is going to do it in its own way, and the challenge for our business is to find the ways to provide additional value that people are willing to pay for. Getting a clear statement out of the government about what it is going to do and when would help in the planning of that. I think that is an area where there could be some improvement, and it would help the government, frankly, to set its own priorities because its resources are limited and sometimes dissipated by trying to do too many things at one time. So you have those issues, but by and large the challenge for our business is continuing to add more and more value. DR. SERAFIN: If we have time we will come back to this later, but I want to get on to question 4. It says, “Identify and discuss the principal benefits and problems to data users.” These are users presumably of databases that you all provide and the benefits and problems that are posed by the current legal and policy regime. DR. WILLIAMS: The users' perspective is similar to our own because they want to have the right to use data in forming derivative databases, etc., and they can do that under the current legislation. In many cases, at least the commercial users of our systems would feel the same as we do about the current legal regime giving them flexibility to get maximum power out of this information.

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS MR. GLICK: I think in our business there are some examples of people withholding the kind of rich databases they have from direct public availability because of concern with lack of protection, both legal and, probably more importantly, technical. For example, in our case people are just able to look at images, rather than having the ability to actually interact with our databases. We wouldn't be able to get a license, for example, to put navigation-type data in their raw vector form to make them available for end users to use on the Internet, no matter what license agreement we had. I don't think any kind of legislation would make someone feel comfortable about that by itself without additional technical means for protection. The only other problem I would say is that end users see a lot of different terms and conditions, which new legislation might help eliminate. In our Web site there are five pages and if end users ever bothered reading them, they would get a little bit confused. But since everyone ignores those anyway, this is probably not much of a problem. DR. BRAMMER: I think I can follow up on that. The problems to our customers might be in the regularly changing and more complex conditions that people are forced to put up with given the new technology developments. There can be a whole lot more options for what you might do, and it is a constant ebb and flow of new developments and then trying to rein it back in or at least derive some benefit from it. We try to put language in our agreement that says that “the customer is entitled to use the information by its own employees for its own operations except in . . .,” and then enumerate very explicitly what the customer is or is not allowed to do with it. If the customer comes up with another idea it leads to another negotiation, or another contract perhaps. New developments provide more alternatives. Now, that is good news, too, but it is more complex. MS. SINGER: Our data users are usually researchers or authors of information or both, in the corporate and academic environment, and we do everything we can to enable them to download data appropriately from our database. We actually have bibliographic tools that allow them to take a search result and export it right into their own database, post it on a Web site, and do whatever they have to do that they are legally entitled to do. Our customers can create their own databases of citations from our products that they use in various different ways, and we facilitate that. MR. MILES: Do you think that they could reach a point where they were competing with you by doing that if they developed citations for a particular subspecialty and put that out? MS. SINGER: We have had commercial entities, not end users, take pieces of our database and tailor it to a specific industry and resell it for commercial use. Depending on the level of intensity of what they are doing and how it impacts us commercially, we will do a cease and desist order or we will just led it slide by. It all depends on the magnitude. We have never had one of our own end users who created a database of citations in their particular line of work do anything that was detrimental to us, and we fully realize that for them, creating a database of citations that they need in their research is what they are using our product for. DR. LIDE: I am a handbook and database editor, that is, a content provider for databases that are sold commercially. I consider myself both a user and a producer of data. I take data from hundreds, even thousands, of sources to create data products through a process of evaluation, selection, and organization. I would consider a textbook author as a data user, as well as a scientist who writes a review article for Reviews of Modern Physics. The existing legal situation permits such authors and editors to extract data from many sources without the burden of asking permission to use each number (or pay license fees). The resulting compilations and reviews are of major importance to scientific research and education.

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS What concerns me about some of the proposed changes is that they would introduce a tremendous overhead burden. If I had to ask for permission for every number I take from the primary literature, it would be virtually impossible to carry out this type of compilation process. This problem has not been addressed sufficiently in the discussion of the E.U. Database Directive or with the proposed legislation in this country. The extra transaction costs that would result from having to keep track of where each item of data came from, and to contact the source for permission, would strongly discourage such work. It is a very nice situation now. I know if I want to reproduce a figure verbatim, I write the publisher and I get permission; but if I take some numbers from here and another number from there and put them all together and create a new type product, then I know that I am safe. MR. MAURER: I am fascinated by Mr. Glick's example with the automobile database. This afternoon Rich Gilbert said that even in the world of patent protection where you have such ferocious legal rights, you cannot get people to open up—secrecy is the most popular method—and I just want an off-the-cuff estimate. Even if all the database protection in the world were enacted, would it ever be possible to get this automobile system on the Internet where anyone could look at it, given technical protections? This is an honest question. It is not meant to be skepticism. MR. GLICK: Yes, it is hard to answer. With some of the things that Mark Stefik discussed, there may be some level of technical trusted systems that would provide sufficient comfort to allow someone like us to make that accessible, but certainly legal protection alone will not do that. I did want to respond actually to something you said earlier about that situation. You said, “How should we deal with this monopoly situation?” I would hope that basic capitalism would take care of it in the sense that in their core target business, where the need is to maintain high prices, they are going to maintain a monopoly; but outside of that—for uses—that is viewed as kind of just gravy, and there is no reason why they wouldn't make that very economical for people to access. The approach should be to make it very easy, and we do this in our Web site too by providing free use of maps legally to anyone who wants to download just one or two; and that is okay in the licensing. I think that is going to be part of the solution to making things available on the Internet. On the one hand I think we are comfortable with dealing with corporations; it is legitimate in our business-to-business dealings. On the other hand, on the consumer side we are kind of nervous; on that side, legal protections by themselves certainly are not the answer. But in the middle of that I think we all agree that copyright protection by itself does not protect our databases or our third-party vendor databases either. MS. LEVINE: I want to follow up onwhat Barry Glick just said. Some of these are business decisions and I guess you make some of your maps available for free in hopes that it will lead to other business down the line. But I wanted to ask whether you feel if you were more certain of legal protection could you foresee possibly developing products that are smaller products that you could sell without going through the same arrangements that are not worth making available now because the licensing aspect of it makes them economically unfeasible? MR. GLICK: That is a good question. I don't know how much of this is due to normal economic factors versus the fact that data can be easy to copy or pirate. Prices are low. I think it was the National Oceanic and Atmospheric Administration example where the resources devoted to databases are not considered a high priority compared to other things, and it is difficult to

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS extract high prices for what is viewed as a basic commodity unless you have something that is highly proprietary, highly monopolistic. So you could argue that prices would be a little higher with more protection for databases. When we get venture capital funding, they say, “What is your information technology protection?” That is part of the due-diligence process. If we could say that if we built this database we would have relatively strong protection, versus the situation today, it might be easier to get investment and justify higher prices in our spreadsheets. I don't know how to answer that, but it would be definitely worth looking at. DR. SERAFIN: I would like to move to the fifth question, which asks us to look into our crystal balls. Would the answers to any of the first four questions that we posed change significantly, based on what you see five years from now? MS. SINGER: We could go back to what you were alluding to before. Government data used to be made available in raw form but were not easy to access or easy to understand. With the technological changes being what they are now, however, it becomes easier for those data to be accessed on a very wide basis and maybe, with technology, even to be enhanced substantially when you are talking about summarizers, automated indexing, translation routines, what have you. So I think technology is really going to change the paradigm of what is happening in the marketplace, not only how data are accessed but how they are enhanced as well. DR. WILLIAMS: I think our environment will be increasingly complex. I believe that we will have to negotiate with virtually every source of data for the appropriate use of those data and pay according to what we are going to be doing. It will greatly increase the administrative load. The changing legislation in Europe as contrasted to the United States is not likely to be resolved in only a matter of months. It will take us a long time to harmonize that legislation. There is a question of how much protection we will have for our database products then in the intervening time, and so I think that there are many storm clouds on the horizon that will make our life far more complex five years from now than it has been to date. MR. GLICK: I think things will be very different in five years. The whole equation of what a value-added provider is, as Leslie Singer said, will be very different. The ability to respond to a specific query by looking at a whole range of databases around the world and just target an answer to a question instead of a retrieval of a database, or a portion of a database, and the ability of data providers to somehow get the right compensation and to ensure that it doesn't get stolen or pirated through technical means, is going to change the way we do business in information services in general. DR. BRAMMER: I think if you look five years back, the first browser was just being written, and remarkably few people had heard of the Internet. If you project five years ahead, one could adopt different scenarios. But let us just assume for the moment that the unauthorized access and the vandalism and so forth are much beyond what it is now. What I would see for our business is that a much higher fraction of ourdata will be obtained on a commercial footing from various suppliers and the competitive aspects will be providing information more quickly. I think time will increasingly become a factor, along with customization and more integration. So providing data, data the way we think of them at least in our business, will give way toward providing information products that are increasingly specific to an end user's operation. That I think actually mitigates against the need for more copyright protection. So if it really is customized, then it is unique to the user, and people won't be so interested in stealing it. So the trend for us would be much more customization, getting the information there faster with more integration of diverse data sources.

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS That is how I see it. I don't think we are going to be relying on copyright protection any more five years from now than we are today. DR. MARTINEZ: On the demand side, what projections might there be? Any idea? DR. BRAMMER: On the demand side the market demand is going to be huge. DR. WILLIAMS: People are beginning to see the value of information in a way they never did before. Now that the ease of access has improved dramatically, the way that information is utilized will be revolutionized and the level of activity will increase dramatically. MR. BAND: If you are saying that in the future there is going to be more customization, that the technology allows that kind of customization, that would almost suggest that increased intellectual property protection will delay and impede that kind of value-added customization. Database producers will not have as much economic incentive to do these kinds of value-added customized services because they will be able to make more money just from selling the raw data. Moreover, that kind of increased level of protection will make it more difficult, arguably, for the customizers to get the data that they are customizing. So it seems that more protection will slow down the process of making all the wonderful uses of information that the technology is going to allow us to make in the future. PARTICIPANT: I certainly don't agree with that. DR. BRAMMER: Maybe I didn't say it well. Let me take another shot at it. The reason that we have situations like Barry Glick was talking about with the monopoly of the automobile database is that the acquisition of the information is very expensive. So if one organization happens, for whatever reason, to get way ahead, it is difficult for a competitor to come in. What I see is that, as technology advances, a lot of these processes will continue to become less expensive and that means that the customization will be more affordable. The dissemination will be a lot easier because you have not universal, but much more easy, access to the data. So you can afford to do the customization and, I think, the applications. I certainly agree with what Myra Williams said that the people will value it more and make better use of information. It will be much more common than it is today. I think there will be a lot of growth, but I think the growth will be in selling a lot of relatively inexpensive things, rather than a smaller number of very big expensive things. MR. LEAVITT: What I have heard here is that what you need with government databases is a more consistent policy so that you can make long-range plans effectively. Of course, they can pull the rug out from under you at any turn, but this has nothing to do with copyright. What concerns me the most is that when so-called “free data” are provided by the government and they get enamored with the idea of putting all these data on the Internet at the expense of the quality of the data, this is where the problem is. They have got to start considering that a degraded database is not a good database. This is what we see happening—that they are more interested in getting garbage out than in improving the quality of the database, and that is where the money should be going. MR. MASSANT: Earlier you said that you felt comfortable with licenses to protect your data. Was that as far as being protected from the people you sell the data to, the commercial purchasers? What about a third party accessing your database somehow, and then you wouldn 't have protection that way? Is that a concern at all?

OCR for page 170
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS MR. GLICK: That is what I meant by making the right products available instead of the database itself available, for that very reason. MR. MASSANT: Then I think legal protection would help cover that situation whereas a contract license wouldn't. DR. WILLIAMS: It is the cost and benefit that we are talking about. If protection adds a lot of obstacles that impede our ability to create things in the future, then that added protection is not desirable. MS. SINGER: I am leery about going forward without copyright protection even though we have very stringent licensing, especially around the world. It sends a terrible message. MR. GLICK: That is right, if U.S. database providers do not have any protection in their own market. DR. SERAFIN: Are there other comments or questions? If not, I thank everyone for participating.