Read "Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options" at NAP.edu

Page 218 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

11 An Unfair Competition Model for Protecting Databases

MR. PERLMAN: I am Harvey Perlman, and I will serve as moderator for this breakout session. I am a professor of law at the University of Nebraska. I have taught unfair competition law and intellectual property for about 30 years and have participated in the background on this issue by giving some advice to the National Academy of Sciences regarding this latest database bill. I am also fairly actively involved in the discussions with respect to the licensing provisions of the Uniform Commercial Code, Article 2B. Before we begin this session, I would like all the participants to provide a brief statement about who you are and either who you represent or why you are here.

MR. BAND: I am Jonathan Band with Morrison & Foerster. The clients that I have represented in this database area are both in the financial services industry and information technology industry, and they tend to be skeptical of the legislation that has been proposed thus far. We have been trying to advocate a more narrow form of database protection than that which was introduced in the 104th and then in the 105th Congress.

DR. LOFTUS: Philip Loftus. I am with Glaxo Wellcome. I am going to be rapporteur for this session.

MR. KAHIN: I am Brian Kahin with the White House Office of Science and Technology Policy, and I have been working with Chris Kelly and Justin Hughes and others within the administration on this issue.

DR. LEDLEY: I am Robert Ledley from the Protein Information Resource. I testified before Congressman Coble's committee last year on the bill that was passed some form by the House and that I thought was pretty good. The only thing is when they were finished, Mr. Coble said to me, “Dr. Ledley, you seem to have a different opinion than all your other scientific colleagues. Could you please write me a letter and tell me why. You know, I am gathering data.”

DR. BARKER: I am Winona Barker, also with the Protein Information Resource, and I am also an ex officio member of the U.S. National Committee for CODATA. I tend to be skeptical of more legislation, although it seems to me that the genie is already out of the bottle. It may not be possible to easily correct the problems that people like Chris Overton are having, but I am not sure that more legislation is going to make it better.

DR. GILBERT: I am Richard Gilbert, professor of economics at the University of California, Berkeley.

MR. COHEN: I am Bill Cohen from the Federal Trade Commission, and I am interested in the competition model.

MR. BARRON: I am Ed Barron, counsel of the Senate Judiciary Committee for the ranking member Senator Leahy.

DR. GILMAN: I am Paul Gilman. I work for a company called Celera Genomics. We are investing about $300 million to create a database, and we are very concerned that, while we want it to be widely accessible to the research community, some of our commercial competitors will simply copy our database and make it available in some other way.

MR. MAURER: I am Steve Maurer. I am an intellectual property lawyer at Berkeley, California. I had the privilege of working on a background report for the National Research

Page 219 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

Council study committee (“Raw Knowledge: Protecting Technical Databases for Science and Industry, ” in Appendix C of these Proceedings).

DR. BROWN: I am Carole Ganz Brown from the Division of International Programs of the National Science Foundation, and I have been working with the General Counsel's Office on various of our research provisions on these issues.

DR. MCDOWELL: I am Bruce McDowell with the National Academy of Public Administration. We have a small contract from the U.S. Geological Survey (USGS) to take a look at some of the potential data limitations that might affect a global disaster information network, which is one of the Vice President's initiatives. The idea of that network is to put together everything in one place that an emergency manager might need worldwide and share it in real time. Not a lot of attention has been given to the difficulties of achieving that grand desire. So, we are taking a look at intellectual property, privacy, liability, and security issues that might limit the information that should go into that system or be shared through it.

MR. KELLY: I am Chris Kelly. I work with intellectual property issues at the Antitrust Division of the Justice Department, where I have worked with Richard Gilbert and learned a lot from him, and I have been working with Brian Kahin and Justin Hughes on these issues for the last year and one-half or so.

MR. MOHR: My name is Chris Mohr. I represent the coalition that supported the goals that went through the House database bill.

MS. SAEZ: I am Carolina Saez with the U.S. Copyright Office.

MS. KELLY: I am Maureen Kelly, from BIOSIS. We are a not-for-profit publisher of a secondary database, which means that we are both a user and a producer of scientific information.

MR. RINDFLEISCH: I am Tom Rindfleisch from Stanford University. I am director of the Lane Medical Library where we are attempting to become a digital library to disseminate information for clinical care and research and education. I am also a computer scientist who has been involved in a number of projects trying to synthesize various kinds of databases—data resources for new kinds of applications.

DR. BENSON: I am Dennis Benson from the National Center for Biotechnology Information at the National Institutes of Health (NIH), and our group is responsible for building and distributing Genbank.

DR. WILLIAMS: Myra Williams from Molecular Applications Group, representing the genomic sector.

DR. OVERTON: I am Chris Overton, director of the Center for Bioinformatics at the University of Pennsylvania. I represent the academic sector for use of genomic information, and one of my chief concerns, is that the pending legislation looks like it is going to be a hidden tax on knowledge. In my opinion, it is going to impede biomedical research.

MR. PETTINGER: I am Larry Pettinger from the USGS, and my main involvement has been in representing USGS in some of the discussions among the federal science agencies on these issues.

MR. PERLMAN: Let me introduce the session and do a couple of things. One, the purpose of these sessions was to get a dialogue between those who see the need for additional protection and those who are concerned about it, and we have the panel organized in such a way that we thought we would produce that result.

Unfortunately, Mike Klipper is not here because of the weather. I think he is a strong advocate for increased protection, and I will play devil's advocate if the occasion arises to try to

Page 220 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

present that view, but I am not as passionate about it as some others around this table. So please feel free if you represent that view to help me.

The idea for this session was to go through the questions that were provided for the workshop. (See Box 11.1 for a list of questions used to guide the discussion.) I will ask Jonathan Band, as he that indicates he is skeptical of increased protection, and whoever else would like to chime in, to take a minute or two and respond to each question from their particular points of view. Then the three people of the panel who are active in the area, all of them in the biotechnology area as it turns out, can respond to the two lawyers' positions. Tom Rindfleisch and Dennis Benson, who are both noncommercial data users and disseminators, will give us their reaction to these comments, and then I'll open the discussion up for questions. If it seems this strategy is working as we answer one or two of these questions, then we will continue that way, but I don't want to have this as a restraint on free interaction and discussion. So I am not going to hold you to these questions; our experience during yesterday's breakout sessions was that conversation tends to blend the questions together.

Box 11.1: Issues for the Discussion Session on an Unfair Competition Model Protecting Databases

Identify the potential benefits and problems of this legal model in your database activities in comparison to the status quo.
How would you define the scope of prohibited activities by users? Should the law distinguish between different categories of users? If so, how?
What specific provisions regarding access and use (both authorized and unauthorized) would you want included in such legislation? Why?
What specific exclusions and limitations on the rights of database owners (e.g., by category of user, type of use, or type of database) would you want included? Should sole-source databases be subject to any greater requirements for openness (e.g., compulsory licenses, fee regulation, etc.)? Why?
Are there prerequisites that a database producer should meet before protection is accorded? Why?
Should the property right be limited in time? If so, what's an appropriate length of time, and why?
Are there any special provisions needed for access to and use of government data incorporated into privately produced databases? If so, what should they be, and why?
Are there any special provisions needed for access to and use of data generated through government-sponsored research by parties outside government? If so, what should they be, and why?
Identify other issues important to public-interest access to and use of data and databases under the unfair competition model, and state why they are important. In particular, are there any technological trends that may alter the balance of rights substantially?

Let me set the parameters for our discussion, again not in a rigid way but at least to focus our attention. At this workshop, there are concurrent discussion sessions like this one dealing with different models of possible database protection, and labels are less important than the provisions of a particular bill. In this session, we are supposed to discuss what is designated an

Page 221 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

unfair competition model, and there might be some confusion about what we mean by that because there is some debate about whether the Coble bill of the last congressional session is an unfair competition model.

I would like to frame the issue fairly narrowly, at least to start out. My view of an unfair competition model is one that allows a database owner protection against activity that directly competes with and prevents that owner from capturing the economic value of the database. It is fairly narrow in the sense that it comes out of the unfair competition tradition, where you only are protected against acts that would likely prevent you from making the investment in the first instance and only against acts that interfere with competition in the product that you are currently selling. I think this is where there is debate between whether the Coble bill is an unfair competition model or a property rights model.

The Coble bill, as some of you may know, allowed a database owner to be protected both in the actual markets in which they were engaged and also in any potential markets; and when you open it to potential market, which means any market discovered in the future, as to how this database might be economically exploited, you essentially end up with a property or very close to a property rights bill. I don't want to hold that distinction rigidly, but at least you get a sense that our model is one that attempts to define specific behavior that we find interferes with the investment of the owner, as opposed to some other models that might make sure that an owner can exploit all of the actual and potential benefits of the database.

With that, I start with Jonathan Band. The first question is to identify the benefits and problems that this model might make in your database activities in comparison to the status quo.

MR. BAND: I think as always the devil is in the details, and it depends on exactly how the unfair competition model is structured if it is truly narrowly drawn in the manner that Harvey Perlman was outlining, I think that it has a lot of benefits and relatively few problems. The benefits, of course, would be that to the extent that there is a gap—and I am not sure there is a gap—in existing forms of protection, this would largely fill it. One of the problems that database producers have talked about, and certainly among the problems that are talked about most often, is wholesale copying by a direct competitor. The cases or examples that have been cited where that is a problem have been the Zeidenberg case [see ProCD v. Zeidenberg, 86 F.3d 1447 (7th Cir. 1996)], where a graduate student copied a CD-ROM of the telephone directory and put that on the Internet, or the publishing case that was raised yesterday, which involved the scanning in or the keying in of a huge amount of information straight from the Cable Fact Book and making it available in digital form. Again, the theory or the argument of proponents of additional legislation is that with digital technology it is easier to copy and easier to disseminate a database, so that the risks of this sort of wholesale appropriation, which would totally wipe out the value to the original producer, is why there is a need for additional protection.

I think an unfair competition model will address this problem. If copyright, license, or technological protections, or all those other forms of protection don't work, this will be an additional weapon in the arsenal of the database publisher to get at someone who is doing bodily appropriation of the data. At the same time it does not preclude most forms of value-added activity, so that it doesn 't prohibit you from taking some of the information and developing a different kind of product, whether it is in a potential market, neighboring market, or call it what you will as long as it is a different market. Thereby this kind of added protection does not stifle second-generation innovation, and so that is why it seems to have little down side.

The real question then, and again this is where the details will come in, is how broadly the legislation is drafted. What about some kind of value-added product, which at the margins

Page 222 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

may compete with the original product, or what if someone just has raw data, and then someone else adds value so that those data become useful for the first time? Of course that will hurt the actual market for the first comer, for the first publisher, and so that is a gray area where a lot would depend on how the bill is drafted and how it is applied by courts. This is something that we can talk about, whether this is what we want to encourage or discourage. We are not, in theory, concerned about the progress of science and the arts, but certainly it seems to me that that should be an ultimate goal here, along with consumer welfare.

MR. PERLMAN: Would the representative from the coalition like to respond rather than have me try to do it?

MR. MOHR: Unfortunately I am not authorized to respond to this question. I probably will interject some comments later on.

MR. PERLMAN: I suppose the argument on the other side is that, to the extent that you want significant investment in some of these databases that are not protected by copyright, the more returns and exploitation one can achieve through the database, the more money is going to come into the database owner to continue to invest in keeping it up or increasing new ones. In addition, if transformative uses are a significant part of the marketplace one would presume that the owner of the database would license that database to permit those transformative uses, but extracting a fee for that privilege.

DR. LEDLEY: I think it ultimately is the argument, but there is a nominative advantage also to having protection, which is that even if you don't charge the user, then the user must at least contact you so that you know who the user is. That is often extremely important, especially with government-funded materials, to identify what the usage is—essentially to know; how important your database is. Without some way of inquiring or at least being able to request that the user identify himself, then there is no way of keeping track, and keeping track is a very important thing.

MR. PERLMAN: I will call on the various panelists, first to respond, if you want to respond, and then we will open up the discussion. The issue is a model with fuzzy fringes that essentially would make different uses free as opposed to a model that would allow the database owner to either limit, restrict, or otherwise exploit other uses besides the one that he is currently using it for.

PARTICIPANT: I just want to respond to this comment. If the major reason you want this legislation is to track uses, then any law like this is just overkill. I don't think that would justify anything that would make such a substantial change.

DR. LEDLEY: I am not saying that this is the major reason. I am just saying that this is another reason.

MR. PERLMAN: Why don't we let each of the panelists respond, and then we will have a discussion.

DR. OVERTON: I make a living using transformations of databases and integration of databases. So in some sense, my primary activity is to take fairly large chunks of other databases, combine them, and do something innovative with them.

My concern is certainly what the consequence of this legislation is going to be because the other part of this is that the transformed databases that we generate are then provided to other scientists through the Web or through bulk downloads of the whole database. As far as I understand the different options, this unfair competition model seems the least abusive, from my perspective, of any of the options I have heard so far. In any case, even with the unfair competition model, my concern is conveyed with the following scenario: I have created a new

Page 223 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

database, and now for each piece of data that I integrated into this new database, I am going to have to track that piece of data. If I serve that datum up to someone on the Web and I have to be concerned somehow with the copyright or fees for usage of this, I am going to have to do that for every single piece of data in the new combination, in the new form that I have created it in.

When they hear something like this, my colleagues in the computer science department say, “That is cool. This is going to mean a new research project for us.” It is an area called data provenance, and I use data provenance to do exactly that. I keep track of every piece of data that comes into the database and the origin of that piece of data, but that is a research project, and when I would be able to use that in practice is years down the road. In the meantime this legislation could restrict my use of this information.

MR. PERLMAN: Was everyone here yesterday so that you know what Chris Overton does with the databases? Do you want to give that 15 seconds?

DR. OVERTON: I take multiple, heterogeneously distributed databases —they could be databases from all over the Web or local databases—and I transform and combine them to produce a new database, a data warehouse of that information. As part of that activity, I add value to the data through various means so that there is a lot of work that goes into creating these new databases. But the bottom line is that the new database, the derived database, is composed of elements of existing databases, plus manual curation, plus derived data through computation. So we have new insights that are in our databases based on the data, the next-level-down information from previous databases.

MR. PERLMAN: And these are genetic data that give you insight about the information by combining those you wouldn't get—

PARTICIPANT: That you wouldn't get otherwise, exactly.

DR. WILLIAMS: Molecular Applications Group has some overlap with what Chris Overton has described in terms of the creation of derivative databases that add significant value over the original database. We also have software that dynamically accesses over 150 sites concurrently using the World Wide Web. These sites can be specified by the customer. Some companies are more interested in agricultural sites. Some want all seven-transmembrane-related sites. Others want things that are very specific. Our software can be easily customized. The system knows where to look for respective types of information and then populates the database with it.

These are very large databases. The idea of having to track exactly which databases were accessed, what information was used, what percentage of that came from which database is just a nightmare because we are talking about data compilations that will grow to terabytes in size very quickly. That aspect would be one of great concern. The thing that was quite positive to me was hearing Harvey Perlman's narrow definition of unfair competition. I think there are two things that need to be clarified as we do any kind of legislation. One issue is the protection provided to the database vendor, and the other is the protection provided to the user of the information. Let me elaborate on each of these.

As long as the protection provided to the database vendor is very narrow and is specified, as Harvey Perlman described, where it says, “I am protecting only that which has already been created, and I am not, in fact, protecting against future possibilities that have not yet been implemented,” then I don't have much of a problem. What I found formidable was the idea that someone in retrospect could say, “Oh, that derivative database included some of my data, and I have been intending to do that as well.” How does the provider document that they had that idea? That becomes a very fuzzy area that would be very difficult for us to define. I don't see

Page 224 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

great advantage from the added protection, especially for our particular organization, but I also don't see that much of a problem. If it prevents these overt cases of abuse that we have been hearing about at this meeting, then protection is probably appropriate.

At the same time, however, in terms of science and technology, it is very important that the rights of the user be protected, whether that user is an academic, a not-for-profit organization, or a commercial organization. One of the things that we found out yesterday is that there seemed to be a lack of clarity on the part of the scientists, the administrative staff at universities, and certainly those of us who are data users on exactly what our rights are to have access to data.

For example, it was said that under new legislation if even 10 percent of the funding to create a database came from a government grant, you are obligated to provide access to those data to everyone. I don't think that is widely known. It certainly doesn't seem to be known among some of the groups that we have been talking about; and the rules are different according to whether or not the work was funded by a grant or by a contract or by a cooperative research and development agreement (CRADA). So, as we consider any kind of legislation, it is very important for all of us to understand what our rights are to have access to the data and in fact to build value-added derivative products from the data without having to pay exorbitant fees to all the people who are involved.

DR. BENSON: I would like to echo that because I think as a general comment there is concern about legislation. Most people, I think, involved in science don't appreciate the subtleties of the law. As this legislation was proposed, if you went to Web sites and saw a number of the issue papers that were prepared it seems to me that a lot of the issues were, in fact, addressed by some of the subsequent legislation that came out. Yet overall, the impact was a chilling one that scientists felt that this was going to impact their day-to-day use of databases. We have to be very careful about the message that goes out about this legislative process and how it will impact or not impact day-to-day science.

In terms of this specific issue about misappropriation of data, this would not affect our organization because we go to the end user directly and collect sequence data from the end user. There is one area that would be a potential danger I think, which was alluded to yesterday, and that is in terms of electronic publishing where journals may be completely in the electronic realm and the data that support the underlying article may be part of that electronic publication, and the publisher may retain rights to all of the background or underlying data. In our particular case, if a publisher were to retain rights to the sequence data, that could obviously be an impediment to the free exchange of sequence data that we currently have. I think that is one concern we would have.

MR. RINDFLEISCH: I want to speak primarily from the user point of view, but I would like to distinguish what I think are three main areas of technology or formulation of these kinds of repositories that we are talking about. First are the data themselves, whether they are genomic data, whether they are textual data from the literature, or whatever. Second is the interface that the database provider generates in terms of the expectation of how users are going to use those data. Third is the user, the person who is trying to accomplish a particular task, care for a patient, do research, construct material for a course; and from the user's point of view, the user is unconcerned about where the data come from. The user is trying to pull together different kinds of information that will allow optimal accomplishment of whatever that task is, and I am concerned that the kind of protection that we are talking about will impede the optimal development of the new technologies that are just now becoming widely available in the development of tools that let people do their jobs.

Page 225 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

We have run into situations where vendors put a lot of money into developing databases, into developing these interfaces, and want to license them. For example, I run the Lane Medical Library at Stanford, and we have approximately 251 titles online digitally that represent various kinds of vendors. The vendors are primarily focused on having you use the data as they present them. They do not want the end user to be able to go flexibly between publishers and all of these databases to accomplish their task. That is an impediment to the optimal use of these data; and, in fact, the economic protections that we are talking about are intended to protect the marketplace against having to do further innovation.

That means that we have talked to vendors and said, “Your database is faulty in the following way . . .. It does not accomplish the following tasks . . . .” The database vendors look at it from a business point of view. What will it cost me to change that interface or to change the organization of the data in order to accomplish this new task? If the investment is high or if the income stream is already quite profitable so that these vendors feel that it is not worth the additional investment, putting these protections in place I believe will impede the development of technology that we are only beginning to understand. I also believe that the long-term economic advantage of these technologies is to free the development, the exploration, the interrelationship of these different kinds of information resources in ways that the vendors have never imagined. So, whereas I am sympathetic to the investment of large amounts of effort to accumulate these data, I don't see that the protection that is warranted should be any more rigorous than a company in Silicon Valley developing a new piece of Web software or a new piece of applications software that a competitor can look at or duplicate. The length of time of the technological advantage in that kind of an arena is very short, and in fact maybe a measure of this is to look at Moore's law, which says that computer technology turns over about every 18 months.

So why should we be putting in place legal restraints on turning over the uses of these data the way we are conceiving of taking advantage of these technologies that should be any more durable? I have a tension between understanding the provider's point of view but also taking the user's point of view where we are trying to do the best possible thing that we can to improve patient care, to improve efficiency of engineering, new artifacts, of doing research that needs to make use of these data in innovative ways, and we absolutely have got to avoid restraining that innovation.

MR. PERLMAN: The discussion is open now, so I invite people to comment.

MR. HUGHES: Justin Hughes, Patent and Trademark Office. I don't think it makes much sense to apply Moore's law to this database issue. Moore's law is about the computing power of chips, not about anything else related to the computer.

Back to the other point. I am not sure what you said about software, but I don't think it was quite right. You characterized software as something people can go out and duplicate, but they cannot go out and duplicate it. They can reverse engineer it under certain conditions, but copyright protection for software does provide some viable protection of the investment, and there is a coterie of people all over this building and over the office of the U.S. Trade Representative who practically devote their lives to preventing people like the Chinese from duplicating our software, and we don't stop at 18 months and say, “Go ahead, take that version of DOS.”

MR. RINDFLEISCH: I understand what you are saying, and I was not implying that Moore's law, that has traditionally applied to hardware, rigorously applies to this new area, but I believe that the development of the ideas that are embodied in software, that are embodied in the

Page 226 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

ways we organize and interlink information is a new generation of technology that we are just beginning to explore. That is what these new companies like Yahoo and Excite and others are coming to produce —new products that rely not so much on the underlying software and hardware technology but the ways in which information is put together.

PARTICIPANT: But to more precisely draw the analogy you are trying to draw, you would want to explore the social parameters that we should impose or do impose on interoperability of software and say, “Yes, we would like everyone to invest in developing innovative software products, but we want those products to be interoperable with each other,” and therefore what kind of terms and conditions do we put on that? That might be a closer analogy to what do we need to do in the database world to make sure that people can take bits and pieces from things in a useful way.

MR. MAURER: I find this exchange very interesting and useful, and I think one of the things that we need to focus on is how long a protection period should be. The Europeans said 15 years; the Americans came back and said, “We have to match the Europeans, ” so they proposed 25 years.

I think there is a question about how long it takes a database owner from an economic point of view to recover their investment, and certainly for the American companies isn't anything like a time horizon of 15 or 25 years. That, I think, is the heart of what Mr. Rindfleisch was trying to say.

The other thing I think we should keep in mind as the great strength of the unfair competition approach is that it is traditionally a sensible case-by-case view that gives you flexibility, which is often a good thing. One of the things that has come out in this workshop, I think, is that there is a gray scale of possible protections, and the challenge is to find something that gives enough protection to the database producers but doesn't give them so much protection that you get pathology. I think it is an empirical question ultimately of how much is enough, and whether you need a strict copyright model or whether something less will do is something that needs to be looked at very closely.

MR. KAHIN: On this interoperability issue, I didn't see it as a question of interoperability of software. It is having the ability to use the data in a way that allows the user to interoperate the data—that is, to construct the user's own interface.

MR. RINDFLEISCH: That is what I meant. I meant it just as a metaphor.

PARTICIPANT: But there are new technologies that interlink and facilitate user's tasks, which involve using these pieces of information in ways that the provider never imagined.

DR. GILBERT: I wonder if we are conceptualizing a hypothetical unfair competition model that doesn't really exist anywhere; and that is a question that maybe Jonathan Band or others who have experience with this model can answer. It seems to me that the unfair competition model does have a lot of advantages that we could articulate, but ultimately it does come down to the details and what the legal precedents are for what constitutes different markets and whether a particular data enhancement would represent unfair competition under this enforcement regime or whether it would not.

MR. BAND: I think you have put your finger on the issue. There is this existing, let us say, misappropriation doctrine, which deals with “hot news,” and courts have applied different standards and different definitions of what is hot news and so forth. That is probably narrower than what we are talking about here because I don't think that the kind of model that Harvey Perlman was positing would be hot news. The stock market, for example, is hot for 15 minutes; I don't think we are thinking about something that is of such finite duration. I think the idea is

Page 227 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

that it is a hypothetical conceptual model of another alternative; an alternative to what was introduced last year in Congress and, according to Marybeth Peters, is going to be introduced next week once again by Congressman Coble. So there is a different proposal. Even then it is going to be subject to interpretation.

In response to the next question, what I want to do is run through some cases. I don't know whether the language I will come up with really explains those cases or would lead to that result, but these are what I think the results should be in a couple of cases. Again, maybe people will think that it is too much or too little, but it will be helpful in terms of focusing the discussion. But you are right; it all comes down to details and definitions.

MR. PERLMAN: It is clear that this model would require judicial intervention at the margin. So, if certainty is a requirement for your conception of what the law should to be in this area, then I think this is not the model.

PARTICIPANT: But neither is the property rights model.

MR. PERLMAN: I understand. There is no such thing as certainty.

PARTICIPANT: On a spectrum that might be more certain than this or at least the burden of proof which—

PARTICIPANT: I just wanted to say three things. Two of them reply to Chris Overton's comments. I don't believe that even the Coble bill that was introduced and didn't go any place prohibits or requires identification of each individual part of the database. It just doesn't require tracking every piece of data. I don't believe that is so.

Also, I think that the derived database is a different database. That is my understanding. So, therefore, even if you use our database in your database it wouldn't do anything to us. We would like to know if you did use our databases so that we can tell people, but it is not a requirement as far as I can tell. If we had something like 50 redistributors at one time and they patched our database with other things and added programs and went ahead and disseminated it, well, we knew who they were because generally in those days they had to ask us for it. So we sent them the tapes.

PARTICIPANT: Why do you want to know? Why do you care?

PARTICIPANT: We care because then we could tell the National Library of Medicine, which funds us, “Look at how broad the usage is of our database.” They repackage it, and their customers use it. We have direct customers and so forth. This is very important. How else are they going to justify spending all that money on us if they don't know that it is doing any good, and that is very important. All right, those are the two things.

Now, the third thing is a very short story, which has not been mentioned here. Why would you want to bother with this database legislation to begin with? I will use as an example the patent system. This isn 't a patent, but it is just an example.

Benjamin Franklin invented the Franklin stove, and in his autobiography he said that he didn't patent his Franklin stove because he wanted it to be available to the public. That was the end of it. He invented it. He didn't patent it, and no one made it. Someone in England read about it. They patented it, and it became very popular. So, one of the reasons for protection is to make the idea, the concept, the database, whatever it is available to the public. Without that it may not ever become available to the public. No one is going to pay to advertise something where you are not going to get any remuneration. You have got to pay for the advertising.

PARTICIPANT: Let me ask the question from the other side. Why should you be able to do what you do without at least contributing something to the cost of the databases that you are mining? What is the theory for that?

Page 228 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

DR. OVERTON: That is not my job. My job is to advance science and biomedical knowledge.

PARTICIPANT: My job is to advance the wealth of my family by making things better, but I have to pay for all the goods and services I use. So why shouldn't you?

PARTICIPANT: I would say that universities, in fact, do generate much of the data that go into these databases, in the literature, in the scientific research that becomes part of the gene banks and clinical trials. I think that this is basically where the data comes from, and these are the people who are looking for ways of optimizing that process.

PARTICIPANT: And to the extent that content is published, universities are the big payers for that as well.

PARTICIPANT: We give it away and buy it back. It is a wonderful process.

PARTICIPANT: If you have a public database that is being provided free of charge to many different commercial users and there is competition in those commercial-user markets, then what is going to happen is the commercial users are going to compete away the profits that accrue from access to the underlying public database.

That doesn't mean that the value vanishes; it means that the value is transferred to consumers. And so in that sense it may be a good thing that the value isn't transferred back to the original database, if it is publicly funded. But of course the analysis of that problem is very different if the underlying database is a private one, in which case you have to transfer profit back. And that is a very good example of the trade-off between access and protection.

PARTICIPANT: If I may, just two other answers to that question, especially the case where the underlying data were derived commercially, not with public funds. One is that I think one can argue that the raw data have little use. They are useful only when they are organized, when there is some degree of an interface, and when they are presented in a useful way. So the incentive in developing the legal regime should be not just the collecting of the data but taking at least the next step or few steps in transforming the data in a way that is useful. If you give too much protection to simply collecting the data, then you reduce the publisher's incentive to do the next step to make the data useful. That is one argument—that you encourage not only the gathering of information but the gathering and the processing in a useful way.

The second point is that the economics of certain areas of the information market are such that often you can only have one player. The investment is so large. However, it is historical information. The existing players have enormous advantages by virtue of having been there first, and so you have a very serious competition problem. If you don't give those publishers, who are the sole source for either historical or economic reasons, an incentive to make the data more useful, then they are simply going to sit there as monopolists often do and get their monopoly rent and impede innovation and competition. You need to have a way of making sure that you have competition, innovation, and progress in a useful way as opposed to just raw material.

PARTICIPANT: Is there anything unique about scientific and technical data that informs the answer to this question?

PARTICIPANT: I think that to a certain degree there is. There are a zillion databases of all kinds, and the barriers, of course, are intended to protect any databases, scientific or not, such as furniture databases or mattress databases.

PARTICIPANT: Yes, there are all kinds of databases, but the one thing that is true is that with scientific databases there are more databases that are built on a not-for-profit basis and

Page 229 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

appeal to the contributor's scientific integrity or reputation and so forth, which becomes almost an end in and of itself, to some degree. I think that is the difference.

DR. GILMAN: I just had two clarifying questions. One is for Chris Overton. Is your concern about a restriction on your ability to use your derived databases or a concern about your ability to redistribute those derived databases?

DR. OVERTON: Both. Part of the way I use those derived databases is to provide them to other scientists. I think one of the distinguishing features between scientific databases and most other commercial databases I can imagine, like the mattress database or furniture database, is that we take scientific data and we build knowledge out of those data. First you take these raw data. You build some information, and from the information you get new scientific insights; and to the extent that any of these protection barriers impede our ability to freely access those data, manipulate them, and then distribute them to other scientists, we are going to impede scientific research. There is no question about it.

My concern is in the ability to take the data, transform them, do what I want to do with them, and then to pass my knowledge, my derived information, my building up on these layers on to other scientists.

DR. GILMAN: I have one clarifying question I wanted to ask Dennis Benson. Did I understand you to say that you have concerns about data coming into the NIH Genbank that might have a restriction on redistribution in competition with the contributors of the data?

DR. BENSON: No. The concern I was trying to raise was a potential one that doesn't exist today. In the world of electronic publishing if there were an all-electronic journal and, as part of that electronic journal, the underlying data behind the article were to be copyrighted by the publisher, there could be restrictions on the author having those data submitted to a public database.

PARTICIPANT: I have two questions. The first one is how do you get your data now?

DR. OVERTON: We access a lot of databases; there are literally hundreds of databases relevant to molecular biology, cell biology, and genomics. One database has approximately 400 such databases, and essentially all of these databases until this year were freely distributed and you could do anything you wanted with them. There were no restrictions, and now more and more of these databases are attaching licensing restrictions to the access, use, transformation of derived data, and all kinds of things that may come from these. They differentiate between different classes of user. It has to do with the granularity. So, responding in part to this earlier question, if we have to track data to where they came from in order to propagate these license restrictions, then at some point the granularity of the data is an issue.

I don't know how to interpret the law in these cases. So what is the granularity of the piece of datum that has a license agreement attached to it? If I have to track that, I guarantee that it is going to add a burden to what I should be doing, which is focusing my energy on research, not tracking licensing agreements.

PARTICIPANT: It would seem to me then that at least the presence of legislation would tend to lower rather than increase your transactions costs.

DR. OVERTON: These are all European databases, by the way, that we are having the trouble with licensing; but you can imagine that it will happen here. A bioinformatics company that is just being formed, which will be a repackager of databases, has approached us. They wanted to take our databases and fold them in with a couple of dozen other databases and resell those databases. That is not going to be a good situation.

PARTICIPANT: Why is that though? Isn't that how you described what you do?

Page 230 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

DR. OVERTON: I don't sell databases. I distribute them. I give them freely to the public.

PARTICIPANT: I keep making the distinction between commercial databases, which are generated with commercial funds, versus databases that are generated with government funding in some way, whether it be by grant or CRADA or whatever means, because I think that there should be no question that if Celera has spent $300 million, I should not be able to have access to that database and create a derivative database, which I then turn around and sell without negotiating the appropriate agreement with Celera that says that I have the right to do that. But if, in fact, I have the legal right to access any information that has been generated by a government grant, then there should not be any restrictions on the way I utilize that information in the creation of derivative databases, and I do much the same as what Chris Overton is doing.

PARTICIPANT: Has anybody suggested that?

PARTICIPANT: We are seeing this very thing happening in Europe where, with multiple databases such as SWISS-PROT, you now have to pay for and negotiate the right to reuse any of that information.

PARTICIPANT: Has anyone proposed that? I am just not aware of that.

PARTICIPANT: I haven't seen the details of this proposed legislation. What I am hearing is U.S. academics who say that we are going to start filtering access by commercial sites from being able to access our data.

DR. OVERTON: That was me.

PARTICIPANT: But it is not just you. This is a trend. They are seeing the Europeans doing it, so they are saying, “We want to do it too. ” I think we had better be certain that there is clarity that we are not changing the policy that says that if the data are generated by a government grant, then this information is public information for anyone to use, whether it be an academic or commercial.

MR. PERLMAN: We have nine questions with this session, and we have about an hour and a half to try to prioritize these questions. It seems to me that two significant questions would be, Assuming this kind of narrow model of protection, would there be an appropriate place to draw the line between activities that are permitted and those that are not? The second major question would be, Assuming that you had drawn a line, would there be specific activities that would need to be excluded—that is, would there be an additional need for some kind of fair-use provision for scientific data or some other kind of amelioration to the doctrine?

We could spend 15 or 20 minutes on each of these questions. The rest of the questions are essentially procedural, which we will try to go through relatively quickly, but if they come up in the context of the discussion that would be fine too. Not to give him an advantage, but Jonathan Band has suggested that he had a number of hypotheticals under a limited unfair competition model that might serve as a useful way to focus our attention.

MR. BAND: I will try to address these hypothetical models very quickly. I actually came up with these models after having a conversation with Justin Hughes when he asked about my moral compass. As such, I tried to figure out what my moral compass was because, when you are talking about a derivative use, one man's derivative, transformative use is another man's infringement. I wanted to sketch out what I thought would be clearly bad and clearly okay.

We are dealing with Able and Baker. Able publishes a directory listing all the restaurants in the District of Columbia (D.C.) and then he organizes it by the style of cuisine—Chinese, Italian, Mexican, and so forth—and within each style he breaks it down alphabetically. So that is a hypothetical model, Able's restaurant directory in that it is comprehensive, but does have a

Page 231 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

minimal amount of arrangement by ethnic group and even ethnic group by alphabet. So Baker does several things to Able's directory.

First, Baker comes along and copies the whole thing. Clearly I think we would all agree that that should be prohibited. He just downloads the whole thing or copies the whole thing. That should not be okay.

PARTICIPANT: I take it he sells it then?

MR. BAND: Yes, he takes it and sells it.

PARTICIPANT: Not for private, personal use.

PARTICIPANT: Isn't there an issue of what Able did to generate that database to start with?

MR. BAND: In my model I am not that concerned about it.

PARTICIPANT: Suppose Able just picked it up?

MR. BAND: Assume, at least for this hypothetical model, that he invested some minimal amount of work in actually collecting information himself and that he certainly didn't steal it from anyone else.

PARTICIPANT: I suppose under an environment like this, if Able just picked it up then the evil that Baker does is picking it from Able rather than from where Able got it or from where Able ran a program. Fair enough.

MR. BAND: That is right. Assume that he has invested some investment and effort in that.

In the second case, Able does the same thing with a second directory. Baker duplicates the Chinese restaurant section and sells it. I still think that should be prohibited; you can say that even though it is a part of the overall database, the listing of Chinese restaurants within that overall database is another database, and so that smaller database has been wholly appropriated by Baker. That should be prohibited.

In the third case, Able does the same directory. Now, Baker takes the Chinese restaurant section from that directory and adds it to a directory that he is putting together from Montgomery County and Northern Virginia. In this new directory, Montgomery County has its own chapter, D.C. has its own chapter, and Virginia has its own chapter. In my view that should still be an infringement because Baker took this whole smaller database and is reproducing it even though he has it as part of a bigger database. Even though it is sort of segregated within that bigger database, at some level, his new database is the same as that first database.

Next case, Able publishes the same directory in alphabetical order. Baker now takes the whole Chinese restaurant section for the District of Columbia and breaks it down by neighborhood. I think at that point it depends how broken down it is, meaning if Baker simply took downtown and the rest of D.C., or Chinatown and the rest of D.C., at that point I think it is still a little bit too similar to Able's to allow him to get away with it.

On the other hand, in case five, he really breaks it down by neighborhood, such as Adams Morgan, Georgetown, Chinatown, Downtown, Upper Connecticut, and so forth. That kind of organization, in my view, is transformative enough that it should be permitted even though he took all the data for the Chinese restaurant section. But, again, it is going to have to have judicial intervention case by case to see if there has been enough transformation to allow it to be acceptable.

In case six, Able does the same directory. Baker just looks at the directory. He sees the listing of all the Chinese restaurants just by scanning it, but he only takes out the Chinese restaurants in Adams Morgan, and again, he sells that listing of Chinese restaurants in Adams

Page 232 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

Morgan. I think that should be okay because, again, he hasn't taken the whole database. He has taken at that point a small enough subsection, which shouldn't be a problem.

With case seven, Able does the same directory. Baker copies the Chinese restaurant section and then he merges it into a database with Chinese restaurants from Virginia and Maryland, but he doesn't break it out as separate chapters. He really has the directory integrated as a whole and does an overall alphabetical listing where maybe he breaks it down by region of China in terms of Hunan, Szechwan, and so forth so that he has merged it as opposed to just having it as a stand-alone chapter. There, too, I think even though you might say, “Gee, that should be okay,” as a general matter it again depends on the specific facts. Let us say that there are 200 listings from the D.C. part that Baker took from Able, but that Baker had only 20 suburban listings. At that point I think you would say, “Well, his database is just a little bit too similar to Able's to be acceptable,” because what Baker supplied is such a minimal part relative to what he took. On the other hand, if he had 100 listings from Northern Virginia and 100 listings from Maryland and merged those with the 200 listings from D.C., at that point I think a judge would be more likely to say that now it is different enough that it is okay.

Anyway those are some of the cases I had in my mind, and that is my moral compass. It is a compass. I don't know if it is moral.

PARTICIPANT: Clarifying question? In addition to the hypothetical models, which are interesting, you have views on how they should be resolved. May I ask what the reasoning behind those views is? Is it legal reasoning or is it economic reasoning, and can you be more explicit about whichever those are? Can you say what the principles are?

MR. BAND: I think I tried to come up with a legal test, which is first of all that the second comer, Baker, is taking the whole database or something that approaches a whole database.

PARTICIPANT: What is the legal reasoning? What is the legal test?

MR. BAND: What I have come up with in the draft is, basically, duplicating the database. But then of course, the plaintiff has the ability to define what the database is, so Able could say that the database here is not the whole directory, but it is the Chinese restaurant subsection. So that is a way that database creators are able to somewhat limit the definition; and now there are other tests. To some extent like anything else, what a database is is arbitrary, but presenting the hypothetical cases was a way of trying to say, at least give my sense, in either the philosophical or the economic approaches, that the transformative uses are okay and should be permitted. Then the question is, How do you define what is transformative? So this is an effort to try to come up with some tests and then apply the tests to the hypothetical cases.

MR. PERLMAN: We don't need to debate each of these hypotheticals, but at least they focus the question of whether we can define effectively the scope of prohibited uses. Does this make sense to those of you here who are going to actually deal with data? I assume the legal rule could be some kind of substantiality questions, or the economic rule I assume is the extent to which it undermines the competitive value of the initial database.

PARTICIPANT: These legal principles and economic principles are sometimes in conflict, which is why I asked the question. The economic principles are obviously badly articulated in most statutory law about intellectual property and also case law; but an economist would be concerned with recovering the costs of all parties, and that would be the motivating question: What were the costs of each party and who do you have to protect?

Page 233 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

DR. OVERTON: This is an interesting example, but I have a fundamental problem with it. It brings up the question of what is a database? A database isn't just the data. It is a way to make the data available.

We have database management systems that are used to access the data. Say I build a database that only gave a narrow view of the data, so that I have a data set and then along with that I have some tool for accessing the data. In all of those hypothetical cases that you just mentioned, suppose someone has the data as a flat file and they could just do simple look-ups or do a key-word search on the database and things popped up, and they take those data. This is the kind of thing I do all the time. I take that whole data set, and I transform it into a relational database, and then I add some geographical information to that database. Then I could do any one of those kinds of queries that you just went through and present it depending on what a particular user might want to do with that database at that time.

So now there is a complete blur about what is going on as to whether or not each one of these things is acceptable or not. It depends on how it is used.

MR. BAND: Right, and in that case it depends on whether Able has a database that was a relational database and Baker came along and basically copied all of it or chunks of it; then it would be more problematic, obviously.

DR. OVERTON: At that point you can do a lot of things that Able couldn 't do because you have transformed it into a more powerful format with a more powerful retrieval system.

So then what? That is a derived database that you have added value to and still have the same data set associated with it. My point is that the trouble I have with a lot of this is that you have to look at this in terms of the data set plus the way that the data are accessed to understand what the real issues are going to be, and I don't think we have talked about it at all in that way yet. We just talked about a data set, and so should that whole data set be protected?

PARTICIPANT: Right, and that is a good question. To rephrase it, if Able has a printed database and Baker copies all that information but then makes it an electronic database with a search engine that is capable of doing all those things, is that transformative or not?

PARTICIPANT: That happens all the time.

MR. BAND: I have a feeling, at least under my construct, that if he copies all of the data and makes them available and has not added any data, but has added software, then I think under the model I put together, that would be prohibited.

PARTICIPANT: We do that all the time. People have optical character readers. You can throw paper through the things and you can—

PARTICIPANT: Yes, but we do it all the time and make people liable for doing it too.

PARTICIPANT: No, because when we build a database, those data are transformed into a different structure. So you are transforming the data.

PARTICIPANT: It is not a copyright.violation. The two cases that we know about are cases that were at least a violation of a license agreement, but the model for all this is that I take someone's database and I put it on the Internet. They are trying to sell the Zagat survey of restaurants in Washington, and I just scan it in and put it on the Internet. So, now, people can search it and manipulate it and print it out etc. That is clearly bad, isn't it? You certainly shouldn 't be able to do that.

PARTICIPANT: Why? You have provided a completely different way to access the data that is much more valuable than the way they were in the beginning.

Page 234 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: But they are still the underlying data. You could get permission. It is no different than doing a play off of a novel. It is a different way to access the novel, but you would assume that the novelist would have some say about that.

PARTICIPANT: I have a question for Jonathan Band. To what extent did the model you articulated differ from sweat-of-the-brow protection that existed prior to 1991?

MR. BAND: Maybe not a whole lot, and the reason I say that is because I think sweat-of-the-brow protection before Feist was very narrow, much narrower than H.R. 2652. The cases to which sweat of the brow was applied were typically cases where there was wholesale misappropriation of a database for which there was little or no copyright protection, and that is what I think is wrong. We had wholesale misappropriation, but those cases didn't deal with taking bits and pieces and value-added. Those cases weren't litigated, and so this, in many ways, is a more faithful implementation of sweat of the brow than H.R. 2652.

MR. MAURER: The existing NBA case [see National Basketball Association v. Motorola, Inc. 105 F.3d 841 (2d Cir. 1997)] expressly codified economic considerations, and it tells the judge, “Here you need to think about public policy in this particular instance and come up with the best economic rule. ” I think we need to remember that if we are going to discuss legal reasoning as an economic approach, the great strength of these unfair competition approaches is precisely that they tell the court to go out and consider the things that we cannot make a general rule for, and I think that is a very strong aspect.

PARTICIPANT: Except that I suppose the only response to that is that yes, in a perfect world, the judge does do a serious economic analysis, but the truth is you know even though the standard might be that it diminishes the incentive to create—

PARTICIPANT: We expect the judge eventually to have accumulated wisdom based on having seen so many of these cases that he can have a rule, but the rule should follow the underlying economic logic.

MR. PERLMAN: What do the members of the scientific community and the non-lawyers think about all this? As someone in the scientific community without trying to draw the line, but understanding that some kind of fuzzy inquiry into either fairness or economics would be required, how do you see this impacting on the flow of scientific data?

PARTICIPANT: I think one of the issues is quality and continual quality improvement. Suppose Able didn't bother to keep his database up to date and someone else had a better mechanism for doing that. In a scientific regime if you had a database of gene data or drug data, or something like that, and they were sloppy data and someone else had a better way of keeping those data more current and more useful in terms of building the scientific enterprise, there is value to the scientific work of having a better database. It is conceivable that one would simply collect a bunch of data and, using this law, prohibit others from working with the data, improving them simply perhaps even by the threat of suit.

PARTICIPANT: Why would anyone do that?

PARTICIPANT: Simply to protect an economic investment that they made.

PARTICIPANT: You don't have one. I create a database and it is a sloppy database, which means it has very little economic value, and you come along and want to enhance it into a quality database. If I had protection, my instinct would be to say, “Yes, go ahead and pay me a small royalty. It is better than what I am getting now because I have got this sloppy database that I cannot sell to anyone.”

PARTICIPANT: Maybe at some point it is the only game in town because you are the first in the market.

Page 235 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: Even so, I cannot capture any gains unless I get you to make it better.

PARICIPANT: But you can prevent me from improving that database.

PARTICIPANT: But why would I, unless I am crazy?

PARTICIPANT: You don't give money to improve the database.

PARTICIPANT: Yes, but he is not going to do it for free either, but I am going to let him do it.

PARTICIPANT: But let us talk about the law. The previous law said that if Able made the database, whether good or bad, and Baker came along and reconstructs the database from scratch and it is a much better database, then Able cannot sue him.

PARTICIPANT: That is fine. That is not what we are talking about. I am saying, “Go ahead.” I mean every rational person would say, “Please do that.”

PARTICIPANT: In a business sense they might not do that.

PARTICIPANT: Why wouldn't they in science?

PARTICIPANT: Because they have a flow of revenue, and they can maintain that flow of revenue without investing any additional money in development.

PARTICIPANT: Then the price I will charge you to make it better is going to be my current flow of revenue plus a dollar.

PARTICIPANT: But people don't necessarily do that.

PARTICIPANT: Why not, and what is it about you scientists?

PARTICIPANT: I am going to do this myself in two years. I want to keep that option open.

PARTICIPANT: In several cases we tried to negotiate the rights to be able to use or redistribute or include information that would have actually increased the market size for the group that we were working with, and we have been turned down.

PARTICIPANT: Is there some reason?

PARTICIPANT: I think part of it is that people don't have any experience in negotiating those sorts of things; so they are worried about anything other than the status quo of the university.

PARTICIPANT: And indeed your very inquiry that you might want to make it better hints that there is perhaps a business prospect and the person wants to hold onto it and go back to it and reevaluate.

PARTICIPANT: It is a threat.

PARTICIPANT: You start raising all these issues about what is the potential value of a database and how the originator of the database might want to participate in that, whether they want to participate as an equity partner or simply as a licensor.

MR. PERLMAN: Other comments on the idea of trying to be able to draw a line here somewhere?

PARTICIPANT: I would just add, going down the list of these hypotheticals in terms of H.R. 2652, at first blush some of these may not be actionable at all. Take, for example, the Chinese restaurant subdivision of the directory. There is an initial question of what the database is. Using something I am familiar with, like LEXIS-NEXIS, it is one thing if you copied an entire file library out of that database. I think that is an easier case than if you are taking one section.

PARTICIPANT: Of course, the flip side of that is trying to draft anything relative to databases becomes difficult if you cannot inherently define what a database is. Then none of the models make any sense.

Page 236 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: Language is an imperfect tool. There have to be some ambiguities written into the legislation. There is an element of res ipsa loquitur here.

PARTICIPANT: You are right. It depends on how you interpret the legislation, but certainly in the negotiations we had last summer, we both had the unpleasant experience of talking about this. H.R. 2652 addressed a qualitatively, not quantitatively, substantial part of the database, which of course is totally ambiguous if you are the second comer. How are you ever going to know what is qualitatively substantial to the owner? Moreover, we made it quite clear that one piece of the data was not considered qualitatively substantial, so there was an exclusion that was put in for one piece of data. However, it was made clear to us that two pieces of data very well could be qualitatively substantial and even quantitatively substantial. So with that analysis there is no question in my mind that the chapter dealing with Chinese restaurants within this bigger database of D.C. restaurants would be quantitatively and qualitatively substantial.

MR. PERLMAN: Let me change the focus without changing the problem. If you cannot solve some of these problems by defining at the outset what the standard is other than in an ambiguous way, how about going at it from the back door and excluding certain kinds of use specifically as being fair. I don't mean to draw too much of a metaphor to copyright law, but it would be useful articulating some things that would be permissible if the database is protected, and our focus is of course on the scientific community. How about just saying that any scientifically transformed use, any advance of science is permitted, something like that?

PARTICIPANT: I want to jump into Chris Overton's camp for a moment and bring it back to the scientific application. These kinds of database issues are endlessly fascinating, partly because the cases are of concern, but they may not be so applicable to the scientific databases simply because the scientific databases are amenable to self-help and other kinds of protections that aren 't available for publishing restaurant lists. I think it would be a mistake to lose sight of this simply because it is so amusing to consider these listings. So this discussion may not be the most germane one for scientific databases.

PARTICIPANT: It strikes me that there may be a way to take your case and make it more relevant. For example, say someone was to take the whole database and instead of making another listing so that people could pick what restaurant they wanted, used it as a resource to mine for specific information, such as hot foods, or used it as a target list for marketing. You just want it either as a service or you make up listings or brochures and use it in a totally different way, mining out data that were there but were incidental to the intent. Maybe you need to blend in another database that says, “It is this kind of restaurant,” and you have a database that says, “For restaurants of this kind, they will use these food groups.” Then you cross correlate the two databases, and you could draw conclusions like that. I think your case could be extended to a more relevant scenario, something more like Chris Overton's. What happens then?

DR. OVERTON: Let me make a comment about what makes me uneasy about this notion of qualitative and significant pieces of databases. Suppose I do a bunch of queries on genome databases or a combination of genome-related databases, and they all look about the same to an outside user. The results of these queries return approximately the same amount of data, but in one of those I make a fabulous discovery leading to a patentable gene or something like that; by the way, this goes on all the time. The other queries don't lead anywhere and it is lipid, the gene for fat, or something like that. So was that qualitatively different from the other queries? I mean the economic outcome from that was substantially, hugely different for that particular query, but qualitatively from the point of view of a query, they look all the same. So how do we make a judgment call there?

Page 237 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

MR. HUGHES: To answer that, H.R. 2652 was never focused on the economic result of taking the data. It has been the investment in the data, and it is hard to come up with great examples of situations where there is a great disparity in the quality of investment, but in the scientific field you can in different data. So if this datum required sending a plane over Mount Everest to get the temperature at a particular time and it goes into a data set of temperatures where the other temperatures were just recorded by people lying on the beach who were there anyway, maybe taking those airplane-measured high-atmosphere temperatures—a small number of them—is taking a qualitatively substantial part of the investment of the database versus a quantitative one.

So I think that the question is not the end result but the input, and I agree that that still creates an enormous problem for the user because the user has no immediate way to discern what are the investment-laden data and which ones aren't.

PARTICIPANT: Is there a way to exempt science from this process, and then we could all go home, and you commercial folks could just do what you wanted?

PARTICIPANT: One item is, it is not unfair competition if you don 't use it for commercial gain.

PARTICIPANT: There are two answers to that. One, you have to cure it for the LaMacchia problem. Does everyone in this room know what the LaMacchia problem is? The problem is an ambitious Massachusetts Institute of Technology student who decided that software should be free and therefore put Windows and WordPerfect and a bunch of software on the Internet for people to download. He couldn't be prosecuted under the copyright law criminally because the criminal statute under copyright law requires for you to gain economically from your wrongdoing, which he did not. He gained ideologically. So the problem that everyone is conscious of is that, particularly in the digital environment where we do have some competing values, you have to take care of situations where people may be out to denude people of their investment purely on ideological grounds.

PARTICIPANT: Or personal grounds.

PARTICIPANT: Yes, and the second is the problem, which is just as vexing. Chris Overton has described situations where scientists create products that they share within a small community, which, from some commercial people's perspective, is taking away some market share or potential market share. That is the gray area that you have to figure out and then you could exempt science, if you could just do that.

PARTICIPANT: If you exempt science, what about economists? I think economists feel that they need a lot of data too, and I am sure they believe that what they do is every bit as socially useful and scientific as these—

PARTICIPANT: The dismal science is included.

PARTICIPANT: But in using a database, whether it constitutes some sort of infringement or unfair competition, it is valid to make a distinction between whether it is actually used for a commercial or ideological purpose or whether it merely represents a segment of the market in which you cannot deny use.

PARTICIPANT: The difficulty I think is that some of those may in fact affect the marketplace. If in academic research you point out a defect in a database or something like that, that may jeopardize the market. The person who is providing the database has no incentive to let you do this. I would like to find a way of somehow exempting academic not-for-profit activities, but it seems to me that the practical —

Page 238 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: I think there is a valid distinction. Obviously this has an impact. There is no question it has an impact. However, I think it is valid to distinguish whether or not there should be a subsidy for academic uses or scientific uses of a database, and I am of less convinced of that, versus whether there should be an exemption for certain types of activities that have a complementary nature, incremental investment nature, improving the product type of nature. I think I find that easier, personally, to accept than an outright subsidy that says that academics don't have to pay for databases.

MR. PERLMAN: Any additional comments?

PARTICIPANT: I am puzzled as to whether any of the proposed models of legislation solve Chris Overton's problem. His problem happened without any change in legislation in the United States. Now, it may have had something to do with the European Union Database Directive, but the genie is out the bottle already and there is no going back.

PARTICIPANT: That is a good point because one of the things that Harvey Perlman alluded to before was the Uniform Commercial Code (UCC) Article 2B drafting process where we know that one of the things it is trying to do, depending on your view, is to determine whether shrink-wrap licenses should be enforced. There is no question that there is a greater movement toward not only having license restrictions on online databases, but even on databases that are distributed in CD-ROM and other forms as well.

PARTICIPANT: It is a little broader than that. There is nothing in our focus so far with respect to fashioning a database bill that would prevent or otherwise interfere with the owner of a database contracting any way he wants to. If you cannot get the database because I haven't published it, unless you come to me and go through me, then I suppose I can extract from you any kind of compensation that I want, including limitations on use; at this point there is nothing that we talk about that can have any impact on that at all. You could say that if we could construct an appropriate balance of a database, one could clearly say that you couldn't contract around that balance, and there have been in the process some suggestions that that could be done. So I don't know whether that helps.

PARTICIPANT: Would you expect the bill to differentiate those databases generated with government grant money?

PARTICIPANT: That is a separate question. I think yes, but your problem is how.

MR. HUGHES: We have made that differentiation, and I am sure everyone will be grappling with that question of “How?” Does everyone understand the UCC Article 2B drafting process? I think that we are leaving some people in the dark.

So that everyone understands the framework, in the copyright world we have a well-understood system of fair use, and the question is whether or not as we move more and more toward licensed products and less and less toward physical copies of those products, which are subject to what is called the first-sale doctrine in copyright law, the owners of the copyrighted work can impose conditions on the use of the copyrighted work that go against the balance that the fair-use provisions bring into the copyright law. What Harvey Perlman is saying is that right now there is nothing to stop a database owner, particularly an online database owner, from imposing egregious terms, whatever terms they want in order to make the online database accessible. They do it now, though I would love to see more of those contracts because people talk about them and I don't get to see them as much.

The question is, Is there nothing there now to protect you? If you built a piece of legislation that set out permitted uses and reasonable uses, you would then have one of two options. Option number one would be you would say that those permitted uses and reasonable

Page 239 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

uses could not be contracted around. Now that may sound very desirable to a lot of people in this room. I am not sure how politically feasible it is, but that is a different issue. The second option would allow those permitted uses and reasonable uses and then you could say that you would be silent on whether or not they could be contracted around, as the copyright law is.

Now, I believe, even in a world where you are silent on that issue, if you had reasonable and permitted uses, this would put some brakes on the attempts to assert egregious terms and contracts because I as a commercial database vendor would then say, “Whoa, if I push the limit too far, he will take me into court, and I have a crap shoot as to whether or not the judge will say that this is an egregious contract under public policy terms, and I am not going to enforce it.” So even if you don't say that these permitted uses and reasonable uses can be contracted around, if you built them into the law, you would be creating some soft protections against overreaching in the form of, again, transaction costs. The commercial database maker won't want to try it.

PARTICIPANT: Let me speak against that argument. It is obviously a problem to create legislation that isn't necessary. So I disagree with the idea that there are no protections against egregious contract determinants. The protection of it is in the discipline of the market, and the discipline of the market means that as long as there are competitors supplying similar software services on the Internet, if someone is requiring egregious contract terms, some other competitor is going to come in and offer better terms. So it is at best in these markets where you have only one supplier that this argument cuts. It would be a mistake to make such a legal revolution in respect of those perhaps rare circumstances of niche markets when perhaps most such markets have the discipline of the market undermining the egregious contract terms.

PARTICIPANT: The down side to that is that in many of the distribution models for digital information the terms of the contract aren't available prior to the payment, and in that circumstance it is unlikely that you will get as strong a competition over terms as you would otherwise.

PARTICIPANT: But you could solve that in contract law.

PARTICIPANT: Could I rephrase this conversation in terms of a specific example, and this gets back to text and journals. There isn't as much competition as you suggest. There are journal titles that are quite definitive in areas—for example, the American Society of Microbiology, which provides a digital form of its journal. The terms of the contract are that I can make the journal available at a seat in my library, no other seat. If I buy the hard copy, anyone can use it at any seat in the library or check it out. From my point of view, that makes the utility of the digital product useless. On the other hand, I have no alternative to that title. Is that the kind of thing that we are talking about?

PARTICIPANT: Yes. I am not saying that is an unfair provision, but that is the nature of the problem.

PARTICIPANT: Just a couple of quick observations. We have all been enjoying renting cars and flying on airplanes, and there is substantial competition in those industries. If you try to file a claim for lost luggage for more than the amount indicated on the back of the ticket, and I forgot my magnifying glass last time I flew, you will find that regardless of the contents of your suitcase, every airline has essentially the same amount of limitation, even in the most egregious circumstances under which the luggage was lost. If you don't have car insurance of your own and you rent that rental car, you are going to be under the rental agency's insurance policy. So trusting to the market is not necessarily going to produce, even in the context of robust competition, the results you want. There is an option for a term to become a competitive advantage or disadvantage, but in the face of whatever reason, whether it is a natural result or

Page 240 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

not, you can have uniform crummy standards that don't get you where you want even in a competitive industry.

The second observation is that we are talking about new intellectual property protection which hasn't existed before. The rationale for introducing that protection is that there is a gap, I will go so far as to say a perceived but relatively small gap, in existing protections, and yet the nature of the solutions under almost any of the models we are proposing here sweep very broadly.

So we are talking about a situation of making laws that sweep very broadly to deal with harms which many people have suggested are, if not hypothetical, at least minimal. So I would submit for people 's comments and consideration that the thrust of the market theory is best applied in terms of “let us see what kind of protections are necessary as markets evolve,” and if it is very hard to track those protections in advance, do we need to?

MR. PERLMAN: Let me segue into the question from that which doesn 't work very well, but I will do it anyway: Are there prerequisites that should be required of a database producer before protection is accorded? I think that question might be thought of in a couple of different ways, substantively and procedurally. Substantively, you could think that a database would have to meet some kinds of quality criteria before it would come under whatever protection we would give. Procedurally, should the database owner, in order to get protection under our narrow scheme, have to register or deposit or give notice or something like that? There seems to be a concern that we have heard in other sessions about the uncertainty of all this. What can I do? What can't I do? So maybe there should be some pre-steps that database owners should have to do before they acquire any protection at all.

What are your thoughts about that?

PARTICIPANT: I think the idea of registration is a very positive one. It is how you structure it; how you define what gets registered and what doesn't, but even the process of going through it intellectually gives you the mind-set that this is a database that I can now protect under a certain environment.

PARTICIPANT: It is just another administrative barrier to academics.

PARTICIPANT: I thought your question was the reverse, if it is not registered.

PARTICIPANT: Right.

PARTICIPANT: So how is the owner of the database supposed to know who is using it if they don't register it?

PARTICIPANT: Then they can register it.

PARTICIPANT: They should register it.

PARTICIPANT: Let me suggest that there is a particularly vulnerable time where it is very difficult to have a registration requirement. That is, the conceiving of a new database represents a new market. The database isn't predefined. It is very hard to register something that is under development, but during that time you are most vulnerable to other people grabbing your ideas. Basically all you can do is use trade secret or keep the idea under wraps; but in order to test many of these, you need a user community of some sort. It may be that this is an area where it is very difficult to define what should be protected and in which the law may offer no recourse if someone, for example, breaks your security and takes your idea.

PARTICIPANT: You would have recourse in that circumstance. A database is not an idea. It is a collection of facts or information.

Page 241 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: I would say that insofar as the way the data are organized and the way the interface to the data facilitates their use, that is an idea. That is a concept that gets beyond this notion of just pieces of data.

MR. PERLMAN: Moving along then, if you have no great concerns about a deposit requirement, then what about time limits? Under an unfair competition model, a narrow protection model, for how long should a database proprietor have this kind of protection?

PARTICIPANT: I cannot answer that question because what I find is in this discussion there are so many disparate kinds of databases. There is the weather database where the weather conditions are practically only good for today and by tomorrow no one cares if someone copies it, except if you are doing climatological statistics or something. This type of database is time dependent and it is not useful after that time has gone by.

Then there is the kind of that doesn't ever change. Once you have collected them and someone steals them, they have stolen your database. They might make it pink instead of blue and sell it for more because consumers like pink better.

The kind of database that we work on is updated every day. You are improving it every day. You are deciding that this protein that has been in the database for 14 years could be given a better name, which when people do a search against the database will give them intellectual insight that may allow further discovery that they won't discover if you leave the name as XYZ protein or something that says nothing.

How are you going to determine the time limits in that case? I read somewhere that someone suggested that about half the time, the mean time between updates is a good time. If the database is updated every night, are you going to protect it for half a day? It makes no sense.

I have a problem with this idea that you would have to register it every time you updated it and you might make it shorter.

PARTICIPANT: I think that as is the case in state misappropriation law there would be no time limit at all.

PARTICIPANT: Would there be a need for it?

PARTICIPANT: Obviously people want a time limit, but I don't think, in the standard way things work, if the competitor is stealing something because it has a value, then at that point you are protected. It varies depending on the product.

PARTICIPANT: If you have a market-harm test in the bill, does the time limit become less relevant?

PARTICIPANT: It would seem to me it would.

MR. MAURER: A lot of misappropriation law has a ferocious time limit. It is the hot news limit; the logic of those cases is that the data are valuable for a limited, economically limited, period of time. The point is well taken that there will always be exceptions, but if most of the users in the society update a database every year or every two years, then that is something to shoot for. For example, you might do that as an extension of the hot news cases, that we want to protect data for the time it is necessary to allow the people who made the data to have the incentive to update them, and then they should go into the public domain.

PARTICIPANT: That is subject matter. Hot news is valuable because it comes from what you are protecting. So the time limits may be totally different.

MR. MAURER: It could be enough of an incentive though.

MR. HUGHES: The perversity of this, of course, is that if you created a legal regime based on how and when something was revised, you would create a disincentive for revision because if I know that it is good for a year because I revise every year, then I am not going to

Page 242 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

revise because you will take part of my market. So I will let it go to pot for 5 or 10 years and then come in.

PARTICIPANT: Not if you are in a competitive market you won't.

PARTICIPANT: The new version is protected though.

MR. MAURER: I was just saying that each update gets protected, but it only lasts two years whether you are doing anything or not.

PARTICIPANT: The new version of the database gets protected. However, I am fairly sure that in a lot of these markets where there is an annual update that if someone could free ride on my last year's investment and offer a product on the market, for example, for a buck compared to my $50 guide to cable services in the country, a lot of people would say, “You know, I can do with last year's version.”

MR. MAURER: I would make one point though. All the people who bought last year's guide, and the cable market is a good example, could just keep the old one. In our own lives we always want to go out and get that new edition, and that is why the updates have such a strength.

PARTICIPANT: I am not sure as a premise that we want the database creator to extract every bit of value that is there to be had. Both copyright and patent leave something to the public domain.

PARTICIPANT: The unfair competition model that we are trying to use doesn't allow you to extract all of the economic value because it is only the market that you are actually in, not the potential markets, which are part of the economic value.

PARTICIPANT: Words like “market” always make me nervous, but it is too important to ignore here because you might have very different legislative results. If you build a bill around the notion of “market harm” in the process, then you wind up with something like what I gather state law has been based on in part —that is, the action of the defendant in the suit sufficient to deprive the compiler of the database in the first place of the incentive to put the database together originally or to continue to maintain it. That is one standard, and that produces one debate and maybe different legislative results.

What we heard during the Senate negotiations over the summer was that a lost sale constitutes sufficient harm to warrant protection, and that was the goal to be targeted against. It was literally lost individual sales or individual licensing.

So when we go to market, as I think we obviously shouldn't have to, we have to be careful to translate that to the larger environment.

MR. PERLMAN: We have about one-half hour, and we should to turn, to questions 7 and 8, which we can handle together because the issue of government-funded or government-generated data is one of the things that distinguishes scientific data from some of the other areas that seem to be driving this. If one envisions a kind of narrow protection for databases built on market harm somehow defined, how would you go about approaching the problem of government data incorporated into a private database, and to what extent should they get the same protection; I think that is the issue, isn't it? To what extent should a publishing company be protected for grabbing all the federal court decisions or weather data?

PARTICIPANT: Chris Kelly, Justin Hughes, and I along with other people here had a conversation about that earlier. The government has had different policies about how they give those data away. One way is to auction the data to exclusive users, and a worse version of that is they give them away to an exclusive user. Another way is that they make the data available freely in a competitive market, and they let all the competitive users use them that want to. I think it is worth keeping in mind how those two approaches are different.

Page 243 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

If you auction the data, that has the value of giving money back to the government, and that is a good thing. We would rather give money back to the taxpayers than let it go to the companies, but, in my view, it would be a bad thing because of the cost-recovery model. It supports an exclusive market because it creates a monopoly by auctioning to a single user. That model is a different model of how the government should behave than one in which they put the data out for competitive use and allow competitors in the same market to use the underlying government reports, compete away the value of the underlying government reports by which I mean keep prices low in the secondary market, transferring the value of the underlying source to the consumers. That model is less appealing to Congress because it is hard to say to Congress, “Yes, we have created this value that you cannot see,” but there is a lot of economics there. So these two models are important to these focal discussions.

DR. GILBERT: It does seem that this is an issue that can and should be left to contract between the government agencies and the developers and users of the data, perhaps with some statutory language about the good of the people. I can imagine a circumstance where the government invested funds to develop a data set, but for the data set to be useful it has to have another layer of development, and the government is not in the position to do it, and the only way it is going to get done is to contract it out under some exclusivity terms. So you wouldn't want a statute that says that you can never do that.

PARTICIPANT: May I just interject that the approach taken in the legislation that has been written thus far is that whatever other laws may apply to exclusive licenses between the government and a contractor, H.R. 2652 or its other iterations does not. So the database protection is separate from that. In other words, if the government licenses to a military contractor, the information is confidential. Whatever other remedies they may have in terms of keeping that information from being disclosed, this bill will not be one of them. That is an entirely separate carve-out.

There are a couple of issues here that I think are still under discussion and will certainly come up again. One issue is how one crafts provisions that deal with data that are required to be kept by a statute or regulation and the other is under what circumstances, if at all, should access be given, but those are distinct questions.

DR. OVERTON: One of the provisions I would like to see for government data is that even if they are sole source, all of the data have to be available to the scientific community in a cost-effective form. In other words, you couldn't sole source them to a provider. Again we go back to this interface of the database that restricts your use in some way to the full value of the data. The full value of the data is only there if you have all of the data, allowing for data mining, aggregating the data, and so on. These tasks can only be done when you have access through your tools often to the whole set of data. So I would say that this issue would have to be part of any provisions of distribution of government data.

PARTICIPANT: What if that provision meant that you would never get the data?

DR. OVERTON: No, that provision means I do get the data. What I am saying is that there has to be a provision that says that I get all of the data, not just some provider's view of the data.

PARTICIPANT: I think hypothetically you could have a circumstance in which certain types of data that you would like simply aren't going to be generated or put in a useful way unless there is some exclusivity.

Page 244 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

DR. OVERTON: I am assuming that they are government data. So they have been generated, and then the issue is how they are going to be presented and made available to the scientific community.

MR. HUGHES: I will give you an example of Professor Gilbert's idea. At the Commerce Department years ago—and I have only now been learning a little bit about it in the past couple of weeks—for 35 years the Commerce Department published the U.S. Industrial Trade Outlook, which contained 35 years of very good, very useful statistics to economists, social scientists, and businesses on roughly 10 to 20 industrial sectors.

In 1994, budget cutbacks forced that to stop. Now, the International Trade Administration (ITA) was still able to do about seven industries, but it really wasn 't enough of a critical mass to bring out the book. ITA entered into a CRADA with McGraw-Hill, and they now publish a book that covers more than 20 industries. That is an example of a situation where you have data sets that are useful but they aren't useful enough to be marketable, and you are trying to find a way to bring them out and to get them into the distribution flow. It is hard to come up with a rule for every circumstance, because as much as I believe in the public data model, I think that the idea is the right thing there in the face of their constraints.

PARTICIPANT: So that is a case where they were blended with the government data set. So you are saying that the government data set is no longer available except through McGraw-Hill?

MR. HUGHES: McGraw-Hill has been very generous in the sense that this book is now deposited in all the federal depository libraries, and 5,000 copies are made available to the ITA to distribute as they wish. So, in a sense, there is still a public domain of all the information, but it is an example where the government data could only be viable in the distribution system when they were blended with another layer of private-sector work.

PARTICIPANT: What this keeps coming back to is that what we really need is licensing people in the government who are sensitive to the consequences of licenses that they enter into and understand the trade-offs between exclusivity and access and try their best to make the right decisions. Traditionally, that is what we have counted on with patent licensing. I am not sure that it has always worked so far, but it seems to me that whoever does the licensing decisions is getting better as people in the government become more sensitive to the significance of the rights in the marketplace, and I guess we have to hope that would happen with data too.

PARTICIPANT: There are essentially two worlds in which you have to think about this. One world regulates the extent to which, and how, under what terms the government licenses the data to X.

In the other world there are questions: To what extent should I be able to do anything with the data X provides differently or more freely than I could if X was a commercial private database producer producing their own private information? Do I have any more rights? And regardless of the relationship between the database owner now and the government, should the rights in that database be more restricted because it contains government data?

MR. BAND: In every case it would be different, but the commercial publisher should only be able to protect that which he added. Again, it could be in the example that Justin Hughes was giving of this with the ITA. It could be that there was a lot of processing of the data by McGraw-Hill—that they took a lot of raw numbers, they processed them, and had a database. But I would submit that that kind of selection, coordination, and arrangement would certainly be covered by copyright.

Page 245 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

So to the extent that some things would not be covered, you would say that the data are government data and therefore not protected under a database bill, it could very well be that the selection, coordination, and arrangement of them is protected under copyright. Either way the publisher gets compensated.

PARTICIPANT: But can I buy the McGraw-Hill book wherever it is or get it free through the public domain distribution?

MR. HUGHES: You can go get everything in it from a depository library, and the depository library didn't pay for it.

PARTICIPANT: So I get it and I take all of the data that the ITA provided, and I—

MR. HUGHES: And McGraw-Hill does not assert any rights over that. That is correct, but we thought a lot about Jonathan Band's problem. Let me give you a very difficult problem, and that is what West Publishing is facing.

You have to understand that a lot of the value of what West Publishing does is that they go to the courthouse and they get the opinion, and the value is partly in the distribution system. Copyright law is very clear that federal material is uncopyrighted and it is supposed to be marked as such, and what is the material you can mark, and ideally there is some kind of citation system. If we are talking about protecting the investment in the database, I do understand West's problem, in that West goes out to hundreds of courts all over the country, gathers the cases, puts them online, and adds its own notes. It adds its own little citation system, but there is value in just the government information there. There is investment by West in gathering it from these hundreds of places, putting it into a format, and distributing it. I am puzzled about what we do about that because if Matthew Bender can come along and just download all of Federal Second and Federal Third Circuit Court opinions from West, and then take out all of West's footnotes and their original materials, then West has lost a substantial investment that they did when they went to all the courthouses.

So I am sympathetic to there being a problem here even when we are not talking about their visibly added value, because it is value-added through the collection and distribution process.

PARTICIPANT: That is like in the old days when it was laborious and hard to collect all the court decisions, and I think that it is easier now. In five years it will be still easier with every court posting everything on the Web, and it could be that technology has put West out of business, and that is life.

I had a meeting the other day with some newspaper publishers, and they were very concerned about their classified ads, among other things. A lot of newspapers are concerned about protecting their classified ads. They were saying, “Gee, it is terrible that people can go and pick out some of the listings from the classified ads and put them up on the Internet with other things and then the advertisers realize that and stop advertising in our classifieds.” For a little while I was sympathetic, and then I said, “Wait a minute, why should I be sympathetic? In five years a person is going to have to be crazy to pay a newspaper $40 for a classified ad when you can go to the eBay Web site for a quarter.” Why should we put a law in place that will preserve an antiquated way of doing business and impede a better way of doing business?

PARTICIPANT: If we do arrive at a world where West gets everything off the Internet, as you can get Michigan court opinions now, then Matthew Bender won't take Federal Second and Federal Third Court decisions from West, they'll get it off the Internet.

The whole premise of this is that it is a world where West is just an example of where there is an incentive for a free rider. If it is true that West really invests nothing, and I look

Page 246 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

forward to that world, then Matthew Bender will just take it off the Internet. They won't go to West.

PARTICIPANT: It might be a little bit of investment.

PARTICIPANT: In science there is always going to be someone out at the forefront of the technology creating data. The creation costs at that point will be high. Now, it is true at some point later in time that data won't have the same value, but for that window of time I can understand that they want to be compensated for that investment and creation, and science may benefit from the fact that they created the data now instead of waiting until the entire community could do it in a cheap way.

MR. MAURER: For 20 years LEXIS had a wonderful self-help system. They just didn't give you the whole database. You submitted a search. They did the search. They gave it back to you. The only reason these things are out in the public domain where people can copy them is that there are now other self-help games that involve giving out the disk, knowing that in a month's time someone else may get it, but in the meantime I am going to recover my investment in the disk. This whole subject is a dramatic example of how people find ways to protect themselves. This cannot be overlooked.

PARTICIPANT: Justin Hughes' point raises one thing that I think is something people have to keep in mind, which is that when you talk about the kind of investment that West certainly used to make, trudging to the courthouse and sweet talking the clerk into giving them those opinions, etc., you are also talking about just the sheer cost of assembling the database, which means that frequently you are talking about what we call a national monopoly. You are relatively unlikely to see serious competition for someone who does what West does until it becomes cheaper to do what West does. One thing we are always thinking about when we talk about creating property rights for someone who is doing what West does is how it is going to affect what may be already significant power over a market.

PARTICIPANT: The natural instinct then is to think about regulation or compulsory licensing or other kinds of restraints, right?

PARTICIPANT: All of the things we shrink from.

PARTICIPANT: But again I think that the market power that West has is not so much over the future as it is over the past. If they are the only source for a lot of those old cases, then unfortunately we lawyers rely on precedent, and we are always looking at the old decisions because they support the proposition we are trying to advance.

MR. PERLMAN: Further comments on the government data question?

DR. SCOTCHMER: If there are no further comments that are immediately relevant, I want to come back to an issue that was raised about an hour ago by two people taking very different views and that is the question of how we should think of government-sponsored data as opposed to government-generated data; that is, grafting to academic universities. One side brought up the Franklin stove as an example that without patent protection, a new idea just didn't get disseminated. Then the other side of that was, why should the users pay for the data, given that the government already sponsored them; those are two very opposed views of whether we should have protection on such data. That issue has been confronted before the database question in the Bayh-Dole Act for patents, and basically they bought the Franklin stove argument in order to get the universities to create the licensing infrastructure to get those things out in the public domain.

Page 247 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

I don't know how to think about that. I realize it is true; there is a lot of ex post facto evidence that it is true because there has been a lot of licensing activity in universities. So I just wanted to point out in relationship to the previous discussion.

MR. PERLMAN: The last question is a general question, and we only have about 15 minutes left. So I invite comments from anyone with respect to the unfair competition model or any of the issues that have been raised.

MR. KAHIN: Let me just add something on Suzanne Scotchmer's point of the Franklin stove model versus the open model because which one works best depends on the size and the nature of the market. The principle of the Orphan Drug Act is that these markets are so small they need the Franklin stove model. For large markets or databases or technologies that have a lot of potential for broad applications building off of a lot of different directions, the open model may work better, the Internet being a classic example.

PARTICIPANT: Just to add to what Brian Kahin said, keep in mind that even when we buy into the Franklin stove model, the government preserves march-in rights on all patents. So we do adopt a fail-safe mechanism that says that if you don't market this correctly we can march in and market it instead. So, in effect, we have tried to have our cake and eat it too. In an ideal world there would be some grand federal licensing office that looked around and said, “Oh, you haven't licensed your patent very well. We will march in and put it out on the market.”

PARTICIPANT: Does that happen very often?

PARTICIPANT: I don't think so. It is theory.

PARTICIPANT: In the cases where the courts have been asked for that, they have refused.

MR. PERLMAN: General views? Other issues that seem to emerge from this model? Other models?

PARTICIPANT: I would like to return to a general issue that was discussed earlier, which is the question of what to do about the duration of protection and the difficulty of databases that are continually revised. It is an area that clearly needs a lot more thought. I am not convinced that it is a deposit problem, that it cannot be dealt with.

If you think of patent protection, for example, you file a patent, but the patent doesn't give you a right to all future improvements of that product. You get that product as it exists. You have the doctrine of equivalence, which says that you can exercise your patent rights with respect to not just that product, but things that are very similar to that product.

So it may be that if you registered or announced a database, and you would have to identify it in some sense but I don't know what the right term is, it seems to me that just by identifying what it is you are claiming confers some value that would apply to an incremental change to that database; if it is something that gets created every day, then it is not clear to me what exactly you are protecting. So obviously a lot more thought needs to go into this, and I don 't want to even suggest that I have done any effective thinking.

PARTICIPANT: Say you were in an unfair competition model where the market harm, however defined, is the trigger for protection. For example, I do a 1990 phone book. The 1991 phone book is the 1990 phone book revised, and I revise it every year. In that setting what would be the market value for the 1990 phone book once the 1991 phone book comes out? Zero.

PARTICIPANT: No.

PARTICIPANT: All right, if it is not, then it has some market value, and someone else takes the 1990 outdated phone book and uses it to penetrate the residual market that you think still exists, would that violate the act?

Page 248 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: No.

PARTICIPANT: But at some point, maybe in 1999, the 1990 phone book loses most of its economic value at which point anyone else could come in and take it without a problem. I could also take the 1990 phone book if I was going to do something other than some potentially new market that the phone company wasn't exploiting at the time.

So I don't see that the revised issue is a problem that doesn't flow logically from the model that we are talking about. As long as there is a market for the outdated database, then there is a potential for market harm.

PARTICIPANT: Yes.

PARTICIPANT: If it is the case that the 1991 phone book is merely a small increment to the 1990 phone book, then under almost any regime, copying the 1990 phone book in order to get the 1991 phone book would be either a substantial harm or an infringement or whatever.

PARTICIPANT: It depends on what you are doing with the phone book. The market harm would be selling it, but I certainly cannot go around and say to homeowners that I am going to give you the 1990 phone book. Homeowners aren't going to take the 1990 phone book if they can get the 1991 edition.

PARTICIPANT: If it is a database, and I take that for the purposes of doing my own update to it to save all the initial entry costs and database building costs, and I take your file structure and do the updates, but all I have paid for is the update, where do I stand then?

MR. BAND: I think, at least under the hypothetical I gave under my moral compass, chances are if you took the 1998 edition and updated it to 1999, you probably would be infringing. Imagine a directory that really is not a phone book but something else that is updated rarely, for whatever reason, and at that point what I come out with is going to be a different product or be substantially different, then I would say that that would be a factor.

PARTICIPANT: Many of those cases leave at risk the sweat of the brow of the original investment.

PARTICIPANT: The term is an outside limit, and if you market for a particular database, it may fail at some point before that. There are other issues, obviously, which we don't have time to consider here, but I think that is something on which there is relative agreement.

PARTICIPANT: You cannot confuse the value left in the 1990 phone book with whether there is enough value in doing the update for 1991, which the gentleman who owns the 1990 phone book is going to do in 1991 whether that competing product comes out or not. Phone books are a good example; the 1990 phone book may be useful for a lot of purposes but you have got to believe that the phone company is still going to put out the 1991 phone book. There are markets like that.

PARTICIPANT: The purpose of that phone book is excluded from protection.

PARTICIPANT: It is all regulated, and they have to come out every year with a new phone book. Databases are being updated all the time. The real question again is, Why are we here? Why do we care about that? It is very difficult, if not impossible, to do some kind of market harm to that kind of database because even with existing technology I don't think there is a good way. For example, one of my clients has a database exactly like that, and they are not worried because they don't think that anyone is likely to spend the time and effort involved in going page by page and downloading—to the extent that you can even download. By the time you are done with that, the database is all different.

PARTICIPANT: I know, but it would depend. Again, most of the online databases that people are talking about involve a large amount of effort and assembling and so forth. The

Page 249 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

examples that are given, again, are updated often and the value is in them being current. That is why people want the data, and why they are so hard to copy that there is no chance for market failure.

PARTICIPANT: But there are some database products that are using outdatedness as a method of price discrimination as well in which there really is a market for the outdated version.

PARTICIPANT: You want to preserve competition because that seems to me to be one of the ways in which this market works. There is almost always a lower-quality data source that you can use in a lot of applications. I think that might also distinguish a lot of scientific applications in which you want everything. You want the whole thing. I know in a lot of other commercial applications there is always another lower-quality data set, but most of the really expensive, sensitive commercial financial applications don't even keep the old data. They are always marketing. They care about the new, the latest data, and to them what is one year old is valueless.

PARTICIPANT: Almost, but not entirely valueless for many scientific purposes.

MR MAURER: This goes to price discrimination, which is a good thing; and we want to promote that. But a lot of it has to do with the data and also the mode of distribution. I keep going back to it with the LEXIS-NEXIS marketing decision to distribute some data on disk or on CD-ROM, knowing that it can be copied. But they weren't concerned because the information is only good for a month, which is enough of a lead time for their marketing so that it doesn't matter. The truth is that with a lot of these cases, for example the Warren publishing case, the data or information could have been distributed differently. They could have distributed the information online, in which case it would be very difficult for someone to download and to come out with a kind of competing product, but that wasn't the channel of distribution they chose. A lot of the problems here can be taken care of by designing a distribution form that maximizes the need for protection. They also acknowledge that they could have used licensing; and that too would have, in that case, taken care of it because the person bought one copy of the book. So if it had been a shrink-wrap license, if it was online maybe they would have had another form or another remedy against that person.

MR. OVERTON: Does the fair-use model provide a blanket restriction on copying a whole database? Here is what I have in mind. I am trying to make up something here. The reason you do this in the first place is to protect someone against competing with you in your market. But suppose I copied the phone book from someone and despite the fact that there is no problem with this, let us pretend there is. All I wanted to do was an analysis of all the first names of people by location, but in order to do that I had to copy the whole database. Am I prevented from doing that under fair use, for example?

PARTICIPANT: Under property rights and fair use you probably would be.

DR. OVERTON: That is what I was afraid of.

PARTICIPANT: But again it all depends on how the fair-use provision is worded.

PARTICIPANT: You could imagine that unless this is very delicately worded that there are going to be plenty of cases like that where I might come up with some bright idea that would require my use of the bulk data. It is going to be restricted.

PARTICIPANT: And I assume that under the Coble bill it would be prohibited assuming phone books were included. It would be a substantial taking of data and potential market.

PARTICIPANT: It may well be. The problem with this is you have a question of how you are going to use the database and the harm that arises. Okay, you have copied it and you are using it for something, but is the harm something that the legislation is designed to prevent?

Page 250 Cite

Suggested Citation:"11 An Unfair Competition Model for Protecting Databases." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.

×

PARTICIPANT: In Dr. Overton's example, he is going to sell a directory of first names associated by geographic location.

PARTICIPANT: Was that really his question? Was Chris Overton then going to market your analysis or—

DR. OVERTON: Let us suppose I did. I am a scientist. So pretend I am going to market this analysis. I am going to do something that is not what the database was intended to be used for. I have come up with some completely new use for this database, but it depends on my having access to all of the data in order to do that.

PARTICIPANT: I think the argument could be under the Coble bill that those are potential markets, and they could argue that a potential market includes our licensing it for bizarre uses. I am exaggerating, but the argument is certainly the market potential for licensing the product for other uses. That is a potential market.

PARTICIPANT: And I think the word “potential” in itself is somewhat circular. I agree. However, in the legislation there have been attempts to cabin it in such a way that there are elements of custom such that if this is something that this company normally does or that is normally done in the industry, it is reasonable to expect that they would go into this area; then yes, you have a problem. You may well have a problem.

MR. PERLMAN: I think this conversation could probably go on forever and be continually interesting and nuanced. On behalf of the NRC study committee, I want to thank all of you for participating in this session. It has been helpful, and I think many of us who have to work on a set of recommendations are going to be enlightened by this conversation.