Cover Image

Not for Sale



View/Hide Left Panel

11

An Unfair Competition Model for Protecting Databases

MR. PERLMAN: I am Harvey Perlman, and I will serve as moderator for this breakout session. I am a professor of law at the University of Nebraska. I have taught unfair competition law and intellectual property for about 30 years and have participated in the background on this issue by giving some advice to the National Academy of Sciences regarding this latest database bill. I am also fairly actively involved in the discussions with respect to the licensing provisions of the Uniform Commercial Code, Article 2B. Before we begin this session, I would like all the participants to provide a brief statement about who you are and either who you represent or why you are here.

MR. BAND: I am Jonathan Band with Morrison & Foerster. The clients that I have represented in this database area are both in the financial services industry and information technology industry, and they tend to be skeptical of the legislation that has been proposed thus far. We have been trying to advocate a more narrow form of database protection than that which was introduced in the 104th and then in the 105th Congress.

DR. LOFTUS: Philip Loftus. I am with Glaxo Wellcome. I am going to be rapporteur for this session.

MR. KAHIN: I am Brian Kahin with the White House Office of Science and Technology Policy, and I have been working with Chris Kelly and Justin Hughes and others within the administration on this issue.

DR. LEDLEY: I am Robert Ledley from the Protein Information Resource. I testified before Congressman Coble's committee last year on the bill that was passed some form by the House and that I thought was pretty good. The only thing is when they were finished, Mr. Coble said to me, “Dr. Ledley, you seem to have a different opinion than all your other scientific colleagues. Could you please write me a letter and tell me why. You know, I am gathering data.”

DR. BARKER: I am Winona Barker, also with the Protein Information Resource, and I am also an ex officio member of the U.S. National Committee for CODATA. I tend to be skeptical of more legislation, although it seems to me that the genie is already out of the bottle. It may not be possible to easily correct the problems that people like Chris Overton are having, but I am not sure that more legislation is going to make it better.

DR. GILBERT: I am Richard Gilbert, professor of economics at the University of California, Berkeley.

MR. COHEN: I am Bill Cohen from the Federal Trade Commission, and I am interested in the competition model.

MR. BARRON: I am Ed Barron, counsel of the Senate Judiciary Committee for the ranking member Senator Leahy.

DR. GILMAN: I am Paul Gilman. I work for a company called Celera Genomics. We are investing about $300 million to create a database, and we are very concerned that, while we want it to be widely accessible to the research community, some of our commercial competitors will simply copy our database and make it available in some other way.

MR. MAURER: I am Steve Maurer. I am an intellectual property lawyer at Berkeley, California. I had the privilege of working on a background report for the National Research



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS 11 An Unfair Competition Model for Protecting Databases MR. PERLMAN: I am Harvey Perlman, and I will serve as moderator for this breakout session. I am a professor of law at the University of Nebraska. I have taught unfair competition law and intellectual property for about 30 years and have participated in the background on this issue by giving some advice to the National Academy of Sciences regarding this latest database bill. I am also fairly actively involved in the discussions with respect to the licensing provisions of the Uniform Commercial Code, Article 2B. Before we begin this session, I would like all the participants to provide a brief statement about who you are and either who you represent or why you are here. MR. BAND: I am Jonathan Band with Morrison & Foerster. The clients that I have represented in this database area are both in the financial services industry and information technology industry, and they tend to be skeptical of the legislation that has been proposed thus far. We have been trying to advocate a more narrow form of database protection than that which was introduced in the 104th and then in the 105th Congress. DR. LOFTUS: Philip Loftus. I am with Glaxo Wellcome. I am going to be rapporteur for this session. MR. KAHIN: I am Brian Kahin with the White House Office of Science and Technology Policy, and I have been working with Chris Kelly and Justin Hughes and others within the administration on this issue. DR. LEDLEY: I am Robert Ledley from the Protein Information Resource. I testified before Congressman Coble's committee last year on the bill that was passed some form by the House and that I thought was pretty good. The only thing is when they were finished, Mr. Coble said to me, “Dr. Ledley, you seem to have a different opinion than all your other scientific colleagues. Could you please write me a letter and tell me why. You know, I am gathering data.” DR. BARKER: I am Winona Barker, also with the Protein Information Resource, and I am also an ex officio member of the U.S. National Committee for CODATA. I tend to be skeptical of more legislation, although it seems to me that the genie is already out of the bottle. It may not be possible to easily correct the problems that people like Chris Overton are having, but I am not sure that more legislation is going to make it better. DR. GILBERT: I am Richard Gilbert, professor of economics at the University of California, Berkeley. MR. COHEN: I am Bill Cohen from the Federal Trade Commission, and I am interested in the competition model. MR. BARRON: I am Ed Barron, counsel of the Senate Judiciary Committee for the ranking member Senator Leahy. DR. GILMAN: I am Paul Gilman. I work for a company called Celera Genomics. We are investing about $300 million to create a database, and we are very concerned that, while we want it to be widely accessible to the research community, some of our commercial competitors will simply copy our database and make it available in some other way. MR. MAURER: I am Steve Maurer. I am an intellectual property lawyer at Berkeley, California. I had the privilege of working on a background report for the National Research

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS Council study committee (“Raw Knowledge: Protecting Technical Databases for Science and Industry, ” in Appendix C of these Proceedings). DR. BROWN: I am Carole Ganz Brown from the Division of International Programs of the National Science Foundation, and I have been working with the General Counsel's Office on various of our research provisions on these issues. DR. MCDOWELL: I am Bruce McDowell with the National Academy of Public Administration. We have a small contract from the U.S. Geological Survey (USGS) to take a look at some of the potential data limitations that might affect a global disaster information network, which is one of the Vice President's initiatives. The idea of that network is to put together everything in one place that an emergency manager might need worldwide and share it in real time. Not a lot of attention has been given to the difficulties of achieving that grand desire. So, we are taking a look at intellectual property, privacy, liability, and security issues that might limit the information that should go into that system or be shared through it. MR. KELLY: I am Chris Kelly. I work with intellectual property issues at the Antitrust Division of the Justice Department, where I have worked with Richard Gilbert and learned a lot from him, and I have been working with Brian Kahin and Justin Hughes on these issues for the last year and one-half or so. MR. MOHR: My name is Chris Mohr. I represent the coalition that supported the goals that went through the House database bill. MS. SAEZ: I am Carolina Saez with the U.S. Copyright Office. MS. KELLY: I am Maureen Kelly, from BIOSIS. We are a not-for-profit publisher of a secondary database, which means that we are both a user and a producer of scientific information. MR. RINDFLEISCH: I am Tom Rindfleisch from Stanford University. I am director of the Lane Medical Library where we are attempting to become a digital library to disseminate information for clinical care and research and education. I am also a computer scientist who has been involved in a number of projects trying to synthesize various kinds of databases—data resources for new kinds of applications. DR. BENSON: I am Dennis Benson from the National Center for Biotechnology Information at the National Institutes of Health (NIH), and our group is responsible for building and distributing Genbank. DR. WILLIAMS: Myra Williams from Molecular Applications Group, representing the genomic sector. DR. OVERTON: I am Chris Overton, director of the Center for Bioinformatics at the University of Pennsylvania. I represent the academic sector for use of genomic information, and one of my chief concerns, is that the pending legislation looks like it is going to be a hidden tax on knowledge. In my opinion, it is going to impede biomedical research. MR. PETTINGER: I am Larry Pettinger from the USGS, and my main involvement has been in representing USGS in some of the discussions among the federal science agencies on these issues. MR. PERLMAN: Let me introduce the session and do a couple of things. One, the purpose of these sessions was to get a dialogue between those who see the need for additional protection and those who are concerned about it, and we have the panel organized in such a way that we thought we would produce that result. Unfortunately, Mike Klipper is not here because of the weather. I think he is a strong advocate for increased protection, and I will play devil's advocate if the occasion arises to try to

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS present that view, but I am not as passionate about it as some others around this table. So please feel free if you represent that view to help me. The idea for this session was to go through the questions that were provided for the workshop. (See Box 11.1 for a list of questions used to guide the discussion.) I will ask Jonathan Band, as he that indicates he is skeptical of increased protection, and whoever else would like to chime in, to take a minute or two and respond to each question from their particular points of view. Then the three people of the panel who are active in the area, all of them in the biotechnology area as it turns out, can respond to the two lawyers' positions. Tom Rindfleisch and Dennis Benson, who are both noncommercial data users and disseminators, will give us their reaction to these comments, and then I'll open the discussion up for questions. If it seems this strategy is working as we answer one or two of these questions, then we will continue that way, but I don't want to have this as a restraint on free interaction and discussion. So I am not going to hold you to these questions; our experience during yesterday's breakout sessions was that conversation tends to blend the questions together. Box 11.1: Issues for the Discussion Session on an Unfair Competition Model Protecting Databases Identify the potential benefits and problems of this legal model in your database activities in comparison to the status quo. How would you define the scope of prohibited activities by users? Should the law distinguish between different categories of users? If so, how? What specific provisions regarding access and use (both authorized and unauthorized) would you want included in such legislation? Why? What specific exclusions and limitations on the rights of database owners (e.g., by category of user, type of use, or type of database) would you want included? Should sole-source databases be subject to any greater requirements for openness (e.g., compulsory licenses, fee regulation, etc.)? Why? Are there prerequisites that a database producer should meet before protection is accorded? Why? Should the property right be limited in time? If so, what's an appropriate length of time, and why? Are there any special provisions needed for access to and use of government data incorporated into privately produced databases? If so, what should they be, and why? Are there any special provisions needed for access to and use of data generated through government-sponsored research by parties outside government? If so, what should they be, and why? Identify other issues important to public-interest access to and use of data and databases under the unfair competition model, and state why they are important. In particular, are there any technological trends that may alter the balance of rights substantially? Let me set the parameters for our discussion, again not in a rigid way but at least to focus our attention. At this workshop, there are concurrent discussion sessions like this one dealing with different models of possible database protection, and labels are less important than the provisions of a particular bill. In this session, we are supposed to discuss what is designated an

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS unfair competition model, and there might be some confusion about what we mean by that because there is some debate about whether the Coble bill of the last congressional session is an unfair competition model. I would like to frame the issue fairly narrowly, at least to start out. My view of an unfair competition model is one that allows a database owner protection against activity that directly competes with and prevents that owner from capturing the economic value of the database. It is fairly narrow in the sense that it comes out of the unfair competition tradition, where you only are protected against acts that would likely prevent you from making the investment in the first instance and only against acts that interfere with competition in the product that you are currently selling. I think this is where there is debate between whether the Coble bill is an unfair competition model or a property rights model. The Coble bill, as some of you may know, allowed a database owner to be protected both in the actual markets in which they were engaged and also in any potential markets; and when you open it to potential market, which means any market discovered in the future, as to how this database might be economically exploited, you essentially end up with a property or very close to a property rights bill. I don't want to hold that distinction rigidly, but at least you get a sense that our model is one that attempts to define specific behavior that we find interferes with the investment of the owner, as opposed to some other models that might make sure that an owner can exploit all of the actual and potential benefits of the database. With that, I start with Jonathan Band. The first question is to identify the benefits and problems that this model might make in your database activities in comparison to the status quo. MR. BAND: I think as always the devil is in the details, and it depends on exactly how the unfair competition model is structured if it is truly narrowly drawn in the manner that Harvey Perlman was outlining, I think that it has a lot of benefits and relatively few problems. The benefits, of course, would be that to the extent that there is a gap—and I am not sure there is a gap—in existing forms of protection, this would largely fill it. One of the problems that database producers have talked about, and certainly among the problems that are talked about most often, is wholesale copying by a direct competitor. The cases or examples that have been cited where that is a problem have been the Zeidenberg case [see ProCD v. Zeidenberg, 86 F.3d 1447 (7th Cir. 1996)], where a graduate student copied a CD-ROM of the telephone directory and put that on the Internet, or the publishing case that was raised yesterday, which involved the scanning in or the keying in of a huge amount of information straight from the Cable Fact Book and making it available in digital form. Again, the theory or the argument of proponents of additional legislation is that with digital technology it is easier to copy and easier to disseminate a database, so that the risks of this sort of wholesale appropriation, which would totally wipe out the value to the original producer, is why there is a need for additional protection. I think an unfair competition model will address this problem. If copyright, license, or technological protections, or all those other forms of protection don't work, this will be an additional weapon in the arsenal of the database publisher to get at someone who is doing bodily appropriation of the data. At the same time it does not preclude most forms of value-added activity, so that it doesn 't prohibit you from taking some of the information and developing a different kind of product, whether it is in a potential market, neighboring market, or call it what you will as long as it is a different market. Thereby this kind of added protection does not stifle second-generation innovation, and so that is why it seems to have little down side. The real question then, and again this is where the details will come in, is how broadly the legislation is drafted. What about some kind of value-added product, which at the margins

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS may compete with the original product, or what if someone just has raw data, and then someone else adds value so that those data become useful for the first time? Of course that will hurt the actual market for the first comer, for the first publisher, and so that is a gray area where a lot would depend on how the bill is drafted and how it is applied by courts. This is something that we can talk about, whether this is what we want to encourage or discourage. We are not, in theory, concerned about the progress of science and the arts, but certainly it seems to me that that should be an ultimate goal here, along with consumer welfare. MR. PERLMAN: Would the representative from the coalition like to respond rather than have me try to do it? MR. MOHR: Unfortunately I am not authorized to respond to this question. I probably will interject some comments later on. MR. PERLMAN: I suppose the argument on the other side is that, to the extent that you want significant investment in some of these databases that are not protected by copyright, the more returns and exploitation one can achieve through the database, the more money is going to come into the database owner to continue to invest in keeping it up or increasing new ones. In addition, if transformative uses are a significant part of the marketplace one would presume that the owner of the database would license that database to permit those transformative uses, but extracting a fee for that privilege. DR. LEDLEY: I think it ultimately is the argument, but there is a nominative advantage also to having protection, which is that even if you don't charge the user, then the user must at least contact you so that you know who the user is. That is often extremely important, especially with government-funded materials, to identify what the usage is—essentially to know; how important your database is. Without some way of inquiring or at least being able to request that the user identify himself, then there is no way of keeping track, and keeping track is a very important thing. MR. PERLMAN: I will call on the various panelists, first to respond, if you want to respond, and then we will open up the discussion. The issue is a model with fuzzy fringes that essentially would make different uses free as opposed to a model that would allow the database owner to either limit, restrict, or otherwise exploit other uses besides the one that he is currently using it for. PARTICIPANT: I just want to respond to this comment. If the major reason you want this legislation is to track uses, then any law like this is just overkill. I don't think that would justify anything that would make such a substantial change. DR. LEDLEY: I am not saying that this is the major reason. I am just saying that this is another reason. MR. PERLMAN: Why don't we let each of the panelists respond, and then we will have a discussion. DR. OVERTON: I make a living using transformations of databases and integration of databases. So in some sense, my primary activity is to take fairly large chunks of other databases, combine them, and do something innovative with them. My concern is certainly what the consequence of this legislation is going to be because the other part of this is that the transformed databases that we generate are then provided to other scientists through the Web or through bulk downloads of the whole database. As far as I understand the different options, this unfair competition model seems the least abusive, from my perspective, of any of the options I have heard so far. In any case, even with the unfair competition model, my concern is conveyed with the following scenario: I have created a new

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS database, and now for each piece of data that I integrated into this new database, I am going to have to track that piece of data. If I serve that datum up to someone on the Web and I have to be concerned somehow with the copyright or fees for usage of this, I am going to have to do that for every single piece of data in the new combination, in the new form that I have created it in. When they hear something like this, my colleagues in the computer science department say, “That is cool. This is going to mean a new research project for us.” It is an area called data provenance, and I use data provenance to do exactly that. I keep track of every piece of data that comes into the database and the origin of that piece of data, but that is a research project, and when I would be able to use that in practice is years down the road. In the meantime this legislation could restrict my use of this information. MR. PERLMAN: Was everyone here yesterday so that you know what Chris Overton does with the databases? Do you want to give that 15 seconds? DR. OVERTON: I take multiple, heterogeneously distributed databases —they could be databases from all over the Web or local databases—and I transform and combine them to produce a new database, a data warehouse of that information. As part of that activity, I add value to the data through various means so that there is a lot of work that goes into creating these new databases. But the bottom line is that the new database, the derived database, is composed of elements of existing databases, plus manual curation, plus derived data through computation. So we have new insights that are in our databases based on the data, the next-level-down information from previous databases. MR. PERLMAN: And these are genetic data that give you insight about the information by combining those you wouldn't get— PARTICIPANT: That you wouldn't get otherwise, exactly. DR. WILLIAMS: Molecular Applications Group has some overlap with what Chris Overton has described in terms of the creation of derivative databases that add significant value over the original database. We also have software that dynamically accesses over 150 sites concurrently using the World Wide Web. These sites can be specified by the customer. Some companies are more interested in agricultural sites. Some want all seven-transmembrane-related sites. Others want things that are very specific. Our software can be easily customized. The system knows where to look for respective types of information and then populates the database with it. These are very large databases. The idea of having to track exactly which databases were accessed, what information was used, what percentage of that came from which database is just a nightmare because we are talking about data compilations that will grow to terabytes in size very quickly. That aspect would be one of great concern. The thing that was quite positive to me was hearing Harvey Perlman's narrow definition of unfair competition. I think there are two things that need to be clarified as we do any kind of legislation. One issue is the protection provided to the database vendor, and the other is the protection provided to the user of the information. Let me elaborate on each of these. As long as the protection provided to the database vendor is very narrow and is specified, as Harvey Perlman described, where it says, “I am protecting only that which has already been created, and I am not, in fact, protecting against future possibilities that have not yet been implemented,” then I don't have much of a problem. What I found formidable was the idea that someone in retrospect could say, “Oh, that derivative database included some of my data, and I have been intending to do that as well.” How does the provider document that they had that idea? That becomes a very fuzzy area that would be very difficult for us to define. I don't see

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS great advantage from the added protection, especially for our particular organization, but I also don't see that much of a problem. If it prevents these overt cases of abuse that we have been hearing about at this meeting, then protection is probably appropriate. At the same time, however, in terms of science and technology, it is very important that the rights of the user be protected, whether that user is an academic, a not-for-profit organization, or a commercial organization. One of the things that we found out yesterday is that there seemed to be a lack of clarity on the part of the scientists, the administrative staff at universities, and certainly those of us who are data users on exactly what our rights are to have access to data. For example, it was said that under new legislation if even 10 percent of the funding to create a database came from a government grant, you are obligated to provide access to those data to everyone. I don't think that is widely known. It certainly doesn't seem to be known among some of the groups that we have been talking about; and the rules are different according to whether or not the work was funded by a grant or by a contract or by a cooperative research and development agreement (CRADA). So, as we consider any kind of legislation, it is very important for all of us to understand what our rights are to have access to the data and in fact to build value-added derivative products from the data without having to pay exorbitant fees to all the people who are involved. DR. BENSON: I would like to echo that because I think as a general comment there is concern about legislation. Most people, I think, involved in science don't appreciate the subtleties of the law. As this legislation was proposed, if you went to Web sites and saw a number of the issue papers that were prepared it seems to me that a lot of the issues were, in fact, addressed by some of the subsequent legislation that came out. Yet overall, the impact was a chilling one that scientists felt that this was going to impact their day-to-day use of databases. We have to be very careful about the message that goes out about this legislative process and how it will impact or not impact day-to-day science. In terms of this specific issue about misappropriation of data, this would not affect our organization because we go to the end user directly and collect sequence data from the end user. There is one area that would be a potential danger I think, which was alluded to yesterday, and that is in terms of electronic publishing where journals may be completely in the electronic realm and the data that support the underlying article may be part of that electronic publication, and the publisher may retain rights to all of the background or underlying data. In our particular case, if a publisher were to retain rights to the sequence data, that could obviously be an impediment to the free exchange of sequence data that we currently have. I think that is one concern we would have. MR. RINDFLEISCH: I want to speak primarily from the user point of view, but I would like to distinguish what I think are three main areas of technology or formulation of these kinds of repositories that we are talking about. First are the data themselves, whether they are genomic data, whether they are textual data from the literature, or whatever. Second is the interface that the database provider generates in terms of the expectation of how users are going to use those data. Third is the user, the person who is trying to accomplish a particular task, care for a patient, do research, construct material for a course; and from the user's point of view, the user is unconcerned about where the data come from. The user is trying to pull together different kinds of information that will allow optimal accomplishment of whatever that task is, and I am concerned that the kind of protection that we are talking about will impede the optimal development of the new technologies that are just now becoming widely available in the development of tools that let people do their jobs.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS We have run into situations where vendors put a lot of money into developing databases, into developing these interfaces, and want to license them. For example, I run the Lane Medical Library at Stanford, and we have approximately 251 titles online digitally that represent various kinds of vendors. The vendors are primarily focused on having you use the data as they present them. They do not want the end user to be able to go flexibly between publishers and all of these databases to accomplish their task. That is an impediment to the optimal use of these data; and, in fact, the economic protections that we are talking about are intended to protect the marketplace against having to do further innovation. That means that we have talked to vendors and said, “Your database is faulty in the following way . . .. It does not accomplish the following tasks . . . .” The database vendors look at it from a business point of view. What will it cost me to change that interface or to change the organization of the data in order to accomplish this new task? If the investment is high or if the income stream is already quite profitable so that these vendors feel that it is not worth the additional investment, putting these protections in place I believe will impede the development of technology that we are only beginning to understand. I also believe that the long-term economic advantage of these technologies is to free the development, the exploration, the interrelationship of these different kinds of information resources in ways that the vendors have never imagined. So, whereas I am sympathetic to the investment of large amounts of effort to accumulate these data, I don't see that the protection that is warranted should be any more rigorous than a company in Silicon Valley developing a new piece of Web software or a new piece of applications software that a competitor can look at or duplicate. The length of time of the technological advantage in that kind of an arena is very short, and in fact maybe a measure of this is to look at Moore's law, which says that computer technology turns over about every 18 months. So why should we be putting in place legal restraints on turning over the uses of these data the way we are conceiving of taking advantage of these technologies that should be any more durable? I have a tension between understanding the provider's point of view but also taking the user's point of view where we are trying to do the best possible thing that we can to improve patient care, to improve efficiency of engineering, new artifacts, of doing research that needs to make use of these data in innovative ways, and we absolutely have got to avoid restraining that innovation. MR. PERLMAN: The discussion is open now, so I invite people to comment. MR. HUGHES: Justin Hughes, Patent and Trademark Office. I don't think it makes much sense to apply Moore's law to this database issue. Moore's law is about the computing power of chips, not about anything else related to the computer. Back to the other point. I am not sure what you said about software, but I don't think it was quite right. You characterized software as something people can go out and duplicate, but they cannot go out and duplicate it. They can reverse engineer it under certain conditions, but copyright protection for software does provide some viable protection of the investment, and there is a coterie of people all over this building and over the office of the U.S. Trade Representative who practically devote their lives to preventing people like the Chinese from duplicating our software, and we don't stop at 18 months and say, “Go ahead, take that version of DOS.” MR. RINDFLEISCH: I understand what you are saying, and I was not implying that Moore's law, that has traditionally applied to hardware, rigorously applies to this new area, but I believe that the development of the ideas that are embodied in software, that are embodied in the

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS ways we organize and interlink information is a new generation of technology that we are just beginning to explore. That is what these new companies like Yahoo and Excite and others are coming to produce —new products that rely not so much on the underlying software and hardware technology but the ways in which information is put together. PARTICIPANT: But to more precisely draw the analogy you are trying to draw, you would want to explore the social parameters that we should impose or do impose on interoperability of software and say, “Yes, we would like everyone to invest in developing innovative software products, but we want those products to be interoperable with each other,” and therefore what kind of terms and conditions do we put on that? That might be a closer analogy to what do we need to do in the database world to make sure that people can take bits and pieces from things in a useful way. MR. MAURER: I find this exchange very interesting and useful, and I think one of the things that we need to focus on is how long a protection period should be. The Europeans said 15 years; the Americans came back and said, “We have to match the Europeans, ” so they proposed 25 years. I think there is a question about how long it takes a database owner from an economic point of view to recover their investment, and certainly for the American companies isn't anything like a time horizon of 15 or 25 years. That, I think, is the heart of what Mr. Rindfleisch was trying to say. The other thing I think we should keep in mind as the great strength of the unfair competition approach is that it is traditionally a sensible case-by-case view that gives you flexibility, which is often a good thing. One of the things that has come out in this workshop, I think, is that there is a gray scale of possible protections, and the challenge is to find something that gives enough protection to the database producers but doesn't give them so much protection that you get pathology. I think it is an empirical question ultimately of how much is enough, and whether you need a strict copyright model or whether something less will do is something that needs to be looked at very closely. MR. KAHIN: On this interoperability issue, I didn't see it as a question of interoperability of software. It is having the ability to use the data in a way that allows the user to interoperate the data—that is, to construct the user's own interface. MR. RINDFLEISCH: That is what I meant. I meant it just as a metaphor. PARTICIPANT: But there are new technologies that interlink and facilitate user's tasks, which involve using these pieces of information in ways that the provider never imagined. DR. GILBERT: I wonder if we are conceptualizing a hypothetical unfair competition model that doesn't really exist anywhere; and that is a question that maybe Jonathan Band or others who have experience with this model can answer. It seems to me that the unfair competition model does have a lot of advantages that we could articulate, but ultimately it does come down to the details and what the legal precedents are for what constitutes different markets and whether a particular data enhancement would represent unfair competition under this enforcement regime or whether it would not. MR. BAND: I think you have put your finger on the issue. There is this existing, let us say, misappropriation doctrine, which deals with “hot news,” and courts have applied different standards and different definitions of what is hot news and so forth. That is probably narrower than what we are talking about here because I don't think that the kind of model that Harvey Perlman was positing would be hot news. The stock market, for example, is hot for 15 minutes; I don't think we are thinking about something that is of such finite duration. I think the idea is

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS that it is a hypothetical conceptual model of another alternative; an alternative to what was introduced last year in Congress and, according to Marybeth Peters, is going to be introduced next week once again by Congressman Coble. So there is a different proposal. Even then it is going to be subject to interpretation. In response to the next question, what I want to do is run through some cases. I don't know whether the language I will come up with really explains those cases or would lead to that result, but these are what I think the results should be in a couple of cases. Again, maybe people will think that it is too much or too little, but it will be helpful in terms of focusing the discussion. But you are right; it all comes down to details and definitions. MR. PERLMAN: It is clear that this model would require judicial intervention at the margin. So, if certainty is a requirement for your conception of what the law should to be in this area, then I think this is not the model. PARTICIPANT: But neither is the property rights model. MR. PERLMAN: I understand. There is no such thing as certainty. PARTICIPANT: On a spectrum that might be more certain than this or at least the burden of proof which— PARTICIPANT: I just wanted to say three things. Two of them reply to Chris Overton's comments. I don't believe that even the Coble bill that was introduced and didn't go any place prohibits or requires identification of each individual part of the database. It just doesn't require tracking every piece of data. I don't believe that is so. Also, I think that the derived database is a different database. That is my understanding. So, therefore, even if you use our database in your database it wouldn't do anything to us. We would like to know if you did use our databases so that we can tell people, but it is not a requirement as far as I can tell. If we had something like 50 redistributors at one time and they patched our database with other things and added programs and went ahead and disseminated it, well, we knew who they were because generally in those days they had to ask us for it. So we sent them the tapes. PARTICIPANT: Why do you want to know? Why do you care? PARTICIPANT: We care because then we could tell the National Library of Medicine, which funds us, “Look at how broad the usage is of our database.” They repackage it, and their customers use it. We have direct customers and so forth. This is very important. How else are they going to justify spending all that money on us if they don't know that it is doing any good, and that is very important. All right, those are the two things. Now, the third thing is a very short story, which has not been mentioned here. Why would you want to bother with this database legislation to begin with? I will use as an example the patent system. This isn 't a patent, but it is just an example. Benjamin Franklin invented the Franklin stove, and in his autobiography he said that he didn't patent his Franklin stove because he wanted it to be available to the public. That was the end of it. He invented it. He didn't patent it, and no one made it. Someone in England read about it. They patented it, and it became very popular. So, one of the reasons for protection is to make the idea, the concept, the database, whatever it is available to the public. Without that it may not ever become available to the public. No one is going to pay to advertise something where you are not going to get any remuneration. You have got to pay for the advertising. PARTICIPANT: Let me ask the question from the other side. Why should you be able to do what you do without at least contributing something to the cost of the databases that you are mining? What is the theory for that?

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS DR. OVERTON: That is not my job. My job is to advance science and biomedical knowledge. PARTICIPANT: My job is to advance the wealth of my family by making things better, but I have to pay for all the goods and services I use. So why shouldn't you? PARTICIPANT: I would say that universities, in fact, do generate much of the data that go into these databases, in the literature, in the scientific research that becomes part of the gene banks and clinical trials. I think that this is basically where the data comes from, and these are the people who are looking for ways of optimizing that process. PARTICIPANT: And to the extent that content is published, universities are the big payers for that as well. PARTICIPANT: We give it away and buy it back. It is a wonderful process. PARTICIPANT: If you have a public database that is being provided free of charge to many different commercial users and there is competition in those commercial-user markets, then what is going to happen is the commercial users are going to compete away the profits that accrue from access to the underlying public database. That doesn't mean that the value vanishes; it means that the value is transferred to consumers. And so in that sense it may be a good thing that the value isn't transferred back to the original database, if it is publicly funded. But of course the analysis of that problem is very different if the underlying database is a private one, in which case you have to transfer profit back. And that is a very good example of the trade-off between access and protection. PARTICIPANT: If I may, just two other answers to that question, especially the case where the underlying data were derived commercially, not with public funds. One is that I think one can argue that the raw data have little use. They are useful only when they are organized, when there is some degree of an interface, and when they are presented in a useful way. So the incentive in developing the legal regime should be not just the collecting of the data but taking at least the next step or few steps in transforming the data in a way that is useful. If you give too much protection to simply collecting the data, then you reduce the publisher's incentive to do the next step to make the data useful. That is one argument—that you encourage not only the gathering of information but the gathering and the processing in a useful way. The second point is that the economics of certain areas of the information market are such that often you can only have one player. The investment is so large. However, it is historical information. The existing players have enormous advantages by virtue of having been there first, and so you have a very serious competition problem. If you don't give those publishers, who are the sole source for either historical or economic reasons, an incentive to make the data more useful, then they are simply going to sit there as monopolists often do and get their monopoly rent and impede innovation and competition. You need to have a way of making sure that you have competition, innovation, and progress in a useful way as opposed to just raw material. PARTICIPANT: Is there anything unique about scientific and technical data that informs the answer to this question? PARTICIPANT: I think that to a certain degree there is. There are a zillion databases of all kinds, and the barriers, of course, are intended to protect any databases, scientific or not, such as furniture databases or mattress databases. PARTICIPANT: Yes, there are all kinds of databases, but the one thing that is true is that with scientific databases there are more databases that are built on a not-for-profit basis and

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS not, you can have uniform crummy standards that don't get you where you want even in a competitive industry. The second observation is that we are talking about new intellectual property protection which hasn't existed before. The rationale for introducing that protection is that there is a gap, I will go so far as to say a perceived but relatively small gap, in existing protections, and yet the nature of the solutions under almost any of the models we are proposing here sweep very broadly. So we are talking about a situation of making laws that sweep very broadly to deal with harms which many people have suggested are, if not hypothetical, at least minimal. So I would submit for people 's comments and consideration that the thrust of the market theory is best applied in terms of “let us see what kind of protections are necessary as markets evolve,” and if it is very hard to track those protections in advance, do we need to? MR. PERLMAN: Let me segue into the question from that which doesn 't work very well, but I will do it anyway: Are there prerequisites that should be required of a database producer before protection is accorded? I think that question might be thought of in a couple of different ways, substantively and procedurally. Substantively, you could think that a database would have to meet some kinds of quality criteria before it would come under whatever protection we would give. Procedurally, should the database owner, in order to get protection under our narrow scheme, have to register or deposit or give notice or something like that? There seems to be a concern that we have heard in other sessions about the uncertainty of all this. What can I do? What can't I do? So maybe there should be some pre-steps that database owners should have to do before they acquire any protection at all. What are your thoughts about that? PARTICIPANT: I think the idea of registration is a very positive one. It is how you structure it; how you define what gets registered and what doesn't, but even the process of going through it intellectually gives you the mind-set that this is a database that I can now protect under a certain environment. PARTICIPANT: It is just another administrative barrier to academics. PARTICIPANT: I thought your question was the reverse, if it is not registered. PARTICIPANT: Right. PARTICIPANT: So how is the owner of the database supposed to know who is using it if they don't register it? PARTICIPANT: Then they can register it. PARTICIPANT: They should register it. PARTICIPANT: Let me suggest that there is a particularly vulnerable time where it is very difficult to have a registration requirement. That is, the conceiving of a new database represents a new market. The database isn't predefined. It is very hard to register something that is under development, but during that time you are most vulnerable to other people grabbing your ideas. Basically all you can do is use trade secret or keep the idea under wraps; but in order to test many of these, you need a user community of some sort. It may be that this is an area where it is very difficult to define what should be protected and in which the law may offer no recourse if someone, for example, breaks your security and takes your idea. PARTICIPANT: You would have recourse in that circumstance. A database is not an idea. It is a collection of facts or information.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS PARTICIPANT: I would say that insofar as the way the data are organized and the way the interface to the data facilitates their use, that is an idea. That is a concept that gets beyond this notion of just pieces of data. MR. PERLMAN: Moving along then, if you have no great concerns about a deposit requirement, then what about time limits? Under an unfair competition model, a narrow protection model, for how long should a database proprietor have this kind of protection? PARTICIPANT: I cannot answer that question because what I find is in this discussion there are so many disparate kinds of databases. There is the weather database where the weather conditions are practically only good for today and by tomorrow no one cares if someone copies it, except if you are doing climatological statistics or something. This type of database is time dependent and it is not useful after that time has gone by. Then there is the kind of that doesn't ever change. Once you have collected them and someone steals them, they have stolen your database. They might make it pink instead of blue and sell it for more because consumers like pink better. The kind of database that we work on is updated every day. You are improving it every day. You are deciding that this protein that has been in the database for 14 years could be given a better name, which when people do a search against the database will give them intellectual insight that may allow further discovery that they won't discover if you leave the name as XYZ protein or something that says nothing. How are you going to determine the time limits in that case? I read somewhere that someone suggested that about half the time, the mean time between updates is a good time. If the database is updated every night, are you going to protect it for half a day? It makes no sense. I have a problem with this idea that you would have to register it every time you updated it and you might make it shorter. PARTICIPANT: I think that as is the case in state misappropriation law there would be no time limit at all. PARTICIPANT: Would there be a need for it? PARTICIPANT: Obviously people want a time limit, but I don't think, in the standard way things work, if the competitor is stealing something because it has a value, then at that point you are protected. It varies depending on the product. PARTICIPANT: If you have a market-harm test in the bill, does the time limit become less relevant? PARTICIPANT: It would seem to me it would. MR. MAURER: A lot of misappropriation law has a ferocious time limit. It is the hot news limit; the logic of those cases is that the data are valuable for a limited, economically limited, period of time. The point is well taken that there will always be exceptions, but if most of the users in the society update a database every year or every two years, then that is something to shoot for. For example, you might do that as an extension of the hot news cases, that we want to protect data for the time it is necessary to allow the people who made the data to have the incentive to update them, and then they should go into the public domain. PARTICIPANT: That is subject matter. Hot news is valuable because it comes from what you are protecting. So the time limits may be totally different. MR. MAURER: It could be enough of an incentive though. MR. HUGHES: The perversity of this, of course, is that if you created a legal regime based on how and when something was revised, you would create a disincentive for revision because if I know that it is good for a year because I revise every year, then I am not going to

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS revise because you will take part of my market. So I will let it go to pot for 5 or 10 years and then come in. PARTICIPANT: Not if you are in a competitive market you won't. PARTICIPANT: The new version is protected though. MR. MAURER: I was just saying that each update gets protected, but it only lasts two years whether you are doing anything or not. PARTICIPANT: The new version of the database gets protected. However, I am fairly sure that in a lot of these markets where there is an annual update that if someone could free ride on my last year's investment and offer a product on the market, for example, for a buck compared to my $50 guide to cable services in the country, a lot of people would say, “You know, I can do with last year's version.” MR. MAURER: I would make one point though. All the people who bought last year's guide, and the cable market is a good example, could just keep the old one. In our own lives we always want to go out and get that new edition, and that is why the updates have such a strength. PARTICIPANT: I am not sure as a premise that we want the database creator to extract every bit of value that is there to be had. Both copyright and patent leave something to the public domain. PARTICIPANT: The unfair competition model that we are trying to use doesn't allow you to extract all of the economic value because it is only the market that you are actually in, not the potential markets, which are part of the economic value. PARTICIPANT: Words like “market” always make me nervous, but it is too important to ignore here because you might have very different legislative results. If you build a bill around the notion of “market harm” in the process, then you wind up with something like what I gather state law has been based on in part —that is, the action of the defendant in the suit sufficient to deprive the compiler of the database in the first place of the incentive to put the database together originally or to continue to maintain it. That is one standard, and that produces one debate and maybe different legislative results. What we heard during the Senate negotiations over the summer was that a lost sale constitutes sufficient harm to warrant protection, and that was the goal to be targeted against. It was literally lost individual sales or individual licensing. So when we go to market, as I think we obviously shouldn't have to, we have to be careful to translate that to the larger environment. MR. PERLMAN: We have about one-half hour, and we should to turn, to questions 7 and 8, which we can handle together because the issue of government-funded or government-generated data is one of the things that distinguishes scientific data from some of the other areas that seem to be driving this. If one envisions a kind of narrow protection for databases built on market harm somehow defined, how would you go about approaching the problem of government data incorporated into a private database, and to what extent should they get the same protection; I think that is the issue, isn't it? To what extent should a publishing company be protected for grabbing all the federal court decisions or weather data? PARTICIPANT: Chris Kelly, Justin Hughes, and I along with other people here had a conversation about that earlier. The government has had different policies about how they give those data away. One way is to auction the data to exclusive users, and a worse version of that is they give them away to an exclusive user. Another way is that they make the data available freely in a competitive market, and they let all the competitive users use them that want to. I think it is worth keeping in mind how those two approaches are different.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS If you auction the data, that has the value of giving money back to the government, and that is a good thing. We would rather give money back to the taxpayers than let it go to the companies, but, in my view, it would be a bad thing because of the cost-recovery model. It supports an exclusive market because it creates a monopoly by auctioning to a single user. That model is a different model of how the government should behave than one in which they put the data out for competitive use and allow competitors in the same market to use the underlying government reports, compete away the value of the underlying government reports by which I mean keep prices low in the secondary market, transferring the value of the underlying source to the consumers. That model is less appealing to Congress because it is hard to say to Congress, “Yes, we have created this value that you cannot see,” but there is a lot of economics there. So these two models are important to these focal discussions. DR. GILBERT: It does seem that this is an issue that can and should be left to contract between the government agencies and the developers and users of the data, perhaps with some statutory language about the good of the people. I can imagine a circumstance where the government invested funds to develop a data set, but for the data set to be useful it has to have another layer of development, and the government is not in the position to do it, and the only way it is going to get done is to contract it out under some exclusivity terms. So you wouldn't want a statute that says that you can never do that. PARTICIPANT: May I just interject that the approach taken in the legislation that has been written thus far is that whatever other laws may apply to exclusive licenses between the government and a contractor, H.R. 2652 or its other iterations does not. So the database protection is separate from that. In other words, if the government licenses to a military contractor, the information is confidential. Whatever other remedies they may have in terms of keeping that information from being disclosed, this bill will not be one of them. That is an entirely separate carve-out. There are a couple of issues here that I think are still under discussion and will certainly come up again. One issue is how one crafts provisions that deal with data that are required to be kept by a statute or regulation and the other is under what circumstances, if at all, should access be given, but those are distinct questions. DR. OVERTON: One of the provisions I would like to see for government data is that even if they are sole source, all of the data have to be available to the scientific community in a cost-effective form. In other words, you couldn't sole source them to a provider. Again we go back to this interface of the database that restricts your use in some way to the full value of the data. The full value of the data is only there if you have all of the data, allowing for data mining, aggregating the data, and so on. These tasks can only be done when you have access through your tools often to the whole set of data. So I would say that this issue would have to be part of any provisions of distribution of government data. PARTICIPANT: What if that provision meant that you would never get the data? DR. OVERTON: No, that provision means I do get the data. What I am saying is that there has to be a provision that says that I get all of the data, not just some provider's view of the data. PARTICIPANT: I think hypothetically you could have a circumstance in which certain types of data that you would like simply aren't going to be generated or put in a useful way unless there is some exclusivity.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS DR. OVERTON: I am assuming that they are government data. So they have been generated, and then the issue is how they are going to be presented and made available to the scientific community. MR. HUGHES: I will give you an example of Professor Gilbert's idea. At the Commerce Department years ago—and I have only now been learning a little bit about it in the past couple of weeks—for 35 years the Commerce Department published the U.S. Industrial Trade Outlook, which contained 35 years of very good, very useful statistics to economists, social scientists, and businesses on roughly 10 to 20 industrial sectors. In 1994, budget cutbacks forced that to stop. Now, the International Trade Administration (ITA) was still able to do about seven industries, but it really wasn 't enough of a critical mass to bring out the book. ITA entered into a CRADA with McGraw-Hill, and they now publish a book that covers more than 20 industries. That is an example of a situation where you have data sets that are useful but they aren't useful enough to be marketable, and you are trying to find a way to bring them out and to get them into the distribution flow. It is hard to come up with a rule for every circumstance, because as much as I believe in the public data model, I think that the idea is the right thing there in the face of their constraints. PARTICIPANT: So that is a case where they were blended with the government data set. So you are saying that the government data set is no longer available except through McGraw-Hill? MR. HUGHES: McGraw-Hill has been very generous in the sense that this book is now deposited in all the federal depository libraries, and 5,000 copies are made available to the ITA to distribute as they wish. So, in a sense, there is still a public domain of all the information, but it is an example where the government data could only be viable in the distribution system when they were blended with another layer of private-sector work. PARTICIPANT: What this keeps coming back to is that what we really need is licensing people in the government who are sensitive to the consequences of licenses that they enter into and understand the trade-offs between exclusivity and access and try their best to make the right decisions. Traditionally, that is what we have counted on with patent licensing. I am not sure that it has always worked so far, but it seems to me that whoever does the licensing decisions is getting better as people in the government become more sensitive to the significance of the rights in the marketplace, and I guess we have to hope that would happen with data too. PARTICIPANT: There are essentially two worlds in which you have to think about this. One world regulates the extent to which, and how, under what terms the government licenses the data to X. In the other world there are questions: To what extent should I be able to do anything with the data X provides differently or more freely than I could if X was a commercial private database producer producing their own private information? Do I have any more rights? And regardless of the relationship between the database owner now and the government, should the rights in that database be more restricted because it contains government data? MR. BAND: In every case it would be different, but the commercial publisher should only be able to protect that which he added. Again, it could be in the example that Justin Hughes was giving of this with the ITA. It could be that there was a lot of processing of the data by McGraw-Hill—that they took a lot of raw numbers, they processed them, and had a database. But I would submit that that kind of selection, coordination, and arrangement would certainly be covered by copyright.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS So to the extent that some things would not be covered, you would say that the data are government data and therefore not protected under a database bill, it could very well be that the selection, coordination, and arrangement of them is protected under copyright. Either way the publisher gets compensated. PARTICIPANT: But can I buy the McGraw-Hill book wherever it is or get it free through the public domain distribution? MR. HUGHES: You can go get everything in it from a depository library, and the depository library didn't pay for it. PARTICIPANT: So I get it and I take all of the data that the ITA provided, and I— MR. HUGHES: And McGraw-Hill does not assert any rights over that. That is correct, but we thought a lot about Jonathan Band's problem. Let me give you a very difficult problem, and that is what West Publishing is facing. You have to understand that a lot of the value of what West Publishing does is that they go to the courthouse and they get the opinion, and the value is partly in the distribution system. Copyright law is very clear that federal material is uncopyrighted and it is supposed to be marked as such, and what is the material you can mark, and ideally there is some kind of citation system. If we are talking about protecting the investment in the database, I do understand West's problem, in that West goes out to hundreds of courts all over the country, gathers the cases, puts them online, and adds its own notes. It adds its own little citation system, but there is value in just the government information there. There is investment by West in gathering it from these hundreds of places, putting it into a format, and distributing it. I am puzzled about what we do about that because if Matthew Bender can come along and just download all of Federal Second and Federal Third Circuit Court opinions from West, and then take out all of West's footnotes and their original materials, then West has lost a substantial investment that they did when they went to all the courthouses. So I am sympathetic to there being a problem here even when we are not talking about their visibly added value, because it is value-added through the collection and distribution process. PARTICIPANT: That is like in the old days when it was laborious and hard to collect all the court decisions, and I think that it is easier now. In five years it will be still easier with every court posting everything on the Web, and it could be that technology has put West out of business, and that is life. I had a meeting the other day with some newspaper publishers, and they were very concerned about their classified ads, among other things. A lot of newspapers are concerned about protecting their classified ads. They were saying, “Gee, it is terrible that people can go and pick out some of the listings from the classified ads and put them up on the Internet with other things and then the advertisers realize that and stop advertising in our classifieds.” For a little while I was sympathetic, and then I said, “Wait a minute, why should I be sympathetic? In five years a person is going to have to be crazy to pay a newspaper $40 for a classified ad when you can go to the eBay Web site for a quarter.” Why should we put a law in place that will preserve an antiquated way of doing business and impede a better way of doing business? PARTICIPANT: If we do arrive at a world where West gets everything off the Internet, as you can get Michigan court opinions now, then Matthew Bender won't take Federal Second and Federal Third Court decisions from West, they'll get it off the Internet. The whole premise of this is that it is a world where West is just an example of where there is an incentive for a free rider. If it is true that West really invests nothing, and I look

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS forward to that world, then Matthew Bender will just take it off the Internet. They won't go to West. PARTICIPANT: It might be a little bit of investment. PARTICIPANT: In science there is always going to be someone out at the forefront of the technology creating data. The creation costs at that point will be high. Now, it is true at some point later in time that data won't have the same value, but for that window of time I can understand that they want to be compensated for that investment and creation, and science may benefit from the fact that they created the data now instead of waiting until the entire community could do it in a cheap way. MR. MAURER: For 20 years LEXIS had a wonderful self-help system. They just didn't give you the whole database. You submitted a search. They did the search. They gave it back to you. The only reason these things are out in the public domain where people can copy them is that there are now other self-help games that involve giving out the disk, knowing that in a month's time someone else may get it, but in the meantime I am going to recover my investment in the disk. This whole subject is a dramatic example of how people find ways to protect themselves. This cannot be overlooked. PARTICIPANT: Justin Hughes' point raises one thing that I think is something people have to keep in mind, which is that when you talk about the kind of investment that West certainly used to make, trudging to the courthouse and sweet talking the clerk into giving them those opinions, etc., you are also talking about just the sheer cost of assembling the database, which means that frequently you are talking about what we call a national monopoly. You are relatively unlikely to see serious competition for someone who does what West does until it becomes cheaper to do what West does. One thing we are always thinking about when we talk about creating property rights for someone who is doing what West does is how it is going to affect what may be already significant power over a market. PARTICIPANT: The natural instinct then is to think about regulation or compulsory licensing or other kinds of restraints, right? PARTICIPANT: All of the things we shrink from. PARTICIPANT: But again I think that the market power that West has is not so much over the future as it is over the past. If they are the only source for a lot of those old cases, then unfortunately we lawyers rely on precedent, and we are always looking at the old decisions because they support the proposition we are trying to advance. MR. PERLMAN: Further comments on the government data question? DR. SCOTCHMER: If there are no further comments that are immediately relevant, I want to come back to an issue that was raised about an hour ago by two people taking very different views and that is the question of how we should think of government-sponsored data as opposed to government-generated data; that is, grafting to academic universities. One side brought up the Franklin stove as an example that without patent protection, a new idea just didn't get disseminated. Then the other side of that was, why should the users pay for the data, given that the government already sponsored them; those are two very opposed views of whether we should have protection on such data. That issue has been confronted before the database question in the Bayh-Dole Act for patents, and basically they bought the Franklin stove argument in order to get the universities to create the licensing infrastructure to get those things out in the public domain.

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS I don't know how to think about that. I realize it is true; there is a lot of ex post facto evidence that it is true because there has been a lot of licensing activity in universities. So I just wanted to point out in relationship to the previous discussion. MR. PERLMAN: The last question is a general question, and we only have about 15 minutes left. So I invite comments from anyone with respect to the unfair competition model or any of the issues that have been raised. MR. KAHIN: Let me just add something on Suzanne Scotchmer's point of the Franklin stove model versus the open model because which one works best depends on the size and the nature of the market. The principle of the Orphan Drug Act is that these markets are so small they need the Franklin stove model. For large markets or databases or technologies that have a lot of potential for broad applications building off of a lot of different directions, the open model may work better, the Internet being a classic example. PARTICIPANT: Just to add to what Brian Kahin said, keep in mind that even when we buy into the Franklin stove model, the government preserves march-in rights on all patents. So we do adopt a fail-safe mechanism that says that if you don't market this correctly we can march in and market it instead. So, in effect, we have tried to have our cake and eat it too. In an ideal world there would be some grand federal licensing office that looked around and said, “Oh, you haven't licensed your patent very well. We will march in and put it out on the market.” PARTICIPANT: Does that happen very often? PARTICIPANT: I don't think so. It is theory. PARTICIPANT: In the cases where the courts have been asked for that, they have refused. MR. PERLMAN: General views? Other issues that seem to emerge from this model? Other models? PARTICIPANT: I would like to return to a general issue that was discussed earlier, which is the question of what to do about the duration of protection and the difficulty of databases that are continually revised. It is an area that clearly needs a lot more thought. I am not convinced that it is a deposit problem, that it cannot be dealt with. If you think of patent protection, for example, you file a patent, but the patent doesn't give you a right to all future improvements of that product. You get that product as it exists. You have the doctrine of equivalence, which says that you can exercise your patent rights with respect to not just that product, but things that are very similar to that product. So it may be that if you registered or announced a database, and you would have to identify it in some sense but I don't know what the right term is, it seems to me that just by identifying what it is you are claiming confers some value that would apply to an incremental change to that database; if it is something that gets created every day, then it is not clear to me what exactly you are protecting. So obviously a lot more thought needs to go into this, and I don 't want to even suggest that I have done any effective thinking. PARTICIPANT: Say you were in an unfair competition model where the market harm, however defined, is the trigger for protection. For example, I do a 1990 phone book. The 1991 phone book is the 1990 phone book revised, and I revise it every year. In that setting what would be the market value for the 1990 phone book once the 1991 phone book comes out? Zero. PARTICIPANT: No. PARTICIPANT: All right, if it is not, then it has some market value, and someone else takes the 1990 outdated phone book and uses it to penetrate the residual market that you think still exists, would that violate the act?

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS PARTICIPANT: No. PARTICIPANT: But at some point, maybe in 1999, the 1990 phone book loses most of its economic value at which point anyone else could come in and take it without a problem. I could also take the 1990 phone book if I was going to do something other than some potentially new market that the phone company wasn't exploiting at the time. So I don't see that the revised issue is a problem that doesn't flow logically from the model that we are talking about. As long as there is a market for the outdated database, then there is a potential for market harm. PARTICIPANT: Yes. PARTICIPANT: If it is the case that the 1991 phone book is merely a small increment to the 1990 phone book, then under almost any regime, copying the 1990 phone book in order to get the 1991 phone book would be either a substantial harm or an infringement or whatever. PARTICIPANT: It depends on what you are doing with the phone book. The market harm would be selling it, but I certainly cannot go around and say to homeowners that I am going to give you the 1990 phone book. Homeowners aren't going to take the 1990 phone book if they can get the 1991 edition. PARTICIPANT: If it is a database, and I take that for the purposes of doing my own update to it to save all the initial entry costs and database building costs, and I take your file structure and do the updates, but all I have paid for is the update, where do I stand then? MR. BAND: I think, at least under the hypothetical I gave under my moral compass, chances are if you took the 1998 edition and updated it to 1999, you probably would be infringing. Imagine a directory that really is not a phone book but something else that is updated rarely, for whatever reason, and at that point what I come out with is going to be a different product or be substantially different, then I would say that that would be a factor. PARTICIPANT: Many of those cases leave at risk the sweat of the brow of the original investment. PARTICIPANT: The term is an outside limit, and if you market for a particular database, it may fail at some point before that. There are other issues, obviously, which we don't have time to consider here, but I think that is something on which there is relative agreement. PARTICIPANT: You cannot confuse the value left in the 1990 phone book with whether there is enough value in doing the update for 1991, which the gentleman who owns the 1990 phone book is going to do in 1991 whether that competing product comes out or not. Phone books are a good example; the 1990 phone book may be useful for a lot of purposes but you have got to believe that the phone company is still going to put out the 1991 phone book. There are markets like that. PARTICIPANT: The purpose of that phone book is excluded from protection. PARTICIPANT: It is all regulated, and they have to come out every year with a new phone book. Databases are being updated all the time. The real question again is, Why are we here? Why do we care about that? It is very difficult, if not impossible, to do some kind of market harm to that kind of database because even with existing technology I don't think there is a good way. For example, one of my clients has a database exactly like that, and they are not worried because they don't think that anyone is likely to spend the time and effort involved in going page by page and downloading—to the extent that you can even download. By the time you are done with that, the database is all different. PARTICIPANT: I know, but it would depend. Again, most of the online databases that people are talking about involve a large amount of effort and assembling and so forth. The

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS examples that are given, again, are updated often and the value is in them being current. That is why people want the data, and why they are so hard to copy that there is no chance for market failure. PARTICIPANT: But there are some database products that are using outdatedness as a method of price discrimination as well in which there really is a market for the outdated version. PARTICIPANT: You want to preserve competition because that seems to me to be one of the ways in which this market works. There is almost always a lower-quality data source that you can use in a lot of applications. I think that might also distinguish a lot of scientific applications in which you want everything. You want the whole thing. I know in a lot of other commercial applications there is always another lower-quality data set, but most of the really expensive, sensitive commercial financial applications don't even keep the old data. They are always marketing. They care about the new, the latest data, and to them what is one year old is valueless. PARTICIPANT: Almost, but not entirely valueless for many scientific purposes. MR MAURER: This goes to price discrimination, which is a good thing; and we want to promote that. But a lot of it has to do with the data and also the mode of distribution. I keep going back to it with the LEXIS-NEXIS marketing decision to distribute some data on disk or on CD-ROM, knowing that it can be copied. But they weren't concerned because the information is only good for a month, which is enough of a lead time for their marketing so that it doesn't matter. The truth is that with a lot of these cases, for example the Warren publishing case, the data or information could have been distributed differently. They could have distributed the information online, in which case it would be very difficult for someone to download and to come out with a kind of competing product, but that wasn't the channel of distribution they chose. A lot of the problems here can be taken care of by designing a distribution form that maximizes the need for protection. They also acknowledge that they could have used licensing; and that too would have, in that case, taken care of it because the person bought one copy of the book. So if it had been a shrink-wrap license, if it was online maybe they would have had another form or another remedy against that person. MR. OVERTON: Does the fair-use model provide a blanket restriction on copying a whole database? Here is what I have in mind. I am trying to make up something here. The reason you do this in the first place is to protect someone against competing with you in your market. But suppose I copied the phone book from someone and despite the fact that there is no problem with this, let us pretend there is. All I wanted to do was an analysis of all the first names of people by location, but in order to do that I had to copy the whole database. Am I prevented from doing that under fair use, for example? PARTICIPANT: Under property rights and fair use you probably would be. DR. OVERTON: That is what I was afraid of. PARTICIPANT: But again it all depends on how the fair-use provision is worded. PARTICIPANT: You could imagine that unless this is very delicately worded that there are going to be plenty of cases like that where I might come up with some bright idea that would require my use of the bulk data. It is going to be restricted. PARTICIPANT: And I assume that under the Coble bill it would be prohibited assuming phone books were included. It would be a substantial taking of data and potential market. PARTICIPANT: It may well be. The problem with this is you have a question of how you are going to use the database and the harm that arises. Okay, you have copied it and you are using it for something, but is the harm something that the legislation is designed to prevent?

OCR for page 218
PROCEEDINGS OF THE WORKSHOP ON PROMOTING ACCESS TO SCIENTIFIC AND TECHNICAL DATA FOR THE PUBLIC INTEREST: AN ASSESSMENT OF POLICY OPTIONS PARTICIPANT: In Dr. Overton's example, he is going to sell a directory of first names associated by geographic location. PARTICIPANT: Was that really his question? Was Chris Overton then going to market your analysis or— DR. OVERTON: Let us suppose I did. I am a scientist. So pretend I am going to market this analysis. I am going to do something that is not what the database was intended to be used for. I have come up with some completely new use for this database, but it depends on my having access to all of the data in order to do that. PARTICIPANT: I think the argument could be under the Coble bill that those are potential markets, and they could argue that a potential market includes our licensing it for bizarre uses. I am exaggerating, but the argument is certainly the market potential for licensing the product for other uses. That is a potential market. PARTICIPANT: And I think the word “potential” in itself is somewhat circular. I agree. However, in the legislation there have been attempts to cabin it in such a way that there are elements of custom such that if this is something that this company normally does or that is normally done in the industry, it is reasonable to expect that they would go into this area; then yes, you have a problem. You may well have a problem. MR. PERLMAN: I think this conversation could probably go on forever and be continually interesting and nuanced. On behalf of the NRC study committee, I want to thank all of you for participating in this session. It has been helpful, and I think many of us who have to work on a set of recommendations are going to be enlightened by this conversation.