Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 121
OCR for page 122
This page in the original is blank.
OCR for page 123
APPENDIX A Network Technology NETWORK BUILDING BLOCKS The essence of an electronic network is connectivity between computers. The first computers ran as stand-alone machines that could be accessed only from their immediate physical location.1 Later, they became reachable through "dumb" terminals connected to hard-wired lines and dial-in ports. More recently, computer-to-computer connections were implemented through hard-wired and dial-in ports and through local area and wide area networks. Today, many computers participate in the growing global network, sometimes referred to as "the net." One of the main components of this network is the Internet, which itself is a network of networks with international reach. Transmissions over the global network may pass through several computers and network gateways before reaching their final destinations. When computers are networked, the network must support a method of uniquely addressing the various computers connected to it. Any user regardless of location (but connected to the network) who wishes 1 Even today, there are valid reasons to establish "islands" of computing capability that do not interact with other systems. For example, a corporation may choose not to connect its network or its individual computers to the external world because of security concerns. Still, recent experience demonstrates that important and powerful synergistic effects are possible when many individual computing elements are connected to each other.
OCR for page 124
to interact with another computer must be able to refer specifically to that computer and not some other computer. Assigning unique addresses to computers on a network is the equivalent of assigning unique telephone numbers to telephones connected to the telephone system. Computer networks can implement store-and-forward communications, real-time connections, and distributed computing. STORE-AND-FORWARD COMMUNICATION With store-and-forward communication, the contents of a communication are temporarily stored on intermediate computers before reaching their final destination. Electronic mail is a good example. A message is typed at the originating computer and is then handed to a ''mailer" running on the same computer or on another "mail host" computer tied in through a local area network. Over some period of time, the message will be transferred to and temporarily stored at a series of other computers until it reaches its ultimate destination. At this destination, it is stored on a host computer until the addressee checks her or his electronic mail. Store-and-forward systems now generally have various enhancements, including the capability to attach files and perform transfers. However, because the native communication protocols of intermediate nodes are not controllable throughout the process, or may in fact be unknown to all systems, file transfers are often limited to text. In any case, the transmission of the message does not create a real-time connection between the sender's and the recipient's computers, and so true real-time interactivity is not possible. One technical note: packet-switched networks also implement a kind of "store-and-forward" communication of the packets that are the basic unit of transmission. However, intermediate nodes forward packets nearly instantaneously, and these packets remain at the intermediate nodes for very short times. Real-Time Connections A real-time connection is one that allows a user on one computer to access a remote computer and directly perform actions on that remote computer. These actions may be as simple as transferring files or searching a database on the remote computer or as complex as controlling the operation of the remote computer. Like the other types of interconnections, real-time connections now often involve several levels of intermediate networks. These intermediate networks
OCR for page 125
are transparent to the user, except that there may be some small time delay induced by the bandwidth, processing, or data communications rate of the intervening physical network media. Historically, real-time access to remote computers was the first type of networking to be widely developed. It was a practical way around the scarcity of computing power, which was often measured in processor cycles. The idea was to allow researchers to use powerful computers that they did not own themselves. Even with the tremendous decreases in the price of computing power, real-time connections play an important role in providing researchers access to the fastest supercomputers. Distributed Computing Distributed computing allows a database, file system, or application to be dispersed across a networked set of computers. Some records in a distributed database or files in a distributed file system may be replicated across several computers to provide greater reliability, faster access, or simultaneous access to a larger number of people. Applications may be distributed in order to take advantage of unused resources on other machines, but they are also distributed so that part of an application can run on a user's workstation while another part runs on a file server or database server and accesses information requested by the user. NETWORK SERVICES Networks support many different types of computer-mediated communications. The most basic form of communication, a "one-to-one" message communication between two individuals, is supported through electronic mail and real-time "talk" facilities that allow the parties to simulate a telephone conversation through exchanges of text (advanced systems support voice and video as well). But networking technology greatly expands this basic notion of communication with ease. For example, networks also support the notion of ''one-to-many" communication, a mode that could be characterized as a broadcast mode in which a single source transmits information to many people. Perhaps most important, electronic networks support a mode of communication for which there is no close historical analog—a many-to-many mode of communication in which many people write and many people read simultaneously. A many-to-many communications mode that can be operated for essentially the same individual effort as a one-to-one mode is unprecedented in the history of
OCR for page 126
communications media.2 This mode has facilitated a new pattern of social interaction that is difficult to achieve through other communications media. Described below are several of the most important forms of computer-mediated communications. Note that these forms do not necessarily map cleanly or uniquely to the one-to-one, one-to-many, or many-to-many modes described above. Electronic Mail Electronic mail (or e-mail) is today the single most common form of communication on electronic networks. E-mail has gone mainstream as it is increasingly used in business settings. E-mail is used most often in a one-to-one mode to send private messages from one person to another. However, simply by adding to the address list, it is possible to send (broadcast) the same message to many parties, illustrating the use of e-mail in a one-to-many mode. The primary advantage of e-mail communication is that it eliminates the need for the message sender and receiver to be active simultaneously; a message sent by one party need not be read by its recipient until it is convenient for the recipient to read it. Postal mail or interdepartmental mail is similar but suffers from the delays and uncertainties inherent in moving physical objects.3 E-mail is often used instead of the telephone because it solves the problem of "telephone tag." It also provides a written record, and for many people it is free. Still, it is usually more time-consuming to carry on a dialogue using e-mail, and there can be uncertainty about whether the recipient has received a message and acted on requests. Store-and-Forward Conferencing Store-and-forward conferencing is a many-to-many mode of communication in which messages are created by members of a group and read by others within that group. Conferencing goes under many 2 The cost in resources required to support many-to-many communications is an entirely different matter. In fact, the ease with which many-to-many communications can be achieved (from the standpoint of the individual end user) may well mask the true cost in resources needed in support. By all accounts, these resources are substantial. 3 E-mail is not immune to delays and uncertainties either, although overall delays are generally much smaller than those associated with postal mail. The telephone system, however, is highly reliable and operates in real time. Thus, fax transmission rather than e-mail is often the most reliable way to transmit a single copy of a document.
OCR for page 127
names (e.g., electronic bulletin boards, "newsgroups," newsletters, forums, mailing lists), but the basic idea is essentially public discussion. People post messages, other people reply, and a structured conversation emerges over a period of time. The boundaries of the relevant group in a conference may or may not be well defined. Some networked conferences strictly limit those who may participate (e.g., a conference running on a corporate network may be limited strictly to employees of the owning company). Other conferences are essentially public: the group consists of anyone who wishes to join the conference. Still other conferences screen potential members through an application process. In all cases, potential conference members need to know of the conference's existence; in a world in which electronic networks are ever more common, this may be the most daunting admission "requirement" of all. Some conferences are "moderated"; others are not. In an unmoderated conference, all messages posted by all members are visible to all conference members. In a moderated conference, a moderator screens or reviews messages submitted for posting by group members. Messages that the moderator deems irrelevant or inappropriate for public posting are eliminated or sent back to the sender for revision. For example, a moderator may decide that a message contains a personal attack on another group member and send that message back with a request to rephrase the message. Properly speaking, an electronic bulletin board is a mode of communication in which all messages ever sent are easily accessible to all participants.4 A user viewing such a board would be able to see new messages as well as old messages, perhaps dating back to a time before he or she had joined the bulletin board for the first time. However, "bulletin board" has also come to mean an automatic mail redistribution site. In this mode, a mail redistribution site is set up to service discussion on a particular topic among a list of people. If the list is open, then new participants can add themselves to the list automatically by sending a request to the redistribution site. The discussion goes on as users send mail on the topic to the redistribution site, and all members of the list receive it.5 Old messages—i.e., messages sent 4 Another use of the term "bulletin board" refers to small dial-in computer systems typically operated by private "sysops." This type of system is described below in the section "Computer Bulletin Board Systems." 5 This characteristic distinguishes it from a simple, individually owned mailing list, in which the sender of a message must know the individual addresses of all those with whom one wishes to communicate. Indeed, in some mail redistribution schemes, it is difficult if not impossible for the message sender to know the individual identities (or addresses) of everyone who receives the message.
OCR for page 128
among board participants before a given user joins the list—are usually not accessible, unless someone has explicitly archived those messages. (A more proper name of this type of bulletin board is a LISTSERVER.) Real-Time Conferencing Real-time conferencing is a many-to-many mode of communication that is similar to store-and-forward conferencing, except that the messages are sent back and forth in real time. Real-time conferencing builds on the real-time connections described in the previous section. One use of real-time conferencing is the real-time chat, in which members of the conference are logged into the conference simultaneously and messages typed by those members are displayed in real time. Real-time chats are the typed network equivalent of citizens' band (CB) radio. Two other forms of real-time conferencing are games such as multiuser dungeons (MUDs) and multiuser simulation environments (MUSEs). MUDs and MUSEs enable remote participants to join ongoing computer-mediated games (dungeons) or simulations. More sophisticated real-time conferences based on audio and video links are beginning to take place on the Internet today. File Transfer File transfers provide one type of remote access to information contained in computer files: a user in New York can move a file to or from a computer located in California. A file may contain text, graphic images, or any other type of digitally encoded information. File transfers may be anonymous or restricted. In the case of anonymous file transfers, any user on any computer connected to the network can obtain a file, knowing only the location on which the file is stored and the name of the file. Restricted file transfers are limited to some well-defined group of individuals who, for example, can obtain certain files from certain computers only if they have been granted access rights to those files. Remote Computer Use Another way to obtain information remotely is to use the network as essentially a very long cable that connects a user at a terminal or workstation in New York directly to a computer in California—a remote log-in. This user, sitting in New York, has access to capabilities of the California computer that would be available to a
OCR for page 129
user sitting at a terminal that was directly connected to it.6 Remote use provides a path for the retrieval of information that is not contained in complete files (e.g., the browsing of entries in a catalog, the conducting of a database search) and thus not obtainable through file transfer. Remote computer use makes it possible to share computing resources; a user in New York in need of a faster computer may be able to find one in California. At the same time, it is more difficult to control remote network-enabled physical access to a computer than to control access when a user must physically appear at a directly connected machine. Passwords and other security measures help to control remote access, but facilitating ease of remote use and denying unauthorized access are goals that are inherently contentious. Information Search Services As described above, one major use of electronic networks is to transfer information between sites for the benefit of remote users. But all of the services described above require that the potential user know a lot about the information being sought: on what computer it is located and under what file name it is stored. A potential user that does not know this information can be handicapped in his or her search for information. A number of tools have been developed for use on the Internet to help users search for and retrieve information (Table A.1). Gopher and the Wide-Area Information Server (WAIS), for example, provide a menu-driven interface for obtaining information from both local and remote systems on the network. Gopher gives the impression of a single large distributed database. Although these and other information search tools are used for different purposes, they share one theme—they reduce the amount of "low-level" information needed by the user to retrieve information, allowing him or her to specify a request for information in terms that are more meaningful (e.g., by a set of key words or phrases that specify the topic of interest on which information is being sought). 6 In many cases, the capabilities available to remote users are restricted due to security considerations. In other cases, the capabilities are identical, and remote users have exactly the same capabilities as do local users.
OCR for page 130
TABLE A.1 Information-finding Resources for Use on the Internet Name of Tool Type of Information Sought How Search Is Specified Finger E-mail address (log-in name) of a user on a given system Last name of user, system on which he/she may reside Archie Usually information in text and binary files File names to be searched for on various FTP sites Gopher Usually information in text and binary files Menus that contain descriptions of general categories of interest; user browses these menus and is automatically connected to the systems on which menu items of interest are found Veronica Files, gopher sites, and menus Key words to be searched for on various gopher menus, key words associated with various files Wide-Area Information Server Information in files Key words or phrases likely to be found in the files that are desired. (If the file is not text, key words may be appended to auxiliary files that point to the desired nontext file.) World-Wide Web Information in files Hyper-text search; a user browses a document and comes across a reference to locate. (He/she pursues an automatic link from that item in the document to the reference.) Mosaica Information in files Mouse pointing and clicking Whois Information about a user without knowing the system on which he/she may reside Name of user NOTE: A more complete description of most of these services can be found in Ed Krol, The Whole Internet: User's Guide and Catalog, O'Reilly and Associates, Sebastapol, Calif., 1992. a Mosaic is a convenient and easy-to-use graphical user interface to the Internet that became popular in the latter part of 1993 and has been responsible for driving much of the recent Internet use. Mosaic makes use of many of the resources described above.
OCR for page 131
A ROUGH TYPOLOGY OF NETWORKS A very rough typology of the various types of networks is the following: global network, computer bulletin board systems, and commercial services. The typology is rough because there is some overlap between the types, but as a first approximation it will suffice. The Global Network and the Internet The term "global network" is used to refer to the worldwide connection of computers and networks that is part of and connected to the Internet. The Internet is a set of interconnected networks, numbering in the tens of thousands and most of which make use of a protocol suite known as TCP/IP for communication. The Internet is perhaps the single most important driver of the global network. Networks other than the Internet have some significance as well, though the networks described here can connect to the Internet. BITNET is a cooperative network consisting mostly of academic institutions and even today provides primary connectivity to network communications for certain institutions. Usenet is a network that supports only newsgroups. UUCP refers to an associated network that supports only mail but not newsgroups. Fidonet is a network that connects personal computers, primarily those using MS-DOS.7 Computer Bulletin Board Systems From a technical perspective, a bulletin board system (BBS) is quite simple. In its most basic form, a BBS involves a computer with a modem on a telephone line. BBS users make their connections by telephone, and data flows between the user and the BBS. Connections between user and BBS are transient, lasting only for the duration of the telephone call. A user dials a connection to the BBS and may interactively read "messages" posted by others, and, if authorized, post his or her own messages. The connection from the user to the BBS is typically a terminal-style interface, e.g., a VT100 terminal emulation, rather than e-mail in the sense described above. BBSs are an example of grass-roots computing—inexpensive, generally informal, open to everyone with a modem and a computer, and often 7 For more discussion of these and other networks, see John Quarterman, The Matrix: Computer Networks and Conferencing Systems Worldwide, Digital Press, Bedford, Mass., 1990.
OCR for page 132
short-lived. Some BBSs charge for usage, others seek voluntary contributions, and still others are entirely free of charge. BBSs cover a wide range of subject material, e.g., coin collecting, parakeet raising, politics, lifestyle, religion, law enforcement, distribution of government information, and hacker information. The low cost of setting up and running a BBS has allowed many individuals to establish their own. In 1992, Jack Rickard, editor of Boardwatch magazine, estimated the number of publicly accessible bulletin boards in the United States at 45,000, and the number has grown substantially since then.8 Many BBSs are connected to each other or to other networks; other BBSs stand alone. Freenets, of which one of the most famous is the Cleveland FreeNet, are community-based networks that are open to the public and provide BBS capabilities. They serve many of the same functions as public libraries and town meetings. Commercial Services A number of commercial networks have emerged in the last decade. Among the most prominent are CompuServe, the Prodigy Services Company, Genie, and America OnLine. Although the services provided by commercial networks vary, they generally include access to a variety of information sources (e.g., magazine and newspaper articles, financial information for publicly owned companies), electronic mail among subscribers, public conferences on a variety of subjects ranging from romance to nuclear energy, and a variety of consumer services (e.g., home shopping). In general, commercial services charge users for the time they are connected to the network and for the specific services they use (though a set of basic services may be available for a fixed fee). Policies exist regarding acceptable use of the network services offered, and they are enforced to varying degrees.9 Common carriers (e.g., local exchange companies) may begin to offer similar services in the near future. 8 This total includes only systems run by either individuals or companies that would welcome a call from a stranger. These bulletin boards typically host between 200 and 2,000 callers each, with an average "unique" caller base of about 250 per board (discounting the common caller base among boards), according to Rickard. This indicates a total U.S. caller base of over 11 million 9 For example, a commercial information service might offer a public "chat" service to its users (i.e., a conference), subject to the condition that users not engage in the use of profanity. These "terms of service" may stipulate that participants using profane language are subject to disconnection from the conference, but in actual practice action may be taken only when another participant complains. Another service might terminate the connection when the profane language first appears.
OCR for page 133
THE INTERNET The Internet is a worldwide network of networks that originated in research and education communities but now also accommodates some commercial traffic. Member networks share a common set of protocols that enable communication between them, but each member network is administratively distinct in much the same way that a given road might pass through a number of separate and distinct states with different law enforcement practices and rules of the road. As a result, the Internet does not have the character of a single, centrally run organization, though certain aspects of its operations are coordinated. The Internet provides all of the functions and services described in the section "Network Services," above. Organization The organization of the Internet is best described in terms of progressively higher levels of aggregation: The local view. Typically, the local view centers on an institution. The institution has a mainframe computer or a local area network, and people use terminals, personal computers, or workstations that are physically wired to the network or the mainframe. The regional view. At the next level of aggregation, each of those local networks or mainframes is connected into some sort of a regional network. Thus, the regional network consists of many interconnected state or local networks. For example, the California Education and Research Foundation Network (CERFNet) alone connects more than 150 academic and commercial institutions to the Internet. There are a few dozen regional networks, most of which began as academic cooperatives and some of which have become commercial enterprises. In addition, several fully commercial enterprises now provide Internet connections to institutions. The national view. Regional networks serving research and education users have been connected through a high-speed "backbone" network known as NSFNET, which has been run with financial support from the National Science Foundation since 1988. (NSFNET is expected to be replaced in 1995 by a web of competing commercial and nonprofit backbone networks.) Some networks outside the United States are multinational, e.g., NORDUNET (Scandinavian nations) and DANTE and EBONE (Europe). The global view. The Internet now reaches internationally. In recent years, various links have been established to networks in other
OCR for page 134
countries. However, the detailed structure of networks in these other countries may not parallel that in the United States. (For example, networks internal to other nations are often similar to regional networks in the United States.) But even these views do not quite explain the "true" nature of the interconnections among Internet institutions. For example, multiple connections between regional mid-level networks on an ad hoc basis are common; it is even possible for individuals to become Internet sites on their own. The resulting network is more like a seamless and tangled web of interconnections that conform to the common protocols than a hierarchically structured organism. Among other things, the Internet's lack of hierarchy makes it far more robust and adaptable to new circumstances. Management The management of the Internet is decentralized. All Internet sites share communications protocols and an agreed-upon set of naming and addressing conventions. A central body, the Internet Assigned Numbers Authority, allocates Internet addresses. Day-to-day operation is conducted by a set of hierarchically related Network Information Centers. Other than these shared elements, very little is common, though a spirit of friendly cooperation enables this decentralized operation to function. As Susan Estrada, former executive director of the California Education and Research Foundation Network, noted, "everybody or nobody" runs the Internet. In the past, the decentralized character of the Internet has interfered with enforcement of the NSF acceptable use policy (AUP). The AUP attempted to regulate the nature of traffic carried across NSFNET and was intended to restrict the use of NSFNET to research and education purposes. However, many commercial enterprises are (and have been) connected to the Internet. Since a user generally has no way to control the precise routing of any given traffic, a commercial user may send commercial traffic across NSFNET without knowing it (or even caring about it). Indeed, the NSF AUP has often been honored more in the breach than in actual practice. (It is expected that efforts to apply this policy will change in 1994-1995 as NSF support for the backbone is reduced and the backbone network is privatized.)
OCR for page 135
Size and Scale Since it is so easy to add a connection to the Internet, the size of the Internet changes rapidly. The Internet connects more than 70 countries (around 150 countries if e-mail links to the Internet are included), between 2 million and 20 million users, and some 3,000 newsgroups. It also connects many thousands of information archives of various sorts. The Internet connected 46,000 domains in July 1994,10 and the number of added networks doubles every year.11 These numbers are growing rapidly. For several years, the traffic across the NSFNET backbone (measured in terms of number of packets of information carried) has grown at an average rate of 15 to 20 percent per month, driven primarily by increases in the number of users. A BRIEF HISTORICAL BACKGROUND The earliest roots of computer networking can be clearly traced back to the time-sharing services of the 1960s. At that time, computers were large and expensive, and consequently rather scarce. Time-sharing, using remote terminal communications, was developed as a way to expand the availability of limited computing resources. General Electric and Tymshare were among the better known, early commercial time-sharing services; many universities and companies maintained their own time-sharing mainframe computers. As computers became more common, many corporations established their own mainframe-based local area networks, often with proprietary networking technologies. Subsequently, they linked a number of their own computing sites to form the first long-haul or wide area networks. Xerox, Digital Equipment Corporation, International Business Machines (IBM), and American Telephone and Telegraph (AT&T) were among the early pioneers. Also beginning in the late 1960s, networking began to move out of the single-corporation realm with the establishment of several small, experimental, packet-switched networks in Europe to link scientific research facilities. Today's widespread availability of fast, reliable, global, and nearly ubiquitous networking is directly related to two significant developments: 10 A ''domain" is approximately one administrative entity that is connected to the Internet; typically, it is a single university or a single company. The domain may be subdivided into a number of smaller subdomains. 11 Vinton Cerf, personal communication, 1994.
OCR for page 136
ARPANET and personal computers. ARPANET was developed in 1968 by the Advanced Research Projects Agency of the U.S. Department of Defense and implemented in 1969 by Bolt, Beranek, and Newman both as a network research project per se and, as it turned out, a very successful method to link military research computers. It demonstrated the viability and system-wide reliability of long-haul packet-switched networks. The development of the Transmission Control Protocol (TCP) and the Internet Protocol (IP) in the mid to late 1970s enabled the linking of a growing number of wide area and local area networks via ARPANET and thus greatly increased the number of researchers with network access. This linking of a number of networks eventually led to the use of the name "ARPA Internet" in 1977 to stress the internetwork aspects of this growing resource for scientific research. More formally, "Internet" was formed in 1983, when the Defense Communications Agency reorganized military networking and mandated the use of the TCP/IP protocols for all hosts on military networks. In the mid to late 1980s the National Science Foundation (NSF) established a number of supercomputer centers to make greatly increased computing power available to the broad spectrum of research scientists outside of the military research community. After some initial experience using ARPANET, NSF established the NSFNET backbone for the Internet in 1987 and 1988 and began to link an increasing number of colleges and universities to the network. This greatly increased both the capacity and number of users on the network and reinforced the fact that the original ARPANET had become only one of the many networks on an already large and continually growing Internet.