Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 407
B Glossary and Acronyms Acceptable use policy (AUP) is a set of guidelines and expectations about how individuals will conduct themselves online. Adolescent is generally an individual older than 13 but younger than the age of majority or 18, whichever is smaller. Adult is an individual who has attained the age of majority. Note that "adult" is a term with legal significance and also with social significance. The age of majority varies by state and even by context (e.g., a 19-year- old can vote but not drink alcohol in many states). Adult verification service is a service provided to businesses that validates the adult status of certain customers. Often, but not always, a credit card is used to provide assurance of one's adult status. Algorithm is a step-by-step problem-solving procedure, especially an es- tablished computational procedure for solving a problem in a finite number of steps. Authentication is the process of confirming an asserted identity with a specified, or understood, level of confidence. The mechanism can be based on something the user knows, such as a password, something the user possesses, such as a smart card, something intrinsic to the person, such as a fingerprint, or a combination of two or more of these. AVI is a format for online video. A file named "example.avi" is likely to be a full-motion video that can be played on a computer. Bandwidth refers to the amount of data that can be transmitted in a fixed amount of time. For digital devices, bandwidth is usually expressed 407
OCR for page 408
408 YOUTH, PORNOGRAPHY, AND THE INTERNET in bits per second (bps) or bytes per second. A standard dial-up connection to the Internet, for example, typically has a bandwidth of around 56 k (or 56,000 bits per second). Binary refers to a number system that has just two unique digits. Computers operate on the binary numbering system (binary code), which consists of just two unique numbers, 0 and 1 (or, as some like to think of it, "on" and doffs. Bit (or binary digit) is the smallest element of computer storage. It is a single digit in a binary number (0 or 1~. Black list, in Internet filtering technology, is a list of Web sites (or URLs) to which access from a given workstation or user has been specifically forbidden. Contrast with white list. Boolean logic refers to a system of logic based on operators such as AND, OR, and NOT. In many search engines, search terms are linked with these Boolean operators to formulate more precise queries. Broadband is a term used commonly to refer to communications or Web access that is faster than dial-up (56 k). Such access would include cable modems and digital subscriber lines. Browser software is the actual computer program used to view docu- ments on the World Wide Web (e.g., Netscape's Navigator or Micro- soft's Internet Explorer). Bulletin board is a computer system used as an information source and forum for a particular interest group. The bulletin board typically holds postings made by various participants and replies to those postings from other participants. Byte is the common unit of computer storage. It is made up of eight bits (or binary digits). A byte holds the equivalent of a single character, such as the letter A, a dollar sign, or a decimal point. Cache refers to a place to store files locally for quicker access. Caches are used to speed up data transfer and may be either temporary or per- manent. Memory and disk caches are used in every computer to speed up instruction execution and data retrieval. CERT is the Computer Emergency Response Team based at Carnegie Mellon University. Chat is real-time conferencing between two or more users on the Internet. Chatting is usually accomplished by typing on the keyboard, not speaking, and each message is transmitted directly to the recipient. Chat room is a virtual room where a chat session takes place. Technically, a chat room is really a channel, but the term "room" is used to pro- mote the chat metaphor. Child is used in the report to denote a broad category of individuals who are younger than adult. Click (or mouse click) is a way of making a selection online.
OCR for page 409
APPENDIX B 409 Client is an application that runs on a personal computer or workstation and relies on a server to perform some operations. For example, an e- mail client is an application that enables one to send and receive e-mail. Client-server model is a network architecture in which each computer or process on the network is either a client or a server. Servers are powerful computers or processes dedicated to managing disk drives (file servers), printers (print servers), or network traffic (network serv- ers). Clients are PCs or workstations on which users run applications. Clients rely on servers for resources, such as files, devices, and even processing power. Client-side refers to any operation that is performed at the client work- station. Contrast with server-side. Content provider is an organization or individual that creates informa- tion, educational, or entertainment content for the Internet. A content provider may or may not provide the software or network infrastruc- ture used to access the material. Cookie is a message given to a Web browser by a Web server. The browser stores the message in a file (generally called cookie.txt). The message is then sent back to the server each time the browser requests a page from the server. Cookies are often used by Web sites to track users and their preferences. Cost per acquisition (CPA) refers to an advertising model in which an advertiser pays a Web site operator for displaying an ad based on the number of new subscriptions the ad generates. Cost per click (CPC) refers to an advertising model in which an adver- tiser pays a Web site operator a certain amount (e.g., $0.05) each time a user clicks on one of the advertiser's ads on the operator's Web site. Cost per mille [thousand] (CPM) refers to an advertising model in which an advertiser pays a Web site operator each time the advertiser's ad is displayed (e.g., $3 per 1,000 displays). Crawler see spider or Web crawler. Cybersex referee to online real-time dialog with someone (usually text- based) that interactively describes sexual behavior and actions with one's online partner for erotic purposes and expression. Cyberspace is a term coined by William Gibson in his 1984 novel Neuro- mancer that refers to the Internet or to the online or digital world in general. CyberTipline is the program operated by the National Center for Missing and Exploited Children for reporting child abuse and child pornogra- phy. Database is a collection of information organized in such a way that users (often both people and computer programs) can quickly select de- sired pieces of data.
OCR for page 410
410 YOUTH, PORNOGRAPHY, AND THE INTERNET Dial-up is the most common method for accessing the Internet. It in- volves making a connection from a user's computer (by using a mo- dem) over a standard phone line to an Internet service provider. Con- trast with "always-on" access methods such as cable modems or DSL lines. Digital subscriber line (DSL) is a term used to denote a class of technolo- gies that use copper phone lines to establish high-speed Internet con- nections between telephone switching stations and homes or busi- nesses (a so-called "last-mile" technology). Domain name service (DNS) is an Internet service that translates domain names into IP addresses. For example, the domain name "www.national academies.org" might translate to "22.214.171.124." Download refers to the act of copying data (usually an entire file) from a main source to a peripheral device. The term is often used to describe the process of copying a file from an online service to one's own computer. E-mail is short for "electronic mail," the transmission of messages over networks. Encryption is any procedure used in cryptography to convert plain text into cyphertext to prevent anyone but the intended recipient from reading the data. File attachment is a method by which users of e-mail can attach files to messages (e.g., one might send a relative a digital picture of a new- born in an e-mail announcing his or her birth). Filter (or filtering) is a type of technology that allows Internet material or activities that are deemed inappropriate to be blocked, so that the individual using that filtered computer cannot gain access to that material or participate in those activities. Gnutella is file sharing system on the Internet that lets users search for software and documents on the GnutellaNet, a loose federation of users and organizations that make a wide variety of information avail- able to the world at large. Gnutella is also an example of peer-to-peer networking. Graphics file is a file that holds an image. IPEG and GIF are two popular formats for image files. Hard disk is a computer's primary storage medium, and it is usually a fixed component within the computer itself. Contrast with "floppy" disks, which are temporary, disposable, and removable. Harvester is an automated program that is designed to collect e-mail addresses by scanning Web sites, bulletin boards, and chat rooms (among other things). History file refers to the list most Web browsers maintain of downloaded pages in a session so that users can quickly review everything that
OCR for page 411
APPENDIX B 411 has been retrieved. However, history files can be cleared or altered easily. Hypertext markup language (HTML) is the authoring language used to create documents on the Web. Web pages are built with HTML codes (usually called "tags") embedded in the text; these tags define page layout, fonts, and graphic elements, as well as hypertext links to other documents on the Web. Hypertext transfer protocol (HTTP) is a communications protocol used to connect clients and servers on the Web. HTTP's primary function is to establish a connection with a Web server and transmit HTML pages to the client browser. ICQ ("I Seek You") is a conferencing program for the Internet. Much like AOL's Instant Messenger service, ICQ provides interactive chat and file transfer, and can alert users when someone on their predefined list has come online. ICRA is the Internet Content Rating Association (www.icra.org). Image recognition/analysis (or "recognition") is the process through which a computer can identify an image (e.g., this graphics file con- tains an image of a naked woman). Information retrieval refers to the processes, methods, and procedures used to selectively recall recorded data from a database. Instant message (IM) is a two-way, real-time, private dialog between two users. A user initiating an IM sends an invitation to talk to another (specific) user who is online at the same time. Instant messaging is very popular today, because unlike participation in chat rooms, one tends to talk to people whom one already knows. Note also that IMs are often used in conjunction with chat rooms a user in a chat room can send an IM to someone else in the chat room (because he or she sees the other party's screen name or "handle"), thus establishing a private communication. Intellectual property refers to the ownership of ideas and control over the tangible or virtual representation of those ideas. Internet is a decentralized global communications network connecting millions of individual users and machines. Internet protocol (IP) is a part of the TCP/IP suite of protocols that al- lows the various machines that make up the Internet to communicate with each other. Internet relay chat (IRC) is another conferencing system used on the Internet. However, unlike in instant messaging, users do not commu- nicate directly with each other; rather, the server broadcasts all mes- sages to all current users of a particular channel. Internet service provider (ISP) is an organization or company that provides access to the Internet. Examples of national-level ISPs
OCR for page 412
412 YOUTH, PORNOGRAPHY, AND THE INTERNET include America Online (AOL), Earthlink, and Microsoft Network (MSN). Internet telephony refers to two-way transmission of audio over the In- ternet. Internet telephony allows users to employ the Internet as the transmission medium for telephone calls. It is also commonly re- ferred to as voice-over-If (VoIP). IP address is an identifier for a computer or device on a TCP/IP network. Networks using the TCP/IP protocol route messages based on the IP address of the destination. The format of an IP address is a 32-bit nu- meric address written as four numbers separated by periods. Each number can be zero to 255 (e.g., "126.96.36.199" could be an IP address). Keystroke log is a method for recording each keystroke made by a user on a given computer. Link (or hyperlink) is a reference or pointer to another document. Click- ing on (or selecting) a link on a Web page generally takes one to the document being referenced (e.g., clicking on a link to the NRC's home page will open the NRC's home page document in the user's browser). Local area network (LAN) is a computer network that spans a relatively small area. Most LANs are confined to a single building or group of buildings. However, one LAN can be connected to other LANs over any distance via telephone lines and radio waves. Log is a file that lists actions that have occurred. For example, Web servers maintain log files listing every request made to the server. See also keystroke log. Login refers to the way that computers recognize users. Logins are also commonly referred to as user names. Generally, the combination of a correct login (or user name) and password is required to gain access to networked computers. Megabyte (MB) is the term used to denote 1 million bytes (or, more precisely, 1,048,576 bytes). Meta tags are elements within HTML code that allow page creators to describe the content of Web pages. Meta tags are often read and indexed by search engines. Metadata is a component of data which describes the data. It describes the content, quality, condition, or other characteristics of data. For instance, in the HEAD section of most HTML documents, many Web page creators encode information about the title, author, date of creation or update, and keywords relating to or descriptions of the document's content. Minor is a term generally used in a legal context to denote individuals who are younger than adults. Modem is a device that enables a computer to transmit digital data over analog telephone lines.
OCR for page 413
APPENDIX B 413 Moderated newsgroup is a mailing list in which all postings are "moder- ated" by a specific individual with the authority and power to reject individual postings that he or she deems inappropriate. Mousetrapping is a technique that forces a user to remain on a specific Web site by not allowing the user to leave the site. Whenever the user tries to leave the site by closing the browser window or going to a new URL, the site that is mousetrapping will automatically open a new browser window with its URL or not allow the browser to go to the new URL. MPEG is a term that refers to the family of digital video compression standards and file formats developed by the Motion Picture Encoding Group (hence, MPEG). The term is also often used to refer to the files of digital video and audio data available on the Internet. Multimedia refers to applications that combine text, graphics, full-mo- tion video, and/or sound into an integrated package. Napster was initially an application that gave individuals access to one another's music (MP3) files by creating a unique file-sharing system over the Internet. Net is a short term for Internet. Newsgroup is an online discussion group. On the Internet, there are literally thousands of newsgroups covering every conceivable inter- est (e.g., see ~. Offline refers to the time that a user is not connected to the Internet. Contrast with online. Online refers to the time that a user is connected to the Internet. Contrast with offline. Overblocking refers to a situation where Internet filtering software blocks access to resources that authorities did not intend to block. Contrast with underblocking. Password is a secret series of characters that enables a user to access a file, computer, or program. On multiuser systems, each user must enter his or her correct user-name/password combination before the com- puter will respond to commands. Peer-to-peer network is a communications network that allows all com- puters in the network to act as servers and share their files with all other users on the network. Gnutella is one example of peer-to-peer networking on the Internet. Contrast with client-server. Pixel is the smallest discrete element of an image or picture on a computer monitor (usually a single-colored dot). Platform for Internet content selection (PICS) is a system for rating the content of Web sites that has been endorsed by the World Wide Web Consortium.
OCR for page 414
414 YOUTH, PORNOGRAPHY, AND THE INTERNET Plug-in is an auxiliary program that works with Internet browser soft- ware to enhance its capability (e.g., RealNetwork's RealPlayer or Microsoft's Media Player). Portal is a Web site or service that offers a broad array of resources and services, such as e-mail, search engines, subject directories, and fo- rums. Yahoo! is an example of a portal. Precision is a measure of the effectiveness of information retrieval and is often expressed as the ratio of relevant documents to the total number of documents retrieved in response to a specific search. For example, using a Web search engine, if a search retrieves 100 documents, but only 30 of them are truly relevant to the search, the precision would be 30 percent. Contrast with recall. Probabilistic algorithm is an algorithm that works for all practical pur- poses but has a theoretical chance of being wrong. Proxy server is a server that sits between a client application, such as a Web browser, and a real server. It intercepts all requests to the real server to see if it can fulfill the requests itself. If not, it forwards the request to the real server. Proxy servers can also be used to filter requests to prevent users from accessing specific Web sites. Push transfer refers to a form of data delivery in which data is automati- cally delivered to the user's computer without the user's having to make a request for the data. Real-time audio/video refers to communication of either sound of images over the Internet that occurs without delay in real time, much like a telephone conversation. Recall is a measure of the effectiveness of document retrieval expressed as a ratio of the total number of relevant documents in a given data- base (or on the Web) to the number of relevant entries or documents retrieved in response to a specific search. However, determining a search's recall can be problematic because it is often very difficult to determine the total number of relevant entries in all but very small databases. Contrast with precision. Remote viewing is the capability of system administrators (whether they be information technology "helpdesk" personnel or teachers in a class- room) to view what is being displayed on a given workstation or computer from their own location. Scanner is a device that can copy text or illustrations printed on paper and translate that information into a form a computer can use. Screen name is an alias (or short nickname) chosen by a computer user to employ when accessing his or her online service or network account. See also login.
OCR for page 415
APPENDIX B 415 Search engine is program that searches documents (or indexes of docu- ments) for specified words or phrases and returns a list of the docu- ments where those items were found. Server is a computer (as well as the software that is running on that computer) that delivers (or serves up) Web pages. Server-side refers to any operation that is performed at the server. Con- trast with client-side. Smart card is a small physical hardware device (typically the size of a credit card) containing read-only non-volatile memory and a micro- processor that can be inserted into a card reader attached to a com- puter. In most scenarios, the individual user carries the card and inserts it into an Internet access point that requires such a device. The memory on-board the device can store information about the user, including his or her age, preferences for material to be blocked, and so on. Software installed on the computer, and on Web sites visited, would check the smart card for dates of birth when necessary, and if the user were underage for certain types of material, would refuse to grant access to that material. Spam generally refers to unsolicited e-mail, particularly unsolicited e- mail of a commercial nature. Spider is a computer program that automatically retrieves Web docu- ments. They are often used to feed pages to search engines for index- ing. Another term for these programs is Web crawler. Streaming media refers to a technique for transferring data in such a way that it can be processed as a steady and continuous stream (as op- posed to the user's needing to download the entire file before being able to view or listen to it). Surfing (or Web surfing) is a metaphor for browsing the contents of the Web. TCP/IP (or Transmission Control Protocol/Internet Protocol) is the suite of communications protocols used to connect machines on the Inter- net. TCP/IP allows different hosts on the Internet to establish a con- nection with each other and exchange streams of data. Teaser refers to Web pages or portions of Web sites that are intended to entice users to spend more time at a given Web site or become paying customers. Thumbnail is a miniature display of a page or image. Thumbnails enable users to see the layout of many items on the screen at once. Top-level domains are the major subdivisions within the Internet's do- main name service (DNS). Examples of top level domains include- among others .com, .gov, and .edu.
OCR for page 416
416 YOUTH, PORNOGRAPHY, AND THE INTERNET Traffic refers to the load on a given Web site or resource. A high-traffic Web site, for instance, receives many visitors or requests for data. Traffic forwarding is the practice whereby one Web site forwards traffic to another Web site and may receive a fee for doing so. Underblocking refers to a situation whereby Internet filtering software does not block access to resources that authorities intended to block. Contrast with overblocking. Uniform resource locator (URL) is the address of documents and other resources on the World Wide Web. The first part of a URL indicates what protocol to use to access the document (e.g., "http" or "ftp"), while the second part specifies the domain name where the resource is located (e.g., www.example.com) as well as the directory and the name of the requested document. Usenet is a worldwide bulletin board system that can be accessed through the Internet or through many online services. It contains more than 14,000 forums, called newsgroups, that cover almost every imagin- able interest group. V-chip is an electronic circuit or mechanism in a television that parents can use to block programs they consider inappropriate for their chil- dren. V-chips can be configured to block all programs of a given rating. Virtual hosting refers to the ability of Internet service providers or Web site operators to "host" Web sites or other services for different entities on one computer while giving the appearance that they exist on separate servers. For instance, with virtual hosting, one might have Web sites from two separate, distinct organizations residing on and being served from one particular server (with one particular IP address). Web is a shortened form of World Wide Web (WWW). Web crawler is a computer program that automatically retrieves Web documents. They are often used to feed pages to search engines for indexing. Another term for these programs is spider. Web page hosting refers to the ability of Internet service providers, com- panies, or other organizations to act as a server of Web pages. Webcam is a video camera that is used to capture periodic images or continuous frames to a Web site for display. WebTV is a service that makes a connection to the Internet via a user's telephone service and then converts the downloaded Web pages to a format that can be displayed on a television. White list, in Internet filtering technology, is a list of Web sites (or URLs) to which access from a given workstation or user has been specifically approved. Contrast with black list. Workstation refers to a computer connected to a network (often the Internet).
OCR for page 417
APPENDIX B 417 World Wide Web (WWW) refers to the set of all the information re- sources that can be accessed via HTTP. World Wide Web Consortium (W3C) is one of the main standards bodies for the World Wide Web. The W3C works with the global community to help establish international standards for client and server proto- cols that enable communications on the Internet.
Representative terms from entire chapter: