Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1181
--> A Proposed Information Handling System for a Large Research Organization W.K.LOWRY and J.C.ALBRECHT The governing factor in the useful application of information is that determined by need rather than the totality of information available. Information is available in amounts far greater than required by any single research organization and to the degree that it is extraneous to needs it constitutes a barrier to research progress if attempts are made to handle it in an information system. To determine what information is relevant, to locate, acquire, and announce it for ultimate use constitute the preliminaries for effective storage and retrieval. They are no less important than techniques of information control such as cataloging, indexing, and coding, and when properly oriented to needs they assist greatly in the simplification of control requirements by reducing the amount of information to be controlled. The proper control of information is a prerequisite to its effective storage and use and should also be governed by the needs of the user rather than the total possibilities for control offered by a body of information. There have been innumerable attempts to solve the difficulties of information storage and retrieval with insufficient consideration having been given to the reduction of problems in the preliminary phases of the information handling cycle. The general failure of these attempts demonstrates the necessity for delineating true information requirements, reducing the amount of irrelevant material in an information system and designing systems for efficient operation in the searching procedure. By proper attention to each phase of the information handling cycle it is possible to make good use of presently available electronic equipments to aid in storage and retrieval. Further aid is possible by employing oral techniques to speed up the flow of technical communication within a research organization. This paper will suggest a system of information W.K.LOWRY and J.C.ALBRECHT Bell Telephone Laboratories, Murray Hill, New Jersey.
OCR for page 1182
--> handling for a large research group but it is believed that most of the approaches and procedures are equally applicable to small organizations. Numerous problem areas of information handling contribute to dissatisfaction with most systems presently in use. This is not to say that all current systems are unsatisfactory. There appear to be a number, particularly for scientific disciplines enjoying well developed taxonomies or in limited fields, which work quite well. On the other hand, satisfactory systems for handling large amounts of information to meet the diverse needs of many research organizations still need to be developed. In such situations problems begin when information is first put on paper by a research worker and continue to accumulate through subsequent stages of the communication cycle. The net effect is a serious and complex situation with respect to the latter stages concerned with storage, retrieval and use of information. It is the reduction of this entropy in information systems which must be achieved and the following proposals are governed largely by this objective. Establishing the need for information In all research organizations information is available from two general sources, that produced by the group itself and that available from outside sources. It is assumed that internally produced information is more reflective of an organization’s research activities and interests than that produced externally, and it is recognized that the need for externally produced information is at least as great as that produced internally. From a practical standpoint it would seem desirable as a first step in determining the information requirements of a particular research group to establish these needs in relation to (1) the research projects being pursued, (2) the scientific and technical interests of the group members, and (3) the information produced by the group itself. All these relationships may be established readily since they are subject only to internal considerations and are not affected by outside factors beyond the group’s control. Suggested techniques for exploitation of these areas will be described in the following paragraphs. The first objective, it should be noted, is to establish a dictionary or list of information requirements. In most research organizations it is necessary to prepare a description or justification for each research or development project before funds, personnel, and facilities are authorized. These statements in themselves are frequently sufficiently informative to permit clear definition of areas where information will be needed. They will also usually carry the signatures of those responsible for project work; so questions concerning information requirements can be readily answered. Analysis of work authorizations has the additional advantage
OCR for page 1183
--> of indicating information needs before or at the time project work starts. The search for information, its compilation, and presentation at an early date should reduce both the time and money requirements for research projects. Successful research administration has recognized the long-range advantages resulting from a policy which permits the individual scientist to explore areas of knowledge which are not task-oriented. The special scientific interests held by members of a research organization sometimes appear to be only remotely related to current research projects but encouragement of such interests has frequently resulted in comprehension of relationships in knowledge which open new frontiers in science having practical applications. The larger the research staff and the more diversified the program, the more likely it is that the aggregate of individual interests will approximate those inherent in the research projects of a laboratory or institute. There are several methods of determining individual interests, one of which is by use of a well-designed questionnaire to be filled in by scientists and engineers. This approach would provide a second possibility for establishing information needs and the preparation of a list of information requirements. In another part of this paper where oral communication techniques will be discussed, reference will be made to the development of a catalog of skills as an aid to internal communication within a group. To determine individual skills, a questionnaire approach is also suggested, so it seems advisable to combine the inquiry on interests and skills in one questionnaire. A third approach to establishing information requirements is that based on areas of interest as reflected in the publications produced by a research organization. These constitute the written record of scientific activity and represent the distillation of long hours of intellectual effort directed toward research objectives. As such, they provide a firm basis for the determination of information needs pertinent to these objectives. Analysis of internally produced publications for subject content will assist in the preparation of an index of information requirements. The methods suggested imply participation by the research staff. An accurate estimate of requirements is unlikely if it is made by any other means and the time required of the research staff for this effort is insignificant compared to that required for laboriously searching through irrelevant information unnecessarily controlled by generally unsatisfactory techniques. The principal contribution of the scientist or engineer in this respect is that of indicating areas of pertinent interest in a language generally acceptable to his professional community. The bibliographical aspects necessary to the development of the information system and the general administration of information services should be the responsibility of personnel having experience in these areas.
OCR for page 1184
--> Preparation of an index of information requirements The foregoing indicates three sources for determining the information requirements of a research organization. To specify these requirements in a useful form, it is suggested that they be expressed as a list of scientific and technical terms initiated by members of the research staff to insure a terminology which is familiar to them. This suggests the need to prepare brief instructions for assigning terms to documents which are most likely to meet the direct or related interests of others. The specific rules should be developed to meet the needs of the individual research organization and, when formulated, they should be published, distributed, and considered standard for indexing purposes throughout the organization. In accordance with the Instructions for Indexing, current work authorizations could then be reviewed by personnel responsible for each research project and a list of terms reflecting the information requirements for a project compiled. The terms assigned by each project group would then be forwarded to an Index Review Panel. To determine the scientific interests of individual scientists and engineers, a questionnaire should be designed which would explain the objectives of the general indexing program and the method of listing interests in accordance with the indexing instructions. The questionnaire would also provide a section for listing the specific research project interests of each individual if these differ from personal scientific interests. A third section would permit the listing of technical skills which the individual feels competent to discuss with other members of the staff who might be in need of his advice. When completed, these questionnaires would be sent to the Index Review Panel. To obtain a list of terms pertinent to internal publications, authors would review their papers and submit appropriate terms in accordance with the general Instructions for Indexing. The extent to which previously published reports and papers should be indexed would be determined by the individual research organization. It may be that analysis of publications issued during the previous year would be sufficient to establish the initial list of terms or it could extend backwards for five to ten years. Terms assigned to internal publications would be forwarded to the Index Review Panel. After receiving suggested terms from the research staff, the Index Review Panel would review them for form and to eliminate inconsistency, synonymous terms, etc., and would prepare a basic list and a list of cross references. The cross-reference list would include all terms submitted but not included in the basic list. The Index Review Panel should be comprised of scientific and technical personnel having backgrounds representative of the major areas of research
OCR for page 1185
--> interest and should be headed by a person having the necessary training and experience in the preparation of indexes. It is assumed that the special expertise required for the development of an effective index would not be available from research personnel and that their technical contribution would have to be strongly supported by a competent indexing specialist. After compilation of the master List of Index Terms, it would be published and distributed to all scientific and technical personnel for use both as an aid in subsequent indexing and as a guide for terms to be used in requesting information at a later date. New terms would be added to the list only after review by the Index Review Panel and revisions of the published list would be issued as determined by the changing interests in the research program. Since the List of Index Terms would reflect the scientific interests of the research staff, it would also serve as an aid in selecting publications prepared externally which would be of value to internal information needs. As such, it becomes a useful guide for the acquisition of information pertinent but not extraneous to those needs. Establishing information controls at the time of publication As was noted earlier, information handling problems begin with the preparation of new information by a research worker. Certain controls can be exercised at the time of publication which will facilitate subsequent indexing, abstracting and announcement, and for this purpose Instructions for Abstracting and an Information Control Sheet are useful. Rules for preparing author abstracts should be developed and distributed to all technical personnel in the research organization. Advice concerning the need to be informative and the necessity to avoid the use of terminology, code words and abbreviations that are not well known outside a particular area will do much to aid in communicating information within an organization. A control sheet utilizes the advantages of standardized presentation of information for indexing and abstracting and can serve as final copy for the preparation of abstract bulletins. The space provided on the control sheet for the assignment of index terms will facilitate the prompt preparation of subject indexes for abstract bulletins. The information control sheets would be forwarded to the Index Review Panel for an additional purpose. Usually an author is well qualified to suggest terms reflecting the technical content of his paper, but he may not recognize the usefulness it holds for other scientific or technical fields. These relationships are important in view of the interplay of interests between various disciplines. The review panel, representing a cross section of total interests, can assist materially in recognizing them and assigning additional useful terms.
OCR for page 1186
--> Correcting superfluity and paucity in information announcement The large amount of information published has resulted in a common complaint that scientists are unable to determine what is pertinent to their interests. This is supported by the constantly increasing number of scientific and technical journals, the delays and duplication associated with abstracting services, and the time required to sift through the chaff of newly published materials to get a few useful kernels of information. This situation results in a rate of information loss which probably approaches that of information use and points up the need for improved techniques of announcement to individual scientists. This is not only true for scientific publication generally, but frequently applies to publications prepared within large research organizations. Previous suggestions on relating information needs to research objectives also apply in respect to announcement of information although the direction of information flow is reversed. In this phase of information handling, the needs of the research group and its individual members have been established as noted earlier, and the present concern is that of further reducing the amount of technical man-hours required for literature review by developing a system of announcement specifically tailored to individual requirements in addition to the announcement system used to cover the needs of the research group as a whole. To meet overall needs, an announcement bulletin issued regularly and containing abstracts of all internally produced publications can be prepared readily from copy supplied by authors on the Information Control Sheet previously mentioned. The standard format of the control sheet permits ready preparation of the bulletin by using xerography and offset processes. Abstracts or titles of externally produced journal articles and reports selected and indexed by the Index Review Panel could be included in the same bulletin after copy has been prepared for these items. By assigning serial numbers in sequence to each item, electronic searching and location of desired items is possible. Serial numbering does not preclude the possibility of a classified arrangement within the bulletin, if this is desired, since numbers may be assigned after page format is completed. An index may be prepared for each issue and cumulatively if desired by use of the Flexowriter programatic automatic writing machine with its tape-to-card punch feature or by any of several other techniques. In addition to the general announcement bulletin, it is also possible to prepare announcement lists pertaining to the specific interests of individual scientists and engineers as indicated by their earlier statements of information requirements. By utilizing the electronic techniques to be described later, the code
OCR for page 1187
--> numbers representative of fields of interest for each member of the staff can be matched in search with those relating to the indexed documents and pertinent document numbers obtained. These special interest reference lists could be attached to the abstract bulletin before mailing and would serve to pin-point items of particular significance to the recipient. The time required to print out such lists will depend upon the number of matches obtained and the output device used but the time for search can be estimated fairly accurately. Using punched cards which specify terms of interest for each of 2000 individuals and assuming that each month 2000 documents had 30 index terms assigned as an average, there would be on the order of 30 feet of magnetic tape required to store the term and administrative codes for the documents. With an automatic card feed and a continuous tape, the time required for search and solution of the problems of 2000 individuals would be of the order of one hour. This assumes the use of buffer output tapes and off-the-line print out of the announcement lists which would take from two to seven hours, assuming a relatively inexpensive line printer. The automatic preparation of 2000 monthly lists by this technique would aid in meeting the requirement for prompt announcement of information of special interest to individual scientists. The details of this procedure will be explained later. Oral communication in the information process One of the consequences resulting from the large amount of information which is inaccessible is the increasing reliance which scientists place upon other scientists for their information needs. The increase in oral communication to meet information requirements is undoubtedly influenced by the failure of conventional library techniques to be of sufficient assistance to the scientific community in finding and making available information when it is needed. The procedure of obtaining knowledge from experts in various fields is in itself becoming more difficult and in a large organization presents a number of problems similar to those encountered in exploiting written information. Personal contact becomes difficult when a research organization has its facilities dispersed, its program diversified, and its staff enlarged to the point where it is difficult to determine just who is the expert to consult. Another factor contributing to the increasing failure of the personal contact technique is the relatively high turnover of personnel which most large organizations experience. The ability to communicate personally is particularly difficult for newly employed scientists and engineers who are unfamiliar with the talents which their new organization may possess. Even with these barriers to personal communication, it remains one of the most used methods to obtain information and in
OCR for page 1188
--> view of this efforts should be made to improve the possibility of personal contact in large groups. As noted earlier, the questionnaire to be filled in by all members of the research staff provided a section where each person would indicate the special areas in which he felt competent to give advice. The preparation of a Register of Skills, based on information from the questionnaire, would be a relatively simple task and would do much to facilitate personal communication. This register could be maintained in a central place, its availability made known generally, and when an individual wished to discuss a particular area of interest, the names and telephone numbers of qualified co-workers could be provided. The method of establishing the register, maintaining it, and providing service on it could be completely manual or it could be semi-automatic by using edge-punched cards or any of several other suitable approaches. It is entirely possible that certain staff members may indicate proficiency in a particular area and that their estimate of personal ability could be overemphasized; therefore, it would perhaps be useful to have their immediate superiors review these estimates. There are, of course, other approaches to establishing the Register of Skills, such as review of statements of experience and training which are usually available from personnel offices or an approach could be made directly to the supervisory level for an opinion of the skills available from persons within their department. It is believed, however, that the suggested method has advantages not held by other methods. By way of supporting the foregoing suggestion, a survey was made in at least one large research organization to determine what sources of information were most used. The results of this survey indicated that in the case of about 80% of the individuals surveyed, their answer was that “they asked somebody.” In about 50% of the cases (the second most used source), internal publications produced by members of the organization were found most useful. Further support for improving oral communication is evident in a recent editorial by Edmund C.Berkeley appearing in Computers and Automation, in which he states …there ought to be much better techniques of communication at computer meetings: (1) The tag which you wear at a meeting should show: your name, your organization, your main interests. This would aid communication by telling whether or not this man is interesting to me; we should talk together. Electronic storage, search, and retrieval of information Intelligent selection of material to meet user requirements as outlined in the preceding suggestions will substantially reduce the amount of information which must be stored for future reference. However, in a large organization the
OCR for page 1189
--> amount of information stored will soon exceed the capacity of manual or commonly accepted retrieval methods. A device having the intelligence of the human mind and the speed of a modern computer would be an ideal solution to the problem of data retrieval since the mind is extremely flexible and is capable of subtle associations and correlations necessary to efficient retrieval. Unfortunately, however, such intelligence is difficult, if not impossible, to achieve fully in a machine. Where material to be retrieved can be clearly specified as in the case of a parts man searching for a particular component, simple machine methods are acceptable since the problem can be clearly outlined in a search request. A more difficult problem, however, lies in the retrieval of information relating to desired subject matter, such as is the case in the compilation of a bibliography or in the situation where a scientist is interested in information pertinent to specific subject matter. In these situations, the problem is extremely complex since the subject matter of interest may be discussed in detail in certain of the information indexed or merely just in passing in other material, depending upon the scope and setting of the information source. For example, a development engineer interested in a specific electric circuit or mechanical movement faces the problem of finding not only all information directed specifically to the area of his interest, but also all information wherein his particular area of interest is treated in context. In most instances, the area of interest of a person seeking retrieval of information is not fully matched by the scope and setting of information stored and indexed for future reference. It is in this situation that the culling process achieved through the associations and correlations of the mind may be employed to advantage to determine what stored information may be of interest, even though it does not fully define the user’s area of interest. A simple technique which approaches a function of the mind and which may be readily embodied in machine searching of information files comprises the establishment of two classes of indexing terms, namely general and essential. From this classification, it is not to be inferred that an indexing term is defined in the standard list of terms as being general or essential, but rather is so defined at the time a particular search is undertaken. Indexing term, as employed herein, relates not only to single keywords, but also to indexing phrases, names, etc. This classification of terms in data retrieval is not limited to a specific vocabulary or hierarchy of definitions, but can be employed to advantage regardless of the indexing approach. As previously discussed, all material stored for future reference is indexed in accordance with a standard list of terms. In requesting a search, the subject matter about which information is desired is defined in accordance with terms drawn from the standard indexing list. The person requesting the search or possibly a skilled operator determines which of
OCR for page 1190
--> these terms must apply to stored information for it to be of interest, and therefore are essential to the search, and which other terms should, but need not apply, for the document to be of interest, and are therefore termed general terms. A direct approach to the problem of retrieving information in accordance with a request indicating essential and general terms is to record the identity of all documents to which the essential terms and, in addition, a threshold number of general terms apply. In accordance with such a system, any or none of the search terms may be designated as essential, and the number and order of search terms is not limited by the action of the machine. This philosophy of data retrieval can probably best be understood by reference to Figs. 1–5. FIG. 1. The arrangement of administrative and indexing terms in a magnetic tape employed in the proposed system. In Fig. 1, the direction of tape travel is as indicated. The document “Start” and document “End” codes are special administrative codes which will be discussed later in detail, the words 1 through N are codes representative of indexing terms taken from a standard list or vocabulary previously described, and the document identity is, as might be expected, a code representative of the document to which the preceding indexing codes apply. The operation of the proposed search machine can be understood by a description of Fig. 3 in terms of a particular search problem. A development engineer faced with the problem of designing a new high-frequency and extremely stable oscillator might request a machine search to determine the identity of available written material relating to his particular design problem. Figure 2 shows a suggested form whereby the engineer may
OCR for page 1191
--> request a machine search to provide him with the desired information. In Fig. 2, only five identifier words are shown associated with the search. However, this is not a limiting number, and any reasonable number of terms within the capacity of the machine could be specified. The identifier terms are taken from the previously described standard vocabulary or list of indexing terms, and the FIG. 2. A simple example of an office form whereby a request for machine search may be made. identifier code representative of the identifier words are similarly obtained from the standard vocabulary. The translation from identifier word to identifier code has been shown as a manual translation; however, at the expense of added cost and machine complexity, a machine translation could be performed if the human translation were too burdensome. In the right-hand column of the request form of Fig. 2, a note is made as to whether each of the identifier words is of general or essential interest to the particular search. In the example, the word oscillator has been marked as essential, while the words transistor, modulated, radio frequency, and crystal controlled are marked as being nonessential or general terms. With the exception of the box labeled “Number of General Identifiers Required to Match,” the remainder of the information at the bottom of the request form is for administrative purposes. The notation “3” under the lower right-hand box of Fig. 2 indicates that the person requesting the search is
OCR for page 1192
--> interested in all documents relating to oscillators and to which three or more of the general terms apply. For example, documents relating to radio-frequency, crystal-controlled, modulated oscillators, as well as documents relating to transistor, radio-frequency, crystal-controlled oscillators, are of possible interest. Further, all documents to which the word oscillator and to which, in addition, any combination of three or more of the general terms applies are of possible interest. FIG. 3. A basic block diagram of a special purpose computer arranged to retrieve data in accordance with the previously described search approach. Figure 3 shows in block diagram form the essential elements of a machine
OCR for page 1193
--> capable of solving the problem, established in the previously described request form. The input devices are shown in general form as these may range from a simple arrangement of a plurality of keys or switches manually settable to establish the codes representative of the indexing terms or to more complex arrangements, such as punched cards, magnetic tape, or operator key sets which when manipulated are effective to generate codes representative of the indexing terms. In this discussion a specific code for storing information in the search files has not been designated, as this decision is relatively unimportant to the operation of the machine so long as an efficient unambiguous coding system is employed. A decision as to what code would be most appropriate depends in part on the codes used by other machines employed in conjunction with the retrieval machine, and such factors as the addition of error detection or error correction would in large depend upon the accuracy demanded of the system. The storage blocks for identifier words 1 through N are shown as separate from the input devices. However, in the case of simple input devices comprising keys or switches or even punched cards, the storage would be included in the input device rather than as a separate box as shown. The two permanent storage elements marked “Start” and “End” are merely wired arrangements in which codes representative of the previously mentioned documents “Start” and documents “End” codes are stored. The comparison circuits 1 through N and the comparison circuits “Start” and “End” are arranged to compare information stored in an associated word storage and parallel information read from the magnetic tape, and to provide an output pulse whenever coincidence occurs. In our example, the identifier words of the request, by means of the input devices, are entered in any of the descriptive word storage blocks 1 through N without regard for order. A double deck switch such as is shown schematically at the right side of Fig. 3 is associated with each of the comparison circuits 1 through N. The purpose of these switches is to designate whether the identifier stored in the associated storage block is of general or essential interest to the search, or if the particular associated storage block is not in use, to establish the idle condition with regard to that particular comparison and storage circuit. The magnetic tape has a plurality of tracks; the exact number of tracks is not at this point essential to an understanding of the proposed system. This number would be determined by the number of identifier words employed in the system and by the coding scheme employed to identify the documents. For example, if a continuous file covering a number of years were employed, the number of tracks would have to be sufficient to record a code descriptive of the largest anticipated number of documents in the system. However, if informa-
OCR for page 1194
--> tion is stored on annual tapes, considerably fewer tracks would suffice, since redundant information defining the year can be omitted. The magnetic tape, arranged as shown in Fig. 1, travels past the parallel magnetic tape reading heads, as indicated in Fig. 3, to provide signals representative of the information stored in the tape, to the line amplifier and to the gated output amplifier. If it is assumed that the magnetic reading heads first encounter a document “Start” code, the information from the line amplifier and that in the “Start” permanent storage will match, and a “Clear Counter” signal will be generated in the “Start” comparison circuit. The clear counter signal resets the threshold counter to “O” count and resets the flip-flop inputs to the output gate. The machine is now prepared to examine the document identifiers contained in the tape between the just encountered document “Start” code and the document “End” code succeeding word N. As the document identifier words are read serially from the storage tape, the code representative of each word is compared in parallel with the code entered in the storage blocks at the time the search is initiated. If a match is found, an output is provided by the particular comparison circuit. For example, if the code representative of the word “transistor” is stored in the first storage block, the first comparison circuit output conductor would be energized each time the code representative of “transistor” is encountered in the search file. The setting of the rotary switch associated with a particular comparison circuit determines whether the output signal from that comparison circuit will be directed to the input of the threshold counter which is associated with the general indentifier terms of the search or to the output and gate through a flip-flop circuit as is the case when the term is stated to be of essential interest to the search. In Fig. 3, the rotary switch associated with the first comparison circuit is shown in the general position, and thereby the output of the first comparison circuit is connected to the input of the threshold counter. Accordingly, each time the word stored in the first word storage block matches the code read from the magnetic tape, the threshold counter will be advanced one count. The flip-flop associated with the first comparison circuit is set whenever a comparison is found. However, with the rotary switch in either the idle or general position, the output of the flip-flop will be open-circuited. When the rotary switch is in the essential position as is the case of the switch associated with the second comparison circuit, the output of the associated flip-flop will be connected to an inhibit lead of the output gate. Resetting of the flip-flop energizes its output conductor, thereby inhibiting the output gate until such time as the comparison is found for the essential identifier which effects setting of the flip-flop. If the information encountered between a document “Start” and a document “End” code matches the conditions imposed by the search problem, the identity
OCR for page 1195
--> of the pertinent document should be recorded in the output device. The flip-flop shown adjacent to the output gate is controlled by the threshold counter output which, in accordance with the search request, is set in position 3. Accordingly, if the threshold counter reaches a count of 3 between a document “Start” code and a document “End” code, an output signal is provided from the threshold counter to set the associated flip-flop to its “1” state. If the flip-flops associated with essential words and the flip-flop associated with the threshold counter are all set at the time the document “End” code is read from the magnetic tape files, the output gate will be enabled to provide an output pulse to in turn enable the output amplifier. Accordingly, the information immediately succeeding the document “End” code, namely the document “Identity” code, is gated to the output device. The transient signal from the output gate is employed, therefore only the document “Identity” is read, and additional extraneous following information such as the document “Start” code is not read out. As in the case of the input devices, the output devices are shown only in general form. Obviously a very high-speed direct printing device would suffice for this application. However, on-line use of such a device probably would not be economically sound. Accordingly, in most instances, buffer storage of one type or another would appear to be advantageous. If a rule of operation is established such that a minimum number of lines on the search tape will be left between identity codes, a simple constant-speed magnetic tape could be employed to advantage as an output buffer. For example, if a minimum of thirty lines were reserved for each document indexed, the output tape could be run at a constant speed of 1/30th of the speed of the search tape. Accordingly, identities read from adjacent entries on the input tape will appear as adjacent entries on the output tape. Where a large number of documents exist in the file between the identities of pertinent references, spaces will occur between identity numbers in the output tape. This does not appear to be a serious problem since the output tape may be removed and employed to control printing devices away from the operation of the search machine in a typical off-the-line mode of operation. The subject system readily lends itself to an orderly increase in search capacity. An extension of the system of Fig. 3 to permit three simultaneous searches is shown in Fig. 4. An orderly expansion is possible since the identifier word storage and comparison circuits are not permanently associated with a particular search problem, but rather are associated with a particular problem in accordance with search requirements. For example, if thirty-word storage and comparison circuits with associated flip-flops were provided to operate with three search circuits each comprising a threshold counter, a flip-flop, an output
OCR for page 1196
--> FIG. 4. A block diagram showing an expansion of the arrangements of Fig. 3 to permit simultaneous solution of more than one search problem.
OCR for page 1197
--> gate, a gated output amplifier, and an output device, it is doubtful that each problem would always require one-third of the comparison and storage circuits. It appears more likely that certain search problems would require more than ten comparison circuits while others would require less; therefore, the flexibility available through the use of the rotary switches which direct the output of the comparison circuits to their proper search tasks is of considerable worth. In Fig. 4, identifier storage and comparison circuits are assigned to a particular search at the time a search is initiated. It is possible that identifier codes representative of the same indexing term may be entered in separate storage circuits which are associated with separate searches. Accordingly, it is a distinct possibility that comparison circuits not associated with the same search will be simultaneously energized to advance the respective threshold counters or to set associated flip-flops. The operation of the circuit of Fig. 4 is identical to that of Fig. 3 except that now a plurality of searches may be simultaneously undertaken. The separate searches may refer to separate search problems or they may specify different ways of posing a search problem relating to a particular piece of information. Accordingly, the second and third searches may specify only different degrees of search requirements with regard to a single search problem or they may advantageously relate to separate and distinct search problems. A mechanized system of information announcement A monthly notice to pinpoint abstracts of particular interest to personnel of an organization has been mentioned herein without specific reference to machine arrangements. A modification of the arrangements of Fig. 4 to accomplish this task is shown in Fig. 5. It has been estimated that in a large organization this service might be furnished to several thousand individuals each having as many as nine or more areas of interest either relating directly to their present work or to special fields of interest not necessarily related to their present jobs. It has been further estimated that the number of documents indexed each month might run in the order to 2000. A large organization having 2000 scientists each having nine areas of interest would require 18,000 search problems per month merely to prepare the notices to accompany the abstract books. Although 18,000 serial searches of a file having 2000 document entries might appear to be a tremendously lengthy process, the arrangements of Fig. 5 appear to be capable of performing this task in an extremely reasonable period of time. Figure 5 is an extension of Fig. 4 with specific suggestions in the area of input and output devices; the use of electronic switching means to gate the information signals throughout the system rather than mechanical switches; and the
OCR for page 1198
--> FIG. 5. A variation of the arrangements of Fig. 4 to provide rapid preparation of announcement lists.
OCR for page 1199
--> use of a continuous tape search file as opposed to the arrangements of Fig. 4 wherein tape is shuttled from reel to reel as the search progresses. The format of the search file as shown in Fig. 6 is identical to the search file shown in Fig. 1 with the exception of the addition of an administrative code to indicate the end of the indexing information. In Fig. 6, this additional administrative code has been labeled “Tape End.” FIG. 6. The search file. A minimum number of lines are left between identity codes in the monthly search file so that the output buffer tape may be run at a uniform relatively slow speed. As previously discussed, the speed of the output buffer tape would be equal to the speed of the search tape divided by the number of lines between document “Identity” codes. Information is set into the arrangements of Fig. 6 by means of punched cards which are fed alternately from two piles of cards. The use of two card feed mechanisms considerably relaxes the requirements for such devices with regard to speed of operation. The philosophy of operation of Fig. 5 is identical to that of Fig. 4, but a short description of the overall operation will readily clarify the areas wherein equipment changes have been made. Assuming as a starting point that there are no cards in the card sensing mechanisms shown at the left side of Fig. 5 and that the search file tape is being run past the reading heads in the direction shown in Fig. 6, the “Tape End” comparison circuit will be energized when the “Tape End” code is encountered in the search file. The comparison circuit output signal is connected to the single stage binary counter and, when energized, will drive the counter either from “0” to “1” or from “1” to “0,” depending upon the prior state of the counter. If the counter is driven from “0” to “1,” an enabling pulse will be
OCR for page 1200
--> transmitted to card feed 1 to shift a card into the first card sensing mechanism and to energize the and output gate associated with the second card sensing mechanism. Accordingly, since there is no card in the second sensing mechanism, there is no possibility of conducting a search during the first pass of the tape file. However, the next time the “Tape End” signal is encountered, the binary counter will be shifted from the “1” to “0” state to enable the second card feed to place a card in the second card sensing mechanism, and to enable the and output gate associated with the first card sensing mechanism. Enablement of the and gate establishes the desired search conditions within the machine and provides a momentary path from the card sensing mechanism to the recording heads. Accordingly, as soon as a card is sensed, the identity code of the person to whom the information should be directed is entered in the buffer output tape. In this way, document identities read to the buffer tape from the search files in accordance with the search problems defined by the input cards will follow the identity of the person requesting the search. The input card contains not only the identity of the person requesting the search, but also all information relating to the search such as the codes of the words defining the search problem; the determination of whether these words are general or essential identifiers; the number of the search problem to which these search terms are to apply; and the threshold number of general identifiers which must apply for a document to be of interest. Figure 5 has been simplified considerably through the use of heavy lines indicating a control cable having a large number of control conductors and a second heavy line indicating a multiconductor bus bar type of interconnection conveying the parallel bits representative of the codes read from the search file. Further, where pluralities of paths exist, amplifiers, and gates, and or gates have all been shown as represented by a single gate. However, it must be fully understood that in such instances a plurality of gates or amplifiers are necessary. The search identifying terms sensed from the input card provides parallel inputs to the comparison circuits 1 through N and control information read from the input cards, enables specific and gates to establish paths from the comparison circuit output conductors, and their associated flip-flops to either the threshold counters associated with the desired search, or to the output and gate associated with the search. Energizing of a comparison circuit output conductor will effect setting of its associated flip-flop, and if a term is stated in the problem to be an essential identifier, one of the gates connected to the “0” output conductor of the flip-flop will be enabled to complete a path from the flip-flop “0” terminal to the inhibit input terminal of one of the three output and gates, depending upon the problem to which the comparison circuit is assigned. If the identifier assigned to the comparison circuit is a general term,
OCR for page 1201
--> the flip-flop “0” conductor output will be open-circuited since none of the essential and gates associated with that circuit will be energized, and the comparison circuit output conductor will be connected through one of the general and gates to the threshold counter of the proper search problem. Accordingly, the selective enabling of the essential and general and gates accomplishes the functions previously delegated to the rotary switches of Fig. 4. The output and gates are arranged so that if the “0” output conductor of a flip-flop associated with the comparison circuit is energized and this state is transmitted to one of the three output and gates, the output gate will be inhibited. When a comparison is found for an essential identifier, the flip-flop will be set to its “1” state to deenergize the “0” conductor and thereby remove the inhibiting signal from the associated output and gates. As in Fig. 4, each time a comparison circuit associated with a general identifier is energized, the threshold counter associated with that search will be advanced one count. The threshold count assigned to a particular search problem is established by enabling one of the threshold counter output and gates so that if a counter reaches the preassigned threshold count, the counter output flip-flop will be energized. Whenever the conditions of a search problem are met between a document “Start” and a document “End” code, the output and gate associated with that particular problem will be enabled to in turn enable the associated gated output amplifier. Thereby, the identity of the document meeting the search requirements will be read to the output buffer tape through the recording heads. There is the possibility that a document will be of interest to more than one of the three simultaneous search problems. Since all three searches relate to one man’s interest, the output and gate may be arranged in a simple preference chain to prevent mutilation of output signals. After the tape file has been completely searched, the “Tape End” signal will again be encountered to advance a new card into the just used card sensing mechanism, and to enable the output and gate associated with the other card sensing device. The searching process will continue until such time as all cards have been searched. It is contemplated that the output tape would be taken away from the searching machine, and the information thereon printed out in standard off-the-line printing procedures. Summary The suggestions made in this paper are directed to several major problems in information handling. They are based on the belief that in most large research organizations these problems exist because sufficient attention has not been
OCR for page 1202
--> given to systematic study of ways and means to correct them. It is believed that research organizations frequently acquire more information than they need because of failure to identify information needs and that the general philosophy that information needs will be met simply by increased acquisition cannot be supported from the standpoint of economy or practical use of much of the information which is acquired. In this respect, the number of articles selected by the Index Review Panel from journals received could well serve to indicate the value of a particular journal to the organization and should influence decisions on further subscription to little used journals. In broader scope, the determination of information requirements should provide more accurate selection of all forms of information according to stated needs. Three approaches to establishing information requirements have been suggested, the first being concerned with needs in relation to immediate tasks, the second in relation to developing the scientific potential of the staff, and the third in relation to organization interests as reflected in the publication of information resulting from research activities. There are, of course, many other sources which could be utilized to relate information needs to objectives including patent activities, administrative correspondence, laboratory notebooks, and conferences, all of which relate to the three approaches discussed. In the review and control of information, a primary consideration is the need for participation by the research staff and by scientifically or technically trained personnel in the information group. To the degree that a research organization wants good information control, it will support staff participation in gaining that control. The use of standardized techniques in the information control process will facilitate the orderly and consistent development of a system for effective control. The techniques suggested for announcing new information to members of the staff provide for announcement to the staff generally and to individuals specifically. This approach does not deprive information from the man who enjoys going through the published literature assiduously, but at the same time it does pinpoint items of particular significance to those who are not inclined to do this. In several respects, oral and machine techniques can serve a useful purpose in improving an information handling system, and suggestions for their use have been made. It is not suggested that mechanization is desirable if it is more economical to use manual techniques or when machines are incapable of doing work where human effort is required. An attempt has been made to bring into combination manual, intellectual and machine operations where it seemed advantageous to do so.
Representative terms from entire chapter: