National Academies Press: OpenBook
« Previous: Descriptive Documentation
Page 1117 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

Variable Scope Search System: VS3

JACOB LEIBOWITZ, JULIUS FROME, and DON D.ANDREWS

This paper will describe a system being developed by the U.S. Patent Office for mechanizing searches for organic chemical compound disclosures. This effort is part of an overall program for mechanization of the entire searching operation of the Patent Office. Its objective is to provide a solution to the complex problems of patent searching occasioned by the rapid, exponential growth of the art, the multiple and variable points of view required in patent search requirements, and the relative inability of rigid manual classification systems to provide such multiple access to subject matter.

The usefulness of the new system is not limited to patent searching. It can be used for any search with respect to chemical compounds in terms of structural characteristics whether the search is done by the patent profession, the research scientist or the industrial organization.

Detailed descriptions of the nature of the patent search problem and some of the mechanization research toward its solution have appeared in the literature (111). This paper will give, therefore, only a brief general statement of the problem with respect to chemical compound searching for which the new structural search system is designed.

The problem

A patent search is performed with respect to the claimed subject matter of an application for a patent to determine its patentability by comparison with the prior art subject matter. The examiner searches both from the point of view of (a) novelty and (b) invention. He is therefore interested not only in identical subject matter but also in similar or the most closely related subject matter. Thus, patent searches are ordinarily from generic points of view regardless of whether the claimed subject matter is for a specific embodiment or a generic class.

JACOB LEIBOWITZ and JULIUS FROME Office of Research and Development, U.S. Patent Office, Department of Commerce, Washington, D.C.

DON D.ANDREWS Director of Research and Development, U. S. Patent Office, Department of Commerce, Washington, D. C.

Page 1118 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

Another feature of patent searching is the variability in search requirements. The point of view of searching varies in accordance with the constantly shifting emphasis which is characteristic of the field of new developments and invention.

Features of the new system

The search system is called VS3 (variable scope search system). It is designed to provide multiple access to organic chemical compound disclosures with great variability in scope both generically and specifically.

Coding of the compound for machine searching is done from the structural formula of the compound. The coding is done by nontechnical clerical personnel.

The system permits the use of a simplified coding method. Each compound is regarded as consisting of certain building-block units in certain associations with each other. The building blocks are single ring configurations and selected nonring or chain-unit configurations, e.g., the benzene ring, the diazine ring, and the carboxamide chain unit. The assemblage of codes for these building blocks per se is constant in each compound, while the associations among these building blocks are variable with each compound. The constants, therefore, have preassigned codes and corresponding prepunched cards. In coding a compound, the relationships among these building blocks are indicated and the prepunched cards are assembled and completed according to the indicated associations.

Another feature of the system is the fact that there is no limit imposed on the size of the dictionary providing the descriptive terminology for description of the compounds or the amount of description that can be given to the disclosure of any particular document.

The system is a punched card system using the new machine ILAS previously described (1, 2).

Types of search questions answered by VS3

The body of art selected for experimentation with the VS3 system is the “thiazine” art. The example in Fig. 1 constitutes the disclosure of a compound in this art, taken from U.S. Patent No. 1,996,867.

FIGURE 1

Page 1119 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

The system is designed to permit retrieval of this compound, inter alia, on the basis of questions of the following type.

To find:

  1. An aryl methyl ether

  2. An aryl sulfonamidobenzene compound

  3. An azine

  4. A thiazine

  5. A 1,2,4-trichlorophenthiazine

  6. A 1,4-thiazine

  7. A dichlorobenzene

  8. A sulfonamidodiphenylamine

  9. An aminochlorophenothiazine

and so on.

It is important to note that retrieval will be, in each case, not only of the compound depicted but also of any other compounds meeting the requirements of the search. Also, the search varies in scope. It may be expressed in terms of one or several characteristics. The search may be for an ether broadly or a methyl ether more specifically, for a heterocyclic compound, or more specifically an azine, or a thiazine, or a 1,4-thiazine. It can require limitation to positions of substitution or ignore positions of substitution.

Vocabulary of the system and codes

A standard 80 column IBM card is used for punching the code. Codes are punched horizontally across the card. Thus there may be punched on each card 12 code words each containing 80 bits or punching positions. Figure 2 shows the format of a code word. The first 68 punching positions are divided

FIGURE 2

into 17 characters of 4 punching positions each. The code is punched in hexadecimal digits. Thus this part of the word has available 17 hexadecimal digits.

Page 1120 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

This is further subdivided into three fields, a field of one character for signal, a field of 4 characters (M, 1, 2, 3) for modulant and the remaining 12 characters for subject matter. The last 12 punching positions, columns 69–80, is the “interfix” field. This is further subdivided into 6 punching positions for the “homo” field and 6 punching positions for the “hetero” field.

Descriptors

The terms appearing in the dictionary which are used to describe and define the characteristics of the chemical compounds are designated as descriptors. The descriptors are translatable into code and so the term descriptor and code will be used synonymously.

It is convenient to consider the descriptors and their code equivalents as being of two types, (1) substantive and (2) organizational.

The substantive descriptors describe and define the characteristics of the chemical compounds and they appear in the coding dictionary, a portion of which is illustrated in Appendix A. (The complete dictionary is available at the Office of Research and Development, United States Patent Office). The organizational descriptors express the relationship among these characteristics. This division is purely arbitrary since substantive codes also express relationships. The division is based, however, on the means employed in the system to set forth the relationships.

The substantive codes appear in the 16 characters M through 15 of the word. The organizational codes, expressive of relationships among the substantive codes, appear in the signal and interfix part of the word.

Substantive codes

The substantive code contains a modulant and subject matter codes. The modulant is a modifier of the subject matter codes; it is a device for using the same code to mean a variety of things according to the particular modulant used. At present, there are 12 modulants employed, although provision has been made for over 65,000 by the allotment of 4 hexadecimal characters to the modulant field. The modulants are listed at the beginning of Appendix A, and the types of information recorded in each type of word according to these modulants are briefly exemplified throughout Appendix A.

Organizational codes

There are two types of organization codes, (1) the grouping shown by the signal code and (2) the relationships among the groupings shown by the interfix code.

Page 1121 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
GROUPING (SIGNALS)

The compound is coded in terms of substructures of various sizes up to and including the complete molecule. The smallest substructural units are the ring and the chain unit. The ring is the familiar cyclic structure; the chain unit is an element or collection of elements in an acyclic configuration. The limits of the chain unit are determined by the groups appearing in the dictionary in the M-1, M-2, M-3, M-4, and M-C words. Most of these groups are the conventional “functional” groups of chemistry although there was no hesitation in synthesizing groups whenever it was deemed necessary from the point of view of retrieval.

The compound of Fig. 1 has been rewritten in Fig. 3 to illustrate the grouping organizations of the structure. The structure has been transformed into the

FIGURE 3

type of skeletal formula set up preparatory to coding. The rings are indicated by Roman numerals; the chain units are encircled. The next substructures to consider are the ring systems and the chains.

Ring system and ring

The ring system is a structural entity which comprehends within its scope one ring or a collection of rings in “fused face” relationship.

There are four ring systems in the formula (Fig. 3), as follows:

  1. [(I) (II) (III)]

  2. [(IV)]

  3. [(V)]

  4. [(VI)]

The parentheses are used to symbolize the enclosure of the codes pertaining to a ring, and the external brackets symbolize an enclosure of the collection of rings pertaining to the ring system. It will be noted that a ring system may con-

Page 1122 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

tain only one ring as in (b), (c), and (d). Rings connected by a bond juncture are not in the same ring system, while rings in fused face relationship are.

A ring system is not only a collection of codes for the individual rings it contains. It consists of other characteristics not present in the individual ring codes, such as the fused face relationship (see Appendix A) and also characteristics of the type provided for by the ring system word M-7 (Appendix A). In setting up the codes for the ring system, then, this additional information is included within the ring system grouping.

Chain unit and chain

The relationship between the chain unit and the chain is analogous to the relationship between the ring and the ring system. The chain is a continuity of chain units; its continuity is terminated by the interposition of a ring. It may consist of one chain unit only. The following are the chains found in the compound of Fig. 3 (with the same bracketing symbology).

  1. [(C) (O)]

  2. [(N)]

  3. [(Cl)]

  4. [(Cl)]

  5. [(Cl)]

  6. [(C)]

This is again a substructure within a larger substructure.

Compound grouping

The next order of grouping is the enclosure of the whole set as a compound unit. Within this unit codes are then added to indicate additional information pertaining to the entire compound group. The type of information added is indicated in words M-8, M-9, M-A, M-B (see Appendix A).

Patent grouping

The next and final grouping is that which encloses all the compounds as pertaining to the same document. This grouping organization of the codes for a

Page 1123 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

compound structure may be compared to the organization of the written language. The substantive codes constitute the identification of the letters of the alphabet and their meanings in the word; the collection of words constitutes the larger organization of the sentence. In addition, however, since the meaning of the sentence is more than the mere sum of the words, additional substantive information is added pertaining to any additional concepts provided by the sentence as a whole. The sentences then are in a group constituting the larger substructure of the paragraph, with further information pertaining to the paragraph not available as the mere contextual sum of the sentences.

Signals

The grouping of the codes is handled through the signals, a list of which appears in Appendix A. These signals permit the proper correlations among the codes so that the codes for one ring do not get scrambled with the codes for a different ring, or the codes for one compound do not become correlated with the codes for a different compound.

There is no fixed limit to the number of codes that may be included within any signal group. Thus, any number of descriptors may be applied to definition of a ring, a ring system, a chain unit, a chain, or the compound as a whole. By the same token there is no particular limit to the amount of information that can be recorded for any one document.

In the machine operation, the presence of the signal signifies the termination of all the codes pertaining to the particular structural unit defined by the signal.

While properties and functions have not been encoded into the system at present, it will be obvious that this can be done according to the grouping logic (11). Properties of a compound would be grouped on the compound level; properties of a substructure would be grouped within the level of said substructure. Thus in searching, correlations can be made between compound and function or substructure and function.

Interfix

Another organizational relationship is that of connectivity of the substructures. In the compound of Fig. 3, ring I is joined to N, ring IV is joined to N and to ring V, ring VI is joined to chains—O—C and—NSO2—, and so on. This relationship is handled by the interfix device. This involves the assignment of a pair of identical arbitrarily selected numbers to those substructures which are joined to each other.

Page 1124 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

The following types of connections are defined by interfix:

(a) Ring to ring by bond juncture

Homo interfix

(b) Chain unit to chain unit

(c) Ring to chain unit

Hetero interfix

(d) Ring to chain

Types (a) and (b) are called the homo interfixes and are coded in columns 1 to 6; (c) and (d) are the hetero interfixes coded in columns 7 to 12.

For illustration, a portion of the formula of Fig. 3 has been extracted and interfix numbers assigned as shown in Fig. 4. Each interfix number has been placed at the bond juncture to indicate that it is assigned to each of the sub-

FIGURE 4

structures involved in the connection. Thus in coding, the units will have the following interfix numbers (interfix in the subscript): (C1), (O1,7), (VI7,8), (V9). The substructures of Fig. 4 can be reconstructed according to the rule that those substructures are joined to each other which have the same numbers. It must be emphasized that the value of the number is of no consequence. What is significant is the identity of the pair of numbers.

Thus, the groups can be renumbered as shown in Fig. 5. The structure is

FIGURE 5

equivalently defined and can be reconstructed from the interfix relationship.

The interfix is punched in any of columns 69 to 80 according to the number assigned. While scanning and sensing of the codes occur row by row, determination of the interfix relationship is done by a vertical comparison down the column. Thus one code word in a row is connected to some other code word

Page 1125 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

in a different row, either on the same card or on a different card, through recognition of a pair of punches in the same vertical column. This principle will be further illustrated. See Appendix B.

Coding: set-subset, exact pattern

Two methods of recording the codes are used according to two different desired searching devices: (1) the set-subset method and (2) the exact pattern method. To exemplify this, the code words used to describe a ring, i.e., M-5 for set-subset and M-6 for an exact pattern have been selected. In the M-5 word (see Appendix A), assignment of codes is on a bit-by-bit (set-subset) basis. In character 7, for example, the bits have been allotted the meanings shown in Fig. 6. A ring which is both unsaturated and heterocyclic will be

FIGURE 6

coded as 7–5 in the M-5 word as in Fig. 7. A search for a heterocyclic ring 7–4

FIGURE 7

will involve searching for a hole in the 4 bit and will result in retrieval of the structure. Similarly, a search for an unsaturated ring 7–1 or an unsaturated heterocyclic ring 7–5 will result in retrieval of the indicated ring.

Characters 9, 10, 11, 12, 13, and 14 involve the same type of coding. Thus, a search for a ring containing a heterocyclic nitrogen ortho to sulfur will result in retrieval of said structure even though other groups are present such as:

In addition to the bit-by-bit basis, the terms of characters 4, 5, 6, and 8 have

Page 1126 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

been coded by what may be called an “at least” basis, i.e., a ring containing 4 or more nitrogens has at least 3, 2, and 1 nitrogens; a 3 nitrogen ring contains at least 2 and 1 nitrogens, and a 2 nitrogen ring contains at least 1 nitrogen. Thus character 4 of the M-5 word will be punched as follows for a 4N, a 3N, a 2N, and a IN containing ring:

FIGURE 8

The exact pattern coding is illustrated in M-6 (Appendix A). Each code is uniquely different from any other. A 4-membered ring 12–4 is not found within a 6-membered ring 12–6. Similarly, a 5 carbon ring is not found within a 6 carbon ring.

In searching, the codes can be used in any combination desired. Thus, where a search is for a diazine and it is desired to exclude the 3 N-or-over containing rings, the combinations of codes requiring a 6-membered ring, at least 2 nitrogens and exactly 4 carbons will result in this exclusion.

RINGS

Rings are coded according to the M-5 and M-6 words as already described. In addition to the descriptors found on pages of the coding dictionary in Appendix A, a code called an index number code is assigned in characters 4, 5, and 6 of the M-6 word. A list of rings with their index numbers appears in the coding dictionary. These are unique numbers defining the specific rings as

Page 1127 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

they are disclosed in the Ring Index (12). The rings in the list have already been completely coded and are maintained in a card file. In addition, a set of prepunched cards has been prepared which contain the punched codes for each of these rings. When these rings are encountered in a formula, they need not be coded again. They are merely identified and the prepunched cards are selected for further processing. As new rings are found in the disclosures they are coded, unique index numbers are assigned, and they are added to the file.

The signal is coded in a separate word as the last word of the set. Thus, three codes are shown for each ring:

M-5 and M-6 followed by S-1.

RING SYSTEM

The ring system is the sum of the codes for the rings contained within the ring system plus any other codes required to define relationships not provided for by the sum of these codes. Where there is only one ring in the ring system, two codes for the ring plus S-1 are followed by S-2, and the coding for the ring system is complete. Where there are two or more rings in the ring system, additional codes are added to define relationships not expressed by the individual ring codes. One of these relationships is the “fused face” pattern. The fused face pattern defines the relative positions of the fused faces of each ring in the system, in accordance with the graphic portrayal of the coding dictionary in Appendix A. The heavy lines refer to a position of fused face joining with another ring. The patterns are not necessarily limited to carbon rings; the same relationships are intended to be applicable regardless of the kind of ring elements.

Thus the structure shows the following patterns for each of the rings:

Page 1128 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

The structure shows the same patterns.

The ring system shows the following patterns for each ring: VII, pattern A; VIII, pattern C; IX, pattern A. Thus it is possible in searching to discriminate on the basis of ring orientation.

The codes are arranged to provide for finding any pattern contained within another pattern. Thus pattern D is found within F, H, I, J, K, and L but not within A, B, C, E, and G.

A generic search, therefore, on the basis of the following ring system type

will result in retrieval of the following structural type

since I is contained within II as follows:

I

II

Ring A

is equal to Ring A

Ring A

is contained within Ring B

Ring D

is contained within Ring F

On the other hand, a structure will not be retrieved since D is not within G.

Page 1129 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

The ring system is further coded according to the type of information in M-7 (Appendix A). This will provide further variation in scope as to searching. The search for a structure of type I, above, for example, can be made more specific by requiring that there be only 3 rings in the system (code M-7, 7–3), whereupon structure II would be excluded.

Ring systems are also preceded and assigned index numbers in the manner indicated for the ring.

CHAIN UNITS

The term chain unit is one of organization. The terms exemplified in M-1, M-2, M-3, and M-4 of the dictionary are, with certain exceptions, while referred to as chain units, equally useful in describing ring structures. Thus, by inclusion of the code within the ring group, it would become a ring description. For the time being, however, these codes are being used as chain descriptors only.

There are five types of chain units: M-1, M-2, M-3, M-4, and M-C. The M-C word refers to the carbon unit. The M-1 to M-4 types are described by general formulas. Thus, the M-1 group is indicated to be of type ABX (=Y) CD. The letters themselves are of no significance. The formula represents an arbitrary notation system which serves as a guide for writing functional groups in a particular sequence. The groups in M-1 all have an element doubly bonded to another which serves as the focal point with reference to which a sequence is written for the chain word.

While each chain unit has a unique code, there are various subgeneric units found within it. For example, a carboxylamido group M-1, 6–2, 7–3 is found within a urethane, a urea, a semicarbazide, and so on. A search on this basis will result in retrieval of all these groups. On the other hand, the search can be limited to any specific one desired, e.g., carboxyamide per se, M-1, 4–1, 5–1, 6–2, 7–3, 8–5, 9–1, since the code is unique for this unit.

As a further example, (see Appendix A) a search for an amine [M-3, 5–5] will result in retrieval of primary, secondary, tertiary, and quaternary amines, imines, hydrazine, etc. A requirement for a secondary amine, M-3 codes 4–1, 5–5, 6–2, will limit the search to secondary amines only.

These chain units have been generated from a small dictionary of elements, some actual and some synthetic (as “Z”). As more chain units are found, codes for them are generated according to the notation system.

The chain unit codes are terminated by signal S-3.

CHAIN

The S-4 signal groups the chain units. The interfix number of a chain connection to a ring is the sum of the interfix numbers of its chain unit connections

Page 1130 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

to the ring. If a chain is intermediate to a number of rings, i.e., its terminal chain units are interfixed (ring to chain unit), the S-4 for the chain has the interfixes of all the terminal chain units (hetero interfixes). In searching, a connection of a chain or any part of a chain to any particular ring can be found with or without specificity as to the particular chain unit by which it is connected.

Carbon word

The carbon word is coded to indicate the number of carbon items both specifically and generically. The specific code permits limitation in searching to a specific alkyl group. The generic code permits the finding of a 2 carbon chain within a 2+ carbon chain, a 3 carbon chain within a 3+ carbon chain, etc. A search can be made for alkyl groups, saturated or unsaturated, straight or branched, lower or higher, as desired.

Positions of substitution

The fact that the coding of a compound takes place in terms of substructures permits the use of a standard numbering system. Each ring, except for a homocyclic carbon ring, is assigned an arbitrary numbering system and the positions of substitution of chain units are indicated according to this number standard. The same standardization is done for the ring system. The position of substitution of a chain unit to a ring and to a ring system is coded in the chain unit word. The position of substitution of a ring to a ring is coded in the S-1 word of the ring code.

Example of coding

The coding of the compound of Fig. 1 is illustrated in Appendix B.

Search

The variability in scope of search in this system is:

  1. The “building blocks” of the system are small units. These units are separately and independently described, which permits the asking of a large number and large variety of search questions for retrieval.

  2. The descriptors are variable in scope—one descriptor may merely indicate a 6-membered ring while another indicates a positional relationship of heterocyclic elements or substituted groups. Search questions may be expressed in terms of combinations of descriptors, which gives variability in scope of the question.

Page 1131 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
  1. Each collection of codes is gathered into a substructural entity. This permits search questions with respect to chemical compounds which vary in scope as desired with respect to selected portions of the molecule.

  2. The groupings and interfixes provide the ability to specify relationships among the substructures and to obtain as much specificity as desired with respect to the compound search.

  3. The machine used is the ILAS. Scanning of the cards is a continuous operation from card to card, as many cards being used as needed to encode the disclosure of any one document. Termination of the card or cards pertaining to the document is obtained by the signal S-6.

Experimental nature of system

The system is being tested. Thus far, it appears to be fully operative according to the principles involved. The coding procedure itself appears to be quite simple. The compounds are coded from the formulas by clerical personnel. The preceding device for the constants of the formulas, rings, ring systems, and chain units has been found to be a very useful procedure from the point of view of speed, accuracy, simplicity of coding, and uniformity.

Future work

An effort is being made to solve the problem of the Markush disclosure. The Markush type of disclosure, where a formula is presented with a number of sets of alternatives, has been described (2, 4, 8). It is desired to code this type of disclosure so that in retrieval only one member of each set of alternatives may be selected in combination with only one member of any other set. To handle this problem, a signal is being used to provide a counting method for each group in “and” relationship, so that of a set of alternatives, no more than one is selected.

The polymer system

The principles described herein are deemed applicable to a more comprehensive problem such as exists in the disclosure of chemical compositions and processes. By an expansion of the principle of groupings, compositions can be provided for within a higher level of grouping enclosure, and processes can be provided for by a still further order of grouping. This principle is being used in a system now under development for encoding the disclosures in the polymer art which includes compounds, compositions, and processes. The method for describing

Page 1132 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

the compounds in that art is not the same as described herein. However, the VS3 system is deemed compatible with this more comprehensive system, and it will be for the future to provide a determination as to whether or not the methods should be merged.

Acknowledgment

Acknowledgment is made of the assistance of Mr. H.P.Luhn of IBM in helping develop some of the concepts herein described, particularly with respect to the concept of fused face patterns. Appreciation is also expressed for the assistance of Mrs. R.W.Swanson of the examining staff and Mrs. A.Replogle, Mrs. J.M.Hale, Mrs. M.Bender, and the various other members of the clerical staff in this experiment.

BIBLIOGRAPHY

1. DON D.ANDREWS. Interrelated Logic Accumulating Scanner (ILAS). Patent Office Research and Development Report No. 6. Washington, 25, D. C., Department of Commerce, 1957.

2. DON D.ANDREWS, JULIUS FROME, H.R.KOLLER, JACOB LEIBOWITZ, and H. PFEFFER. Recent Advances in Patent Office Searching, Steroid Compounds and ILAS. Patent Office Research and Development Report No. 8. Washington 25, D. C., Department of Commerce, 1957.

3. DON D.ANDREWS and SIMON M.NEWMAN. Storage and Retrieval of Contents of Technical Literature, Nonchemical Information, Preliminary Report. Patent Office Research and Development Report No. 1. Washington 25, D. C., Department of Commerce, 1956.

4. JULIUS FROME and JACOB LEIBOWITZ. A Punched Card System for Searching Steroid Compounds. Patent Office Research and Development Report No. 7. Washington 25, D. C., Department of Commerce, 1957.

5. B.E.LANHAM, J.LEIBOWITZ, and H.R.KOLLER. Advances in Mechanization of Patent Searching, Chemical Field. Patent Office Research and Development Report (No. 2). Washington 25, D. C., Department of Commerce, 1956.

6. B.E.LANHAM, J.LEIBOWITZ, H.R.KOLLER, and H.PFEFFER. Organization of Chemical Disclosures for Mechanized Retrieval. Patent Office Research and Development Report No. 5. Washington 25, D. C., Department of Commerce, 1957.

7. SIMON M.NEWMAN. “Linguistics and Information Retrieval; Toward a solution of the Patent Office Problem.” Monograph Series in Linguistics and Language Studies, No. 10. Washington 25, D. C., Georgetown University Press, 1957.

8. SIMON M.NEWMAN. Linguistic Problems in Mechanization of Patent Searching. Patent Office Research and Development Report No. 9. Washington 25, D. C., Department of Commerce, 1957.

9. SIMON M.NEWMAN. Problems in Mechanizing the Search in Examining Patent

Page 1133 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

Applications. Patent Office Research and Development Report No. 3. Washington, 25, D.C., Department of Commerce, 1956.

10. SIMON M.NEWMAN. Storage and Retrieval of Contents of Technical Literature, Nonchemical Information, First Supplementary Report. Patent Office Research and Development Report No. 4. Washington 25, D. C., Department of Commerce, 1957.

11. M.F.BAILEY, B.E.LANHAM, and J.LEIBOWITZ. Mechanized searching in the U.S. Patent Office. Journal of the Patent Office Society, XXXV (1953), 566–587.

12. AUSTIN M.PATTERSON and LEONARD T.CAPELL. The Ring Index, Reinhold Publishing Corp., New York, 1940.

APPENDIX A Excerpts from VS3index of codes

The codes contained in each word are introduced by a modulant. Twelve modulants are presently defined as follows:

Modulants

M-1 to M-4 for chain units

M-5 and M-6 for rings

M-7 for ring systems

M-8 for compounds

M-9, M-A, M-B for metals

M-C for carbon chain units

Signals are employed to group the words pertaining to rings, ring systems, functional groups, and chains as follows:

Signals

S-1 pertaining to individual rings

S-2 pertaining to the ring system

S-3 pertaining to the chain unit

S-4 pertaining to the complete chain

S-5 pertaining to the complete compound

S-6 pertaining to the complete patent

Interfixes are employed to define the bond relationships. Twelve interfixes are provided, numbers 1–6 for homo relationships (ring to ring or chain unit to chain unit) and numbers 7–12 for hetero relationships (ring to chain).

MODULANT M-1 Chain unit A B X(=Y) C D

Chemical formula

Name

Structural formula

A

B

X

=Y

C

D

 

COF

Carboxylfluoride

C(=O)F

Z 4–1

Z 5–1

C 6–2

=O 7–3

F 8-B

Hal 9-D

 

CON

Carboxylamide

C(=O)N

Z 4–1

Z 5–1

C 6–2

=O 7–3

N 8–5

Z 9–1

10–8

CON2

Isourea

OC(=N)N

Z 4–1

O 5–3

C 6–2

=N 7–5

N 8–5

Z 9–1

10–8

CON2

Urea

NC(=O)N

Z 4–1

N 5–5

C 6–2

=O 7–3

N 8–5

Z 9–1

10–8

CON3

Semicarbazide

NC(=O)NN

Z 4–1

N 5–5

C 6–2

=O 7–3

N 8–5

N 9–5

10–8

CO2N

Urethane

OC(=O)N

Z 4–1

O 5–3

C 6–2

=O 7–3

N 8–5

Z 9–1

10–9

SO2I

Sulfonyliodide

Z 4–1

Z 5–1

S 6–4

I 8–9

Hal 9-D

 

SO2N

Sulfonamide

Z 4–1

Z 5–1

S 6–4

N 8–5

Z 9–1

10–8

Page 1134 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

MODULANT M-2 Chain unit A X (=Y) B

Chemical formula

Name

Structural formula

A

X

=Y

B

CO

Ketene

=C(=O)

=4-E

C 5–2

=O 6–3

Z 7–1

CO

Ketone

C(=O)

Z 4–1

C 5–2

=O 6–3

C 7–2

CO

Quinone

C=O

C 4–2

C 5–2

=O 6–3

C 7–2

CS

Thioketene

=C(=S)

=4-E

C 5–2

=S 6–4

Z 7–1

CS

Thioketone

C(=S)

Z 4–1

C 5–2

=S 6–4

C 7–2

MODULANT M-3 Chain unit A X Y

Chemical formula

Name

Structural formula

A

X

Y

 

NH

Secondary amine

Z 4–1

N 5–5

C 6–2

 

N

Tertiary amine

C 4–2

N 5–5

C 6–2

 

O

Ether

-O-

Z 4–1

O 5–3

Z 6–1

10–4

S

Thioether

-S-

Z 4–1

S 5–4

Z 6–1

10–4

N

Quaternary amine

Z 4–1

N 5–5

Z 6–1

 

N

Imine

=N

=4-E

N 5–5

Z 6–1

NH2

Primary amine

H-N-H

Z 4–1

N 5–5

H 6–6

N2

Hydrazine

Z 4–1

N 5–5

N 6–5

Br

Bromine

-Br

Hal 4-D

Br 5-C

Z 6–1

Cl

Chlorine

-Cl

Hal 4-D

Cl 5-A

Z 6–1

F

Fluorine

-F

Hal 4-D

F 5-B

Z 6–1

I

Iodine

-I

Hal 4-D

I 5–9

Z 6–1

MODULANT M-4 Chain unit

Chemical formula

Name

Structural formula

Notation code

CN

Nitrile

−C≡N

4-B

CN

Isonitrile

−N=C

4–6

CN2

Carbodiimide

N=C=N

4–8

CNO

Cyanate

O−C≡N

4–3

CNO

Isocyanate

N=C=O

4–4

MODULANT M-C Carbon chain

No. carbons

No. carbons specifically

No. carbons genetically

CU Posn. on ring

Code

CU Posn. on R.S.

Code

 

4

5

8

9

 

1

5–1

9–1

1

11–1

1

13–1

2

5–2

9–1

2

11–2

2

13–2

3

5–3

9–1

3

11–3

3

13–3

4

5–4

9–3

4

11–4

4

13–4

5

5–5

9–3

5

11–5

5

13–5

6

5–6

9–7

6

11–6

6

13–6

7

5–7

9–7

7

11–7

7

13–7

8

5–8

9-F

8

11–8

8

13–8

9

5–9

9-F

9

11–9

9

13–9

Page 1135 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

MODULANT M-5 Ring schedule

No.

N atoms

O atoms

S atoms

Misc. hetero atoms

Type of ring

 

1

4–1

5–1

6–1

8–1

Unsaturated

7–1

2

4–3

5–3

6–3

8–3

Saturated

7–2

3

4–7

5–7

6–7

8–7

Heterocyclic

7–4

4,+

4-F

5-F

6-F

8-F

Homocyclic

7–8

Combinations of atoms

Ortho

Meta

Para

N-N

9–1

11–1

13–1

N-O

9–2

11–2

13–2

N-S

9–4

11–4

13–4

O-O

9–8

11–8

13–8

O-S

10–1

12–1

14–1

S-S

10–2

12–2

14–2

Misc. (a)-Misc. (a)a

10–4

12–4

14–4

Misc. (a)-Misc. (b)a

10–8

12–8

14–8

a Miscellaneous combinations include all combinations not specifically provided for. Misc. (a)-Misc. (a) refers to combinations of the same elements. Misc.(a)-Misc.(b) refers to combinations of different elements.

MODULANT M-6 Ring schedule

No. elements in ring

Code

No. carbons in ring

Code

1

 

1

 

2

 

2

 

3

12–3

3

13–3

4

12–4

4

13–4

5

12–5

5

13–5

6

12–6

6

13–6

7

12–7

7

13–7

8

12–8

8

13–8

9

12–9

9

13–9

10

12-A

10

13-A

11

12-B

11

13-B

12

12-C

12

13-C

13

12-D

13

13-D

14

12-E

14

13-E

15,+

12-F

15,+

13-F

Page 1136 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

MODULANT M-6 Ring schedule

MODULANT M-7 Ring system schedule

No.

Rings

N rings

O rings

S rings

Benzene rings

Miscellaneous hetero rings

1

7–1

8–1

9–1

10–1

11–1

12–1

2

7–2

8–2

9–2

10–2

11–2

12–2

3

7–3

8–3

9–3

10–3

11–3

12–3

4

7–4

8–4

9–4

10–4

11–4

12–4

5

7–5

8–5

9–5

10–5

11–5

12–5

Page 1137 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

MODULANT M-8 Compound schedule

No.

Ring systems

Saturated rings

Unsatd. rings

7 and 7+ rings

N rings

O rings

S rings

Benzene rings

1

4–1

5–1

6–1

7–1

8–1

9–1

10–1

11–1

2

4–2

5–2

6–2

7–2

8–2

9–2

10–2

11–2

3

4–3

5–3

6–3

7–3

8–3

9–3

10–3

11–3

4

4–4

5–4

6–4

7–4

8–4

9–4

10–4

11–4

5

4–5

5–5

6–5

7–5

8–5

9–5

10–5

11–5

 

etc.

 

MODULANTS M-9, M-A, M-B Metals schedule

Symbol

Metal

M-9

M-A

M-B

 

 

M-B

Ca

Calcium

8–1

 

5–1

 

Group IA

4–1

Cr

Chromium

 

4–1

6–1

10–1

Group IIA

5–1

Co

Cobalt

 

6–1

6–1

12–1

Heavy metal

6–1

Cu

Copper

13–1

 

6–1

13–1

Group IIIB

7–1

Au

Gold

13–1

 

6–1

13–1

Group IVB

8–1

Fe

Iron

10–1

 

6–1

12–1

Group VA

9–1

Pb

Lead

 

11–1

6–1

15–1

Group VIB

10–1

Li

Lithium

11–1

 

4–1

 

Group VIIB

11–1

etc.

 

Page 1138 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

APPENDIX B

FIGURE B-1. Preparatory skeletal formula.

Page 1139 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

FIGURE B-2

Page 1140 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

FIGURE B-3

Page 1141 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

FIG. B-4. Code meanings for phenothiazine

Page 1142 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×

FIGURE B-5. Outline of the punched card used in VS3. The twelve codes reading from row 9 up correspond respectively to the first twelve codes shown in Fig. B-3

Page 1117 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1117
Page 1118 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1118
Page 1119 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1119
Page 1120 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1120
Page 1121 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1121
Page 1122 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1122
Page 1123 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1123
Page 1124 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1124
Page 1125 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1125
Page 1126 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1126
Page 1127 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1127
Page 1128 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1128
Page 1129 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1129
Page 1130 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1130
Page 1131 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1131
Page 1132 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1132
Page 1133 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1133
Page 1134 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1134
Page 1135 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1135
Page 1136 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1136
Page 1137 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1137
Page 1138 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1138
Page 1139 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1139
Page 1140 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1140
Page 1141 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1141
Page 1142 Cite
Suggested Citation:"Variable Scope Search System: VS8." National Research Council. 1959. Proceedings of the International Conference on Scientific Information: Two Volumes. Washington, DC: The National Academies Press. doi: 10.17226/10866.
×
Page 1142
Next: The Haystaq System: Past, Present, and Future »
Proceedings of the International Conference on Scientific Information: Two Volumes Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The launch of Sputnik caused a flurry of governmental activity in science information. The 1958 International Conference on Scientific Information (ICSI) was held in Washington from Nov. 16-21, 1958 and sponsored by NSF, NAS, and American Documentation Institute, the predecessor to the American Society for Information Science. In 1959, 20,000 copies of the two volume proceedings were published by NAS and included 75 papers (1600 pages) by dozens of pioneers from seven areas such as:

  • Literature and reference needs of scientists
  • Function and effectiveness of A & I services
  • Effectiveness of Monographs, Compendia, and Specialized Centers
  • Organization of information for storage and search: comparative characteristics of existing systems
  • Organization of information for storage and retrospective search: intellectual problems and equipment considerations
  • Organization of information for storage and retrospective search: possibility for a general theory
  • Responsibilities of Government, Societies, Universities, and industry for improved information services and research.

It is now an out of print classic in the field of science information studies.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!