The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Page 259
percentage of correct predictions on a residue-by-residue basis) approached 60 percent.
Subsequent work has developed more sophisticated variations on the theme. In 1979, Robson and co-workers introduced an information-theoretic formalism to supplement conformational preferences of the isolated residues with preferences based on pairwise interactions (Garnier et al., 1978). The method was easy to implement in a computer algorithm and achieved ~ 64 percent accuracy. More recently, various authors (Qian and Sejnowski, 1988; Holley and Karplus, 1989) have employed neural networks, which belong to a general class of machine learning algorithms that can efficiently "learn" an optimal translation of one data string (for example, a protein sequence) into another (for example, the sequential secondary structure assignments). The network is a group of input nodes connected to a group of output nodes with an optional hidden layer or layers of nodes (see Figure 9.8). A matrix of weights is developed to map the input information into the nodes on a path to the output layer. Like neurons in the nervous system, a cooperative nonlinear "firing" potential is used to decide if adequate information has accumulated to switch on an output node (see Figure 9.9). For secondary structure prediction, this "all or none" output node predicts an a-helix when the accumulated helical propensity of the residue of interest and its neighbors crosses the threshold. The weights for the connections that relate input nodes to output nodes are learned by example. A window specifies the number of neighboring residues that can contribute to the conformational state of the residue of interest. Case after case of input amino acid sequence and output secondary structure is presented to the network. A least squares algorithm defines an optimal set of weights for the encoding of the data set using a back propagation strategy. A more complete description of neural networks can be found in a chapter of the book Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1 (Rumelhart et al., 1986).
Neural networks easily achieve an accuracy of 64 percent, a figure comparable to that for other methods. It is useful to explore the connection weights derived by the network that relate amino acids to their secondary structure preferences. Figure 9.10 is a Hinton diagram of these weights (the magnitude of the weight is proportional to the area of the square; positive weights are in white, and negative weights are in black). Alanine