Read "Toward a Safer and More Secure Cyberspace" at NAP.edu

Page 83 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

4
Category 1—Blocking and Limiting the Impact of Compromise

The goal of requirements in Category 1 of the committee’s illustrative research agenda is that of ensuring that the impact of compromises in accountability or system security is limited. This broad category—blocking and limiting the impact of compromise—includes secure information systems and networks that resist technical compromise; technological and organizational approaches that reveal attempts to compromise information technology (IT) components, systems, or networks; containment of breaches; backup and recovery; convenient and ubiquitous encryption that can prevent unauthorized parties from obtaining sensitive or confidential data; system lockdowns under attack; and so on.

A basic principle underlying Category 1 is that of defense in depth. A great deal of experience in dealing with cybersecurity issues suggests that no individual defensive measure is impossible to circumvent. Thus, it makes sense to consider defense in depth, which places in the way of a cyberattacker a set of varied hurdles, all of which must be penetrated or circumvented if the cyberattacker is to achieve its goal. When different hurdles are involved, an attacker must have access to a wider range of expertise to achieve its goal and also must have the increased time and resources needed to penetrate all of the defenses.

4.1
SECURE DESIGN, DEVELOPMENT, AND TESTING

The principle that security must be a core attribute of system design, development, and testing simply reflects the point that it is more effective

Page 84 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

to reduce vulnerabilities by not embedding them in a system than to fix the problems that these vulnerabilities cause as they appear in operation.¹ Vulnerabilities can result from design, as when system architects embed security flaws in the structure of a system. Vulnerabilities also result from flaws in development—good designs can be compromised because they are poorly implemented. Testing for security flaws is necessary because designers and implementers inevitably make mistakes or because they have been compromised and have deliberately introduced such flaws.

4.1.1
Research to Support Design

4.1.1.1
Principles of Sound and Secure Design

In the past 40+ years, a substantial amount of effort has been expended in the (relatively small) security community to articulate principles of sound design and to meet the goal of systems that are “secure by design.” On the basis of examinations of a variety of systems, researchers have found that the use of these principles by systems designers and architects correlates highly to the security and reliability of a system. Box 4.1 summarizes the classic Saltzer-Schroeder principles, first published in 1975, that have been widely embraced by cybersecurity researchers.

Systems not built in accord with such principles will almost certainly exhibit inherent vulnerabilities that are difficult or impossible to address. These principles, although well known in the research community and available in the public literature, have not been widely adopted in the mainstream computer hardware and software design and development community. There have been efforts to develop systems following these principles, but observable long-term progress relating specifically to the multitude of requirements for security is limited. For example, research in programming languages has resulted in advances that can obviate whole classes of errors—buffer overflows, race conditions, off-by-one errors, format string attacks, mismatched types, divide-by-zero crashes, and unchecked procedure-call arguments. But these advances, important though they are, have not been adopted on a sufficient scale to make these kinds of error uncommon.

Nonetheless, the principles remain valid—so why have they had so little impact in the design and development process? In the committee’s

¹

For example, Soo Hoo et al. determined empirically that fixing security defects after deployment cost almost seven times as much as fixing them before deployment. Furthermore, security investments made in the design stage are 40 percent more cost-effective than similar investments in the development stage. See K. Soo Hoo, A. Sudbury, and A. Jaquith, “Tangible ROI Through Secure Software Engineering,” Secure Business Quarterly, Quarter 4, 2001.

Page 85 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

view, three primary reasons account for the lack of such impact: the mismatch between these principles and real-world software development environments, short-term expenses associated with serious adherence to these principles, and potential conflicts with performance.

4.1.1.1.1
The Mismatch with Current Development Methodologies

One reason for the lack of impact is the deep mismatch between the principles of system design in Box 4.1 and real-world software development environments. Even a cursory examination of the principles discussed in Box 4.1 suggests that their serious application is predicated on a thorough and deep understanding of what the software designers and architects are trying to do. To apply these principles, software designers and architects have to know very well and in some considerable detail just what the ultimate artifact is supposed to do.

The software development model most relevant to this state of affairs is often called the waterfall model, explicated in considerable detail by Boehm.² This model presumes a linear development process that proceeds from requirements specification, to design, to implementation/coding, to integration, to testing/debugging, to installation, to maintenance, although modified versions of the model acknowledge some role for feedback between each of these stages and preceding ones.

But despite its common use in many software development projects (especially large ones), the waterfall model is widely viewed as inadequate for real-world software development. The reason is that many—perhaps even most—software artifacts grow organically. The practical reality is that large software systems emerge from incremental additions to small software systems in ways entirely unanticipated by the designers of the original system. If the original system is successful, users will almost certainly want to add new functionality. The new functionality desired is by definition unanticipated—if the designers had known that it would be useful, they would have included it in the first place.

Indeed, it is essentially impossible in practice for even the most operationally experienced IT applications developers to be able to anticipate in detail and in advance all of a system’s requirements and specifications. (Sometimes users change their minds about the features they want, or even worse, want contradictory features! And, of course, it is difficult indeed to anticipate all potential uses.) Thus, system requirements and specifications are always inherently incomplete, even though they underlie and drive the relationships among various modules and their inter-

²	Barry Boehm, Software Engineering Economics, Prentice-Hall, Englewood Cliffs, N.J., 1981.

Page 86 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

BOX 4.1

The Saltzer-Schroeder Principles of Secure System Design and Development

Saltzer and Schroeder articulate eight design principles that can guide system design and contribute to an implementation without security flaws:

Economy of mechanism: The design should be kept as simple and small as possible. Design and implementation errors that result in unwanted access paths will not be noticed during normal use (since normal use usually does not include attempts to exercise improper access paths). As a result, techniques such as line-by-line inspection of software and physical examination of hardware that implements protection mechanisms are necessary. For such techniques to be successful, a small and simple design is essential.
Fail-safe defaults: Access decisions should be based on permission rather than exclusion. The default situation is lack of access, and the protection scheme identifies conditions under which access is permitted. The alternative, in which mechanisms attempt to identify conditions under which access should be refused, presents the wrong psychological base for secure system design. This principle applies both to the outward appearance of the protection mechanism and to its underlying implementation.
Complete mediation: Every access to every object must be checked for authority. This principle, when systematically applied, is the primary underpinning of the protection system. It forces a system-wide view of access control, which, in addition to normal operation, includes initialization, recovery, shutdown, and maintenance. It implies that a foolproof method of identifying the source of every request must be devised. It also requires that proposals to gain performance by remembering the result of an authority check be examined skeptically. If a change in authority occurs, such remembered results must be systematically updated.
Open design: The design should not be secret. The mechanisms should not depend on the ignorance of potential attackers, but rather on the possession of specific, more easily protected, keys or passwords. This decoupling of protection mechanisms from protection keys permits the mechanisms to be examined by many reviewers without concern that the review may itself compromise the safeguards. In addition, any skeptical users may be allowed

faces, inputs, state transitions, internal state information, outputs, and exception conditions.

Put differently, the paradox is that successful principled development requires a nontrivial understanding of the entire system in its ultimate form before the system can be successfully developed. Systems designers need experience to understand the implications of their design choices. But experience can be gained only by making mistakes and learning from them.

Page 87 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

to convince themselves that the system they are about to use is adequate for their individual purposes. Finally, it is simply not realistic to attempt to maintain secrecy for any system that receives wide distribution.

Separation of privilege: Where feasible, a protection mechanism that requires two keys to unlock it is more robust and flexible than one that allows access to the presenter of only a single key. The reason for this greater robustness and flexibility is that, once the mechanism is locked, the two keys can be physically separated and distinct programs, organizations, or individuals can be made responsible for them. From then on, no single accident, deception, or breach of trust is sufficient to compromise the protected information.
Least privilege: Every program and every user of the system should operate using the least set of privileges necessary to complete the job. This principle reduces the number of potential interactions among privileged programs to the minimum for correct operation, so that unintentional, unwanted, or improper uses of privilege are less likely to occur. Thus, if a question arises related to the possible misuse of a privilege, the number of programs that must be audited is minimized.
Least common mechanism: The amount of mechanism common to more than one user and depended on by all users should be minimized. Every shared mechanism (especially one involving shared variables) represents a potential information path between users and must be designed with great care to ensure that it does not unintentionally compromise security. Further, any mechanism serving all users must be certified to the satisfaction of every user, a job presumably harder than satisfying only one or a few users.
Psychological acceptability: It is essential that the human interface be designed for ease of use, so that users routinely and automatically apply the protection mechanisms correctly. More generally, the use of protection mechanisms should not impose burdens on users that might lead users to avoid or circumvent them—when possible, the use of such mechanisms should confer a benefit that makes users want to use them. Thus, if the protection mechanisms make the system slower or cause the user to do more work—even if that extra work is “easy”—they are arguably flawed.

SOURCE: Adapted from J.H. Saltzer and M.D. Schroeder, “The Protection of Information in

For these reasons, software development methodologies such as incremental development, spiral development, and rapid prototyping have been created that presume an iterative approach to building systems based on extensive prototyping and strong user feedback. Doing so increases the chances that what is ultimately delivered to the end users meets their needs, but entails a great deal of instability in “the requirements.” Moreover, when such “design for evolvability” methodologies are used with modularity, encapsulation, abstraction, and well-defined

Page 88 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

interfaces, development and implementation even in the face of uncertain requirements are much easier to undertake. The intellectual challenge—and thus the research question—is how to fold security principles into these kinds of software development processes.

4.1.1.1.2
The Short-Term Expense

A second reason that adherence to the principles listed in Box 4.1 is relatively rare is that such adherence is—in the short term—almost always more expensive than ignoring the principles. If only short-term costs and effort are taken into account, it is—today—significantly more expensive and time-consuming to integrate security from the beginning of a system’s life cycle, compared with doing nothing about security and giving in to the pressures of short-timeline deliverables.

This reality arises from a real-world environment in which software developers often experience false starts, and there is a substantial amount of “playing around” that helps to educate and orient developers to the task at hand. In such an environment, when many artifacts are thrown away, it makes very little sense to invest up front in that kind of adherence unless such adherence is relatively inexpensive. The problem is further compounded by the fact that the transition from the “playing around” environment to the “serious development” environment (when it makes more sense to adhere to these principles) is often unclear.

An example is the design of interfaces between components. Highly constrained interfaces increase the stability of a system incorporating such components. At the same time, that kind of constraining effort is inevitably more expensive than the effort involved when an interface is only lightly policed. In this context, a constrained interface is one in which calling sequences and protocols are guaranteed to be valid, meaningful, and appropriate. Guarantees must be provided that malformed sequences and protocols will be excluded. Providing such guarantees requires resources and programming that are unnecessary if the sequences and protocols are simply assumed to be valid.

A second example arises from cooperative development arrangements. In practice, system components are often developed by different parties. With different parties involved, especially in different organizations, communications difficulties are inevitable, and they often include incompatibilities among interface assumptions, the existence of proprietary internal and external interfaces, and performance degradations resulting from the inability to optimize across components. This point suggests the need for well-defined and carefully analyzed specifications for the constituent components, but it is obviously easier and less expensive to simply assume that specifications are unambiguous.

Page 89 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

In both of these examples, an unstructured and sloppy design and implementation effort is likely to “work” some of the time. Although such an effort can provide insight to designers and offer an opportunity for them to learn about the nature of the problem at hand, transitioning successfully to a serious production environment generally requires starting over from scratch rather than attempting to evolve an unstructured system into the production system. But in practice, organizations pressed by resources and schedule often believe—incorrectly and without foundation—that evolving an unstructured system into the production system will be less expensive. Later, they pay the price, and dearly.

4.1.1.1.3
The Potential Conflict with Functionality and Ease of Use

A third important reason that adherence to the principles in Box 4.1 is relatively rare is the potential conflict with functionality. In many cases, introducing cybersecurity to a system’s design slows it down or makes it harder to use. Implementing the checking, monitoring, and recovery needed for secure operation requires a lot of computation and does not come for free. At the same time, commodity products—out of which many critical operational systems are built—are often constrained by limited resources and cost, even while the market demands ever-higher performance and functionality.

4.1.1.2
The Relevant Research

In light of the issues above and the historically well-known difficulties in conventional computer system development (and especially the software), research and development (R&D) should be undertaken aimed at adapting the design principles of Box 4.1 for use in realistic and common software development environments that also do not make excessive sacrifices for performance or cost. Today, there are well-established methodologies for design-to-cost and design-for-performance but no comparable methodologies for designing systems in such a way that security functionality can be implemented systematically or even that the security properties of a system can be easily understood. Indeed, security reviews are generally laborious and time-consuming, a fact that reduces the attention that can be paid to security in the design process.

In general, the design process needs to consider security along with performance and cost. One essential element of a “design-for-security evaluation” will be approaches for dealing with system complexity, so that genuinely modular system construction is possible and the number of unanticipated interactions between system components is kept to a bare

Page 90 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

minimum, as discussed in Box 4.1. In any given case, the right balance will need to be determined between reducing the intrinsic complexity of a system (e.g., as expressed in the realistic requirements for security, reliability, availability, survivability, human safety, and so on) and using architectural means that simplify the interfaces and maintainability (e.g., through abstraction, encapsulation, clean interface design, and design tools that identify and enable the removal of undesired interactions and incompatibilities and hindrances to composability). This point also illustrates the need to address security issues in the overall architecture of applications and not just as added-on security appliances or components to protect an intrinsically unsafe design.

Another important element is the tracing of requirements to design decisions through implementation. That is, from a security standpoint (as well as for other purposes, such as system maintenance), it is important to know what code (or circuitry) in the final artifact corresponds to what requirements in the system’s specification. Any code or circuitry that does not correspond to something in the system specification is inherently suspect. (See also Section 4.1.3.1.) Today, this problem is largely unsolved, and such documentation—in those rare instances when it does exist—is generated manually. Apart from the labor-intensiveness of the manual generation of such documentation, a manual approach applied to a complex system virtually guarantees that some parts of the code or circuitry will remain untraced to any requirement, simply because it has been overlooked. Moreover, for all practical purposes, a manual process requires that the original designers and implementers be intimately involved, since the connections between requirement and code or circuitry must be documented in near real time. Once these individuals are no longer available for consultation, these connections are inevitably lost.

With respect to the issue of short-term expense, R&D might develop both technical and organizational approaches to reducing short-term costs. From a technical perspective, it would be desirable to have tools that facilitate the reuse of existing design work. From an organizational perspective, different ways of structuring design and development teams might enable a more cost-effective way of exploiting and leveraging existing knowledge and good judgment.

Finally, it is worth developing design methods that proactively anticipate potential attacks. Threat-based design is one possible approach that requires the identification and characterization of the threats and potential attacks, finding mechanisms that hostile parties may employ to attack or gain entry to a computing system, and redesigning these mechanisms to eliminate or mitigate these potential security vulnerabilities. A further challenge is that of undertaking such design in a way that does not com-

Page 91 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

promise design-to-cost and design-for-performance goals, such as high performance, low cost, small footprint, low energy consumption, and ease of use.

4.1.2
Research to Support Development

4.1.2.1
Hardware Support for Security

Today, systems developers embody most of the security functionality in software. But hardware and computer architecture can also support more secure systems. In the past two to three decades, computer and microprocessor architects have focused on improving the performance of computers. However, in the same way that processing capability has been used in recent years to improve the user experience (e.g., through the use of compute-intensive graphics), additional increases in hardware performance (e.g., faster processors, larger memories, higher bandwidth connections) may well be usable for improving security.

Compared with software-based security functionality, hardware-based support for security has two primary advantages. One advantage is that new hardware primitives can be used to make security operations fast and easily accessible, thus eliminating the performance penalty often seen when the same functionality is based in software and increasing the likelihood that this functionality will be used. A second advantage is that it tends to be more trustworthy, because it is much harder for an attacker to corrupt hardware than to corrupt software.

Some critics of implementing security in hardware believe that security is inflexible and cannot adapt to changes in the environment or in attacker patterns. But hardware support for security need not imply that the entire security function desired must be implemented in hardware. Research is needed to determine the fundamental hardware primitives or features that should be added to allow flexible use by software to construct more secure systems.

Hardware support can be leveraged in several ways. First, faster computing allows software to do more checking and to do more encrypting. Increases in raw processing performance can be large enough to allow more modular, more trustworthy software to run at acceptable speeds—that is, special-purpose software tricks used to enhance performance that also violated canons of secure program construction are much less necessary than they were in the past.

Second, specific checking capability can be added to the processor itself, supporting a kind of “hardware reference monitor.” This is especially easy to contemplate at the moment, given the current trend to multicore architectures—some cores can be used for checking other cores.

Page 92 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

The checks possible can be quite sophisticated, monitoring not only what actions are being requested but checking those actions in the context of past execution.³ Such checking can be used to ensure that applications, middleware, and even privileged operating system software do not perform actions that violate security policies. Hardware can also provide a safety net for potentially harmful actions taken by software, such as executing code that should be considered data. Since the hardware processor executes all software code, it can provide valuable “defense-in-depth” support in preventing software from compromising system security and integrity.

Third, security-specific operations can be added into hardware. For example, processors can be designed in which data that are written to memory are encrypted leaving the processor and decrypted when they return to the processor. Or, instructions can be stored in memory in encrypted form and then decrypted by the hardware just prior to execution. Some proposals for hardware-implemented security operations even go so far as to make the operations of these special operations invisible to other computations that occur on that processor.

Hardware can also implement a trustworthy and protected memory for storing secrets (typically, a small number). These secrets cannot be retrieved by software (so they are guaranteed to remain secret no matter what software is running); rather, they are used—for example, to encrypt data—by invoking hardware primitives that use those secrets and return the result. This approach was first implemented in smart cards some years ago, but smart cards have often proved slow and inconvenient to use. Smart cards were followed by a succession of other positionings of the functionality, including outboard secure co-processors and modified microprocessors.

The desirability of any given positioning depends, at least in part, on the nature of the threat. For example, if the hardware support for security appears on additional chips elsewhere on a board, then an attacker with physical access to the computer board might succeed without very sophisticated equipment. Placing the support on the microprocessor chip itself significantly complicates such attacks.

An example of embedding security-specific features into hardware to protect a user’s information is provided by Lee et al.,⁴ who have developed a secret-protected (SP) architecture that enables the secure and con-

³	Paul Williams and Eugene H. Spafford, “CuPIDS: An Exploration of Highly Focused, Coprocessor-Based Information System Protection,” Computer Networks, 51(5): 1284-1298, April 2007.
⁴	R. Lee, P. Kwan, J.P. McGregor, J. Dwoskin, and Z. Wang, “Architecture for Protecting Critical Secrets in Microprocessors,” Proceedings of the 32nd International Symposium on Computer Architecture, IEEE Computer Society, Washington, D.C., pp. 2-13, June 2005.

Page 93 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

venient protection of a user’s sensitive information stored in an online environment, by providing hardware protection of critical secrets such as cryptographic keys belonging to a given user. In the SP architecture, keys follow their users and are not associated with any particular device. Thus, a given user can securely employ his or her keys on multiple devices, and a given device can be used by different users.

The SP architecture is based on several elements. One element is the existence of a concealed execution mode in an SP-enhanced microprocessor, which allows a process to execute without its state being tampered with or observed by other processes, including the main operating system running on the processor. It includes a very efficient mechanism for runtime attestation of trusted code. A second element is a trusted software module running in concealed execution mode that performs the necessary protected computations on users’ secret keys, thus protecting all key information (the keys themselves, the computations, and intermediate states) from observation and tampering by adversaries. A third element is a chain of user cryptographic keys that are needed for accessing, and the protecting by encryption of any amount of sensitive information. This chain is stored in encrypted form (and thus can be resident anywhere), but it can be decrypted with a master key known only to the user. Similarly, user data, programs, and files encrypted by these keys can be stored safely in public online storage and accessed over public networks. A fourth element is a secure input/output (I/O) channel that enables the user to pass the master key to the SP hardware and the trusted software module without the risk that other modules may intercept the master key. (SP architecture also requires a variety of specific hardware and operating system enhancements for implementing these elements.)

Lee et al. suggest that SP architecture may be valuable for applications other than protecting cryptographic keys—applications such as digital rights management and privacy protection systems. Also, different scenarios, such as those requiring “transient trust” in providing protected data to crisis responders, can be supported with small extensions to the SP architecture. Lee et al. also note that while various proposals exist for secure I/O and secure bootstrapping, more research is needed to study alternatives that can be integrated into SP-like architectures for commodity computing and communications devices. SP architecture demonstrates that security-enhancing hardware features can be easily added to microprocessors and flexibly employed by software applications without degrading a system’s performance, cost, or ease of use.

Another example of recent work in this area is the new generation of hardware being shipped with secure co-processors that can store encryption keys and can perform encryption and hash functions. Specifically, the Trusted Computing Group is an industry consortium that has proposed

Page 94 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

a Trusted Platform Module (TPM) that is added to the I/O bus of a computing device to enable the measurement of the bits of the software stack that is installed, in order to detect changes in the software.⁵ It can provide useful security functionality that can be leveraged by many applications. TPM is a step forward in hardware-based security, but there are some limitations, such as the fact that the TPM definition of “remote attestation” enables checking the integrity of the bits on the entire software stack on program launch, but it does not do any checks after that for dynamic hostile code insertion and modification. TPM also has a threat model limited to software attacks and does not provide any coverage for even simple physical attacks like bus or memory probing; these probably should be considered because of the easy theft or loss of mobile or personal computing devices. TPM is available in some personal computers.

A system with tamper-proof hardware, or with hardware features that support the tamper-proofing of software, has the potential to radically change the way that operating systems enforce security. In particular, such a system provides a basis for doing secure attestation of programs and data—both locally and remotely. For example, a program might be accompanied by an attestation that describes its hash, thereby preventing modified programs (with the same name) from being executed. (In general, an attestation can refer to almost any property of a program and not just to the specific machine code realization of a program.) To ensure that a given software module is unaltered, one might digitally sign it—however, maintaining the binding between the hash and the software can be problematic without hardware support. In the longer run, operating systems might support programs accompanied by attestations which assert that some analyzer has checked the program (along with attestations that give a basis for trusting the analyzer and trusting the environment in which it executed) or asserting that some program has been “wrapped” in a reference monitor which ensures that certain policies are enforced.⁶

Much fundamental research remains to be done to determine what kinds of attestations will be useful to users and how difficult it will be for such attestations to be developed. There are also new legal issues to be addressed, since basic questions of ownership and control over computational resources come to the fore. (For example, the notion of hardware-based restrictions on certain uses of programs and data stored on one’s computer is inconsistent with the tradition that one has unlim-

⁵	Trusted Computing Group, “Trusted Platform Module (TPM) Specifications,” April 2006; available at https://www.trustedcomputinggroup.org/specs/TPM.
⁶	Alan Shieh, Dan Williams, Emin Gun Sirer, and Fred B. Schneider, “Nexus: A New Operating System for Trustworthy Computing,” Work in Progress Session, 20th Symposium on Operating System Principles, October 2005; available at http://www.cs.cornell.edu/fbs/publications/NexusSOSPwip.pdf.

Page 95 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

ited technical freedom to do as one pleases with programs and data on one’s computer.)

Another example is the recent introduction of multicore processors. These processors allow security checking to be done in parallel with other instruction processing, but the dominant application is support of code safety rather than checking access-control privileges. Today, it is not known how best to use multicore processors, and devoting some of their resources to security checking may have significant security advantages. New architectures in operating systems will be necessary to fully leverage the potential for such hardware.

Still another security function that may be more appropriately implemented in hardware is the generation of random numbers, which can be used for cryptographic keys or for nonces.⁷ Random numbers generated through software are much more guessable by an opponent, since the opponent must be presumed to have access to the same software. Thus, since poor choice of random numbers leads to vulnerabilities, hardware implementation of random number generators—strictly speaking, generators for random seeds to be used as inputs into (pseudo) random number generators—allows for the continuing injection of randomness into the pool. Protection of the random seed generator and the pseudo-random number generator is more effectively accomplished in hardware.

Finally, the processor is not the only hardware element in a computing system. How other hardware elements might contribute to security is as yet almost entirely unexplored.

4.1.2.2
Tamper Resistance

The tamper resistance of an IT artifact (which includes resistance to inspection and to alteration) is also an important property. Improving the tamper resistance of hardware can increase the robustness of a system, because security functionality implemented at a high level of abstraction (in software) can often be subverted by tampering at lower levels (in hardware). Improving the tamper resistance of such artifacts is especially important in a world of pervasive computing, in which hardware devices with networked connectivity will proliferate in an unconstrained manner and thus may well be available for adversaries to examine and modify.

⁷

A nonce is a number that is used in a protocol only once. For example, it can be used in an authentication protocol to ensure that an old message cannot be reused in a replay attack. Since an authentication protocol will typically require a nonce as an input variable, a replay attack is virtually impossible because the infinitesimally small likelihood that any given nonce will be identical to a previous one.

Page 96 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

Research in both the creation of tamper-resistant components and how they can be effectively exploited is valuable.

For example, Lie et al. have developed a hardware implementation of a form of execute-only memory that allows instructions stored in memory to be executed but not otherwise manipulated.⁸ In particular, software cannot be copied or modified without detection. A machine supporting internal compartments is required, in which a process in one compartment cannot read data from another compartment. All data that leave the machine are encrypted, since it must be assumed that external memory is not secure. There are trade-offs among security, efficiency, and flexibility, but the analysis of Lie et al. indicates that it is possible to create a normal multitasking machine in which nearly all applications can be run in an execute-only mode.

A second dimension of tamper resistance is that of increasing the difficulty of reverse-engineering a given object code. This can be problematic, as the object code must ultimately be read in its original form to be executed. One might encrypt object code, and decrypt it only when necessary for execution. However, in the absence of special-purpose hardware to carry out such decryption,⁹ it might be possible for an adversary to intercept the code as it is being decrypted and run.

Another class of techniques is known as code obfuscation, which refers to processes through which object code can be rewritten and/or stored in forms that are hard to transform into meaningful source code.¹⁰ Code obfuscation is intended to transform the object program in ways that do not alter its function but make it more difficult to understand. The new transformed program may have slower execution times or exhibit behavior not found in the original program, and managing this trade-off between undesirable behavior and degree of obfuscation remains a key challenge in developing code-obfuscation techniques.

Finally, a degree of tamper resistance can be obtained by adding code (“guards”) that monitor for changes to the code and take action if tamper-

⁸	D. Lie, C. Thekkath, M. Mitchell, P. Lincoln, D. Boneh, J. Mitchell, and M. Horowitz, “Architectural Support for Copy and Tamper Resistant Software,” Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 168-177, 2000.
⁹	See, for example, Amir Herzberg and Shlomit S. Pinter, “Public Protection of Software,” ACM Transactions on Computer Systems, 5(4): 371-393, November 1987.
¹⁰	Boaz Barak, “Can We Obfuscate Programs?,” available at http://www.math.ias.edu/~boaz/Papers/obf_informal.html#obfpaper; Douglas Low, “Protecting Java Code Via Code Obfuscation,” available at http://www.cs.arizona.edu/~collberg/Research/Students/DouglasLow/obfuscation.html.

Page 97 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

ing is detected.¹¹ Such an approach is the foundation behind at least one commercial enterprise.¹²

4.1.2.3
Process Isolation

A third interesting area is that of process isolation and separation. The ability to virtualize multiple processes running on the same processor has been in hand since the early 1960s on the PDP-1 and in the mid-1960s on IBM mainframe computers. In more recent times, early microprocessors such as the Intel 8086 lacked an instruction set architecture that could support virtualization, and this deficiency persisted through the instruction set architecture of the Pentium.¹³ As the instruction set evolved to be more capable and processor speeds rose, virtualization of these microprocessors became feasible—and was useful as well, because of the increasing needs for isolation in a changing threat environment.

The basic requirements for virtualization were described in 1974 by Popek and Goldberg.¹⁴ The basic work on virtualization to run multiple operating systems was done at IBM for the 7044,¹⁵ the 360/40,¹⁶ and the first product CP/67 for the 360/67.¹⁷ Virtualization makes it possible to run multiple operating systems (and their applications) on a single server, reducing overall hardware costs. Production and test systems can run at the same time in the same hardware, and different operating systems such as Windows and Linux can share the same server. Virtualization may also have particular relevance to improving security in operating systems that are designed to be backward compatible with earlier versions. Virtualization can increase the load factor on servers and other systems, thus

¹¹	See Hoi Chang, 2003, “Building Self-Protecting Software with Active and Passive Defenses,” Ph.D. dissertation, Department of Computer Science, Purdue University.
¹²	For an example of a commercial enterprise based on products using this approach, see http://www.arxan.com.
¹³	J.S. Robin and C.E. Irvine, “Analysis of the Intel Pentium’s Ability to Support a Secure Virtual Machine Monitor,” 9th USENIX Security Symposium, August 14-17, 2000, Denver, Colo.: USENIX, The Advanced Computing Systems Association, pp. 129-144; available at http://www.usenix.org/events/sec2000/robin.html.
¹⁴	G.J. Popek and R.P. Goldberg, “Formal Requirements for Virtualizable Third Generation Architectures,” Communications of the ACM, 17(7): 412-421, July 1974.
¹⁵	R.W. O’Neill, “Experience Using a Time-Shared Multi-Programming System with Dynamic Address Relocation Hardware,” pp. 611-621 in Vol. 30, Proceedings of the 1967 Spring Joint Computer Conference, April 18-20, 1967, Atlantic City, N.J.: Thompson Books.
¹⁶	A.B. Lindquist, R.R. Seeber, and L.W. Comeau, “A Time-Sharing System Using an Associative Memory,” Proceedings of the IEEE, 54(12): 1774-1779, December 1966.
¹⁷	R.A. Meyer and L.H. Seawright, “A Virtual Machine Time-Sharing System,” IBM Systems Journal, 9(3): 199-218, 1970; available at http://www.research.ibm.com/journal/sj/093/ibmsj0903D.pdf.

Page 98 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

utilizing central processing unit cycles that would otherwise be wasted unproductively.

The major challenge today in process separation is that of achieving the capability to allow selected interactions between processes (that is, in defining and enforcing policies for information flow), and of course interactions between processes mean that independence and separation can no longer be guaranteed. Consider, for example, that a mailer and a Web browser might run on separate virtual machines. That would prevent downloaded malware from a Web site from having a harmful effect on the mailer. But what should be done if the user wants to mail a Web page? Or if the user wants to view the Web page corresponding to a URL in a received e-mail? The general problem, not solvable in the abstract, is in deciding whether any proposed interaction between processes will be harmful or not. Put differently, the issue is unanticipated consequences of interactions that are allowed and designed into the system, rather than failures in the isolation of a virtual machine in the first place. Note also that when large-scale storage devices must also be shared between processes for storing the data associated with each process, these processes must interact implicitly as they seek and obtain access to such devices, even if such interactions are not explicitly allowed by whatever security policy is in place.

Finally, the integration of higher-level components that have not been optimized for use in a secure kernel environment remains a challenge. Isolation is a relatively easily exploitable benefit, in that a component should be able to run in a virtual environment just as easily as in a real one. But other services can exploit the services provided by secure kernels as well. For example, security services can benefit from isolation because they are less easily subverted in that configuration and have greater tamper resistance. In addition, because they are isolated, they are likely to have different failure modes than if they were run in the main system. Some examples include the following: antivirus services that depend on trustworthy databases to identify viruses, provenance services that securely store the provenance of every file out of reach of the main operating system, network services that check provenance metadata prior to forwarding real data to applications, event-monitoring and -logging services for detecting problems or to support subsequent forensic investigation, and automated recovery services that enable recovery to a system state captured at some point prior to some security failure.

A different approach to process separation is to isolate functionality on multiple processors. The theory underlying this approach is that processing power is increasingly inexpensive, thus putting a lower premium on maximizing the efficiency of computational capability. Especially with multicore processors available, it becomes possible in principle for one

Page 99 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

processor to run a single application, thus increasing its immunity from flaws in other applications. The operating system for that application can thus be written in a way that supports only that application, which implies that it can be made much simpler than an operating system designed for general-purpose use. Further, from a security standpoint, the behavior of a simpler and more specialized system is easier to specify, and hence deviations from normal behavior are easier to detect.¹⁸

4.1.2.4
Language-Based Security

Language-based security is an approach to security that is based on techniques developed in the programming-language community to ensure that programs can be relied on not to violate some policy of interest.¹⁹ The techniques involved include analysis and transformation. One well-known form of analysis is “type checking,” whereby the fact that a program does certain unsafe things is detected before the program is run. One well-known form of program transformation is the addition of runtime checks to a program, whereby a program is instrumented in a way that prevents the (instrumented) program from making a problematic (i.e., policy-violating) transformation.²⁰

These techniques are applicable to a wide variety of systems: systems written in high-end languages, legacy systems, and systems whose code is represented only as machine language today. These techniques also have special relevance to writing systems that enforce information flow and integrity policies, which are “end to end” and far more general than the usual “access-control policies” that today’s operating systems enforce, and for creating “artificial diversity” (by program rewriting) so that different instances of a program are not subject to common attacks.

4.1.2.5
Component Interfaces

Sound interface design must be integrated into system architecture. A basic goal of interface design should be to encourage the development and analysis of system requirements, policies, architectures, and interfaces that will greatly enhance the understandability of computer

¹⁸	Eric Bryant et al., “Poly² Paradigm: A Secure Network Service Architecture”; available at http://www.acsac.org/2003/abstracts/72.html.
¹⁹	Fred B. Schneider, Greg Morrisett, and Robert Harper, “A Language-Based Approach to Security,” pp. 86-101 in Informatics: 10 Years Back, 10 Years Ahead, Lecture Notes in Computer Science, Vol. 2000, Reihnard Wilhelm (ed.), Springer-Verlag, Heidelberg, 2000.
²⁰	Fred B. Schneider, “Enforceable Security Policies,” ACM Transactions on Information and System Security 3(1): 30-50, February 2000.

Page 100 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

systems and their behavior as observed by application developers, system administrators, and users.

The achievement of sound interfaces can have enormous benefits in the development, procurement, operation, and use of those computer systems and associated networks. This effort has user- and system-oriented aspects, particularly in trying to reduce the semantic gap between what can be derived from specific interfaces and what can be obtained by detailed examination of source code, libraries, compilers, interpreters, operating system environments, networking, and system administration tools.

Particular emphasis is also needed on the use of analysis techniques for defining and analyzing system interfaces so that the desired behavior that they represent and the dependencies among different interfaces can be more easily understood and controlled. The approach should be both constructive (in terms of developing or modifying systems to achieve more understandable behavior) and analytic (in terms of trying to figure out what is happening dynamically, especially when something unusual occurs), and it should be applicable to interfaces for operating systems, applications, and system administration.

As an illustration, consider what it means to specify a component interface. These interfaces typically describe how a component is supposed to respond to certain inputs (or a range of inputs). But many component designers fail to specify the behavior for other inputs, and this is exactly the space within which attackers search for inputs that will make the component act outside its specification.

Composability is particularly relevant in the design of interfaces. For example, combining two components with well-designed interfaces may introduce unwanted side effects that are not evident from either interface. This is clearly undesirable and needs to be avoided through sound interface and system design.

Research and development areas specifically oriented to interface design might include the following:

The development of models and static-analysis tools for evaluating interface specifications and determining their composability, interdependencies, and ability to enforce security requirements. Of considerable interest to the development of secure systems would be the following:
- The ability to analyze individual interfaces for logical consistency, completeness with respect to functionality that must be included, uniformity of interface conventions, consistency with documentation, understandability, and ease of use;
- The ability to analyze the interactions among different interfaces, as part of the ability to create systems as predictable composi-

Page 101 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

tions of carefully analyzed components, with analysis of how the properties of the individual interfaces are affected; and

The ability to determine the minimal subsets of systems whose functionality is sufficient to satisfy the given requirements, and to identify any hidden dependencies on unvalidated functionality.

In addition, extensive guidelines should be developed for perspicuous interfaces, for use with various software development methodologies and programming languages.

The establishment of systematic approaches for handling exception conditions, concurrency, and remediation under adverse conditions. For example, it is important to avoid bad system behavior where possible and to be able to respond rapidly to potentially complex system misbehavior or attacks, and to ensure that appropriate handles are accessible in the visible interfaces without cluttering up normal use and creating more opportunities for human error.
The development and constructive use of metrics for usability, particularly with respect to security issues such as access controls, authentication protocols, system administration, and so on. Usability metrics for visible interfaces must be an integral part of the development process. They must also be incorporated into any evaluation processes, such as those built into the Common Criteria process.²¹
The supplementing of the design and development process with assurance techniques specifically relevant to the interfaces, including the ability to identify additional hidden and detrimental functionality that can be accessed through the interface in undocumented or unspecified ways. For example, an interface might include a test function inserted during debugging that exposes cryptographic keys. Although such a function should be removed before release, its actual removal may be overlooked.

Note that these areas may require semantic knowledge of the underlying components (such as specifications or implementations) and cannot be based solely on the interfaces themselves.

4.1.2.6
Cryptology

Today, with many advances already made in cryptography, it is tempting to believe that cryptography is well enough understood that it does not warrant further research. Nevertheless, as the recent success in break-

²¹	For more information on the Common Criteria process, see http://www.commoncriteriaportal.org/.

Page 102 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

ing the SHA-1 hash algorithm suggests,²² the intellectual infrastructure of cryptography for commercial and other nonmilitary/nondiplomatic use is not as secure as one might believe. Growing computational power (which led to the vulnerability of the Data Encryption Standard to brute-force decryption) and increasingly sophisticated cryptanalytic tools mean that the study of even these very basic cryptographic primitives (encryption and hash algorithms) has continuing value. Moreover, what had been viewed as esoteric cryptographic primitives and methods of mostly theoretical interest—threshold cryptography, proactive security, and multiparty computation—are now being seen as exactly the right primitives for building distributed systems that are more secure.

Nor are interesting areas in cryptology restricted to cryptography. For example, the development of secure protocols is today more of an art than a science, at least in the public literature, and further research on the theory of secure protocols is needed. A related point is that real-world cryptosystems or components can be implemented in such a way that the security which they allegedly provide can be compromised through unanticipated information “leakages” that adversaries can exploit or cause.²³ In addition, despite the widespread availability of encryption tools, most electronic communications and data are still unencrypted—a point suggesting that the infrastructure of cryptology remains ill-suited for widespread and routine use. Many practical problems, such as the deployment of usable publickey infrastructures, continue to lack scalable solutions. The conceptual complexity of employing encryption and the potential exposures that come

²²

More precisely, an attack against the SHA-1 algorithm has been developed that reduces its known run-time collision resistance by a factor of 2¹¹ (from 2⁸⁰ to 2⁶⁹) (Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu, “Finding Collisions in the Full SHA-1,” Advances in Cryptology—Crypto’05; available at http://www.infosec.sdu.edu.cn/paper/sha1-crypto-auth-new-2-yao.pdf). In addition, Adi Shamir announced during the Rump Session at Crypto’05 (on August 15, 2005) that Wang and other collaborators had demonstrated the possibility of finding a collision in SHA-1 in 2⁶³ operations, although no actual collisions had been found. This result applies only to collision resistance, which means that digital signatures are placed at risk, but the result does not affect constructions for key derivation, message authentication codes, or random function behavior (i.e., it does not affect any construction in which specific content may be at issue).

²³

For example, Paul Kocher has developed attacks on certain real-world systems that can reveal secret keys in much less time than would be required by brute-force techniques, even though the cryptography in these systems has been implemented perfectly. Kocher’s attacks are based on timing and/or power measurements of the systems involved. See, for example, Paul Kocher et al., “Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems,” December 1995, available at http://www.cryptography.com/resources/whitepapers/TimingAttacks.pdf; and Paul Kocher et al., “Introduction to Differential Power Analysis and Related Attacks,” 1998, available at http://www.cryptography.com/dpa/technical/.

Page 103 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

with doing it wrong strongly suggest the need for research to understand where, how, and when it fits into security architecture.

As an example of bringing cryptographic theory into practice, consider multiparty computations. Here, a collection of parties engages in computing some function of the values that each has, but no party learns values that the others have. Moreover, some protocols defend against having a fraction of the participants be compromised.

Threshold digital signatures are a simple example of a multiparty computation. This functionality is useful (though it has not yet enjoyed widespread practical use) when a service is implemented by a replicated set of servers. (Any majority of the servers can together create a signature for responses from the service, but no individual server is capable of impersonating the service.) However, more sophisticated multiparty computation algorithms have not yet made the transition from theory to practice. So-called proactive cryptographic protocols are another area of interest. These protocols call for the periodic changing of secrets so that information that an attacker gleans from successfully compromising a host is short-lived. Effecting the transition of this cryptographically supported functionality from theory to practice will change the toolbox that systems builders use and could well enable systems that are more secure through the clever deployment of these new cryptographic primitives.

Finally, as new mathematical methods are discovered and as new computing technology becomes available, what is unbreakable today may be penetrable next week. As one example, consider that quantum computing, if made practical, would invalidate several existing methods thought to be unbreakable. Likewise, it has not yet been proven that prime factorization is an NP problem, and that NP is not reducible to P. Thus, it is possible that future discoveries could change a number of the current assumptions about systems such as the RSA algorithm—suggesting that work on developing new basic cryptographic primitives is useful as a hedge against such possibilities.

4.1.3
Research to Support Testing and Evaluation

Testing and evaluation (T&E) are necessary because of the nature of information technology artifacts as things designed and implemented by people, who make mistakes. T&E generally consumes half or more of the overall cost for a software system. T&E occurs at every level of granularity in a system (unit to subassembly, to overall system, to deployed system in situ), and at all process phases, starting with requirements.

Traditional testing involves issues of coverage. Testing every statement may not be enough, but it may nonetheless be difficult to achieve. Testing every branch and path is even harder, since there is generally a

Page 104 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

combinatorially large number of paths. How much coverage is needed, and what are the metrics of coverage?

4.1.3.1
Finding Unintended Functionality

One of the most challenging problems in testing and evaluation is that of auditing a complex artifact for functionality that has not been included in the specification of requirements and that may result in security vulnerabilities. In a world of outsourced and offshore chip fabrication and/or code development and given the possibilities that trusted designers or programmers might not be so trustworthy, it is an important task to ensure that functionality has not been added to a hardware or software system that is not consistent with the system’s specifications.

However, the complexity of today’s IT artifacts is such that this task is virtually impossible to accomplish for any real system, and the problem will only get worse in the future. Today, the best testing methodologies can be divided into two types: (1) efforts to find the problems whose presence is a priori known, and (2) directed but random testing of everything else that might reveal an “unknown unknown.” Formal methods may also offer some promise for finding unintended functionality, although their ability to handle large systems is still quite limited.

These considerations suggest that comprehensive cybersecurity involves both secure hardware and secure software at every level of the protocol stack, from the physical layer up. This is not to say that every IT application must be run on hardware or software that has been designed and fabricated by trustworthy parties—only that the sensitivity of the application should determine what level of concern should be raised about possible cybersecurity flaws that may have been deliberately embedded in hardware or software.

4.1.3.2
Test Case Generation

A second dimension of testing is to ensure that testing is based on a “good” set of test cases. For example, it is well known that test cases should include some malformed inputs and some that are formally derived from specifications and from code, and in particular, cases that go outside the specification and break the assumptions of the specification. Such cases will often reveal security vulnerabilities if they do exist.

Testing can focus on particular attributes beyond just functional behavior. For example, a security test might focus on behavior with out-of-specification inputs, or it might occur when the system is under load beyond its declared range, and so on. Similarly, unit or subsystem testing could focus on the “robustness” of internal interfaces as a way to assess

Page 105 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

how an overall system might contain an error, keeping an error within the confines of a subsystem by tolerating and recovering.

A related point is the development of test suites for commonly used software for which there are multiple implementations. For example, Chen et al. documented the existence of different semantics in three different versions of Unix (Linux, Solaris, and FreeBSD) for system calls (the uid-setting system calls) that manage system privileges afforded to users.²⁴ Their conclusion was that these different semantics were responsible for many security vulnerabilities. Appropriate test suites would help to verify the semantics and standards compliance of system calls, library routines, compilers, and so on.

4.1.3.3
Tools for Testing and Evaluation

A third important dimension of testing and evaluation is the real-world usability of tools and approaches for T&E, many of which suffer from real-world problems of scalability, adoptability, and cost. For example:

Tools for static code analysis are often clumsy to use and sometimes flag an enormous number of issues that must be ignored because they are not prioritized in any way and because resources are not available to address all of them.
Dynamic behavior analysis, especially in distributed asynchronous systems, is poorly developed. For example, race conditions—the underlying cause of a number of major vulnerabilities—are difficult to find, and tools oriented toward their discovery are largely absent.
Model checking, code and program analysis, formal verification, and other “semantics-based” techniques are becoming practical only for modestly sized real-system software components. Considerable further work is needed to extend the existing theory of formal verification to the compositions of subsystems.

All of these T&E techniques require some kind of specification of what is intended. With testing, the test cases themselves form a specification, and indeed agile techniques rely on testing for this purpose. Inspection allows more informal descriptions. Analysis and semantics-based

²⁴	Hao Chen, David Wagner, and Drew Dean, “Setuid Demystified,” Proceedings of the 11th USENIX Security Symposium, pp. 171-190, 2002; available at http://www.cs.berkeley.edu/~daw/papers/setuid-usenix02.pdf.

Page 106 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

techniques rely on various focused “attribute-specific” specifications of intent.

Inspection is another important technique related to testing and evaluation. Inspection underlies the Common Criteria (ISO 15408), but it relies on subjective human judgment in the sense that the attention of the human inspectors may be guided through the use of tools and agreed frameworks for inspection. Moreover, the use of human inspectors is expensive, suggesting that inspection as a technique for testing and evaluation does not easily scale to large projects.

4.1.3.4
Threat Modeling

Today, most security certification and testing are based on a “test to the specification” process. That is, the process begins with an understanding of the threats against which defenses are needed. Defenses against those threats are reflected as system specifications that are included in the overall specification process for a system. Testing is then performed against those specifications. While this process is reasonably effective in finding functionality that is absent from the system as implemented (this is known because that functionality is reflected in the specification), it has two major weaknesses.

The first weakness of the test-to-the-specification process is that it requires a set of clear and complete specifications that can be used to drive the specifics of the testing procedure. However, as noted in Section 4.1.1, a great deal of real-world software development makes use of methodologies based on spiral and incremental development in which the software “evolves” to meet the new needs that users have expressed as they learn and use the software. This means that it is an essentially impossible task to specify complex software on an a priori basis. Thus, specifications used for testing are generally written after the software has been written. This means that the implemented functionality determines the specifications, and consequently the specifications themselves are no better than the understanding of the system on the part of the developers and implementers. That understanding is necessarily informal (and hence incomplete), because it is, by assumption, not based on any kind of formal methodology. (The fact that these specifications are developed after the fact also makes them late and not very relevant to the software development process, but those are beyond the scope of this report.)

The second weakness, related to the first, is that this methodology is not particularly good at finding additional functionality that goes beyond what is formally specified. (Section 4.1.3.1 addresses some of the difficulties in finding such problems.)

Page 107 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

Weaknesses in a test-to-the-specification approach suggest that complementary approaches are needed. In particular, threat modeling and threat-based testing are becoming increasingly important. In these approaches, a set of threats is characterized, and testing activities include testing defenses against those threats. (This is the complement to threat-based design, described in Section 4.1.1.2.)

This approach can be characterized as, “Tell me the threats that you are defending against and prove to me that you have done so.” Research in this domain involves the development of techniques to characterize broader categories of threat and more formal methods to determine the adequacy of defenses against those threats. For those situations in which a threat is known and a vulnerability is present but no defense is available, developing instrumentation to monitor the vulnerability for information on the threat may be a useful thing to do as well. Research is also needed for enabling spiral methodologies to take into account new threats as a system “evolves” to have new features.

4.2
GRACEFUL DEGRADATION AND RECOVERY

If the principle of defense in depth is taken seriously, system architects and designers must account for the possibility that defenses will be breached, in which case it is necessary to contain the damage that a breach might cause and/or to recover from the damage that was caused. Although security efforts should focus on reducing vulnerabilities proactively where possible, it is important that a system provide containment to limit the damage that a security breach can cause and recovery to maximize the ease with which a system or network can recover from an exploitation. Progress in this area most directly supports Provision II and Provision III of the Cybersecurity Bill of Rights, and indirectly supports Provision VII.

4.2.1
Containment

There are many approaches to containing damage:

Engineered heterogeneity. In agriculture, monocultures are known to be highly vulnerable to blight. In a computer security context, a population of millions of identically programmed digital objects is systematically vulnerable to an exploit that targets a specific security defect, especially if all of those objects are attached to the

Page 108 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

Internet.²⁵ If it is the specifics of a given object code that result in a particular vulnerability, a different object code rewritten automatically to preserve the original object code’s high-end functionality may eliminate that vulnerability. (Of course, it is a requirement of such rewriting that it not introduce another vulnerability. Moreover, such methods can interfere with efforts to debug software undertaken at the object code level, as well as with legitimate third-party software add-ons and enhancements, suggesting that there are trade-offs to be analyzed concerning whether or not automatic rewriting is appropriate or not in any given situation.)

Disposable computing. An attacker who compromises or corrupts a system designed to be disposable—that is, a computing environment whose corruption or compromise does not matter much to the user—is unlikely to gain much in the way of additional resources or privileges.²⁶ A disposable computing environment can thus be seen as a buffer between the outside world and the “real” computing environment in which serious business can be undertaken. When the outside world manifests a presence in the buffer zone, the resulting behavior is observed, thus providing an empirical basis for deciding whether and/or in what form to allow that presence to be passed through to the “real” environment. As in the case of process isolation, the challenge in disposable computing is to develop methods for safe interaction between the buffer and the “real” environment.

One classic example of disposable computing is Java, which was widely adopted because its sandboxing technology created a perimeter around the execution context of the applet code. That is, an applet could do anything inside the sandbox but was constrained from affecting anything outside the sandbox.
Virtualization and isolation. As discussed in Section 4.1.2.3, isolation is one way of confining the reach of an application or a software module.

²⁵	Monocultures in information technology also have an impact on the economics of insuring against cyber-disasters. Because the existence of a monoculture means that risks to systems in that monoculture are not independent, insurers face a much larger upper bound on their liability than if these risks were independent, since they might be required to pay off a large number of claims at once.
²⁶	Perhaps the most important gain from such an attack is knowledge and insight into the structure of that computing environment—which may be useful in conducting another attack against another similarly constructed system.

Page 109 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

4.2.2
Recovery

A second key element of a sound defensive strategy is the ability to recover quickly from the effects of a security breach, should one occur. Indeed, in the limiting case and when information leakage is not the threat of concern, allowing corruption or compromise of a computer system may be acceptable if that system can be (almost) instantaneously restored to its correct previous state. That is, recovery can itself be regarded as a mechanism of cyberdefense when foiling an attack is not possible or feasible. Recent work in embedding transaction and journaling capabilities into basic file system structures in operating systems suggests that there is some commercial demand for this approach.

Because of the difficulty of high-confidence prevention of system compromise against high-end threats, recovery is likely to be a key element of defending against such threats. Illustrative research topics within this domain include the following:

Rebooting. Rebooting a system is a step taken that resets the system state to a known initial configuration; it is a necessary step in many computer operations. For example, rebooting is often necessary when a resident system file is updated. Rebooting is also often necessary when an attack has wreaked havoc on the system state. However, rebooting is normally a time-consuming activity that results in the loss of a great deal of system state that is perfectly “healthy.” Rebooting is particularly difficult when a large-scale distributed system is involved. Micro-rebooting (an instantiation of a more general approach to recovery known as software rejuvenation²⁷) is a technique that reboots only the parts of the system that are failing rather than the entire system. Research in micro-rebooting includes, among other things, the development of techniques to identify components in need of rebooting and ways to reduce further the duration of outage associated with rebooting. Such considerations are particularly important in environments that require extremely high availability.

²⁷

Software rejuvenation is a technique proposed to deal with the phenomenon of software aging, one in which the performance of a software system degrades with time as the result of factors such as exhaustion of operating system resources and data corruption. In general terms, software rejuvenation calls for occasionally terminating an application or a system, cleaning its internal state and/or its environment, and restarting it. See, for example, Kalyanaraman Vaidyanathan and Kishor S. Trivedi, “A Comprehensive Model for Software Rejuvenation,” IEEE Transactions on Dependable and Secure Computing, 2 (2, April-June): 124-137, 2005. See also http://srejuv.ee.duke.edu.

Page 110 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

Online production testing. An essential element of recovery is fault identification. One approach to facilitate such identification is online testing, in which test inputs (and sometimes deliberately faulty inputs) are inserted into running production systems to verify their proper operation. In addition, modules in the system are designed to be self-testing to verify the behavior of all other modules with which they interact.
Large-scale undo capabilities. An undo capability enables system operators to roll back a system to an earlier state, and multiple layers of undo capability enable correspondingly longer roll-back periods. If a successful cyberattack occurs at a given time, rolling back the system’s state to before that time is one way of recovering from the attack—and it does not depend on knowing anything about the specific nature of the attack.²⁸

4.3
SOFTWARE AND SYSTEMS ASSURANCE

Software and systems assurance is focused on two related but logically distinct goals: the creation of systems that will do the right thing under the range of possible operating conditions, and human confidence that the system will indeed do the right thing.

For much of computing’s history, high-assurance computing has been most relevant to systems such as real-time avionics, nuclear command and control, and so on. But in recent years, the issue of electronic voting has brought questions related to high-assurance computing squarely into the public eye. At its roots, the debate is an issue of assurance: how does (or should) the voting public become convinced that the voting process has not been compromised? In such a context, it is not enough that a system has not been compromised; it must be known not to have been compromised. This issue has elements of traditional high-assurance concerns (e.g., Does the program meet its specifications?) but also has broader questions concerning support for recounts, making sure the larger context cannot be used for corruption (e.g., configuration management).

A variety of techniques have been developed to promote software and

²⁸

Aaron B. Brown, A Recovery-Oriented Approach to Dependable Services: Repairing Past Errors with System-Wide Undo, University of California, Berkeley, Computer Science Division Technical Report UCB//CSD-04-1304, December 2003, available at http://roc.cs.berkeley.edu/projects/undo/index.html; A. Brown and D. Patterson, “Undo for Operators: Building an Undoable E-Mail Store,” in Proceedings of the 2003 USENIX Annual Technical Conference, San Antonio, Tex., June 2003, available at http://roc.cs.berkeley.edu/papers/brown-emailundo-usenix03.pdf.

Page 111 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

systems assurance, including formal requirements analysis, architectural reviews, and the testing and verification of the properties of components, compositions, and entire systems. It makes intuitive sense that developing secure systems would be subsumed under systems assurance—by definition, secure systems are systems that function predictably even when they are under attack.²⁹

An additional challenge is how to design a system and prove assurance to a general (lay) audience. In the example above, it is the general voting public—not simply the computer science community—that is the ultimate judge of whether or not it is “sufficiently assured” that electronic voting systems are acceptably secure.

Some techniques used to enhance reliability are relevant to cybersecurity—much of software engineering research is oriented toward learning how to decide on and formulate system requirements (including tradeoffs between functionality, complexity, schedule, and cost); developing methods and tools for specifying systems, languages, and tools for programming systems (especially systems involving concurrent and distributed processing); middleware to provide common services for software systems; and so on. Testing procedures and practices (Section 4.1.3) are also intimately connected with assurance. All of these areas are relevant to the design and implementation of more secure systems, although attention to these issues can result in common solutions that address reliability, survivability, and evolvability as well.

Software engineering advances also leverage basic research in areas that seem distant from system building per se. Success in developing tools for program analysis, in developing languages for specifications, and in developing new programming languages and computational models typically leverages more foundational work—in applied logic, in algorithms, in computational complexity, in programming-language design, and in compilers.

At the same time, assurance and security are not identical, and they often seek different goals. Consider the issue of system reliability, usually regarded as a key dimension of assurance. In contrast with threats to security, threats to system reliability are nondirected and in some sense are more related to robustness against chance events such as power outages or uninformed users doing surprising or unexpected things. By contrast, threats to security are usually deliberate, involving a human adversary who has the intention to do damage and who takes actions that are decid-

²⁹	For more discussion of this point, see National Research Council, Trust in Cyberspace, National Academy Press, Washington, D.C., 1999.

Page 112 Cite

Suggested Citation:"4 Category 1 - Blocking and Limiting the Impact of Compromise." National Research Council and National Academy of Engineering. 2007. Toward a Safer and More Secure Cyberspace. Washington, DC: The National Academies Press. doi: 10.17226/11925.

×

edly not random. A test and evaluation regime oriented toward reliability will not necessarily be informative about security. The same is true about using redundancy as a solution to reliability, since redundancy can be at odds with heterogeneity in designing for security. Thus, it would be a mistake to conclude that focusing solely on reliability will automatically lead to high levels of cybersecurity.