4
Category 1—Blocking and Limiting the Impact of Compromise

The goal of requirements in Category 1 of the committee’s illustrative research agenda is that of ensuring that the impact of compromises in accountability or system security is limited. This broad category—blocking and limiting the impact of compromise—includes secure information systems and networks that resist technical compromise; technological and organizational approaches that reveal attempts to compromise information technology (IT) components, systems, or networks; containment of breaches; backup and recovery; convenient and ubiquitous encryption that can prevent unauthorized parties from obtaining sensitive or confidential data; system lockdowns under attack; and so on.

A basic principle underlying Category 1 is that of defense in depth. A great deal of experience in dealing with cybersecurity issues suggests that no individual defensive measure is impossible to circumvent. Thus, it makes sense to consider defense in depth, which places in the way of a cyberattacker a set of varied hurdles, all of which must be penetrated or circumvented if the cyberattacker is to achieve its goal. When different hurdles are involved, an attacker must have access to a wider range of expertise to achieve its goal and also must have the increased time and resources needed to penetrate all of the defenses.

4.1
SECURE DESIGN, DEVELOPMENT, AND TESTING

The principle that security must be a core attribute of system design, development, and testing simply reflects the point that it is more effective



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 83
Toward a Safer and More Secure Cyberspace 4 Category 1—Blocking and Limiting the Impact of Compromise The goal of requirements in Category 1 of the committee’s illustrative research agenda is that of ensuring that the impact of compromises in accountability or system security is limited. This broad category—blocking and limiting the impact of compromise—includes secure information systems and networks that resist technical compromise; technological and organizational approaches that reveal attempts to compromise information technology (IT) components, systems, or networks; containment of breaches; backup and recovery; convenient and ubiquitous encryption that can prevent unauthorized parties from obtaining sensitive or confidential data; system lockdowns under attack; and so on. A basic principle underlying Category 1 is that of defense in depth. A great deal of experience in dealing with cybersecurity issues suggests that no individual defensive measure is impossible to circumvent. Thus, it makes sense to consider defense in depth, which places in the way of a cyberattacker a set of varied hurdles, all of which must be penetrated or circumvented if the cyberattacker is to achieve its goal. When different hurdles are involved, an attacker must have access to a wider range of expertise to achieve its goal and also must have the increased time and resources needed to penetrate all of the defenses. 4.1 SECURE DESIGN, DEVELOPMENT, AND TESTING The principle that security must be a core attribute of system design, development, and testing simply reflects the point that it is more effective

OCR for page 83
Toward a Safer and More Secure Cyberspace to reduce vulnerabilities by not embedding them in a system than to fix the problems that these vulnerabilities cause as they appear in operation.1 Vulnerabilities can result from design, as when system architects embed security flaws in the structure of a system. Vulnerabilities also result from flaws in development—good designs can be compromised because they are poorly implemented. Testing for security flaws is necessary because designers and implementers inevitably make mistakes or because they have been compromised and have deliberately introduced such flaws. 4.1.1 Research to Support Design 4.1.1.1 Principles of Sound and Secure Design In the past 40+ years, a substantial amount of effort has been expended in the (relatively small) security community to articulate principles of sound design and to meet the goal of systems that are “secure by design.” On the basis of examinations of a variety of systems, researchers have found that the use of these principles by systems designers and architects correlates highly to the security and reliability of a system. Box 4.1 summarizes the classic Saltzer-Schroeder principles, first published in 1975, that have been widely embraced by cybersecurity researchers. Systems not built in accord with such principles will almost certainly exhibit inherent vulnerabilities that are difficult or impossible to address. These principles, although well known in the research community and available in the public literature, have not been widely adopted in the mainstream computer hardware and software design and development community. There have been efforts to develop systems following these principles, but observable long-term progress relating specifically to the multitude of requirements for security is limited. For example, research in programming languages has resulted in advances that can obviate whole classes of errors—buffer overflows, race conditions, off-by-one errors, format string attacks, mismatched types, divide-by-zero crashes, and unchecked procedure-call arguments. But these advances, important though they are, have not been adopted on a sufficient scale to make these kinds of error uncommon. Nonetheless, the principles remain valid—so why have they had so little impact in the design and development process? In the committee’s 1 For example, Soo Hoo et al. determined empirically that fixing security defects after deployment cost almost seven times as much as fixing them before deployment. Furthermore, security investments made in the design stage are 40 percent more cost-effective than similar investments in the development stage. See K. Soo Hoo, A. Sudbury, and A. Jaquith, “Tangible ROI Through Secure Software Engineering,” Secure Business Quarterly, Quarter 4, 2001.

OCR for page 83
Toward a Safer and More Secure Cyberspace view, three primary reasons account for the lack of such impact: the mismatch between these principles and real-world software development environments, short-term expenses associated with serious adherence to these principles, and potential conflicts with performance. 4.1.1.1.1 The Mismatch with Current Development Methodologies One reason for the lack of impact is the deep mismatch between the principles of system design in Box 4.1 and real-world software development environments. Even a cursory examination of the principles discussed in Box 4.1 suggests that their serious application is predicated on a thorough and deep understanding of what the software designers and architects are trying to do. To apply these principles, software designers and architects have to know very well and in some considerable detail just what the ultimate artifact is supposed to do. The software development model most relevant to this state of affairs is often called the waterfall model, explicated in considerable detail by Boehm.2 This model presumes a linear development process that proceeds from requirements specification, to design, to implementation/coding, to integration, to testing/debugging, to installation, to maintenance, although modified versions of the model acknowledge some role for feedback between each of these stages and preceding ones. But despite its common use in many software development projects (especially large ones), the waterfall model is widely viewed as inadequate for real-world software development. The reason is that many—perhaps even most—software artifacts grow organically. The practical reality is that large software systems emerge from incremental additions to small software systems in ways entirely unanticipated by the designers of the original system. If the original system is successful, users will almost certainly want to add new functionality. The new functionality desired is by definition unanticipated—if the designers had known that it would be useful, they would have included it in the first place. Indeed, it is essentially impossible in practice for even the most operationally experienced IT applications developers to be able to anticipate in detail and in advance all of a system’s requirements and specifications. (Sometimes users change their minds about the features they want, or even worse, want contradictory features! And, of course, it is difficult indeed to anticipate all potential uses.) Thus, system requirements and specifications are always inherently incomplete, even though they underlie and drive the relationships among various modules and their inter- 2 Barry Boehm, Software Engineering Economics, Prentice-Hall, Englewood Cliffs, N.J., 1981.

OCR for page 83
Toward a Safer and More Secure Cyberspace BOX 4.1 The Saltzer-Schroeder Principles of Secure System Design and Development Saltzer and Schroeder articulate eight design principles that can guide system design and contribute to an implementation without security flaws: Economy of mechanism: The design should be kept as simple and small as possible. Design and implementation errors that result in unwanted access paths will not be noticed during normal use (since normal use usually does not include attempts to exercise improper access paths). As a result, techniques such as line-by-line inspection of software and physical examination of hardware that implements protection mechanisms are necessary. For such techniques to be successful, a small and simple design is essential. Fail-safe defaults: Access decisions should be based on permission rather than exclusion. The default situation is lack of access, and the protection scheme identifies conditions under which access is permitted. The alternative, in which mechanisms attempt to identify conditions under which access should be refused, presents the wrong psychological base for secure system design. This principle applies both to the outward appearance of the protection mechanism and to its underlying implementation. Complete mediation: Every access to every object must be checked for authority. This principle, when systematically applied, is the primary underpinning of the protection system. It forces a system-wide view of access control, which, in addition to normal operation, includes initialization, recovery, shutdown, and maintenance. It implies that a foolproof method of identifying the source of every request must be devised. It also requires that proposals to gain performance by remembering the result of an authority check be examined skeptically. If a change in authority occurs, such remembered results must be systematically updated. Open design: The design should not be secret. The mechanisms should not depend on the ignorance of potential attackers, but rather on the possession of specific, more easily protected, keys or passwords. This decoupling of protection mechanisms from protection keys permits the mechanisms to be examined by many reviewers without concern that the review may itself compromise the safeguards. In addition, any skeptical users may be allowed faces, inputs, state transitions, internal state information, outputs, and exception conditions. Put differently, the paradox is that successful principled development requires a nontrivial understanding of the entire system in its ultimate form before the system can be successfully developed. Systems designers need experience to understand the implications of their design choices. But experience can be gained only by making mistakes and learning from them.

OCR for page 83
Toward a Safer and More Secure Cyberspace to convince themselves that the system they are about to use is adequate for their individual purposes. Finally, it is simply not realistic to attempt to maintain secrecy for any system that receives wide distribution. Separation of privilege: Where feasible, a protection mechanism that requires two keys to unlock it is more robust and flexible than one that allows access to the presenter of only a single key. The reason for this greater robustness and flexibility is that, once the mechanism is locked, the two keys can be physically separated and distinct programs, organizations, or individuals can be made responsible for them. From then on, no single accident, deception, or breach of trust is sufficient to compromise the protected information. Least privilege: Every program and every user of the system should operate using the least set of privileges necessary to complete the job. This principle reduces the number of potential interactions among privileged programs to the minimum for correct operation, so that unintentional, unwanted, or improper uses of privilege are less likely to occur. Thus, if a question arises related to the possible misuse of a privilege, the number of programs that must be audited is minimized. Least common mechanism: The amount of mechanism common to more than one user and depended on by all users should be minimized. Every shared mechanism (especially one involving shared variables) represents a potential information path between users and must be designed with great care to ensure that it does not unintentionally compromise security. Further, any mechanism serving all users must be certified to the satisfaction of every user, a job presumably harder than satisfying only one or a few users. Psychological acceptability: It is essential that the human interface be designed for ease of use, so that users routinely and automatically apply the protection mechanisms correctly. More generally, the use of protection mechanisms should not impose burdens on users that might lead users to avoid or circumvent them—when possible, the use of such mechanisms should confer a benefit that makes users want to use them. Thus, if the protection mechanisms make the system slower or cause the user to do more work—even if that extra work is “easy”—they are arguably flawed. SOURCE: Adapted from J.H. Saltzer and M.D. Schroeder, “The Protection of Information in For these reasons, software development methodologies such as incremental development, spiral development, and rapid prototyping have been created that presume an iterative approach to building systems based on extensive prototyping and strong user feedback. Doing so increases the chances that what is ultimately delivered to the end users meets their needs, but entails a great deal of instability in “the requirements.” Moreover, when such “design for evolvability” methodologies are used with modularity, encapsulation, abstraction, and well-defined

OCR for page 83
Toward a Safer and More Secure Cyberspace interfaces, development and implementation even in the face of uncertain requirements are much easier to undertake. The intellectual challenge—and thus the research question—is how to fold security principles into these kinds of software development processes. 4.1.1.1.2 The Short-Term Expense A second reason that adherence to the principles listed in Box 4.1 is relatively rare is that such adherence is—in the short term—almost always more expensive than ignoring the principles. If only short-term costs and effort are taken into account, it is—today—significantly more expensive and time-consuming to integrate security from the beginning of a system’s life cycle, compared with doing nothing about security and giving in to the pressures of short-timeline deliverables. This reality arises from a real-world environment in which software developers often experience false starts, and there is a substantial amount of “playing around” that helps to educate and orient developers to the task at hand. In such an environment, when many artifacts are thrown away, it makes very little sense to invest up front in that kind of adherence unless such adherence is relatively inexpensive. The problem is further compounded by the fact that the transition from the “playing around” environment to the “serious development” environment (when it makes more sense to adhere to these principles) is often unclear. An example is the design of interfaces between components. Highly constrained interfaces increase the stability of a system incorporating such components. At the same time, that kind of constraining effort is inevitably more expensive than the effort involved when an interface is only lightly policed. In this context, a constrained interface is one in which calling sequences and protocols are guaranteed to be valid, meaningful, and appropriate. Guarantees must be provided that malformed sequences and protocols will be excluded. Providing such guarantees requires resources and programming that are unnecessary if the sequences and protocols are simply assumed to be valid. A second example arises from cooperative development arrangements. In practice, system components are often developed by different parties. With different parties involved, especially in different organizations, communications difficulties are inevitable, and they often include incompatibilities among interface assumptions, the existence of proprietary internal and external interfaces, and performance degradations resulting from the inability to optimize across components. This point suggests the need for well-defined and carefully analyzed specifications for the constituent components, but it is obviously easier and less expensive to simply assume that specifications are unambiguous.

OCR for page 83
Toward a Safer and More Secure Cyberspace In both of these examples, an unstructured and sloppy design and implementation effort is likely to “work” some of the time. Although such an effort can provide insight to designers and offer an opportunity for them to learn about the nature of the problem at hand, transitioning successfully to a serious production environment generally requires starting over from scratch rather than attempting to evolve an unstructured system into the production system. But in practice, organizations pressed by resources and schedule often believe—incorrectly and without foundation—that evolving an unstructured system into the production system will be less expensive. Later, they pay the price, and dearly. 4.1.1.1.3 The Potential Conflict with Functionality and Ease of Use A third important reason that adherence to the principles in Box 4.1 is relatively rare is the potential conflict with functionality. In many cases, introducing cybersecurity to a system’s design slows it down or makes it harder to use. Implementing the checking, monitoring, and recovery needed for secure operation requires a lot of computation and does not come for free. At the same time, commodity products—out of which many critical operational systems are built—are often constrained by limited resources and cost, even while the market demands ever-higher performance and functionality. 4.1.1.2 The Relevant Research In light of the issues above and the historically well-known difficulties in conventional computer system development (and especially the software), research and development (R&D) should be undertaken aimed at adapting the design principles of Box 4.1 for use in realistic and common software development environments that also do not make excessive sacrifices for performance or cost. Today, there are well-established methodologies for design-to-cost and design-for-performance but no comparable methodologies for designing systems in such a way that security functionality can be implemented systematically or even that the security properties of a system can be easily understood. Indeed, security reviews are generally laborious and time-consuming, a fact that reduces the attention that can be paid to security in the design process. In general, the design process needs to consider security along with performance and cost. One essential element of a “design-for-security evaluation” will be approaches for dealing with system complexity, so that genuinely modular system construction is possible and the number of unanticipated interactions between system components is kept to a bare

OCR for page 83
Toward a Safer and More Secure Cyberspace minimum, as discussed in Box 4.1. In any given case, the right balance will need to be determined between reducing the intrinsic complexity of a system (e.g., as expressed in the realistic requirements for security, reliability, availability, survivability, human safety, and so on) and using architectural means that simplify the interfaces and maintainability (e.g., through abstraction, encapsulation, clean interface design, and design tools that identify and enable the removal of undesired interactions and incompatibilities and hindrances to composability). This point also illustrates the need to address security issues in the overall architecture of applications and not just as added-on security appliances or components to protect an intrinsically unsafe design. Another important element is the tracing of requirements to design decisions through implementation. That is, from a security standpoint (as well as for other purposes, such as system maintenance), it is important to know what code (or circuitry) in the final artifact corresponds to what requirements in the system’s specification. Any code or circuitry that does not correspond to something in the system specification is inherently suspect. (See also Section 4.1.3.1.) Today, this problem is largely unsolved, and such documentation—in those rare instances when it does exist—is generated manually. Apart from the labor-intensiveness of the manual generation of such documentation, a manual approach applied to a complex system virtually guarantees that some parts of the code or circuitry will remain untraced to any requirement, simply because it has been overlooked. Moreover, for all practical purposes, a manual process requires that the original designers and implementers be intimately involved, since the connections between requirement and code or circuitry must be documented in near real time. Once these individuals are no longer available for consultation, these connections are inevitably lost. With respect to the issue of short-term expense, R&D might develop both technical and organizational approaches to reducing short-term costs. From a technical perspective, it would be desirable to have tools that facilitate the reuse of existing design work. From an organizational perspective, different ways of structuring design and development teams might enable a more cost-effective way of exploiting and leveraging existing knowledge and good judgment. Finally, it is worth developing design methods that proactively anticipate potential attacks. Threat-based design is one possible approach that requires the identification and characterization of the threats and potential attacks, finding mechanisms that hostile parties may employ to attack or gain entry to a computing system, and redesigning these mechanisms to eliminate or mitigate these potential security vulnerabilities. A further challenge is that of undertaking such design in a way that does not com-

OCR for page 83
Toward a Safer and More Secure Cyberspace promise design-to-cost and design-for-performance goals, such as high performance, low cost, small footprint, low energy consumption, and ease of use. 4.1.2 Research to Support Development 4.1.2.1 Hardware Support for Security Today, systems developers embody most of the security functionality in software. But hardware and computer architecture can also support more secure systems. In the past two to three decades, computer and microprocessor architects have focused on improving the performance of computers. However, in the same way that processing capability has been used in recent years to improve the user experience (e.g., through the use of compute-intensive graphics), additional increases in hardware performance (e.g., faster processors, larger memories, higher bandwidth connections) may well be usable for improving security. Compared with software-based security functionality, hardware-based support for security has two primary advantages. One advantage is that new hardware primitives can be used to make security operations fast and easily accessible, thus eliminating the performance penalty often seen when the same functionality is based in software and increasing the likelihood that this functionality will be used. A second advantage is that it tends to be more trustworthy, because it is much harder for an attacker to corrupt hardware than to corrupt software. Some critics of implementing security in hardware believe that security is inflexible and cannot adapt to changes in the environment or in attacker patterns. But hardware support for security need not imply that the entire security function desired must be implemented in hardware. Research is needed to determine the fundamental hardware primitives or features that should be added to allow flexible use by software to construct more secure systems. Hardware support can be leveraged in several ways. First, faster computing allows software to do more checking and to do more encrypting. Increases in raw processing performance can be large enough to allow more modular, more trustworthy software to run at acceptable speeds—that is, special-purpose software tricks used to enhance performance that also violated canons of secure program construction are much less necessary than they were in the past. Second, specific checking capability can be added to the processor itself, supporting a kind of “hardware reference monitor.” This is especially easy to contemplate at the moment, given the current trend to multicore architectures—some cores can be used for checking other cores.

OCR for page 83
Toward a Safer and More Secure Cyberspace The checks possible can be quite sophisticated, monitoring not only what actions are being requested but checking those actions in the context of past execution.3 Such checking can be used to ensure that applications, middleware, and even privileged operating system software do not perform actions that violate security policies. Hardware can also provide a safety net for potentially harmful actions taken by software, such as executing code that should be considered data. Since the hardware processor executes all software code, it can provide valuable “defense-in-depth” support in preventing software from compromising system security and integrity. Third, security-specific operations can be added into hardware. For example, processors can be designed in which data that are written to memory are encrypted leaving the processor and decrypted when they return to the processor. Or, instructions can be stored in memory in encrypted form and then decrypted by the hardware just prior to execution. Some proposals for hardware-implemented security operations even go so far as to make the operations of these special operations invisible to other computations that occur on that processor. Hardware can also implement a trustworthy and protected memory for storing secrets (typically, a small number). These secrets cannot be retrieved by software (so they are guaranteed to remain secret no matter what software is running); rather, they are used—for example, to encrypt data—by invoking hardware primitives that use those secrets and return the result. This approach was first implemented in smart cards some years ago, but smart cards have often proved slow and inconvenient to use. Smart cards were followed by a succession of other positionings of the functionality, including outboard secure co-processors and modified microprocessors. The desirability of any given positioning depends, at least in part, on the nature of the threat. For example, if the hardware support for security appears on additional chips elsewhere on a board, then an attacker with physical access to the computer board might succeed without very sophisticated equipment. Placing the support on the microprocessor chip itself significantly complicates such attacks. An example of embedding security-specific features into hardware to protect a user’s information is provided by Lee et al.,4 who have developed a secret-protected (SP) architecture that enables the secure and con- 3 Paul Williams and Eugene H. Spafford, “CuPIDS: An Exploration of Highly Focused, Coprocessor-Based Information System Protection,” Computer Networks, 51(5): 1284-1298, April 2007. 4 R. Lee, P. Kwan, J.P. McGregor, J. Dwoskin, and Z. Wang, “Architecture for Protecting Critical Secrets in Microprocessors,” Proceedings of the 32nd International Symposium on Computer Architecture, IEEE Computer Society, Washington, D.C., pp. 2-13, June 2005.

OCR for page 83
Toward a Safer and More Secure Cyberspace venient protection of a user’s sensitive information stored in an online environment, by providing hardware protection of critical secrets such as cryptographic keys belonging to a given user. In the SP architecture, keys follow their users and are not associated with any particular device. Thus, a given user can securely employ his or her keys on multiple devices, and a given device can be used by different users. The SP architecture is based on several elements. One element is the existence of a concealed execution mode in an SP-enhanced microprocessor, which allows a process to execute without its state being tampered with or observed by other processes, including the main operating system running on the processor. It includes a very efficient mechanism for runtime attestation of trusted code. A second element is a trusted software module running in concealed execution mode that performs the necessary protected computations on users’ secret keys, thus protecting all key information (the keys themselves, the computations, and intermediate states) from observation and tampering by adversaries. A third element is a chain of user cryptographic keys that are needed for accessing, and the protecting by encryption of any amount of sensitive information. This chain is stored in encrypted form (and thus can be resident anywhere), but it can be decrypted with a master key known only to the user. Similarly, user data, programs, and files encrypted by these keys can be stored safely in public online storage and accessed over public networks. A fourth element is a secure input/output (I/O) channel that enables the user to pass the master key to the SP hardware and the trusted software module without the risk that other modules may intercept the master key. (SP architecture also requires a variety of specific hardware and operating system enhancements for implementing these elements.) Lee et al. suggest that SP architecture may be valuable for applications other than protecting cryptographic keys—applications such as digital rights management and privacy protection systems. Also, different scenarios, such as those requiring “transient trust” in providing protected data to crisis responders, can be supported with small extensions to the SP architecture. Lee et al. also note that while various proposals exist for secure I/O and secure bootstrapping, more research is needed to study alternatives that can be integrated into SP-like architectures for commodity computing and communications devices. SP architecture demonstrates that security-enhancing hardware features can be easily added to microprocessors and flexibly employed by software applications without degrading a system’s performance, cost, or ease of use. Another example of recent work in this area is the new generation of hardware being shipped with secure co-processors that can store encryption keys and can perform encryption and hash functions. Specifically, the Trusted Computing Group is an industry consortium that has proposed

OCR for page 83
Toward a Safer and More Secure Cyberspace ing the SHA-1 hash algorithm suggests,22 the intellectual infrastructure of cryptography for commercial and other nonmilitary/nondiplomatic use is not as secure as one might believe. Growing computational power (which led to the vulnerability of the Data Encryption Standard to brute-force decryption) and increasingly sophisticated cryptanalytic tools mean that the study of even these very basic cryptographic primitives (encryption and hash algorithms) has continuing value. Moreover, what had been viewed as esoteric cryptographic primitives and methods of mostly theoretical interest—threshold cryptography, proactive security, and multiparty computation—are now being seen as exactly the right primitives for building distributed systems that are more secure. Nor are interesting areas in cryptology restricted to cryptography. For example, the development of secure protocols is today more of an art than a science, at least in the public literature, and further research on the theory of secure protocols is needed. A related point is that real-world cryptosystems or components can be implemented in such a way that the security which they allegedly provide can be compromised through unanticipated information “leakages” that adversaries can exploit or cause.23 In addition, despite the widespread availability of encryption tools, most electronic communications and data are still unencrypted—a point suggesting that the infrastructure of cryptology remains ill-suited for widespread and routine use. Many practical problems, such as the deployment of usable publickey infrastructures, continue to lack scalable solutions. The conceptual complexity of employing encryption and the potential exposures that come 22 More precisely, an attack against the SHA-1 algorithm has been developed that reduces its known run-time collision resistance by a factor of 211 (from 280 to 269) (Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu, “Finding Collisions in the Full SHA-1,” Advances in Cryptology—Crypto’05; available at http://www.infosec.sdu.edu.cn/paper/sha1-crypto-auth-new-2-yao.pdf). In addition, Adi Shamir announced during the Rump Session at Crypto’05 (on August 15, 2005) that Wang and other collaborators had demonstrated the possibility of finding a collision in SHA-1 in 263 operations, although no actual collisions had been found. This result applies only to collision resistance, which means that digital signatures are placed at risk, but the result does not affect constructions for key derivation, message authentication codes, or random function behavior (i.e., it does not affect any construction in which specific content may be at issue). 23 For example, Paul Kocher has developed attacks on certain real-world systems that can reveal secret keys in much less time than would be required by brute-force techniques, even though the cryptography in these systems has been implemented perfectly. Kocher’s attacks are based on timing and/or power measurements of the systems involved. See, for example, Paul Kocher et al., “Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems,” December 1995, available at http://www.cryptography.com/resources/whitepapers/TimingAttacks.pdf; and Paul Kocher et al., “Introduction to Differential Power Analysis and Related Attacks,” 1998, available at http://www.cryptography.com/dpa/technical/.

OCR for page 83
Toward a Safer and More Secure Cyberspace with doing it wrong strongly suggest the need for research to understand where, how, and when it fits into security architecture. As an example of bringing cryptographic theory into practice, consider multiparty computations. Here, a collection of parties engages in computing some function of the values that each has, but no party learns values that the others have. Moreover, some protocols defend against having a fraction of the participants be compromised. Threshold digital signatures are a simple example of a multiparty computation. This functionality is useful (though it has not yet enjoyed widespread practical use) when a service is implemented by a replicated set of servers. (Any majority of the servers can together create a signature for responses from the service, but no individual server is capable of impersonating the service.) However, more sophisticated multiparty computation algorithms have not yet made the transition from theory to practice. So-called proactive cryptographic protocols are another area of interest. These protocols call for the periodic changing of secrets so that information that an attacker gleans from successfully compromising a host is short-lived. Effecting the transition of this cryptographically supported functionality from theory to practice will change the toolbox that systems builders use and could well enable systems that are more secure through the clever deployment of these new cryptographic primitives. Finally, as new mathematical methods are discovered and as new computing technology becomes available, what is unbreakable today may be penetrable next week. As one example, consider that quantum computing, if made practical, would invalidate several existing methods thought to be unbreakable. Likewise, it has not yet been proven that prime factorization is an NP problem, and that NP is not reducible to P. Thus, it is possible that future discoveries could change a number of the current assumptions about systems such as the RSA algorithm—suggesting that work on developing new basic cryptographic primitives is useful as a hedge against such possibilities. 4.1.3 Research to Support Testing and Evaluation Testing and evaluation (T&E) are necessary because of the nature of information technology artifacts as things designed and implemented by people, who make mistakes. T&E generally consumes half or more of the overall cost for a software system. T&E occurs at every level of granularity in a system (unit to subassembly, to overall system, to deployed system in situ), and at all process phases, starting with requirements. Traditional testing involves issues of coverage. Testing every statement may not be enough, but it may nonetheless be difficult to achieve. Testing every branch and path is even harder, since there is generally a

OCR for page 83
Toward a Safer and More Secure Cyberspace combinatorially large number of paths. How much coverage is needed, and what are the metrics of coverage? 4.1.3.1 Finding Unintended Functionality One of the most challenging problems in testing and evaluation is that of auditing a complex artifact for functionality that has not been included in the specification of requirements and that may result in security vulnerabilities. In a world of outsourced and offshore chip fabrication and/or code development and given the possibilities that trusted designers or programmers might not be so trustworthy, it is an important task to ensure that functionality has not been added to a hardware or software system that is not consistent with the system’s specifications. However, the complexity of today’s IT artifacts is such that this task is virtually impossible to accomplish for any real system, and the problem will only get worse in the future. Today, the best testing methodologies can be divided into two types: (1) efforts to find the problems whose presence is a priori known, and (2) directed but random testing of everything else that might reveal an “unknown unknown.” Formal methods may also offer some promise for finding unintended functionality, although their ability to handle large systems is still quite limited. These considerations suggest that comprehensive cybersecurity involves both secure hardware and secure software at every level of the protocol stack, from the physical layer up. This is not to say that every IT application must be run on hardware or software that has been designed and fabricated by trustworthy parties—only that the sensitivity of the application should determine what level of concern should be raised about possible cybersecurity flaws that may have been deliberately embedded in hardware or software. 4.1.3.2 Test Case Generation A second dimension of testing is to ensure that testing is based on a “good” set of test cases. For example, it is well known that test cases should include some malformed inputs and some that are formally derived from specifications and from code, and in particular, cases that go outside the specification and break the assumptions of the specification. Such cases will often reveal security vulnerabilities if they do exist. Testing can focus on particular attributes beyond just functional behavior. For example, a security test might focus on behavior with out-of-specification inputs, or it might occur when the system is under load beyond its declared range, and so on. Similarly, unit or subsystem testing could focus on the “robustness” of internal interfaces as a way to assess

OCR for page 83
Toward a Safer and More Secure Cyberspace how an overall system might contain an error, keeping an error within the confines of a subsystem by tolerating and recovering. A related point is the development of test suites for commonly used software for which there are multiple implementations. For example, Chen et al. documented the existence of different semantics in three different versions of Unix (Linux, Solaris, and FreeBSD) for system calls (the uid-setting system calls) that manage system privileges afforded to users.24 Their conclusion was that these different semantics were responsible for many security vulnerabilities. Appropriate test suites would help to verify the semantics and standards compliance of system calls, library routines, compilers, and so on. 4.1.3.3 Tools for Testing and Evaluation A third important dimension of testing and evaluation is the real-world usability of tools and approaches for T&E, many of which suffer from real-world problems of scalability, adoptability, and cost. For example: Tools for static code analysis are often clumsy to use and sometimes flag an enormous number of issues that must be ignored because they are not prioritized in any way and because resources are not available to address all of them. Dynamic behavior analysis, especially in distributed asynchronous systems, is poorly developed. For example, race conditions—the underlying cause of a number of major vulnerabilities—are difficult to find, and tools oriented toward their discovery are largely absent. Model checking, code and program analysis, formal verification, and other “semantics-based” techniques are becoming practical only for modestly sized real-system software components. Considerable further work is needed to extend the existing theory of formal verification to the compositions of subsystems. All of these T&E techniques require some kind of specification of what is intended. With testing, the test cases themselves form a specification, and indeed agile techniques rely on testing for this purpose. Inspection allows more informal descriptions. Analysis and semantics-based 24 Hao Chen, David Wagner, and Drew Dean, “Setuid Demystified,” Proceedings of the 11th USENIX Security Symposium, pp. 171-190, 2002; available at http://www.cs.berkeley.edu/~daw/papers/setuid-usenix02.pdf.

OCR for page 83
Toward a Safer and More Secure Cyberspace techniques rely on various focused “attribute-specific” specifications of intent. Inspection is another important technique related to testing and evaluation. Inspection underlies the Common Criteria (ISO 15408), but it relies on subjective human judgment in the sense that the attention of the human inspectors may be guided through the use of tools and agreed frameworks for inspection. Moreover, the use of human inspectors is expensive, suggesting that inspection as a technique for testing and evaluation does not easily scale to large projects. 4.1.3.4 Threat Modeling Today, most security certification and testing are based on a “test to the specification” process. That is, the process begins with an understanding of the threats against which defenses are needed. Defenses against those threats are reflected as system specifications that are included in the overall specification process for a system. Testing is then performed against those specifications. While this process is reasonably effective in finding functionality that is absent from the system as implemented (this is known because that functionality is reflected in the specification), it has two major weaknesses. The first weakness of the test-to-the-specification process is that it requires a set of clear and complete specifications that can be used to drive the specifics of the testing procedure. However, as noted in Section 4.1.1, a great deal of real-world software development makes use of methodologies based on spiral and incremental development in which the software “evolves” to meet the new needs that users have expressed as they learn and use the software. This means that it is an essentially impossible task to specify complex software on an a priori basis. Thus, specifications used for testing are generally written after the software has been written. This means that the implemented functionality determines the specifications, and consequently the specifications themselves are no better than the understanding of the system on the part of the developers and implementers. That understanding is necessarily informal (and hence incomplete), because it is, by assumption, not based on any kind of formal methodology. (The fact that these specifications are developed after the fact also makes them late and not very relevant to the software development process, but those are beyond the scope of this report.) The second weakness, related to the first, is that this methodology is not particularly good at finding additional functionality that goes beyond what is formally specified. (Section 4.1.3.1 addresses some of the difficulties in finding such problems.)

OCR for page 83
Toward a Safer and More Secure Cyberspace Weaknesses in a test-to-the-specification approach suggest that complementary approaches are needed. In particular, threat modeling and threat-based testing are becoming increasingly important. In these approaches, a set of threats is characterized, and testing activities include testing defenses against those threats. (This is the complement to threat-based design, described in Section 4.1.1.2.) This approach can be characterized as, “Tell me the threats that you are defending against and prove to me that you have done so.” Research in this domain involves the development of techniques to characterize broader categories of threat and more formal methods to determine the adequacy of defenses against those threats. For those situations in which a threat is known and a vulnerability is present but no defense is available, developing instrumentation to monitor the vulnerability for information on the threat may be a useful thing to do as well. Research is also needed for enabling spiral methodologies to take into account new threats as a system “evolves” to have new features. 4.2 GRACEFUL DEGRADATION AND RECOVERY If the principle of defense in depth is taken seriously, system architects and designers must account for the possibility that defenses will be breached, in which case it is necessary to contain the damage that a breach might cause and/or to recover from the damage that was caused. Although security efforts should focus on reducing vulnerabilities proactively where possible, it is important that a system provide containment to limit the damage that a security breach can cause and recovery to maximize the ease with which a system or network can recover from an exploitation. Progress in this area most directly supports Provision II and Provision III of the Cybersecurity Bill of Rights, and indirectly supports Provision VII. 4.2.1 Containment There are many approaches to containing damage: Engineered heterogeneity. In agriculture, monocultures are known to be highly vulnerable to blight. In a computer security context, a population of millions of identically programmed digital objects is systematically vulnerable to an exploit that targets a specific security defect, especially if all of those objects are attached to the

OCR for page 83
Toward a Safer and More Secure Cyberspace Internet.25 If it is the specifics of a given object code that result in a particular vulnerability, a different object code rewritten automatically to preserve the original object code’s high-end functionality may eliminate that vulnerability. (Of course, it is a requirement of such rewriting that it not introduce another vulnerability. Moreover, such methods can interfere with efforts to debug software undertaken at the object code level, as well as with legitimate third-party software add-ons and enhancements, suggesting that there are trade-offs to be analyzed concerning whether or not automatic rewriting is appropriate or not in any given situation.) Disposable computing. An attacker who compromises or corrupts a system designed to be disposable—that is, a computing environment whose corruption or compromise does not matter much to the user—is unlikely to gain much in the way of additional resources or privileges.26 A disposable computing environment can thus be seen as a buffer between the outside world and the “real” computing environment in which serious business can be undertaken. When the outside world manifests a presence in the buffer zone, the resulting behavior is observed, thus providing an empirical basis for deciding whether and/or in what form to allow that presence to be passed through to the “real” environment. As in the case of process isolation, the challenge in disposable computing is to develop methods for safe interaction between the buffer and the “real” environment. One classic example of disposable computing is Java, which was widely adopted because its sandboxing technology created a perimeter around the execution context of the applet code. That is, an applet could do anything inside the sandbox but was constrained from affecting anything outside the sandbox. Virtualization and isolation. As discussed in Section 4.1.2.3, isolation is one way of confining the reach of an application or a software module. 25 Monocultures in information technology also have an impact on the economics of insuring against cyber-disasters. Because the existence of a monoculture means that risks to systems in that monoculture are not independent, insurers face a much larger upper bound on their liability than if these risks were independent, since they might be required to pay off a large number of claims at once. 26 Perhaps the most important gain from such an attack is knowledge and insight into the structure of that computing environment—which may be useful in conducting another attack against another similarly constructed system.

OCR for page 83
Toward a Safer and More Secure Cyberspace 4.2.2 Recovery A second key element of a sound defensive strategy is the ability to recover quickly from the effects of a security breach, should one occur. Indeed, in the limiting case and when information leakage is not the threat of concern, allowing corruption or compromise of a computer system may be acceptable if that system can be (almost) instantaneously restored to its correct previous state. That is, recovery can itself be regarded as a mechanism of cyberdefense when foiling an attack is not possible or feasible. Recent work in embedding transaction and journaling capabilities into basic file system structures in operating systems suggests that there is some commercial demand for this approach. Because of the difficulty of high-confidence prevention of system compromise against high-end threats, recovery is likely to be a key element of defending against such threats. Illustrative research topics within this domain include the following: Rebooting. Rebooting a system is a step taken that resets the system state to a known initial configuration; it is a necessary step in many computer operations. For example, rebooting is often necessary when a resident system file is updated. Rebooting is also often necessary when an attack has wreaked havoc on the system state. However, rebooting is normally a time-consuming activity that results in the loss of a great deal of system state that is perfectly “healthy.” Rebooting is particularly difficult when a large-scale distributed system is involved. Micro-rebooting (an instantiation of a more general approach to recovery known as software rejuvenation27) is a technique that reboots only the parts of the system that are failing rather than the entire system. Research in micro-rebooting includes, among other things, the development of techniques to identify components in need of rebooting and ways to reduce further the duration of outage associated with rebooting. Such considerations are particularly important in environments that require extremely high availability. 27 Software rejuvenation is a technique proposed to deal with the phenomenon of software aging, one in which the performance of a software system degrades with time as the result of factors such as exhaustion of operating system resources and data corruption. In general terms, software rejuvenation calls for occasionally terminating an application or a system, cleaning its internal state and/or its environment, and restarting it. See, for example, Kalyanaraman Vaidyanathan and Kishor S. Trivedi, “A Comprehensive Model for Software Rejuvenation,” IEEE Transactions on Dependable and Secure Computing, 2 (2, April-June): 124-137, 2005. See also http://srejuv.ee.duke.edu.

OCR for page 83
Toward a Safer and More Secure Cyberspace Online production testing. An essential element of recovery is fault identification. One approach to facilitate such identification is online testing, in which test inputs (and sometimes deliberately faulty inputs) are inserted into running production systems to verify their proper operation. In addition, modules in the system are designed to be self-testing to verify the behavior of all other modules with which they interact. Large-scale undo capabilities. An undo capability enables system operators to roll back a system to an earlier state, and multiple layers of undo capability enable correspondingly longer roll-back periods. If a successful cyberattack occurs at a given time, rolling back the system’s state to before that time is one way of recovering from the attack—and it does not depend on knowing anything about the specific nature of the attack.28 4.3 SOFTWARE AND SYSTEMS ASSURANCE Software and systems assurance is focused on two related but logically distinct goals: the creation of systems that will do the right thing under the range of possible operating conditions, and human confidence that the system will indeed do the right thing. For much of computing’s history, high-assurance computing has been most relevant to systems such as real-time avionics, nuclear command and control, and so on. But in recent years, the issue of electronic voting has brought questions related to high-assurance computing squarely into the public eye. At its roots, the debate is an issue of assurance: how does (or should) the voting public become convinced that the voting process has not been compromised? In such a context, it is not enough that a system has not been compromised; it must be known not to have been compromised. This issue has elements of traditional high-assurance concerns (e.g., Does the program meet its specifications?) but also has broader questions concerning support for recounts, making sure the larger context cannot be used for corruption (e.g., configuration management). A variety of techniques have been developed to promote software and 28 Aaron B. Brown, A Recovery-Oriented Approach to Dependable Services: Repairing Past Errors with System-Wide Undo, University of California, Berkeley, Computer Science Division Technical Report UCB//CSD-04-1304, December 2003, available at http://roc.cs.berkeley.edu/projects/undo/index.html; A. Brown and D. Patterson, “Undo for Operators: Building an Undoable E-Mail Store,” in Proceedings of the 2003 USENIX Annual Technical Conference, San Antonio, Tex., June 2003, available at http://roc.cs.berkeley.edu/papers/brown-emailundo-usenix03.pdf.

OCR for page 83
Toward a Safer and More Secure Cyberspace systems assurance, including formal requirements analysis, architectural reviews, and the testing and verification of the properties of components, compositions, and entire systems. It makes intuitive sense that developing secure systems would be subsumed under systems assurance—by definition, secure systems are systems that function predictably even when they are under attack.29 An additional challenge is how to design a system and prove assurance to a general (lay) audience. In the example above, it is the general voting public—not simply the computer science community—that is the ultimate judge of whether or not it is “sufficiently assured” that electronic voting systems are acceptably secure. Some techniques used to enhance reliability are relevant to cybersecurity—much of software engineering research is oriented toward learning how to decide on and formulate system requirements (including tradeoffs between functionality, complexity, schedule, and cost); developing methods and tools for specifying systems, languages, and tools for programming systems (especially systems involving concurrent and distributed processing); middleware to provide common services for software systems; and so on. Testing procedures and practices (Section 4.1.3) are also intimately connected with assurance. All of these areas are relevant to the design and implementation of more secure systems, although attention to these issues can result in common solutions that address reliability, survivability, and evolvability as well. Software engineering advances also leverage basic research in areas that seem distant from system building per se. Success in developing tools for program analysis, in developing languages for specifications, and in developing new programming languages and computational models typically leverages more foundational work—in applied logic, in algorithms, in computational complexity, in programming-language design, and in compilers. At the same time, assurance and security are not identical, and they often seek different goals. Consider the issue of system reliability, usually regarded as a key dimension of assurance. In contrast with threats to security, threats to system reliability are nondirected and in some sense are more related to robustness against chance events such as power outages or uninformed users doing surprising or unexpected things. By contrast, threats to security are usually deliberate, involving a human adversary who has the intention to do damage and who takes actions that are decid- 29 For more discussion of this point, see National Research Council, Trust in Cyberspace, National Academy Press, Washington, D.C., 1999.

OCR for page 83
Toward a Safer and More Secure Cyberspace edly not random. A test and evaluation regime oriented toward reliability will not necessarily be informative about security. The same is true about using redundancy as a solution to reliability, since redundancy can be at odds with heterogeneity in designing for security. Thus, it would be a mistake to conclude that focusing solely on reliability will automatically lead to high levels of cybersecurity.