National Academies Press: OpenBook

Critical Code: Software Producibility for Defense (2010)

Chapter: 4 Adopt a Strategic Approach to Software Assurance

« Previous: 3 Assert DoD Architectural Leadership for Innovative Systems
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

4
Adopt a Strategic Approach to Software Assurance

SOFTWARE ASSURANCE AND EVIDENCE

One of the great challenges for both defense and civilian systems is software quality assurance. Software assurance encompasses reliability, security, robustness, safety, and other quality-related attributes. Diverse studies suggest that overall software assurance costs account for 30 to 50 percent of total project costs for most software projects.1 Despite this cost, current approaches to software assurance, primarily testing and inspection, are inadequate to provide the levels of assurance required for many categories of both routine and critical systems.2

In major defense systems, the assurance process is heavily complicated by the arm’s-length relationship that exists between a contractor development team and government stakeholders. This relation-

1

In “Software Debugging, Testing, and Verification,” IBM Systems Journal (41)1, 2002, B. Hailpern and P. Santhanam say, “In a typical commercial development organization, the cost of providing this assurance via appropriate debugging, testing, and verification activities can easily range from 50 to 75 percent of the total development cost.” In Estimating Software Costs (McGraw-Hill, 1998), Capers Jones provides a table relating percentage of defects removed to percentage of development effort devoted to testing, with data points that include 90 to 39 percent, 96 to 48 percent, and 99.9 to 58 percent. In Software Cost Estimation with COCOMO II (Prentice Hall, 2000), Barry W. Boehm, Chris Abts, A. Winsor Brown, Sunita Chulani, Bradford K. Clark, Ellis Horowitz, Ray Madachy, Donald Reifer, and Bert Steece indicate that the cost of test planning and running tests is typically 20 to 30 percent plus rework due to defects discovered. In Balancing Agility and Discipline (Addison-Wesley, 2004), Barry Boehm and Richard Turner provide an analysis of the COCOMO II Architecture and Risk Resolution scale factor, indicating that the increase in rework due to poor architecture and risk resolution is roughly 18 percent for typical 10-KSLOC (KSLOC stands for thousand software lines of code) projects and roughly 91 percent for typical 10,000-KSLOC projects. (COCOMO II, or constructive cost model II, is a software cost, effort, and schedule estimation model.) This analysis suggests that improvements are needed in up-front areas as well as in testing and supporting the importance of architecture research, especially for ultra-large systems.

2

The challenges relating to assurance were highlighted by several briefers to the committee. In addition, this issue is a core concern in the Defense Science Board (DSB), September 2007, Report of the Defense Science Board Task Force on Mission Impact of Foreign Influence on DoD Software, Washington, DC: Office of the Undersecretary of Defense for Acquisition, Technology, and Logistics, at pp. 30-38. Available online at http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA473661. The 2007 NRC report Software for Dependable Systems also addressed the issue of testing and noted, “Testing … will not in general suffice, because even the largest test suites typically used will not exercise enough paths to provide evidence that the software is correct nor will it have sufficient statistical significance for the levels of confidence usually desired” (p. 13). See NRC, Daniel Jackson, Martyn Thomas, and Lynette I. Millett, eds. 2007, Software for Dependable Systems, National Academies Press, Washington, DC. Available online at http://www.nap.edu/catalog.php?record_id=11923. Last accessed August 20, 2010.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

ship—in which sometimes even minor changes to up-front commitments may necessitate amendments to contracts and adjustments in costing—can create barriers to the effective and timely sharing of information that can assist the customer in efficiently reaching accurate assurance judgments. Additionally, it can be difficult to create incentives for the appropriate use of preventive measures such as those referenced in this chapter.

In this chapter the committee first considers the trends related to the challenges of software assurance. It then offers a concise conceptual framework for certain software assurance issues. Finally, it identifies significant technical opportunities and potential future challenges to improving our ability to provide assurance. (Some of these are elaborated in Chapter 5.)

Failures in software assurance can be of particularly high consequence for defense systems because of their roles in protecting human lives, in warfighting, in safeguarding national assets, and in other pivotal roles. The probability of failure can also be high, due to the frequent combination of scale, innovative character, and diversity of sourcing in defense systems. Unless exceptional attention is devoted to assurance, a high level of risk derives from this combination of high consequence and high likelihood.

Assurance considerations also relate to progress tracking, as discussed in Chapter 2—assessment of readiness for operational evaluation and release is based not just on progress in building a system, but also on progress in achieving developmental assurance. Additionally, the technologies and practices used to achieve assurance may also contribute useful metrics to guide process decision making.

Assurance Is a Judgment

Software assurance is a human judgment of fitness for use. In practice, assurance judgments are based on application of a broad range of techniques that include both preventive and evaluative methods and that are applied throughout a software engineering process. Indeed, for modern systems, and not just critical systems, the design of a software process is driven not only by issues related to engineering risk and uncertainty, but also, in a fundamental way, by quality considerations.3 These, in turn, are driven by systems risks—hazards—as described in Chapter 2 and also in Box 4.1 (cybersecurity).

An important reality of defense software assurance is the need to achieve safety—that is, in war, there are individual engagements where lives are at stake and where software is the deciding factor in the outcome. In many life and death situations, optimum performance may not be the proper overriding assurance criterion, but rather the “minimization of maximum regret.” This is exacerbated by the fact that, while a full-scale operational test of many capabilities may not be feasible, assurance must nonetheless be achieved. This applies, for example, to certain systems that support strategic defense and disaster mitigation. The committee notes, however, that there are great benefits in architecting systems and structuring requirements such that many capabilities of systems that would otherwise be rarely used only for “emergencies” are also used in an ongoing mode for more routine operations. This creates benefits from operational feedback and user familiarity. It also permits iterative development and deployment, such as is familiar to users of many evolving commercial online services.

Another reality of defense software that affects assurance is that it is developed by contractors working at arm’s length from the DoD. This means, for example, that the information sharing necessary to assessing and achieving assurance must be negotiated explicitly.

There are many well-publicized examples of major defense systems exhibiting operational failures of various kinds that are, evidently, consequences of inadequate assurance practices. A recent example of this type of top-level systems engineering issue was the failure of an F-22 flight management system when it was flown across the international dateline for the first time en route from Hawaii to Japan. In a CNN interview, Maj. Gen. Don Sheppard (ret.) said, “At the international date line, whoops, all systems dumped and when I say all systems, I mean all systems, their navigation, part of their communications,

3

Michael Howard and Steve Lipner, 2006, The Security Development Lifecycle, Redmond, WA: Microsoft Press. See also Box 2.3.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

BOX 4.1

Assurance and Cybersecurity—A Brief Consideration

Cybersecurity


Although it is not a principal focus of this report, cybersecurity is an unavoidable and critical dimension of software assurance. It is rarely possible to contemplate software assurance without also giving major attention to security considerations. This is particularly challenging because security, like assurance, must be addressed at every phase of development and the software lifecycle overall.1

A system can only be assured if it is well understood. The main text elaborates the concept of a chain of evidence, which documents this understanding as traceability from intentions to outcomes, including functional requirements, quality attributes, and architectural constraints. Security adds the additional dimension of threats and attacks. For software, these can occur not only during operations, but also at every stage of the lifecycle, from development through to ongoing evolution and update during operations. The most crude categorization of threats yields three different avenues of attack: (1) external attackers—adversaries gaining access from points external to the system, typically via network connections, (2) operational insiders—adversaries gaining access to a DoD software system through inappropriate privileging, compromised physical access, or compromised personnel, and (3) engineering insiders—adversaries influencing or participating in the engineering process at some point in the supply chain for an overall system. Attacks can have different goals, typically characterized as “CIA”—breaching Confidentiality of data, damaging the Integrity of data, and disrupting Availability of a computational service. The analysis of possible threats and attacks is a key element of secure software development. This analysis is strongly analogous to hazard analysis (as discussed elsewhere in this report), and it can lead to a host of security considerations to address in the development of systems, for example, relating to identity and attribution, network situational awareness, secure mobility, policy models and usability, forensics, etc. From the standpoint of secure software development, the committee highlights two principal policy considerations, chosen because they are most likely to significantly influence both software architecture and development practice. The first of these relates to separation—minimizing and managing the coupling among components in a way that reduces the overall extent of those most sensitive components in a system that require the highest levels of assurance as well as the “attack surface” of those components with respect to the various avenues of attack noted above. The second relates to configuration integrity—the assurance that any deviations or dynamic alterations to an operational system are consistent with architectural intent.


Separation


The first example of a security-related chain is the separation chain. Construction of this chain of evidence entails documenting relationships among critical shared resources and the software and system components that should, or should not, have access to or otherwise influence those resources.2 This chain documents the means by which access to resources is provided—or denied—to the components of a software system that need to rely on those resources. A less trusted component, for example, may be excluded by policy from observing, changing, or influencing access by others to a critical resource such as a private key.


The ability to construct chains of this kind is determined by architectural decisions and implementation practices. Concepts from security architecture such as process separation, isolation, encapsulation, and secure communication architecture determine whether this kind of chain can be feasibly constructed, with minimal exposure of the most sensitive portions of a system. For example, modern commercial PC operating systems are designed to achieve security goals while offering tremendous generality and power

  

1 Steve Lipner and Michael Howard, 2006, The Security Development Lifecycle, Redmond, WA: Microsoft Press. See also Gary McGraw, 2006, Software Security: Building Security In, Boston: Addison-Wesley.

  

2 This documentation should be formal wherever possible, such as might be derived from code analysis, verification, and modeling.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

in their underlying services and resource-management capabilities. Operating systems more focused on media delivery may offer less generality and flexibility, but may do better in providing assurance relating to security because architectures are designed to more tightly regulate access to resources.

Research advances can expand architectural options for which assurance of this kind can be achieved. This is influenced both through enhancement of architectural sophistication and through the ability to model and assure policies.


Configuration


The second example of a security-related chain is the configuration chain. This chain documents the configuration integrity that is established when a system starts up and that is sustained through operations. The chain, in this case, typically links a known hardware configuration with the full complexity of an overall running system, including software code, firmware, and hardware operating within that configuration. Loss of integrity can occur, for example, when malware arrives over a network and embeds itself within a system. It should be clear that this chain (like the other chain) is significant not only for networked systems but also for any system with a diverse supply chain, due to the differing trust levels conferred on system components. The assurance enabled by this chain is that the assumptions that underlie the construction of other kinds of chains (and the architectural, functional, and other decisions that enable that construction) are reflected in the reality of the code that executes—and so the conclusions can be trusted. Put simply, this chain assures an absence of tampering. This has proven to be a singular challenge for commercial operating systems, as evidenced by the difficulty of detecting and eradicating rootkits, for example.

Documentation of this second kind of chain is complicated by a diversity of factors. One is the dynamism of modern architectures, which afford the flexibility and convenience of dynamically loading software components such as device drivers and libraries. Another is the layered and modular structure that is the usual result of considerations related to development of the second kind of chain. A third factor is assuring configuration integrity of the hardware itself. Including hardware in the chain can be much more challenging than the analogous process for software, because of the added need to “reverse engineer” physical hardware.3 A fourth factor is derived from the “bootstrap” process through which initial software configurations are loaded onto bare hardware, generally layer by layer. This affords the opportunity of an iterative and ongoing process of loading and integrity checking, such as has been envisioned in the development of the TPM chips that are present on the motherboards of most PCs and game platforms.4 In this model, the intent is to assure integrity through fingerprinting and monitoring the integrity of software components as they are loaded and configured both through the bootstrap process and during operations. These four factors, combined with a highly competitive environment that discourages compromise on systems functionality and performance, have proven highly challenging for DoD in adopting commercial off-the-shelf operating systems, for example.5


A Note on Secrecy


Security-related faults lead to hazards just when attackers are able to exploit those faults to create errors and failures. It may be tempting, therefore, to think that full secrecy of the software code base would preclude such possibilities. For defense systems there are many good reasons for secrecy, but, from the perspective of exploitation of vulnerabilities, over-reliance on secrecy (“security through obscurity”) is a

  

3 DSB, February 2005, Report of the Defense Science Board Task Force on High Performance Microchip Supply, Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics, Available online at http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA473661. Last accessed August 20, 2010.

  

4 See http://www.trustedcomputinggroup.org/.

  

5 DSB, September 2007, Report of the Defense Science Board Task Force on Mission Impact of Foreign Influence on DoD Software, Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics. Available online at http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA473661. Last accessed August 20, 2010.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

dangerous approach. There are two reasons. First, faults can very often be detected through sophisticated “black-box” methods, in which attackers probe and poke a system based on hypotheses regarding its likely structure and function—these methods are analogous to those used in software development for operational and systems-level testing. Second, if secrecy enables developers to become complacent about fundamentals, such as appropriate security architectures (see below), then the overall risk can increase dramatically. A minor coding flaw may expose a vulnerability, but with good development and assurance practice that flaw can be readily eliminated either directly through analysis or indirectly through multi-layer defense. An architectural flaw, on the other hand, may be much more difficult or even impossible to mitigate without taking the entire system offline and undertaking significant reengineering.


Opaque Code


As noted in the main text, modern software systems often consist of components drawn from diverse sources. The gradients of trust among components are often complicated by the fact that many components are relatively opaque compared with others—for example, only executable code is available. There are several reasons for this opacity in DoD systems, many of which are driven by commercial considerations related to the protection of intellectual property manifest in source code and design documentation. These considerations may apply both to commercial vendors and to subcontractors who may be potential competitors with their prime contractor on other projects. Indeed, some development organizations may not want to share source code and design information with the government because they are concerned about potential public release or by the possibility of similar requests for access from other governments who they may seek as customers. This is a particular challenge for commercial vendors, who typically conduct business globally and so may face similar requirements from other governments. This risk of exposure may even deter some firms from conducting business in the U.S. government supply chain.

This issue motivates technologies related to sandboxing and isolation, such as those used in web browsers for JavaScript and (as a research goal) technologies for “evidence-carrying code,” where evidence of security or safety can be provided in a way that may nonetheless cloak vendor trade secrets.

These considerations notwithstanding, a principal consideration in assurance is the reduction in the extent of code that, in the end, remains opaque to DoD acceptance evaluators. One mechanism, embodied in the Common Criteria model, is the use of mutually trusted third parties to support assurance activities. A key issue is how that evaluation can be done such that two goals are addressed: (1) There is minimal added cost and delay, and (2) Evidence can be produced that protects the interests of the developers and that manifests the necessary links in the various required chains of evidence. The first of these goals could be supported, for example, through the involvement of evaluation teams throughout development. But it could also be addressed through a consistent practice of “evidence production,” whereby developers create links in the necessary chains of evidence that can support a more efficient third-party or government evaluation.

One of the challenges in evidence production is achieving a return-on-investment model for evidence production that has the characteristic of “early gratification” for development teams. This was considered unachievable for many years. But there is now evidence in modern team practice, with intensive use of tools for team coordination, defect/issue tracking, and software assurance (unit testing and analysis), that costly after-the-fact practices are giving way to ongoing evidence production, in the same spirit as test-driven development. The second of these goals suggests a number of challenging research problems related to the production of evidence that supports assurance but may also cloak proprietary algorithms from other development teams working on the same system. Both goals also suggest a research challenge related to enhancing the scope of specification of APIs to facilitate demonstration of compliance with API rules of the road. This is significant from an architectural standpoint, because it enables development teams to work more independently of each other, given the added certainty regarding API-mediated interactions.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

their fuel systems. They were—they could have been in real trouble. They were with their tankers…. The [F-22 crews] tried to reset their systems, couldn’t get them reset. The tankers brought them back to Hawaii. This could have been real serious. It certainly could have been real serious if the weather had been bad. It turned out OK. It was fixed in 48 hours. It was a computer glitch in the millions of lines of code, somebody made an error in a couple lines of the code and everything goes.” The contact with the tankers was visual: “Had they gotten separated from their tankers or had the weather been bad, they had no attitude reference. They had no communications or navigation. They would have turned around and probably could have found the Hawaiian Islands. But if the weather had been bad on approach, there could have been real trouble.”4

There Are Diverse Quality Attributes and Methods

Software assurance encompasses a wide range of quality attributes. For defense systems, there is particular emphasis on addressing hazards related to security (primarily confidentiality, integrity, and access of service, see Box 4.1), availability and responsiveness (up time and speed of response), safety (life and property), adherence to policy (rules of engagement), and diverse other attributes. There is a very broad range of kinds of failures, errors, and faults that can lead to such hazards (Box 4.2). Software assurance practices must therefore encompass a correspondingly broad range of techniques and practices.

There is a false perception that assurance can be achieved entirely through acceptance evaluation such as that achieved through operational and systems test. Systems test is certainly a necessary step in assuring functional properties and many performance properties. But it is by no means sufficient. Assurance cannot be readily achieved from testing for many kinds of failures related to security, intermittent failures due to non-determinism and concurrency, readiness for likely future evolution and interoperation requirements, readiness for infrastructure upgrades, highly complex state space, and other kinds of failures.

A comprehensive assurance practice requires attention to quality issues throughout the development and operations lifecycle, at virtually every stage of the process and at all links in the supply chain supporting the overall system. The latter point is a consequence of the observation above regarding the fallacy of relying entirely on acceptance evaluation and operational testing. Although the DoD relies extensively on vendor software and undertakes considerable testing of that software, it also implicitly relies on a relationship founded in trust (rather than “verify”) to assure many of the quality attributes (listed above) that are not effectively supported through this kind of testing. This issue is explored at length in a report by the Defense Science Board on foreign software in defense systems.5

It is now increasingly well understood by software engineers and managers that quality, including security, is not “tested in,” but rather must be “built in.”6 But there are great challenges to succeeding in both “building in quality,” using preventive methods, and assuring that it is there, using evaluative methods. The nature of the challenge is determined by a combination of factors, including the potential operational hazards, the system requirements, infrastructure choices, and many other factors.

4

“F-22 Squadron Shot Down by the International Date Line,” Defense Industry Daily, March 1, 2007. Available online at http://www.defenseindustrydaily.com/f22-squadron-shot-down-by-the-international-date-line-03087/. August 10, 2010. There are also numerous public accounts of software failures of diverse kinds and consequences, such as those cited in the Forum on Risks to the Public in Computers and Related Systems, available online at http://www.risks.org.

5

Defense Science Board (DSB), 2007, Report of the Defense Science Board Task Force on Mission Impact of Foreign Influence on DoD Software, Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics. Available online at http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA473661. Last accessed August 20, 2010.

6

This is not a comment about test-driven development, which is an excellent way to transform the valuable evaluative practice of testing into a more valuable preventive practice of test-driven development—building test cases and code simultaneously or even writing test cases before code is written. Note here that “test” should be broadly construed, encompassing quality techniques such as inspection, modeling, and analysis. There are benefits to writing code from the outset that more readily support, for example, modeling, sound analysis, and structured inspection.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

BOX 4.2

Faults, Errors, Failures, and Hazards

A failure is a manifestation of a system that is inconsistent with its functional or quality intent—it fails to perform to specification. A hazard is a consequence to an organization or its mission that is the result of a system manifesting a failure. That is, if a system has been placed in a critical role and a failure occurs, then the hazard is the consequence to that role. For example, if an aircraft navigation system delivers incorrect results, the hazard is the potential consequence to the aircraft, its occupants, its owner, and so on, of incorrect navigation. An error, like a failure, is a manifestation when a system is running. But an error can be contained entirely within a system, not necessarily leading to failures. For example, some database systems can detect and remediate “local deadlocks” that involve perhaps a pair of threads, and they can do this in a generally transparent manner. Another example is an unexpected exception (such as might be raised when a null pointer is de-referenced) being handled locally within a component or subsystem. More broadly, architectures can be designed to detect errors, including security problems, within individual components and can reconfigure themselves to isolate or otherwise neutralize those errors.1 Errors, in turn, are enabled by local faults in code. A fault is a static flaw in the code at a particular place or region or identifiable set of places. Examples of faults include points in code where integrity tests are not made (leading to robustness errors), where locks are not acquired (leading to potential race conditions), where data is incorrectly interpreted (leading to erroneous output values), where program logic is flawed (leading to incorrect results), and so on.

In systems that include hardware, probabilistic models are used to make predictions regarding when errors or failures are likely to occur—for example, to compute mean time to failure or expected lifetimes of components. These models are the core of reliability theory, and they can involve complex relationships of conditional probability (i.e., faults that are more likely in the presence of other faults), coupled probability (e.g., when many faults are made more likely in adverse weather), and other complexities. With software, these probabilistic models are less useful, since the failures are caused by intrinsic design flaws that require implementation changes for correction. Intermittent errors in software are thus “designed into the code” (albeit unintentionally). Repair thus means making changes in the flawed design. For embedded software, where the software includes fault-tolerance roles, hybrid models are often most helpful.

This model helps to highlight the challenges associated with effective software testing, inspection, and

  

1Of course, it is possible that the mechanism by which errors are contained results in a loss of information regarding both the errors and the fact that they were contained. This information loss can create dangerous situations. The well known case of the Therac 25 failures (Nancy G. Leveson and Clark S. Turner, 1993, “An Investigation of the Therac-25 Accidents,” IEEE Computer 26(7):18-41) is a particularly compelling example of the consequences of inadequate information regarding actual error containment in operations. In this case, engineers acted on a false supposition regarding the extent of error containment by a hardware mechanism in operations, resulting in fatal x-ray doses being administered to cancer patients.

Underlying both preventive and evaluative methods are the two most critical broad influences on quality: judicious choice of process and practices, and the capability and training of the people involved in the process. Process and practices can include techniques for measurement and feedback in process execution in support of iteration, progress and earned value tracking, and engineering-risk management. Indeed, a key feature of process design is the concept of feedback loops specifically creating opportuni-

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

static analysis. The results of tests that fail are manifestations of errors (unit tests) or failures (system tests). Assuming the tests are valid, the engineer must then ascertain which faults may have led to the error or failure manifestations. This reverse-engineering puzzle can be challenging, or not, depending on the scope of the tests and the complexity of the code. Failures in system tests, for example, can derive from the full scope of the code, including incorporated vendor components and infrastructure. Test results are generally of moderate to high value, because they reflect the priorities implicit in the test coverage strategy that guided their creation.2

One of the pitfalls of late testing, as would be the case if unit testing were deferred, is that the faults identified may have become very expensive to repair, adding substantially to engineering risk. If the fault is fundamental to the design of a particular interface, then all clients and suppliers that share that interface may be affected as part of the repair process. If the fault is architectural, the costs may be greater, and there may be new engineering risks associated with exploration of alternative options. This suggests both that testing be done at the component level early in the process and that commitments related to architecture and interface design be evaluated through modeling, simulation, and analysis as early as possible in the lifecycle.

The results of inspections, on the other hand, generally point to specific places in code or in models where there are problems of one kind of another. This, from a purely theoretical basis, may be why inspections are sometimes measured as being more effective than testing by a measure of defects found per hour. Because inspections usually combine explicit targeting of issues and opportunistic exploration, the issues found are generally high value.

Static analysis results, including both sound analyses and heuristic analyses, generally point to faults in the code. They thus share with inspections the productivity advantage of avoiding the puzzle-solving inherent in the handling of adverse test results. Additionally, static analysis results can highlight low-probability intermittent errors that might routinely crash continuously operating servers but not be readily detectable using conventional testing. Unlike validated tests, analysis results can include false positives, which are indications of possible faults when there are actually no faults. (Unvalidated tests can also produce false positives in cases where the code is “correct,” but the test case is not.) Sound static analysis (i.e., static analysis with no false negatives) is used in compiler type checkers and some free-standing analysis tools. Its results are usually tightly targeted to very particular attributes and can lead fairly directly to repairs. Heuristic static analysis results, such as from open-source tools PMD and FindBugs, have considerably broader coverage than targeted sound analysis. But the results are typically less exact, and include false negatives (faults not found) as well as false positives. Additionally, there can be large numbers of results ranging from serious issues to code layout style suggestions. This necessitates an explicit process to set priorities among the results. An analysis of the open-source Hadoop system, for example, can yield more than 10,000 findings.

  

2 Test coverage metrics can be useful, but there are many kinds of coverage criteria. Pure “statement coverage” may be misleading, because it may indicate a prevalence of regression tests crafted in response to defects rather than of tests motivated by more “proactive” criteria.

ties for feedback at low cost and with high benefit in terms of reducing engineering risk.7 Practices can also include approaches to defect tracking, root cause analysis, and so on.

There is overlap between preventive and evaluative methods because evaluative methods are most effective when applied throughout a development process and not just as part of a systems-level acceptance evaluation activity. When used at the earliest stages in the process, evaluative methods shorten

7

These feedback loops may be conceptualized as “OODA loops”—Observe, Orient, Decide, Act. The OODA model for operational processes was articulated by COL John Boyd, USAF, and is widely used as a conceptual framework for iterative planning and replanning processes.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

BOX 4.3

Examples of Preventive and Evaluative Methods

Below are several illustrative examples of preventive methods. Underlying all of these particular methods is an emphasis on preventing the introduction of defects or finding them as soon as possible after they are introduced.

  • Requirements analysis. Assess operational hazards derived from context of use, adjusting operational plans to the extent possible to minimize potential hazard. Assess goals and limits with respect to quality attributes.

  • Architecture design. Adopt structural approaches that enhance reliability, robustness, and security while also providing flexibility in areas of anticipated change.

  • Ecosystem choice. Affiliate with ecosystems based on quality assessments of components and infrastructure derived from the associated supply chain.

  • Detail design. Adopt software structures and patterns that enhance localization of data and control over access.

  • Specification and documentation. Capture explicit formal and informal representations of functional and quality-attribute requirements, architecture description, detail design commitments, rationale, etc.

  • Modeling and simulation. Many software projects fail because the consequences of early decisions are not understood until late in the process, when the costs of revising those decisions appear to be prohibitively high, leading to costly workarounds and acceptance of additional engineering risk. It may be perceived by project managers that evaluation cannot be done before code is written and can be run. In fact, a range of techniques related to modeling and simulation can be employed to achieve “early validation” of critical up-front decisions. These techniques include prototyping, architectural simulation, model checking of specifications, and other kinds of analysis.1

  • Coding. Adopt secure coding practices and more transparent structured coding styles that facilitate the various evaluative methods.

  • Programming language. Select languages that provide first-class encapsulation and controlled storage management.

  • Tooling. Support traceability and logging structures in tooling, providing direct (and ideally semantics-based) interlinking among related design artifacts such as architecture and design specifications, source code, functional specifications, quality-attribute specifications, test cases, etc.

  

1 Daniel Jackson’s Alloy model checker is an example of early validation technique for specifications. Daniel Jackson and Martin Rinard, 2000, “Software Analysis: A Roadmap,” in The Future of Software Engineering, Anthony Finkelstein, ed., New York: ACM, pp. 215-224.

feedback loops and guide development choices, thus lessening engineering risk. (To illustrate the range of methods and interventions related to quality in software, a summary is presented in Box 4.3.)

Judgments Are Based on Chains of Evidence

The goal of assurance methods is to create connections, a set of “chains of evidence” that ultimately connect the code that executes with architectural, functional, and quality requirements. The creation of these chains is necessarily an incremental process, with “links” in the chains being created and adapted as the development process proceeds. An example of a link is a test case that connects code with a particular expectation regarding behavior at an internal software interface. Another link, perhaps cre-

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

Here are several illustrative examples of evaluative methods. These are applied throughout a lifecycle to assess various kinds of software artifacts.

  • Inspection of the full range of software-related artifacts, ranging from models and simulation results supporting requirements and architecture design to detailed design specifications, code, and test cases.

  • Testing of code with respect to function, performance, usability, integration, and other characteristics. Test cases can be developed to operate at the system level, for example, simulating web-browser clients in testing e-commerce or other web services systems, or they can operate on code “units” across software interfaces to test aspects of component behavior. Test cases are selected according to a combination of coverage strategies determined by architecture and ecosystem, software design, programming language choice, potential operational risks, secure coding practices, and other considerations.

  • Direct analysis of source, intermediate, or binary code, using sound tools that target particular quality attributes and heuristic tools that address a broader range of quality attributes.

  • Monitoring of operational code and dynamic analysis of running code, focused on particular quality attributes. As with testing, monitoring can operate at the system level, including logging and event capture, as well as at the unit level, such as for transaction and other internally focused event logs. Monitoring supports prevention, evaluation, and also forensics after failures occur. Infrastructure for monitoring can support a range from real-time to short-time delayed to forensic analyses of the collected event data. In the absence of other feedback loops, this can assist in focusing attention on making repairs and doing rework.

  • Verification of code against specifications. A number of formal “positive verification” capabilities have become practical in recent years for two reasons: First, scalability and usability are more readily achievable when verification is targeted to particular quality attributes.2 Second, new techniques are emerging, based on model checking or sound analysis that support this more targeted verification without excessive requirements for writing formal specifications and assertions in code.

Various process models have been proposed that provide a framework within which these various preventive and evaluative methods can be applied in a systematic fashion, structured, as it were, within Observe-Orient-Decide-Act (OODA) loops of various durations. Two of the most prominent are the Lipner-Howard method (the SDC, or Secure Development Lifecycle) and the method proposed by McGraw.

  

2 An example is the Microsoft Static Driver Verifier tool developed by Tom Ball for verifying protocol compliance of Windows device driver code using model checking. See Steve Lipner and Michael Howard, 2006, The Security Development Lifecycle: A Process for Developing Demonstrably More Secure Software, Redmond, WA: Microsoft Press.

ated using model-based analysis techniques, would connect this specific interface expectation with a more global architectural property. Another link is the connection of a fragmentary program annotation (“not null”) with the code it decorates. A further link would connect that global architectural property with a required system-level quality attribute. Validation of this small chain of links could come from system-level testing or monitoring that provides evidence to support presence of the system-level quality attribute.

This metaphor is useful in highlighting several significant features that influence assurance practice and the cost and potential to achieve high levels of assurance. Here are some examples of influences on the success of assurance practice:

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
  • There is a great diversity of the particular kinds of attributes that are to be assured. These range from functional behavior, performance, and availability, to security, usability, and interface compliance for service application programming interface APIs and frameworks. The Mitre Corporation maintains a catalog, the Common Weakness Enumeration (CWE)8 that illustrates the diversity in its identification of more than 800 specific kinds of “software weaknesses.”

  • There is also a great diversity of kinds of artifacts that must be linked in the chains. These include code, design models, architectural models, and specifications of functional and quality requirements. These also include more focused artifacts such as individual test cases, inspection results, analysis results, annotations and comments in code, and performance test results.

  • There is a range of formality among these artifacts—some have precise structure and meaning, and others are informal descriptions in natural language or presented as diagrams. (This issue is elaborated below.)

  • Components and services encompassed in a system may have diverse sources, with varying degrees of access to the artifacts and support/cooperation in an overall assurance process. Identification and evaluation of sources in an overall supply chain is a significant issue for cybersecurity (see Box 4.1), for which both provenance (trust) and direct evidence (verification) are considerations that influence the cost and effectiveness of an assurance process.

  • Many different kinds of techniques must be employed to assess consistency among artifacts and to build links in the chain. The most widely used are testing and inspection. Other techniques that are increasing in importance include modeling and simulation (e.g., for potential architecture choices), static analysis, formal verification and model checking (for code, designs, specifications, and models), and dynamic analysis and monitoring (for code, design, and models).

  • Some techniques are based not on reasoning about an artifact or component, but on safely containing it to insulate system data and control flow from adverse actions of the component. Techniques include sandboxing, process separation, virtual machines, etc.9

  • Different links in the chain may have different levels of “confidence,” with some providing (contingent) verification results and others providing a more probabilistic outcomes that may (or may not) increase confidence in consistency among artifacts. Test coverage analysis, for example, can be used to assess the appropriate degree to which a particular set of test results may be generalized to give confidence with respect to some broad assurance criterion.

  • Methods or their implementations may be flawed or implemented in a heuristic way that may lead to false positives and/or false negatives in the process of building chains.

Perhaps most importantly, the cost-effectiveness of activities related to software assurance is heavily influenced by particular choices made in development practice—factors that are in the control of developers, managers, and program managers. Here are examples of factors that influence the effectiveness and cost of both preventive and evaluative methods:

  • In assurance activities, access is provided not only to source code, but also to specifications, models, and other documentation. Without this information, evaluators must expend resources to “reverse engineer” design intent on code produced within their own organization and create these intermediate models. In the 1980s, a study suggested that, in fact, the DoD spends almost half of its

8

The CWE inventory (available online at http://cwe.mitre.org/) focuses primarily on security-related attributes. See also, for example, Robert C. Seacord, 2005, Secure Coding in C and C++, Boston: Addison-Wesley, for an inventory of potential issues related to not only secure, but also safe and high-quality code. There is substantial overlap of attributes related to safe and quality coding, on the one hand, and security, on the other.

9

Use of these containment or isolation techniques may create benefits for components that are opaque (some vendor executables, for example) or that are difficult to assure intrinsically (mobile code and scripts in a web services environment, for example). But there are also potential hazards associated with the containment infrastructure itself (such as virtual machine or a web-client sandbox), which must often also be assured to a high level of confidence.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

post-deployment cost (47%) reverse engineering its own code.10 Of course, this reverse engineering was for diverse purposes, but it illustrates the failure of documentation and traceability.

  • Traceability exists among the diverse software artifacts including code and the various model and documentation components. The goal, as noted above, is to ultimately connect code with architectural, functional, and quality requirements. In some software engineering groups, evaluators ignore documentation on the premise that it is easier to reverse engineer the code being evaluated (but see above). That is, while the artifacts exist, traceability is lacking, making it difficult both to locate the correct document in a sea of documentation and to verify that the description in the document remains current with as-built code. Modern team-based software tooling has presented a revolution in traceability and logging—for example, each line of code in modern tool-enhanced code bases can have direct links to its complete history including which developers have “touched” that line of code and for what purpose.

  • Choices are made regarding architecture, design, and coding that facilitate more definitive evaluation outcomes. These choices relate to formality, explicit complexity in structure, and information hiding and modularity, as well as to the characteristics of possible executions of the code. For example, distributed and concurrent systems can, for an unchanging input, exhibit different behaviors with each run. This is due to the asynchrony often characteristic of concurrent execution. When errors are unlikely but possible, testing and even inspection may not offer sufficiently useful results.11

  • Product-line and ecosystems choices can enable leveraging of assurance activity across multiple projects. This benefit is proportional, however, to the extent that assurance techniques can be composed, which in turn is enabled by our ability to model assurance-related attributes at component or protocol interfaces. (This is a research issue identified in Chapter 5.)

  • Choice of programming language (and coding style) can significantly influence ability to assure. Highly complex “tangled code” in a language such as C (which lacks first-class encapsulation and controlled access to storage) may present formidable barriers to evaluative methods in achieving confident assurance judgments when compared, for example, with well-structured programs in modern languages such as C#, Java, and Ada that have comparable functionality.12 In these latter cases, “well-structured” means two things: First, modular structures can be crafted using modern type systems to replace tangled complexity with organization. Second, intrinsic support for information hiding and encapsulated data simplifies the structure of the various links in the chain of evidence that need to be constructed.

All evaluative methods are challenged by the difficulty of defining the scope of the operating environment that may be delineated as the “boundaries” for evaluation.13 Unanticipated features of the operational environment that affect system operation may influence not only hazard, but also the validity of requirements. An example of such a scoping error occurred during a test of an F-22 that originated at Edwards Air Force Base and flew to an altitude where it became exposed to the many radio emitters in the Los Angeles basin. This was the first such intensive radio exposure in the test process for the jet,

10

See Girish Parikh and Nicholas Zvegintzov, eds., 1983, Tutorial on SoftwareMmaintenance, Silver Spring, MD: IEEE Computer Society Press. See also Center for Software Engineering at the University of Southern California (USC), 2000,“COCOMO II,” Los Angeles: University of Southern California. Available online at http://csse.usc.edu/csse/research/COCOMOII/cocomo2000.0/CII_modelman2000.0.pdf. Last accessed August 20, 2010.

11

An early example was the start-up failure in establishing communications among the five computers on the NASA Space Shuttle on April 10, 1981. Later investigation of the design showed that there had been a 1 in 67 chance that the computers would not synchronize. This meant that, in testing, there was a better than 98 percent chance that the error would not be observed. If there had been anticipation of stochastic phenomena, the error could have been found with more pervasive testing. But in practice this is infeasible for two reasons: (1) there are significant errors that might occur with much lower frequencies, and (2) there are too many different kinds of interactions that might prompt this kind of testing.

12

The performance gap between “lower level” languages such as C and modern encapsulation-based languages has generally been closed and, indeed, modern languages may offer better performance in many cases since runtime checks can be eliminated when static verification is achieved by compilers for typing and encapsulation properties, for example.

13

Michael Jackson, 1995, Software Specifications and Requirements: A Lexicon of Practice, Principles and Prejudices, Boston: Addison-Wesley.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

despite much testing. Unexpectedly, the software concluded that the jet was under attack, and it went into an electronic defensive mode. Crew were forced to shut down all functions to prevent unintended consequences. This experience led the F-35 Joint Strike Fighter developers to test all their software in a fully realized flying testbed well before the actual fighter was flown. This flying testbed is now one of many steps in a highly comprehensive (and expensive) process of operational testing in support of acceptance.

Practices Influence Feasibility and Cost of Assurance

The examples above illustrate that development practices and technologies can profoundly influence the ability to achieve successful and cost-effective evaluation outcomes. These development choices range from architectural choices to choices of programming language and coding style. As noted above, complex tangled code is more difficult to evaluate than structurally simpler code, regardless of whether the evaluation is done using testing, inspection, static analysis, or model checking. It may be, for example, that the 2 percent performance improvement that is created through the additional complexity may not be justified when the added evaluation costs are considered.14

One of the great benefits of modern tooling is that a much more comprehensive record of development can be created to facilitate evaluation. When more of the various development-related artifacts are formal (i.e., have precise structure and meanings), then tooling can be used to greater advantage in both prevention and evaluation (as well as in prototyping and other analogs of the modeling and simulation common in the development of physical systems). Degree of formality is an important characteristic of software-related artifacts, discussed at greater length below.


Finding 4-1: The feasibility of achieving high assurance for a particular system is strongly influenced by early engineering choices, particularly architectural and tooling choices.

Assurance Techniques and Results Can Benefit Developers Directly

Because of recent advances in traceability, evaluative techniques, and expressiveness of models, this record of artifacts associated with development is gaining considerable value in contributing to the creation of chains of evidence. When development teams see immediate benefits from the evidentiary material, they are naturally led to adopt a broader range of preventive practices to create additional links in the chain of evidence. It is increasingly apparent that modern assurance techniques can provide immediate benefits in the form of direct feedback loops and greater transparency in the processes implemented by small teams and even by individual developers. The techniques and associated models are also enablers of flexibility and evolution, which are essential in long-lived software systems of all kinds, because of the rapid changes in operational requirements, infrastructure, ecosystems, and underlying hardware capabilities.

SOFTWARE ASSURANCE FUNDAMENTALS

Software Reliability Is Different

Unlike other engineering materials, software does not wear out or suffer transient faults. But it can suffer transient errors, for example, because of concurrency (see Box 4.2). This is both an obvious and

14

Even if choices related to architecture and language affect performance or code size by observable constant factors, there is a pareto principle that suggests that this can be mitigated through performance-focused tuning of a small number of “hot spots” in code. This enables the benefits of superior structure to be realized without adverse performance cost. This point notwithstanding, the idea of a tradeoff of speed against structure and safety is not necessarily principled and may, in the long haul, be incorrect.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

subtle point. It is obvious in the sense that there is no analog of metal fatigue, rust and oxidation, or other kinds of physical deterioration or environmentally induced change in physical properties. It is subtle because software is often the mechanism of choice for handling such faults in associated hardware, such as sensors and actuators in a robotic or cyber-physical system, or faults in underlying computing hardware such as processor chips, memory, and communication channels. When software delivers “bad results,” including transient errors, these are due to “permanently faulty” software design, which is addressed by changes in the software code—that is, a “new software design” in the sense of changing the mechanism that is implemented.

Despite these differences, the terminology of reliability is usefully applied to software.15 The core of the terminology is four words: fault, error, failure, and hazard. These are defined and illustrated in Box 4.2.

Information Loss and Traceability

As noted above, the software engineering process is almost always characterized by cycles of information loss and recovery. Although code16 is all that is necessary for the software to operate, considerable additional information is needed to effectively support ongoing evolution of the software over its lifespan. Some of this information is formal—that is, its expressions are precisely structured and have exact meanings—while other information is “informal,” which typically means expressed in the form of natural language text, presentation charts, sketches, and informal diagrams. Examples of formal information are test cases, assertions in code, certain kinds of design models such as unified modeling language (UML) StateCharts and formal structural architectural models (such as Acme). Examples of informal information include comments in code, design description and rationale, structured API documentation (such as Javadoc), and architecture and design diagrams such as from UML.

Two small scenarios illustrate the value of this kind of information in the design process:

  1. A planning process for system enhancement leads to reconsideration of a principal architectural commitment such as choice of ecosystem, design of structural architecture, or choice of infrastructure components. Original designers and developers are sought out to help a new team of planners to understand elements of decision rationale for the as-built system, including other alternatives that were considered and why those choices were made.

  2. An internal algorithmic enhancement is made in a module that connects to the rest of the system through a software interface or a network protocol. Questions arise concerning particular “rules of the road” for that interface or protocol, and they can be resolved only through an examination of other modules in the system that operate through that interface or protocol. Other questions arise due to the possible dependency of client code on “accidental features” visible through an interface or protocol but not intended to be promised.

Software producibility is directly influenced not only by the extent of design-related information that is retained and managed, but also by the means by which this design-related information is represented.17 There are four dimensions of representation that are most significant. These are formality, modeling, consistency, and usability.

15

Daniel P. Siewiorek and Robert S. Swarz, 1998, Reliable Computer Systems: Design and Evaluation, Natick, MA: AK Peters, Ltd.

16

“Code,” in this context, includes both executable files and associated declarative configuration files such as the XML files often used in .Net and Java EE web systems.

17

The committee uses the phrase “design-related information” in a broad sense to include not only architectural and structural commitments, but also other commitments related to quality and functional attributes not otherwise explicit in the code itself.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

Formality

When information is represented formally, tools not only can make maximum use of the meanings that are expressed but also can rely on those meanings as being exact. Tools can also use informal information, of course, but the inexactness of meanings limits their ability—and consequently the ability of software developers—to rely on any particular meaning as being correct. This suggests a strong bias for formality.18 The extent of formality (i.e., expressiveness of formal notations) is limited, however, by the state of practice regarding what we know how to express. Much of the advancement in programming languages (assembly language to C to C++ to C# and Java, for example) and design notations (informal ad hoc structural diagrams to the UML family, for example) is enabled by advancements in technical progress by the research community. One of the challenges is understanding what scope of the worldly context of operations must be modeled in order to support reasoning regarding the full range of functional and quality requirements.19 This is an area where there has been steady progress in research, along with significant influence of that research on practice.

Formal information can be very simple, such as references to version numbers, identifiers in defect databases, web links (URLs), or the extensive structured metadata in a defect management database. This illustrates the notion of partial formality, sometimes called “semiformal” or “semi-structured,” wherein formal information (such as web links) is embedded in informal texts, or vice versa (e.g., textual comments embedded in code). Another example, in the defect databases, is the fact that there is also considerable latitude for informal expression within the overall structure of the wealth of “formal” metadata—for example the words used to describe the defect or the constituent messages in the “blog” record associated with the defect. Formality can also be semantically very “deep,” such as the temporal logic specifications used to express models for model-checking tools.20

A key insight is that any step from informal text to structured metadata facilitates traceability and analysis. These steps involve making structure more explicit and identifying precise meanings for the elements of the structure. This is not to say that all models should be formal—achieving formality can create constraints on flexibility and expressiveness. This is why there is so much partial formality. But it also reminds us that incremental steps can be made as research progresses.

Semantic expressiveness is a key distinguishing feature among programming languages, within which small steps can make a considerable difference. For example, the first-class typing of Ada, C#, and Java creates significant advantages for development teams in managing structural aspects of larger-scale systems, and particularly in ongoing assessment of consistency of as-built code with architectural specifications. The C language does not afford such advantages. Although the C++ language gives some of the benefits, it is possible to “bypass” the protection mechanisms in C++ programs and thus lose some of the benefits. Much of the subject of modern programming language research is how to increase the expressiveness of type systems and other structuring mechanisms to facilitate more modular management of large evolving code bases as they evolve and more concise expression of abstractions represented in the code and their relationships.

18

This does not necessarily relate to “formal methods” as traditionally construed. See footnote 20 below. The idea of “formality” is about precision of structure and meaning—and even HTML tags confer a small increment of formality. This is distinct from many of the methodologically focused ideas proposed under the rubric of “formal methods” over the past four decades. Much of the recent success of mathematically based approaches that build on the tradition of formal methods has been in areas often called “lightweight formal methods”—approaches that trade scope and generality for scalability and ease of use. These more scalable approaches include model checking, sound static analysis, and some approaches based on assertion-passing verification. Because they focus more narrowly on particular quality or functional attributes, these approaches have achieved success in professional development practice. An example is Microsoft’s use of diverse analysis tools such as SLAM, PreFast, Spec#, and others.

19

This issue is explored at length in Michael A. Jackson, 2001, Problem Frames: Analysing and Structuring Software Development Problems, Boston: Addison-Wesley.

20

The term “formal methods” refers to techniques for reasoning about code or design models, generally focusing on logical relationships between specifications and the code or models.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

Modeling

From the standpoint of assurance, models of all kinds—architecture, design, performance, structural, and semantic—form the intermediate way-points that facilitate linking (in the chain of evidence) of executable code with requirements of various kinds. The way-points include “domain-oriented” models related to requirements.21 The UML family of design models includes models that are more formal, such as StateCharts, and others that are less formal, such as deployment diagrams. The advantage of the more formal models is that there is more that tools can do to support traceability and analysis. StateCharts has a precise semantics rooted in state machines, which enables creation of a range of tools for analysis, simulation, consistency checking with code, and the like.

There are benefits, of course, when models can not only support the software development process and management of engineering risks (e.g., through simulation and analysis), but also facilitate the activities related to assurance. Many of the topics identified in Chapter 6 relate to modeling and the use of models for various purposes.

Tools such as model checkers and static analysis tools are informed by formal specification fragments, which are a kind of model. These are sometimes expressed in self-contained specifications (e.g., linear temporal logic specifications or Alloy specifications for model checkers) and sometimes use fragmentary annotations associated with code or models. Some verification tools make use of highly expressive specification languages for functional properties.

In general there is an advancing frontier from informal to formal models—actually from less formal to more formal models—and modern tooling is creating momentum to push this frontier more rapidly and effectively. In Chapter 5, there is discussion regarding research goals related to both advancing modeling and specification capability and also to improving techniques and tools for reasoning and analysis. Examples include techniques ranging from theorem proving, model checking, and analysis to type modeling and checking, architectural and design analysis, and analyses related to concurrency and parallelism. Much of the recent progress in program analysis, which is particularly evident in certain leading vendor development practices, is built on these ideas.

Consistency

Information in a software development process is gathered incrementally over time. Almost always, systems are evolving and so are detailed choices regarding architecture, requirements, and design. A seemingly unavoidable consequence is a loss of consistency within the database of information captured over time. Indeed, developers often set aside documents and model descriptions, and resort to interviewing colleagues and doing reverse engineering of code in order to develop confidence in the models they are building or evolving. Precision in models (formality) can be useful in achieving consistency when tools can be used to analyze consistency on an ongoing basis. Tool use ranges from maintenance of batteries of regression tests to the use of verification and analysis tools to compare code with models. With both formal and informal information, explicit hyperlinking can expose interrelationships to developers and enable them to more readily sustain consistency among design artifacts.

Extensive hyperlinking is a feature of modern development tools, including team tools and developer tools. It is an essential feature, for example, of modern open-source development and build environments.22 With automated tools, a very fine granularity can be achieved without adding to developer effort. For example, an open-source developer can check in code by submitting a simple “patch” file,

21

Requirements always start with informal articulations that are made precise and potentially formal (in the sense of this chapter) through the development process. One of the great benefits of high-quality models for requirements and associated domain concepts is the opportunity for early validation. These models can include scenarios, use cases, mock-ups, etc.

22

Linking and other kinds of support for traceability are supported in most commercial development tools and in high-end open-source ecosystems. An example that can be readily explored is the Mozilla development ecosystem—see, for example, code and tools at https://hg.mozilla.org/mozilla-central.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

and from this the tools can update the information database in a way that shows the identity of the developer who last changed every individual line of code, along with some informal and semi-formal rationale information such as a reference to a file version number and an identifier from the issue/defect database.

Usability

Even the highest quality information does not add value if it is not readily accessible and applicable by the key stakeholders in the software development process—developers, managers, evaluators, and others. With respect to search, for example, there are enormous differences in efficiency between traditional paper documents and electronic records. Augmenting search with linking and with direct support for anticipated workflows is another large step in efficiency. Choice of representation for expressing design information and models can also make a significant difference—“developer-accessible” notations can reduce training requirements and lower barriers to entry for developers to capture information that otherwise might not be expressed at all.

Indeed, we can contemplate a concept of “developer economics” that can be used as a guide for assessing potential motivation of individual developers in using assurance-related tools. An example of bad developer economics is when a developer or team is asked to devote considerable time and effort to expressing design information when payback is uncertain, diffuse, or most likely far in the future. A goal in formulating incentive models that motivate developer effort (beyond management or contractual mandates) is to afford developers increments of value for increments of time invested in capturing design information, and to provide that value as soon as possible after the effort has been invested. Thus, when a developer writes a single-unit test case, it becomes possible both to execute that test case right away on an existing small unit, and to validate the test case against other design information (and to capture links with that design information to support consistency). This “early gratification incrementality” can be a challenge to achieve for certain kinds of tools and formal documentation, however. Success in achieving this “early gratification” is one of the reasons why unit testing has caught on, and model checking and analysis are also emerging into practice.23


Finding 4-2: Assurance is facilitated by advances in diverse aspects of software engineering practice and technology, including modeling, analysis, tools and environments, traceability, programming languages, and process support. Advances focused on simultaneous creation of assurance-related evidence with ongoing development effort have high potential to improve the overall assurance of systems.

CHALLENGES FOR DEFENSE AND SIMILAR COMPLEX SYSTEMS

Hazards

The extent and rigor adopted for an evaluation process is most directly influenced by the potential hazards associated with the intended operational environment. Missile launch control, cryptographic tools, infusion pumps for medication administration, automobile brake systems, and fly-by-wire avionics are all “critical systems” whose design and construction are profoundly influenced by considerations of evaluation and assurance. For many critical systems, standards have been established that regulate various aspects of process, supply-chain decisions, developer training and certification, and evaluation. These standards are ultimately directed toward assurances regarding quality attributes in running code. From the particular perspective of assurance, any focus on aspects other than the intended delivered

23

Difficulty in achieving this kind of incrementality has been a challenge to the adoption of emerging prototype functional verification systems.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

code (and its associated chains of evidence) is intended either as a predictor of ultimate code quality or, often, as a surrogate for direct evaluation of some critical quality of that running code. The latter approach is often used as a “work-around” when direct evaluation is thwarted by the raw complexity of the system or the inadequacy of methods and tools available for direct evaluation.

Indeed, system managers often feel that they face an uncomfortable tradeoff between enhancing the capability of a system and delivering a high level of assurance. This folkloric “quality-capability tradeoff” is particularly challenging because it may be difficult to know exactly where on the quality axis a particular design is likely to reside. Greater incentives for quality have had the effect of “pushing outward” this tradeoff curve for both preventive and evaluative methods. This observation explains, for example, why vendors such as Microsoft have made such a strong commitment to advancing in all areas of prevention and evaluation, because it enables them to offer simultaneous increases in quality and capability.

Capability and Complexity

A major complicating factor in software assurance for defense is the rapid growth in the scale, complexity, and criticality of software in systems of all kinds. (This is elaborated in Chapter 1.) This growth adds to both factors in the risk product, including extent of consequence (hazard, due to the growing criticality of software systems, and cost of repair, due to the growing significance of early commitments) and potential for consequence (due to complexity and interlinking with other systems). The transition to fly-by-wire aircraft, which was for many years loudly debated, is an example of the growing consequence of software. In the commercial world, we are now analogously moving to “drive-by-wire” vehicles, where the connections between brake and accelerator pedals and the respective mechanical actuators are increasingly computer mediated. The benefits are significant, in the form of anti-lock braking, cruise control, fuel economy, gas/electric hybrid designs, and other factors. But so are the risks, as documented in recent cases regarding software upgrades for the brake mechanisms for certain Toyota and Ford vehicles.

An example of the risks of fly-by-wire were demonstrated when an F-22 pilot had to eject from his aircraft (which eventually crashed) when he realized that, due to an unexpected gyro shutdown, he had no ability to control the aircraft from the cockpit. He realized this only after takeoff, when the aircraft initiated a series of uncommanded maneuvers. In modern fighters, if the Vehicle Management System computers (VMS) are lost, so is the aircraft.

As noted in National Research Council reports, more constrained domains such as medical devices and avionics benefit from rigorous standards of quality and practice such as DO-178B.24 These standards prescribe specific documents, process choices (including iterative models), consistency management and traceability practices, and assurance arguments (“verification”) that include various links of the chain, as described earlier in this chapter. These approaches are extremely valuable, but they also appear to be more effective in domains with less diversity and scale than is experienced in DoD critical systems.

Complexity and Supply Chains

An additional complicating factor in software assurance for defense is the changing character of the architecture and supply structure for software systems generally, including defense software systems. The changes, which are enabled by advances in the underlying software technologies, particularly related to languages, tools, and runtime architectures, allow for more complex architectures and richer and more diverse supply chains. Even routine software for infrastructure users such as banks, for example, can involve dozens of major modules from a similar number of vendor and developer

24

NRC, Daniel Jackson, Martyn Thomas, and Lynette I. Millett, eds., 2007, Software for Dependable Systems, Washington, DC: National Academies Press. Available online at http://www.nap.edu/catalog.php?record_id=11923. Accessed August 20, 2010.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

organizations, as well as custom software components developed by multiple independent in-house development teams. This is in addition to the defense and government challenges of the customer and key stakeholders working at arm’s length from the development teams.

When systems are modular and component-based, there are sometimes opportunities to structure the assurance task in an analogously modular fashion. Unfortunately, many critical software attributes do not “compose” in this fashion, but there are some that do. For example, type correctness of software in modern languages such as Java, C#, and Ada is composable, which permits separate compilation of distinct modules. But without composability, the problem of creating “links” in the assurance chain can rapidly become intractable. Composability is therefore an important goal in the design of models, languages, and analysis capabilities.

Additionally, modern systems make greater use of complex software frameworks and libraries. This is a great success in reuse, but there is also great complexity. Frameworks provide aggregate functionalities such as graphical user interaction, application server capability, web services support, mobile device capabilities, software development environments, enterprise resource planning (ERP), and the like. These frameworks embody many of the technical commitments associated with the ecosystems described in Chapter 1, and they now appear ubiquitously in larger-scale commercial applications. A framework is different from a library, roughly, because it embodies greater architectural commitment, including the structure of its associated subsystems, patterns for the flow of control, and representations for key data structures. This approach, which is enabled by modern object-oriented technology and languages, greatly reduces engineering risk for framework users, because the established frameworks embody proven architectures. But it does create some assurance challenges due to the complexity of the relationships among the framework, its client code, and potentially framework add-ins that augment capability in various ways.

Frameworks and Components

The success of component-based architectures, libraries, and frameworks has led to larger and more capable software applications that draw from a much greater diversity of sources for code. This is a mixed blessing. On the one hand, highly capable and innovative applications can be created largely by selecting ecosystems and assembling components, with a relatively very small proportion of new custom design and code development. Often the overall architecture can be highly innovative, even when it incorporates subsystems and components drawn from established ecosystems. This approach is particularly well suited to incremental methods that facilitate accommodation of the refresh cycles for the various constituent components. It also facilitates prototyping, because functional capabilities can often be approximated through the assembly process, with additional custom code added in later iterations to tailor to more detailed functional needs, as they become better understood.

Trust

This model, while attractive in many respects, poses significant challenges for assurance. Because there are diverse components from diverse sources, there will necessarily be differences in the levels of trust conferred on both components and suppliers. This means that, in the parlance of cybersecurity, there are potential attack surfaces inside as well as outside the software application and that we must support rigorous defense at the interfaces within the application. In other words, the new perimeter is within the application rather than around it or its platform. This can imply, for example, that the kinds of architecture analyses alluded to in Chapter 3 that relate to modularity and coupling may also be useful in assuring that among components in a system (e.g., involving access to data or control of resources) there is no “connectivity” other than that which is intended by the architects.

This new reality for large systems poses great challenges for assurance, because of the potentially reduced ability to influence the many sources in the supply chain and also because of the technical

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

challenges of composing assessment results for individual components and subsystems into aggregate conclusions that can support an assurance case.

Vendor components are very often accepted on the basis of trust and expectations rather than direct analysis. There are both technical and legal barriers to direct analysis that often thwart the ability of the DoD to make sound assessments that can lead to reliable conclusions regarding assurance. There are several options in these cases. One is to employ a formal third-party assessment process such as Common Criteria (ISO 15408), which is in fact derived from the old “Orange Book” process defined in the early 1980s. These processes can be expensive and can create delay.25 Additionally, results can be invalidated when components must be configured, plug-ins are added, or other small changes are made such as adding device drivers to an operating system configuration. There has been much consideration of alternate approaches to such assessments. (Detailed consideration of this issue is beyond the scope of this report, but consideration is given in the referenced DSB report.26)

TWO SCENARIOS FOR SOFTWARE ASSURANCE

To illustrate evaluative techniques and the value of preventive techniques when software is developed at arm’s length, the committee presents two speculative scenarios for software assurance. In the first scenario, evaluators are given full access to an already existing software system that is proposed for operational release. The access includes source code for all custom development as well as all associated development documents. The evaluators also have access to threat experts, and they may have the opportunity to interview members of the development team. In the second scenario, a similar system is developed, but evaluators have access to the development team from the outset of the project, and the development team leaders have specific contractual incentives to obtain favorable judgments of high assurance.

The first scenario, which is fully after the fact, may be read as a strawman for the second and more desirable scenario. Unfortunately, an after-the-fact response such as sketched in the first scenario is all too often called for in practice—and indeed in some cases may be optimistic due to the opacity of many code and service components.

First Scenario—After the Fact

In the informal narrative below, the committee starts with the first scenario and then (under the same paragraph headings) explores the potential benefits of the greater access in the second scenario.

  • Hazard and requirements analysis. The first step for the evaluators is to engage with the threat experts and the operational stakeholders for the purpose of identifying the key hazards. These could include hazards related to quality attributes: security hazards (e.g., confidentiality, integrity, and access in some combination), safety hazards (e.g., related to weapons release), and reliability and performance hazards. This will include identification of the principal hazards relating to functional attributes—correctness of operation, usability and ergonomic considerations, and compliance with interoperation requirements

25

The Common Criteria standard (ISO 15408) is generally considered to be more successful for well-scoped categories of products such as firewalls and other self-contained devices—as contrasted with general-purpose operating systems, for example. Success with Common Criteria is also challenged by dynamic reconfiguration, such as through dynamically loaded libraries, device driver additions, and reconfiguration of system settings by users and administrators. Additionally, much of the evaluation undertaken through the Common Criteria process is focused on design documents rather than on the code to be executed. There may be no full traceability of executing code corresponding to the evaluated design documents.

26

DSB, 2007, Report of the Defense Science Board Task Force on Mission Impact of Foreign Influence on DoD Software, Washington, DC: Office of the Under Secretary of Defense for Acquisition, Technology, and Logistics. Available online at http://stinet.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA473661. Last accessed August 20, 2010.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

and, more generally, with standards associated with interlinked systems (ultra-scale, net-centric, system of systems, etc).

  • Architecture and component identification. The system and its associated documents are then analyzed to determine the overall intended and as-built system architectures. The intended architecture may not correspond exactly to the as-built, but it should be as close as possible, with deviations plausibly explainable as design or coding defects. As part of this process, the internal component structure of the system is modeled, including the adoption of off-the-shelf components and frameworks from established ecosystems. For example, if the system uses web capabilities, then there will likely be major subsystems implemented as configured vendor frameworks. The result of this step is an architectural model, an identification of the principal internal interfaces that mediate interactions among components (frameworks, libraries, local services, network-accessed services, custom components, etc.), and an identification of significant semantic invariants regarding shared data, critical process flows, timing and performance constraints, and other significant architectural features.27

  • Component-level error and failure modeling. If successful, the architectural analysis yields an understanding of principal constraints on the components of the system that relate to attributes such as timing, resource usage, data flows and access, user interaction constraints, and potentially many other attributes depending on the kind of system. This process, and also the architecture analysis process, is informed by documents and developer interviews.

  • Supply-chain and development history appraisal. Based on information regarding component sourcing and supply-chain management practices, levels of trust are assigned to system components. This will inform priority setting in assessment of the individual components. Custom components from less-trusted sources may merit greater attention, for example, than off-the-shelf commercial components from more trusted sources. A similar analysis should apply to services (e.g., cloud services, software-as-a-service capabilities, etc.). Open-source components afford visibility into code, rationale, and history. They may also afford access to test cases, performance analyses, and other pertinent artifacts. It is also helpful, from the standpoint of security threats (see Box 4.1), to assess detailed historical development data. This can include not only data regarding producer/consumer interfaces within the supply chain, but also, when possible, code check-in records from modern development databases (such as captured in open-source systems such as SVN and CVS and similar commercial products and services).

  • Analysis of architecture and component models. Proceeding on the (as yet unverified) assumption that component implementations are consistent with their constraints, the models at the granularity of architecture and component interactions can be subject to analysis. Because of the diversity of attributes of the models that can trace to the identified failures and hazards, multiple modeling exercises are likely to be undertaken, each focusing on particular attributes. When the models can be rendered formally, then tools for semi-automated analysis can be used for model checking, theorem proving, static analysis (at model level), simulation, and other kinds of mathematically based analysis. If certain models can be formalized only partially or not at all, then a more manual approach must be adopted to undertake the analysis.

  • Identify high-interest components. Component analyses can be prioritized on the basis of a combination of trust level (from the supply-chain analysis) and potential role with respect to hazards, or “architectural criticality.” Greater attention, for example, would be devoted to a component that handles sensitive information and that is custom developed by an unknown or less trusted supplier.

  • Develop a component evaluation plan. The evaluation plan involves allocating resources, setting priorities, identifying assurance requirements, and establishing progress measures on the basis of the analyses above.

  • Assess individual components. This can involve a combination of evaluative techniques. “Static”

27

This documentation, focused on succinct renderings of traceability and technical attributes, should not be confused with the “for the record” documentation often required with development contracts—which may be of limited value in an assurance exercise that relies on efficient tool-assisted evaluation.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

techniques, which do not involve executing the code, include inspection (with design documents), sound static analysis, and heuristic static analysis. These analyses may involve the construction of various kinds of abstract models that can themselves be analyzed to assess various functional and quality attributes. This activity is facilitated when models can be made more formal—informally expressed models necessarily require people to make interpretations and assessments. The analyses may also involve “dynamic” techniques, which involve execution of the code, either in situ in the running system (analogous to in vivo testing in life sciences) or in test scaffolds (analogous to in vitro testing in life sciences). If the project had used unit testing, then scaffold code would be included in the corpus, and this could be adapted and reused. Dynamic methods also include dynamic analysis and monitoring and can be used to inform the development of static models to provide assurance in cases where this is significant—particularly concurrent and performance-sensitive code. The results of this assessment are in the form of an identification of areas of confidence and areas of remaining assessment risk with respect to the component interface specifications derived from the architecture analysis.

  • Select courses of action for custom components. On the basis of the identification of high-interest components and the component assessment results, specific options are identified for mitigation of the remaining assessment risks. These options could range from acceptance of the component (a positive assurance judgment) to wholesale replacement of the component. Intermediate options include, for example, containment (“sandboxing” the component behind a façade that monitors and regulates control and data flows, either within the process or in a separate process or virtual machine), refactoring, and other kinds of rework that might lead to more definitive assessment results. For example, simplification of code control structure and localization of state (data) can greatly facilitate analyses of all kinds. On the other hand, if there are major issues that afflict multiple components and the value is deemed sufficient, then this kind of refactoring and rework could be done at the architectural level, facilitating assessment for multiple components.

  • Select courses of action for opaque components and services. For opaque components (typically products from vendors), the options are more constrained. In these cases, the extent of the intervention may be influenced by the extent of trust vested in the particular vendor in its supply-chain role. When trust is relatively low, potential interventions include sandboxing (as noted above) and architectural intervention to assure that the untrusted component does not have access to the most sensitive data and control flows. Outsourced services, for example, can also be sandboxed and monitored. An alternative is to replace the component or to rework the arm’s-length contractual arrangements to facilitate access and evaluation.

  • Refine system-level assessment. On the basis of the results of the component assessments and interventions (where appropriate and practical), architecture-level refactoring can sometimes be considered as a means to improve modularity, isolating components for which high levels of assurance cannot be achieved. Most importantly, the architectural-level models should be reconsidered in the light of the information acquired and verified in the foregoing steps. This reconsideration should focus on the hazards, quality attributes, and functional requirements as identified in the initial steps. If the component- and architecture-level assurances do not combine to yield sufficient assurances for the hazards identified, then more drastic options need to be contemplated, including canceling the project, redefining the mission context to reduce the unaddressed hazards, revising initial thresholds regarding system risks, or undertaking a more intensive reengineering process on the offending components of the system and/or its overall architecture. As noted in Chapter 3, reworking architecture commitments at this late stage can be very costly, because there can be considerable consequent rework in many components.

This scenario is intended to illustrate not only the potential challenges in an evaluation process, but also some of the added costs and risks that exist due to insufficiency either of effort in the “preventive” category or of evaluator involvement in the development phase. In the second scenario, the committee briefly considers how these steps might be different were the evaluators and developers to work in partnership during the development process rather than after the fact.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

Second Scenario—Preventive Practices

The steps are the same as those for the first scenario, but the descriptions focus on the essential differences with the after-the-fact scenario above. This scenario should make evident the value of incentives in the development process for “design for assurability.”

  • Hazard and requirements analysis. This step is similar, but performed as part of the overall scoping of the system. Because architecture is such a primary driver of quality attributes and assurance (as illustrated above), in this preventive scenario, a savvy manager would couple the architecture definition with the hazard analysis and, if possible, limit early commitment regarding specific functional characteristics to broad definitions of the “scope” of the system (see Chapter 2). At this stage, the first set of overall progress metrics is defined, and these could include credit to be allocated for resolving engineering risks associated with assurance. These metrics can also relate to compliance with standards associated with interlinked systems, as noted in the first scenario.

  • Architecture and component identification. As noted earlier, the architecture definition is coupled with hazard identification and scope definition. The exceedingly high engineering risk for assurance and architecture in the after-the-fact scenario (assuming innovative architectural elements are required) is replaced with an up-front process of architecture modeling, supported by various early-validation techniques such as simulation, prototyping, and direct analysis (such as with model checking). Certain detail-level architectural commitments can be made incrementally. Progress metrics related to assurance-related engineering risk are refined and elaborated.

  • Component-level error and failure modeling. A key difference is that the component-level modeling, combined with the supply-chain appraisal, provides an early feedback mechanism regarding engineering risks in the evolving architecture design. Risks can be assessed related not only to quality attributes and technical feasibility, but also to sourcing costs and risks. For example, choices might be made regarding opaque commercial components from a trusted source, custom components, wrapped untrusted components, and open-source components that afford stakeholders both visibility and the possibility of useful intervention (e.g., adding test cases, adapting APIs, adding features, etc.). This process can also lead to the early creation of unit test cases, analysis and instrumentation strategies, and other quality-related interventions in the component engineering process. Process metrics defined in earlier stages can inform allocation of resources in this stage of the process. The metrics are also refined as part of the incremental development process.

  • Supply-chain and development history appraisal. See above. The committee notes that it is sometimes asserted that offshore development is intrinsically too dangerous. However, one could argue that badly managed onshore development by cleared individuals may be more dangerous than offshore development with best practices and evidence creation along with coding. A well-managed offshore approach may be feasible for many kinds of components when elements of the evolving best practice are adopted, such as (1) highly modular architectures enabling simplicity in interface specifications and concurrent development, (2) unit testing, regression testing, and code analysis, with results (and tests) delivered as evidence along with the code, (3) frequent builds, (4) best-practice configuration control, and (5) agile-style gating and process management.28 Metrics can relate to a combination of adoption of best practices and production of separately verifiable evidence to support any assurance claims. As noted above, full line-by-line historical tracking of changes to a code base is now commonplace for development projects of all sizes. A key benefit of such tracking is that it provides full traceability not only among artifacts, but also to individual developers, which is useful for security and to assure that individual developers are fully up-to-date with best practices.

28

Michael A. Cusumano, Alan MacCormack, Chris F. Kemerer, and William Crandall, 2009, Critical Decisions in Software Development: Updating the State of the Practice, IEEE Software 26(5):84-87. See also Alan MacCormack, Chris F. Kemerer, Michael Cusumano, and Bill Crandall, 2003, “Trade-offs Between Productivity and Quality in Selecting Software Development Practices,” IEEE Software 20(5):78-85.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
  • Analysis of architecture and component models. This becomes part of the iterative early-stage process of refining architecture, quality attribute goals, functional scoping, and sourcing. If there are portions of the configuration that may create downstream challenges for evaluators, this is the opportunity to revisit design decisions to facilitate evaluation. For example, an engineer might suggest a change in programming language for a component in order to get a 5 percent speed up. At this stage of the process, that proposal can be considered in the light of how it might influence assurance with respect to quality attributes, interface compliance, correct functionality, and other factors. The decision could be made not to change the programming language, but rather to incentivize the vendor to make the next set of improvements in its compiler technology. These decisions are made using a multi-criteria metric approach, with criteria and weightings informed by the earlier stages.

  • Identify high-interest components. Regardless of the front end of the process, there will be a set of high-interest components. Ideally, however, as a result of architecture decisions, the components in this category are not also opaque and untrusted. Regardless, components are prioritized on the basis of measured assurance-related engineering risk, with metrics as set forth in the earlier stages. This assessment will account for ongoing improvements in development technologies (e.g., languages, environments, traceability and knowledge management), assurance tools (e.g., test, inspection, analysis, and monitoring support), and modeling (for various quality attributes including usability).

  • Develop a component evaluation plan. Allocate resources, set priorities, and identify assurance requirements on the basis of the analyses above. In this preventive scenario, this plan is largely a consequence of the early decisions regarding architecture, sourcing, hazards, and functional scope. Metrics are defined for resolution of engineering risk in all components (but particularly high-interest components), so progress can be assessed and credit assigned.

  • Assess individual components. As above, this involves a combination of many different kinds of techniques. In the preventive scenario, component development can be done in a way that delivers not only code, but also a body of evidence including test cases, analysis results, in-place instrumentation and probes, and possibly also proofs of the most critical properties. (These proofs are analogous to what is now possible for type-safety and encapsulation integrity, which is now a ubiquitous analysis that is composable and scalable.) This supporting body of evidence that is delivered with code enables acceptance evaluators to verify claims very efficiently regarding quality attributes, functionality, or other properties critical to assurance. Metrics are developed to support co-production of component code and supporting evidence.

  • Select courses of action for custom components. See above.

  • Select courses of action for opaque components and services. For existing vendor components, the same considerations apply as in the previous scenario. If new code is to be developed in a proprietary environment, then there is the challenge of how to make an objective case (not based purely on trust) that the critical properties hold. Existing approaches rely on mutually trusted third parties (as in Common Criteria), but there may be other approaches whereby proof information is delivered in a semi-opaque fashion with the code.29 Additionally, the proprietary developer could develop the code in a way that is designed to operate within a sandbox, in a separate process, or in another container—in this approach, the design is influenced by the need to tightly regulate control and data flows in and out of the contained component. Metrics would weight various criteria, with a long-term goal of diminishing the extent of reliance on trust vested in commercial vendors in favor of evidence production in support of explicit “assurability” claims.

  • Refine system-level assessment. Given the high risks and costs of architectural change, in a preventive scenario, any adjustments to architecture are done incrementally as part of the overall process. Metrics would relate to the extent of architectural revisions necessary at each stage of the process.

29

There is a wealth of literature on proof-carrying code and related techniques.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

Conclusion

A key conclusion from these scenarios is the high importance of three factors: (1) The extremely high value of incorporating assurance considerations (including security considerations—see Box 4.1) into the full systems lifecycle starting with conceptualization, throughout development and acceptance evaluation, and into operations and evolution. (2) The strong influence of technology choices on the potential to succeed with assurance practices. (3) As a consequence, the value to DoD software producibility that comes from enhancements to critical technologies related to assurance, including both what is delivered (programming languages, infrastructure) and what is used during development (models and analytics, measurement and process support, tools and environments).


Recommendation 4-1: Effective incentives for preventive software assurance practices and production of evidence across the lifecycle should be instituted for prime contractors and throughout the supply chain.


This includes consideration of incentives regarding assurance for commercial vendor components, services, and infrastructure included in a system.

As illustrated in the scenario, when incentives are in place, there are emerging practices that can make significant differences in the outcomes, cost, and risk of assurance. The experience at Microsoft with the Lipner-Howard Security Development Lifecycle (SDL)30 reinforces this—the lifecycle not only leads to better software but also incentivizes continuous improvement in assurance technologies and practices.

When ecosystems, vendor components, open-source components, and other commerical off-the-shelf (COTS) elements are employed, assurance practices usually necessitate the DoD to constantly revisit selection criteria and particular choices. The relative weighting among the various sourcing options, from an assurance standpoint, will differ from project to project, based on factors including transparency of the development process and of the product itself, either to the government or to third-parties. This affords opportunity to create incentives for commercial vendor components to include packaged assurance-related evidence somewhere between the two poles of “as is” and “fully Common Criteria certified.” Advancement in research and practice could build on ideas already nascent in the research community regarding ways that the evidence could be packaged to support quality claims and to protect trade secrets or other proprietary technology embodied in the components.


Recommendation 4-2: The DoD should expand its research focus on and its investment in both fundamental and incremental advances in assurance-related software engineering technologies and practices.


This investment, if well managed, could have broad impact throughout the DoD supply chain. When both recommendations are implemented, a demand-pull is created for improved assurance practices and technologies.


Recommendation 4-3: The DoD should examine commercial best practices for more rapidly transitioning assurance-related best practices into development projects, including contracted custom development, supply-chain practice, and in-house development practice.

30

Steve Lipner and Michael Howard, 2006, The Security Development Lifecycle: A Process for Developing Demonstrably More Secure Software, Redmond, WA: Microsoft Press.

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×

Several leading vendors have developed explicit management models to accelerate the development of assurance-related technologies and practices, to validate them on selected projects, and to transition them rapidly into broader use.31

31

Microsoft is well known for its aggressive use of development practices including process (the Security Development Lifecycle (SDL) noted earlier—see http://msdn.microsoft.com/en-us/library/ms995349.aspx) and analysis tools (such as SLAM, PreFast, and others—see, for example Thomas Ball, 2008, “The Verified Software Challenge: A Call for a Holistic Approach to Reliability, pp. 42-48 in Verified Software: Theories, Tools, Experiments, Bertrand Meye and Jim Woodcock, eds. Berlin: Springer-Verlag).

Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 86
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 87
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 88
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 89
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 90
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 91
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 92
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 93
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 94
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 95
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 96
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 97
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 98
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 99
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 100
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 101
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 102
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 103
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 104
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 105
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 106
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 107
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 108
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 109
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 110
Suggested Citation:"4 Adopt a Strategic Approach to Software Assurance." National Research Council. 2010. Critical Code: Software Producibility for Defense. Washington, DC: The National Academies Press. doi: 10.17226/12979.
×
Page 111
Next: 5 Reinvigorate DoD Software Engineering Research »
Critical Code: Software Producibility for Defense Get This Book
×
Buy Paperback | $47.00 Buy Ebook | $37.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Critical Code contemplates Department of Defense (DoD) needs and priorities for software research and suggests a research agenda and related actions. Building on two prior books—Summary of a Workshop on Software Intensive Systems and Uncertainty at Scale and Preliminary Observations on DoD Software Research Needs and Priorities—the present volume assesses the nature of the national investment in software research and, in particular, considers ways to revitalize the knowledge base needed to design, produce, and employ software-intensive systems for tomorrow's defense needs.

Critical Code discusses four sets of questions:

  • To what extent is software capability significant for the DoD? Is it becoming more or less significant and strategic in systems development?
  • Will the advances in software producibility needed by the DoD emerge unaided from industry at a pace sufficient to meet evolving defense requirements?
  • What are the opportunities for the DoD to make more effective use of emerging technology to improve software capability and software producibility?
  • In which technology areas should the DoD invest in research to advance defense software capability and producibility?
  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!