| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 110
7
Reliability Concepts
The AAN battle force concept is predicated on all systems being highly reliable.
Furthermore, to reduce logistics demand the Army must make reliability an equal partner
with lethality, survivability, and mobility considerations. This chapter describes the
reliability concepts and technologies needed to develop AAN mission-reliable systems.
LOGISTICAL IMPLICATIONS OF HIGHT,Y RELIABLE SYSTEMS
Improving the reliability of the systems used by an AAN battle force will have a
multiplier effect on reducing logistics demands. This multiplier effect can be illustrated
by reviewing two functional requirements arising from the assumed AAN concept for
logistics support. First, as briefed to the committee, the AAN battle force will take no
separate maintenance and supply units into the area of operations. Second, a battle unit
support element (BUSE) located at the staging area will be responsible for rapid
refitting, refueling, and repair of battle force systems between combat pulses in
preparation for subsequent pulses. From these two functional requirements, it is clear
that decreasing the maintenance and spare parts needed for subsequent pulses are
essential aspects of AAN reliability. Thus, the functional requirements can be used to
help define the essential degree of reliability and the things that are unnecessary or too
costly, even if they are desirable in principle.
The classic definition for the reliability of an item, system, or component is the
probability that it will operate successfully during its mission (see Box 7-1~. A typical
AAN battle force mission will require systems that are at least reliable enough to meet
the two functional requirements described above. No doubt, other functional
requirements for AAN systems will also generate reliability requirements, but these two
basic requirements are sufficient to illustrate that (~) improving reliability reduces
logistics demand and (2) the reliability of any system is relative to the mission context in
which it is expected to operate.
Pulse-Reliable Systems
For the AAN assumption of no maintenance and repair support in the battle
space during an operational pulse, all systems must be reliable enough for commanders
to meet their force readiness and deployment timelines.
110
OCR for page 111
RELIABILITY CONCEPTS
BOX 7-1 Classical Definitions of Reliability and Related Concepts
Reliability. The probability that an item, component, or system will
operate successfully during its mission.
Maintainability. The probability that an item, component, or system
will remain in a specified operational condition or can be restored to that
condition within a given period of time, when maintenance is performed
according to prescribed procedures and resources.
Availability. The probability that, at any random instant, an item,
component, or system will be in proper condition to begin a mission.
Durability. The probability that an item, component, or system will
successfully survive its projected service life, overhaul point, or rebuild
point without a catastrophic failure. (A catastrophic failure is a failure
that requires that the item, component, or system be rebuilt or replaced.)
111
For a combat operation to accomplish its objectives, all systems taken into the
area of operations must be close to fully operational and require neither repair nor
maintenance throughout the period of the operation. Systems must not fad! during that
operational time period. Even in a dynamic, high-stress environment, all systems must
maintain a high level of operability, even if a part or component has been damaged or
malfunctions. For the purposes of this report, an AAN system is pulse-reliable if it meets
the following criteria:
requires no maintenance or wear-related repair or replacement by external
logistics personnel for the duration of a combat pulse
can continue to perform within minimum operational parameters even with
damage to subsystems
If all AAN systems are pulse-reliable, no maintenance and repair support
elements troops, tools, or spare parts would have to accompany the combat elements.
It follows that no fuel, food, water, or shelter would be needed for support elements, and
no combat capability would have to be diverted to protect them. No transport systems,
which have their own logistics requirements, would be needed to bring in and retrieve
the support elements. if commanders could rely on the pulse-reliability of their combat
systems, the effectiveness of a force of any given size would be increased for planning
purposes. In other words, a given objective could be met with a smaller force, requiring
proportionately less fuel, ammunition, etc. These indirect effects of puIse-reliable
systems are based on the assumption that every system taken into a combat operation
will perform successfully for the duration of the operation.
OCR for page 112
112
REDUCING THE LOGISTICS BURDEN FOR THE ARMYAFTER NEXT
Fast Refitting through Improved Maintainability
The pulsed operations of an AAN battle force will require that the refueling,
refitting, and repair phase of each cycle be short enough to keep an opponent off guard
and incapable of regrouping and responding effectively to the next combat pulse. AAN
systems will require maintenance and repair, but the speed with which a resuming battle
element can be reconstituted to a state of pulse-reliability and readiness will determine
how short this part of the cycle can be. In this context, the overall mission reliability of a
system depends on improved" maintainability, including longer times between preventive
maintenance cycles, faster diagnosis and repair or replacement of parts, and preventive
or predictive (prognostic) maintenance, rather than reactive maintenance. This
performance goal will be called "fast refit."
The logistics impacts of fast refit include (1) fewer maintenance personnel in the
BUSE per unit of combat strength required for a given refitting (higher tooth-to-tai]
ratio); (2) fewer spare parts per average combat-day, including fewer spare systems to
replace systems that cannot be repaired before the next pulse; (3) reduced logistics
burdens (fuel, food, water, and energy) at the staging area to support a smaller
maintenance element; and (4) simpler planning requirements to ensure that the BUSE
can sustain the war-fighting operations.
AAN Mission Reliability Versus Ultrareliability
The discussion above shows how the reliability for AAN missions, or ANN
mission reliability, can be analyzed into specific reliability requirements for systems,
such as pulse reliability and fast refit. These requirements can also be expressed as
objective, quantifiable measures, or performance metrics. A hypothetical example of a
metric for pulse reliability would be a 90 percent level of confidence that a system
verified as pulse reliable will be able to meet eight standards for full operating
performance throughout the duration of a 14-day pulse, with a 99 percent level of
confidence that it will perform at a degraded (but still operable) level of performance for
no more than two of those measures during a pulse. The fast refit requirement might be
expressed as an availability metric, such as a 98 percent probability that a given system
will be available for the next pulse (pulse reliable and ready) after 12 person-hours of
maintenance and refitting, if no battle damage was sustained in the previous operation.
The details of both of these hypothetical reliability metrics depend on the specifics of
how an AAN battle force would fight and be sustained for the duration of a campaign.
Exploring other AAN operational concepts will reveal additional elements of AAN
mission reliability.
Basing reliability on how an AAN mission would be conducted is very different
from the idea of "ultrareliability," which is too often sold as a context-free property that
can be achieved by adopting a particular technology or design. But the concept of
ultrareliability is too general to help achieve AAN mission-reliable systems. For this
Nor example, an analysis of the soldier-machine interfaces required for the complexity, tempo, and
intensity of an AAN operation in a three-dimensional battle space would reveal important features of what
might be called "trainability": the ease with which a system operator can attain and maintain a high level of
proficiency for a specific mission profile.
OCR for page 113
RELIABILITY CONCEPTS
113
reason, the committee decided to avoid the term ultrareliability and to focus on AAN
mission reliability, that is, the minimum reliability requirements that can be used in a
distributed M&S environment to develop AAN systems.
AAN Mission Reliability and RAMD
Reliability, availability, maintainability, and durability (RAMD) have become
linked as key factors in keeping future systems affordable, both in terms of investing
scarce dollars for research, development, testing, and evaluation (RDT&E) and in terms
of fielded equipment that can be procured in adequate quantities within budgetary con-
straints.2 The system characteristics that contribute to RAMD are sometimes contrasted
with operational performance values because, in the past, performance has typically been
achieved at the expense of RAMD. AAN mission reliability, however, cannot be sepa-
rated from other aspects of system performance. Systems that fad! to meet reliability
requirements will also fad! to meet ANN mission objectives.
Three approaches can be used to ensure that AAN mission reliability (and re-
lated RAMD qualities) receives appropriate consideration along with other performance
objectives. First, RAMD qualities must be interpreted into objective, assessable charac-
teristics that can be designed into a system. Rather than being Jumped together as a
vague quality to which lip service is paid with terms like "ultrareliability," RAMD must
be defined in terms of concrete metrics that reflect operational requirements. System
designs should be assessed and engineered against these metrics, just as they are against
metrics for mobility, lethality, or any other performance requirement. Second, these ob-
jective characteristics must be weighed, along with other performance characteristics, in
system trade-offs when designs or prototypes cannot meet all performance goals.
The third approach is a Tonger-term alternative to the second. Instead of being
forced to trade off a desirable level of performance (whether in a reliability measure or
some other goal, such as cross-country mobility, lethality, or survivability) to achieve the
optimum performance for all performance characteristics, new technology and new
design concepts can be developed to improve overall (combined) performance. In short,
the third approach is to seek new and better solutions. Applied and basic research, when
informed by the specific characteristics required to meet difficult AAN system
constraints, can, in time, provide new solutions. These three approaches are not mutually
exclusive. All three are likely to be needed for complex systems with demanding
performance requirements.
The M&S environment described in Chapter 3 provides a near-term implemen-
tation of the first two approaches: designing systems for AAN mission reliability and
making system trade-offs that do not sacrifice reliability to other performance goals.
M&S is an essential and powerful too! for systems engineering to move AAN mission
reliability off the bullet charts and into the battle force. The next section describes the
necessary elements for a distributed, hierarchical federation of M&S tools adequate for
building AAN mission reliability (or "RAMD for AAN") into each system at every
Sometimes adaptability is added to these four characteristics, making the acronym "RAAMD."
The argument made here applies to RAAMD, as well as to RAMD.
OCR for page 114
114
REDUCING THE LOGISTICS BURDEN FOR THE ARMYAFTER NEXT
level, from subsystems down through components, structures, and materials. After that
ways to enhance the third approach are discussed.
USING AN M&S ENVIRONMENT TO DEVELOP AAN
MISSION-RELIABLE SYSTEMS
A,
Designing systems and performing trade-off analyses with tools that can
simulate whether feasible systems, subsystems, components, and structures will meet
mission-specific RAMD metrics is the key to a realistic strategy for achieving AAN
mission-reliable systems by 2025. A hierarchy of mode! domains, illustrated in Figure
7-1, can be constructed for any complex system developed for the AAN battle force.
Note that Figure 7-1 is based on the discussion in Chapter 3 of the distributed M&S
environment illustrated in Figures 3-l and 3-2. Fielding AAN materiel in 2025 wit!
require that engineering and manufacturing development begin by 2010. To meet this
milestone, extremely complex system trade-offs will have to be made, and the
supporting technologies for engineering and manufacturing development will have to be
available. In the judgment of the committee, the only way to perform the systems
engineering essential to making trade-off analyses while reducing the costs, in time,
resources, and risk, of trial-and-error developmental approaches is to use the simulation
techniques described in Chapter 3, beginning with conceptual design. This approach is
used extensively by leading manufacturers to design highly reliable subsystems and can
be effectively exploited by the Andy to significantly enhance the reliability of AAN
systems.
Unfortunately, existing M&S tools cannot feed data on achievable reliability and
performance levels at the component and subsystem levels back to the operational level,
at which system trade-offs should be made. Without performing iterative simulations up
and down a hierarchy of M&S tools, as illustrated in Figure 7-l, determining through
M&S whether a design concept will meet AAN mission reliability objectives will be
impossible.3
Reliability (much less the pulse and mission reliability needed by AAN systems)
is not currently part of the design process, but it can be easily included by adding
reliability analysis at appropriate levels of the M&S hierarchy. Designing candidate
AAN system concepts to meet AAN mission reliability requirements will require the
following extensions of current capabilities and design approaches:
M&S systems must be adequate at every level in the hierarchy at which
"designing for reliability" is done, from the top level of force-on-force
engagement down to the lowest level at which the reliability of design options is
evaluated.
Metrics for reliability at each level must be defined in terms of operational
requirements, so that reliability can be assessed objectively at that level.
The design process must include iterative simulations up and down the
hierarchy.
3For the argument supporting this point, see "M&S Environment to Support AAN Logistics Trade-
off Analysis" in Chapter 3.
OCR for page 115
RELIABILITY CONCEPTS
AAN mission
reliability metrics ~ 1
Virtual Proving Ground (Single Vehicle:
Orator and Hardware in
System-level
reliabilitymetrics ~ T
`` System Architecture
Single vehicle
performance results
in meeting AAN
mission reliability
requirements
System structural
characteristics related
to system
performance
~an_
Subsystem-level r Subsystem
reliability metrics performance results
~ r related to reliability
(I Subsystem Architecture ~
Component-specific ~Component performance
reliability metrics ~ ~results related to subsystem
~reliability
Component Design
Mate nal characteristics ~ t Mate n al pro pe rties related to
required fjorbcjlojtrynponent ~ _ component reliability
Materials Selection and
Processing Analysis
Properties needed in new
materials to meet component ~ Simulated material properties
perFonnance requirements, ~ r and processing approaches
including component reliability ~ --.
~ M&S for Design and Processing of
`` Material Microstructure and Composition J
115
FIGURE 7-1 Hierarchy of model domains. An extended M&S environment can be used to design
reliability into AAN systems, perform system trade-off analyses, and develop new options for
enhancing reliability. The figure is based on Figures 3-1 and 3-2. For the sake of simplicity note
that the top two engagement levels are not shown. Also, the requirements and metrics for other
performance goals (left side of Figure 3-1) and M&S results relevant to them (right side of Figure
3-1 ) are not shown.
At the lowest level at which M&S is being used, valid data on alternatives must
be available for the characteristics that determine reliability at that level (i.e.,
estimates of the metrics for reliability at that level of system decomposition must
be realistic, not guesses or wishful thinking).
During the iterative design process, and subsequently during engineering
development, testing, and evaluation, the mission reliability of the system (i.e.,
OCR for page 116
116
RED UCING THE LOGISTICS B URDEN FOR THE ARMY AFTER NEXT
the reliability-related minimum requirements set for the top level of system
performance in simulated AAN engagements) must not be traded away to
sustain or increase another desired aspect of overall system value.
In practice, each of these extensions can be achieved in varying degrees.
Therefore, the extent to which AAN systems can be designed for reliability will depend
on how well the M&S environment and the methodology of using it meet these five
goals. The challenges and opportunities in these five areas are explored below.
Adequate M&S Systems
Chapters 3 and 5 examined at length the existing mobility M&S systems at each
level in the hierarchy, diagnosed some of their limitations, and recommended
improvements. Because AAN mission reliability is a cumulative outcome of complex
system-level behaviors, it will be helpful to consider the necessary capabilities of each
M&S system type described in Chapter 3 to provide a reasonable simulation for
assessing reliability.
For example, at each of the three engagement levels shown in Figure 3-l (force-
on-force, multiple systems with operators, and single system with operator), both the
normal or "expected" duty cycle and the frequency-versus-severity profile of excursions
from the normal cycle must be realistically simulated and exercised. At the system and
subsystem levels, the models must include system-stressing loads and conditions and
variable patterns of operation, not just baseline operating scenarios. Because time is
often a key factor in the appearance of failure modes that reduce reliability, either the
simulations at each level must be run for durations required by AAN-mission reliability
or the analytical methods used to extrapolate from shorter run times to the durations
characteristic of AAN operations and duty cycles must be validated.
Because reliability is relative to context (e.g., mission or duty-cycle profiled, the
realism of the higher-level models in the hierarchy will be critical to using an M&S
environment for designing reliability into an AAN system. In effect, a systems engineer
will have to rely on the results from the higher-level models to define the behaviors of
the subsystems, components, and materials critical to making the entire system mission
reliable. One may, of course, rely on engineering experience or rules of thumb to make a
reasonable guess at characteristics that will affect reliability at higher levels of system
integration. But these approaches are difficult to quantify into metrics or validate.
Reliance on qualitative and heuristic approaches to reliability has probably contributed
to the ease with which reliability has typically been traded away for performance
characteristics that could be more easily quantified during requirements specification,
design, and evaluation.
. .
Defining Reliability in Measurable Characteristics
Once mission-specific reliability is accepted as a performance value that applies
at each level of an M&S hierarchy, each level will require that appropriate reliability
OCR for page 117
RELIABILITY CONCEPTS
117
measures be specified for which mode! runs can be evaluated. Less obvious perhaps is
that the appropriateness of the reliability measures at one level is determined by the
performance properties at the next higher level to achieve the reliability characteristics
required there. This linkage of the reliability measures at a given level to the required
performance characteristics for reliability at the next higher level makes it possible for
assessable reliability requirements to "flow down" from mission-specific, functional
requirements (like pulse reliability or rapid refit) to reliability requirements for particular
subsystems and components. The linkages between levels enable systematic design to
achieve the ultimate (i.e., top level) reliability requirements.
As noted in Chapter 3, the absence of this linkage in existing models prevents
the flow of Tower-leve! data up to the systems level in the hierarchy, where alternative
designs and trade-offs of one system value for another ought to be made. To state the
~ , ~
~ 1 , 1 , `1 1 r , , ~ ~ ~ ~ 1 a,
problem In practical terms, the value ot using Iterative M&S cycles as an alternative to
trial-and-error cycles of design, building, testing, and modifying depends on how closely
the M&S hierarchy can model the causal relations between the metrics for reliability at
one level and the properties at the next higher level of integration that affect system
performance.
Iterative Simulation
In Chapter 3, the committee stressed the importance of iterative simulations up
and down the M&S hierarchy. As a high-level system requirement, AAN-mission
reliability is a good example of why the iterative approach is essential for making design
decisions. (Performance characteristics defined at the system level for such things as
energy management, mobility, and lethality require an iterative approach for the same
basic reasons.)
A mode! by its nature is not an exact replica of the thing it models. When a
general mode! is applied to a specific case (for example, when a force-on-force mode! is
used to simulate a particular type of AAN mission or the NRMM is used to simulate the
behavior of a particular vehicle concept over selected terrains), the fit of the model can
be improved if input parameters are specified and the settings selected for the model's
run parameters. if the concept to be modeled is at an early design stage, results from
earlier runs can be used as feedback to "refine, tune, and tweak" both the mode! and the
design being tested.
Reliability outcomes of detailed engineering model runs for a combat vehicle
(for example) indicating that loads on a bearing approach or exceed design limits may
lead to a redesign of the vehicle. Systems that are frequently or easily defeated in a
force-on-force simulation or that consistently run out of fuel or ammunition may require
that the system be redesigned or modified, that tactics for using the system be
reconsidered, or that the training for operators be changed. Another alternative is that the
validity of the mode! itself may be questioned, leading to corrections and refinements in
the model.
in an M&S hierarchy of tools, this feedback process extends beyond a single
mode! at any given level. Determining the implications for reliability metrics at the
system or subsystem level of a particular design using particular components in a
OCR for page 118
118
REDUCING THE LOGISTICS BURDEN FOR THE ARMYAFTER NEXT
particular configuration will require modeling up through several layers of the hierarchy.
The general axiom of systems engineering applies: optimization for a quality (such as
reliability) at a sublevel in a structural-functional hierarchy does not necessarily lead to
optimization even for the analogous quality expressed at a higher level of system
integration. Furthermore, the Army will have to optimize more than one quality (e.g.,
pulse reliability for at least two weeks of pulses, plus various mobility and lethality
objectives). Optimization at Tower levels for any one of the overall performance qualities
may not provide the best system solution for all of them.
For all of these reasons, upward iterations through the hierarchy will be crucial.
Similar reasoning applies to the downward flow of performance requirements (including
reliability requirements) from the top level to the component level (and below that to the
materials selection and materials design levels). High-level functional requirements may,
on analysis at lower levels, turn out to be inherently incompatible (for example, they
jointly "violate the laws of physics". Or they may be jointly unachievable for all exist-
ing design options. In either case, some kind of"goal leveling" across performance re-
quirements will be necessary. Additional downward iterations of different combinations
of modified requirements will be necessary to make reasoned decisions about the "best"
system trade-offs. For instance, lightening a vehicle by using advanced composite mate-
rials may increase its range per fuel Toad and may improve its pulse reliability, but these
materials may require longer maintenance checks between pulses and, therefore, require
additional maintenance specialists in the staging area to meet the fast refit requirement.
As the larger AAN process evolves with new combinations of tactics and
doctrine, performance specifications will change (including the mission-specific medics
that define reliability requirements). Modeling the new options downward through the
hierarchy and running alternative solutions back up-will again be necessary.
Valid Data on Alternatives
Even in a well coupled hierarchical M&S environment, independent input
variables must be set at each level. These include variables that specify design choices or
environmental conditions specific to each level, as well as the input variables that
represent design choices at the lowest model level in the overall simulation scheme. The
utility and validity of a simulation exercise for making design decisions and system
performance trade-offs depends on how accurately these input data characterize the
design options and conditions that the simulation is supposed to represent. This simple
point has important consequences for a simulation being used to assess a complex
variable like AAN mission reliability.
When new design concepts are introduced at any level in the simulation
hierarchy, the properties that influence the reliability metrics at that level may not be
well characterized. New structural options may in principle be available for insertion
into designs (for example, new materials that might be used in components and
structures), but valid data on even well established properties that affect reliability may
not be available to the designer. For AAN mission reliability to be analyzed objectively
and reasonably, modelers will need sound data for all design options of potential interest
and for all properties that significantly affect the reliability metrics at each mode! level.
OCR for page 119
RELIABILITY CONCEPTS
119
Preserving Mission Reliability during System Trade-offs
Once RAMD is represented by measurable characteristics reflecting operational
requirements in the system conception, design, and testing processes, the funa1amental
systems engineering issue is whether all metrics representing these and other
operational requirements can be met with known technology. If all metrics cannot be
satisfied in one system, trade-offs can be made to optimize the outcome. In the past,
whether this was done systematically or haphazardly, RAMD values were often
sacrificed for "performance" values, real or perceived. For AAN systems, it may be
necessary to sacrifice some desirable mobility, lethality, or survivability characteristics
to maintain the level of AAN mission reliability requires! for an acceptable probability of
success.
A principal benefit of an M&S environment like the one proposed in Chapter 3
is that it allows the established trade-off methods of systems engineering to be applied to
novel AAN systems, beginning very early with design conception and continuing, with
increasing precision and certainty, through detailed design, engineering development,
testing, and evaluation. With rigorous adherence to good systems engineering practices,
a performance goal like mission reliability, which is a global property of overall system
performance across a system's mission profile, can be achieved.
AAN mission reliability can only be assured if the trade-offs inherent In creating
a novel and complex product that can be fielded by 2025 maintain adequate levels of
mission reliability. Assume that an adequate M&S environment is available for a
proposed AAN system concept. A reasonable starting point for designing the new
concept is to attempt to meet all performance metrics (including those for reliability)
with existing, well characterized solutions. Suppose, though, that all of the requirements
cannot be met jointly. This is a likely outcome for the leap-ahead systems needed. The
next step might be to look for less well characterized options that can be substituted for
some of the tried-and-true standard materials, structures, and components. Because less
is known about these options, additional physical testing of the proposed alternatives and
modeling of the system configured with them will have to be done. Data on the
alternatives must be validated, and additional cycles of iterative simulation will assess
whether the new design can meet the requirements for mission reliability and other
performance qualities.
Advances in materials engineering may be able to help here by providing new
approaches to obtaining data about relatively untested options. For example, it is diff~-
cult to use accelerated testing methods to determine how a component fashioned by new
means from novel materials of construction will respond to a complex duty cycle.
Knowledge of the physical properties of the materials gained from experimental data,
including their dynamic responses throughout the duty cycle, may make it possible to
mode! the Tong-term failure, wear, and aging behavior of the alternative, in the context of
a particular design for a particular system.
A radical form of this "search for better system inputs" is to Took to materials
engineering to provide a "new solution" that meets the particular requirements (e.g.,
specific strength or resistance to failure modes of the familiar options, or ease of
replacement) of an element in the modeled system. "Designing" a new material (or novel
structuring of known materials) depends, like the modeling of hard-to-test physical
OCR for page 120
120
REDUCING THE LOGISTICS BURDEN FOR THE ARMYAFTER NEXT
behavior, on knowing the physical properties that will provide the desired functional
behavior and knowing how to engineer those properties into the structural element in
question. These approaches will probably not be valid for the development of the first
AAN systems, but they are discussed in the following section as longer term options for
meeting the AAN functional requirements for reliability.
An equally valid approach from the standpoint of systems engineering is for
designers to re-examine the performance requirements, including the reliability
requirements, at each level in the M&S environment from the top down, to see if any can
be relaxed without sacrificing the essential requirements for the system to do its job (i.e.,
top-down reduction of functional requirements). Eventually it may be necessary to
compromise on functional requirements to find an acceptable system solution.
In the past, when a lower requirement for one performance goal was traded to
achieve an acceptable metric for another, system reliability was often "traded away." To
varying degrees, the justification for the other performance characteristic was considered
"more important than cost," and decreased reliability could be compensated for by buy-
ing additional quantities of the system (for replacements). Two other reasons for
sacrificing reliability have been, first, the lack of objective, assessable metrics for
mission-specific reliability at each level in the structural-functional hierarchy of Army
systems and, second, a dearth of hard data about the reliability-relevant properties of
system elements that were introduced to meet other performance objectives. If reliability
is considered on an equal basis with lethality, survivability, and mobility, then reliability
can no longer be used as an excuse for poor design.
Even the best systems engineering in the world will not consistently produce
AAN mission-reliable systems unless and until the following steps are taken to
supplement systems engineering throughout the design, development, and testing
process:
Reliability must not be traded away to meet other performance objectives, at
least not to the point that mission reliability will be threatened or lost.
Designers must have a design construct (e.g., an M&S environment) for highly
complex systems that incorporates meaningful, quantifiable characteristics that
define mission reliability at the topmost system (platform) level and
characteristics that are closely coupled with mission reliability at each lower
level of the system structure-function hierarchy.
Contractors who offer proposals to build a system, subsystem, or component
should be evaluated (using the M&S environment) on the basis of the proposed
design's capability to achieve the requirements for AAN mission reliability (as
well as the requirements for other mission-critical system goals, such as system
fuel efficiency "Chapter 4], vehicle mobility "Chapter 5], and precision
engagement "Chapter 63~.
· Contracts should be awarded on the basis of meeting mission-specific reliability
requirements, and contractors should be held to delivering what they promise.
Source selection criteria must be changed to consider reliability on an equal
basis with other mission-specific goals. Currently, reliability is often traded off
for performance, which increases logistics support requirements for new
systems.
OCR for page 121
RELIABILITY CONCEPTS
121
THE THIRD APPROACH: RESEARCH TO ENABLE
NEW RELIABILITY SOLUTIONS
Let us assume that the system design and trade-off analyses performed with
M&S tools indicate that not all of the logistics reduction and perfo~ance requirements,
including the reliability-related requirements, can be met jointly with materials,
structures, and components that are well characterized from testing and accumulated
experience in similar applications. An alternative to relaxing one or more requirements
to optimize the system design is to search for better designs or for better components or
materials. Although the better designs, components, and materials may not be ready for
engineering and manufacturing development by 2010, the research to find them may be
necessary to meet all of the AAN requirements at a later date. Furthermore, it is always
possible that a breakthrough or burst of progress in a key area will lead to improvements
sooner.
Improving System Reliability at the Level
of Component Analysis and Design
Although much can be done in the near term (by 2010) to improve existing M&S
tools, research will be necessary in the following areas, even if the results do not bear
directly on systems for AAN until after 2010.
mechanisms offailure modeling, to relate structural failure modes at one level in
the M&S hierarchy to physical properties at the next lower level
materials selection and materials design to provide new options (and inputs at
the level of component design and analysis) in the M&S hierarchical
environment
prognostics (the design and application of prognostic sensing technology) to
monitor for physical precursors of failure when the mechanisms of failure for a
design or a material are known but no better design or material (with respect to
meeting all system performance objectives) is available
Modeling Mechanisms of Failure
Iterative simulation runs up and down an M&S hierarchy can only ensure system
reliability to the extent that the models accurately represent the causal relations between
the reliability-related characteristics (performance metrics) at one structural level and the
physical properties at the next Tower level of structure. Models can misrepresent these
linkages in three ways:
The models may be inaccurate because of errors in the assumptions or approxi-
mations used in the modeling tool itself or in the runs for a particular
configuration.
OCR for page 122
122
REDUCING THE LOGISTICS BURDEN FOR THE ARMYAFTER NEXT
The models may be reasonably accurate around a "good design" point
(anticipated range of operation and performance) but may not be able to predict
off-design performance or identify failure signatures and failure modes.
Data for accurate simulation of the system may be insufficient.
Improving the models to resolve these problems can be considered increasing their
fidelity by incorporating more complete knowledge of the mechanisms of failure into the
model.
When a system fails to operate properly during its mission, a failure has
occurred. "Mechanisms of failure" is just another name for the causal linkages between
structure at one level and successful performance at the next higher level of integration.
"Physics of failure," a term often used to describe the gathering and applying of knowi-
edge of these causal linkages, originated in efforts to improve the reliability of electronic
materials and structures. Thinking in terms of the physics of failures has been highly
productive in semiconductor electronics because a great deal is known about how the
physical structure of semiconducting materials produces the functional characteristics of
the "component" electronic device at the next higher level of organization. Building on
the success of the physics of failure approach, the Army is now implementing physics of
failure studies to assess the reliability of electronic packaging concepts that are still at
the design stage.
A fundamental constraint on a "physics of failure" approach to determining
mechanisms of failure, a constraint that is not always clearly recognized (or stated), is
that the ability to predict failure modes and failure events from underlying physical
properties depends on two factors. First, specific physical properties must be strongly
linked to specific failures (for example, is occurrence of condition A sufficient in itself to
cause failure mode F. or is it just a contributing factor that requires other conditions
before F occurs?. Second, do we understand the causal structure that determines
whether or not a failure will occur? As the causal relations between physical conditions
and the occurrence of failure become more complex and our knowledge of that
complexity becomes more tenuous, predicting failure modes and events becomes more
and more speculative.
A more practical way to express this theoretical point is that research on the
mechanisms of failure for AAN systems is unlikely to reveal all of the fundamental
failure modes of a system and what causes them. In most cases, this is probably an
impossible, or even a meaningless, task. However, to design and manufacture highly
reliable components that can meet AAN performance requirements, enough must be
known about the underlying properties and conditions that can create known or
suspected failure modes (i.e., some of the causal linkages) to build components with
superior performance in reliability-related characteristics.
An example far afield from the area of semiconductor design illustrates the
potential value of modeling mechanisms of failure. Given the importance of energy
management in reducing logistics burdens (see Chapter 4), many AAN vehicles or other
systems will require high-horsepower engines that operate at high fuel efficiencies
throughout the range of operating conditions required for AAN mission scenarios. These
engines must also be highly reliable, not just when operated at "design" conditions but
OCR for page 123
RELIABILITY CONCEPTS
123
also under any conditions that occur during an AAN mission. Even if the engine is
operated outside its design envelope for optimal performance, it must continue to
perform, at least until a combat pulse has been completed and it can be returned to the
staging area.
Designing this engine will require an understanding of some of the detailed
physical characteristics of the engine's subsystems in relation to the performance of the
entire engine. For example, the design engineer would want to know how the fuel-air
mixing and combustion process is affected by local mixing inefficiencies, pressure
oscillations in the fuel feed line, combustion instability in the combustor, soot formation
and resulting inhibition of the ignition system, and the stability of lean flames, including
local extinction and reignition. In each of these areas, knowing something about the
conditions that can lower operating efficiency or damage the engine structures over time
would help in modeling the mechanisms of failure for the high-efficiency, high-power,
highly reliable engine the designer is trying to build.
Tf the designer has a simulation model that incorporates this knowledge of
failure mechanisms, the model will be better at simulating how a real engine would
perform under a broader range of conditions than a mode! that represents only the
optimum design point of operation. However, the designer is never going to sit down
with a set of fundamental physical equations (even a very large set) and "deduce" an
engine design from them. Nevertheless, a simulation model that incorporates "first-
principles" parametric representations for even some of the physical processes in an
engine is likely to show the designer some unexpected failure modes when the mode! is
run under off-design or nonoptimal conditions. But even a completely accurate physical
description of the engine will not enable the designer to deduce all of the failure modes
of the engine.
As the example of engine operating efficiency illustrates, much of the research
that is often described as investigating "physics of failure" that is, investigating the
mechanisms of failure-can and should be part of research on the physical processes
underlying the complex technologies needed for AAN systems, such as advanced
engines, active suspension systems, and lightweight protection systems. However, the
duty cycles for many commercial applications for these technologies will be very
different from the duty cycles in AAN systems. Therefore, the Army may have to support
and encourage basic research on the broad issues in the mechanisms offailure that are
unique to the duty cycles for AWN concepts of operations. Two broad issues are (1) the
relationship between dynamic physical conditions and properties (conditions and
properties that do not vary uniformly over time) and failure modes of subsystems and
components, and (2) how failure modes are affected by materials with different
structural patterns at different spatial scales (ranging from atomic to micron scaler, as
opposed to materials with bulk properties dete~ined predominantly by their atomic-
scale structure.
Materials Selection for Improved Reliability
If all of the performance requirements, including the reliability metrics, for
component-level models of an AAN system cannot be met with standard materials,
OCR for page 124
124
REDUCING THE LOGISTICS BURDEN FOR THE ARMYAFTER NEXT
seeking new materials is an alternative to relaxing the requirements. Appendix C,
Materials Selection and Design, discusses how materials science and engineering can
develop solutions for component design and analysis. Alternatives may exist among
lesser known materials, and better materials databases and selection charts can help
designers find potential solutions. Often these materials may require testing to determine
how well they meet design requirements.
As materials science develops better tools for modeling the performance
characteristics of materials based on their underlying structures, and as methods for
forming materials with novel structures, particularly at very fine scales, are improved, an
even more innovative approach may become possible. The component designer may be
able to call on the materials designer to design a new material to meet the performance
requirements for a particularly demanding application. The M&S tools needed to
support "materials by design" must have the same generic capabilities as the tools at
various levels in the M&S hierarchy for systems design. In fact, the M&S tools for
materials design and processing can be considered another level of structure-function
relationships below the component level. The extensions of current capabilities required
to develop AAN systems will also be necessary at this new level in the hierarchy.
1
Prognostics
Prognostics (prognostic sensing technology) can be described as the use of
sensor technology to detect precursors of failure before the failure occurs. Prognostics
applies our limited knowledge of the mechanisms of failure to detect "failures in the
making" so that the failure can be prevented, avoided, or ameliorated (the graceful
degradation of performance).
If the M&S capability at each level in the hierarchy, including the just-emerging
"materials design" level, could simulate physical reality and predict failure modes and
events perfectly (i.e., if the mechanisms of failure linking each structure-function level to
the ones above and below it were fully understood and had been incorporated into the
models at every level) and if engineers knew how to design each level of a system so that
all faiTure-causing conditions could be avoided, then prognostics would not be needed.
Often, though, something is known about conditions that cause operational failures but
not enough to ensure that none of them occurs. In some cases, an optimal design for the
full set of performance requirements for a given system, subsystem, or component is
known to be subject to a particular failure mode when certain antecedent conditions
arise. In these cases, prognostic sensing technology can improve the reliability of the
system.
The use of prognostic sensors is well established at the higher levels in the
hierarchy of systems design. A warning light goes on when the lining of an automobile
brake is worn to the point that a replacement is needed to avoid brake failure. An oil
pressure gauge is not used by a knowledgeable driver or mechanic as a means to measure
oil pressure but as a prognostic sensor indicating a condition that could lead to the
catastrophic failure of the vehicle (oil pump wear or failure, a system leak, or
overheating). The more innovative (and sometimes controversial) uses of prognostic
sensing are for detecting precursors of structural failures at small spatial scales,
particularly by sensors embedded in the material.
OCR for page 125
RELIABILITY CONCEPTS
125
These new technological possibilities for small-scale sensors (measured in
micrometers or even nanometers) to detect equally small causal preconditions for a
structural failure have the potential to operate at the material and component levels of
complex structures in a manner analogous to the more familiar prognostic sensors at
higher, larger scales. The following general principles apply to prognostics at any level
of system design:
· Using a sensor system for prognostics implies that something is known about the
mechanisms of failure for the performance characteristic that the sensor is
monitoring.
If one knows enough about the failure mode and knows a way to avoid it, it is
better to design the system not to fad! rather than to use a sensor to predict when
the failure will occur. If we do not know for sure how to avoid it and the failure
mode is important enough, a prognostic sensor may be useful.
Not every precondition for failure that can be monitored with a sensor is worth
monitoring.
Prognostic sensors are useful when the causal link between the precondition and
the consequence is well established, the consequence is likely to lead to overall
operational failure of the larger system, and something useful and relatively easy
can be drone to prevent the operationalfailure of the larger system.
Prognostic sensors could be used to speed the refitting of AAN systems between
combat pulses. For example, knowing that an embedded sensor would detect nascent
crack formation in a key structural component could be a faster way to ensure that a
system is puIse-reliable than performing a laborious, and possibly destructive, testing
procedure during each maintenance check. A prognostic sensor might contribute to pulse
reliability by warning a well trained driver to avoid certain stresses, thereby trading a
constraint on vehicle operation (a small degradation in performance) for a larger system
failure. Although prognostics is not a substitute for AAN mission reliability, it is clearly
a complementary technology.
SCIENCE AND TECHNOLOGY INITIATIVES TO ACHIEVE
AAN MISSION RELIABILITY
Based on the preceding analyses of the role of reliability (and related concepts,
such as maintainability, availability, and durability) in reducing logistics burdens for
AAN systems and the technological opportunities for improving reliability, the
committee concluded that the Army should pursue the following areas of scientific
research and technology development. The order of a numbered item reflects a rough
order of priority.
AAN Mission Reliability
Defining AAN Mission Reliability. Reliability for AAN systems (or RAMD for AAN)
must be defined in relation to AAN operational concepts, as illustrated in this chapter by
OCR for page 126
126
REDUCING THE LOGISTICS BURDEN FOR THE ARMY~FTER NEXT
the examples of pulse-reliable systems and the improved maintainability required for
rapid refitting of a battle force between combat pulses. Additional aspects of AAN mis-
sion reliability can be defined as the operational concept for the AAN evolves. However,
the war-fighters and technologists must make every effort to define mission reliability in
objective, quantifiable, and accountable tens. In short, the terms must be usable in
system design and trade-off processes. Working definitions of mission reliability must
begin at the highest levels of small unit and force-on-force engagement analysis and pro-
ceed down to the reliability requirements for individual systems (e.g., AAN combat
vehicles). Factoring reliability into the logistics analyses necessary to design, develop,
and field systems that will meet AAN performance objectives by 2025 will require
clearly linking reliability requirements to mission perfo~ance. The higher-level, func-
tional definitions of reliability should be reviewed and updated by both war-fighters and
technologists as the concepts of AAN operations evolve. The science and technology
community should ensure that the lower levels of system analysis include reliability-
related performance metrics that contribute to reliability at the next higher level of
system integration.
Three Approaches to Mission Reliability
I. Designing for Reliability with a Distributed M&S Environment. Five extensions
of current capabilities and design approaches must be incorporated into M&S tools at
every level of system structure, from components to fully integrated systems. These
extensions are (1) models incorporated into the M&S environment that can represent the
system properties and environmental conditions that affect AAN mission reliability
requirements, (2) measurable reliability-related requirements defined for the models at
each level in the M&S hierarchy, (3) iterative simulations up and down the hierarchy of
models in the design and engineering process, (4) provisions for obtaining valid data on
lesser known design options that could contribute to satisfying combinations of AAN
performance goals (including the goal of mission reliability), and (5) mission reliability,
defined by assessable reliability requirements, as a performance objective for design and
engineering development that cannot be compromised to meet other performance
objectives.
2. System Trade-offs That Include Reliability as a Primary Performance Goal. The
M&S environment used for designing reliability into systems can also enable rational
trade-offs when existing technologies and design concepts do not meet all of the primary
performance goals. Especially in the near term, compromises will be necessary, and op-
timum system performance should take priority over meeting individual performance
goals or the performance of a subsystem. If objective, measurable reliability require-
ments have been defined for the system at each level in the M&S environment, then
adjustments to those requirements should flow down the hierarchy, and the con-
sequences of other design changes on reliability should be assessed upward through the
hierarchy. Novel or less conventional technological or design alternatives should be
evaluated in terms of their impact on reliability, as well as on other performance goals.
Contractor proposals should be evaluated, and contracts awarded, on the basis of how
well they meet reliability requirements, as well as other performance goals.
OCR for page 127
RELIABILITY CONCEPTS
127
3. Application of Materials Science to New Reliability Solutions. Although the
payoffs are likely to come after the 2010 deadline for decisions on major AAN systems,
several important areas of basic research are likely to provide important contributions to
developing systems that can meet AAN reliability requirements, as well as other primary
performance requirements. The Army should continue to leverage its resources in these
areas of research through networking with industry and academic partners and through
active participation in joint programs. Three research areas that are particularly
important for improving reliability are (1) investigating mechanisms of failure and
incorporating this knowledge into M&S tools, (2) selecting or designing alternatives for
materials that can meet AAN requirements, including reliability requirements, that
familiar materials cannot meet, and (3) using embedded prognostic sensing technology
in designing structures and components. The potential impact of this research and a
realistic assessment of their potential contributions to AAN solutions should be framed
in terms of refining, improving, and extending to smaller spatial scales the hierarchical
M&S environment for systems design and logistics trade-off analyses.
Representative terms from entire chapter:
aan mission