Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 97
APPENDIX C
LETTER REPORTS TO THE ADMINISTRATOR OF NASA
AND NASA RESPONSE
Prior to this final report, the Shuttle Criticality Review and Hazard Analysis Audit Committee issued
two interim letter reports to the Administrator of the National Aeronautics and Space Administration.
The Administrator of NASA provided a response to the Committee regarding the first interim report. It
also was referenced in NASA's Report to the President of June 1987. These documents are contained in
this appendix.
First interim letter report to the Administrator of NASA frown Committee Chairman Alton D.
Slay, January 13, 1 98 7, 4 pp.
Reply to Committee Chairman Alton D. Slay from the Administrator of NASA regarding the
first report, April 22, 1987
Report to the President. lmplen~entation of the Recommendations of The Presidential Commission
on the Space Shuttle Challenger Accident, NASA, June 1987, excerpts from pp. 41-42
r
.
Seconc! interim letter report to the Administrator of NASA from Committee Chairman Alton
D. Stay, July 22, 1987, 8 pp.
5
97
Page
98
102
104
107
OCR for page 98
6
NATIONAL RESEARCH COUNCIL
COMMISSION ON ENGINEERING AND TECHNICAL SYSTEMS
21()1 Constitution Avenue \Nashingt`,n. D ~ 2(~4lt3
AERONAUTICS AND SPACE
ENGINEERING BOARD
January 13, 1987
The Honorable Jades C. Fletcher
Administrator
National Aeronautics and Space Administration
Washington, D.C. 20546
Dear Jim:
This is an inboard progress report of the Shuttle Criticality Review
and Hazard Analysis Audit Committed. The National Research Council
formed this committee in response to your request for an a It of the
NASA response to the Presidential Commission Fecormen~ation III
regarding criticality review and hazard analysis.
The Ccmmitt~e has been a functioning entity since its first meeting on
September 22, 1986. We have thus far received presentations from and
engaged in detailed discussions with NA5A Heac qua rters, the National
Space Transportation System program office, Johnson Space Center,
Marshall Space Flight Center, and Kennedy Space Censor. Similar
meet m gs were held at Rocket*yne (Space Shuttle Main Engine) and
Rockwell International (Orbiter), and by a working group at Morton
Thioko1 (Solid Rocket Motors. All of the participants described their
efforts and progress in reivaluat mg the Failure Modes and Effects
Analysis (FMEA) and Critical Items List (CIL) status and in reassess-
ing hazard analysis and risk management. The Committee also has
received a brief m g on and direst the process be m g used by the
U.S. Air Force Systems Command-Space Division to determine launch
readiness and. safety status. The Titan 34D Recovery Program was
described as an example.
m e Committee has been favorably impressed by the ~~;cated effort and
extremely beneficial results obtained thus far from the FMEA/CIL and
hazard analysis processes. We are very appreciative of the frank and
Open manner in which NOVA and contractor personnel have worked with
the Committee. Our suggestions have been received in a very respon-
sive manner in all quarters. We wish to commend Admiral Truly, Arnold
Aldrich and the NASA Shuttle team involved in the FMEA/CIL,hazard
analysis processes for the significant work they have performed so
far. Although cur general impressions are favorable, we do have some
suggestions for improvement. In summary, they are:
0 Criticality 1 and 1R items should be assigned priorities
based on the probability of occurrence.
0 S. mace many of the Criticality 1 and 1R items differ substan-
tially in terms of the probability of failure, NINA should
consider modify Meg the definition of critical items to
account for these differences.
The National Research Council is the principal operating agency of the National Academy of Sciences and the National Academy of Engineering
to serve government and other organizations
98
OCR for page 99
Letter to the Honorable JR~PC C. Fletcher
o
— 2 —
NOVA should incorporate its present total system review proce-
dures in an integrated systems assessment process coupled
closely with the FMEA/CIL reevaluation now be m g undertaken.
O Linkage between the STS eng m eer m g change activities and the
FMEA/CIL,hazard analysis processes should be assured.
SETTING PRIORITIES FOR CRITICALITY 1 AND 1R ITEMS
NASA does not now set priorities for Criticality 1 and 1R items nor
does it consider the prrh~hility of occurrence of an event in the
treatment of these items. The Committee reck== Ode that MESA devise
some mechanism for and assign priorities to the Criticality 1 and 1R
items. It suggests that probability of occurrence should be an
Important element of any such priority r=~40ning. Pas m g priorities
on this fundamental Measurement of risk will help NINA and these
interested in its progress to evaluate the adequacy of changes being
made to Shuttle hardware, software, or procedun~n in the interest of
enhancing safety.
Essential to the sur--cc of any risk assessment process is the certain
and timely feedback of preflight and postflight system performance
data, along with test data and failure or degradation reports. Such
in put_ are critical to any successful FMEA/CIL and hazard analysis
program and can form the Harris for more precise evaluation of risk.
While it is clear to the Committ== that these data are `:=P~ in readi-
ness reviews and other NINA activities, it is not clear that they are
used in the FMEA/CTT. or hazard analysis processes. The Committee
believes that this information can, if properly Band, assist greatly
in the EMEA/CIL and hazard analysis processes and in the determination
of prioritie_.
m e present decision-ma-ding process within NASA with regard to
FMEA/CIL appears to be based on the judgment of experienced practitio-
ners and has received very little contribution from quantitative anal-
ysis. We believe that the failure of NASA to use numerical techniques
as an input to decision-making detracts from the overall effectiveness
of the FMEA/CIL and hazard analysis processes. Such techniques could
provide a more realistic assessment of risk, at least on a relative
basis. We do not wish to suggest that NINA subordinate technical
judgment to numerical analysis.
Such an approach would be, in cur
cplalon, u:~un1unq and perhaps counterproductive.
Currently waiver authority for all Criticality 1 and 1R items rests
with NINA Level I. m e Committee believes that Level I should focus
its attention an the highest priority items resulting from the
99
OCR for page 100
Letter to the Honorable James C. Fletcher
l
- , .
suggested selection pro Mess, along with the rationale that produced
the priority rat m g. The waiver decision authority for the remainder
of the Criticality 1 a ~ OR items should be delegated to Levels II and
perhaps III.
DEFINITION OF CRITICALITY CaTECORUES
The rrn~dttee nctes that the Indicated response of the entire NOVA
o ~ zatian and its contractors has produced a variety of items
which, by precise definition, must be placed in the Criticality 1 or
OR categories. Many of the items differ substantially flus one
another in terms of the prrh~hility of failure or Performance and
thus their potential impact on Shuttle Operational safety. The
Committee suggests that NASA consider a modification of the Critical
Items List to account for these differences, help the priority
selection process, and better focus present or future efforts to
achieve safer Shuttle cgerations.
INTEGRATED SPACE TRANSPORTATION SYSTEM ANALYSIS
m e Committee oncerstards that various Mechanisms are being used by
NINA to examine total system operation, including propagation of fail-
ure modes to interfac Meg or physically adjacent modules or ~obsy~te=~.
The Committee does not perceive, however, any formal relationship of
such evaluation methods to the Ongoing FMEA/CIL process. me Commit-
tee suggests that MESA devise an integrated STS systems ~~=-=c~put
process which is closely coupled with the FMEA/cTT activity to assure
assessment of the truly critical safety elements in the STS. This
incl ~~= all combinations of herd ware/sof~ware/procedura1 failures and
cascading failures.
RELATION BETWEEN FMEA/CILrHAZARD ANALYSIS AND DESIGN CHANGES
We note that many engmeermg changes have been undertaken since the
51-L accident to in prove Shuttle safety prior to ration of flight,
now scheduled for February 1988. In parallel, the FME\/CIL arm hazard
analysis reevaluations are urger way with Completion em Purim
the sir of 1987. Thus, the FMEA/C:IL reevaluation may not abate
ly reflect all of the enginE~erir~ charges, nor will there be time to
incorporate any su ~ antial design *I es that may be indicated by
the outcome of the FMEA/CIL reevaluation, hazard analyses, and related
activities. The Committee reccrmends that NASA assure a close linking
between the STS eng meeting change activities and the FMEA/cIlrhazard
analysis processes.
100
OCR for page 101
Letter to the Honorable J~mpc C. Fletcher
TUII~ WORK
m e Committee is cant m am g its effort to aunt the EMEA/Ct~, hazard
analysis, and related processes deal m g with risk assessment. We have
planned additional visits to NASA centers and contractor facilities
where we will continue to exam me the Mechanisms used by NASA and its
contractors to provide for the overall safety of the STS as an inbe-
grated system. We also will further refine sOmP of the points raised
here in future reports to you. While we recog m ze that it is not
possible a priori to ensure mission suppress and flight safety, through
this review and a It we hope to assist NOVA in taking these prudent
steps which will provide a reasonable and responsible level of assur-
ance of flight safety. We will, of course, remain in close contact
with your staff Throughout this activity.
Sincerely y ~ s,
1]
~D1l
cc: Admiral Richard H. Truly
101
_' ~V: ~ /~1 - /
Alton D. Slat v
_ .
Coleman
Committee on Shuttle Criticality
Review and Hazard Analysis It
1
OCR for page 102
Nl\S/\
National Aeronautics and
Space Administration
Washington D C
20546
Otto ~ of tine AUrn~n~str~tor
General Alton D. Slay
National Research Council
National Academy of Engineering
2101 Constitution Avenue, NW (NAS 307)
Washington, DC 20418
Dear Al:
In reply to your January 13, 1987, interim progress report of the
Committee on Shuttle Criticality Review and Hazard Analysis, your four
suggestions are repeated, along with NASA's response to each.
NRC Comment: "Criticality 1 and 1R items should be assigned priorities
based on the probability of occurrence." (This comment also suggested the use
of probability analysis techniques and the delegation of certain criticality
items to lower levels of the organization.)
NASA Response: The National Space Transportation System is in the
process of selecting and implementing a critical items prioritization
technique for the Shuttle program. Five different techniques have been
evaluated by review teams at JSC, MSFC, and KSC. One of these techniques has
been selected to be presented to the program manager at a Program Requirements
Control Board (PRCB) for baselining as a formal program requirement. The
chosen approach will overlay the existing Failure Mode and Effects
Analysis/Critical Items List (FMEA/CIL) activity with minimum perturbation,
yet provide an effective measure of relative risk in order to focus future
review emphasis and resource allocations. In parallel with the prioritization
technique development, an effort is also under way to assess the utility of
probabilistic risk assessment in the NSTS FMEA/CIL process. Activities have
been initiated to engage two independent firms with expertise in probabilistic
risk assessment to perform detailed reviews of the orbiter auxiliary power
unit and the shuttle main propulsion pressurization system. A decision to
apply such probabilistic risk assessment techniques to other elements of the
Shuttle will depend upon assessments of the results and impacts of those
efforts and comparison of these results with the results of the mainline
FMEA/CIL activity. Delegating the review and approval of certain critical
items will be decided after the results of the prioritization and risk
assessment activities have been thoroughly assessed.
NRC Comment: "Since many of the Criticality 1 and 1R items differ
substantially in terms of the probability of failure, NASA should consider
modifying the definition of critical items to account for these differences."
6
102
OCR for page 103
NASA Response: We expect the FMEA/CIL prioritization process will
provide the necessary definitions and program focus in this regard.
NRC Comment: "NASA should incorporate its present total system review
procedures in an integrated systems assessment process coupled closely with
the FMEA/CIL reevaluation now being undertaken."
NASA Response: Since the Challenger accident, NASA has reemphasized its
risk management effort. An important feature of the revised effort must be a
"systems engineering" approach that integrates the various elements of the
risk management process to assure assessment of the combinations of hardware,
software, procedures, and cascading failures. NASA's new Associate
Administrator for Safety, Reliabilility, Maintainability and Quality Assurance
has been tasked to develop a new agencywide risk management system.
NRC Comment: "Linkage between the STS engineering change activities and
the FMEA/CIL hazard analysis processes should be assured."
NASA Response: Engineering changes are processed through the same Space
Shuttle configuration control boards that conduct the review of the
FMEA/CIL. A recent change to the procedure requires an assessment of each
change request to determine if it affects any Criticality 1 or 2 hardware.
The nature of the combined change control and FMEA/CIL processes is such that
the total process cannot be completed until the last change to be implemented
before flight has itself undergone a FMEA and been dispositioned by the
board. Regardless of the timetable established by the NSTS working schedule
for FMEA/CI L preparation and review, the changes that result will be dealt
with in the same manner as the generating FMEA items. All changes mandatory
for first flight will undergo the same rigor, even if this results in a flight
schedule impact. The NSTS Systems Design Reviews which began early last year
have significantly reduced the likelihood of new changes being identified that
have major schedule impacts.
The dedication of your committee and the sincerity of its comments are
very much appreciated by NASA. I hope you find our actions in response to
your suggestions to be both appropriate and timely. Thank you again for your
help.
Sincerely,
~ . ~
~ ,
flames CI. Fletcher
Administrator
6
103
OCR for page 104
Report to the President
IMPLEMENTATION
of the
RECOMMENDATIONS
of the Presidential Commission
on the Space Shuttle
Challenger Acciclent
June 1987
6
104
OCR for page 105
-
?
EIFA's have been conducted on ET/
orbiter, SSME/orbiter, and SRB/ET/orbiter
interfaces. These analyses have been
reviewed by NASA and the systems integra-
tion contractor, and the results are under
evaluation by the element project offices and
the NSTS Engineering Integration Office.
When this review is completed, the finalized
EIFA's will be presented to the PRCB for for-
mal approval.
NATIONAL RESEARCH
COUNCIL AUDIT
The Shuttle Criticality Review and Haz-
ard Analysis Audit Committee of the
National Research Council (NRC), chaired
by retired USAF General Alton Slay, reports
directly to the NASA Administrator and is
responsible for verifying the adequacy of the
proposed actions for returning the Space
Shuttle tO flight status (see Appendix F for
panel membership and a summary of
responsibilities).
The committee has discussed the FMEA/
CIL/HA reevaluation process with repre-
sentatives from NASA Headquarters, ISC,
KSC, and MSFC. Meetings have been held
at the centers and at Rockwell Internation-
al's~ Space Transportation Systems and
Rocketdyne divisions; Morton Thiokol;
United Space Boosters, Inc.; Sundstrand
Corporation; and NRC Headquarters. The
committee is evaluating the adequacy of the
review process, checking for continuity
across all elements of the program, and
reviewing changes that NASA and its con-
tractors have made since the accident.
A preliminary report was submitted to
the NASA Administrator on January 13,
1987, indicating that the committee has been
favorably impressed with the results obtained
from the FMEA/CIL and hazard analysis
processes. While the committee's general
impressions were favorable, it did make some
suggestions for improvements. In summary,
these suggestions are: ( 1 ) Criticality 1 and 1 R
items should be assigned priorities based on
the probability of occurrence; (2) since many
of the Criticality 1 and 1R items differ sub-
stantially in terms of the probability of fail-
ure, NASA should consider modifying the
definition of critical items to account for
these differences; (3) NASA should incorpo-
rate its present system review procedures into
an integrated system assessment process
coupled closely with the FMEA/CIL reevalu-
ation now being undertaken; (4) linkage
between the STS engineering change activi-
ties and the FMEA/CIL/HA processes
should be provided.
NASA has responded to these sugges-
tions in the following manner:
1. Several candidate systems for prioritizing
critical items have been evaluated by each
of the projects. A hybrid system has been
developed that incorporates the positive
features of the candidate systems and spe-
cihcally addresses probability of occur-
rence. The approach can be overlaid on
the existing FMEA activity with mini-
mum perturbation, providing an effective
measure of relative risk.
In parallel with the development of
. . . . . . .. .
prlorltlzatlon techniques, an ettort 1S
under way to determine the applicability
of probability risk assessment to the
FMEA/CIL process. This technique is
2.
41
1 _! _ _
used in the nuclear power industry to pro-
vide relative-risk assessments. Two firms
with expertise in probability analysis have
been selected to perform detailed assess-
ments of the orbiter auxiliary power unit
and the main propulsion engine pressur-
ization system. A decision to apply proba-
bility analysis techniques to other systems
of the program will depend on the results
of these assessments.
The FMEA/CIL prioritization process
will provide the necessary program focus
and more definitive definitions in
response to the committee's concern
expressed in their second suggestion.
3. Since the accident, NASA has reempha-
sized its risk management effort. An
important feature of the revised effort is a
"systems engineering" approach that inte-
grates the various elements of hardware
and software failure analysis. Further dis-
cussion of risk management is included in
the response to Recommendation IV.
4. Engineering changes are processed
through the same project and program
control boards that conduct and approve
the reviews of the FMEA/CIL. Each
105
OCR for page 106
change request will be assessed to deter-
mine if it affects any Criticality 1 or 2
hardware to ensure that the required link-
age is provided.
The NRC audit committee is reviewing
additional areas to identify potential meth-
ods of reducing risk. These include the design
qualification and flight certification pro-
cesses, launch commit criteria and waiver
policy, and the generation, review, and
approval of retention rationale for waivers to
. . .
crltlca. . Items.
-
Also being reviewed are the overall
safety, reliability, maintainability, and quality
assurance program, the definition of struc-
tural analysis requirements, the establish-
ment and verification of analyses for margins
of safety, the risk management processes for
software, and the processes for analyzing pay-
load safety.
Interim findings and recommendations
from these reviews will be submitted to the
NASA Administrator through letter reports,
as required. The final report, anticipated in
1987, will include an assessment of the proce-
dures reviewed and recommendations for
improving the Shuttle risk management sys-
tem. As reports are received, any recommen-
lations included will be reviewed by NASA
and responses will be provided to NRC.
42
106
OCR for page 107
NATIONAL RESEARCH COUNCIL
COMMISSION ON ENGINEERING AND TECHNICAL SYSTEMS
2101 Constitution Avenue \\ashingt`~n. [a ~ 2~)418
AERONAUTICS AND SPACE
ENGINEERING BOARD
July 22, 1987
The Honorable James C. Fletcher
Administrator
National Aeronautics and Spare Administration
Washington, D.C. 20546
Dear Jim:
I am pleased to provide this second interim progress report of the
National Research Council's Committee on Shuttle Criticality Review and
Hazard Analysis Audit. I wish to thank you for your letter of April 22,
1987, in which you summarized the steps that the National Aeronautics and
Space Administration (NASA) is taking in response to the suggestions in
our first report to you of January 13, 1987. The Committee is indeed
gratified by the progress NASA is making in strengthening the Space
Transportation System (STS) risk management program. We also appreciate
the continued close collaboration with NASA and contractor personnel, and
note the interest they show and their responsiveness to the Committee's
suggestions. The purpose of this letter is to react to the actions of
NASA taken in response to our first letter, and to comment on some
additional aspects of STS risk management.
Since our last report, the full Committee has met six more times,
~nclud mg visits to Marshall Space Flight Center, Kennedy Space Center,
I again to Rocket~yne on the Space Shuttle Main Engine (SSME), and with
Rockwell Space Transportation System Division on STS integration. Working
groups of the Committee also met at appropriate NASA centers and
contractors to review the risk management aspects of the Solid Rocket
Booster (SRB); Orbiter Auxiliary Power Unit (APU) and SUB Hydraulic Power
Unit (HPU); Shuttle structural analysis, margins and verification; Orbiter
nose wheel steer mg; software; and Space Shuttle Main Engine. This
continued audit has allowed the Committee to evaluate the changes NASA is
making in the STS risk management processes and to identify some
additional views which we thought would be useful to share with you in
this interim report.
Regard m g the response of NASA to the first report, the Ccmmittee's
reaction is, in summary:
o The work under way to assign priorities to Criticality 1 and 1R
items appears to be a significant step forward. We also are
pleased to note the tests of Probabilistic Risk Assessment (PRA)
now be mg conducted.
o The Committee looks forward to learning how the prioritization
process will be used to redefine the critical items by taking
into account the differences in the probability of occurrence.
The Nahonal Research Council is the principal operahng agency of the National Academy of Sciences and the National Academy of Engineering
to serve government and other organizahons
107
OCR for page 108
Latter to the Honorable James C. Fletcher
o
I; ~
We enthusiastically support the agency-wide risk management
system now being developed. However, we are still concerned with
the apparent lack of consideration of the STS as a s m gle,
complex system rather than a collection of subsystems.
The steps taken to link the engineer m g change control and the
Failure Modes and Effects Analyses/Critical Items List (FMEA4CIL)
processes are both appropriate and welcome. We are also
reassured by your statement that the flight schedule will not be
allowed to reduce the rigor with which the risk management tasks
will be conducted.
o
The Ccmmittee's continuing audit since our last interim report leads us to
provide initial comments on the following topics:
Persons involved in the STS program frequently give the
impression that decisions are made collectively by panels,
boards, etc., rather than by the responsible individuals. We
believe that the Administrator of NASA should periodically remind
the NASA organization of the specific individuals responsible for
final decisions based on the advice received from each advisory
body.
o
The new System Integrity Assurance Program (SIAP), especially its
Program Compliance Assurance and Status System (PCASS), now be Meg
implemented by the National Space Transportation System (NSTS)
Program office, will be invaluable as a tool in support of STS
risk management. The STS failures data base, when completed, can
be of major importance in determining the probability that the
worst case effect postulated in the FMEA will actually occur.
0 The progress be Meg made in improvements to the SSME as a result
of the FMEA/CIL reevaluation is very encouraging.
The changes being introduced in NASA Headquarters Safety,
Reliability, Maintainability and Quality Assurance (SRM&QA)
appear to be well planned and in the right direction. However,
we are concerned that it is not adequately staffed to cope with
the demands placed upon it, and recognize that close
collaboration with the centers and program offices is necessary
to improve risk management in NASA.
o
A risk assessment report, based upon both the FMEA/CIL/retention
rationale and a comprehensive hazard and safety assessment,
should be the basis for the acceptance rationale in consider Meg
waivers to fly Criticality l components.
108
OCR for page 109
Letter to the Honorable James C. Fletcher
o
o
There appear to have been unexplained differences among the STS
elements in the approach to and the rigor of the FMEA/CIL
reevaluations. The methods be m g used should be reviewed to
assure that any differences which exist will not compromise the
FMEA/CIL reevaluation process.
The panels and boards (Program Requirements Change Board, Flight
Readiness Review, etc.) that advise key NASA decision makers are
not adequately staffed with people skilled in the statistical
sciences of data analysis, statistical inference, and
probabilistic risk assessment; persons with such skills should be
added to provide 1mprcved support of the decision mating process.
A greater effort is needed to plan for additional elimination or
reduction of risks in the STS.
Follow m g is an elaboration on these topics.
HIS ON NASA RESPONSE
Setting priorities for Criticality 1 and IR items
We are pleased to see the steps be m g taken to assign priorities to the
critical items. The Committee notes that the technique proposed for
Implementation lends itself to the incorporation of quantitative measures
of risk and probabilities of occurrence ~s these measure are developed.
However, the Committee urges that mare be taken to assure that over
simplified but potentially inaccurate quantitative measures are not used.
We have been assured by a representative of the NSTS office that the
prioritization process can be completed well before the next Shuttle
launch, which we believe to be an important consideration. We look
forward to learning how NASA plans to use the results of this process. I
can understand your desire to defer a decision to delegate from Level I of
NASA the review and approval of waivers on certain critical items until
you have assessed the results of the new prioritization and risk
assessment processes. However, the Committee believes that before the
next launch some method should be used to assure that NASA Level I gives
special attention to the highest priority items identified through the
prioritization process.
The Ccmmitt^P is delighted to learn that NASA is test Meg the use of
Probabilistic Risk Assessment (PRA) on the APU and HPU, and the Shuttle
main propulsion pressurization system. We also are aware of the SSME
certification process assessment study being conducted at the Jet
Propulsion T~horatory, which includes a PRA of the SSME. Ihe Committee
cautions NASA on its intention to evaluate PRA by oomparLng the results of
only two or three disparate tests of PRA with the results obtained earlier
by the FMEA/CIL process. The criterion should not only be whether a
significant new problem is identified by the PRA. Ihe PRA test results
should be used by NASA to answer the questions: Would the PRA have helped
109
OCR for page 110
Letter to the Honorable James C. Fletcher
in making NASA's original decisions, e.g., on a Criticality 1 waiver?
Would it have given more confidence in the decisions that were made? The
current sample size is too small to judge its merits when applied to the
entire STS or even a complex element such as the Orbiter. The PRA should
increase in value as the scope of its coverage of the STS is widened. It
also should be useful in better understanding the nature of the failure
modes.
Integrated Space Transportation System analysis
The Committee is pleased to note that the NASA Associate Administrator for
SRMMQA has been directed to develop an agency-wide risk management
system. We believe that it is important to call attention to the totality
of "risk management" as the sum of a number of separate processes which
ultimately must be considered on an integrated basis.
The committee is still concerned that at the NSTS office at JSC we have
not found a consolidated, integrated STS systems engineering analysts,
including system safety analysis, that views the sum of the STS elements
as a single system. Such a "top-down" engineering analysis would help
avoid potential gaps which may exist as a result of the present very
thorough "bottcm-up" analyses centered at the subsystem and element
project levels.
We have recently become aware of the Avionics Audit which is conducted by
Rockwell International-STS Division for the NSTS Program office. We
understand that this audit process will be expanded to embrace eventually
the entire STS. The Committee believes that an expanded audit of this
type could serve as the nucleus or the needed Integrates Ale engln~=rlng
analysis in support of risk management.
Relation between FMEP/CILrHazard Analysis and design changes
The Committee is reassured by the steps NASA has taken to tighten the
procedure for assessing the impact of any proposed design change on
Criticality 1 or 2 hardware; by the requirement that all changes
introduced before a flight must undergo a FEES which also must be accepted
by the change board; and by your statement that the flight schedule will
not be permitted to reduce the rigor with which these risk management
tasks are conducted.
COMMENTS ON NEW TOPICS
Role of panels and boards in STS decisions
The Committee recognizes the important role played by the many panels and
boards in the NSTS program in providing coordination, r~solv m g problems
and technical conflicts, and reviewing and recommend Meg actions. m ese
~0
OCR for page 111
Letter to the Honorable James C. Fletcher
entities allow the different interests and skill groups to br m g forward
Sneer Inputs, contribute their knowledge, and thus munlmize the risk that
a proposed action will negatively affect some aspect of the STS. We
presume that each of these entities recommends an action to an appropriate
official, such as a project manager at Level III or the Deputy Director of
the NSTS Program at Level II, who actually makes and takes responsibility
for the decision.
The Committee is concerned about a possible attitudinal problem regarding
the decision process on the part of the NASA personnel engaged in it.
When we ask a NASA manager about how a decision is made, often we are told
that it is made by such-and-such a board. We are concerned that there may
be a tendency for those involved in the mNlti-layered review and decision
process to hide in the anonymity of panels and boards, and that each
person who must sign off on an item may not be inclined to concentrate
enough on his or her individual responsibility in light of the number of
levels of group reviews involved in the decision process. The Committed
recommends that the Administrator of NASA periodically remind all of the
NASA organization of the specific individuals by name and position who are
responsible for final decisions (and the organizational relationships
among them) based on the advice coming from each panel and board. This
would not detract freon the important role played by all members of the
panels and boards in prcvid m g advice to the decision maker.
Potential of the Program CcmPl~ance Assurance Status Svstem fPrAR.~N
The Committee is enthusi~.ctic about the potential of the PCASS, which is
be Meg established ~.c a major Dart of the new .~vct~m Tnt=~rit~r ~CC1lr=~^
Program (SIAP) of the NSTS.
~ ]~ ~ ^ ~ ~] ~ ~~L ~ ~~ ~ 1 ~] ~~[ ~ ~ ~ 1~
It should improve the quality of information
available to key decision makers (e.g., at Flight Readiness Reviews) by
pro Siding in near real-time an integrated view of the status of problems
with the STS, including trends, anomalies and deviations, assessments, and
closure information. Plans to keep up to date and computerize the FMEA
will provide a very useful input to PCASS. The Committee also has learned
of the data base maintained by the Johnson Space Center (JSC) SR~Q~ office
which documents in one place the failures which have occurred on the
Orbiter during ground test Meg and in flight. It is encourag m g to note
that of those failures of components on the Orbiter categorized as
Criticality 1 which have occurred dur m g flight, none resulted in the
worst-case effect postulated in the FMEA. These failure data can be very
valuable in connection with the new CIL prioritization system in
establish m g the probability that the postulated effects will actually
occur, given the failure in flight. We understand that this, and similar
data bases for the other STS elements, will be integrated into the PCASS.
We believe that PCASS, ~s a real-time data base, has the potential to
become a key element of the STS risk management, and thus its full and
timely development should be encouraged and supported. m e Committee
recommends that this development be given a high priority and that the
potential users of PCASS, includ m g key decision makers, be involved
closely now in its development.
r
111
OCR for page 112
Letter to the Honorable James C. Fletcher
Progress on the SSME as a result of the FMEA/CIL reevaluation
— 6 —
Based on its second visit to Rockwell International - Rocketdyne Division,
the Ccmmittee is encouraged with the progress be m g made in improve g the
SSME ~s a result of the FMEA/CIL reevaluation. We also applaud the
improvements in the test program which are designed to validate the
reliability of the modified SSME before first flight. The SSME is one of
the few cases in which the Ccmmittee has found that changes have been made
~~ a result of the FMEA/CIL. In most other cases, the Ccmmittee observes
that the Initiation of changes has not originated with the FMEA/CIL
pr~ess .
NASA Headquarters Safety, Reliability Maintainability and Ouality
Assurance (SRM&OA) program.
In April, the Ccmmittee received a comprehensive brief m g regarding the
status and plans for the NASA Headquarters SRM&QA program. We are
encouraged by the progress that has been made. The Committee believes
that the program is go mg in the right direction. We recognize the
magnitude of the task ahead; however, the goals and the program plans
developed so far appear to be sound. The Committee is concerned that
SRM&QA (at Headquarters and the centers) is not adequately staffed to cope
with the demands being placed upon it, perhaps necessitat m g the
additional use of contract personnel in order to carry out their functions
before the launch of the next Shuttle. The Committee also believes that
it will be particularly important to develop close collaboration with the
NASA centers as well as other program offices in order to do those things
which are needed to create a total risk management system augment m g the
independent check and balance role of SAMOA.
Input to waiver decisions
The Committee understands that FMEAs, CIL determinations, and their
retention rationale are developed by the STS design and development
people. The SRMMQA, operations and other relevant personnel contribute as
appropriate. The FMEA/CIL and retention rationale so produced are among
the inputs to the hazard analyses which are done by the safety people. In
this case, design, development, operations and other relevant personnel
contribute as appropriate. The output of these two processes (FMEA/CIL/
retention rationale on the one hand, and hazard analyses on the other) are
individually approved by the Program Requirements Control Board (PRCB).
However, the Committee is concerned that the FMEA/CILs with their
design-based retention rationale have become the only effective input to
Levels II and I in their waiver decisions to accept the designs as safe
enough to fly.
The Committee recommends that the present design-based retention rationale
should be only one part of the rationale required to accept the hazards
which can result from each critical failure mode. The other part should
112
OCR for page 113
Letter to the Honorable James C. Fletcher
J
rat
be the output of the hazard and safety assessments, including evaluations
of the probability that the hazardous conditions will actually develop and
the probability that these conditions will lead to a Criticality 1
consequence. A risk assessment report, embracing the design retention
rationale and the hazards/safety assessment, should provide the acceptance
rationale for consideration by Level II and I managers in reachm g their
decisions on the grant mg of waivers.
Differences in FMEA/CIL reevaluation process among STS elements
In the Committee's audit of the reevaluation of the FMEA/CILs, a number of
differences were found in the process being used by different element
project offices and contractors. In some cases, we were unable to
ascertain the reasons for the observed differences. For example, the
independent contractors evaluating the FMEA/CILs for the STS elements
managed by the Marshall Space Flight Center are required to review all
subsystems and to file a Review Item Discrepancy (RID) when they differ
with the results of the element contractor's analysis. On the other hand.
one Inaepenaenc contractor for One wrester evaluation was not Directed to
review all parts of the Orbiter and does not file RIDs. We understand
that JSC now has directed the contractor to review all subsystems in the
Orbiter. An audit by the Committee of the documentation and review
process used in the case of the Orbiter indicates that it is a reasonable
alternative to the RID process. Nevertheless, the Committee suggests that
the NSTS program office review the FME6/CIL reevaluation processes as
Implemented for each STS element to assure itself that any differences
will not compromise the quality and completeness of the STS FMEA/CIL
effort as a whole.
Expertise in Statistical Sciences
The key technical decision makers in NASA operate as chairmen of bodies
that review relevant technical information. The decisions involve design,
requirements, waivers, launch decisions. etc. Much of this information is
in the form of complex engineer mg data, such as test, inspection, flight,
and weather data. These bodies draw upon experts in many engineering
disciplines to deal with the complexities. Indeed, it is important that
there be close ties among the design engineers, test and analysis people,
and decision makers throughout the process of designing, build m g,
certify m g, and us Meg components and systems. However, the Committee
finds that these bodies are not adequately supported by people skilled in
the statistical sciences to aid in the transformation of cam pled data into
information useful for decision making.
The Committee recommends that NINA build up its staff of experts in the
statistical sciences (civil servants and contract support) to provide
improved analytical support of risk management and of key decision makers
by the application of modern statistical analysis, inference and
assessment techniques.
1 1 ~
OCR for page 114
Letter to the Honorable James C. Fletcher
Reduc m ~ the risk in the Space Transportation System
it,
Even with the current FMEA/CIL and hazard analysis efforts which are
supported thoroughly within NASA and by its contractors, the Committee
receives the impression that changes often may only be considered which
will reduce risks to that level which has been previously accepted in the
STS program. The Committee believes that such risks, accepted in the
past, logical as that may have appeared to be at the time, should not now
be accepted without a concur bated effort to plan and implement a program
to remove or reduce these risks.
Fury WORK
The Committee is continu m g its audit by examining other aspects of the
STS risk management process. Among these are the design qualification and
flight certification processes; a further look at integrated systems
analysis; launch commit criteria and waiver policy; the process for
general mg, review m g, revis m g and approving the retention rationale for
waivers to permit flight of the Shuttle with critical items that affect
safety; the process for structural analysis, establishment of margins, and
verification of analyses and margins; the risk management process for STS
software; and the process for analyz m g the effect of payloads on the
safety of the Shuttle, ground personnel, and flight crews.
6
Admiral Richard H. Truly
We plan to issue a final report of the Committee late this year. It will
include our assessment of all of the procedures reviewed and recommenJa-
tions for improvement of the STS risk management system. If it should
appear desirable, we will provide another interim letter report to convey
find m gs and recommendations which may emerge from the reviews now under
way.
Sincerely yours,
:
Alton D. Slay
Chairman
Ccmmittce on Shuttle Criticality
Review and Hazard Analysis Audit
114
Representative terms from entire chapter:
critical items