Click for next page ( 98


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 97
APPENDIX C LETTER REPORTS TO THE ADMINISTRATOR OF NASA AND NASA RESPONSE Prior to this final report, the Shuttle Criticality Review and Hazard Analysis Audit Committee issued two interim letter reports to the Administrator of the National Aeronautics and Space Administration. The Administrator of NASA provided a response to the Committee regarding the first interim report. It also was referenced in NASA's Report to the President of June 1987. These documents are contained in this appendix. First interim letter report to the Administrator of NASA frown Committee Chairman Alton D. Slay, January 13, 1 98 7, 4 pp. Reply to Committee Chairman Alton D. Slay from the Administrator of NASA regarding the first report, April 22, 1987 Report to the President. lmplen~entation of the Recommendations of The Presidential Commission on the Space Shuttle Challenger Accident, NASA, June 1987, excerpts from pp. 41-42 r . Seconc! interim letter report to the Administrator of NASA from Committee Chairman Alton D. Stay, July 22, 1987, 8 pp. 5 97 Page 98 102 104 107

OCR for page 97
6 NATIONAL RESEARCH COUNCIL COMMISSION ON ENGINEERING AND TECHNICAL SYSTEMS 21()1 Constitution Avenue \Nashingt`,n. D ~ 2(~4lt3 AERONAUTICS AND SPACE ENGINEERING BOARD January 13, 1987 The Honorable Jades C. Fletcher Administrator National Aeronautics and Space Administration Washington, D.C. 20546 Dear Jim: This is an inboard progress report of the Shuttle Criticality Review and Hazard Analysis Audit Committed. The National Research Council formed this committee in response to your request for an a It of the NASA response to the Presidential Commission Fecormen~ation III regarding criticality review and hazard analysis. The Ccmmitt~e has been a functioning entity since its first meeting on September 22, 1986. We have thus far received presentations from and engaged in detailed discussions with NA5A Heac qua rters, the National Space Transportation System program office, Johnson Space Center, Marshall Space Flight Center, and Kennedy Space Censor. Similar meet m gs were held at Rocket*yne (Space Shuttle Main Engine) and Rockwell International (Orbiter), and by a working group at Morton Thioko1 (Solid Rocket Motors. All of the participants described their efforts and progress in reivaluat mg the Failure Modes and Effects Analysis (FMEA) and Critical Items List (CIL) status and in reassess- ing hazard analysis and risk management. The Committee also has received a brief m g on and direst the process be m g used by the U.S. Air Force Systems Command-Space Division to determine launch readiness and. safety status. The Titan 34D Recovery Program was described as an example. m e Committee has been favorably impressed by the ~~;cated effort and extremely beneficial results obtained thus far from the FMEA/CIL and hazard analysis processes. We are very appreciative of the frank and Open manner in which NOVA and contractor personnel have worked with the Committee. Our suggestions have been received in a very respon- sive manner in all quarters. We wish to commend Admiral Truly, Arnold Aldrich and the NASA Shuttle team involved in the FMEA/CIL,hazard analysis processes for the significant work they have performed so far. Although cur general impressions are favorable, we do have some suggestions for improvement. In summary, they are: 0 Criticality 1 and 1R items should be assigned priorities based on the probability of occurrence. 0 S. mace many of the Criticality 1 and 1R items differ substan- tially in terms of the probability of failure, NINA should consider modify Meg the definition of critical items to account for these differences. The National Research Council is the principal operating agency of the National Academy of Sciences and the National Academy of Engineering to serve government and other organizations 98

OCR for page 97
Letter to the Honorable JR~PC C. Fletcher o 2 NOVA should incorporate its present total system review proce- dures in an integrated systems assessment process coupled closely with the FMEA/CIL reevaluation now be m g undertaken. O Linkage between the STS eng m eer m g change activities and the FMEA/CIL,hazard analysis processes should be assured. SETTING PRIORITIES FOR CRITICALITY 1 AND 1R ITEMS NASA does not now set priorities for Criticality 1 and 1R items nor does it consider the prrh~hility of occurrence of an event in the treatment of these items. The Committee reck== Ode that MESA devise some mechanism for and assign priorities to the Criticality 1 and 1R items. It suggests that probability of occurrence should be an Important element of any such priority r=~40ning. Pas m g priorities on this fundamental Measurement of risk will help NINA and these interested in its progress to evaluate the adequacy of changes being made to Shuttle hardware, software, or procedun~n in the interest of enhancing safety. Essential to the sur--cc of any risk assessment process is the certain and timely feedback of preflight and postflight system performance data, along with test data and failure or degradation reports. Such in put_ are critical to any successful FMEA/CIL and hazard analysis program and can form the Harris for more precise evaluation of risk. While it is clear to the Committ== that these data are `:=P~ in readi- ness reviews and other NINA activities, it is not clear that they are used in the FMEA/CTT. or hazard analysis processes. The Committee believes that this information can, if properly Band, assist greatly in the EMEA/CIL and hazard analysis processes and in the determination of prioritie_. m e present decision-ma-ding process within NASA with regard to FMEA/CIL appears to be based on the judgment of experienced practitio- ners and has received very little contribution from quantitative anal- ysis. We believe that the failure of NASA to use numerical techniques as an input to decision-making detracts from the overall effectiveness of the FMEA/CIL and hazard analysis processes. Such techniques could provide a more realistic assessment of risk, at least on a relative basis. We do not wish to suggest that NINA subordinate technical judgment to numerical analysis. Such an approach would be, in cur cplalon, u:~un1unq and perhaps counterproductive. Currently waiver authority for all Criticality 1 and 1R items rests with NINA Level I. m e Committee believes that Level I should focus its attention an the highest priority items resulting from the 99

OCR for page 97
Letter to the Honorable James C. Fletcher l - , . suggested selection pro Mess, along with the rationale that produced the priority rat m g. The waiver decision authority for the remainder of the Criticality 1 a ~ OR items should be delegated to Levels II and perhaps III. DEFINITION OF CRITICALITY CaTECORUES The rrn~dttee nctes that the Indicated response of the entire NOVA o ~ zatian and its contractors has produced a variety of items which, by precise definition, must be placed in the Criticality 1 or OR categories. Many of the items differ substantially flus one another in terms of the prrh~hility of failure or Performance and thus their potential impact on Shuttle Operational safety. The Committee suggests that NASA consider a modification of the Critical Items List to account for these differences, help the priority selection process, and better focus present or future efforts to achieve safer Shuttle cgerations. INTEGRATED SPACE TRANSPORTATION SYSTEM ANALYSIS m e Committee oncerstards that various Mechanisms are being used by NINA to examine total system operation, including propagation of fail- ure modes to interfac Meg or physically adjacent modules or ~obsy~te=~. The Committee does not perceive, however, any formal relationship of such evaluation methods to the Ongoing FMEA/CIL process. me Commit- tee suggests that MESA devise an integrated STS systems ~~=-=c~put process which is closely coupled with the FMEA/cTT activity to assure assessment of the truly critical safety elements in the STS. This incl ~~= all combinations of herd ware/sof~ware/procedura1 failures and cascading failures. RELATION BETWEEN FMEA/CILrHAZARD ANALYSIS AND DESIGN CHANGES We note that many engmeermg changes have been undertaken since the 51-L accident to in prove Shuttle safety prior to ration of flight, now scheduled for February 1988. In parallel, the FME\/CIL arm hazard analysis reevaluations are urger way with Completion em Purim the sir of 1987. Thus, the FMEA/C:IL reevaluation may not abate ly reflect all of the enginE~erir~ charges, nor will there be time to incorporate any su ~ antial design *I es that may be indicated by the outcome of the FMEA/CIL reevaluation, hazard analyses, and related activities. The Committee reccrmends that NASA assure a close linking between the STS eng meeting change activities and the FMEA/cIlrhazard analysis processes. 100

OCR for page 97
Letter to the Honorable J~mpc C. Fletcher TUII~ WORK m e Committee is cant m am g its effort to aunt the EMEA/Ct~, hazard analysis, and related processes deal m g with risk assessment. We have planned additional visits to NASA centers and contractor facilities where we will continue to exam me the Mechanisms used by NASA and its contractors to provide for the overall safety of the STS as an inbe- grated system. We also will further refine sOmP of the points raised here in future reports to you. While we recog m ze that it is not possible a priori to ensure mission suppress and flight safety, through this review and a It we hope to assist NOVA in taking these prudent steps which will provide a reasonable and responsible level of assur- ance of flight safety. We will, of course, remain in close contact with your staff Throughout this activity. Sincerely y ~ s, 1] ~D1l cc: Admiral Richard H. Truly 101 _' ~V: ~ /~1 - / Alton D. Slat v _ . Coleman Committee on Shuttle Criticality Review and Hazard Analysis It 1

OCR for page 97
Nl\S/\ National Aeronautics and Space Administration Washington D C 20546 Otto ~ of tine AUrn~n~str~tor General Alton D. Slay National Research Council National Academy of Engineering 2101 Constitution Avenue, NW (NAS 307) Washington, DC 20418 Dear Al: In reply to your January 13, 1987, interim progress report of the Committee on Shuttle Criticality Review and Hazard Analysis, your four suggestions are repeated, along with NASA's response to each. NRC Comment: "Criticality 1 and 1R items should be assigned priorities based on the probability of occurrence." (This comment also suggested the use of probability analysis techniques and the delegation of certain criticality items to lower levels of the organization.) NASA Response: The National Space Transportation System is in the process of selecting and implementing a critical items prioritization technique for the Shuttle program. Five different techniques have been evaluated by review teams at JSC, MSFC, and KSC. One of these techniques has been selected to be presented to the program manager at a Program Requirements Control Board (PRCB) for baselining as a formal program requirement. The chosen approach will overlay the existing Failure Mode and Effects Analysis/Critical Items List (FMEA/CIL) activity with minimum perturbation, yet provide an effective measure of relative risk in order to focus future review emphasis and resource allocations. In parallel with the prioritization technique development, an effort is also under way to assess the utility of probabilistic risk assessment in the NSTS FMEA/CIL process. Activities have been initiated to engage two independent firms with expertise in probabilistic risk assessment to perform detailed reviews of the orbiter auxiliary power unit and the shuttle main propulsion pressurization system. A decision to apply such probabilistic risk assessment techniques to other elements of the Shuttle will depend upon assessments of the results and impacts of those efforts and comparison of these results with the results of the mainline FMEA/CIL activity. Delegating the review and approval of certain critical items will be decided after the results of the prioritization and risk assessment activities have been thoroughly assessed. NRC Comment: "Since many of the Criticality 1 and 1R items differ substantially in terms of the probability of failure, NASA should consider modifying the definition of critical items to account for these differences." 6 102

OCR for page 97
NASA Response: We expect the FMEA/CIL prioritization process will provide the necessary definitions and program focus in this regard. NRC Comment: "NASA should incorporate its present total system review procedures in an integrated systems assessment process coupled closely with the FMEA/CIL reevaluation now being undertaken." NASA Response: Since the Challenger accident, NASA has reemphasized its risk management effort. An important feature of the revised effort must be a "systems engineering" approach that integrates the various elements of the risk management process to assure assessment of the combinations of hardware, software, procedures, and cascading failures. NASA's new Associate Administrator for Safety, Reliabilility, Maintainability and Quality Assurance has been tasked to develop a new agencywide risk management system. NRC Comment: "Linkage between the STS engineering change activities and the FMEA/CIL hazard analysis processes should be assured." NASA Response: Engineering changes are processed through the same Space Shuttle configuration control boards that conduct the review of the FMEA/CIL. A recent change to the procedure requires an assessment of each change request to determine if it affects any Criticality 1 or 2 hardware. The nature of the combined change control and FMEA/CIL processes is such that the total process cannot be completed until the last change to be implemented before flight has itself undergone a FMEA and been dispositioned by the board. Regardless of the timetable established by the NSTS working schedule for FMEA/CI L preparation and review, the changes that result will be dealt with in the same manner as the generating FMEA items. All changes mandatory for first flight will undergo the same rigor, even if this results in a flight schedule impact. The NSTS Systems Design Reviews which began early last year have significantly reduced the likelihood of new changes being identified that have major schedule impacts. The dedication of your committee and the sincerity of its comments are very much appreciated by NASA. I hope you find our actions in response to your suggestions to be both appropriate and timely. Thank you again for your help. Sincerely, ~ . ~ ~ , flames CI. Fletcher Administrator 6 103

OCR for page 97
Report to the President IMPLEMENTATION of the RECOMMENDATIONS of the Presidential Commission on the Space Shuttle Challenger Acciclent June 1987 6 104

OCR for page 97
- ? EIFA's have been conducted on ET/ orbiter, SSME/orbiter, and SRB/ET/orbiter interfaces. These analyses have been reviewed by NASA and the systems integra- tion contractor, and the results are under evaluation by the element project offices and the NSTS Engineering Integration Office. When this review is completed, the finalized EIFA's will be presented to the PRCB for for- mal approval. NATIONAL RESEARCH COUNCIL AUDIT The Shuttle Criticality Review and Haz- ard Analysis Audit Committee of the National Research Council (NRC), chaired by retired USAF General Alton Slay, reports directly to the NASA Administrator and is responsible for verifying the adequacy of the proposed actions for returning the Space Shuttle tO flight status (see Appendix F for panel membership and a summary of responsibilities). The committee has discussed the FMEA/ CIL/HA reevaluation process with repre- sentatives from NASA Headquarters, ISC, KSC, and MSFC. Meetings have been held at the centers and at Rockwell Internation- al's~ Space Transportation Systems and Rocketdyne divisions; Morton Thiokol; United Space Boosters, Inc.; Sundstrand Corporation; and NRC Headquarters. The committee is evaluating the adequacy of the review process, checking for continuity across all elements of the program, and reviewing changes that NASA and its con- tractors have made since the accident. A preliminary report was submitted to the NASA Administrator on January 13, 1987, indicating that the committee has been favorably impressed with the results obtained from the FMEA/CIL and hazard analysis processes. While the committee's general impressions were favorable, it did make some suggestions for improvements. In summary, these suggestions are: ( 1 ) Criticality 1 and 1 R items should be assigned priorities based on the probability of occurrence; (2) since many of the Criticality 1 and 1R items differ sub- stantially in terms of the probability of fail- ure, NASA should consider modifying the definition of critical items to account for these differences; (3) NASA should incorpo- rate its present system review procedures into an integrated system assessment process coupled closely with the FMEA/CIL reevalu- ation now being undertaken; (4) linkage between the STS engineering change activi- ties and the FMEA/CIL/HA processes should be provided. NASA has responded to these sugges- tions in the following manner: 1. Several candidate systems for prioritizing critical items have been evaluated by each of the projects. A hybrid system has been developed that incorporates the positive features of the candidate systems and spe- cihcally addresses probability of occur- rence. The approach can be overlaid on the existing FMEA activity with mini- mum perturbation, providing an effective measure of relative risk. In parallel with the development of . . . . . . .. . prlorltlzatlon techniques, an ettort 1S under way to determine the applicability of probability risk assessment to the FMEA/CIL process. This technique is 2. 41 1 _! _ _ used in the nuclear power industry to pro- vide relative-risk assessments. Two firms with expertise in probability analysis have been selected to perform detailed assess- ments of the orbiter auxiliary power unit and the main propulsion engine pressur- ization system. A decision to apply proba- bility analysis techniques to other systems of the program will depend on the results of these assessments. The FMEA/CIL prioritization process will provide the necessary program focus and more definitive definitions in response to the committee's concern expressed in their second suggestion. 3. Since the accident, NASA has reempha- sized its risk management effort. An important feature of the revised effort is a "systems engineering" approach that inte- grates the various elements of hardware and software failure analysis. Further dis- cussion of risk management is included in the response to Recommendation IV. 4. Engineering changes are processed through the same project and program control boards that conduct and approve the reviews of the FMEA/CIL. Each 105

OCR for page 97
change request will be assessed to deter- mine if it affects any Criticality 1 or 2 hardware to ensure that the required link- age is provided. The NRC audit committee is reviewing additional areas to identify potential meth- ods of reducing risk. These include the design qualification and flight certification pro- cesses, launch commit criteria and waiver policy, and the generation, review, and approval of retention rationale for waivers to . . . crltlca. . Items. - Also being reviewed are the overall safety, reliability, maintainability, and quality assurance program, the definition of struc- tural analysis requirements, the establish- ment and verification of analyses for margins of safety, the risk management processes for software, and the processes for analyzing pay- load safety. Interim findings and recommendations from these reviews will be submitted to the NASA Administrator through letter reports, as required. The final report, anticipated in 1987, will include an assessment of the proce- dures reviewed and recommendations for improving the Shuttle risk management sys- tem. As reports are received, any recommen- lations included will be reviewed by NASA and responses will be provided to NRC. 42 106

OCR for page 97
NATIONAL RESEARCH COUNCIL COMMISSION ON ENGINEERING AND TECHNICAL SYSTEMS 2101 Constitution Avenue \\ashingt`~n. [a ~ 2~)418 AERONAUTICS AND SPACE ENGINEERING BOARD July 22, 1987 The Honorable James C. Fletcher Administrator National Aeronautics and Spare Administration Washington, D.C. 20546 Dear Jim: I am pleased to provide this second interim progress report of the National Research Council's Committee on Shuttle Criticality Review and Hazard Analysis Audit. I wish to thank you for your letter of April 22, 1987, in which you summarized the steps that the National Aeronautics and Space Administration (NASA) is taking in response to the suggestions in our first report to you of January 13, 1987. The Committee is indeed gratified by the progress NASA is making in strengthening the Space Transportation System (STS) risk management program. We also appreciate the continued close collaboration with NASA and contractor personnel, and note the interest they show and their responsiveness to the Committee's suggestions. The purpose of this letter is to react to the actions of NASA taken in response to our first letter, and to comment on some additional aspects of STS risk management. Since our last report, the full Committee has met six more times, ~nclud mg visits to Marshall Space Flight Center, Kennedy Space Center, I again to Rocket~yne on the Space Shuttle Main Engine (SSME), and with Rockwell Space Transportation System Division on STS integration. Working groups of the Committee also met at appropriate NASA centers and contractors to review the risk management aspects of the Solid Rocket Booster (SRB); Orbiter Auxiliary Power Unit (APU) and SUB Hydraulic Power Unit (HPU); Shuttle structural analysis, margins and verification; Orbiter nose wheel steer mg; software; and Space Shuttle Main Engine. This continued audit has allowed the Committee to evaluate the changes NASA is making in the STS risk management processes and to identify some additional views which we thought would be useful to share with you in this interim report. Regard m g the response of NASA to the first report, the Ccmmittee's reaction is, in summary: o The work under way to assign priorities to Criticality 1 and 1R items appears to be a significant step forward. We also are pleased to note the tests of Probabilistic Risk Assessment (PRA) now be mg conducted. o The Committee looks forward to learning how the prioritization process will be used to redefine the critical items by taking into account the differences in the probability of occurrence. The Nahonal Research Council is the principal operahng agency of the National Academy of Sciences and the National Academy of Engineering to serve government and other organizahons 107

OCR for page 97
Latter to the Honorable James C. Fletcher o I; ~ We enthusiastically support the agency-wide risk management system now being developed. However, we are still concerned with the apparent lack of consideration of the STS as a s m gle, complex system rather than a collection of subsystems. The steps taken to link the engineer m g change control and the Failure Modes and Effects Analyses/Critical Items List (FMEA4CIL) processes are both appropriate and welcome. We are also reassured by your statement that the flight schedule will not be allowed to reduce the rigor with which the risk management tasks will be conducted. o The Ccmmittee's continuing audit since our last interim report leads us to provide initial comments on the following topics: Persons involved in the STS program frequently give the impression that decisions are made collectively by panels, boards, etc., rather than by the responsible individuals. We believe that the Administrator of NASA should periodically remind the NASA organization of the specific individuals responsible for final decisions based on the advice received from each advisory body. o The new System Integrity Assurance Program (SIAP), especially its Program Compliance Assurance and Status System (PCASS), now be Meg implemented by the National Space Transportation System (NSTS) Program office, will be invaluable as a tool in support of STS risk management. The STS failures data base, when completed, can be of major importance in determining the probability that the worst case effect postulated in the FMEA will actually occur. 0 The progress be Meg made in improvements to the SSME as a result of the FMEA/CIL reevaluation is very encouraging. The changes being introduced in NASA Headquarters Safety, Reliability, Maintainability and Quality Assurance (SRM&QA) appear to be well planned and in the right direction. However, we are concerned that it is not adequately staffed to cope with the demands placed upon it, and recognize that close collaboration with the centers and program offices is necessary to improve risk management in NASA. o A risk assessment report, based upon both the FMEA/CIL/retention rationale and a comprehensive hazard and safety assessment, should be the basis for the acceptance rationale in consider Meg waivers to fly Criticality l components. 108

OCR for page 97
Letter to the Honorable James C. Fletcher o o There appear to have been unexplained differences among the STS elements in the approach to and the rigor of the FMEA/CIL reevaluations. The methods be m g used should be reviewed to assure that any differences which exist will not compromise the FMEA/CIL reevaluation process. The panels and boards (Program Requirements Change Board, Flight Readiness Review, etc.) that advise key NASA decision makers are not adequately staffed with people skilled in the statistical sciences of data analysis, statistical inference, and probabilistic risk assessment; persons with such skills should be added to provide 1mprcved support of the decision mating process. A greater effort is needed to plan for additional elimination or reduction of risks in the STS. Follow m g is an elaboration on these topics. HIS ON NASA RESPONSE Setting priorities for Criticality 1 and IR items We are pleased to see the steps be m g taken to assign priorities to the critical items. The Committee notes that the technique proposed for Implementation lends itself to the incorporation of quantitative measures of risk and probabilities of occurrence ~s these measure are developed. However, the Committee urges that mare be taken to assure that over simplified but potentially inaccurate quantitative measures are not used. We have been assured by a representative of the NSTS office that the prioritization process can be completed well before the next Shuttle launch, which we believe to be an important consideration. We look forward to learning how NASA plans to use the results of this process. I can understand your desire to defer a decision to delegate from Level I of NASA the review and approval of waivers on certain critical items until you have assessed the results of the new prioritization and risk assessment processes. However, the Committee believes that before the next launch some method should be used to assure that NASA Level I gives special attention to the highest priority items identified through the prioritization process. The Ccmmitt^P is delighted to learn that NASA is test Meg the use of Probabilistic Risk Assessment (PRA) on the APU and HPU, and the Shuttle main propulsion pressurization system. We also are aware of the SSME certification process assessment study being conducted at the Jet Propulsion T~horatory, which includes a PRA of the SSME. Ihe Committee cautions NASA on its intention to evaluate PRA by oomparLng the results of only two or three disparate tests of PRA with the results obtained earlier by the FMEA/CIL process. The criterion should not only be whether a significant new problem is identified by the PRA. Ihe PRA test results should be used by NASA to answer the questions: Would the PRA have helped 109

OCR for page 97
Letter to the Honorable James C. Fletcher in making NASA's original decisions, e.g., on a Criticality 1 waiver? Would it have given more confidence in the decisions that were made? The current sample size is too small to judge its merits when applied to the entire STS or even a complex element such as the Orbiter. The PRA should increase in value as the scope of its coverage of the STS is widened. It also should be useful in better understanding the nature of the failure modes. Integrated Space Transportation System analysis The Committee is pleased to note that the NASA Associate Administrator for SRMMQA has been directed to develop an agency-wide risk management system. We believe that it is important to call attention to the totality of "risk management" as the sum of a number of separate processes which ultimately must be considered on an integrated basis. The committee is still concerned that at the NSTS office at JSC we have not found a consolidated, integrated STS systems engineering analysts, including system safety analysis, that views the sum of the STS elements as a single system. Such a "top-down" engineering analysis would help avoid potential gaps which may exist as a result of the present very thorough "bottcm-up" analyses centered at the subsystem and element project levels. We have recently become aware of the Avionics Audit which is conducted by Rockwell International-STS Division for the NSTS Program office. We understand that this audit process will be expanded to embrace eventually the entire STS. The Committee believes that an expanded audit of this type could serve as the nucleus or the needed Integrates Ale engln~=rlng analysis in support of risk management. Relation between FMEP/CILrHazard Analysis and design changes The Committee is reassured by the steps NASA has taken to tighten the procedure for assessing the impact of any proposed design change on Criticality 1 or 2 hardware; by the requirement that all changes introduced before a flight must undergo a FEES which also must be accepted by the change board; and by your statement that the flight schedule will not be permitted to reduce the rigor with which these risk management tasks are conducted. COMMENTS ON NEW TOPICS Role of panels and boards in STS decisions The Committee recognizes the important role played by the many panels and boards in the NSTS program in providing coordination, r~solv m g problems and technical conflicts, and reviewing and recommend Meg actions. m ese ~0

OCR for page 97
Letter to the Honorable James C. Fletcher entities allow the different interests and skill groups to br m g forward Sneer Inputs, contribute their knowledge, and thus munlmize the risk that a proposed action will negatively affect some aspect of the STS. We presume that each of these entities recommends an action to an appropriate official, such as a project manager at Level III or the Deputy Director of the NSTS Program at Level II, who actually makes and takes responsibility for the decision. The Committee is concerned about a possible attitudinal problem regarding the decision process on the part of the NASA personnel engaged in it. When we ask a NASA manager about how a decision is made, often we are told that it is made by such-and-such a board. We are concerned that there may be a tendency for those involved in the mNlti-layered review and decision process to hide in the anonymity of panels and boards, and that each person who must sign off on an item may not be inclined to concentrate enough on his or her individual responsibility in light of the number of levels of group reviews involved in the decision process. The Committed recommends that the Administrator of NASA periodically remind all of the NASA organization of the specific individuals by name and position who are responsible for final decisions (and the organizational relationships among them) based on the advice coming from each panel and board. This would not detract freon the important role played by all members of the panels and boards in prcvid m g advice to the decision maker. Potential of the Program CcmPl~ance Assurance Status Svstem fPrAR.~N The Committee is enthusi~.ctic about the potential of the PCASS, which is be Meg established ~.c a major Dart of the new .~vct~m Tnt=~rit~r ~CC1lr=~^ Program (SIAP) of the NSTS. ~ ]~ ~ ^ ~ ~] ~ ~~L ~ ~~ ~ 1 ~] ~~[ ~ ~ ~ 1~ It should improve the quality of information available to key decision makers (e.g., at Flight Readiness Reviews) by pro Siding in near real-time an integrated view of the status of problems with the STS, including trends, anomalies and deviations, assessments, and closure information. Plans to keep up to date and computerize the FMEA will provide a very useful input to PCASS. The Committee also has learned of the data base maintained by the Johnson Space Center (JSC) SR~Q~ office which documents in one place the failures which have occurred on the Orbiter during ground test Meg and in flight. It is encourag m g to note that of those failures of components on the Orbiter categorized as Criticality 1 which have occurred dur m g flight, none resulted in the worst-case effect postulated in the FMEA. These failure data can be very valuable in connection with the new CIL prioritization system in establish m g the probability that the postulated effects will actually occur, given the failure in flight. We understand that this, and similar data bases for the other STS elements, will be integrated into the PCASS. We believe that PCASS, ~s a real-time data base, has the potential to become a key element of the STS risk management, and thus its full and timely development should be encouraged and supported. m e Committee recommends that this development be given a high priority and that the potential users of PCASS, includ m g key decision makers, be involved closely now in its development. r 111

OCR for page 97
Letter to the Honorable James C. Fletcher Progress on the SSME as a result of the FMEA/CIL reevaluation 6 Based on its second visit to Rockwell International - Rocketdyne Division, the Ccmmittee is encouraged with the progress be m g made in improve g the SSME ~s a result of the FMEA/CIL reevaluation. We also applaud the improvements in the test program which are designed to validate the reliability of the modified SSME before first flight. The SSME is one of the few cases in which the Ccmmittee has found that changes have been made ~~ a result of the FMEA/CIL. In most other cases, the Ccmmittee observes that the Initiation of changes has not originated with the FMEA/CIL pr~ess . NASA Headquarters Safety, Reliability Maintainability and Ouality Assurance (SRM&OA) program. In April, the Ccmmittee received a comprehensive brief m g regarding the status and plans for the NASA Headquarters SRM&QA program. We are encouraged by the progress that has been made. The Committee believes that the program is go mg in the right direction. We recognize the magnitude of the task ahead; however, the goals and the program plans developed so far appear to be sound. The Committee is concerned that SRM&QA (at Headquarters and the centers) is not adequately staffed to cope with the demands being placed upon it, perhaps necessitat m g the additional use of contract personnel in order to carry out their functions before the launch of the next Shuttle. The Committee also believes that it will be particularly important to develop close collaboration with the NASA centers as well as other program offices in order to do those things which are needed to create a total risk management system augment m g the independent check and balance role of SAMOA. Input to waiver decisions The Committee understands that FMEAs, CIL determinations, and their retention rationale are developed by the STS design and development people. The SRMMQA, operations and other relevant personnel contribute as appropriate. The FMEA/CIL and retention rationale so produced are among the inputs to the hazard analyses which are done by the safety people. In this case, design, development, operations and other relevant personnel contribute as appropriate. The output of these two processes (FMEA/CIL/ retention rationale on the one hand, and hazard analyses on the other) are individually approved by the Program Requirements Control Board (PRCB). However, the Committee is concerned that the FMEA/CILs with their design-based retention rationale have become the only effective input to Levels II and I in their waiver decisions to accept the designs as safe enough to fly. The Committee recommends that the present design-based retention rationale should be only one part of the rationale required to accept the hazards which can result from each critical failure mode. The other part should 112

OCR for page 97
Letter to the Honorable James C. Fletcher J rat be the output of the hazard and safety assessments, including evaluations of the probability that the hazardous conditions will actually develop and the probability that these conditions will lead to a Criticality 1 consequence. A risk assessment report, embracing the design retention rationale and the hazards/safety assessment, should provide the acceptance rationale for consideration by Level II and I managers in reachm g their decisions on the grant mg of waivers. Differences in FMEA/CIL reevaluation process among STS elements In the Committee's audit of the reevaluation of the FMEA/CILs, a number of differences were found in the process being used by different element project offices and contractors. In some cases, we were unable to ascertain the reasons for the observed differences. For example, the independent contractors evaluating the FMEA/CILs for the STS elements managed by the Marshall Space Flight Center are required to review all subsystems and to file a Review Item Discrepancy (RID) when they differ with the results of the element contractor's analysis. On the other hand. one Inaepenaenc contractor for One wrester evaluation was not Directed to review all parts of the Orbiter and does not file RIDs. We understand that JSC now has directed the contractor to review all subsystems in the Orbiter. An audit by the Committee of the documentation and review process used in the case of the Orbiter indicates that it is a reasonable alternative to the RID process. Nevertheless, the Committee suggests that the NSTS program office review the FME6/CIL reevaluation processes as Implemented for each STS element to assure itself that any differences will not compromise the quality and completeness of the STS FMEA/CIL effort as a whole. Expertise in Statistical Sciences The key technical decision makers in NASA operate as chairmen of bodies that review relevant technical information. The decisions involve design, requirements, waivers, launch decisions. etc. Much of this information is in the form of complex engineer mg data, such as test, inspection, flight, and weather data. These bodies draw upon experts in many engineering disciplines to deal with the complexities. Indeed, it is important that there be close ties among the design engineers, test and analysis people, and decision makers throughout the process of designing, build m g, certify m g, and us Meg components and systems. However, the Committee finds that these bodies are not adequately supported by people skilled in the statistical sciences to aid in the transformation of cam pled data into information useful for decision making. The Committee recommends that NINA build up its staff of experts in the statistical sciences (civil servants and contract support) to provide improved analytical support of risk management and of key decision makers by the application of modern statistical analysis, inference and assessment techniques. 1 1 ~

OCR for page 97
Letter to the Honorable James C. Fletcher Reduc m ~ the risk in the Space Transportation System it, Even with the current FMEA/CIL and hazard analysis efforts which are supported thoroughly within NASA and by its contractors, the Committee receives the impression that changes often may only be considered which will reduce risks to that level which has been previously accepted in the STS program. The Committee believes that such risks, accepted in the past, logical as that may have appeared to be at the time, should not now be accepted without a concur bated effort to plan and implement a program to remove or reduce these risks. Fury WORK The Committee is continu m g its audit by examining other aspects of the STS risk management process. Among these are the design qualification and flight certification processes; a further look at integrated systems analysis; launch commit criteria and waiver policy; the process for general mg, review m g, revis m g and approving the retention rationale for waivers to permit flight of the Shuttle with critical items that affect safety; the process for structural analysis, establishment of margins, and verification of analyses and margins; the risk management process for STS software; and the process for analyz m g the effect of payloads on the safety of the Shuttle, ground personnel, and flight crews. 6 Admiral Richard H. Truly We plan to issue a final report of the Committee late this year. It will include our assessment of all of the procedures reviewed and recommenJa- tions for improvement of the STS risk management system. If it should appear desirable, we will provide another interim letter report to convey find m gs and recommendations which may emerge from the reviews now under way. Sincerely yours, : Alton D. Slay Chairman Ccmmittce on Shuttle Criticality Review and Hazard Analysis Audit 114