National Academies Press: OpenBook
« Previous: 3 Ratings in Specific Dimensions: The Dimensional Measures
Suggested Citation:"4 The Overall Rating of Program Quality." National Research Council. 2009. A Guide to the Methodology of the National Research Council Assessment of Doctorate Programs. Washington, DC: The National Academies Press. doi: 10.17226/12676.
×
Page 14
Suggested Citation:"4 The Overall Rating of Program Quality." National Research Council. 2009. A Guide to the Methodology of the National Research Council Assessment of Doctorate Programs. Washington, DC: The National Academies Press. doi: 10.17226/12676.
×
Page 15
Suggested Citation:"4 The Overall Rating of Program Quality." National Research Council. 2009. A Guide to the Methodology of the National Research Council Assessment of Doctorate Programs. Washington, DC: The National Academies Press. doi: 10.17226/12676.
×
Page 16
Suggested Citation:"4 The Overall Rating of Program Quality." National Research Council. 2009. A Guide to the Methodology of the National Research Council Assessment of Doctorate Programs. Washington, DC: The National Academies Press. doi: 10.17226/12676.
×
Page 17
Suggested Citation:"4 The Overall Rating of Program Quality." National Research Council. 2009. A Guide to the Methodology of the National Research Council Assessment of Doctorate Programs. Washington, DC: The National Academies Press. doi: 10.17226/12676.
×
Page 18

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

4 The Overall Rating of Program Quality The dimensional measures provide a summary of program performance along individual dimensions that are of importance in doctoral education. The overall rating combines the variables that make up the dimensional measures into a single measure. In addition to reflecting the faculty preferences in each field as derived from the faculty questionnaire, it includes the results of the importance measures derived from the rating survey. This section describes in non- technical terms how the overall rating of a program is calculated. Readers who wish more technical detail are referred to Appendix A. THE OVERARCHING IDEA There is a great deal of uncertainty in the ratings of the quality of programs. Uncertainty can come from a variety of sources. For example, although many academics may think that they can identify the top five or ten programs in their field, this certainty about perceived quality decreases as more and more programs are included. Furthermore, one program may be strong in one area while a second program’s strengths may lie in a different area. Faculty asked to rate programs may differ in their views about the importance of these strengths, and the programs may differ in various characteristics, many of which may be considered important to the perceived quality of a doctoral program. Describing this uncertainty was a key task of the predecessor committee that produced Assessing Research-Doctorate Programs: A Methodology Study.22 This committee examined the methodology of the 1995 study and recommended that the next study rely more explicitly on 22 National Research Council., Assessing Research-Doctorate Programs: A Methodology Study. Washington, D.C. 2003. 14 PREPUBLICATION COPY—UNEDITED PROOFS

program data. It also contained two key recommendations as to how the methodology of obtaining reputation measures should be revised: “The next study should have sufficient resources to collect and analyze auxiliary information from peer raters and the programs being rated to give meaning and context to the rating ranges that are obtained for the programs….” (p. 5) and “Re-sampling methods should be applied to ratings to give ranges of rankings for each program that reflect the variability of ratings by peer raters. The panel investigated two related methods, one based on Bootstrap re-sampling and another closely related method based on Random Halves, and found that either method would be appropriate.” (p. 5) The dimensional ratings, described in the previous section, fulfill the first recommendation. This section describes how the second recommendation was followed and combined with the first to obtain an overall rating for each program within a field. THE OVERALL APPROACH A schematic description of the overall approach appears in Box 4-1 and is described in the text: 15 PREPUBLICATION COPY—UNEDITED PROOFS

Box 4-1 Faculty Students Institutions and Programs Existing Data 1. DATA More than 5,000 doctoral programs in 222 institutions in 61 fields across the sciences, engineering, social sciences arts, and humanities. Institutional practices, program characteristics, and faculty and student demographics. Obtained through a combination of original surveys and existing data sources (NSF surveys and ISI publication and citation data). 2. WEIGHTS In two surveys, program faculty provided the NRC with information on what they value most in Ph.D. programs 1) Asked directly how important they felt 21 items in a list of program characteristics were. 2) A sample of faculty rated a sample of programs in their field. These ratings were then related through regressions to the same items as appeared in 1). 3. ANALYSIS “Direct” and “regression-based” weights provided by faculty were averaged into one combined set of weights, reflecting the multi- dimensional views faculty hold about contributing factors to the quality of doctoral programs. 4. RANGES OF RANKINGS. Each program’s rating was calculated 500 times by randomly selecting half of the raters from the faculty sample in Step #2 and also incorporating statistical and measurement variability. Similarly, 500 samples of direct weights were selected. Combined weights were then applied to 500 randomly selected sets of program data to produce ratings for each program. These ratings for each of the 500 samples determine a rank 16 ordering of the programs. A “range PREPUBLICATION COPY—UNEDITED the middle of rankings” was then constructed showing PROOFS half of calculated rankings. What may be compared, among programs in a field, is this range of rankings.

Faculty were surveyed to get their views on the importance of different characteristics of programs as measures of quality. Ratings were based on faculty members’ views of how those measures related to program quality, as discussed in the chapter on dimensional measures. The views were related to program quality using two distinct methods: (1) directly, through answers to questions on the faculty survey; and (2) regression-based, obtained by asking faculty raters to provide program ratings for a sample of programs in a field and then relating these ratings, through a regression model that corrected for correlation among the characteristics, to data on the program characteristics. The two methods approach the ratings from different perspectives. The direct approach is a “bottom-up” approach that builds up the ratings from the importance that faculty members gave to specific program characteristics independent of reference to any actual program. The regression-based method is a “top-down” approach that starts with ratings of actual programs and uses statistical techniques to infer the weights given by the raters to specific program characteristics. The direct approach is idealized. It asks about the characteristics that faculty feel contribute to quality of doctoral programs without reference to any particular program. The second approach presented the respondent with 15 programs in his or her field and asked for ratings of program quality23, but the responders were not explicitly queried about the basis of their ratings. Because it turned out that these different approaches gave results that were similar in magnitude24 but not strongly correlated25, the two views of the importance of program characteristics were combined26 to obtain an overall view (or combined weight) for each measured program characteristic. The sum of these weighted characteristics yielded a rating for each program. As is explained below, each rating is recalculated 500 times using different samples of raters. The program ratings obtained from all these calculations can then be arranged 23 The question given raters about program quality was: On a scale from 1 to 6, where 1 equals not adequate for doctoral education and 6 equals a distinguished program, how would you rate this program? Not Adequate For Doctoral Don’t Know Education Marginal Adequate Good Strong Distinguished Well Enough 1 2 3 4 5 6 9 24 In the case of the resulting direct and regression based weights. 25 For any given measure, the results from the two methods are not highly correlated with one another, permitting us to assume that the results from the two approaches are statistically independent. 26 If there were no uncertainty, the weights would simply be averaged. Because there is uncertainty, the optimal combined weight is not so simple. but takes into account the variances of the separate coefficients. See equations (19) and (20) in Appendix A and the related discussion. 17 PREPUBLICATION COPY—UNEDITED PROOFS

in rank order and, in conjunction with all the ratings from all the other programs in the field, used to determine a range of possible rankings. Because of the various sources of uncertainty, which are discussed at greater length in Appendix A, each ranking is expressed as a range of values. These ranges were obtained by taking into account the different sources of uncertainty in these ratings (statistical variability from the estimation, program data variability, and variability among raters). The measure of uncertainty is expressed by reporting the end points of the inter-quartile range of rankings for each program; that is, the range that contains the middle half of a large number of ratings calculations that take uncertainty into account.27 An example of the derivation of rankings for a program is given in the Chapter 5. In summary, we obtain a range of rankings for each program in a given field by first obtaining two sets of weights through two different methods, direct and regression-based. We then standardize all the measures to put them on the same scale and obtain ratings by multiplying the value of the standardized measure by the weights. We obtain both the direct weights and coefficients from regressions through calculations carried out 500 times, each time with a different set of faculty, to generate a distribution of ratings that reflects their uncertainties. We obtain the range of rankings for each program by trimming the bottom quarter and the top quarter of the 500 rankings to obtain the inter-quartile range. This method of calculating ratings and rankings takes into account variability in rater assessment of what contributes to program quality within a field, variability in values of the measures for a particular program, and the range of error in the statistical estimation. It is important that these techniques give us a range of rankings for most programs. We do not know the exact ranking for each program, and to try to obtain one—by averaging, for example—could be misleading, because we have not imposed any particular distribution on the range of rankings.28 The database that presents the range of rankings for each program will list the programs alphabetically and give the range for each program. Users are encouraged to look at groups of programs that are in the same range as their own programs, as well as programs whose ranges are above or below, in trying to answer the question, “Where do we stand?” The next section provides an example of how the ranges of rankings were calculated for a particular program. 27 The inter-quartile range eliminates the top and bottom 125 ratings calculated from 500 regressions and 500 samples of direct weights from faculty. It is a range that contains half of all the rankings for a program. 28 For example, most of the rank ordered ratings could be at the top of the range. 18 PREPUBLICATION COPY—UNEDITED PROOFS

Next: 5 An Example »
A Guide to the Methodology of the National Research Council Assessment of Doctorate Programs Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

A Guide to the Methodology of the National Research Council Assessment of the Doctorate Programs describes the purpose, data and methods used to calculate ranges or rankings for research-doctorate programs that participated in the NRC Assessment of Research-Doctorate programs. It is intended for those at universities who will have to explain the NRC Assessment to others at their university, to potential students, and to the press. Although the main text is fairly non-technical, it includes a technical description of the statistical methods used to derive rankings of over 5000 doctoral programs in 61 fields.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!