Click for next page ( 2

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 1
1 Introduction The Assessment of Research Doctorate Programs conducted by the National Research Council (NRC) provides data that allow comparisons to be made among similar doctoral programs around the United States, with the goal of informing efforts to improve current practices in doctoral education. The assessment, which covers doctoral programs in 61 fields at 222 institutions, offers accessible data about program characteristics that will be of interest to policymakers, researchers, university administrators, and faculty, as well as to students who are considering doctoral study. Furthermore, the assessment analyzes and combines these data to create ranges of rankings that allow the comparison of different doctoral programs within a field. PURPOSE OF THE METHODOLOGY GUIDE This methodology guide is intended mainly for one specific audience: those people in universities who will be asked to explain the results of the NRC Assessment to their presidents and provosts. This intended audience consists primarily of faculty, many of whom are serving as graduate deans and graduate program directors, as well as institutional researchers. Other potential audiences include those people who will be asked to explain the use of the study to the public, as well as those students who are considering doctoral study. Participants at the 2007 Annual Meeting of the Council of Graduate Schools requested that the NRC provide this guide in advance of the release of the assessment so that these various users may prepare for it. The assessment itself will be a separate document: a brief report on doctoral education in U.S. universities accompanied by online spreadsheets that will contain data, dimensional measures, and ranges of rankings for programs on a field-by-field, program-by-program basis. This methodology guide is organized into the following chapters: • A brief description of the data—This section lays out how the study was designed and how the data were collected. In particular, it covers the recruitment of the participating institutions, the questionnaires, how the taxonomy of fields was determined, the determinants of program inclusion, the reasons for dropping some programs and some fields, how a sample 1 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 1
survey of faculty was used in obtaining ratings1, and how the faculty questionnaire was used to determine direct measures of quality. • How ratings in three dimensions are calculated—In addition to the overall measure provided by the assessment for each program at each institution, dimensional measures were constructed in three areas: research activity, student support and outcomes, and student and faculty diversity. These measures take into account only the variables relevant to each area. • Calculating the overall rating of a program—This section covers the sources of variability in ratings, direct measurement of quality as perceived by faculty, regression-based measures of the importance of measured variables to program quality, combined direct and regression-based measures, and how ratings are calculated and converted to a range of rankings. The calculation includes all the variables (20 for non-humanities fields, 19 for the humanities). • An example—The calculation of the range of rankings for a program in economics is presented and explained. This guide also presents technical information about the current study. Appendix A describes the statistical techniques used to obtain the ratings and ranges of rankings and is intended for those interested in the statistical basis of the summary measure. Appendix B contains a link to the questionnaires used to obtain the data about the universities, programs, faculty, and students. Appendix C is a list of the number of programs in each field included in the assessment. Appendix D contains a web link to a list of all the programs and their institutions by field. A detailed description of the 20 variables used in the calculations of the overall range of rankings is provided in Appendix E. Appendix F provides the weights for broad fields for each of the dimensional measures and the variables used in determining them. Appendix G shows the range of rankings for the dimensional measures for 117 (anonymous) programs in economics as an example. Appendix H shows the average number of ratings obtained per program in the sample survey. DATA FOR A DYNAMIC DISCUSSION The assessment has collected a great deal of data from doctoral programs across the United States, and it has statistically summarized these data along a variety of dimensions. The data that were assembled with great effort by U.S. research universities and their faculty, combined with the analytical talent of the many experts with whom we have consulted, have 1 We use the term rating to mean a number on a scale from 1 to 6 that indicates the perceived quality of a program, or the statistically estimated perceived quality. Ratings from many raters were aggregated for programs as described in this guide and were thus arranged in order, from highest to lowest, to yield a program ranking. A rating is a score. A ranking is calculated from an ordered list of ratings. In our study, we calculate multiple ratings for each program, and from the multiple ratings, obtain ranges of rankings for each program. 2 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 1
enabled us to produce a study with procedures designed to provide a richer array of results from those of previous NRC efforts and from those of commercial vendors This study and its methodology, however, are merely the beginning of an informed discussion, not the last word. Users of the assessment and its methodology should understand that it was not the intent of the assessment committee to produce the final verdict (as of 2006) on the characteristics and quality of doctoral programs. Rather, we intend to present data that are relevant to the assessment of doctoral programs and to make them available to others. Users will want to bring to these data their own knowledge of programs and to compare the assessment that the NRC has produced with that knowledge. This should be a dynamic process that leads to further discussion and insights. We seek to make users aware of the strengths and limitations of the data and believe in the importance of this dynamic process. We have operated under the assumption that outstanding programs have certain measurable characteristics in common. For example, one can see evidence of a vibrant scholarly community by looking at measures of the number of faculty who produce scholarship and whose scholarship is recognized through citations, awards given by scholarly societies, and the percent of the faculty who receive grants. Nonetheless, the question of assessing how well a program accomplishes the dual objectives of conducting research and educating students to become scholars, researchers, and educators is a complex one. The quality of doctoral programs is a multidimensional concept, and assessing that quality requires highlighting some of the more significant factors underlying it. This study has attempted to collect data that will capture this multidimensionality and to design measures that will best reflect it. Among the dimensions that we have sought to measure are: (1) the research activity of program faculty; (2) student support and outcomes; (3) diversity of the academic environment; and, taking these measures into account, (4) a summary measure that provides a range of rankings of the estimated overall quality of programs, which includes all these separate dimensions, included with differing weights, and which is based on recent quantitative measurements. Each of these four measures necessarily collapses interesting and informative measures of doctoral programs. We hope that users of the study will want to mine the data that underlie each metric, to examine additional information collected in the course of the study, and then construct their own comparisons. This will be possible by using the online spreadsheets that will accompany the final report. In this undertaking, we were necessarily limited to examining what is countable2. Many will argue that program quality goes well beyond what can be measured: the existence of a scholarly community, the creative blending of interdisciplinary perspectives, or the excitement generated by the exploration of new paradigms. We agree. We also understand that some of these important qualitative dimensions will elude even the most carefully conceived quantitative measures. In order to capture as fully as possible those subjective dimensions that correlate with excellence in doctoral education, however, we surveyed a sample of program faculty about the 2 “Perceived quality,” a notion that underlies the rating part of the study, is measurable, but not countable. Most of the other variables in the study, such as numbers of faculty, students, citations, or publications are countable. 3 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 1
perceived quality of a sample of programs in their individual fields and then used standard statistical techniques to find the measurable characteristics that best correlated with these subjective estimates of program quality. We balanced this by asking faculty members in each field for their explicit views of the characteristics that are most important in facilitating a strong Ph.D. program. We then made a blend of these two estimators—the “regression-based” views of faculty as expressed through their ratings of sample programs, and their “direct” views as obtained through explicit identification of important program characteristics—to give us the quantitative tool that most robustly measured overall program quality. 4 PREPUBLICATION COPY—UNEDITED PROOFS