Click for next page ( 6

The National Academies of Sciences, Engineering, and Medicine
500 Fifth St. N.W. | Washington, D.C. 20001

Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 5
2 The Data and How They Were Obtained The long history of the NRC Assessment of Research Doctorate Programs in the United States—this is the third in a series of such assessments since 1982—will not be recounted in detail here. Rather, we will offer a shortened history that begins with the decision of the National Research Council to undertake another study following the assessment published in 1995. The first step in the process of developing this new assessment was the publication of Assessing Research Doctoral Programs: A Methodology Study (“the Methodology Study“), which was completed in 2003 and provided a roadmap for the large-scale study. At this point, universities still had to be recruited to join in the study, the final taxonomy of disciplines had to be settled, and the questionnaires had to be finalized and administered. RECRUITING UNIVERSITIES In November 2006 the chairman of the National Research Council, Ralph Cicerone, notified presidents and chancellors of U.S. universities offering doctoral degrees of the NRC’s intention to conduct a new assessment of doctoral programs. The universities were asked to contribute funding to the project, with the amount determined by a sliding scale that reflected the number of doctoral degrees in selected fields granted in 2003-2004 according to the National Science Foundation’s Survey of Doctoral Recipients.3 Two hundred twenty-two universities chose to participate.4 Most of the data collection was carried out in late fall 2006 and spring 2007. Data were checked through fall 2007 via correspondence with many institutions. Data collection was completed in the spring of 2008. At this point the study had collected data for more than 5,000 programs in 61 fields in the physical sciences and mathematics, agricultural and 3 A contribution was not required for participation, but almost all of the participating universities did contribute funds. 4 The institutions that chose not to participate generally had very few doctoral programs and often were undergoing administrative reorganization. Although the NRC followed up with institutions that did not respond, a handful of institutions that had been invited were excluded because of non-response. 5 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 5
life sciences, health sciences, engineering, social sciences, and arts and humanities.5 Unless otherwise stated, the data reported in this study are for the 2005-2006 academic year. The universities and their programs are listed on the Web site whose URL is given in Appendix D. THE TAXONOMY At the same time as the universities were being recruited, we consulted widely in order to settle on a taxonomy of disciplines.6 To assist us in this task, we examined the taxonomy of fields used by the National Science Foundation (NSF) in its Doctorate Records File,7 reviewed the classification of instructional programs (CIPS) of the U.S. Department of Education, and consulted with a number of scholarly societies. These societies were especially helpful when it came to the life sciences, because the taxonomy used in the 1995 NRC study for that area had become outdated. In particular, interdisciplinary study in the life sciences had grown considerably since 1995. This is reflected in the current study by the addition of an interdisciplinary field, “Biology/Integrated Biology/Integrated Biomedical Sciences,” which includes 120 programs. Most of the other changes from the 1995 NRC study served to expand the disciplines that were included. For example, programs in agricultural fields, public health, nursing, public administration, and communication were added. We decided not to include doctoral programs in schools of education, because in many cases, research and practice-oriented doctoral programs could not be separated. A separate study of these programs is now beginning under the auspices of the American Education Research Association. The criteria for inclusion of a field or a discipline in the study were that it had produced at least 500 Ph.D.’s in the five years prior to 2004-2005, and that there were programs in the field in at least 25 universities.8 The criterion for inclusion of a program was that it had produced at least five Ph.D.’s in the five years prior to 2005-2006.9 Given these criteria, each university chose which of their programs to include. The disciplines and programs covered by the study are listed in Appendixes C and D. 5 Data were collected for 67 fields in all, but 6 of these were emerging fields with too few programs to rate. Only partial data were collected for 5 of these fields. The other field that was not rated was Languages, Societies, and Cultures, which is discussed below. 6 A provisional taxonomy had been suggested in Assessing Research-Doctorate Programs: A Methodology Study. This was revisited by a panel of the current Committee. 7 The Doctorate Records File, administered by the National Science Foundation (NSF), is a joint data gathering activity of NSF, the U.S. Department of Agriculture, U.S. Department of Education, U.S. Department of Energy, the National Institutes of Health, and the National Endowment for the Humanities 8 The fields of German and classics were included, although they did not meet these criteria, because they had been included in earlier NRC assessments. In 2006, they not only were included for historical reasons, but they qualified on the basis of the number of programs in the field. 9 The dates for the test of field inclusion differ from those for program inclusion because of the lag in NSF data on Ph.D. production by field. Program data, which were obtained from the universities, were more current. 6 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 5
QUESTIONNAIRE CONSTRUCTION AND DATA COLLECTION During the winter of 2005-2006, a panel consisting of graduate deans and institutional researchers met to review the questionnaires that had been developed for the methodology study and to suggest additional and alternative questions. Once the draft questionnaires had been posted on the project Web site, many suggestions were also received from the universities. The questionnaires were finalized in November 2006 and a link to them appears in Appendix B. The administration of the questionnaires involved the following steps: • Questionnaire design—Five questionnaires were designed: 1) an institutional questionnaire, which contained questions about institution- wide practices and asked for a list of doctoral programs at the institution. 2) a program questionnaire, which was sent to each doctoral program in most cases10. In addition to questions about students, faculty, and characteristics of the program, programs were asked to provide lists of their doctoral faculty, and for five fields, their advanced doctoral students (see below) 3) the faculty questionnaire, which asked individual faculty members about their educational and work history, grants, publications, what characteristics they felt were important to the quality of a doctoral program, and whether they would be willing to answer a survey asking them to provide ratings for programs in their field. 4) the student questionnaire, sent to advanced students in English, chemical engineering, economics, physics, and neuroscience, which asked about student educational background, research experiences while in the program, program practices that they had experienced, and post-graduation plans. 5) the rating questionnaire, which was sent to a stratified sample of those who had answered on the faculty questionnaire that they were willing to provide ratings of programs in their field. The operation of administering all these questionnaires was conducted by our contractor, Mathematica Policy Research, in close collaboration with NRC staff. All questionnaires were submitted and approved by the Institutional Review Board (IRB) of the National Research Council and most institutions also received approval from their own IRBs. • Data Collection—Each of the participating universities was asked to name an institutional coordinator (IC) who would be responsible for collection of data from the university. On the institutional questionnaire, the IC provided the names of the programs at that university that met the NRC criterion for inclusion. Each of these programs was then sent the program questionnaire through the IC. Some universities had a well-developed 10 Some large institutions with well-equipped institutional research offices answered those program questions they could centrally and then sent the remaining questions to the doctoral programs to answer. 7 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 5
centralized data-collection capability and provided much of the data centrally. Others did not and gave the program questionnaires to each of their programs to complete. Each program was asked for a list of faculty members who were involved in doctoral education according to the NRC definition of a program that was given on the institutional and program questionnaires. On the program questionnaire, we asked respondents to divide their program faculty into three groups: (1) core faculty, who either were actively supervising doctoral dissertations or serving on an admissions or curriculum committee for the doctoral program; (2) new faculty, who were tenured, or tenure-track faculty, who had been hired in the previous three years and were expected to become core faculty; and (3) associated faculty, who were not core faculty in the program, but were working in the program supervising dissertations and were regular faculty members at the institution. The faculty questionnaire was then sent to core and new faculty in each program and included a section (Section G) asking what aspects of doctoral programs the faculty member thought were important to quality. Faculty in programs in five fields (physics, English, chemical engineering, economics, and neuroscience) were asked to provide lists of enrolled students who had been admitted to candidacy. These students were then each sent a copy of the student questionnaire. All questionnaires were delivered and answered online. Selected results of the student survey will be provided in the final report, but are not discussed in this guide. As part of the faculty questionnaire, faculty members were asked if they would be willing to complete a rating survey. Those who indicated they were willing were put into a pool that was used to obtain the stratified sample of raters for the rating survey. Although response rates varied by field, there were no detectable characteristics of non-respondents that would suggest response bias. • Sampling for the rating survey—Programs and raters within a field were classified according to the size of the program (measured by faculty size) and the program’s geographic region. Raters were also classified by faculty rank. In the fields with a large number of programs, 50 programs were sampled at random from a stratified classification. In fields with a smaller number of programs, 30 programs were chosen in a similar manner. A sample of raters in each field was chosen so that the sample duplicated the distribution by program size, faculty rank, and geographic region for all programs in the field. Each rater was given a set of 15 programs to rate on a six-point scale, for which 1 was “not adequate for doctoral education” and 6 was “distinguished.” The questionnaire also asked the rater’s familiarity with each program and provided information about the program and a reference to the program Web site. On average, programs received ratings from about 58 percent of the selected raters who had been given data about them. Non-respondents were replaced by other raters from the same stratum until almost every program had been rated by 50 raters11. Most programs in the rating sample received at least 40 ratings.12 The numbers of raters for programs in each rated field are shown in Appendix H13. 11 The average number of raters taken over all programs was 44. See Appendix H. 12 Since the committee did not know in advance how many programs there would be in each discipline, special treatment was given during the regression calculations to programs in disciplines with fewer than 35 programs. 8 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 5
• Method of collecting publications, citations, and awards—With the exception of fields in the humanities, publications and citations were collected through the Institute for Scientific Information (ISI), now a part of Thomson Scientific, and matched to faculty lists for fields in the sciences (including the social sciences). To assist in matching publications to faculty, faculty were asked for a list of ZIP codes that had appeared on their publications. These were used to match publications to faculty who had moved and to distinguish faculty with the same name and field. Although faculty were also asked about their publications in Section D of the faculty questionnaire, these lists were used only to check the completeness of the ISI data. The citation count is for the years 2000-2006 and relates to papers published between 1981 and 2006. In the case of the humanities, for which we do not have a comprehensive bibliographic source, we analyzed faculty members’ curriculum vitae, which were submitted along with the faculty questionnaire or the list they provided in answer to the questionnaire. We then counted books and publications going back to 1996 and recorded these counts, giving books a weight of 5 and articles a weight of 1. Finally, lists of honors and awards were collected from 224 scholarly societies for all fields and differentiated between “highly prestigious” awards, which received a weight of 5, and other awards, which received a weight of 1. • Key variables—Twenty-one key variables14 were identified by the committee for inclusion in the rating process; these are described in Appendix E. One other variable that the committee wished to include—the number of student publications and presentations—was excluded because of lack of data. Most of these variables are expressed as per capita or “intensive” variables; that is, we divided the measure of interest (e.g., publications, citations) by the “allocated” faculty in the program, or, in the case of citations, we divided citations by the number of publications for each faculty member. This allocation was designed to assure that no more than 100 per cent of a faculty member was assigned to all programs taken together. The use of these key variables is described in Chapter 3.15 These were combined with another field that had similar “direct” weights in order to obtain the regression-derived ratings.13 Languages, Societies, and Cultures was a special case that was not rated when it became clear to the committee that the programs included in the “field” were too heterogeneous for ratings to be obtained that were comparable across the field and that no subfield had more than 20 programs. Respondents included programs in Italian, romance languages, Russian studies, Middle Eastern studies, African studies, and a number of other fields. Full data about these programs will be published in the database accompanying the final report. 14 There were only 19 for the humanities fields, since citation data were unavailable. 15 The justification of each of the variables will be discussed in the final report. Two variables, however—one controversial and one novel—should be mentioned at this point. There is a large literature about the use of citations as a measure of excellence. A citation measure for an individual faculty member may be manipulated by self- citation. Flawed results may be highly cited but not indicative of quality. We grant the validity of these objections, but remind the reader that we are aggregating citations across the publications of all the faculty members in a program. Considering aggregated data, within a field, subdisciplines can have varying patterns of productivity and the numbers of citation an article may receive are not independent of the size of the subdiscipline, so that the value of the measure for a program will depend on its specialty composition, not the quality of the program. The final report will have a short discussion of these pitfalls. We use the variable here, in intensive form, because other things equal, we believe that a program whose faculty are more cited and that has a greater number of citations per publication will be a higher-quality program. The novel variable is interdisciplinarity. It, too, will be discussed at 9 PREPUBLICATION COPY—UNEDITED PROOFS

OCR for page 5
• Final data review—Once all the data had been collected, they were reviewed by NRC staff for completeness and consistency. The institutional coordinators were asked to revise anomalous data and populate missing cells. If, after this request, the programs were still unable to provide missing data, two procedures were followed: If data on two or fewer measures were missing, the cells were populated with the mean value for programs that had provided data.16 If data for three or more measures were missing, the program was dropped and the institutional coordinator was informed. If the data were then provided, the program was reinstated. Program names and assignment to a field were also reviewed by staff, and the institutional coordinator was consulted if anomalies were found and his or her recommendation was followed. greater length in the final report. We measure interdisciplinarity by the percent of program faculty who are serving on dissertation committees from outside the program (associated faculty). This is an imperfect measure, since it will depend on institutional practices; e.g., how broad doctoral programs are. We felt, however, that some measure, however imperfect, would be informative. It rarely shows up as an important variable in determining program ratings. 16 These values will be identified in the data tables that accompany the final report. Eight hundred fifty-four programs out of 4,915 total had at least one missing value. Programs were dropped if they did not submit a faculty list, so there were no missing values for the publications, citations, or awards measures. 10 PREPUBLICATION COPY—UNEDITED PROOFS