Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Origins of Study
arid Selection of Programs
Each year more than 22,000 candidates are awarded doctorates in
engineering, the humanities, and the sciences from approximately 250
U.S. universities. They have spent, on the average, five-and-a-half
years in intensive education and research in preparation for careers
either in universities or in settings outside the academic sector, and
many will make significant contributions to research. Yet we are
poorly informed concerning the quality of the programs producing these
graduates. This study is intended to provide information pertinent to
this complex and controversial subject.
The charge to the study committee directed it to build upon the
planning that preceded it. The planning stages included a detailed
review of the methodologies and the results of past studies that had
focused on the assessment of doctoral-level programs. The committee
has taken into consideration the reactions of various groups and indi-
viduals to those studies. The present assessment draws upon previous
experience with program evaluation, with the aim of improving what was
useful and avoiding some of the difficulties encountered in past stud-
ies. The present study, nevertheless, is not purely reactive: it has
its own distinctive features. First, it focuses only on programs
awarding research doctorates and their effectiveness in preparing stu-
dents for careers in research. Although other purposes of graduate
education are acknowledged to be important, they are outside the scope
of this assessment. Second, the study examines a variety of different
indices that may be relevant to the program quality. This multidimen-
sional approach represents an explicit recognition of the limitations
of studies that rely entirely on peer ratings of perceived quality--the
so-called reputational ratings. Finally, in the compilation of repu-
tational ratings in this study, evaluators were provided the names of
faculty members involved with each program to be rated and the number
of research doctorates awarded in the last five years. In previous
reputational studies evaluators were not supplied such information.
During the past two decades increasing attention has been given to
describing and measuring the quality of programs in graduate education.
It is evident that the assessment of graduate programs is highly im-
portant for university administrators and faculty, for employers in
industrial and government laboratories, for graduate students and
prospective graduate students, for policymakers in state and national
1
OCR for page 2
2
organizations, and for private and public funding agencies. Past ex-
perience, however, has demonstrated the difficulties with such assess-
ments and their potentially controversial nature. As one critic has
asserted:
. . . the overall effect of these reports seems quite
-
clear. They tend, first, to make the rich richer and
the poor poorer; second, the example of the highly
ranked clearly imposes constraints on those institu-
tions lower down the scale {the "Hertz-Avis" effect).
And the effect of such constraints is to reduce diver-
sity, to reward conformity or respectability, to penal-
ize genuine experiment or risk. There is, also, I be-
lieve, an obvious tendency to promote the prevalence of
disciplinary dogma and orthodoxy. All of this might
be tolerable if the reports were tolerably accurate and
judicious, if they were less prescriptive and more de-
scriptive; if they did not pretend to "objectivity" and
if the very fact of ranking were not pernicious and in-
vidious; if they genuinely promoted a meaningful "meri-
tocracy" (instead of simply perpetuating the status quo
ante and an establishment mentality). But this is pre-
cisely what they cannot claim to be or do.i
The widespread criticisms of ratings in graduate education were
carefully considered in the planning of this study. At the outset
consideration was given to whether a national assessment of graduate
programs should be undertaken at this time and, if so, what methods
should be employed. The next two sections in this chapter examine the
background and rationale for the decision by the Conference Board of
Associated Research Councils2 to embark on such a study. The remain-
der of the chapter describes the selection of disciplines and programs
to be covered in the assessment.
The overall study encompasses a total of 2,699 graduate programs
in 32 disciplines. In this report--the fifth and final report issuing
from the study--we examine 639 programs in seven disciplines in the
social and behavioral sciences: anthropology, economics, geography,
history, political science, psychology, and sociology. These programs
account for more than 90 percent of the research doctorates awarded in
these seven disciplines. It should be emphasized that the selection
of disciplines to be covered was determined on the basis of total doc-
toral awards during the FY1976-78 period (as described later in this
William A. Arrowsmith, "Preface" in The Ranking Game: The Power of
the Academic Elite, by W. Patrick Dolan, University of Nebraska Print-
ing and Duplicating Service, Lincoln, Nebraska, 1976, pe ix.
2The Conference Board includes representatives of the American Coun-
cil of Learned Societies, American Council on Education, National Re-
search Council, and Social Science Research Council.
OCR for page 3
3
chapter), and the exclusion of a particular discipline was in no way
based on a judgment of the importance of graduate education or research
in that discipline. Also, although the assessment is limited to pro-
grams leading to the research-doctorate {Ph.D. or equivalent) degree,
the Conference Board and study committee recognize that graduate
schools provide many other forms of valuable and needed education.
PRIOR ATTEMPTS TO ASSESS QUALITY IN GRADUATE EDUCATION
Universities and affiliated organizations have taken the lead in
the review of programs in graduate education. At most institutions
program reviews are carried out on a regular basis and include a com-
prehensive examination of the curriculum and educational resources as
well as the qualifications of faculty and students. One special form
of evaluation is that associated with institutional accreditation:
The process begins with the institutional or program-
matic self-study, a comprehensive effort to measure
progress according to previously accepted objectives.
The self-study considers the interest of a broad cross-
section of constituencies--students, faculty, admini-
strators, alumni, trustees, and in some circumstances
the local community. The resulting report is reviewed
by the appropriate accrediting commission and serves
as the basis for evaluation by a site-visit team from
the accrediting group. . . . Public as well as educa-
tional needs must be served simultaneously in determin-
ing and fostering standards of quality and integrity
in the institutions and such specialized programs as
they offer. Accreditation, conducted through nongov-
ernmental institutional and specialized agencies, pro-
vides a major means for meeting those needs. 3
Although formal accreditation procedures play an important role in
higher education, many university administrators do not view such pro-
cedures as an adequate means of assessing program quality. Other ef-
forts are being made by universities to evaluate their programs in
graduate education. The Educational Testing Service, with the sponsor-
ship of the Council of Graduate Schools in the United States and the
Graduate Record Examinations Board, has recently developed a set of
procedures to assist institutions in evaluating their own graduate
programs.4
3 Council on Postsecondary Accreditation, The Balance Wheel for Ac-
creditation, Washington, D.C., July 1981, pp. 2-3.
4For a description of these procedures, see M. J. Clark, Graduate
Program Self-Assessment Service: Handbook for Users, Educational
Testing Service, Princeton, New Jersey, 1980.
OCR for page 4
4
While
reviews at the institutional (or state) level have proven
useful in assessing the relative strengths and weaknesses of individual
programs, they have not provided the information required for making
national comparisons of graduate programs. Several attempts have been
made at such comparisons. The most widely used of these have been the
studies by Keniston (1959), Cartter (1966), and Roose and Andersen
(1970~. All three studies covered a broad range of disciplines in en-
gineering, the humanities, and the sciences and were based on the opin-
ions of knowledgeable individuals in the program areas covered. Ken-
istonS surveyed the department chairmen at 25 leading institutions.
The Cartter6 and Roose-Andersen7 studies compiled ratings from much
larger groups of faculty peers. The stated motivation for these stud-
ies was to increase knowledge concerning the quality of graduate edu-
cation:
A number of reasons can be advanced for undertaking
such a study. The diversity of the American system of
higher education has properly been regarded by both the
professional educator and the layman as a great source
of strength, since it permits flexibility and adapta-
bility and encourages experimentation and competing
solutions to common problems. Yet diversity also poses
problems. . . . Diversity can be a costly luxury if it
is accompanied by ignorance. . . . Just as consumer
knowledge and honest advertising are requisite if a
competitive economy is to work satisfactorily, so an
improved knowledge of opportunities and of quality is
desirable if a diverse educational system is to work
effectively.8
Although the program ratings from the Cartter and Roose-Andersen stud-
ies are highly correlated, some substantial differences in successive
ratings can be detected for a small number of programs--suggesting
changes in the programs or in the perception of the programs. For the
past decade the Roose-Andersen ratings have generally been regarded as
the best available source of information on the quality of doctoral
programs. Although the ratings are now more than 10 years out of date
and have been criticized on a variety of grounds, they are still used
extensively by individuals within the academic community and by those
in federal and state agencies.
5H. Keniston, Graduate Study in Research in the Arts and Sciences at
the University of Pennsylvania, University of Pennsylvania Press, Phil-
adelphia, 1959.
6 A. M. Cartter, An Assessment of Quality in Graduate Education, Amer-
ican Council on Education, Washington, D.C., 1966.
7K. D. Roose and C. J. Andersen, A Rating of Graduate Programs, Amer-
ican Council on Education, Washington, D.C., 1970.
Cartter, p. 3.
OCR for page 5
5
A frequently cited criticism of the Cartter and Roose-Andersen
studies is their exclusive reliance upon reputational measurement.
The ACE rankings are but a small part of all the eval-
uative processes, but they are also the most public,
and they are clearly based on the narrow assumptions
and elitist structures that so dominate the present
direction of higher education in the United States.
As long as our most prestigious source of information
about postsecondary education is a vague popularity
contest, the resultant ignorance will continue to pro-
vide a cover for the repetitious aping of a single
model. . . . All the attempts to change higher educa-
tion will ultimately be strangled by the "legitimate"
evaluative processes that have already programmed a
single set of responses from the start.9
A number of other criticisms have been leveled at reputational rank-
ings of graduate programs.~° First, such studies inherently reflect
perceptions that may be several years out of date and do not take into
account recent changes in a program. Second, the ratings of individ-
ual programs are likely to be influenced by the overall reputation of
the university--i.e., an institutional "halo effect." Also, a dispro-
portionately large fraction of the evaluators are graduates of and/or
faculty members in the largest programs, which may bias the survey re-
sults. Finally, on the basis of such studies it may not be possible
to differentiate among many of the lesser known programs in which rel-
atively few faculty members have established national reputations in
research.
Despite such criticisms several studies based on methodologies
similar to that employed by Cartter and Roose and Andersen have been
carried out during the past 10 years. Some of these studies evaluated
post-baccalaureate programs in areas not covered in the two earlier
reports--including business, religion, educational administration, and
medicine. Others have focused exclusively on programs in particular
disciplines within the sciences and humanities. A few attempts have
been made to assess graduate programs in a broad range of disciplines,
many of which were covered in the Roose-Andersen and Cartter ratings,
but in the opinion of many each has serious deficiencies in the meth-
ods and procedures employed. In addition to such studies, a myriad of
articles have been written on the assessment of graduate programs
since the release of the Roose-Andersen report. With the heightening
interest in these evaluations, many in the academic community have
recognized the need to assess graduate programs, using other criteria
in addition to peer judgment.
9Dolan, p. 81.
i°For a discussion of these criticisms, see David S. Webster, "Meth-
ods of Assessing Quality," Change, October 1981, pp. 20-24.
OCR for page 6
6
Though carefully done and useful in a number of ways,
these ratings (Cartter and Roose-Andersen) have been
criticized for their failure to reflect the complexity
of graduate programs, their tendency to emphasize the
traditional values that are highly related to program
size and wealth, and their lack of timeliness or cur-
rency. Rather than repeat such ratings, many members
of the graduate community have voiced a preference for
developing ways to assess the quality of graduate pro-
grams that would be more comprehensive, sensitive to
the different program purposes, and appropriate for
use at any time by individual departments or universi-
ties.~
Several attempts have been made to go beyond the reputational assess-
ment. Clark, Harnett, and Baird, in a pilot studying of graduate pro-
grams in chemistry, history, and psychology, identified as many as 30
possible measures significant for assessing the quality of graduate ed-
ucation. Glowers 3 has ranked engineering schools according to the
total amount of research spending and the number of graduates listed
in Who's Who in Engineering. House and Yeager~4 rated economics de-
partments on the basis of the total number of pages published by full
professors in 45 leading journals in this discipline. Other ratings
based on faculty publication records have been compiled for graduate
programs in a variety of disciplines, including political science, psy-
chology, and sociology. These and other studies demonstrate the feasi-
bility of a national assessment of graduate programs that is founded on
more than reputational standing among faculty peers.
DEVELOPMENT OF STUDY PLANS
In September 1976 the Conference Board, with support from the Car-
negie Corporation of New York and the Andrew W. Mellon Foundation, con-
vened a three-day meeting to consider whether a study of programs in
graduate education should be undertaken. The 40 invited participants
in this meeting included academic administrators, faculty members, and
~Clark, p. 1.
I'M. J. Clark, R. T. Harnett, and L. L. Baird, Assessing Dimensions
of Quality in Doctoral Education: A Technical Report of a National
Study in Three Fields Educational Testino Service Princeton New
Jersey, 1976.
~3 Donald D. Glower, "A Rational Method for Ranking Engineering Pro-
grams," Engineering Education, May 1980.
McDonald R. House and James H. Yeager, Jr., "The Distribution of Pub-
lication Success Within and Among Top Economics Departments: A Disag-
gregate View of Recent Evidence," Economic Inquiry' Vol. 16, No. 4,
October 1978, pp. 593-598.
OCR for page 7
7
agency and foundation officials and represented a variety of insti-
tutions, disciplines, and convictions. In these discussions there was
considerable debate concerning whether the potential benefits of such a
study outweighed the possible misrepresentations of the results. On
the one hand, "a substantial majority of the Conference [participants
believed] that the earlier assessments of graduate education have re-
ceived wide and important use: by students and their advisors, by the
institutions of higher education as aids to planning and the allocation
of educational functions, as a check on unwarranted claims of excel-
lence, and in social science research." 6 On the other hand, the
Conference participants recognized that a new study assessing the qual-
ity of graduate education "would be conducted and received in a very
different atmosphere than were the earlier Cartter and Roose-Andersen
reports. . . . Where ratings were previously used in deciding where to
increase funds and how to balance expanding programs, they might now
be used in deciding where to cut off funds and programs."
After an extended debate of these issues, it was the recommendation
of this conference that a study with particular emphasis on the effec-
tiveness of doctoral programs in educating research personnel be under-
taken. The recommendation was based principally on four considera-
tions:
(1) the importance of the study results to national
and state bodies,
the desire to stimulate continuing emphasis on
quality in graduate education,
(3) the need for current evaluations that take into
account the many changes that have occurred in
programs since the Roose-Andersen study, and
(4) the value of extending the range of measures used
in evaluative studies of graduate programs.
Although many participants expressed interest in an assessment of mas-
ter's degree and professional degree programs, insurmountable problems
prohibited the inclusion of these types of programs in this study.
Following this meeting a 13-member committee, 7 co-chaired by
Gardner Lindzey and Harriet A. Zuckerman, was formed to develop a de-
tailed plan for a study limited to research-doctorate programs and de-
signed to improve upon the methodologies utilized in earlier studies.
In its deliberations the planning committee carefully considered the
criticisms of the Roose-Andersen study and other national assessments.
Particular attention was paid to the feasibility of compiling a variety
of specific measures {e.g., faculty publication records, quality of
students, program resources) that were judged to be related to the
quality of research-doctorate programs. Attention was also given to
making improvements in the survey instrument and procedures used in the
resee Appendix G for a list of the participants in this conference.
from a summary of the Woods Hole Conference (see Appendix G).
Resee Appendix H for a list of members of the planning committee.
OCR for page 8
8
Cartter and Roose-Andersen studies. In September 1978 the planning
group submitted a comprehensive report describing alternative strate-
gies for an evaluation of the quality and effectiveness of research-
doctorate programs.
The proposed study has its own distinctive features.
It is characterized by a sharp focus and a multidimen-
sional approach. (1) It will focus only on programs
awarding research doctorates; other purposes of doc-
toral training are acknowledged to be important, but
they are outside the scope of the work contemplated.
(2) The multidimensional approach represents an ex-
plicit recognition of the limitations of studies that
make assessments solely in terms of ratings of per-
ceived quality provided by peers--the so-called repu-
tational ratings. Consequently, a variety of quality-
related measures will be employed in the proposed study
and will be incorporated in the presentation of the
results of the study.
This report formed the basis for the decision by the Conference Board
to embark on a national assessment of doctorate-level programs in the
sciences, engineering, and the humanities.
In June 1980 an 18-member committee was appointed to oversee the
study. The committee,~9 made up of individuals from a diverse set of
disciplines within the sciences, engineering, and the humanities, in-
cludes seven members who had been involved in the planning phase and
several members who presently serve or have served as graduate deans
in either public or private universities. During the first eight
months the committee met three times to review plans for the study ac-
tivities, make decisions on the selection of disciplines and programs
to be covered, and design the survey instruments to be used. Early in
the study an effort was made to solicit the views of presidents and
graduate deans at more than 250 universities. Their suggestions were
most helpful to the committee in drawing up final plans for the assess-
ment. With the assistance of the Council of Graduate Schools in the
United States, the committee and its staff have tried to keep the grad-
uate deans informed about the progress being made in this study. The
final section of this chapter describes the procedures followed in de-
termining which research-doctorate programs were to be included in the
assessment.
SELECTION OF DISCIPLINES AND PROGRAMS TO BE EVALUATED
One of the most difficult decisions made by the study committee was
the selection of disciplines to be covered in the assessment. Early in
Nonnational Research Council, A Plan to Study the Quality and Effective-
ness of Research-Doctorate Programs, 1978 {unpublished report).
~9See p. vii for a list of members of the study committee.
OCR for page 9
9
the planning stage it was recognized that some important areas of grad-
uate education would have to be left out of the study. Limited finan-
cial resources required that efforts be concentrated on a total of no
more than about 30 disciplines in the biological sciences, engineering,
humanities, mathematical and physical sciences, and social and behav-
ioral sciences. At its initial meeting the committee decided that the
selection of disciplines within each of these five areas should be made
primarily on the basis of the total number of doctorates awarded nation-
ally in recent years.
At the time the study was undertaken, aggregate counts of doctoral
degrees earned during the FY1976-78 period were available from two in-
dependent sources--the Educational Testing Service (ETS) and the Na-
tional Research Council (NRC). Table 1.1 presents doctoral awards data
for 10 disciplines within the social and behavioral sciences. As al-
luded to in footnote 1 of the table, discrepancies between the ETS and
NRC counts may be explained, in part, by differences in the data col-
lection procedures. The ETS counts, derived from information provided
by universities, have been categorized according to the discipline of
the department/academic unit in which the degree was earned. The NRC
counts were tabulated from the survey responses of FY1976-78 Ph.D. re-
cipients, who had been asked to identify their fields of specialty.
Originally the committee had decided to include only the first six so-
cial and behavioral science disciplines listed in Table 1.1. However,
at the urging of many individuals in the academic community and at the
request of the National Science Foundation, which provided supplemental
funding, geography ° was added to the list of social and behavioral
science disciplines to be covered in the assessment. Since the deci-
sion to include geography was not made until spring 1981, the survey
of evaluators in this discipline was not undertaken until five months
after the survey in other disciplines.
The selection of the research-doctorate programs to be evaluated in
each discipline was made in two stages. Programs meeting either of the
following criteria were initially nominated for inclusion in the
study:
{1) more than a specified number (see below) of re-
search doctorates awarded during the FY1976-78
period or
(2) more than one-third of that specified number of
doctorates awarded in FY1979.
2 °Geography was among the disciplines covered in the Roose-Andersen
study.
2 Fin the first three volumes of the committee's study, which pertain
to the mathematical and physical sciences, humanities, and engineer-
ing, it is mistakenly reported that a third criterion based on results
from the Roose-Andersen study was used in the nomination of programs
to be included in the assessment. This third criterion, while at one
time considered by the committee, was not adopted.
OCR for page 10
10
TABLE 1.1 Number of Research-Doctorates Awarded in Socia 1
and Behavioral Science Disciplines, FY1976-78
Disciplines Included in the Assessment
Psychology
History
Economics
Political Science
Sociology
Anthropology
Geography
Total
Disciplines Not Included in the Assessment
Area Studies
Public Administration
Urban Studies
Other Social and Behavioral Sciences
Total
Source of Data*
ETS NRC
6~977
2~511
2J323
2~021
1~981
1~252
523
17,588
479
413
62
N/A
8,868
2,819
2,524
2,195
2,069
1,290
469
20,234
333
423
247
694
1,697
*Data on FY1976-78 doctoral awards were derived from two independent
sources: Educational Testing Service {ETS), Graduate Programs and Ad-
missions Manual, 1979-81, and the NRC's Survey of Earned Doctorates,
1976-78. Differences in field definitions account for discrepancies
between the ETS and NRC data.
In each discipline the specified number of doctorates required for
inclusion in the study was determined in such a way that the programs
meeting this criterion accounted for at least 90 percent of the doctor-
ates awarded in that discipline during the FY1976-78 period. In the
social and behavioral science disciplines the following numbers of
FY1976-78 doctoral awards were required to satisfy the first criterion
(above):
Anthropology--9 or more doctorates
Economics--12 or more doctorate s
Geography--1 or more doctorates
History--ll or more doctorates
Political Science--10 or more doctorates
Psychology--22 or more doctorates
Sociology--9 or more doctorates.
OCR for page 11
11
A list of the nominated programs at each institution was then sent to
a designated individual (usually the graduate dean) who had been ap-
pointed by the university president to serve as study coordinator for
the institution. The coordinator was asked to review the list and
eliminate any programs no longer offering research doctorates or not
belonging in the designated discipline. The coordinator also was given
an opportunity to nominate additional programs that he or she believed
should be included in the study.2 2 Coordinators were asked to re-
strict their nominations to programs that they considered to be "of
uncommon distinction" and that had awarded no fewer than two research-
doctorates during the past two years. In order to be eligible for in-
clusion, of course, programs had to belong in one of the disciplines
covered in the study. If the university offered more than one re-
search-doctorate program in a discipline, the coordinator was instruc-
ted to provide information on each of them so that these programs could
be evaluated separately.
The committee received excellent cooperation from the study coor-
dinators at universities. Of the 243 institutions that were identified
as having one or more research-doctorate programs satisfying the cri-
teria (listed earlier) for inclusion in the study, only 7 declined to
participate in the study and another 8 failed to provide the program
information requested within the three-month period allotted (despite
several reminders). None of these 15 institutions had doctoral pro-
grams that had received strong or distinguished reputational ratings in
prior national studies. Since the information requested had not been
provided, the committee decided not to include programs from these in-
stitutions in any aspect of the assessment. In each of the seven chap-
ters that follows, a list is given of the universities that met the
criteria for inclusion in a particular discipline but that are not rep-
resented in the study.
As a result of nominations by institutional coordinators, some pro-
grams were added to the original list and others dropped. Table 1.2
reports the final coverage in each of the seven social and behavioral
science disciplines. The number of programs evaluated varies consider-
ably by discipline. A total of 150 psychology programs have been in-
cluded in the study; in geography and anthropology fewer than half this
number have been included. Although the final determination of whether
a program should be included in the assessment was left in the hands of
the institutional coordinator, it is entirely possible that a few pro-
grams meeting the criteria for inclusion in the assessment were over-
looked by the coordinators. In the chapter that follows, a detailed
description is given of each of the measures used in the evaluation of
research-doctorate programs in the social and behavioral sciences. The
description includes a discussion of the rationale for using the mea-
sure, the source from which data for that measure were derived, and any
known limitations that would affect the interpretation of the data re-
2 2 See Appendix A for the specific instructions given to the coordi-
nators.
OCR for page 12
12
ported. The committee wishes to emphasize that there are limitations
associated with each of the measures and that none of the measures
should be regarded as a precise indicator of the quality of a program
in educating scientists for careers in research. The reader is
strongly urged to consider the descriptive material presented in Chap-
ter II before attempting to interpret the program evaluations reported
in subsequent chapters. In presenting a frank discussion of any short-
comings of each measure, the committee's intent is to reduce the possi-
bility of misuse of the results from this assessment of research-doc-
torate programs.
TABLE 1.2 Number of Programs Evaluated in Each Discipline and the
Total FY1976-80 Doctoral Awards from These Programs
Discipline
Programs FY1976-80 Doctorates*
Anthropology 70 1,960
Economics 93 3,770
Geography 49 762
History 102 3,877
Political Science 83 2,909
Psychology 150 10,582
Sociology 92 3,061
TOTAL 639 26,921
*The data on doctoral awards were provided by the study coordinator at
each of the universities covered in the assessment.
Representative terms from entire chapter:
doctoral awards