| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 235
Evaluation of Economic Claims
.,
There is no question that any individual employer who can be
selective in hiring workers will benefit. What is problematic is the
magnitude of the economic benefits that would accrue to the individual
employer or to the economy as a whole if ability testing were more
widely used. Part of the Department of Labor's rationale for promoting
the VG-GATB Referral System is based on very specific claims of
economic benefits. John Hunter, the author of U.S. Employment
Service (USES) Test Research Report No. 47, which contains an
analysis of the economic benefits of personnel selection using ability
tests (U.S. Department of Labor, 1983e), estimates that a "potential
increase in work force productivity among the employers who hire
through the service would come to $79.36 billion per year." That report
also refers the reader to the work of Hunter and Schmidt (1982), in
which they estimate productivity gains of between $13 billion and $153
billion in the economy as a whole due to using ability tests for selection.
In this chapter we review these claims.
UTILITY ANALYSIS: GAINS FOR THE INDIVIDUAL FIRM
In the first part of the discussion we review the model (known as utility
analysis) that Hunter and Schmidt used to estimate how much an
individual employer would gain by using ability tests to select workers.
The formula that Hunter and Schmidt derive to measure the gains from
using ability testing is taken from Brogden (19461:
235
OCR for page 236
236 ASSESSMENT OF THE VG-GATB PROGRAM
G= (r)(s)(A),
where
G = the dollar gain per worker per year due to hiring in order of test
score rather than randomly,
the correlation between test score and productivity,
s = the standard deviation of yearly productivity in dollars among
workers in the applicant pool, and
A = the average test score of those applicants selected, when test
scores are standardized to have mean O and variance 1 in the
applicant pool.
In this formula, the economic benefits to an employer are determined
by three parameters. The first is the validity of the test, the extent to
which test performance is correlated with productivity. The second and
third parameters measure the potential an employer has for improving
productivity by selecting better workers. How much productivity could
improve depends on the variability of productivity in the employers
applicant pool and on the latitude the employer has in selecting workers.
If productivity varies widely, an employer will benefit from using a test
that selects the best workers. However, if one worker is about as good as
another, the gains from selecting the best will be small. Similarly, if an
employer must hire everyone who applies for a job, then it does not help
him to know who is best. However, if it is possible to reject 90 or 95
percent of all applicants, it is obviously advantageous to be able to
identify the most able workers.
If the selection were random, then the average test score among
selected workers would be zero, and there would be no gains in
productivity. The gain is derived because the employer can select the
top-scoring percentage of those who apply for a job. If the test score
distribution is normal, the influence of selectivity, p, on the employer's
gains is measured by M(pJ, a statistical formula that is the inverse of the
Mill's ratio. For our purposes, it suffices to note that M(p' calibrates the
influence of selectivity, p, on productivity gains. M(p) is a decreasing
function of p; the more selective an employer can be, the lower is p and
the greater are the potential gains from using ability tests to hire the best
workers.
'The formal definition of M(p) is
M(p) = f [H(1 - p)]lp
where f and H are, respectively, the density and the inverse of the cumulative distribution
function of the standardized normal distribution function.
OCR for page 237
EVALUATION OF ECONOMIC CLAIMS 237
Potential Benefits of Employment Service Use of the VG-GATB
As a demonstration of the use of the utility formula, we examine the
Hunter estimate that optimal test use would have resulted in an estimated
benefit of $79.36 billion-to employers using the Employment Service
system in 1980 (U.S. Department of Labor, 1983e). That figure is widely
quoted in promotional literature for the General Aptitude Test Battery
(GATE). (The numbers in this discussion relate to 1980. The technique
could be applied to contemporary data with corrections for inflation and
the scale of Employment Service operations.)
The first number needed for the formula is the correlation between test
score and productivity, which Hunter takes to be .5, based on USES
validity generalization studies connecting test score and supervisor ratings.
The second number is the standard deviation of worker productivity,
which Hunter estimates to be 40 percent of average wages. This figure is
based on six empirical studies that covered clerks, nursers aides, grocery
clerks, adding machine operators, and radial drill-press operators, with
estimated standard deviations of 20 percent, 15 percent, 15 percent, 10
percent, 10 percent, and 25 percent (Hunter and Schmidt, 1982: Table
7.1~. It is also based on a method of variability assessment developed by
Hunter and Schmidt (see U.S. Department of Labor, 1983e) in which
supervisors are asked to estimate the dollar value of an average worker
and of a worker at the 85th percentile. The ratio of the two estimates is an
estimate of the standard deviation of worker productivity (under the
assumption that productivity is normally distributed, an assumption that
has been supported by Hunter and Schmidt in a study of computer
programmers.) Hunter and Schmidt developed values of 60 percent and
55 percent for budget analysts and computer programmers. Combining
these estimates with the previous empirical studies produces their overall
estimate of the standard deviation of worker productivity as 40 percent of
average annual wages.
The final number is the referral ratio, the proportion of applicants
referred. Hunter takes the value of 10 percent based on an "informal
enquiry that tile U.S. Employment Service has jobs for only about 1 in 10
of the applicants." The value of M(pJ for this referral ratio is 1.76; this
means that the average test score over the top 10 percent of scorers is 1.76,
when the test is standardized to have mean O and standard deviation 1.
Applying Brogden's formula gives a percentage gain, per worker per
year, of
G= .50 x 40 x 1.76= 3556.
In 1980, the Employment Service placed 4 million applicants in jobs.
Average annual wage in the jobs served by the Employment Service is
OCR for page 238
238 ASSESSMENT OF THE VG-GATB PROGRAM
$16,000. Average job tenure in the United States is 3.6 years. Thus the
total wages spent on workers hired in a particular year, over the expected
tenure of their jobs, is $230 billion, and, according to Hunter's calcula-
tions, the savings if they had been hired top-down in order of test score
would be 35 percent x $230 billion = $80.5 billion.
Wid VG-GATB Testing Save $80 Billion?
We examine the applicability of Brogden's formula for evaluating gains
from the use of the GATE by the Employment Service and reconsider the
particular numerical inputs used by Hunter (U.S. Department of Labor,
1983e).
There are two points to consider about the r value (correlation between
test score and productivity), which Hunter estimates at .5. First, the .5
value is based on corrections for restriction of range and for unreliability
in the criterion that the committee does not accept (see Chapter 8) and is
significantly larger than is supported by the second wave (post-1972) of
GATB validity studies.
Second, Brogden's formula measures the gains to an employer from
using ability tests, under the assumption that, without the tests, hiring is
random. Hunter asserts that the counseling used by the Employment
Service instead of the test "is equivalent to random selection" (U.S.
Department of Labor, 1983e). We do not have convincing evidence,
however, that the other techniques used by the Employment Service and
by employers are of no value. (If, indeed, workers are being selected at
random from applicant pools by the alternative methods, how can it be
argued, as Hunter does elsewhere, that it is necessary to correct corre-
lations computed on worker groups for restriction of range in order to
estimate their values for applicant groups?) In any case, some employers
use their own selection methods to screen applicants sent by the Employ-
ment Service. In assessing the gains from using ability tests, it would be
necessary to understand how ability tests complement existing proce-
dures.
Suppose an employer is using a procedure that has a validity of .10.
For example, an employer uses some combination of interviews and
biographic information to rank job applicants and hires those who come
out best in that ranking. The ranking has a correlation of .10 with
productivity.
Now suppose the employer adds an ability test, which in combination
with other selection methods has a validity of .3 to select applicants. The
gain in productivity can be measured by Brogden's formula, but the
validity term in the formula must be replaced by .30 - .10 = .20, the
change in validity due to adopting the new procedure.
OCR for page 239
EVALUATION OF ECONOMIC CMIMS 239
At first glance, it might be thought that the employer's prior procedure
with validity .10 could be combined with a cognitive test of validity .30 to
produce a combined selection procedure with validity .40, so that the gain
in validity due to using the cognitive test is .30. That, however, is not the
case. Even if the two are uncorrelated, the correlation of the combined
procedures is only .33; if they are positively correlated, it will be somewhat
less than this. To discover the improvement due to using a cognitive test,
one cannot avoid adjusting for the validity of the prior procedure.
Thus, in place of Hunter's estimate of .5, we suggest that the gain in the
validity of an employer's selection procedures from using the GATE is
more likely to range from .1 to .3. The .1 corresponds to jobs for which
the employer already has a reasonable selection procedure, and the .3
corresponds to jobs for which the current selection procedure is effec-
tively random.
Hunter's estimate of the second value in the Brogden formula is also
open to question. The empirical evidence cited for the standard deviation
of worker productivity is quite slight-eight studies by five authors (U.S.
Department of Labor, 1983e). Six of these studies are for jobs in the Job
Families IV and V principally served by the Employment Service, and the
standard deviations of output as a percentage of wages average 16
percent. Two of the studies, using a questionnaire of supervisors devel-
oped by Hunter and Schmidt, give values of 55 percent and 60 percent for
budget analysts and computer programmers, respectively. However, the
Employment Service does not see many applicants like budget analyst
and computer programmer. It seems overly optimistic to produce a figure
of 40 percent as the consensus figure for Employment Service jobs. In
Schmidt and Hunter (1983) the low-complexity jobs were estimated to
have standard deviations of 20 percent, and in more recent work (Hunter
et al., 1988) the estimates have been revised downward to 15 percent. In
our judgment, a more appropriate consensus figure for Employment
Service jobs would be about 20 percent.
The third figure in Brogden's formula is the selection ratio, which
Hunter takes to be 1 in 10 (1 selected for every 10 applicants). In 1980 the
Employment Service placed 4 million applicants in jobs. To achieve a
selection ratio of 1 in 10, it would have needed 40 million applicants, the
top 4 million test scorers being placed. The figures for 1986-1987 were 3.2
million placements of 6.9 million referrals for 19.2 million applicants, a
ratio of 1 in 6 (and perhaps 1 in 4 would be more reasonable, because 7
million of the 192 million were unemployment insurance claimants legally
obliged to register). The theoretical gains to be reaped from testing come
from allocating the top X percent of test scorers to jobs and the bottom
100 - X percent to no jobs. Hunter's numbers would mean that 10 percent
would be selected and 90 percent would not. For an individual employer
OCR for page 240
240 ASSESSMENT OF THE VG-GATB PROGRAM
who can afford to be highly selective, Brog~en's formula may well be
applicable. But it cannot apply to the whole economy, for which the
prospect of the top-scor~ng 10 percent working and the bottom 90 percent
not working is absurd. And the Employment Service is a microcosm of
the economy; of the 16 million applicants not placed during 1986-1987,
many will have already had jobs when they applied or will get them
through some other route than the Employment Service. Thus, even if
they score low on the test, they will get to work, and their productivity
must be allowed for.
Suppose there was only one job and all job seekers were tested, and the
top 90 percent of test scorers were employed and the rest were unem-
ployed. Ten percent is regarded as a reasonably high rate of unemploy-
ment. The gains from testing against random hiring would be computed
using a selection ratio of 9 in 10. The corresponding inverse Mill's ratio is
.20, which should be compared with an M(p) value of 1.76 when the
selection ratio is 1 in 10.
Taking a more optimistic view, let us now assume a selection ratio of 6
to 1 based on the 1986-1987 figures. (This is optimistic in the sense that
it supposes that the 16 million workers not placed by the Employment
Service did not have or find jobs and so did not lower average produc-
tivity.) The corresponding value of M(pJ is 1.40. If one accepts the
committee's more cautious estimates of the first two values in the
Brogden formula, and if the Employment Service referred in order of test
score and the employers hired in order of test score, the economic gain by
Brogden's rule would be:
G= .2 x 20 x 1.40= 5.6%.
This would lead to an estimated dollar gain, in 1980, of $13 billion as
opposed to Hunter's $80 billion. However, this is still an overestimate
because the average job tenure figure was not discounted for the decreased
value of the savings over time. Rather, one year's savings was multiplied
by the 3.6-year average tenure figure. A value of 3 would be more
appropriate, since next year's savings are not as valuable as this year' s.2
This correction would reduce the dollar gain to about $10.75 billion.
2To correctly estimate the amount discounted, one would need to know both the
appropriate discount rate and the distribution of job tenure (not just its mean). To arrive at
the value 3, we took 10 percent as a discount rate. This is probably conservative. The most
conservative assumption one could make about the distribution of job tenure would be to
suppose that every worker stays on the job for exactly 3.6 years and then quits. Under that
assumption, the discounted present value of savings to a firm is 3.15 times annual savings.
A less conservative procedure would assume that workers leave jobs at a constant rate. In
this case the discounted present value of one year's savings should be multiplied by 2.63. A
reasonable compromise value is 3.
OCR for page 241
EVALUATION OF ECONOMIC CMIMS 24}
The more radical view, with a selection ratio of 9 in 10 (that is, 9 of 10
Job Service applicants get jobs one way or another), would lead to a gain
of 0.8 percent. Including the discounted job tenure figure, the dollar gain
in this scenario would be on the order of 1.5 billion.
The committee concludes that both the logic and the numbers used in
the estimate of $80 billion to be gained from testing are flawed, and that
an estimate in the range $1.5 billion to $10 billion is more plausible.
Although we regard this as a plausible estimate of savings, provided
both the Employment Service and employers used the GATE optimally,
we emphasize that it is not reasonable to conclude that the economy as a
whole would save this amount of money or that the gross national product
(GNP) could increase by this amount. Employment Service use of the
VG-GATB will not improve the quality of the labor force as a whole. If
employers using the Employment Service get better workers, employers
not using the Employment Service will necessarily have a less competent
labor force. One firm's gain is another firm's loss.
With great ambivalence, we have developed alternative computations
of the economic gains to be anticipated from widespread use of the
VG-GATB system. Such dramatic claims of dollar gains have been
proposed and given a credence perhaps not originally intended that we
feel compelled to demonstrate that a careful critique of the assumptions
and the numbers would lead many experts to very different, and much
more modest, estimates.
Our ambivalence stems from a reluctance to do anything to encourage
further use of dollar estimates in Employment Service literature. Given
the paucity of empirical evidence and the state of the art, all estimates of
productivity gains from ability testing are highly speculative. The choice
of a dollar metric lends a false precision to the analysis. We feel that it is
more likely to mislead than to inform policy.
GAINS TO THE ECONOMY AS A WHOLE ARE FROM
JOB MATCHING
Several attempts have been made to calculate the gains that would accrue
to the economy as a whole if ability testing were used to select all workers in
the economy. This calculation cannot be made simply by applying Brogden's
formula to the economy as a whole. The reason is that an important source
of increased productivity is an employer's ability to select the best-qualified
workers and to avoid hiring the least-qualified workers. If there is no
selectivity, then an employer gains nothing by identifying the able, since
this identification will not affect the hiring decisions.
The economy as a whole is very much like a single employer who must
accept all workers. All workers must be employed. Whereas it may be
OCR for page 242
242 ASSESSMENT OF THE VG-GATB PROGRAM
true for an individual firm that more than 10 percent of its workers fit into
the top 10 percent of the ability distribution, this can never be true of the
entire labor force. The economy as a whole must make do with a labor
force that has only 10 percent of the workers who fit into the top 10
percent of the ability distribution. It must somehow reserve 10 percent of
its jobs for the least able 10 percent. This situation contrasts with that of
the individual employer. If a firm uses tests to identify the able and if the
firm can be selective, then it can improve the quality of its work force.
The economy as a whole cannot; the economy as a whole must employ
the labor force as a whole.3
Testing can increase aggregate productivity only if there are gains to be
made from matching people to jobs. Estimating those gains requires
models and procedures that are different from those used to measure the
gains that accrue to an individual employer who uses ability tests. In
estimating the effect on the economy as a whole, the mode! must balance
the single employer's gains against the losses of others.
To summarize, utility analysis cannot be applied to the economy as a
whole because the economy as a whole cannot have a selection ratio of much
less than 100 percent. The economy as a whole must make do with the labor
force that it has. It is not possible to assign the best workers to every job.
~- ~7
Economic Gains Based on the Hunter and Schmidt Job-Matching Model
In job matching, individuals are assigned to jobs to maximize overall
productivity. In the simplest case, when there is one predictor for each of
several jobs, gains over random assignment occur only if the quantity
validity x standard deviation of productivity
varies over the different jobs. The higher-scoring workers are assigned to
the jobs with the higher values of this quantity (Cronbach and Gleser,
1965: Chap. 51.
3What about the unemployed? One not entirely frivolous answer is that being unem-
ployed is a job; unemployment is essential to the smooth functioning of the economy. If
there were no unemployment, then inflation would be unacceptably high. Furthermore,
unemployment is necessary if the labor force is to respond to changing economic demands.
Without unemployment we would have many blacksmiths and no computer technicians. The
fact that the unemployment rate (or at least the unemployment rate that is consistent with
reasonable price stability) changes quite slowly is support for this view. If one takes
seriously this point of view, then it is clear that productivity can increase if the most able are
given the job "work" and the least able remain unemployed. But this conclusion rests on the
observation that some jobs are more productive than others and that aggregate productivity
increases when the more able are assigned to the more productive jobs. In other words, this
is a theory about how good job matching enhances productivity.
OCR for page 243
EVALUATION OF ECONOMIC CLAIMS 243
Brogden (1955, 1959, 1964) developed algorithms for optimal classifi-
cation when separate equations are used for predicting success in the
different jobs. The assignment part of the problem is mathematically
standard. There are m jobs and n workers, and each worker has an
expected dollar productivity for each job. Each worker is assigned to a
job to maximize expected total productivity. This is a problem in the field
of linear programming called the assignment problem. It will take a while
to do the calculation when m and n are large, but it is clear what needs to
be done. The hard problem is developing a plausible estimate of dollar
productivity for each worker for each job, then assessing the gain in using
optimal assignment versus random assignment.
Under some simplifying assumptions, Brogden (1959) showed that the
gain from optimal assignment was proportional to (1 - c), where c is the
correlation between the predictors used in the different jobs. Under these
assumptions, it is thus important to classify jobs so that different
prediction equations are appropriate for the different jobs.
Schmidt and Hunter (1983) present two job-matching models that
assign workers optimally. In the first of these, the univariate model, they
divide jobs into four types: management-professional, skilled trade,
clerical, and semiskilled and unskilled labor. Productivity is predicted by
a single predictor, cognitive ability, with correlation .4 in all jobs. The
standard deviation of productivity is assumed proportional to average
productivity in the job. Thus the optimal classification assigns higher-
ability workers to the higher-wage jobs, for which their expected produc-
tivity is higher because the standard deviation of productivity in dollars is
higher.
If there is a single predictor, then Brogden (1959) would predict no
gains from the use of testing. Hunter and Schmidt's different conclusion
is based on a different assumption about the way in which:
validity x standard deviation of productivity
varies across jobs. Hunter and Schmidt argue that the higher the average
productivity of a job, the greater is the influence of a worker's ability on
the output of the job. Some fragmentary confirming evidence that
supports this point of view can be found in Hunter et al. (1988~. Brogden
implicitly assumes that the effect of ability on job output is the same for
all jobs. We regard the Hunter and Schmidt assumption as plausible but
note that there is very little evidence about the nature of the relationship
of ability to output.
In the second of Hunter and Schmidt's models, the multivariate model,
different predictors are used for the different job types. Cognitive ability
is used for managerial-professional and for semiskilled-unskilled, with an
assumed correlation of .4 for each. Cognitive ability and spatial ability
OCR for page 244
244 ASSESSMENT OF THE VG-GATB PROGRAM
predict productivity in skilled trades, and the three correlations between
the two abilities and productivity are assumed to be .4. Cognitive ability
and perceptual ability predict productivity in clerical work, and the three
correlations between the two abilities and productivity are assumed to be
.4. Finally, the correlation between spatial and perceptual ability is
assumed to be .16.
The workers are assigned in the second model as follows: first, those
scoring highest on cognitive ability are assigned to the management-
professional group; then, of those remaining, the highest scorers on
spatial plus cognitive ability are assigned to the skilled trades; of those
remaining, the highest scorers on perceptual plus cognitive ability are
assigned to clerical work; and the remainder go to semiskilled-unskilled
labor. (Although it is a minor academic point, this assignment does not
maximize productivity; despite their high cognitive ability, some prodi-
gious scorers on spatial ability should be assigned to skilled trades.)
Hunter and Schmidt use their models to estimate the amount by which
the GNP would increase if testing were used to place all workers
optimally in jobs. Under the assumption that validity is .4, their estimates
range from 1.7 percent of the GNP for the univariate model (using a low-
16 percent of average output-estimate of the standard deviation of
productivity) to 8.1 percent of GNP for the multivariate mode} (using a
high 40 percent of average output-estimate of the standard deviation
of productivity).
Using our preferred parameters-validity is .2 and the standard devia-
tion of productivity on a job is 20 percent of output on that jo~their
univariate model suggests that improved job matching would increase the
GNP by about 1.1 percent; the multivariate model suggests an increase of
2.1 percent.
These percentage increases should be compared with the 35 percent
increase estimated by Hunter for Employment Service jobs (U.S.
Department of Labor, 1983e). Hunter and Schmidt argue that their
multivariate model overestimates the potential gain from a testing
program because it does not take into account that placement is not now
at random. They suggest that a reasonable way to correct this estimate
of the potential gains is to take the difference between the multivariate
and univariate models. Under their assumptions, these gains would
range from 1.6 percent to 4 percent of the GNP; under our preferred
assumptions, this technique puts potential gains at 1 percent of the
GNP.
How do these economy-wide models relate to Employment Service use
of the GATE? This is an important question, because a policy that would
increase the GNP by just 1 percent would be of enormous value to the
country (l percent of the GNP in 1987 was $45 billion). In answering this
OCR for page 245
EVALUATION OF ECONOMIC CORMS 245
question it is important to remember that only a small fraction of those
who find jobs each year do so through the Employment Service system.
The gains that Hunt~and Schmidt calculate would be realized only if all
employers used tests optimally. It is also important to remember that the
most important assumptions of the Hunter-Schmidt models rest on a very
slim empirical foundation.
Nevertheless, the committee views the economy-wide matching
models as a promising way to assess the economic effects of testing. By
looking beyond a single job, they offer the Employment Service a
device for balancing the demands of all employers and all applicants. In
particular, if they are to be taken seriously, they would require a job
classification scheme that as much as possible reduces the correlation
between predictors in different job classes. The present five-family
classification scheme is not adequate for effective multivariate match-
~ng.
Few economists have tried to answer the question of how productivity
is affected by the way in which workers are matched to jobs. Those who
have approached this problem have used models and procedures that are
very different from those used by Hunter and Schmidt. Most economic
models assume that workers choose the job for which they are best fitted.
With this maintained assumption it is not possible to address the question
that Hunter and Schmidt ask. Some economic models (notably those of
Heckman and Sedlacek, 1985, and Willis and Rosen, 1979) have been
tested in the sense that they have been successfully fitted to data about
the U.S. economy. In this weak sense they have a firmer empirical base
than the Hunter-Schmidt models. However, on the issue of how much
output would go up if people were better fitted to their jobs, they are, at
present, silent.
Hunter and Schmidt's economy-wide models are based on simple
assumptions for which the empirical evidence is slight. The most impor-
tant one is that the standard deviation of productivity is proportional to
average wage of the job. That assumption is supported by only a very few
studies. Without that effect there would be no gains in placing higher-
scoring workers in the more highly paid jobs. The second set of assump-
tions concerns the correlation of various aptitudes with productivity.
Although there are many more data on which to base these correlations,
there is much variation in the data and considerable disagreement about
what the correlations should be. The general concept of the models is
promising, but the particular numerical values used can be regarded as
only illustrative. We do not know how well employers and workers match
themselves already. We do not have a classification of jobs that lends
itself to job matching, so the gains from the multivariate model are only
theoretical.
OCR for page 246
246 ASSESSMENT OF THE VG-GATB PROGRAM
SUPERVISOR RATINGS AND TRUE PRODUCTIVITY
Proponents of the VG-GATB claim that its use will lead to increased
productivity. The scientific base of GATB research does not support
such an inference directly. This is because the GATB validity studies do
not report correlations between test performance and productivity;
instead they report correlations between test performance and a surro-
gate for productivity, supervisor ratings. (A number of studies report
correlations between test scores and performance in training programs.
In analysis of the economic benefits of using the GATB, the data on
training are largely ignored.) A small number of studies, discussed in
Chapter 10, have attempted to measure the economic benefits of using
the GATB directly. The small number and mixed quality of these
studies make it difficult to draw inferences that can be generalized to
other settings.
The correlation between test scores and true productivity could well
be either higher or lower than the correlation between test scores and
supervisor's ratings. If the GATB measures productivity, and if super-
visor ratings are imperfect measures of productivity, then the correla-
tion between productivity and test scores will be higher than the
reported correlation between test scores and supervisor ratings. For an
elaboration of this point see the discussion of criterion unreliability in
Chapter 6.
If, however, the GATB measures well what supervisors regard highly
and if supervisor ratings tend to ignore or overlook significant contri-
butions to productivity, contributions that are not well measured by the
GATB, then the correlation between supervisor ratings and GATB
scores will exceed the correlation between productivity and GATB
scores.
Which is the case? In the absence of direct data on the joint distribution
of test scores, supervisor ratings, and productivity, we cannot say with
confidence whether reported validity coefficients overstate or underesti-
mate the true correlation between test scores and productivity.
It seems highly unlikely that data that will resolve this problem will
exist in the near future (or ever). What then is to be done? The most
reasonable course would seem to be to regard correlation with supervisor
ratings as the best available estimate of the correlation between test
scores and productivity. However, those who use these numbers to
evaluate potential economic gains should be aware of the uncertain
scientific base on which their estimates rest.
OCR for page 247
EVALUATION OF ECONOMIC CHUMS 247
FINDINGS, CONCLUSIONS, AND RECOMMENDATIONS
A major attraction of the VG-GATB system is the anticipation of
substantial economic gains. USES Test Research Report No. 47 (U.S.
Department of Labor, 1983e), written by John Hunter, contends that a
potential increase in work force productivity of $79.36 billion per year
would accrue if the 4 million placements made by the Employment
Service system were based on top-down referral from GATE test
scores.
Our evaluation of the potential economic effects of the VG-GATB
Referral System included study of the work of labor economists as well as
the utility analysis developed in recent years by psychologists. We have
looked carefully at Hunter's work with GATE data as well as the more
elaborate models proposed by Hunter and Schmidt in other contexts.
Findings
Benefits to the Individual Employer
1. There is evidence in the economics and industrial/organizational
psychology research literature that people who score higher on ability
tests tend to produce more and make fewer errors, as well as to complete
training somewhat faster and stay on the job longer.
2. How selective an individual firm can be depends on the people
available and how much the firm can offer its employees in pay and other
benefits. Selection can operate only within those conditions, and the
potential gains are commensurately constrained.
Aggregate Economic Elects
1. There is no well-developed body of evidence *om which to estimate
the aggregate effects of better personnel selection. A number of theoret-
ical models have been developed that imply various estimates of produc-
tivity gains from improved selection and placement. But we have seen no
empirical evidence that any of them provides an adequate basis for
estimating the aggregate economic effects of implementing the VG-GATB
on a nationwide basis.
The Hunter-Schmidt Moclels
1. The Hunter-Schmidt univariate and multivariate models for estimat-
ing the aggregate economic gain of optimal selection are potentially
valuable. However, we have seen no empirical evidence that supports
OCR for page 248
248 ASSESSMENT OF THE VG-GATB PROGRAM
their estimates of dollar gains in the GNP if employment testing with
top-down scoring were widely used.
Conclusions
1. Our review of the economics literature and our analysis of the
Hunter-Schmidt theoretical models lead us to reject their estimates of
specific dollar gains from test-based selection.
2. Furthermore, given the state of scientific knowledge, we do not
believe that realistic dollar estimates of aggregate gains from improved
selection are even possible. They lend a spurious certainty to the
argument for the VG-GATB Referral System that can only mislead policy
makers, employers, and those who administer the referral system.
3. We agree that better selection of workers would be likely to benefit
individual employers and that a better matching of people to jobs
according to their particular abilities or other work-related characteristics
would tend to foster the economic health of the community, all other
things being equal. But the current state of economic knowledge does not
permit estimation of the overall economic effects of widespread testing.
Recommendation
1. Given the primitive state of knowledge about the aggregate eco-
nomic effects of better personnel selection, we recommend that Employ-
ment Service officials refrain from making dollar estimates of the gains
that would result from test-based selection.
OCR for page 249
PART V
CONCLUSIONS AND
RECOMMENDATIONS
Whereas the committee's specific conclusions and recommendations
appear at the end of each chapter, Part V highlights the committees most
important recommendations. Chapter 13 presents the committee's rec-
ommendations on the use of score adjustments for black and Hispanic job
seekers in the VG-GATB Referral System and its recommendations on
what scores to report to test takers and employers. Chapter 14 is a
summary of the committee's central recommendations: it recapitulates
the committee's statements on operational use of the VG-GATB system,
methods of referring applicants to jobs, options for reporting GATE
scores to employers and to job seekers, promotion of the VG-GATB
system, research on its effects, and action with regard to veterans and
people with handicapping conditions.
OCR for page 250
Representative terms from entire chapter:
test score