A counterfactual question—how would things have looked in the absence of the U.S. Agency for International Development (USAID) program?—lies at the core of any design for impact evaluations. Chapter 5 made the case that randomized evaluations provide the soundest methodology for generating definitive answers to this question. However, it is one thing to specify what may be optimal theoretically and another thing altogether to implement that methodology on the ground. Practical impediments may make the implementation of randomized evaluation difficult, even impossible, at least in a pure form. For example, factors outside of USAID’s control may render it not feasible to gather baseline data, to identify and monitor outcomes in a control group, or to select by lottery the units in which programs should be implemented. Although Chapter 5 provided examples of several successful randomized evaluations, only a handful of these are in the democracy and governance (DG) area, and none of them are examples of evaluations of USAID’s own programs. Thus, even if willing to accept the desirability in principle of adopting the methodology of randomized evaluation, it is reasonable to wonder how readily it can be applied to the sorts of programs that USAID missions in the field regularly undertake.
To find out, the committee commissioned three expert teams to visit USAID missions overseas in an effort to assess the viability of impact evaluations for past and present DG programming. The key task for each team was to talk with implementers, local partners, and USAID mission
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 151
6
Implementing Impact
Evaluations in the Field
INTRODUCTION
A counterfactual question—how would things have looked in the
absence of the U.S. Agency for International Development (USAID) pro-
gram?—lies at the core of any design for impact evaluations. Chapter 5
made the case that randomized evaluations provide the soundest meth-
odology for generating definitive answers to this question. However, it
is one thing to specify what may be optimal theoretically and another
thing altogether to implement that methodology on the ground. Practical
impediments may make the implementation of randomized evaluation
difficult, even impossible, at least in a pure form. For example, factors
outside of USAID’s control may render it not feasible to gather baseline
data, to identify and monitor outcomes in a control group, or to select by
lottery the units in which programs should be implemented. Although
Chapter 5 provided examples of several successful randomized evalua-
tions, only a handful of these are in the democracy and governance (DG)
area, and none of them are examples of evaluations of USAID’s own
programs. Thus, even if willing to accept the desirability in principle of
adopting the methodology of randomized evaluation, it is reasonable to
wonder how readily it can be applied to the sorts of programs that USAID
missions in the field regularly undertake.
To find out, the committee commissioned three expert teams to visit
USAID missions overseas in an effort to assess the viability of impact
evaluations for past and present DG programming. The key task for each
team was to talk with implementers, local partners, and USAID mission
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
personnel on the ground to assess the feasibility of actually implementing
in practice the evaluation methodologies outlined in the previous chapter.
The first part of this chapter presents the results of those field visits. The
second part provides responses to the most commonly raised objections
that the committee and its field teams heard expressed about the use of
randomized evaluations in DG programs.
Before turning to the details of what the field teams found, it is impor-
tant to highlight a clear and consistent message that came through from
all three field visits. All three teams concluded, first, that the introduction
of randomized ealuations into USAID project ealuation was both feasible and
cost-effectie in many of the contexts they inestigated. They were unanimous
that, where possible, adopting such methods would represent an improve-
ment over current practices. Second, they reported that, for projects where
randomized ealuations were not possible, other improements to USAID ealua-
tion—for example, improed measurement, systematic collection of baseline data,
and comparisons across treated and untreated units—also hae the potential to
yield significant improements in the agency’s ability to attribute project impact.
These issues are discussed in Chapter 7. Finally, the teams returned from
the field energized by their interactions with mission staff and confident
that a willingness, and even excitement, exists about improving the qual-
ity of project evaluations. The teams were also impressed with some of
the work already being done as part of current project monitoring, in
particular in the broadening of measurement strategies beyond project
outputs to include an assessment of outcomes.
FIELD vISITS TO USAID MISSIONS
As a complement to the deliberations in Washington and extensive
engagement with USAID staff and implementers, the committee felt
strongly that its recommendations should be informed by a set of extended
field visits to USAID missions. The committee therefore identified a set of
missions, representing a diversity of regions, that were engaged in sub-
stantial programming on DG issues and were in the process of designing
large, new projects in one of USAID’s core DG areas (rule of law, elections
and political processes, civil society, and governance). From the list of
missions provided, USAID explored the willingness of the missions to
host the team and consider new approaches to project evaluation. After
negotiating issues of timing and access, USAID and the committee agreed
to send field teams to Albania, Peru, and Uganda. The field visits were
intended to accomplish three main goals:
1. to better understand current strategies used for project evaluation,
including approaches to data collection;
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
2. to explore the feasibility of introducing impact evaluations in the
future, including (but not limited to) randomized evaluation; and
3. to obtain the perspectives of mission personnel and USAID
implementers regarding the possibilities for, and impediments to, new
approaches to evaluation.
The committee encouraged the field teams to explore the range of
DG activities currently under way in each mission, assess the adequacy
of current evaluation approaches, and provide concrete examples of how
existing approaches could be improved. In addition, the field teams were
directed to focus particular attention on the development of an impact
evaluation design in one specific area in each mission. The teams focused
on local government/decentralization in Albania and Peru and support
for multiparty democracy in Uganda.1
Each field team was composed of methodological consultants, aca-
demic or other experts with relevant experience in research design or
program evaluation and DG issues, and country or regional expertise;
a Washington-based USAID staff member who was familiar with the
mission, the committee’s work, and USAID policies and practices; and
National Research Council professional staff, who assisted the consultants
in meeting the team’s objectives and coordinated the logistics of the field
visits.
In evaluating the findings of the three field teams, it is important to
keep in mind that the field teams visited missions that had expressed an
interest in improving their evaluation strategies. The field teams’ conclu-
sions about the applicability of impact evaluations, especially its sense
that standard objections to these designs can be addressed, thus reflect
the experiences gleaned from this (nonrandom) sample of missions. It is
not known if other missions, especially smaller ones with leaner budgets
or those in countries experiencing violent conflicts or particularly rapid
political change, would be as amenable to new approaches to evaluation:
The committee has no control group of non-self-selecting missions with
which to compare its findings. Yet the committee believes it unlikely that
missions that did not invite the committee to send a field team would
have offered novel additional objections. Over the 15 months of the study
period, the committee talked with numerous USAID staff and implement-
ers from a variety of areas and with backgrounds and experience with DG
programming in a great many countries, and the set of objections that are
taken up in the second part of this chapter dominated the responses of
everyone with whom the committee spoke.
1 Key results of the field visits are discussed in this chapter and the next. Additional infor-
mation can be found in Appendix E.
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
EMPLOyINg RANDOMIzED IMPACT EvALUATIONS
FOR USAID Dg PROJECTS IN THE FIELD
Randomized evaluations are widely considered the best method for
determining the causal effects of treatment in a broad range of areas,
including public health, education, microfinance, and agriculture. As
the Olken (2007) and Gugerty and Kremer (2006) studies described in
Chapter 5 show, such methodologies are also beginning to be applied to
evaluate the effectiveness of projects in the area of democratic governance.
Nonetheless, the committee learned from its consultations with USAID
staff and implementers that there is a general feeling that randomized
evaluation was not an option for many of the projects that USAID carries
out. Even in those cases where randomized evaluations might be possible
theoretically, the assumption among USAID staff seemed to be that such
approaches would be too difficult to implement in practice, owing to an
inability to select treatment groups by lottery, the difficulty of preserv-
ing a control group, the difficulty of identifying good indicators for key
outcomes, the high cost of the extensive data collection that would be
required, or the tension between the flexibility staff believe they need to
respond to opportunities and challenges as projects go forward and the
need to minimize changes to ensure an effective evaluation.
These are legitimate concerns. To address them, this section discusses
how randomized evaluations could be used in current USAID projects,
drawing on examples gleaned from the field visits and consultations with
practitioners. We begin with a decentralization project in Peru that has
already been implemented, outlining how the project monitoring strategy
that was employed could have been adjusted to accommodate a random-
ized component that would have made it an impact evaluation design
and showing how such an adjustment would have permitted the mis-
sion to generate much stronger inferences about project impact.2 Then a
planned multipronged effort to support multiparty democracy in Uganda
is described, emphasizing how pieces of the existing project might be
amenable to randomized evaluation and showing how adopting such an
evaluation method would improve USAID’s ability to assess the project’s
effects.3 The committee’s goal is to use these projects as illustrations of the
potential payoffs that could accrue from improved evaluation strategies.
2 The discussion here of decentralization in Peru is drawn from the report of a field team
led by Thad Dunning, assistant professor of political science, Yale University.
3 These designs were developed by a team led by Devra Moehler, assistant professor of
political science, Cornell University.
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
Decentralization in Peru
USAID/Peru launched a project in 2002 to support national decen-
tralization policies initiated by the Peruvian government. Over a five-year
period, the Pro-Decentralization (PRODES) program was intended to:
• support the implementation of mechanisms for citizen participa-
tion with subnational governments (such as “participatory budgeting”),
• strengthen the management skills of subnational governments in
selected regions of Peru, and
• increase the capacity of nongovernmental organizations in these
same regions to interact with their local government (USAID/Peru
2002).
With the exception of some activities relating to national-level policies,
all interventions under the project took place in seven selected subna-
tional regions (also called departments): Ayacucho, Cusco, Huanuco,
Junin, Pasco, San Martin, and Ucayali.4 These seven regions contain 61
provinces, which in turn contain 536 districts.5 Workshops on participa-
tory budgeting, training of civil society organizations (CSOs), and other
interventions took place at the regional, provincial, and district levels. 6
The ultimate goal of the project was to promote “increased respon-
siveness of subnational elected governments to citizens at the local level
in selected regions” (USAID/Peru 2002). This outcome is potentially mea-
surable on different units of observation. For example, government capac-
ity and responsiveness could be measured at the district or provincial
level (through expert appraisals or other means), while citizens’ percep-
tions of government responsiveness may be measured at the individual
level (through surveys).
The PRODES decentralization project represented an ambitious effort.
By all accounts it was a well-executed program; the performance of the
local contractor received high marks from mission staff at USAID/Peru.
The questions of interest here do not relate to the performance of the con-
tractor in relation to project outputs or very proximate outcomes, which
4 The regions were nonrandomly selected for programs because they share high poverty
rates, significant indigenous populations, and narcotics-related activities and because a
number of the departments were strongholds for the Shining Path movement in the 1980s.
5 Peru has 24 departments plus one “constitutional province”; the 24 departments in turn
comprise 194 provinces and 1,832 districts. Provinces and districts are often both called
“municipalities” in Peru and both have mayors. Sometimes two or more districts combine
to form a city, however.
6 Relevant subnational authorities include members of regional councils, provincial may-
ors, and mayors of districts.
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
were the focus of the project monitoring plan used by the implementer.7
Instead, the question is how we could know whether such a project had
impacts on targeted policy outcomes, such as the responsiveness of local
governments to citizens’ demands.
Since the project was not designed with impact evaluation, as defined
here, in mind, it suffered from a number of serious deficiencies in that
regard. The main deficiencies parallel the general points raised in Chap-
ter 5: the absence of indicators for at least some of the most important
policy outcomes, the absence of comparison units, and the absence of
treatment randomization. Taken together, these shortcomings present
almost insuperable obstacles to an impact evaluation. One important find-
ing of the team was that with foresight some of these deficiencies might
have been fairly easily corrected and for not much additional cost. Indeed,
some of the changes outlined below would likely yield cost saings.
As mentioned, the decentralization project sought to foster citizen
participation, transparency, and accountability at the local level, with
the ultimate objective of promoting “increased responsiveness of subna-
tional elected governments to citizens.” Though some of these outcomes
are potentially, albeit imperfectly, measurable, indicators gathered at the
local level related almost exclusively to outputs rather than outcomes.
For example, the indicators gathered included the percentage of munici-
palities that signed “participation agreements” with local contractors;
the percentage of participating municipalities from which at least two
individuals (local authorities or representatives of CSOs) attended a train-
ing course in participatory planning and budgeting; the percentage of
targeted provincial governments in which at least two CSOs exercised
regular oversight of municipal government operations, as measured by
participation in at least two public forums during the year; and the per-
centage of participating local governments that establish technical teams
to assist with decentralization efforts (PRODES PMP 2007).
Such indicators are designed to monitor the implementer’s perfor-
mance and perhaps measure very proximate outcomes, such as formal
participation in the decentralization process. However, they do little to
help discern the impact of interventions on the main outcomes that the
project was designed to affect. For purposes of evaluating impact—and
even for improved project monitoring—we want to know not how many
training courses there were or how many officials attended them but
rather whether they led subnational elected governments to be more
responsive to their citizens.8
7A description of current USAID project monitoring can be found in Chapter 2.
8 TheUSAID/Peru team and local contractors were clearly aware of the distinction be-
tween measures of contractor performance and measures useful for assessing impact; this
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
Several indicators gathered through surveys did tap citizens’ percep-
tions of the responsiveness of subnational elected governments in targeted
municipalities. Surveys taken in 2003, 2005, and 2006 asked respondents:
Are the services provided by the (district, provincial, or regional) gov-
ernment very good, good, average, bad, or very bad? Another question,
administered only in the 2003 and 2005 surveys,9 asked: Do you think that
the (district, provincial, or regional) government is responsive to what the
people want almost always, on the majority of occasions, from time to
time, almost never, or never? (PRODES PMP 2006, 2007).
In principle, such survey questions may provide useful proxy mea-
sures of the outcomes of interest. In practice, however, there were a num-
ber of issues that limited the usefulness of these measures. First, only the
first question was asked in a comparable manner across all three surveys,
allowing for a very limited time series on the outcome of interest. Second
and perhaps more importantly, as discussed further below, was the failure
to gather measures on control units in all but the 2006 survey.
Finally, a “baseline” assessment of municipal capacity was prepared
at the start of the program by a local institution. All district and provin-
cial municipalities in the seven selected regions were coded along several
dimensions, including extent of socioeconomic needs and management
capacities of district and provincial governments (GRADE 2003).
Poverty rates and related indicators played a preponderant role in the
local institution’s calculations, which may have limited the usefulness of
the index for assessing changes in subnational government capacity or
responsiveness. In theory, however, repeated assessments of this kind
could have provided useful data on municipal capacity, which is an out-
come of interest under the decentralization project. As far as the team
could determine, the assessment was not repeated.
USAID/Peru’s implementer was tasked with carrying out the decen-
tralization project in all 536 districts of the seven selected regions. Once
the rollout of interventions in all municipalities had been completed, no
untreated municipalities remained available in the selected regions. The
absence of appropriate control units (untreated municipalities) is perhaps
the biggest problem for effective evaluation of the decentralization proj-
ect. In addition, since rollout was completed by the second year of the
program, there was little opportunity to compare outcomes in treated and
untreated units in the seven regions.
distinction is made in some of the relevant program monitoring plans (e.g., PRODES PMP
2006). However, most of the impact measures appear to be fairly proximate outcome mea-
sures related to the process of supporting decentralization.
9 The 2003 and 2005 questions were administered as a part the Democratic Indicators
Monitoring Survey, whereas for 2006, data came from the Latin American Public Opinion
Project.
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
In principle, comparisons could be made across treated municipali-
ties in the seven selected regions and untreated municipalities outside
these regions. Since the seven regions were nonrandomly selected on the
basis of characteristics that almost surely covary with municipal capac-
ity and subnational government responsiveness (e.g., high poverty rates,
narcotics-related activities, past presence of the Shining Path), however,
inferences drawn from such comparisons would be problematic, although
not completely uninformative. In practice, however, the data do not exist
for such comparisons because virtually no data were gathered on control
units. The exception is the 2006 commissioned survey taken as a part of
the Latin American Public Opinion Project (LAPOP), which administered
a questionnaire to a nationwide probability sample of adults including
an oversample of residents in the seven regions in which USAID works
(Carrión et al 2007).10 This survey includes several questions that would
be useful measures of the outcome variables (though only one question
is comparable to questions asked in the earlier non-LAPOP surveys taken
in treated municipalities in 2003 and 2005).11 The 2006 LAPOP national
survey, had it been carried out beginning in 2003, could have established
a national baseline against which the selected regions could have been
measured before the program began.12 The project implementers would
then have known, for example, if as was hypothesized, satisfaction with
local government, participation in local government, corruption in local
government, and so forth, were more problematic in the targeted regions
than in the rest of the country. Since the regions selected were poorer and
more rural than the nation as a whole, covariate controls could have been
introduced in an analysis-of-variance design that could have statistically
forced the nation and the control groups to look more alike. Then, in
each subsequent round of surveys, comparisons could have been made
between the nation and the targeted regions, thereby making it pos-
sible to observe the rate of change. Had satisfaction with local govern-
ment nationwide remained unchanged while the targeted areas showed
increased satisfaction, project impact could have been established with
a reasonable degree of confidence. Indeed, if national satisfaction had
10 In addition to 1,500 respondents in the nationwide sample, an oversample of 2,100 (300
per region) was taken from the seven regions (Patricia Zárate, Instituto de Estudios Pe-
ruanos, personal communication June 2007). Inter alia, this survey asked respondents their
opinions of the quality of local government services, as noted above.
11 The LAPOP instruments include questions that are comparable across 20 surveyed
countries; see Seligson (2006). For useful information, the committee is grateful to Patricia
Zárate, Instituto de Estudios Peruanos.
12 Of course, the national sample would need to have had removed from it any sample
segments lying in the project area in order for the national “control” group not to have been
contaminated by the project inputs.
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
declined over the life of the project while the target areas held steady,
this, too, could have been an indicator of project success. It is important
to stress that since the mission was already regularly conducting national
samples of public opinion, there would hae been no added data-gathering
costs in the hypothetical strategy just proposed. The only cost would have
been the minimal expense of analyzing the data.
Outside the LAPOP 2006 survey, no data were gathered on untreated
municipalities. The universe of the 2003 and 2005 surveys was limited
to residents of the seven regions (and thus only to residents of treated
municipalities). Evaluations of municipal capacity (e.g., the GRADE study
mentioned above) were conducted only on districts and provinces in the
seven selected regions.
Although some data were collected in control municipalities outside
the seven regions, the absence of a control group within the regions has
serious consequences for evaluation. As just one example, many munici-
palities in the seven regions had been ravaged by the conflict with the
Shining Path during the 1980s and 1990s. Investment and population
return have picked up in some areas during the past decade, especially
the past five years; at least some of this upturn must be due to the end
of the war and other factors.13 Improvements in measured municipal
capacity or in citizens’ perceptions of local government responsiveness
during the life of the program may, therefore, not be readily attributable
to USAID support for decentralization. If control municipalities had been
selected from the outset at random and the treatment municipalities had
outperformed the controls, we would have greater confidence that the
project had a positive impact.
In sum, as discussed further below, if the project had been designed to
permit rigorous impact evaluation rather than monitoring, a plan for gath-
ering data on control units would have been created as part of the initial
project design. Ideally, one would have compared treated and untreated
municipalities inside the seven regions. In the absence of untreated munici-
palities inside the regions, data could have been gathered on appropri-
ately selected municipalities outside the region.14 Surveys should have
included residents of untreated municipalities, and evaluations of munici-
pal capacity (such as the GRADE study) should have included pre- and
postmeasures on municipalities with which USAID/Peru’s contractor was
not assigned to work.
13 Interviews, Ayacucho, June 27, 2007.
14 However, as discussed below, without assignment, data on controls may also not help
with the inferential issues mentioned in the previous paragraph.
OCR for page 151
0 IMPROVING DEMOCRACY ASSISTANCE
An Alternatie Ealuation Design
It is possible, looking backward, to describe an ideal randomized
impact evaluation design for the decentralization project that could have
been implemented in 2002. Assume that the decision to implement the
decentralization project in the seven nonrandomly chosen regions was not
negotiable; inferences about the effect of the intervention would then be
made to the districts and provinces that comprise these regions.
The simplest design would involve randomization of treatment at the
district level. Districts in the treatment group would be invited to receive
the full bundle of interventions associated with the decentralization proj-
ect (e.g., training in participatory budgeting, assistance for civil society
groups); control districts would receive no interventions.
There are two disadvantages to randomizing at the district level,
however. One is that some of the relevant interventions in fact take place
at the provincial level.15 Another is that district mayors and other actors
may more easily become aware of treatments in neighboring districts.
For both of these reasons it would be useful to randomize instead at the
provincial level. Then all districts in a province that is randomly selected
for treatment would be invited to receive the bundle of interventions.
Several different kinds of outcome measures could be gathered. Sur-
vey evidence on citizens’ perceptions of local government responsiveness
would be useful, as would information on participation in local govern-
ment and evaluations of municipal governance capacity taken across all
municipalities in the seven regions (both treated and untreated).
A difference in average outcomes across groups at the end of the
project—for example, differences in the percentage of residents who say
government services are “good” or “very good,” or the percentage who
say the government responds “almost always” or “on the majority of
occasions” to what the people want—could then be reliably attributed to
the effect of the bundle of interventions, if the difference is bigger than
might reasonably arise by chance.
One feature of this design that may be perceived as disadvantageous
is the fact that treated municipalities are subject to a bundle of inter-
ventions. Thus, if a difference is observed across treated and untreated
groups, it may not be known which particular intervention was respon-
sible (or most responsible) for the difference: Did training in participatory
budgeting matter most? Assistance to CSOs? Or some other aspect of
the bundle of interventions? This problem arises as well in some medi-
cal trials and other experiments involving complex treatments, where
15 Some interventions also occurred at the regional level, particularly toward the end of the
project, yet these interventions constitute a relatively minor part of the project.
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
it may not be clear exactly what aspect of treatment is responsible for
differences in average outcomes across treatment and control groups.
Despite this drawback, it seems preferable to design an evaluation plan
that would allow USAID to know with some confidence whether a project
it financed made any difference. Bundling the interventions may provide
the best chance to estimate a causal effect of treatment. Once this ques-
tion is answered, one might then want to ask what aspect of the bundle
of interventions made a difference, using further experimental designs.
However, another possibility discussed below is to implement a more
complex design in which different municipalities would be randomized
to different bundles of interventions.
USAID/Peru is preparing to roll out a second five-year phase of the
decentralization project, possibly again in the seven regions in which it
typically works. At this point, all municipalities in the seven regions were
already treated (or at least targeted for treatment) in the first phase. This
may raise some special considerations for the second-phase design. The
committee’s understanding is that there are several possibilities for the
actual implementation of the second phase of the project; which option
is chosen will depend on the available budget and other factors. One is
that all 536 municipalities are again targeted for treatment. As in the first-
phase design, this would not allow the possibility of partitioning munici-
palities in the seven regions into a treatment group and controls.
In this case the best option for an experimental design may be to ran-
domly assign different treatments—bundles of interventions—to different
municipalities. While such an approach would not allow comparison of
treated and untreated cases, it would allow us to assess the relative effects
of different bundles of interventions. This may be quite useful, particu-
larly for assessing the question raised above about which aspect of a given
bundle of interventions has the most impact on outcomes. Do workshops
on participatory budgeting matter more than training CSOs? Randomly
assigning workshops to some municipalities and training to others would
allow us to find out.
A second possibility for the second phase of the project is to reduce
the number of municipalities treated, for budgetary reasons. Suppose the
number of municipalities were reduced by half. The best option in this
case is probably to randomize the control municipalities out of treatment,
leaving half of the universe assigned to treatment and the other half as the
control. Those municipalities assigned to treatment would be offered the
full menu of interventions in the decentralization program.
Of course, randomizing some municipalities out of treatment is sure
to displease authorities in control municipalities as well as USAID offi-
cials who would want to choose municipalities where they believe they
have the greatest chances for success. Yet if the budget only allows for 268
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
ods of working with selected local governments and civil society groups
at the sub-county levels in the identified districts” (USAID/Uganda
2007a:16). Although the specific interventions were as yet undefined, the
Request for Proposals suggested working with elected and appointed
leaders, traditional leaders, women, youth, constituents, and CSOs at
a subcounty level. Most likely, the program will consist of a bundle of
interventions rather than a single activity.
The fact that USAID plans to work with a sample of subcounties
(within 10 preselected districts) makes this activity an excellent candidate
for randomized evaluation. The number of subcounties within the 10
districts will almost certainly be enough to provide for a large N random-
ized evaluation. Therefore, in planning interventions at the subcounty
level, provision would be made for the random selection of treatment
and control subcounties. One approach would be to randomly select half
the subcounties within the 10 districts to be in the treatment group and
receive the full bundle of interventions. The remainder of the subcounties
would receive no interventions and thus serve as a control group. Alter-
natively, subcounties could be stratified along district boundaries or other
criteria, and random selection could take place within strata to facilitate
equivalence on important dimensions.
It is difficult to determine the most appropriate measurement tools
without a better understanding of the exact interventions and the goals
of the program. Regardless of the measurement approach, equivalent data
would need to be collected in the subcounties in the control group as well
as those in the treatment group. Ideally, baseline data would be collected
before implementation of the program and then again during and after.
USAID could also investigate the possibility of contributing to ongoing
data collection efforts by the government or other agencies (such as the
yearly school census, the service delivery survey, the Afrobarometer pub-
lic opinion survey, and public expenditure tracking surveys) in order to
provide the necessary funds for oversampling in the 10 selected districts.
In most cases, oversampling will be necessary to obtain data that are rep-
resentative at the subcounty level.
Interparty Debates
In an effort to support multiparty democracy, USAID envisions inter-
ventions to “foster discussion and dialogue among the political parties so
that difficult decisions can be achieved through compromise and nego-
tiation before they result in conflict and stalemate” (USAID/Uganda
2007b:18). Building on successful interparty dialogues during the cam-
paign before the 2006 presidential elections, USAID is considering spon-
soring local-level political debates at the district level and below to engage
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
citizens in multiparty politics more effectively. In thinking about how to
evaluate such activities, it is natural to ask: How does exposure to inter-
party debates impact citizen knowledge and attitudes about politics, voter
turnout, voting outcomes, and political conflict at the local level?
Randomized evaluation offers a powerful tool for assessing the impact
of interparty dialogues. Five voting precincts could be randomly selected
to be in the treatment group for each of 14 different parliamentary con-
stituencies. Remaining precincts in the 14 constituencies would make up
the control group. In each of the 70 treatment precincts, interparty debates
would be held between candidates for Parliament in advance of the next
election. Specifically, a given group of candidates vying for a single par-
liamentary seat would participate in interparty debates in five different
precincts within their own constituency. This would take place across 14
different groups of candidates in 14 different constituencies.
Many outcomes of interest are already collected by the electoral com-
mission—voter registration, voter turnout, and the percent vote for each
candidate. If interparty candidate debates help mobilize candidates, there
should be higher registration and turnout rates in treatment precincts. One
might imagine also that debates inform citizens about lesser-known candi-
dates and thus increase the vote for nonincumbent candidates or parties.
Therefore, if debates create a more informed citizenry, there should be a
smaller share of the vote for incumbents in treatment precincts. If, instead,
debates remind voters of the greater experience and access to largess pos-
sessed by the incumbent, the opposite effect would be evident. To gain
greater power, a difference in difference estimation strategy21 could be
used to evaluate changes from the last election in turnout and vote out-
comes (assuming that the boundaries of the voting precincts are relatively
stable since the last election and polling-station-level data are available
for the last election). An analysis of the distance of control precincts from
treatment precincts can also be performed to account for the fact that
citizens in neighboring constituencies in the control group may attend or
learn about debates in the treatment precincts.
To assess the impact of interparty debates on local conflict, one could
also compare measures of election-day violence and intimidation gath-
ered by DEMGroup, party observers, or outside monitors.
If resources were available to conduct surveys in treatment and con-
trol precincts, the evaluation would provide an even richer perspective on
citizen knowledge, attitudes, political tolerance, and behaviors, enabling
a better understanding of the causal pathways linking debates with reg-
istration, turnout, and vote choice. Ideally, pre- and posttreatment panel
surveys would be carried out in treatment and control sites. Of course,
21 See Chapter 5 for a description of this evaluation design.
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
care must be taken to ensure that the population surveyed in the treatment
sites is comparable to those surveyed in the control sites. For example, it
would be misleading to survey only those individuals who attended the
debates in the treatment sites but to survey a random sample of individu-
als in the control group (including those who would have attended if the
debate were held in their area and those who would not have). A random
sample of all adult citizens in both treatment and control groups would
be more informative.
While the field team in Peru described how a past project might have
been designed in a way that permitted rigorous evaluation, the Uganda
team focused on a multifaceted set of projects that were just getting
started. Working with mission staff, the committee’s experts identified
a series of planned interventions, each of which could be assessed using
tools of randomized evaluation. Although these evaluation models do not
cover every planned intervention currently under consideration by the
Uganda mission, if implemented, they would provide substantial new
evidence about the efficacy of USAID DG programming in Uganda.
CHALLENgES IN APPLyINg RANDOMIzED
EvALUATION TO Dg PROgRAMS
The evaluation designs described above are the basis for the unani-
mous conclusion of the field teams that randomized evaluations, apart
from being valuable where they can be successfully applied, are also fea-
sible designs for measuring the impact of (at least some) ongoing USAID
DG projects. Yet demonstrating the feasibility of designing randomized
evaluations that do not require significant modifications of “normal” DG
projects does not imply that adopting them will not involve at least some
trade-offs. Indeed, USAID staff and implementers in all three countries
visited raised objections and concerns about some of the problems that
randomized evaluations might pose. While several of these problems do,
in fact, constitute real obstacles to program implementation or evaluation,
the field teams concluded that alternatives exist in many cases that could
help partially or wholly address the concerns that were raised. This sec-
tion discusses these problems and how randomized evaluations could
be designed to minimize them.22 Two important issues that are deferred
and discussed separately—the former in the next chapter and the latter in
Chapter 9—are the questions of what to do with projects that treat too few
units to be suitable for randomized evaluation and problems arising from
22 See Savedoff et al (2006) for another discussion of objections to rigorous evaluations and
ways they can be overcome.
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
the incentives (or disincentives) that DG staff and implementers have to
conduct impact evaluations and their current capabilities to do so.
Randomly selecting units for treatment is simply not workable. Adopt-
ing the principle of random assignment runs the risk that certain units
that project designers would very much like to include in the treatment
group will wind up being excluded from the program. For some USAID
staff and implementers with whom the committee spoke, this was a major
reason to resist adoption of randomized evaluations. It was pointed out,
for example, that in many situations USAID and its implementers can
only work with local authorities that accept their help. Moreover, it was
suggested that units (municipalities, ministries, groups) that lacked the
“political will” to work with USAID to fully implement the programs in
question would not be likely to achieve successful outcomes and thus do
not merit an investment of resources. It was also suggested that units with
exemplary past performance sometimes appeared to be such sure bets
for program success that excluding them from participation in the new
project appeared wasteful.
These are reasonable objections; however, accepting their merit need
not imply jettisoning a randomized design. One option that satisfies the
need for randomized selection of treatment units while also recognizing
that rolling out a program in some units may not be feasible would be to
select the set of units that are eligible for treatment on the basis of political
will and other criteria that USAID believes maximize the chances for suc-
cess and then to assign units randomly to treatment and control groups
within this group of eligible units. This approach is also useful for situa-
tions where USAID seeks to limit programs to needy or conflict-affected
areas, as long as there are more units than USAID can possibly treat.
Another option, suitable for situations where, for political or other
reasons, allocating treatment to one or several units may be nonnego-
tiable (i.e., the consensus among project designers is that a particular unit
or units simply must be included in the treatment group), is to go ahead
with random selection of units for treatment but leave aside a certain
percentage of the project budget (e.g., 10 to 15 percent) to pay for the
implementation of program activities in units that were not selected but
that organizers feel must be included. In such a case the evaluation would
be based on a comparison of the regular treated group (not including the
added units) with the control group. Of course, one can always look as
well at outcomes in the non-randomly selected—the “must have”—units.
Yet comparing outcomes in such units to nontreated units would be less
informative about the causal impact of the USAID intervention than
comparing outcomes across the units that were randomly assigned to the
treatment group and the control group.
OCR for page 151
0 IMPROVING DEMOCRACY ASSISTANCE
It is unethical or impossible to preserve a control group. Is it ethical to
deny treatment to control groups? This issue arises frequently in public
health programs but may also be relevant in projects where, as with inter-
ventions in the area of DG, the assistance is welfare improving even if not,
strictly speaking, life saving. As with public health studies, the standard
defense applies: Without an experiment, how do we know whether or
not the intervention helps? USAID intervenes to assist DG all over the
world. As in the public health field, it behooves us to know with as much
confidence as possible what works and what does not. Continuing to
channel scarce resources to projects that, once properly evaluated, turn
out to have no positive impact is wasteful, particularly when properly
executed randomized evaluations could put USAID in a position to iden-
tify projects that do work and whose reach and impact could usefully be
expanded with a shift in resources from those that have been found to be
underperforming.
A second defense of randomized assignments against the criticism
that some units will go untreated is that, in any project being implemented
across a large number of potential units, there will virtually always be
untreated units. In the context of a decentralization project involving doz-
ens of municipalities, it is simply not feasible for USAID to work with all
of them; in the context of a project designed to support CSO development,
it is simply not possible for USAID to work with every group. Given the
impossibility of treating eery unit, the only question is how untreated
units will be chosen. In many contexts it may be fairest, and most ethically
defensible, to choose untreated units by lottery, as would be the case in a
randomized evaluation.
Finally, even if every unit is to be treated, it may be reasonable to
delay treatment for a portion of the units by a randomized rollout. In this
case, while some units (chosen by lottery) will get assistance first, others
will have a delay before they receive assistance. Yet for the group that
faces delay, this may be more than compensated by the possibility that
the delayed group will either be spared an ineffective treatment or will
receive improved assistance, since the initial phase of the rollout provides
the basis for learning from a randomized impact study of the treatment’s
effects.
Isolating control from treatment groups is not feasible in practice. A
third objection involves the great difficulty in preventing the effects of
treated units from “spilling over” and affecting control units. For exam-
ple, a project that provides support for CSOs to advocate improved ser-
vice delivery may impact not only the area in which the CSOs are based
but also neighboring areas (either because local governments fear similar
mobilization and act to forestall it or because CSOs in neighboring areas
become emboldened by the example of what their colleagues are doing
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
next door and step up their own advocacy). Another example of spillover
is when grassroots party activities in one locale yield benefits in other
places, either because party contacts extend across administrative bound-
aries or because changing attitudes are transmitted across familial and
social networks. Whenever there are spillover effects (and there often are),
the difference between the control and treatment groups is attenuated,
and this will bias the evaluation toward a finding of no effect.
Sometimes, design modifications can help minimize the likelihood of
spillover. For example, in the context of the Peruvian decentralization proj-
ect discussed earlier, randomizing at the provincial level might decrease
the probability that district mayors are aware of treatments administered
to other units. In this case all municipalities in a province would be in
either the treatment group or the control group, thereby minimizing the
likelihood of spillover from municipality to municipality (except insofar
as they happen to be located adjacent to a provincial boundary).
But while problematic for inference, spillover effects may be impor-
tant to measure in their own right. In their study of deworming pro-
grams in Western Kenya, for example, Miguel and Kremer (2004) found
that deworming interventions are not cost-effective unless the positive
externalities of the program that spill over into neighboring untreated
communities are accounted for. Taking advantage of the fact that the
treatment is randomly assigned across space, they estimate the size of
these spillover effects and then use the estimates to calculate the true
effects of the deworming program, which they find to be positive once
the spillover effects are accounted for. Their study underscores that not
just minimizing but also measuring contamination must be a core aspect
of any well-designed randomized evaluation.
A related problem is the possibility that donors from other countries
might concentrate their programs in areas in which USAID is not under-
taking program activities, thereby, as one program officer put it, “flooding
the controls.” This may happen intentionally, when donors coordinate
and divide up areas of focus to avoid duplication of efforts. Or projects
not intended to directly influence democracy, such as programs to create
entrepreneurs or regional cooperative associations, may in fact help the
spread of democracy in the area being observed. If this occurs, the other
donors’ interventions become a confounding factor associated with treat-
ment, and this will almost certainly bias inferences about the effect of
USAID interventions.23
One possible response to this issue is not to advertise the existence of
23 However, it might be pointed out that, if anything, this is likely to dilute the (it is hoped
positive) effect of treatment. If other donors flood the controls and there is still a difference
between groups, a causal effect of USAID’s intervention can be inferred. (At least, the effect
of USAID relative to other donors can be evaluated.)
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
control units. For example, in the context of a decentralization project it
may be known that USAID is working in seven regions, but it need not be
made publicly known which particular municipalities it is working with
in each region. A second solution is to commit in advance to implement
the project in all units (and to make this publicly known) but to roll it out
gradually, using untreated units as a comparison group for treated units
in the years before they are added to the intervention (as in the second
design for the Peru decentralization program described earlier). Another
option is to randomize different treatments across all municipalities. In
other words, USAID would work with all municipalities in the seven
regions (thereby leaving no municipalities to be flooded) but randomly
assign different treatments to different municipalities (again, as discussed
earlier for Peru). One final possibility is to engage other donors in con-
ceptualizing the evaluation exercise. If multiple donors are implementing
similar interventions, all would benefit from an impact evaluation of their
projects. In such circumstances it may be possible to coordinate USAID’s
activities with theirs to preserve a control group.
It is hard to plan an evaluation (or stick to one) because mission objec-
tives and programs change all the time. A common concern the field
teams heard was that randomized evaluations are insufficiently flexible to
be practical. As a political officer at the U.S. Embassy in Peru commented,
the embassy is sometimes compelled to “put out fires.” For example, in an
experimental evaluation of the impact of municipal-level interventions in
mining towns, the embassy might have to intervene if a conflict broke out
in a community. This may or may not pose an issue for causal inference.
Some “fires” may be independent of treatment assignment—that is, they
may be equally likely to occur in treated units as in control units. How-
ever other “fires” may be products of the treatment. They may reflect,
for example, the absence of a desired treatment among controls, which
necessarily feel left out. This raises more serious issues. Unanticipated
events that require additional interventions in either treatment or control
communities must be recorded so that they can be taken into account in
the final evaluation. Such events may make interpretation of the results
more complicated, but the possibility that they might arise is not an argu-
ment to forego randomized evaluations per se.
In addition, missions may wish to adjust programming midstream,
either by learning lessons from an early assessment of outcomes or by
responding to new developments on the ground. Sometimes this is quite
consistent with the purposes of a good evaluation. For example, if there
is powerful evidence part way through that a project is working, USAID
may wish to extend its reach into communities that were previously in the
control group (medical trials are often abandoned early if there is robust
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
evidence of the benefits, or dangerous consequences, of a treatment). The
phrase “if there is powerful evidence” is crucial here. Since the whole pur-
pose of the randomized evaluation is to generate evidence for a project’s
success or failure, there is no trade-off whatsoever in abandoning it or in
tweaking it midstream, if “powerful evidence” for the project’s efficacy
has already emerged. A real trade-off presents itself only if the evidence
for the project’s success or failure is still tentative. In such a situation
a judgment call would have to be made about the relative importance
of confirming what the initial evidence seems to suggest (which would
require not altering the design of the randomized evaluation) or mov-
ing ahead with the change in course (which might have the benefit of
maximizing impact but risks acting on a hunch that may have been ill
founded).
The more difficult issue is when, as frequently happens, unforeseen
challenges arise in project implementation that USAID thinks require
slight adjustments in the interventions or sometimes the replacement
of implementers. Changing the treatment part of the way through the
process is, of course, not ideal. As long as the adjustments are consistent
across the treatment group, however, there is no threat to causal inference
(although it should be kept in mind that the ultimate evaluation measures
a more complicated treatment). Whatever the source of the midstream
correction, responsible officials will need to remember that the benefits of
continuing with the rigorous evaluation design accrue agency-wide and
are not limited to the particular mission or project. So the advantages of
a midcourse correction for a project or mission will need to be balanced
against the potential loss of valuable evaluation information that could be
usefully applied to programs in other countries.
Randomized evaluations are too complex; USAID does not have the
expertise to design and oversee them. Staff both in the field and in Wash-
ington consistently raised the objection that USAID is not well equipped
to design and implement, or even simply oversee, randomized evalua-
tions. This is a valid concern. While the idea of randomized evaluation is
intuitive and easy to understand, the design of high-quality randomized
evaluations requires additional academic training, specialized expertise,
and good instincts for research design. It is likely that many (or most)
USAID DG staff do not have training in research methods and causal infer-
ence, thus making it difficult for them to evaluate the quality of proposed
impact evaluations or to play a role in their design and implementation.
The committee wants to emphasize that the guidance provided in
this report should not be seen as a “cookbook” of ready-made evaluation
designs for DG officers. It would be a mistake for USAID to endorse the
typology of evaluation designs outlined in Chapter 5 and then require DG
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
officers to put these new designs into practice without additional training
or support. Because the issue of competence and capacity is so central to
the prospect of improving evaluation in USAID DG programs, Chapter 9
is dedicated to providing recommendations about how USAID DG could
make the necessary investments and provide appropriate incentives to
encourage using impact evaluations of its projects where appropriate
and feasible.
It will cost too much to conduct randomized evaluations. Perhaps the
most important objection the committee encountered in the field is that
randomized evaluation will cost too much. In part, this is a question
of USAID’s priorities. If the agency is committed to knowing whether
important projects achieve an impact, it will need to commit the neces-
sary resources to the task. But aside from whether the agency commits to
higher quality evaluations, it is legitimate to ask how much more random-
ized evaluation will cost than the procedures currently employed.
The committee’s field teams were tasked with some detective work in
an effort to answer this question. As discussed in Chapter 8, the committee
discovered that USAID could not provide concrete information about how
much it spends on monitoring and evaluation (M&E) every year, even
for a subsample of DG programs. The committee therefore encouraged
the field teams to explore the cost of current approaches by reviewing
project documents and through discussions with mission staff. They, too,
encountered insurmountable obstacles; project documents almost never
provided line items for M&E and what was reported was not consistent
from one project to another. Based on interviews with implementers, the
field teams reported that nontrivial amounts of time were dedicated to
the collection of output and outcome indicators and the monitoring of
performance, but no team could arrive at any hard numbers related to
current expenditures. The committee thus cannot answer the question of
how much more it will cost to introduce baseline measures, data collec-
tion for comparison groups, or random assignment, relative to current
expenditures on M&E. At best it can be said that in a number of cases that
the field teams examined, it seems that substantial improvements in all
of these areas could be obtained for little or no additional cost, but that in
other cases the costs could be substantial. Much depends on whether data
are being collected from third parties or local governments versus being
generated by surveys or other primary data collection by implementers,
on whether surveys are already being used for the projects or would need
to be developed specifically for the project in question, and on the specific
outcomes that have to be measured in the treatment and control groups.
As noted, in some cases—such as reducing the initial number of units
treated in order to preserve a control group—an impact evaluation could
OCR for page 151
IMPLEMENTING IMPACT EVALUATIONS IN THE FIELD
actually save money compared to providing all groups with assistance
immediately, before the effects of the project have been tested.
But how much will a randomized evaluation cost? Answering this
question requires two different calculations. The first is the straightfor-
ward calculation of how much more it will cost to collect the necessary
data. This will depend on the number of control and treatment units
required for a useful random assignment; the more subtle the expected
effects, the larger the number of units that will be required, with a corre-
sponding increase in the cost of data collection. The factor to keep in mind
is that, even if data collection is more costly in a randomized evaluation
design, the potential benefit is that it would put USAID in a position to
assess the impact of the project with much more confidence and to detect
subtle improvements that might not be visible without a randomized
design.
The second, much trickier, calculation lies in assessing (1) the cost of
selecting units at random, which may entail not implementing project
activities in units where USAID might have reason to believe that the
project will have a large positive impact and/or (2) going ahead with
the implementation of project activities in units where USAID has reason
to believe that the project will fail. Here the cost is less a direct expense
than an opportunity cost. Again, these costs must be weighed against
the potential benefit of being able to conclude whether or not the project
worked. Note, however, that the latter type of cost (of directing program
funds either to places where staff are convinced the project will not work
or away from places where staff are convinced that it will) will be greater
the more confident staff members are about whether or not (or where)
an accurate prediction can be made about exactly where a project will
be successful and where it will not. If it is already known whether (or
where) a project will work, then randomized evaluations are not needed
to answer this question. The real peril lies in believing wrongly that the
consequences of a program are, in fact, known and allocating resources
on that basis when the hypotheses behind a program have not been tested
by impact evaluations.
CONCLUSIONS
The committee’s consultants believed they had demonstrated that at
least some of the types of projects USAID is now undertaking could be
subject to the most powerful impact evaluation designs—large N random-
ized evaluations—within the normal parameters of the project design.
For a majority of committee members, this provided a “proof of concept”
that the designs would also be feasible in the sense that they would
work in practice as well as in theory. However, one committee member
OCR for page 151
IMPROVING DEMOCRACY ASSISTANCE
with experience in actually managing DG programs remained skeptical
as to whether the complexity and dynamic nature of DG programming
would allow random assignment evaluation designs to be implemented
successfully. The committee also notes that doing random assignment
evaluations in the highly politicized field of democracy assistance will
likely be controversial. It is, therefore, recommended in Chapter 9, as part
of a broader effort to improve evaluations and learning regarding DG
programs at USAID, that USAID begin with a limited but high-visibility
initiative to provide a test of the feasibility and value of applying impact
evaluation methods to a select number of its DG projects.
REFERENCES
Carrión, J.F., Zárate, P., and Seligson, M.A. 2007. The Political Culture of Democracy in
Peru: 2006. Latin American Public Opinion Project (LAPOP), Vanderbilt University and
Instituto de Estudios Peruanos, Lima, Peru. Available at http://stemason. anderbilt.edu/
files/gcfLNu/Peru_English_DIMS%000with0corrections0,pdf. Accessed on April
26, 2008.
Dehn, J., and Svensson, J. 2003. Survey Tools for Assessing Performance in Service Delivery.
Working Paper. Development Research Group, The World Bank, Washington, DC.
GRADE. 2003. Grupo de Análisis para el Desarrollo, Linea de Base Rapida: Gobiernos Sub-
nacionales e Indicadores de Desarrollo. Lima, Peru: GRADE.
Gugerty, M.K., and Kremer, M. 2006. Outside Funding and the Dynamics of Participation in
Community Associations. Background Papers. Washington, DC: World Bank. Available
at http://siteresources.worldbank.org/INTPA/Resources/Training-Materials/OutsideFunding.
pdf. Accessed on April 26, 2008
Miguel, E., and Kremer, M. 2004. Worms: Identifying Impacts on Education and Health in
the Presence of Treatment Externalities. Econometrica 72(1):159-217.
Olken, B.A. 2007. Monitoring Corruption: Evidence from a Field Experiment in Indonesia.
Journal of Political Economy 115:200-249.
PRODES PMP. 2006. Pro Decentralization Performance Monitoring Plan, 2003-2006. Lima,
Peru: ARD, Inc.
PRODES PMP. 2007. Pro Decentralization Performance Monitoring Plan, Fifth Year Option,
February 2007-February 2008. Lima, Peru: ARD, Inc.
Savedoff, W.D., Levine, R., and Birdsall, N. 2006. When Will We Eer Learn? Improing Lies
Through Impact Ealuation. Washington, DC: Center for Global Development.
Seligson, M. 2006. The AmericasBarometer, 2006: Background to the Study. Available at:
http://sitemason.anderbilt.edu/lapop/americasbarometer00eng. Accessed on February 23,
2008.
USAID/Peru. 2002. Request for Proposals (RFP) No. 527-P-02-019, Strengthening of The
Decentralization Process and Selected Sub-National Governments in Peru (“the Pro-
Decentralization Program”). Lima, Peru: USAID/Peru.
USAID/Uganda. 2007a. Request for Proposals (RFP): Strengthening Democratic Linkages
in Uganda. Kampala, Uganda: USAID/Uganda.
USAID/Uganda. 2007b. Request for Proposals (RFP): Strengthening Multi-Party Democracy.
Kampala, Uganda: USAID/Uganda.