Synthesis v. Purity and Large-N Studies: How Might We Assess the Gap between Promise and Performance?

 
Will Hathaway Moore Florida State University

Moore, Will H. (2006) "Synthesis v. Purity and Large-N Studies: How Might We Assess the Gap between Promise and Performance?," Human Rights & Human Welfare: Vol. 6: Iss. 1, Article 20.

    "In a recent essay, Nicholas Kristoff (2006) bemoaned the fact that the type of U.S. inaction that
Samantha Power (2001) documented with respect to Rwanda in the mid-1990s is again evident in
Darfur. Ne plus jamais? No: we have déja vu. The standard explanation that one finds among
commentators and pundits is that “a lack of political will” prevents countries from acting on their
international legal and moral obligations to respond to crimes against humanity with action to stop
the killing, torture, etc. Yet as the reader knows only too well, Darfur is but the most acute case of
violations of the international human rights regime. Despite the relative success of the regime, there
remains a considerable gap between the ideals of the international human rights regime and the
practice of states and their agents.


A handful of scholars have recently turned to large-N statistics to evaluate this gap, the most
substantial of which is Todd Landman’s book Protecting Human Rights.1 The realist schools of
international relations provide a well known argument to explain why treaties are “meaningless
scraps of paper,” and the Keith (1999), Hathaway (2002); and Hafner-Burton and Tsutsui (2005)
studies support this claim as they find that, once one controls for variables such as democracy and
national income, human rights treaties do not influence the behavior of the countries that sign them.
The constructivist school, on the other hand, offers an alternative view advancing the claim that as
norms become diffuse and entrenched, state behavior will change as well.2 Cardenas suggests that
the constructivist school suffers from a tendency to focus attention on cases of compliance and
thereby has “tended to overlook an important puzzle: why human rights violations sometimes
persist despite ongoing pressures for compliance” (2004: 227). She points to the body of large-N
studies of personal integrity rights violations and state coercion/repression (e.g., Poe and Tate 1994;
and Davenport 1995, 1999) as a source of concepts and findings that could be profitably exploited
to correct this misplaced focus, but does not comment on the recent studies that are pursuing her
suggestion.


Four published studies and a review article calling for such work suggest that something is afoot
and that we ought to pay attention. Yet there is more here than a need for sound empirical research.
1 See also Keith (1999); Hathaway (2002); and Hafner-Burton and Tsutsui (2005).2 See especially Risse, Ropp and Sikkink (1999).1 Moore: Synthesis v. Purity and Large-N Studies
Published by Digital Commons @ DU, 2006 VOLUME 6 – 2006 90


As is always the case, theoretical choices are consequential. Cardenas and Landman recognize this
and advocate a synthesis of theoretical approaches. I wish to dissent and offer an alternative.
Both Cardenas and Landman observe a synthetic trend in human rights literature that abandons
realist versus constructivist (and other) dichotomies in favor of drawing from multiple schools of
thought to produce new theories. And both scholars contend that this emergent theoretical
perspective makes a compelling case for why the international human rights regime influences the
behavior of states, though far less than the states, persons and activists who champion it would like.
While neither work is aware of the other, Landman partially puts into practice that for which
Cardenas calls.
I submit that while well-intentioned calls for theoretical synthesis and cross fertilization are
commonplace, efforts to do so are generally sufficiently vague that they are little more than pablum.
By contrast, opposing claims that advocate theoretical purity (e.g., Eckstein 1980) are relatively rare.
This essay uses Landman’s book as a lens through which to explore the trade-offs between
theoretical purity and synthesis. It also examines the strengths and weaknesses of large-N statistical
analyses of human rights violations. It then concludes with a brief consideration of the intersection
of these two issues.


Synthesis or Purity?


Cardenas distinguishes rationalist from constructivist schools of international relations, noting
two types of rationalist approaches: power-centric and self-interest centered. She observes that each
approach identifies causal processes at the international (systemic/dyadic), domestic (monadic), and
domestic-international levels of analysis.3 Landman sketches a more complex typology of the
literature,4 and suggests that a double convergence, “centered on the idea of constrained agency” offers
an opportunity to develop an empirical theory that can account for cross-national and cross-
temporal variation in the human rights behavior of states (13). The agent-structure debate occupies
center-stage in Landman’s argument, and the “double convergence” that he identifies is the
common effort in international relations and comparative politics to resolve this debate by recognizing that the behavior of actors is neither dictated by, nor independent of, structure. Because
it involves structure (anarchy and the norms of the human rights regime) and agency at both the
domestic and international levels of analysis (and their intersection), the human rights behavior of
states, argues Landman, is at the center of such debates. Cardenas would presumably concur.
I am less sanguine about the prospects of such a synthetic effort, largely because I believe that a
solution to the agent-structure problem is the “great white whale” of social science. Let us consider
two views. The first suggests that theories that take structures or the preferences of agents as given
are necessarily deficient since none of us really believes that either is fixed and immutable (i.e., we all
believe that in reality preferences and structures are both malleable and mutually constitutive). The
second suggests that models of human behavior must make simplifying assumptions and that to
make theories tractable scholars must fix either structures or preferences because constructing

“mutually constitutive” theories is simply too difficult. The former position is advanced by those
who prefer models that “closely reflect reality” while the latter is advanced by those who prefer
abstract models where the connection between assumptions and implications is prized over the
theory’s “correspondence to reality.” Debate between these two approaches is well worn and this
essay will not contribute to it. Instead, I wish to briefly consider a less well worn discussion
concerning synthesis versus purity.
In a discussion of what he labeled “inherent v. contingent theories of civil strife,” Eckstein
(1980) makes a case for eschewing synthesis. He implicitly embraces a Lakatosian perspective to the
progression of knowledge that prizes the competition between theories. That neither of two given
theories “corresponds with reality” is beside the point, according to this view. The temptation to
increase that correspondence via synthesis is a siren: the proper way to evaluate the usefulness of a
theory is to contrast it with another theory with respect to 1) the scope of its explanatory power
(e.g., how many hypotheses does it produce?) and 2) its ability to withstand falsification. By this
account accumulation of knowledge occurs in a field when two (or more) rival theories press one
another to explore their logic in an effort to produce more implications. Efforts at synthesis are
likely to undermine that effort because they tend to obfuscate the logic (or causal mechanisms) of
the rival theories.

As an example, one might consider Taylor’s (1989) effort to establish the hegemony of agent-
centered theory relative to structure-centered theory in comparative politics. In international
relations, Wendt (1987, 1992, 1999) makes the case for a mutually-constitutive synthesis of the
agent-structure problem as superior to structural realism. Is the path advocated by Eckstein and
illustrated by Taylor more effective, or should we pursue a synthetic path à la Wendt? Cardenas and
Landman advocate the latter, and while Cardenas is a call for such an effort, Landman claims to
have forged one. Has he?
One of the strengths of Landman’s effort is the feedback process that he specifies between
human rights law and human rights protection. Other studies of the gap between the obligations of
treaties and the behavior of states study the impact of the former on the latter while excluding the
possibility that behavior also influences international law (Keith 1999; Hathaway 2002; and Hafner-
Burton and Tsutsui 2005). The “double convergence” of constrained agency in comparative politics
and international relations motivates Landman to specify such a feedback loop. Further, the
empirical analyses support the specification: Landman finds that the processes of democratization,
economic development, and interdependence have an impact on both the extent to which states
endorse the human rights regime and observe those rights, and that signing international treaties has
a weak effect on observation of the obligations contained within those documents (147). The latter
finding is at odds with previous studies, but the former findings are quite consistent with the large-N
literature to which Cardenas pointed as a foundation on which to build a synthetic model of human
rights behavior. Empirically Landman’s findings appear to be on stronger ground than those of
Keith (1999), Hathaway (2002), or Hafner-Burton & Tsatsui (2005): those studies impose a zerorestriction
on any feedback and if such a feedback process exists (and Landman’s results suggest that
it does), it is well known that a model with such a restriction will produce biased estimates.
Landman’s results place the burden squarely on others to show that their results still hold when the
feedback is included (or that the feedback relationship is spurious).

Leaving empirics aside, what of the success of Landman’s theoretical synthesis? Unfortunately,
the causal mechanisms are unclear. His exposition of the theory that drives his specification is a
combination of a broad-gauge discussion of schools of theory in comparative politics and
international relations and a review of the extent findings in the quantitative literature on human
rights violations and state coercion/repression. By contrast, Cardenas (2004) identifies three
important actors: the state, elites, and groups that support the international human rights regime.
She centers her discussion on perceived threats to state rule and the almost reflexive response of
states to such threats with violations of rights. What explains the variation in that near-reflexive
response to repress, asks Cardenas? The answer, she submits, will be found in synthetic theories that
focus on the preferences of those three actors. As she is writing a review piece, she leaves the
creation of such a theory to the future, yet it is precisely the absence of such a theory that weakens
the contribution of Landman’s book. He fails to identify the actors and discuss their preferences and
how they are formed. In essence, his “double convergence” of constrained agency is not a theory,
but instead an insight that he used to suggest that it is important to specify a feedback between
state’s treaty signing behavior and their human rights behavior.
Lest this be read as denigrating the importance of that insight, let me note that Landman’s study
is a landmark. The statistical work is of the highest quality and sets a new standard in an area where
high quality statistical work is de rigeur.5 More importantly, it is difficult to imagine that many scholars
will dispute the conjecture that there is feedback, yet Landman is the first to specify and examine it. I
anticipate that future studies that fail to include a feedback specification will find it difficult to find
their way into print. Further, his findings that states’ human rights obligations and observance of
those obligations are both largely a function of democratization, economic productivity, and
interdependence, but that the “scraps of paper” do have a limited effect, not only have substantial
face validity, but will become the new starting point for debate in the field. So, Landman is to be
congratulated for making an important contribution to the advancement of our understanding of the
development of the human rights regime and the practice of states within it. His failure to specify the actors involved in the process, their preferences, the structures that
constrain them, and the interaction among preferences, behavior, and structure is an opportunity
that future scholarship can exploit. I wish to suggest, however, that we are more likely to produce
useful theory if we abandon the search for a solution to the agent-structure problem in favor of
taking either preferences or structures as given and exploring the implications of doing so. As long
as there is a distribution among scholars such that some explore the implications of fixing
preferences and exploring how structures affect behavior while other scholars fix structures and
focus on how preferences affect behavior we should produce a healthy debate that will spur
advances in our understanding.


On human rights suffers from biased case selection—something that is difficult to do when one
employs a global sample. Cardenas contends that this biased sampling has led the constructivist
literature to ignore the puzzling gap between state signatory behavior and observance behavior. One
virtue of using global samples, then, is that they are considerably less likely to suffer from sample
selection bias. That is not to say that sample selection bias is a non-issue in large-N statistical studies.
Indeed, the literature on treaty compliance is presently debating the issue (e.g., Von Stein 2005).
Nonetheless, global samples are considerably less likely to suffer from selection of the dependent
variable than small-N studies.
Were such samples based on random sampling then one could extol the virtue of the external
validity of such studies, but since they are census populations of the cases that are not missing data,
one does not have access to such an appeal (Ward, Siverson and Cao 2005). Nevertheless, the
coefficient estimates from such an analysis represent the average effect of each variable on the
dependent variable. Further, the standard error of the estimate gives us a measure of our confidence
in the estimate. Knowing the average effect and its dispersion can be very useful and, importantly, is
something we cannot ascertain via alternative methods.


What do Large-N Studies Obscure? Landman is unusually self-conscious about the limitations of statistical inference based on a
global sample of cases. He singles out the influences of political, sociological, and personal
relationships among the actors involved; the lobbying involved; the impact of different mobilization
strategies; and the effect of different cultural understandings as processes that cannot be explored
when using global pooled cross-sectional time-series samples (54, 58).
In addition, while one can conduct analysis of outliers, the virtue of identifying general patterns
has a downside: by definition such studies explain outlying cases poorly. This is especially
problematic if one’s theory is about necessary and/or sufficient conditions. Such strong theories are
relatively rare in international relations, but there is no a priori reason they should be. Put differently,
the usefulness of large-N studies hinges wholly on whether one adopts a probabilistic account of
causation when constructing theory. Strong theories that eschew probabilistic causation can be
falsified by a single case, and global pooled cross-sectional time-series samples are useless for testing
such theories.
Finally, measurement is necessarily gross (i.e., imprecise) in global pooled cross-sectional timeseries
data structures. Researchers compromise and use both proxy indicators and data that are
considerably more noisy than they would like. Small-N analyses need not make either compromise.
Patrick Ball’s work at the American Academy for the Advancement of Science demonstrates the
value of what can be done to develop quantitative measures in individual cases.7 One implication of
the superior validity and reliability of data collected over time in individual countries is that we ought
to conduct more single-case, time-series studies (e.g., Pion-Berlin 1983).

Another strength of Landman’s study is that he is careful to distinguish between cross-sectional
and temporal variation in his data, and employs histograms and plots to useful effect. A future
direction for work that employs statistical inference is to employ the single-country design and
conduct time-series analysis. These types of studies would add a useful complement to the
qualitative cases studies and large-N statistical analyses that presently dominate the literature.
To What Effect?


I began by observing that states’ observance of the human rights regime leaves a great deal to be
desired and that commentators and pundits blame a lack of political will within Western
democracies to back their proclamations with action. Realist theorists of international relations
submit that national security interests prevent states from concerning themselves with moral
obligations such as those enshrined in the treaties that comprise the international human rights
regime. Constructivist theorists of international relations, on the other hand, contend that norms
influence behavior and observe that proclamations in favor of human rights are far more common
than they used to be, but also that the practices of specific states are closer to those norms than they
used to be.


With that as context I would like to conclude by examining the value of large-N studies like
Landman’s. As Cardenas observes, large-N studies are less likely than small-N studies to be victims
of biases resulting from the selection of dependent variables. Even when they are not based on
random sampling they are capable of determining the average effects of independent variables on a
dependent variable. To the extent that we are interested in determining the average impact of a state
accepting the obligations of an international treaty on its subsequent behavior (and vice versa) large-
N studies are useful. They help us establish baseline information about general (or typical)
relationships.

    They are also useful for testing hypotheses. Statistical inference provides one with an explicit technique and criteria for evaluating the probability that a given hypothesis is consistent with
relevant evidence. While a single study is of limited import, a body of studies is able to establish both
the baseline expectations we can have about a given relationship as well as the strength or weakness
of specific hypotheses and the theories that produce them.
What of the charge that large-N statistics are inherently conservative—that because their reliance
on data restricts them to the study of a given status quo, they are unable to comment on how to
change the status quo? This is an important issue in the study of human rights as the vast majority of
scholars working in the area (presumably) have a normative motivation that is at least as strong as
their positive motives. Does the charge of a conservative bias stick?
I submit that it does not. However, it may appear to be a problem. Many statistical analyses
focus attention on structural characteristics of states that are, by definition, slow to change. The
policy implications of such a study are, by design, limited. Yet there is nothing inherent in large-N
studies that requires they focus on structural characteristics. One might observe that it is easier to
measure structural characteristics than behavior and relational characteristics. But if this is true, then
it is a failure of conceptualization and measurement, not a weakness of large-N studies.

    We might thus level one last charge at Landman’s book: by failing to carefully specify the actors,
their preferences, and the structures that constrained their behavior, Landman limits the policy
implications that he might have drawn. I argued above that the book is empirically strong, but
delivers less theoretically. Thus it should not be surprising that its ability to provide policy
implications is limited. We learn that to improve the status of human rights on the planet, we should
promote democratic governance, economic productivity, and interdependence, and that the more
that the states of the world share these characteristics, the more they will observe human rights, the
more they will endorse the international human rights regime, and that such endorsement will have a
small additional impact on observance. These are important broad-based conclusions, but we will
have to await future work that specifies actors, their preferences, and the structures that constrain
their behavior before we can draw more fine-grained implications from what appears to be a
convergence on the usefulness of agent-structure approaches to the study of human rights.
 

References © 2006, Graduate School of International Studies, University of Denver.
Moore :Hathaway Synthesis v. Purity and Large-N Studies
Published by Digital Commons @ DU, 2006