Abstract
Goenner (Conflict Management and Peace Science, 28(5): 1–20, 2011) criticizes the simultaneous equations regression model (SEM) of bilateral trade flows (BT) and militarized interstate disputes (MID) developed by Keshk, Pollins and Reuveny (Journal of Politics 66(4): 1155–1179, 2004) and extended by Keshk, Reuveny and Pollins (Conflict Management and Peace Science, 27(1): 1–20, 2010). Like Hegre, Oneal and Russett (Journal of Peace Research 47(6): 763–774, 2010), he does not agree with Keshk, Reuveny and Pollins that a larger BT has no effect on MID. Unlike Hegre et al. (2010), who focus on the role of distance between capital cities on MID in Keshk et al.’s (2004) SEM, Goenner finds faults in their econometrics. Once these faults are fixed, he says, a larger BT reduces the probability of MID. His analysis is unconvincing. We believe our essay is of interest beyond the trade and conflict research community, as it illustrates the risk of emphasizing technique over substance.
Introduction
Keshk, Pollins and Reuveny (2004) developed a simultaneous equations model (SEM) of bilateral trade flows (BT) and the probability of interstate militarized disputes (MID). They find that a larger BT does not affect MID while MID reduce BT. Goenner (2011) rejects this model and argues that trade brings peace. Provocatively titled “Simultaneity between trade and conflict: Endogenous instruments of mass destruction,” Goenner’s paper comes in the wake of the Hegre, Oneal and Russett (2010) paper, which argues that Keshk et al. (2004) failed to control for distance between capital cities in their MID equation. When distance is included, they say, trade brings peace. Keshk, Reuveny and Pollins (2010) show that the Keshk et al. (2004) result of no effect continues to hold when distance is added to their MID equation, while the Hegre et al. (2010) result reflects their creation of missing BT data cells. Regardless, Hegre et al. (2010) offer a reasonable criticism, for it begins in theory and continues through appropriate methodical exploration. Goenner’s analysis, in contrast, is merely alarmist, lacking substance and veracity. As such, we remain unmoved by his argument.
The context of our response
Goenner says that Keshk et al. (2004) misstep in using a full-information (FI) SEM estimator, as opposed to a limited-information (LI) estimator, and use improper instrumental variables (IVs) for BT in the MID equation and for MID in the BT equation. He offers an alternative model, concluding that MID reduces BT and a larger BT reduces the likelihood of MID. This section sets forth the context of our response.
In macro-social science, we observe the data through econometric models, but we typically begin with stylized facts. Militarily uninteresting trading dyads such as Denmark–Norway or Canada–USA have long been peaceful, but BT did not prevent MID for dyads such as UK–Imperial Germany, USA–Japan, Nazi Germany–USSR, Nazi Germany–USA, USA–USSR, Greece–Turkey, India–Pakistan, USSR–Afghanistan, Israel–Palestinian Authority, France–Iraq, USA–Iraq, USA–Libya, USA–Iran, UK–Iran and Italy–Libya. MID, in contrast, lead to a significant decline and sometimes cessation of BT. For its part, theory sees competing effects for more BT on MID, but expects MID to reduce BT. The result of no effect obtained by Keshk et al. (2004) is intuitive because their large N sample averages out all the different effects.
Econometrics is, of course, one thing in the textbook and another in reality. Goenner equates simultaneity to mere endogeneity manifested as correlation with the error term. The inaccuracy of his assertion stems from the weakness of this assumption, as the correlation may actually reflect measurement errors, incorrect functional forms, sample selection and model misspecification (omitted variables). Any investigation of endogeneity should start with its possible causes. Unlike in the natural sciences, we generally cannot do much to improve our data’s accuracy, and we almost always use linear forms. Our samples are chosen for us by history, so the best we can do is not to arbitrarily remove available data; Goenner commits this cardinal transgression.
These considerations were implicit in Keshk et al. (2004), so they focused on model specification. They developed a theory suggesting that MID and BT simultaneously affect each other. Aware that BT and MID also depend on other theory-driven covariates, they tapped state of the art studies of BT and MID in specifying these covariates. Goenner, we will show, essentially specifies his model ad hoc and in this failure displaces the relevance of his paper from Keshk et al.’s (2004) work.
Simultaneity and endogeneity are related but demand different treatments. Researchers handle simultaneity by taking into account theoretical causal channels in a structural system of simultaneous equations, while they address endogeneity by replacing problematic variables in a single equation. The Keshk et al. (2004) theory emphasized simultaneity, but Goenner focuses on the correlation, and in his rush to detect and correct it, he employs inappropriate methods. We look at each of these issues in turn.
Simultaneity between trade and MID
Goenner argues that Keshk et al. (2004) slip-up first in using the FI estimator, rather than the LI estimator. This is a problem, he says, because unlike the LI estimator, which uses information from one equation at a time, the FI estimator uses information from all the equations, making its results relatively more sensitive to a misspecification in any of the SEM equations.
His argument is problematic on several counts. First, both the LI and FI estimators use all of the SEM variables, and hence all the information, in estimating each equation. Second, Goenner does not note that the increased sensitivity of the FI approach to misspecification is based on a textbook theory that assumes knowledge of the true model. As the true model is unknowable, this result is unhelpful. The final issue is that the Amemiya (1978) algorithm that Keshk et al. (2004) used is not an FI SEM estimator, as Goenner suggests, as it does not estimate the variance–covariance matrix of the SEM and does not include a third step of estimating all the equations at once using this matrix.
Next, Goenner suggests LI estimation for each equation. This, too, is troublesome. First, irrespective of Keshk et al.’s (2004) analysis, his decision not to use FI estimation despite its greater efficiency implies an assumption that his equations are misspecified and so beget a problem requiring containment. If so, we should not trust them at all. Second, his claim that the Keshk et al. (2004) equations are misspecified is atheoretical, for he fails to elucidate the nature of their misspecification, when actually Keshk et al. (2004) specified them based on the most dominant specifications in the field.
He analyses separate equations for MID and BT, ignoring the other equation in each case. He seems to think that BT and MID work separately, except that BT in the MID equation and MID in the BT equation are correlated with their respective error terms for reasons not fully explained. This is an odd modeling approach if the goal is to say something about an analysis based on a theory arguing that BT and MID interact simultaneously. By studying the effects of BT on MID and MID on BT as separate exercises, as opposed to a system, Goenner misunderstands the central purpose of Keshk et al. (2004).
Specification
Even harder to understand is why Goenner claims he addresses Keshk et al. (2004; and 2010, which he cites), considering that he changes their theory-based equations. He uses their dependent variables (DVs), ongoing MID and the natural logarithm (Log) of BT, but for his BT equation, he removes their Log (GDP A ), Log (GDP B ), Log (Population A ), Log (Population B ), Log (Democracy L ) and Lagged Log (BT) (where A and B denote the dyad’s countries; L is the lower score) and adds Log (GDP A ⋅ GDP B ), Log (Population A ⋅ Population B ) and Contiguity. For his MID equation, he removes their Trend of Dependence H , Log (Democracy L ) and Lagged MID (H is the higher score), and adds Log (Distance), Democracy L , Peace Years and three Splines. Given these changes, it is unsurprising that his results differ from Keshk et al. (2004). The question, instead, is why he specifies the equations in this way rather than using his own (Goenner, 2004) MID equation, which shows that more BT does not affect MID (as in Keshk et al., 2004), or the MID equation in Goenner (2010), which indicates that more BT promotes MID. We wonder why Goenner (2011) shoves Goenner (2004) into a footnote and ignores Goenner (2010).
In any case, his specification is problematic. To begin with, his BT variable includes a transformation error, as it comes from Keshk et al. (2004). Keshk et al. (2010), which he cites, fixed the error and still found results similar to those obtained by Keshk et al. (2004). Goenner’s results may change should he use the corrected variable because his model and his sample (see later) differ from Keshk et al. (2004).
Second, BT follows a dynamic process owing to the time it takes to change preferences and technologies, produce goods, bring goods to markets, execute trade contracts, etc. As a result, trade markets are never at economic equilibrium and the value of BT in one period depends on its previous value. Goenner errs in excluding the Lagged Log (BT) from the BT equation, ignoring the trade dynamic.
Third, he includes contiguity in the BT equation, unlike Keshk et al. (2004, 2010), Hegre et al. (2010) and others (see Keshk et al., 2010). He says that Keshk et al. (2004) (and others) ignore contiguity’s effect on BT, but he fails to see the full picture. Transportation cost, not contiguity, affects BT. Contiguity is an ambiguous proxy for transportation cost because it does not inform us about the ease of crossing the border. Including contiguity in the BT equation puts dyads such as India–China on a par with, say, Belgium–Luxembourg.
Fourth, his use of the Log (GDP
A
⋅ GDP
B
) and Log (Population
A
⋅ Population
B
) forces the parameter estimates of the two GDPs and the two populations, respectively, into equality. The theoretical micro-foundations of the BT equation indicate that the gross domestic product (GDP) and population of each country in the dyad play separate roles in BT. Again, the restriction is atheoretical. Finally, recalling that the BT equation works in logs, his inclusion of the level of Democracy
L
in this equation suggests that he thinks democracy should enter as
Goenner’s MID equation is similarly flawed. First, he errs in excluding the lagged ongoing MID. The ongoing MID is the right variable to use for trade and conflict because all the years of an MID can affect BT, but there is a need to model the ongoing MID dyad–years dynamics because they are not independent events. Keshk et al. (2004, 2010), Hegre et al. (2010) and others account for this by including the lagged MID, while Goenner’s treatment of ongoing MID dyad–years as independent events yields deficient estimates. Second, his omission of the lower Trend of Dependence is erroneous, as established theory advocates its use. Third, the inclusion of the unlogged Democracy L is problematic, as this variable is skewed. Fourth, his inclusion of distance is disconcerting. Despite Keshk et al.’s (2010) serious concerns about this variable, Goenner notes their argument only in passing and adds distance to his equation anyway. While Keshk et al. (2010) find that the Keshk et al. (2004) result of no effect for a rise in BT on MID holds with distance, the Goenner result that trade pacifies does not hold without it.
Finally, his discussion of the Keshk et al. (2010) MID equation is misleading with regard to GDP. In the paragraph citing it, he says that Hegre et al. (2010) believe that the GDP of the two countries must be included, reporting that his results still hold in this case. The implication here is that Keshk et al. (2010) do not include two GDPs in the MID equation when they do.
Estimation
Regression with an endogenous regressor requires replacing it with its predicted value from an auxiliary regression on the original exogenous variables and overidentifying restriction, or external IVs. The procedure assumes that the overidentifying restrictions affect the endogenous variable, and do not directly affect, and are affected by, the DV. In an SEM, these restrictions are naturally the exogenous variables from the other equations, provided they are excluded from the equation at stake. This is intuitive, for the SEM is a theory-based system in which endogenous variables depend both on each other and on all the exogenous variables. Keshk et al. (2004, 2010), Hegre et al. (2010) and many other SEM applications use this approach, including the larger models, which drive public policy.
In contrast, without offering justification, Goenner chooses the overidentifying restrictions for each endogenous variable in the equation they enter, an approach that is again divorced from theory. His auxiliary regression for replacing MID includes determinants of BT and external IVs, omitting theoretical determinants of MID, and his auxiliary regression for replacing BT includes determinants of MID and external IVs, omitting theoretical determinants of BT. The resulting auxiliary regressions are therefore biased, and so the results building on them in the second stage regressions are unreliable.
Of additional concern is the problematic nature of his chosen overidentifying restrictions. For BT in the MID equation, he chooses the lagged BT and the absolute value of difference between the land-to-population ratios in a dyad, and for MID in the BT equation he chooses the lagged MID and the high-to-low military expenditures ratio in a dyad. Let use evaluate these choices.
Goenner says that military expenditure proxies MID, but these expenditures include non-MID components such as wages, pensions, medical benefits, training, arms upkeep, base upkeep and R&D.
The expenses are also not dyadic because they are rarely spent with one country in mind. His high-to-low expenditure ratio is dyadic, but only mechanically. Even were it to be dyadic, it misleads. A higher ratio can indicate a larger numerator (more conflict by his logic), a smaller denominator (less conflict) or some mix. Indeed, we do not know of any study using his dyadic ratio as a proxy for dyadic MID.
He claims that land-to-population ratio proxies BT because it captures differences in national factor endowments, which the Heckscher and Ohlin (H-O) model links to comparative advantage, and Deardorff (1998) uses to derive a trade gravity model. However, this is problematic on two counts. Goenner employs total land and total population, which misrepresents comparative advantage, for land and population come in different economic types (e.g. desert, arable/skilled, unskilled). Further, while H-O trade occurs among countries with dissimilar factor endowments trading different types of goods, most international trade occurs among countries with similar factor endowments trading similar types of products, so called intra-industry trade or similar–similar trade (Krugman, 2009). About 40% of world trade is H-O or dissimilar–dissimilar, with the other 60% or so being similar–similar. 1
Theory indicates that his overidentifying restrictions are affected by BT or MID in the equation they enter, or are endogenous. For BT in the MID equation, the lagged BT is affected by MID owing to MID expectations and the effect of MID on population and land. For MID in the BT equation, the lagged MID is affected by BT owing to expectations, arms trade and the effect of BT on economic growth and, therefore, government revenues and expenditures. 2
Goenner’s overidentifying restrictions also directly affect the DV in their respective equations. In the BT equation, the lagged MID affects BT, and military expenditures does the same by changing labor allocations and arms trade. In the MID equation, the lagged BT can affect MID by Goenner’s liberal explanation, and population density (one over his restriction) may figure into MID decisions since more population-dense countries may suffer more casualties owing to MID or get involved in MID to seek more land.
In sum, Goenner’s external IVs are atheoretical, arbitrary and misleading. They also directly affect and are affected by the DV in the equation they enter. Econometric theory in this case indicates that the IV estimates are inconsistent. We find no way to view them as other than deleterious to understanding.
Grave as these problems are, the list is not yet exhaustive. Goenner does not model the ongoing MID dynamics, so serial correlation probably biases his inferences. He uses the statistical package Stata estimator ivreg2 for the BT equation, so his fitted MID in the first stage comes from a linear probability model, but there is no reason to expect that the MID probability depends linearly on variables across their range, and the linear probability model’s predicted probabilities lie outside the 0–1 range. Finally, he arbitrarily drops about 24,000 observations from the Keshk et al. (2004) sample in estimating his MID equation, and 3000 observations in estimating his BT equation, casting yet more doubt on his results.
Endogeneity tests
Theory indicates that Goenner’s external IVs are endogenous to BT and MID, respectively. Still, his Hansen J for BT and Lee (1992) test for MID report that his IVs are exogenous. Puzzlingly, Goenner runs the tests but seems unfazed by their counterintuitive results.
Let us first define some terms. The vector
Under the null hypothesis (H0) for the Hansen J or Lee tests,
For all that he addresses it, Goenner may have overlooked this point entirely. He reports, for example, that his test says that his external IVs for BT are exogenous when he excludes contiguity from the MID equation, and are endogenous when distance is excluded. He takes this abrupt change in the results as absolute, turning a blind eye to more than one red flag. For one, he accepts the results of a test that excludes contiguity, one of our most robust determinants of MID, and allows a decision on whether some variable affects another (endogeneity) to depend on including or excluding other variables (distance) in the test.
These basic problems arise because the Goenner tests of endogeneity for the overidentifying restrictions perform calculations assuming all the other variables in each regression are exogenous to BT and MID, respectively. Since they actually are not exogenous, a small change in specification changes the result abruptly. Goenner does not even begin to discuss the applicability of his tests’ assumptions to his analyses. Since their assumptions do not hold, the tests’ findings are not meaningful.
The bigger picture: Are endogeneity tests ever appropriate?
As Greene says, in macroeconomics, let alone in the international system, almost no variable can be said to be exogenous (Greene, 2008: 357), so these tests are not applicable here. They are applicable only when the studied unit of analysis is small enough that some of its determinants change autonomously outside of its boundaries. Examples of work concerning units of analysis of this type include Wooldridge (2002) and Cameron and Trivedi (2005), which Goenner cites, and Cameron and Trivedi (2010). The latter two authors even title their works as microeconometrics, or econometrics for units of analysis sufficiently small to be subject to external forces that can be taken as given. 3
The international system is small compared with, say, our galaxy, but galactic forces have so far not made an appearance in our theories. Until they do, our models of macroeconomics, international economics, international relations, international political economy, and so on, do not really include exogenous variables. We have long known that all of our variables are endogenous to the forces we ask them to explain. Even the climate is becoming increasingly endogenous—we are changing it.
Given all of this, we reject both the form and findings of Goenner’s paper, but then what are we to do in the meantime? One option is to examine parts of the system, assuming others change autonomously and leave it at that. This is akin to doing partial equilibrium analysis in economics. As Cameron and Trivedi (2010: 181) write, even when a test of overidentifying restrictions is applicable (i.e. the units are small), exogeneity relies on theory (i.e. assumptions) and norms established in prior studies.
A second option is to argue that we can only estimate reduced forms that depend on several lags in each of our variables, the position taken by vector auto regressions (Kennedy, 2008: 181), although it, too, is imperfect. Notably, the estimation is imprecise since the lagged values of some variable are typically highly correlated, resulting in reduced visibility caused by having to examine joint significance levels of parameter estimates, sums of parameter estimates and/or impulse response functions. Since human actors have expectations, the right-hand side variables are still endogenous to the DV.
A third option is to use larger models, assuming that some variables are exogenous depending on the modeling goal. This has been our position in Keshk et al. (2004, 2010). We have had good teachers. In the early days of the behavioral revolution in international relations, some wise people attempted to eat the whole cake at once in their Globus world model. The field apparently was not ready for that, for this trailblazing project has largely disappeared from our discourse. Perhaps the time has come to revisit it; we have made progress since those early days in Berlin. A partial path to its resurrection is the accumulation of knowledge by modeling parts of the system in the less ambitious SEM environments.
The crucial point in this proposed endeavor is to avoid what Goenner has done, relying on statistics to do the work for us. Greene (2008: 381) refers to this approach as “for better or worse.” His “for better” part is understood; the approach can be useful for small units of analysis. It is Green’s “for worse” part that should interest us the most, for it applies in our (and his) macro domains. The Goenner way is easy to implement, but it is a “for worse” way; in emphasizing technique over substance, it risks forfeiting the primary advantages we have over computers: curiosity and intuition.
Footnotes
Notes
Funding
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
