I will be giving a presentation at the Peter D. Watson Seminar Series next week on why weapons inspections are useful. Here’s the abstract:
Abstract: How do weapons inspections alter international bargaining environments? While conventional wisdom focuses on informational aspects, this paper focuses on inspections’ impact on the cost of a potential program–weapons inspectors shut down the most efficient avenues to development, forcing rising states to pursue more costly means to develop arms. To demonstrate the corresponding positive effects, this paper develops a model of negotiating over hidden weapons programs in the shadow of preventive war. If the cost of arms is large, efficient agreements are credible even if declining states cannot observe violations. However, if the cost is small, a commitment problem leads to positive probability of preventive war and costly weapons investment. Equilibrium welfare under this second outcome is mutually inferior to the equilibrium welfare of the first outcome. Consequently, both rising states and declining states benefit from weapons inspections even if those inspections cannot reveal all private information.
Download the paper here.
Download the slides here.
First, a brief background. The main theoretical chapter of my dissertation shows that nonproliferation agreements are fairly easy to establish. Even if a potential rising state proliferates, it will ultimately only be able to receive some amount of concessions from a rival. As such, to deter proliferation, the rival only needs to offer most of the concessions the rising state would receive if it did proliferate. Nuclear investment is no longer profitable, as most of the concessions that proliferation would yield have already been given up. The rising state is happy to maintain the status quo because it is already getting most what it wants. (Building would yield slightly more concessions, but it would not be worth paying the investment cost.) The rival is happy because it can keep a small amount of the concessions to itself, as the rising state is willing to accept slightly reduced offers due to the aforementioned cost savings.
The only obstacle is if the costs of proliferation are very small. Here, the rival cannot scale back very much on the deal, so the value of a nonproliferation agreement is smaller for that rival. In turn, the rival may prefer impatiently hording as much of the bargaining good as possible for as long as possible, forcing the rising state to proliferate. At that point, the rival makes great concessions.
Note that the key comparative static determining whether proliferation occurs is the cost of weapons. If the cost is high, proliferation agreements work. If not, proliferation occurs. Fortunately, nuclear weapons (and their necessary delivery systems) are incredibly expensive. Consequently, nonproliferation prevails most of the time.
However, nuclear “prestige” seems like a hindrance to the nonproliferation regime. Advocates of this theory claim that nuclear weapons bestow international prestige on their possessors, separating the owners from everyone else above and beyond the power nuclear weapons provide. There are many reasons to doubt whether such prestige actually exists–I can’t remember the last time one country was excited to hear that another one was proliferating–but let’s go with it for a moment. It then seems like prestige might sabotage those nonproliferation agreements, as it reduces the perceived investment cost for the rising state.
This has been in the back of my mind for a year or two now, and I have wondered what it meant for the robustness of my dissertation’s nonproliferation argument. Luckily, I had a mental breakthrough a couple of nights ago. The prestige argument is mostly harmless.
The key here is that prestige is zero sum. If nuclear weapons are prestigious, then no country is prestigious if all countries have them. As a result, it is incorrect to think of prestige as affecting a rising state’s perception of the cost of proliferation. Rather, prestige matters for determining the amount of goodies the rising state will receive in the future. Additional prestige means the rising state will receive more, whereas the relatively less prestigious countries (compared to today’s status quo) will receive less.
But this is just a complicated way of saying that nuclear weapons give their possessors additional concessions. Consequently, those who would have to give up the concessions should proliferation occur have incentive to reach nonproliferation agreements for the reasons outlined above. In turn, prestige has little affect on the viability nonproliferation agreements.
Nevertheless, this logic explains the competing beliefs about the existence of prestige. Rising states claim that prestige exists–because, if it does, rivals will have to give them more to reach nonproliferation settlements. Their rivals claim that prestige does not exist–because, if it does not, the cost of reaching a nonproliferation agreement will be lower for them.
If you’d like to see the argument in action, take a look at the chapter. The prestige argument is the first robustness check I run.
So Syria’s chemical weapons have made life a little more interesting lately. I know enough to know that I don’t know enough to say whether we should intervene in Syria. I do know something about Iran, though. Here, I’m briefly going to argue that not intervening (or a very limited intervention) in Syria could help our ongoing negotiations with Iran regarding its nuclear program.
Before going further, I want to emphasize that this departs from the conventional wisdom. The standard argument is that failure to intervene in Syria shows weakness–if we aren’t resolved enough to attack Syria, we are less likely to attack Iran, so we should attack Syria to maintain the plausibility of the bluff.
The problem with this argument is that it overlooks the scope of these missions. We could plausibly commit to an air campaign in Syria a la Libya from a few years ago. Iran’s nuclear program, however, is complicated enough that air strikes might do more harm than good. And if you look at public opinion polls on Syria, a majority(ish) supports strikes. Due to Iraq’s shadow, the support for a ground war is in the single digits. Thus, the overall American war narrative here is that air strikes are okay but ground wars are not.
Back to Iran. One reason we cannot reach an agreement with Iran is our inability to credibly commit to a bargain. Given our negotiation history, Iran is worried that if we ever find ourselves in a position of strength again (like right after the fall of Baghdad but before the start of the insurgency), we will immediately cut all concessions to Tehran. Consequently, Iran views nuclear weapons as a costly insurance policy–a wasteful but necessary evil. Yet, if we could simply keep our commitment to not intervene credible, Iran would have no need to proliferate, and we would all be better off.
So if we state a full-scale assault on Syria, we officially signal that the lessons from Iraq are irrelevant, and we as Americans just don’t mind seeking total victory against weaker states. If we don’t do anything to Syria, or just limit the intervention to light aerial bombardments, we signal that Iraq ushered in a new era in which we just don’t do that type of thing anymore. In the former case, Iran will scramble to finish a nuclear bomb as quickly as possible. In the latter case, we reassure Tehran that our commitments to nonproliferation inducements are credible.
The whole thing is a strategic mess. But if you want to learn more about credible commitment in nonproliferation agreements (and incidentally preview my upcoming Peace Science presentation), check out this chapter from my dissertation.
Abstract
This paper develops a model of negotiating over costly weapons programs. Surprisingly, in equilibrium, rising states rarely invest in arms. First, if the extent of the power shift is large, the declining state leverages the threat of preventive war to induce the rising state not to build. Second, if the power shift is too small to be worth the investment, the declining state offers no concessions and still induces non-armament. In between, if the cost are sizable, the declining state offers concessions-for-weapons, or butter-for-bombs deals. Even though the rising state could take those concessions and build anyway, it nevertheless accepts the payments and maintains the status quo. Armament only occurs in the least important of cases–that is, when the power shift is minimal and not costly. The results indicate that major power shifts–such as those caused by nuclear proliferation–are not non-negotiable and are instead the result of other bargaining problems.
At the center of the cultural problems was a management system called “stack ranking.” Every current and former Microsoft employee I interviewed—every one—cited stack ranking as the most destructive process inside of Microsoft, something that drove out untold numbers of employees. The system—also referred to as “the performance model,” “the bell curve,” or just “the employee review”—has, with certain variations over the years, worked like this: every unit was forced to declare a certain percentage of employees as top performers, then good performers, then average, then below average, then poor…
For that reason, executives said, a lot of Microsoft superstars did everything they could to avoid working alongside other top-notch developers, out of fear that they would be hurt in the rankings. And the reviews had real-world consequences: those at the top received bonuses and promotions; those at the bottom usually received no cash or were shown the door…
“The behavior this engenders, people do everything they can to stay out of the bottom bucket,” one Microsoft engineer said. “People responsible for features will openly sabotage other people’s efforts. One of the most valuable things I learned was to give the appearance of being courteous while withholding just enough information from colleagues to ensure they didn’t get ahead of me on the rankings.” Worse, because the reviews came every six months, employees and their supervisors—who were also ranked—focused on their short-term performance, rather than on longer efforts to innovate…
This strikes two of my pet peeves. First, it shows a fundamental misunderstanding as to how normal distributions work. Despite many of my high school English teachers insisting otherwise, not everything is normally distributed. In fact, there is very good reason to expect Microsoft employees to not be normally distributed. Microsoft should be hiring the very best of the best programmers in the world. Thus, they are sampling from the right tail of the normal distribution–there are a lot of very good programmers, fewer very very good programmers, and yet fewer very very very good programmers.
Second, the system institutes perverse incentives. It is fully reasonable to believe that everyone within a division is a productive employee. Yet the system assumes otherwise. As a formal theorist, I am all for making assumptions when they serve a purpose. There is no purpose here. Managers should be evaluating everyone’s performance over the hypothetical replacement employee. Instead, a certain percentage are forced to be bad by assumption. This leads to stupid infighting–employee rankings are zero sum, so some might need to step on others to reach a higher rank.
It also wouldn’t surprise me if many employees bolted to other companies for better wages in search of fairer wages. If only the top five people are receiving large bonuses but the sixth-best contributed a great deal to the company, that sixth person is also worth an equitable share of the bonus. If he can’t get it from Microsoft, he’s going to go a place that will give it to him.
Abstract: How do weapons inspections alter international bargaining environments? While conventional wisdom focuses on informational aspects, this paper focuses on inspections’ impact on the cost of a potential program–weapons inspectors shut down the most efficient avenues to development, forcing rising states to pursue more costly means to develop arms. To demonstrate the corresponding positive effects, this paper develops a model of negotiating over hidden weapons programs in the shadow of preventive war. If the cost of arms is large, efficient agreements are credible even if declining states cannot observe violations. However, if the cost is small, a commitment problem leads to positive probability of preventive war and costly weapons investment. Equilibrium welfare under this second outcome is mutually inferior to the equilibrium welfare of the first outcome. Consequently, both rising states and declining states benefit from weapons inspections even if those inspections cannot reveal all private information.
If you are here for the long haul, you can download the chapter on the purpose of weapons inspections here. Being that it is a later chapter from my dissertation, here is a quick version of the basic “butter-for-bombs” model:
Imagine a two period game between R(ising state) and D(declining state). In the first period, D makes an offer x to R, to which R responds by accepting, rejecting, or building weapons. Accepting locks in the proposal; R receives x and D receives 1-x for the rest of time. Rejecting locks in war payoffs; R receives p – c_R and D receives 1 – p – c_D. Building requires a cost k > 0. D responds by either preventing–locking in the war payoffs from before–or advancing to the post-shift state of the world.
In the post-shift state, D makes a second offer y to R, which R accepts or rejects. Accepting locks in the offer for the rest of time. Rejecting leads to war payoffs; R receives p’ – c_R and D receives 1 – p’ – c_D, where p’ > p. Thus, R fares better in war post-shift and D fares worse.
As usual, the actors share a common discount factor δ.
The main question is whether D can buy off R. Perhaps surprisingly, the answer is yes, and easily so. To see why, note that even if R builds, it only receives a larger portion of the pie in the later stage. Specifically, D must offer p’ – c_R to appease R and will do so, since provoking war leads to unnecessary destruction. Thus, if R ever builds, it receives p’ – c_R for the rest of time.
Now consider R’s decision whether to build in the first period. Let’s ignore the reject option, as D will never be silly enough to offer an amount that leads to unnecessary war. If R accepts x, it receives x for the rest of time. If it builds (and D does not prevent), then R pays the cost k and receives x today and p’ – c_R for the rest of time. Thus, R is willing to forgo building if:
x ≥ (1 – δ)x + δ(p’ – c_R) – (1 – δ)k
Solving for x yields:
x ≥ p’ – c_R – (1 – δ)k/δ
It’s a simple as that. As long as D offers at least p’ – c_R – (1 – δ)k/δ, R accepts. There is no need to build if you are already getting all of the concessions you seek. Meanwhile, D happily bribes R in this manner, as it gets to steal the surplus created by R not wasting the investment cost k.
The chapter looks at the same situation but with imperfect information–the declining state does not know whether the rising state built when it chooses whether to prevent. Things get a little hairy, but the states can still hammer out agreements most of the time.
I hope you enjoy the chapter. Feel free to shoot me a cold email with any comments you might have.
I can tell you when I began losing faith in standardized testing: August 11, 2009. That was the date of my GRE. My writing prompt was as follows:
Explain the causes of war.
Wow! This could not have been any more perfect. Here I was taking the GRE so I could go to grad school and write a dissertation on the causes of war. The College Board threw me a softball!
Then I received my score: 4.5. While not terrible, the 4.5 corresponded to the 63rd percentile. According to the College Board (roughly), more than a third of GRE takers could write my dissertation better than I could!
Maybe the essay was not that great. Maybe I am a terrible writer. Maybe I don’t understand the causes of war. The academic job market will likely be the ultimate arbiter of my abilities. But until then, the 4.5 seems silly.
_________
Years later, I came across this revealing article on standardized testing graders’ incentive structure. Many tests use some sort of consensus method. Begin by giving the test to two graders. If the marks are close, average the grades and move to the next exam. If the marks are not close, bring in a third grader and average the three grades in some pre-defined way.
Standardized testing companies are not in the business of giving correct grades–they are in the business of grading tests as quickly as possible. The potential for a third grader merely appeases a school’s desire to have some semblance of legitimacy. For the company, the third grader is a speed bump. Every test that reaches a third reader requires 50% more labor–and a non-negligible decrease in profitability for that particular test.
Realizing this problem, testing companies pay careful attention to their graders’ agreement rate. A low agreement rate is the mark of a bad employee. Supervisors might creatively limit the number of tests such an employee can grade (thus inflating the supervisor’s overall agreement rate for his or her team), while managers might fire him or her.
The testing companies deserve respect for creating mechanisms to keep employees in line with company goals. Unfortunately, those company goals do not comport to what we as consumers want out of the standardized testing scores we buy.
_________
I have loved game shows all my life. One of my earliest memories is of watching Family Feud. The game is simple: producers survey 100 people with a variety of questions. They then tally the responses. Contestants on the show must then guess which answer was most frequently given.
Note the emphasis here is on matching, not correctness. For example, suppose the prompt was “name a cause of war.” As a contestant, I would say “greed” or “irrationality” way before I said “private information” and “commitment problems.” The first two responses are terrible, terrible answers. The second two are fantastic. Yet, because your average survey taker has not read “Rationalist Explanations for War,” I would not expect many people to give sensible responses to the survey. So commitment problems would yield a smaller score than greed. In turn, I as a contestant pander to their ignorance and say the silly thing.
_________
Suppose you are a test grader. Better yet, say you are the test grader. God has endowed you with absolute authority in test grading matters. You know a 10 essay is a 10 essay and a 1 essay is a 1 essay. Whereas others struggle to see the difference, your observations are perfect.
But you are also broke and need a job. I hand you a test. You recognize it is a clear 10. What grade do you give it?
In a world of justice, your answer is 10. But in a world where you need to eat, you take a different route. Perhaps the essay did something strange, something you have never seen before–something like argue that bargaining problems cause war. You recognize that such an argument reflects scholarly consensus and would be the baseline for tenure at Stanford. But you also know that other graders will think that the argument is just plain bizarre. So you credit the writer for having decent organizational structure but not much substance and turn in a 7.
The other grader gives it a 6.5. The system counts it as a match and you do not get into trouble.
Grading systems are not grading systems–they coordination games with multiple equilibria. Like in the Family Feud, a reader should not give the grade the writer actually deserves but rather what he or she thinks other readers will give it.
But this leads to perverse equilibria. For example, if we all created the rule that essays beginning with a vowel are 10s and essays beginning with consonants are 2s, no one would want to break the system and risk the wrath of being labeled inefficient. So substance goes out the window. Graders instead look for focal points to coordinate their scores.
Those cues need not reflect anything of substance. To wit, consider the following review of a supervisor’s team:
[A] representative from a southeastern state’s Department of Education visited to check on how her state’s essays were doing. As it turned out, the answer was: not well. About 67 percent of the students were getting 2s.
That’s when the representative informed Farley that the rubric for her state’s scoring had suddenly changed.
“We can’t give this many 1s and 2s,” she told him firmly.
The scorers would not be going back to re-grade the hundreds of tests they’d already finished—there just wasn’t time. Instead, they were just going to give out more 3s.
3s magically appeared out of nowhere–partially because the testing company wanted a higher average, and partially because graders feared that giving a 1 or 2 would result in a mismatch and disciplining from the supervisor.
The article is full of other lovely anecdotes and worth the read. And it is also terrifying to think about.
_________
In retrospect, I should not have written about how bargaining problems cause war. From the point of view of a grader, it is just too bizarre. I deserve my 4.5 for not properly playing the game.
(I suppose three years of grad school has made me more cynical about life. Perhaps a better subtitle for this article is “How I Learned to Stop Worrying and Love Pandering.”)
Of course, that is precisely the problem. The incentive system for grading is perverse and rewards students for writing safe–and not particularly insightful–essays. In contrast, the academic job market rewards the opposite approach. Write a dissertation that has been done a thousand times before, and you won’t find employment. Do something revolutionary, and have your pick at the top jobs.
The test companies do not have final say over the method of standardized testing grading. We do. It’s time we demand change in the system.
In standard bargaining situations, both parties understand the fundamentals of the agreement. For example, if I offer you a $20 per hour wage, then I will pay you $20 per hour; if I propose a 1% sales tax increase, then sales tax will increase by 1%. But not all such deals are evident. Senate confirmation of judicial nominees is particularly troublesome—the President has a much better idea of the true nominee’s ideology than the Senate does. Indeed, as the Senate votes to confirm or reject, the Senate may very well be unsure what it is buying.
This situation is the center of a new working paper from Maya Sen and myself. We develop a formal model of the interaction between the President and the Senate during the judicial nomination process. At first thought, it might seem as though the President benefits from the lack of information by occasionally sneaking in extremist justices the Senate would otherwise reject. However, our main results show that this lack of information ultimately harms both parties.
To unravel the logic, suppose the President could nominate a moderate or an extremist. Now imagine that the Senate is ideologically opposed, so it only wants to confirm the moderate. The choice to reject is not so simple, though, because the Senate cannot directly observe the nominee’s type but rather must make inferences based on a noisy signal. Specifically, the Senate receives a signal with probability p if the President chooses an extremist. (This signal might come from the media uncovering a “smoking gun” document.) The President suffers a reputation cost if he is caught in this manner. If the President selects a moderate, the Senate receives no signal at all. Thus, upon not receiving a signal, the Senate cannot be sure whether the President nominated a moderate or extremist.
With those dynamics in mind, consider how the President acts when the signal is weak. Can he only nominate an extremist? No–the Senate would obviously always reject regardless of its signal. Can he only nominate a moderate? No–the Senate would respond by confirming the nominee despite the lack of a signal, but the President could then gamble by selecting an extremist and hoping that the weak signal works in his favor. As such, the President must mix between nominating a moderate and nominating an extremist.
Similarly, the Senate must mix as well. If it were to always confirm, the President would nominate extremists exclusively, but that cannot be sustainable for the reasons outlined above. If the Senate were to always reject, the President would only nominate moderates to avoid smoking guns. But then the Senate could confirm the moderates it was seeking.
Thus, both parties mix. Put differently, the President sometimes bluffs and sometimes does not; the Senate sometimes calls what it perceives as bluffs and sometimes lets them go.
These devious behaviors have an unfortunate welfare implication–both parties are worse off than if they could agree to appoint a moderate. Since the Senate mixes, it must be indifferent between accepting and rejecting. The indifference condition means that the Senate receives its rejection payoff in expectation, which is worse than if it could induce the President to appoint a moderate. Meanwhile, the President is also mixing, so he must be indifferent between nominating a moderate and nominating an extremist. But whenever he nominates a moderate, the Senate sometimes rejects. This also leaves the President in worse position than if he could credibly commit to appointing moderates exclusively.
Further, we show that the President and Senate can only benefit from more information about judicial nominees when they are ideologically opposed. And yet there seems to be little serious effort to change the current charade of judicial nominee hearings. (During Clearance Thomas’s hearing, when asked whether Roe v. Wade was correctly decided, he unconvincingly replied that he did not have an opinion “one way or the other.”) Why not?
The remainder of our paper investigates this question. We point to the potential benefits of keeping nominee ideology secret when the Senate is ideologically aligned with the President. Under these conditions, the President can nominate extremists and still induce the Senate to accept. Keeping the process quiet allows the President to nominate such extremists without worrying about suffering reputation costs as a result. Consequently, the current system persists.
Although our focus is on judicial nominations, the same obstacles are likely present in other nominations processes. And coming from an IR background, I have been thinking about similar situations in interstate bargaining. In any case, please check out the paper if you have a chance. We welcome your comments on it.
Two years ago today, I published the first incarnation of Game Theory 101: The Complete Textbook. (It was incomplete back then, heh.) Every summer, I like to go through it and make changes where I can. This time around, I decided to add a new lesson on games with infinite strategy spaces, like Hotelling’s game, second price auctions, and Cournot competition. I have correspondingly added some content to the MOOC version. Videos below.
Initially, I was hesitant to add more material to the textbook because Amazon’s fee increases as the file size of the book increases. Yet, the size of the textbook shrunk because I cut down on unnecessarily wordy sentences. (Switching “is greater than” to “beats” probably chopped off 300 words from the book.)
The optimistic interpretation: Readers now learn more while reading less!
The pessimistic interpretation: I really, really need to work on writing shorter sentences.
According to a new study by Menusch Khadjavi and Andreas Lange, prisoners cooperate more frequently in prisoner’s dilemmas than college students. Here’s the abstract from their article “Prisoners and Their Dilemma”:
We report insights into the behavior of prisoners in dilemma situations that so famously carry their name. We compare female inmates and students in a simultaneous and a sequential Prisoner’s Dilemma. In the simultaneous Prisoner’s Dilemma, the cooperation rate among inmates exceeds the rate of cooperating students. Relative to the simultaneous dilemma, cooperation among first-movers in the sequential Prisoner’s Dilemma increases for students, but not for inmates. Students and inmates behave identically as second movers. Hence, we find a similar and significant fraction of inmates and students to hold social preferences.
I have always thought that the prisoner’s dilemma was a terrible example of strict dominance for introductory classes in game theory. Students tend to shoot back with “snitches get stitches” or something similar, so prisoners would cooperate in such a situation. This leads to an awkward conversation about expected utilities when all you really want to do is explain the logic of strict dominance. The study further suggests we drop the prisoner story from the prisoner’s dilemma.[1]
Nevertheless, after having read the article, I have serious problems with the results. The article is extremely short on theoretical mechanisms, so I am going to step in and provide some speculation. Here are three explanations for the result:
The mechanism the authors would prefer is that prisoners are incredibly strategic. To quote Red from Shawshank Redemption, prison time is slow time. Prisoners have nothing better to do than plot, strategize, and scheme against one another.[2] While this might initially appear detrimental to cooperation, the truth is just the opposite. Everyone knows you can’t get away with doing stupid things, so people don’t bother. As such, prisoners cooperate–even though the game is anonymous, cooperation is less likely to lead to a witch hunt later and less likely to cause problems greater than a few Euros worth of phone credit.
Prisoners didn’t believe that the game really is anonymous. The experimenters stressed to participants that they would be the only one to see the results. However, this is utterly ridiculous. No one, NO ONE, NO ONE took that statement at face value. Marek Kaminski, a political scientist at UC Irvine, spent some time in a Polish prison for publishing anti-communist materials way back in the day. He wrote Games Prisoners Play based off of his experiences.[3] It is a compelling read. Kaminski might be the only serious social scientist to spend time in a prison. He makes a big point that we really shouldn’t trust any studies of prisoners simply because prisoners do not trust supposedly confidential experimenters. At all. Prisoners might have cooperated in the study because they believed the prison staff would see the results, or other prisoners would see the results, or whatever. College students, meanwhile, know the results will be confidential and are therefore free to defect all they wish.
The results have nothing to do with the prison/free dichotomy but rather education levels. Non-prisoners in the study were all college educated. The prisoners averaged below 10 years of schooling. The authors only obtain statistically significant results without any controls. Once they popped in education level, the only thing that was statistically significant was coffee consumption. (Good luck explaining that result theoretically!) But education and the prisoner dummy are about as multicollinear as multicollinearity gets. We probably shouldn’t trust the coefficients on either of those variables. But this also would mean that education would be statistically significant if you ran a regression without any controls. Perhaps the prisoners just don’t see that defection strictly dominates cooperation. The data tell us nothing here.[4]
On the surface, the paper is neat. However, authors of any quantitative model need to think hard about their data generating process and then construct their research design model accordingly. The authors don’t do that here, especially when it comes to point 3. As such, this study is…lacking.
[1]But this is a coordination problem, and we are well past that tipping point.
[2]This is why Orange Is the New Black is an interesting series.
[3]The authors do not cite Kaminski. It should be required reading (and a required citation) for all studies of prisoners.
[4]I would imagine someone has previously studied the effect of education level on prisoner’s dilemma cooperation, but I am unaware of any such study and the authors do not cite any.