By Warren D. Smith; heavily revised March 2009 after learning about paper by Anthony Quas. (Skip to summary) (Romanian translation)
Monotonicity is the property of a voting system that both
One can also consider sets of voters increasing their vote for C in (I), causing (if monotonicity fails) C to stop winning, etc.
Unfortunately, Instant Runoff Voting (IRV) is not monotonic; indeed, it fails both criteria. Two realworld IRV elections (both held in 2009) illustrating this were Burlington VT (USA) and Frome Australia.
Criterion I fails: when two voters change their vote to C, that stops C from winning. Details: C wins this 17voter IRV election (B is eliminated and 1 more of his 5 votes transfer to C than to A, causing C to win 9:8 over A). But after the two A>C>B voters instead give their vote to C, then B wins. (In the new election, A is eliminated and his 4 votes transfer to B, causing B to win 9:8 over C.) This occurred in the Louisiana 1991 governor election, which was a (noninstant) runoff election with 12 candidates but only the top 3 got over 410,000 votes each; the remaining 9 candidates each got under 83,000 individually and under 125,000 in the aggregate. It also occurred in Burlington Vermont 2009 (a genuine IRV election for which we have full set of ballots) and Peru 2006 (a top2 runoff election in which the top 3 candidates got 79% of the votes). 
 
Criterion II fails: when two voters change their vote away from B, that causes B to win. Details: C wins this 17voter IRV election (A is eliminated and his 4 votes transfer to C so that C wins 9:8 over B). But after 2 of the B>A>C voters instead give their vote to A, then B wins. (In the new election, C is eliminated and his votes transfer to B so that B wins 11:6 over A.) This occurred in Frome Australia 2009 and also (in an altered form, with extra Currie voters added, of) the Irish 1990 presidential election, which was an IRV election with 3 candidates. 

The two particular election examples above both feature "Condorcet cycles" C>A>B>C and C>B>A>C respectively. However, both these kinds of nonmonotonicity can also occur without a cycle:
Criterion I fails: when two voters change their vote to C, that stops C from winning. Details: C wins this 17voter IRV election. But after the two A>C>B voters instead give their vote to C, then B wins.

 
Criterion II fails: when two voters change their vote away from B, that causes B to win. Details: C wins this 17voter IRV election. But after 2 of the B>A>C voters instead give their vote to A, then B wins. 

In all the examples above, C was the IRV winner while the plainplurality winner was somebody else (or it was a tie). TypeII (but not typeI) failure can also occur in situations (both with or without a Condorcet cycle) where C simultaneously is both the IRV and plurality winner:
Criterion II fails: when two voters change their vote away from B, that causes B to win. There is a Condorcet cycle in this example: C>A>B>C. Details: C wins this 21voter IRV election. (Also the plainplurality winner.) But after 2 of the B>A>C voters instead give their vote to A, then B wins.

 
Criterion II fails: when two voters change their vote away from B, that causes B to win. This example has no cycle. Details: C wins this 21voter IRV election. (Also the plainplurality winner.) But after 2 of the B>A>C voters instead give their vote to A, then B wins. 

There is one saving grace for IRV, though: it is not possible for both typeI and typeII failure to occur in the same election at the same time in the same way. That is, there is no IRV election in which X, by giving away some votes to the current IRVwinner W, can cause X to win and W to stop winning. This disjointness is easy to see and will implicitly underlie our probability calculations below.
For now, let's consider only 3candidate IRV elections. (We shall consider IRV elections with more than 3 candidates later; in them monotonicity failures presumably are more frequent. If there are two or fewer candidates, failure is impossible. In the limit of a large number of candidates, we shall see failure becomes overwhelmingly likely.) In the "random elections model" [which has also been called the "Impartial Culture" in the literature, see, e.g. Ilia Tsetlin, Michel Regenwetter, Bernard Grofman: The impartial culture maximizes the probability of majority cycles, Social Choice and Welfare 21,3 (Dec 2003) 387398; William V. Gehrlein: Weighted scoring rules, the impartial culture condition, and homogeneity, Quality and Quantity 20,1 (1986) 85107; etc.] monotonicity failures of type (I) occur about once per 45 elections and of type (II) occur once per 8.2 elections. In total, monotonicity failure occurs once every 6.9 elections, i.e. 14.5% of the time.
IRV advocates have falsely claimed that nonmonotonicity in IRV elections is very rare.
One reason was a published paper by Crispin Allard in which he made an incorrect calculation of these numbers. Allard made at least 3 different errors, all of which pushed his number erroneously too small.
As a result of these errors, Allard got a number about 1000 times smaller than the truth. His incorrect paper was Estimating the Probability of Monotonicity Failure in a UK General Election, Voting Matters 5 (January 1996).
A second reason was the perception by IRV advocates that if phenomena like nonmonotonicity occurred in an important real election, then the press would shout all about it; since they were unaware of the press ever doing that, therefore, it must never occur in real life. Since we have mentioned very important realworld elections above in which monotonicity (and other properties too!) failed (and the press, as far as I can tell, never mentioned that even once) we know that perception was erroneous. IRV property failure is quite common in real life – it occurs at rates comparable to those predicted by the random elections and Dirichlet models – despite the lack of mention of that in the press.
It actually is possible to derive exact formulas for these probabilities using "Schläfli functions" (see puzzle #4). But a simpler approach is just to perform "Monte Carlo experiments" where you simply try lots of random elections to see how often these phenomena occur.
Here is a simple computer program which does this for typeI nonmonotonicity:
for NNL from 2 to 5 do NN := 6*10^NNL; #NN = no. of experiments to try = 600, 6000, 60000, 600000 FailCt := 0; #counter initially zero CAwinsCt := 0; #this counter also initially zero for i from 1 to NN do #perform NN experiments #step 1: generate random normal numbers as the 6 kinds of vote count ABC := RandomNormal(); ACB := RandomNormal(); BAC := RandomNormal(); BCA := RandomNormal(); CAB := RandomNormal(); CBA := RandomNormal(); if(CAB+CBA > BAC+BCA and #B elimd in original IRV election ABC+ACB > BAC+BCA and #B elimd in original IRV election CAB+CBA+BCA > ABC+ACB+BAC #C wins original IRV election ) then CAwinsCt := CAwinsCt+1; #to simplify, we only examine elections which "C" won with "A" second t := min(BAC+BCA, 0.5*(CAB+CBA+ACB+ABC), ABC); #CORRECTION added March 2009 if( #after ACB votes (plus ABCt votes of form ABC) moved from A to C then A elimd... t+BCA+BAC > CAB+CBA+ACB+ABCt #...whereupon B wins the new IRV election ) then FailCt := FailCt+1: #increment counter since monoI failure example seen fi; fi; od; print(FailCt, " out of ", CAwinsCt, " typeI mono failures where C won but stops winning if given ACB votes"); od; 
Note the correction (in red) added March 2009. The old uncorrected code had read
if(ABC < BAC+BCA and #after ACB votes moved from A to C then A elimd... ABC < CAB+CBA+ACB and ABC+BCA+BAC > CAB+CBA+ACB #...whereupon B wins the new IRV election ) 
This old code underestimated the typeI nonmonotonicity probability because it erroneously thought that it was only worth converting ACB votes to C votes to stop C winning. In fact, it sometimes also helps to convert some ABC votes too.
The output of this program (two runs, using any normal distribution with much larger positive mean than its standard deviation) was:
13 out of 92 typeI mono failures where C won but stops winning if given ACB votes 125 out of 1015 typeI mono failures where C won but stops winning if given ACB votes 1278 out of 9985 typeI mono failures where C won but stops winning if given ACB votes 12134 out of 100008 typeI mono failures where C won but stops winning if given ACB votes 8 out of 89 typeI mono failures where C won but stops winning if given ACB votes 123 out of 1021 typeI mono failures where C won but stops winning if given ACB votes 1240 out of 10003 typeI mono failures where C won but stops winning if given ACB votes 12088 out of 99549 typeI mono failures where C won but stops winning if given ACB votes 
Hence we conclude the typeI failure probability for IRV 3candidate elections is
(13+125+1278+12134 + 8+123+1240+12088)/(92+1015+9985+100008 + 89+1021+10003+99549)which with a generous error bar is (12.2±0.1)%, or 1 failure per 8.2 elections.
= 27009/221762
= 12.18%
Here is the analogous program for typeII monotonicity failure; it yields (2.312±0.001)% failure rate.
for NNL from 2 to 5 do NN := 6*10^NNL; #NN = no. of experiments to try = 600, 6000, 60000, 600000 FailCt := 0; #counter initially zero CBwinsCt := 0; #this counter also initially zero for i from 1 to NN do #perform NN experiments #step 1: generate random normal numbers as the 6 kinds of vote count ABC := RandomNormal(); ACB := RandomNormal(); BAC := RandomNormal(); BCA := RandomNormal(); CAB := RandomNormal(); CBA := RandomNormal(); if(CAB+CBA > ABC+ACB and #A elimd in original IRV election BAC+BCA > ABC+ACB and #A elimd in original IRV election CAB+CBA+ACB > BCA+BAC+ABC #C wins original IRV election ) then CBwinsCt := CBwinsCt+1; #to simplify, we only examine elections which "C" won with "B" second x := min(BCA+BACCABCBA, 0.5*(CBA+BCA+BAC ABCACBCAB) ); if( x>0 and CAB+CBA < ABC+ACB+x and CAB+CBA < BAC+BCAx #after x BAC (andor BCA) votes moved from B to A then C elimd & B wins #in older work years back, I forgot this last condition and hence got "4.9% failure rate"; #when the older program is corrected, it agrees with this one. ) then FailCt := FailCt+1: #increment counter since monoII failure example seen fi; fi; od; print(FailCt, " out of ", CBwinsCt, " typeII mono failures where C won but B wins if B gives A votes"); od; 
Fortunately this program seems not to exhibit any error analogous to the one I corrected (March 2009) in the first program.
A different probabilistic model is sometimes proposed in which the 6 votenumbers (for the 6 vote types ABC, ACB, BAC, BCA, CAB, CBA) are to be chosen uniformly in the 5dimensional simplex
(using 3letter variable names). I call this the "Dirichlet model." If you replace "RandomNormal()" in the above computer programs by "RandomExponential()" then you get that effect (as Dirichlet was the first to point out).
In the Dirichlet model, for 3candidate IRV elections I found in this way using the uncorrected program that the probability of typeI monotonicity failure was (2.2140±0.0057)% which agrees mildly well with Eivind Stensholt's finding [Nonmonotonicity in AV, Voting Matters 15 (June 2002)] that "2.17% of the profiles are nonmonotonic." ( Incidentally, Stensholt perhaps erroneously thinks the Dirichlet model is the same as the Impartial Culture model, and erroneously is unaware of typeII failure. Stensholt did not cite the erroneous Allard paper, which if he had seen it would have let him know about typeII. ) However, this was wrong! I thought it was right for several years because of the agreement with Stensholt, but then I found a paper by Anthony Quas in which he found a substantially larger number, and soon after I found the error (which Stensholt and I had both made)! After running the corrected program, I find that
The probability of typeI monotonicity failure is (4.5133±0.0008)%.
This agrees with Quas's exact finding of 13/288=4.513888...%
Anthony Quas: Anomalous outcomes in preferential voting (pdf), Stochastics & Dynamics 4,1 (2004) 95105.
and the independently computed (also exactly) finding of 13/288 also by
D.Lepelley, F.Chantreuil, S.Berg: The likelihood of monotonicity paradoxes in runoff elections Mathematical Social Sciences 31,3 (June 1996) 133146.
It is surprising how greatly the number changes when we change the underlying probabilistic model in this way: 12.2% in the first model versus only 4.5% in the second.
For typeII monotonicity, while we in our original Impartial Culture model found a
2.312% monotonicityfailure rate,
in the Dirichlet model we get
(0.98±0.094)%.
A computer program I had written independently years before for the purpose
of correcting Allard's calculation had found
(1.050±0.001)%
which agrees fine with, and confirms our present
(0.98±0.094)%;
but unfortunately a later longer run of my present program found
(0.90935±0.0001)%.
Allard's incorrect number was 0.025%.
(Actually it ought to be possible in the Dirichlet model to express all these
probabilities as exact rational numbers
arising from polytopevolume ratios, as opposed to our approximate MonteCarlo results.
Perhaps the present number is 1/110=0.0090909...
)
In this model, the N candidates are randomuniform points on the real interval (0,1). The voters are the uniform distribution on this same interval. Each voter orders the candidates besttoworst in order of increasing distance away from her.
Quas says that he has evaluated the probability of a typeI monotonicity failure when N=3 in this model and it is exactly 5/72=6.9444...%. My computer (MonteCarlo calculation) finds (6.945±0.002)%, confirming Quas. Quas [after some clarification of his wording] claims the following characterization of 3candidate typeInonmonotonic IRV elections in the political spectrum model:
There is a nonmonotonic [type I] allocation of votes if and only if the central candidate gets more than 25% of the vote and is the first to be eliminated. (In this case, the centrist automatically is a "Condorcet winner.") To see this, note that in this case, the nonwinning outer candidate can give the winner enough of her vote to have slightly fewer votes than the centrist. She then is eliminated (now having the fewest votes) and all her votes transfer to the centrist rather than to the more extreme winner, allowing the centrist to become the new winner.
That was, in fact, exactly what happened in Burlington 2009.
Apparently 3candidate typeIInonmonotonic IRV elections are impossible in this model. We thank Peter J. Sloan for pointing this out.
At first, confusedly, I had thought the following election was nonmonotonic [type II]. [As a picture:0==============P=======M===============E=1With Instant Runoff Voting (IRV with honest voters), E is eliminated, then M wins (versus P). With plurality, P just wins.
But if some P supporters dishonestly & strategically give their P votes to E (i.e change their honest votes P>M>E to E>M>P), then M is eliminated in IRV, whereupon P wins. That's nonmonotonicity: fewer P votes cause P to win.]
My computer finds the frequency of occurrence of that kind of failure is (2.7778±0.0003)%, which suggests to me that the exact answer is 1/36=2.7777...%. However this does not meet the technical definition of nonmonotonicity! This election is an example of strategic ballot reversal which is actually an even crazier pathology than nonmonotonicity – maximally lying about your preferences causes your favorite to win, while honest voting would cause him to lose. But because the strategists are also being dishonest about M vs E, this strictly speaking is not "nonmonotonicity" which would involve dishonesty about P only.
Peter Sloan's impossibility proof (sketch):
Let's call the plurality winner P, the central candidate M, for moderate, and the other candidate E ("extremist"). The only honest ranking for P's voters allowed under the 1D political spectrum is P>M>E. Now to get a nonmonotonicity typeII paradox, we need some IRV voters to demote P in their vote, causing P to win. This will not work if the voters honestly ranked P second or third since it will not alter the IRV elimination order (except perhaps by eliminating P). So we must have some P voters switching from P>M>L to either L>M>P or L>P>M, paradoxically causing P to win. That's disallowed since this does not leave the relative rankings of the rest (i.e. L and M) unchanged. The honest ranking is M>L. Or finally, P>M>L could switch to M>L>P or M>P>L, but this will not help eliminate M, so it won't work. Conclusion: typeI nonmonotonicity is the only kind allowed in a threecandidate race under the political spectrum model. Q.E.D.
THEOREM: Consider Ncandidate elections in either
PROOF SKETCH: The Dirichlet case of this theorem was proved by Quas (in the article we cited) as his "theorem 1." The other case can be proved by a parallel argument pointed out by Warren D. Smith in 2009. In both cases the argument proceeds roughly as follows.
Let X be the IRV winner and let Y be preferred over X by a majority (earlier lemmas by Smith & Quas had shown Y exists with probability→1 since X is a Condorcet winner with probability→0 since indeed with probability→1 no Condorcet Winner exists; see this puzzle for the proofs of those lemmas). Change votes saying "X>Y" to now vote X top (apparent "improvement" for X). Now slightly less than half the ballots say X is top. But Y is now the IRV winner with probability →1 since it is ranked top by about 1/N of the voters (versus about 0.5/N for everybody else besides X) hence with probability →1 Y survives to the final IRV round then beats X. (That, to be solid, really required some estimates, but Quas made them.) So the "improvement" stopped X winning.
More details:
In case 1 (the random election model with V→∞),
Quas's beta distributions become normal distributions
(correlated ones
with known correlations).
Quas's finding (at top of p.104 in his published article) that
Q.E.D.
REMARK: The above theorem and proof (with slight changes) should also work for WBSIRV instead of IRV. For BTRIRV it will work provided that once the BTRIRV process winnows it down to 3 survivors, either the final two besides X both pairwisedefeat X, or the one that doesn't is pairwisedefeated by the other one. Those requirements are not necessarily going to be satisfied by themselves (although they should be satisfied with a decent probability, say 0.4 or so), but can presumably be assured with probability→1 with some slight amount of malice aforethought – the undesired survivors need to be eliminated in earlier rounds because X gets raised ahead of them in ballots a little more often.
CONJECTURE:
In the 1D political spectrum model with N candidates,
the IRV election is nonmonotonic (typeI failure) with
PROOF SKETCH: We now sketch a proof of Quas's conjecture that the IRV election in the 1D political spectrum model is nonmonotone with probability→1 when N=#candidates→∞. I'm optimistic this proof sketch is, in fact, convertible into a genuine proof, but (as of 20 March 2009) not yet willing to say I'm sure.
By Black's singlepeakedness theorem, a Condorcet winner Y exists and is the candidate located nearest to 1/2. Let X be the IRV winner. Let us select a candidate C somewhat nearer to 1/2 than X (but on the same side) and (say) having a comparatively large initial Voronoi region (i.e. comparatively large initial number of toprank votes). [We shall need a lemma here that with probability>1O(ε), the distance between X and 1/2 exceeds ε and there are numerous candidates located between 1/2 and X.] Consider a cabal, organized by me, of voters who raise X in their votes and whose strategic goal is to create a "buffer zone of safety" for C. That is, some of the voters who prefer Z>C>X, will raise X above Z. The point is this will cause C's nearby rivals (Z) to be eliminated before C is eliminated (even though C would normally have been eliminated because squeezed between two rivals). In this way it seems to me we can keep C alive. In the same way it seems to me we can enhance the survival of other selected candidates. The goal is to induce a specific configuration to approximately appear at each stage of the IRV process, essentially independently of the initial candidatelocations. We make IRV winnow the candidates down according to a fixed timeevolution of the distribution, and with the survivors artificially uniformlyspaced. (The point of uniform spacing is small changes tip the balance; it gives us more control for fewer "bucks expended.") Eventually C will become the IRV winner, entirely due to voters raising the old IRVwinner X in their votes with no other change. Suppose at the stage where there are M survivors, all the intercandidate spacings are uniformly const/M but with additive "noise" of order ±(N/M)^{P}/M for some constant exponent P with 0<P<1. (I think this all will work with P=2/3.) Then we ought to be able to control things to eliminate the evennumbered candidates, thus getting M/2 survivors also highlyuniformly spaced, while expending about const·(N/M)^{P} votealterations (i.e. raising X in that fraction of the votes).Before starting the above argument, we let the IRV process run normally (unperturbed) until it has winnowed down the field to, say, N^{4/5} survivors, and only after that point try to perturb the process. IRV naturally, since it eliminates candidates with small Voronoi regions, will tend to uniformize intercandidate spacings. Indeed, believe it can be shown as a lemma that (with probability→1 when N→∞), after IRV has winnowed down the field to O(N^{κ}) candidates with 0<κ<1, the spacings between the remaining candidates automatically will be uniform to within a constant factor. If we select a clump of all the candidates within, say N^{3/5} of the N^{1/5}spaced lattice points and make sure (via vote manipulation) there is one survivor in each clump then we get a situation with nearuniform spacing.
After that, the ballot alterations ("raising X above Z") can be done to manipulate each of the IRV process without affecting preceding stages. For example, suppose we have a ballot of the form "Q>Q>Q>Q>Z>C>...>X>..." where the Qs stand for candidates eliminated in preceding IRV rounds. We can then raise X above Z to get "Q>Q>Q>Q>X>Z>C>..." and all the preceding elimination decisions are unaffected. (We can pledge never to raise X to toprank; with probability→1 that will not sacrifice enough control to hurt anything.) This way, if we work out manipulations to make the field after IRV round #2^{57} (say) come out the way we want, we then can also add manipulations to make field #2^{58} come out as desired without destroying the preceding gambits that had manipulated the fields after rounds 2^{57}, 2^{56}, etc.
The total expenditure (i.e. #ballots altered) over the whole IRV process is then upperbounded by about const·∑_{k}(N/2^{k})^{P} which is a convergent geometric series with sum=const·N^{P}. This has been a sketch of (basically) an inductive proof that we can control the whole history of the IRV election (with success probability→1) by raising X within const·N^{P} ballots. In that case at the end of the day the IRV process will have winnowed it down to X versus C. Thereupon C will have a larger Voronoi region than X and hence will win because the extra votes X got, namely const·N^{P}, is sublinear and not enough to alter the conclusion that C is majoritypreferred over X (with success probability>1N^{P1} or so, anyhow – e.g. if X unluckily were located very near to 1/2 this scheme wouldn't work).
It appears than many (say at least N^{0.1}) choices of "C" would be possible in this argument, proving not only that nonmonotonicity almostsurely happens, but it also almost surely happens in a huge number of simultaneous ways. Q.E.D.
REMARK: The above conjecture and proof sketch will not work for WBSIRV and For BTRIRV instead of IRV, becasue they both always elect Condorcet winners, one exists with probability→1 in this model, and raisng a Condrocet winner in votes always still leaves him a Condorcet winner.
Quas had in mind a different proofargument for this conjecture [he gives it just before the start of section 3 of his paper]. His idea would be for Xraising voters to try to enhance the survival of candidates located near to X who would normally have been eliminated. Then at some point X will find itself surrounded by two surprisinglystillalive rivals, and hence will be eliminated. Quas notes the lemma that, in this 1D model, any candidate who, at any stage in the IRV process, is a "local minimum," i.e. whose neighbors each have more toprank votes than it does, cannot win. Thus, raising X in votes stops X winning.
Regardless of whether we can prove it, the conjecture seems to be true since Quas tells me he has "convincing computer evidence" for it.
Another open problem – what about "noshow paradoxes"? It is an open question whether an analogue of this "→100% failure theorem" holds not only for nonmonotonicity, but also for "participation failure." (Conjecturally, it does.) See puzzle#55c.
Probabilistic model  typeI failure prob  typeII failure prob  monotonicity failure prob 

Randomelections = Impartial culture  (12.157±0.001)%  (2.3119±0.0005)%  (14.469±0.002)% 
Dirichlet [uniform in 5simplex]  13/288=4.513888...%  (0.90935±0.0001)%  (5.4232±0.0001)% 
1D "Political spectrum"  5/72=6.9444...%  (2.7778±0.0003)%* or 0  (9.7222±0.0003)% or 5/72 
* Actually the 2.7778% is for "ballot reversal" strategy and not typeII nonmonotonicity. Strictly speaking the latter is impossible.
The best approximation of reality probably lies somewhere between the Dirichlet and Impartial Culture models... and Quas's "1D political spectrum" model evidently does.
Probabilistic model  typeI failure prob  monotonicity failure prob 

Randomelections = Impartial culture  →100%  →100% 
Dirichlet [uniform in (N!1)simplex]  →100%  →100% 
1D "Political spectrum"  →100% (conjectured)  →100% (conjectured) 
Also note that multiwinner STV elections can be approximately regarded as a sequence of IRV singlewinner elections, and multiwinner elections tend to have a lot of candidates. Therefore, it would seem practically guaranteed that such STV elections will exhibit nonmonotonic embarrassments – with more winners and more candidates making these embarrassments more likely.
Please notify warren.wds AT gmail.com of any improved/corrected values.