What happened when the Olympics decided not to use range voting

Normally, we think of Olympic judges as using range voting – the highest average score wins. (Albeit with some alterations such as dismissing the two highest and lowest "outlier" scores each performance, but the principle remains just averaging.)

But in 1995 at the World Figure Skating Championships, they used a different procedure. Here's Journalist Lila Guterman's recounting:

The U.S. champion, Nicole Bobek, had skated into second place behind Chen Lu of China. In third place after her final performance was Surya Bonaly from France. Then, a relatively unknown skater, 14-year-old Michelle Kwan, took the audience and judges by storm with a performance that catapulted her into fourth place.

Ms. Kwan's skate did not alter any of the judges' scores for Ms. Bobek or Ms. Bonaly. But after the votes were tallied, their positions flipped. Ms. Bonaly went home with the silver, and Ms. Bobek won bronze!

That's because the scoring rules called for each judge to rank the skaters and then for the group to determine winners by using a modified plurality vote based on those rankings. Ms. Bobek had had more second-place rankings than Ms. Bonaly until Ms. Kwan skated, but got bumped out of second in enough judges' rankings to give the silver medal to Ms. Bonaly.

Moral: switch from just using scores and averaging (range voting) to using ranking (first, second, third)? Big mistake, which made the figure skaters' association look stupid.

[The following postscript comes from W.Poundstone's book Gaming The Vote:] Then something very similar happened at the 1997 European men's championships. The top three were Urmanov, Zogorodniuk, and Candeloro, in that order. They all were done with all their skating. Then Vlascenko skated and came in sixth. Surprise! Vlascenko, just by skating badly, reversed Zogorodniuk versus Candeloro, causing Candeloro to end up with the silver medal. After that, the International Skating Union had had enough. Its chair Ottavio Cinquanta rolled out a new scoring system in 1998, promising "if you are in front of me, then you will stay in front of me!"

But that was a lie – counterexamples were constructed almost immediately. But note, there actually was no need to construct a counterexample – any voting theorist could tell immediately, with almost no thought whatever, that it had to be a lie, because of Arrow's theorem.


Condorcet, IRV, and Borda are examples of voting systems in which removing or adding a candidate C, can cause the relative order of two other candidates A, B to flip even though not a single voter changes her relative order for A and B. With Range Voting, that can't happen.


Sources

To learn more about the insane evolution of voting systems used for scoring skaters, we recommend

Maureen T. Carroll, Elyn K. Rykken, Jody M. Sorensen: The Canadians Should Have Won!?, MAA Math Horizons 10 (February 2003) 5-7 (pdf).

For skating, I personally would recommend the following "trimmed mean" system (which actually, I had thought, based on casually listening to talking heads on TV yakking about the Olympic skating, diving, and/or gymnastics, was the system they were using, but in fact at least sometimes, it wasn't...):

  1. Judges rate skates on 0-to-10 scale (using any real number from 0 to 10). And note here we do not support anonymous judges (secret ballots) – we want all scores to be public.
  2. The top K and bottom K scores for each skate are discarded (where K is some pre-agreed constant)
  3. The skater's score is the mean of her un-discarded judge-scores
  4. Highest scoring skater wins the gold.

This system is exactly the same as range voting, except that "trimmed mean" is used instead of mean, i.e. "outlier" judges are discarded. It has these advantages:

  1. It's simple to explain. Olympic head Juan Antonio Samaranch was quoted as saying systems not simple enough for the public to understand easily, were unacceptable for the Olympics (leaving it a mystery how the heck the tremendously complicated OBO-with-random-ignored-judges system was accepted for use in 1998-2004)
  2. If all except for K judges are honest (on each skate) then each skater will get an honest score. I.e. the system is almost immune to "coalitions of ≤K colluding dishonest judges."
  3. All judges are treated a priori the same.
  4. If skater C is added or removed from the competition, or does well, or does badly, that cannot affect the relative rankings of any two other skaters A and B. Int'l Skater's Union head Ottavio Cinquanta was quoted as saying such embarrassing "flip-flops" were unacceptable (leaving it a mystery why he soon-after chose a system which generates multitudes of them). I'd agree with Cinquanta that flip-flops are a public relations disaster.
  5. There is no "dictator" judge. Hence note this system obeys all three of Arrow's "impossible" desiderata. That is possible because Arrow's impossibility theorem applies only to voting systems based on rank orderings whereas the system we just described is based on numerical ratings.
  6. The "trimmed mean" system by discarding ("trimming off") the K+K outlier judges for each skate, unfortunately can thereby discard some important honest judgements. (Not all outliers represent corrupt judges.) We admit that is sad. However, in riposte, note that the trimmed mean system takes into account the actual numerical scores, e.g. regards a 9.9 skate as worth far more than a 5.4 skate. Rival systems which ignore the numerics and thus regard the advantage of a 9.9 over a 5.4 as the same as the advantage of a 5.5 over 5.4, also discard a lot of information, and it seems probable this has a much more-injurious effect than the information-discarding done by trimmed-mean.

Note that being "immune to coalitions of K strategic voters" is totally antithetical to "democracy" (!!) and hence I do not recommend trimmed-mean range voting for political applications. (Imagine saying to black, or Jewish, or gay, or third-party voters "sorry, the voting system has decided to ignore your votes because you are 'outliers'.") But it does seem a good idea in the skating application. The value of K can be adjusted anywhere from 0 to ⌊(N-1)/2⌋ where there are N judges.

We give a fairly precise description of the 1995 BOM and 1998 OBO scoring systems here – with "flip-flop" examples for each from the 1998 Olympics.

For two popular articles about skating scoring difficulties, see

Lila Guterman: When Votes Don't Add Up, Chronicle of Higher Education 47,10 (3 Nov. 2000) page A18 (2 pages plus 2 photos).

Richard Monastersky: Mathematicians Find Problems With New System for Scoring Figure Skating, Chronicle of Higher Education 49,2 (31 Jan. 2003) page A16.
The Bobek-Bonaly "great flip-flop" is also mentioned in
Farhad Manjoo: Your presidential candidate: Hot or not?, for Salon.com.

Return to main page