Kemeny-Young method

From Wikipedia, the free encyclopedia

The Voting series:

This series is part of the
Politics and the Election series

Politics Portal · edit

The Kemeny-Young method is a voting system that uses preferential ballots, pairwise comparison counts, and sequence scores to identify the most popular choice, and also identify the second-most popular choice, the third-most popular choice, and so on down to the least-popular choice.

The system was developed[1] by John Kemeny in 1959. In 1974, Peyton Young showed[2] that this method was the unique neutral and anonymous method satisfying Pareto efficiency, reinforcement, and local IIA. In 1991 Richard Fobes independently discovered a mathematically equivalent variation that forms the core of VoteFair ranking.[3]


Contents

[edit] Description

The Kemeny-Young method makes use of preferences expressed on order-of-preference ballots. A voter is allowed to rank more than one choice at the same preference level. A paper-based order-of-preference ballot looks like the type of preferential ballot in which ovals in different columns designate different preference levels. To minimize invalid hand-marked order-of-preference ballots, unmarked choices should be interpreted as least-preferred, and more than one preference level should be possible for the same choice, but only the highest-marked preference level (for each choice) should be used.

Kemeny-Young calculations are usually done in two steps. The first step is to create a matrix or table that counts pairwise voter preferences. The second step is to test all possible order-of-preference sequences, calculate a sequence score for each sequence, and compare the scores. In the VoteFair variation, each sequence score equals the sum of the pairwise counts that apply to the sequence, and the sequence with the highest score is identified as the overall ranking, from most popular to least popular. The original Kemeny method uses pairwise counts of voters who oppose each pairwise order, and the lowest sequence score is identified.

The following two paragraphs (from Ending The Hidden Unfairness In U.S. Elections[4], used with the author's permission, and available for public use) describe the Kemeny-Young method in a way that can be used in legal documents for small organizations:

After the candidates for the offices have been nominated, the officers of this organization shall be elected as follows. Eligible members shall be given ballots that contain the names of the candidates grouped according to their desired office. To the right of each name shall be markable locations, such as empty ovals, arranged in columns labeled First choice, Second choice, Third choice, and so on, progressing from left to right. Each voter shall mark these locations on their ballot to indicate their first choice, second choice, third choice, and so on for each office. The left-most mark among multiple marks given to the same candidate shall be used as the voter’s preference level. More than one candidate can be marked at the same preference level. The absence of a mark for a candidate indicates the lowest preference. VoteFair ranking, as explained below, shall be used to identify the most popular candidate for each office, and the most popular candidate for each office shall win the election for that office. If there is a tie for first place, the counting of votes and the VoteFair ranking shall be repeated. If the recount also indicates a tie, the outgoing Treasurer (or some other designated official) shall choose how to resolve the tie.

VoteFair ranking shall be done using software (such as accessible at www.VoteFair.org) that performs the following calculations. The preferences indicated in the ballots are counted to produce a tally table in which all the possible pairs of candidates are listed, one number for each pair indicates the number of voters who prefer one candidate in the pair over the other candidate in the pair, another number for each pair indicates the number of voters who have the opposite preference for these two candidates, and a third number for each pair indicates the number of voters who express no preference between the two candidates. Using a computer, each possible sequence of candidates is considered, where a sequence consists of one of the candidates being regarded as the most popular candidate, another candidate being regarded as the second-most popular candidate, and so on. For each such sequence the numbers in the tally table that apply to that sequence are added together to produce a sequence score for this sequence. The sequence that has the highest sequence score indicates the overall order of preference for the candidates. If there is more than one sequence that has the same highest score, the sequences with this score shall be analyzed to identify one or more ties at one or more preference levels.

[edit] Example

Tennesee's four cities are spread throughout the state

Imagine that the population of Tennessee, a state in the United States, is voting on the location of its capital. The population of Tennessee is concentrated around its four major cities, which are spread throughout the state. For this example, suppose that the entire electorate lives in one of these four cities, and that they would like the capital to be established as close to their city as possible.

The candidates for the capital are:

  • Memphis, the state's largest city, with 42% of the voters, but located far from the other cities
  • Nashville, with 26% of the voters
  • Knoxville, with 17% of the voters
  • Chattanooga, with 15% of the voters

The preferences of the voters would be divided like this:

42% of voters
(close to Memphis)
26% of voters
(close to Nashville)
15% of voters
(close to Chattanooga)
17% of voters
(close to Knoxville)
  1. Memphis
  2. Nashville
  3. Chattanooga
  4. Knoxville
  1. Nashville
  2. Chattanooga
  3. Knoxville
  4. Memphis
  1. Chattanooga
  2. Knoxville
  3. Nashville
  4. Memphis
  1. Knoxville
  2. Chattanooga
  3. Nashville
  4. Memphis

This matrix summarizes the corresponding pairwise comparison counts:

Memphis Nashville Chattanooga Knoxville
Memphis - 42% 42% 42%
Nashville 58% - 68% 68%
Chattanooga 58% 32% - 83%
Knoxville 58% 32% 17% -


The Kemeny-Young method arranges the pairwise comparison counts in the following tally table:

All possible pairs
of choice names
Number of votes with indicated preference
Prefer X over Y Equal preference Prefer Y over X
X = Memphis
Y = Nashville
42% 0 58%
X = Memphis
Y = Chattanooga
42% 0 58%
X = Memphis
Y = Knoxville
42% 0 58%
X = Nashville
Y = Chattanooga
68% 0 32%
X = Nashville
Y = Knoxville
68% 0 32%
X = Chattanooga
Y = Knoxville
83% 0 17%

The sequence score for the sequence Memphis first, Nashville second, Chattanooga third, and Knoxville fourth equals (the unit-less number) 345, which is the sum of the following annotated numbers.

42% (of the voters) prefer Memphis over Nashville
42% prefer Memphis over Chattanooga
42% prefer Memphis over Knoxville
68% prefer Nashville over Chattanooga
68% prefer Nashville over Knoxville
83% prefer Chattanooga over Knoxville

The following table lists all the sequence scores.

First
choice
Second
choice
Third
choice
Fourth
choice
Sequence
score
Memphis Nashville Chattanooga Knoxville 345
Memphis Nashville Knoxville Chattanooga 279
Memphis Chattanooga Nashville Knoxville 309
Memphis Chattanooga Knoxville Nashville 273
Memphis Knoxville Nashville Chattanooga 243
Memphis Knoxville Chattanooga Nashville 207
Nashville Memphis Chattanooga Knoxville 361
Nashville Memphis Knoxville Chattanooga 295
Nashville Chattanooga Memphis Knoxville 377
Nashville Chattanooga Knoxville Memphis 393
Nashville Knoxville Memphis Chattanooga 311
Nashville Knoxville Chattanooga Memphis 327
Chattanooga Memphis Nashville Knoxville 325
Chattanooga Memphis Knoxville Nashville 289
Chattanooga Nashville Memphis Knoxville 341
Chattanooga Nashville Knoxville Memphis 357
Chattanooga Knoxville Memphis Nashville 305
Chattanooga Knoxville Nashville Memphis 321
Knoxville Memphis Nashville Chattanooga 259
Knoxville Memphis Chattanooga Nashville 223
Knoxville Nashville Memphis Chattanooga 275
Knoxville Nashville Chattanooga Memphis 291
Knoxville Chattanooga Memphis Nashville 239
Knoxville Chattanooga Nashville Memphis 255

The highest sequence score is 393, and this score is associated with the following sequence, so this is the winning preference order.

Preference
order
Choice
First Nashville
Second Chattanooga
Third Knoxville
Fourth Memphis

If a single winner is needed, the first choice, Nashville, is chosen. (In this example Nashville is the Condorcet winner.)

[edit] Characteristics

In all cases that do not result in an exact tie, the Kemeny-Young method identifies a most-popular choice, second-most popular choice, and so on.

A tie can occur at any preference level. Except in some cases where circular ambiguities are involved, the Kemeny-Young method only produces a tie at a preference level when the number of voters with one preference exactly matches the number of voters with the opposite preference.

If all the sequence scores are calculated, the Kemeny-Young calculations require significantly more computer-calculation time than other voting methods. However, not every sequence score needs to be calculated to identify the maximum sequence score. For each added choice, the time required to calculate every sequence score increases by an amount that exceeds what can be expressed as a polynomial of the number of choices. Historically the full calculation time for cases that involve more than about six choices has been a barrier to the use of the Kemeny-Young method.

The following table summarizes the fairness criteria achieved by the Kemeny-Young method. The ability of the Kemeny-Young method to achieve fairness criteria in situations that do not involve circular ambiguity is indicated because situations that involve electing a person to an office (such as in an organization) seldom involve circular ambiguity in ways that affect who is identified as the most popular.


Criterion Description Satisfied?
Universality Identifies the overall order of preference for all the choices. The method does this for all possible sets of voter preferences, involves no randomness, and always produces the same result for the same set of voter preferences. Universality is a significant advantage over voting systems that only attempt to identify a single winner. Yes
Condorcet criterion If there is a choice that wins all pairwise contests, then this choice wins. Yes
Majority criterion If a majority of voters strictly prefer choice X to every other choice, then choice X is identified as the most popular. Yes
Pareto efficiency Any pairwise preference expressed by every voter results in the preferred choice being ranked higher than the less-preferred choice. Yes
Non-imposition There are voter preferences that can yield every possible overall order-of-preference result, including ties at any combination of preference levels. Yes
Monotonicity If voters increase a choice's preference level, the ranking result either does not change or the promoted choice increases in overall popularity. Yes
Consistency If all the ballots are divided into separate races and choice X is identified as the most popular in every such race, then choice X is not necessarily the most popular when all the ballots are combined. No
Independence of irrelevant alternatives Adding or withdrawing choice X does not change a result in which choice Y is identified as most popular.[5] No
Independence of clones Offering a larger number of similar choices, instead of offering only a single such choice, decreases the probability that one of these choices is identified as most popular (fratricide). No
Invulnerability to burying A voter can displace a choice from most popular by giving the choice an insincerely low ranking. No
Later-no-harm Ranking an additional choice (that was otherwise unranked) can displace a choice from being identified as the most popular. No
Invulnerability to compromising A voter can cause a choice to become the most popular by giving the choice an insincerely high ranking. No
Invulnerability to push-over A voter can cause choice X to become the most popular by giving choice Y an insincerely high ranking. No
Participation Adding ballots that rank choice X over choice Y can cause choice Y, instead of choice X, to become most popular. No
Schwartz The choice identified as most popular may be outside the Schwartz set. No
Non-dictatorship A single voter cannot control the outcome in all cases. Yes
Polynomial runtime The full calculation method can determine the most popular choice in a runtime that is polynomial in the number of choices. No

 An example of a case involving both circular ambiguity and a dependence on irrelevant alternatives is when 5 voters prefer A then B then C, and 4 voters prefer B then C then A, and 2 voters prefer C then A then B, because withdrawing choice B causes C to win instead of A.

[edit] References

  1.  John Kemeny, "Mathematics without numbers", Daedalus, Vol. 88 (1959), pp. 571--591.
  2.  H. Peyton Young, "Optimal voting rules", Journal of Economic Perspectives, Vol. 9 No. 1 (1995), pp. 51–64.
  3.  Richard Fobes, Ending The Hidden Unfairness In U.S. Elections (2006) describes VoteFair ranking in full, provides illustrated examples that demonstrate its use, and argues that VoteFair ranking can be used to improve the fairness of government elections and other voting situations.
  4.  Richard Fobes, The Creative Problem Solver's Toolbox (1993) contains the first published description (on pages 223-225) of what evolved into VoteFair ranking.
  5. Vincent Conitzer, Andrew Davenport, and Jayant Kalagnanam, "Improved bounds for computing Kemeny rankings, 2006.

[edit] External links

  • www.VoteFair.org A website that provides free access to VoteFair ranking calculations. For comparison the calculations also identify the winning choice according to plurality, Condorcet, Borda-count, and other voting methods.
  • www.FullRanking.com A free online tool for using VoteFair ranking to rank priorities, budget categories, and more.