This is the document Jeane Eisenstadt submitted to the Colorado legislature in 2008. We respond to it here.

SOME THOUGHTS ABOUT RANGE VOTING

Jeanne Eisenstadt

PhD, Social Psychology

Range Voting (RV) vs Instant Runoff Voting (IRV).

Runoff voting is already familiar to voters, and IRV would make only a small change. RV introduces a totally different technique of assessment, and trying to implement it would face many more individual and political obstacles.

J.C. Nunnally, in his 1967 book on Psychometric Theory (brief extract on validity of rating scales available on the web) has several arguments for preferring comparative judgments (of which ranks are one) to absolute judgments (of which RV is one).

He argues that making comparative judgments is a natural process, known and used by the time we graduate from kindergarten – we can easily say who is bigger, older, smarter, etc. Converting these comparative assessments into absolute numbers does not happen naturally and is not how we make our choices in "real life."

Furthermore, he finds that comparative methods are far more accurate, reliable and consistent than absolute methods.

He states that,

"Whereas people are notoriously inaccurate when judging [absolutely], they are notoriously accurate in making comparative judgments." (emphasis added)

The Rating Scale as a Form of Assessment

In many ways, range voting is equivalent to providing a separate rating for each candidate. Therefore, I looked for what I could find out about rating scales.

Number Of Categories

In an article on Communication Validity and Rating Scales (1996) W. Lopez states that:

"Each category is intended to increase the discrimination of the rating scale and so to increase the information in all responses. But confrontation by too many response alternatives muddles respondents. Respondents rarely make stable discriminations among more than 6 levels. Sometimes 2 or 4 levels are all they can negotiate. Excess categories introduce more noise than information by forcing respondents to make their fine choices idiosyncratically, such as by preference for even or odd numbering."

(It seems likely that voters in a situation of range voting would be pushed to all use the most extreme ratings, and this would further nullify the ostensible value of having the option of choosing from a large number of values.)

Missing Zero Point; Unequal Intervals, Or Lack Of Calibration

There is general agreement among psychometricians that raw scores on a rating scale do not correspond to equal and interchangeable units of the quality being measured, even when the scale consists of numbers that appear to be equally distant from each other. Therefore, NO ARITHMETIC OPERATIONS ARE LEGITIMATE. This includes the simple process of adding scores from different people to get a single total. A vote of 6 points is not twice as big as a vote of 3 points, and they cannot be added together to get a total of 9 points.

In response to this problem, testers have created 5-point and 7-point scales, with 0 in the middle and including descriptions of the meaning of each point, for instance, strongly agree, agree, 0, disagree, strongly disagree. This reduces the difficulty, but does not eliminate it.

A global rating, with no specification of what the numbers mean, has maximum ambiguity and volatility: it is neither reliable (repeatable from one time to another) nor valid (an accurate representation of what, supposedly, is being measured).

In RV, one essentially is reduced to the "operational definition" that degree of preference for a candidate does not exist apart from the operation of assigning points on the ballot.

Ambiguity About How To Express Neutral/Negative Response

Range voting is ambiguous with respect to neutral or negative votes. Because of the prevalence in our experience in other settings of 5 and 7 point scales, as above, some might treat RV as a scale in which the middle number is neutral and the lower numbers are negative. Others might treat it as only the positive part of the scale, with neutral and negative votes undifferentiated and all mixed together, and reported as either a 1 – the lowest score – or 0 – no approval at all. Also included in this heterogeneous mix would be people who want to opt out of the choosing process altogether.

Rater Bias Distorts Ratings

Assertiveness; extremism vs moderation.

Raters – and voters – differ in their habitual preference for using ratings at the extremes, or keeping to the middle/moderate parts of a scale.

Mood, fatigue, and habit may incline a person either toward the positive or toward the negative end of ratings.

Citations

J.C. Nunnally, PSYCHOMETRIC THEORY, New York: McGraw Hill,1967. 2^nd Edition, 1978.

W. Lopez, Communication Validity and Rating Scales. Rasch Measurement Transactions, 10:1, p. 481. 1996.

B. D. Wright and G. N. Masters, RATING SCALE ANALYSIS . Mesa Press, Chicago, 1982.

Return to main page