On Ranking Merit by Waterloo Institute for Complexity and Innovation

On ranking merit: applying the page-rank algorithm to the electoral process Robert Spekkens Perimeter Institute for Theoretical Physics October 19, 2010 WICI seminar

The pagerank algorithm provides a way of ranking the members of a community by merit using the aggregate opinions of the community and without any prior ranking Call this the merit-rank algorithm This should be considered a module to be incorporated into broader systems for collective decision-making Ex: Appointment of the most meritorious members of a community to a particular set of offices • the most trustworthy to decision-makers • the most fair to jurors • the most expert to policy-makers

Outline • • • • • •

Shortcomings of current schemes How the merit-rank algorithm works Case study: Google’s search engine Case study: Citation networks Criticisms and Possible failure modes Beyond pagerank

Two schemes for identifying merit and their shortcomings By authority

- Requires a prior notion of who is best qualified to judge merit - Susceptible to corruption - doesn’t scale well - Each authority has a short horizon of deep familiarity

By majority vote

-Popular opinion may be less reliable than that of a better-qualified minority (the pitfalls of rule by referendum) - Each voter has a short horizon of deep familiarity

Merit-rank can hope to avoid some of these shortcomings

How the merit-rank algorithm works

Pagerank as a slogan Important webpages are those that are linked to by other important webpages

Merit-rank as a slogan Meritorious individuals are those who are judged to have merit by other meritorious individuals

What kinds of merit will the algorithm work for? • Auto-indicating merit: An individual having merit is better able to assess merit in others Or equivalently, • Merit that is transitive: If Alice esteems Bob, then she would also esteem those who are esteemed by Bob.

From a majority vote system to the merit-rank algorithm

Ranking a slate of candidates

Ranking the entire community

One vote per person

One unit of voting power per person

Either: - split equally among targets - split arbitrarily among targets

1 1 1

0 0.33 0

3.83

4.0

0.33

0.5

1.0

Beyond majority vote: adding recursion Primitive version of merit-rank algorithm Iterate the calculation of individual ranks At step 0, everyone has equal merit-rank Alice’s merit-rank at step k = Bob’s merit-rank at step k-1 £ Fraction of Bob’s vote cast for Alice + Charlie’s merit-rank at step k-1 £ Fraction of Charlie’s vote cast for Alice +… If the calculation converges, final ranking = merit-rank

Merit-rank at step 0 1

1 1 1

Merit-rank at step 0 1

1 1 1

Merit-rank at step 1 0

0 0.33 0

3.83

4.0

0.33

0.5

1.0

Merit-rank at step 1 0

0 0.33 0

3.83

4.0

0.33

0.5

1.0

Merit-rank at step 2 0

0 1.33 0

2.66

0.17

1.33

0.33

3.83

Merit-rank at step 2 0

0 1.33 0

2.66

0.17

1.33

0.33

3.83

Merit-rank at step 3 0

0 0.06 0

0.67

0.06

5.22

0.67

2.66

Problems with primitive version People who earn but do not cast any votes are sinks for merit-rank 0

0.06 0

5.22

0.67

0.06

2.66

0.67

Sol’n: Uniformly distribute their vote

Problems with primitive version People who earn but cast no votes other than to themselves & Groups who earn but cast no votes other than to their own membership are sinks for merit-rank 0

0.67

0.06

5.22

2.66

0.67

Sol’n: Uniformly distribute a fraction of their vote

Problems with primitive version People who earn no votes are left with no voting power after the first step 0

0.06 0

5.22

0.67

0.06

2.66

0.67

Sol’n: Uniformly distribute a fraction of every vote

The merit-rank algorithm “taxing votes for the common good” Fraction X of vote uniformly distributed Fraction 1-X of vote distributed at voter’s discretion (uniform if unspecified) Standard choice: X=0.15

Merit-rank at step 0 1

1 1 1

Merit-rank at step 0 1

1 1 1

0 0.33 0

3.83

4.0

0.33

0.5

1.0

Merit-rank at step 1 0

0.15

+ 0.85

0.33 0

3.83

4.0

0 0.33

0.33 0

3.83

4.0

0 0.33

0.5

1.0

0.5

1.0

Merit-rank at step 1

Merit-rank at step 2

0.15

+ 0.85 c

= …………………

final merit-rank

The algorithm always converges to a unique solution

Compare Weighted in-degree

final merit-rank

0.33 0

3.83

4.0

0.33

0.5

1.0

Case study: Google’s search engine

Sergey Brin and Lawrence Page (1998), "The anatomy of a large–scale hypertextual Web search engine," at http://www-db.stanford.edu/~backrub/google.html Webpages vote for one another by linking to one another Every webpage has a unit of voting power which is divided equally among the webpages to which it links

The random surfer picture with probability 0.85: follows a random link from the webpage she is currently on; or with probability 0.15: “teleports” to a completely random webpage. The long-time probability of ending up on a given webpage converges to a fixed value = the pagerank of that webpage

Google’s dominance over other search engines is perhaps the strongest recommendation of pagerank

Case study: Citation networks

Applying pagerank to a citation network P. Chen, H. Xie, S. Maslov, S. Redner, “Finding Scientific Gems with Google,” J.Informet. 1, 8-15 (2007) 353,268 nodes = all publications in the Physical Review family of journals from 1893–2003 3,110,839 links = all citations to Physical Review articles from other Physical Review articles A value X>0 is required to prevent all votes to sink to the oldest papers. Chosen value: X=0.5.

Strong correlation between # of citations and pagerank

However, outliers constitute exceptional papers

Benefits of merit-rank • Identifies a set of individuals that are more exceptional than the set that majority vote would identify • Completely democratic yet gives more weight to the opinions of the best qualified • Plays to our strengths by permitting us to assess only those we know well

Criticisms and Possible failure modes

Unrecognized merit Not necessarily a problem  The algorithm actually ranks people by their degree of vetted merit

Disenfranchisement

Proportional representation See: Xie, Yan and Maslov, “Optimal ranking in networks with community structure”, arXiv:physics/0510107

Merit-ranking of groups If the algorithm works for individuals, it should work for groups

Voting based on ideology rather than on merit It is not necessarily a failure of the algorithm if an individual chooses to judge merit primarily in terms of ideology The network may partition into ideologically homogeneous groups Still, we have proportional representation for different ideologies

The celebrity failure mode A large imbalance in degree of recognition can trump considerations of merit Note: Merit-rank fares better than majority vote (consider the difficulty of gaming google) Possible fix: A weighting factor in proportion to the depth of a relationship

Are the relevant kinds of merit really auto-indicating? - the trustworthy can be naive - Experts can fall prey to groupthink Response: Unreliability of assessments of merit increases in proportion to superficiality of the relationship Possible fix: A weighting factor in proportion to the depth of a relationship

Few will understand the algorithm How can a community that doesn’t understand an algorithm ever come to endorse it? Answer: Trust based on past performance )

No secret ballot The algorithm needs to know how everyone voted But how can one trust the institution that calculates the outcome without making public all of the information and thereby opening the door to bribery and coercion? Possible Fix: A cryptographic scheme

Beyond pagerank

The HITS algorithm – Hubs and Authorities Jon Kleinberg, "Authoritative sources in a hyperlinked environment“ Journal of the ACM 46 (5): 604–632 (1999).

Recall: Pagerank as a slogan Important webpages are those that are linked to by other important webpages HITS algorithm as a slogan Hubs are webpages that link to authorities, authorities are webpages that are linked to by hubs

The HITS algorithm – Hubs and Authorities The HITS algorithm returns two numbers for a webpage: • Authority value = the value of the content of the page • Hub value = the value of its links to other pages. Start with authority value = in-degree Hub value = out-degree Weight in-degree by Hub values Weight out-degree by Authority values Iterate.

Using a HITS-like algorithm to rank expertise Suppose some nodes have only incoming links (pure authorities) and others only outgoing links (pure hubs) Let Pure Hubs be the experts Let Pure Authorities be the beliefs of the experts Experts are the people who have the right beliefs. The right beliefs are the ones believed by the experts. Such a scheme can overcome the problem of unrecognized merit

Outlook Consider more complicated schemes (ex: negative votes, combination of HITS and pagerank, etc.)

Numerical tests of the various failure modes Find a good forum for a real-world trial - online gaming community? - facebook applications?