Tuesday, June 12, 2018

Success randomness in chess

This is a surprising entry. It's not about hockey at all But maybe it's good to write surprising entries after a long silence like this. This post was inspired by the research of randomness in playoff results in four major sports that was discussed on Twitter between @StatsByLopez, @StatsInTheWild and @BaumerBen.

Chess is regarded as the least random sport, i.e. most frequently the top seeds succeed and the bottom seeds trail in the standings. Naturally, there is some randomness due to humans making errors as well as the error margins in seeding. But I decided to put this to test and to analyze the closest analogs of playoff competitions in chess: the Candidates competitions between 1950 and 1983 and the FIDE World Cup/KO championships between 1998 and 2017. From 1985 to 1997, due to the prolonged competition between Karpov and Kasparov at first, and the breakaway of Kasparov from FIDE later, the playoff competition systems were unstable and I decided to avoid them in this research. Maybe I'll take a deeper look and include them later.

I. Candidates Tournaments (1950-1983)
These were the events where most of the top chess players competed to determine who is going to challenge the World Champion. In these events there was no literal seedings, but we can assign seeds by the historical Elo ratings of chess player at the start of the event, as provided by the wonderful chess statistics site ChessMetrics. Naturally World Champion, most often ranked #1 in the world did not take part in the competition. 

Twelve tournaments were examined, and here are the outcomes for the top 4 placements at the end:

Seed placement in the top 4 finishers
Year # Pl System 1st 2nd 3rd 4th
1950 10 DRR 2 8 1 9
1953 15 DRR 2 6 7 1
1956 10 DRR 1 2 3 9
1959 8 QRR 1 4 3 2
1962 8 QRR 1 6 7 4
1965 8 P/O 2 1 7 5
1968 8 P/O 2 1 4 3
1971 8 P/O 1 4 3 2
1974 8 P/O 1 2 5 6
1977 8 P/O 1 8 3 7
1980 8 P/O 1 6 4 2
1983 8 P/O 1 5 4 6
Systems: DRR - Double Round-Robin. QRR - Quadruple Round-Robin. P/O 8-game or longer head-to-head matches.

In the knock-out playoff format, the opponent seedings were random rather than rating based!

Now if we find the average value of seeds, there are quite big surprises both ways!

Avg. seed1.334.414.254.25

The value for the column "1st", i.e. for winners is outrageously, uber-surprisingly good. Average seed of 1.33 means 2/3 of the time the top seed won the competition and the remainder of the time the 2nd seed won it. By the way the first seed never finished outside top 4 (8 wins, twice runner-up and once for 3rd and 4th each). That indicates a minimal randomness in the outcome. But then baffling is the average seed taking second place: 4.41, almost completely random, bigger than both average seed for 3rd and 4th places. For myself I haven't any good explanation that would stem from the game theory, so I'll leave this exercise to my readers.

II. World Knock-out tournaments (1998-2017)
At the end of the 1990s the World Chess Federation (FIDE) tried a completely new format: 128 (in the first three editions - 100) players who qualify through various criteria, played a massive 7-round playoff tournament, with most of the rounds consisting of just two games (rapid and blitz tiebreaks if necessary), and only the final stages were played as longer matches - and even these matches became shorter as the tournament developed. The tournament was serving as a World Championship between 1998 and 2004, and as a World Cup (a World Championship qualifier) since 2005.

Although quite frequently some of the top players in the World (sometimes more than a half of the actual top 10) did not take part in the event, the stability of the seedings was ensured because all the 100 or the 128 participants were ranked by their Elo ratings prior to the event.

The summary, similar to the table posted above, turns out like this:

Seed placement in the top 4 finishers
Year # pl 1st 2nd 3rd 4th
1998 100 2 9 18 8
1999 100 36 31 46 5
2000 100 1 4 3 46
2002 128 19 4 15 1
2004 128 28 3 1 18
2005 128 3 9 2 4
2007 128 11 5 17 10
2009 128 1 7 22 12
2011 128 9 6 2 4
2013 128 3 21 23 32
2015 128 11 16 26 4
2017 128 5 11 8 2
We immediately see the high element of randomness in the top placements. In the years 1999 and 2000 46th seed out of 100 made it twice into the semifinals! The average seed summary table comes out as following:

Avg. seed10.7510.515.2512.17

Interestingly, both the consistency and the inconsistency of the part I (extremely predictable winner and rather random runner-up) disappear. A top 20 player is expected to either win or to finish second, and the semifinalists are expected to be not very much lower seeded. It is once again confirmed, that the length of the series between two competitors is key to eliminate randomness.

 However, the average seed as runner-up in the 1950-1983 Candidates remains a mystery to me.