Measuring Harmony With Algebra: Success randomness in chess

This is a surprising entry. It's not about hockey at all But maybe it's good to write surprising entries after a long silence like this. This post was inspired by the research of randomness in playoff results in four major sports that was discussed on Twitter between @StatsByLopez, @StatsInTheWild and @BaumerBen.

Chess is regarded as the least random sport, i.e. most frequently the top seeds succeed and the bottom seeds trail in the standings. Naturally, there is some randomness due to humans making errors as well as the error margins in seeding. But I decided to put this to test and to analyze the closest analogs of playoff competitions in chess: the Candidates competitions between 1950 and 1983 and the FIDE World Cup/KO championships between 1998 and 2017. From 1985 to 1997, due to the prolonged competition between Karpov and Kasparov at first, and the breakaway of Kasparov from FIDE later, the playoff competition systems were unstable and I decided to avoid them in this research. Maybe I'll take a deeper look and include them later.

I. Candidates Tournaments (1950-1983)
These were the events where most of the top chess players competed to determine who is going to challenge the World Champion. In these events there was no literal seedings, but we can assign seeds by the historical Elo ratings of chess player at the start of the event, as provided by the wonderful chess statistics site ChessMetrics. Naturally World Champion, most often ranked #1 in the world did not take part in the competition.

Twelve tournaments were examined, and here are the outcomes for the top 4 placements at the end:

Seed placement in the top 4 finishers
Year	# Pl	System	1st	2nd	3rd	4th
1950	10	DRR	2	8	1	9
1953	15	DRR	2	6	7	1
1956	10	DRR	1	2	3	9
1959	8	QRR	1	4	3	2
1962	8	QRR	1	6	7	4
1965	8	P/O	2	1	7	5
1968	8	P/O	2	1	4	3
1971	8	P/O	1	4	3	2
1974	8	P/O	1	2	5	6
1977	8	P/O	1	8	3	7
1980	8	P/O	1	6	4	2
1983	8	P/O	1	5	4	6

Systems: DRR - Double Round-Robin. QRR - Quadruple Round-Robin. P/O 8-game or longer head-to-head matches.
In the knock-out playoff format, the opponent seedings were random rather than rating based!

Now if we find the average value of seeds, there are quite big surprises both ways!

Place	1st	2nd	3rd	4th
Avg. seed	1.33	4.41	4.25	4.25

The value for the column "1st", i.e. for winners is outrageously, uber-surprisingly good. Average seed of 1.33 means 2/3 of the time the top seed won the competition and the remainder of the time the 2nd seed won it. By the way the first seed never finished outside top 4 (8 wins, twice runner-up and once for 3rd and 4th each). That indicates a minimal randomness in the outcome. But then baffling is the average seed taking second place: 4.41, almost completely random, bigger than both average seed for 3rd and 4th places. For myself I haven't any good explanation that would stem from the game theory, so I'll leave this exercise to my readers.

II. World Knock-out tournaments (1998-2017)
At the end of the 1990s the World Chess Federation (FIDE) tried a completely new format: 128 (in the first three editions - 100) players who qualify through various criteria, played a massive 7-round playoff tournament, with most of the rounds consisting of just two games (rapid and blitz tiebreaks if necessary), and only the final stages were played as longer matches - and even these matches became shorter as the tournament developed. The tournament was serving as a World Championship between 1998 and 2004, and as a World Cup (a World Championship qualifier) since 2005.

Although quite frequently some of the top players in the World (sometimes more than a half of the actual top 10) did not take part in the event, the stability of the seedings was ensured because all the 100 or the 128 participants were ranked by their Elo ratings prior to the event.

The summary, similar to the table posted above, turns out like this:

Seed placement in the top 4 finishers
Year	# pl	1st	2nd	3rd	4th
1998	100	2	9	18	8
1999	100	36	31	46	5
2000	100	1	4	3	46
2002	128	19	4	15	1
2004	128	28	3	1	18
2005	128	3	9	2	4
2007	128	11	5	17	10
2009	128	1	7	22	12
2011	128	9	6	2	4
2013	128	3	21	23	32
2015	128	11	16	26	4
2017	128	5	11	8	2

We immediately see the high element of randomness in the top placements. In the years 1999 and 2000 46th seed out of 100 made it twice into the semifinals! The average seed summary table comes out as following:

Place	1st	2nd	3rd	4th
Avg. seed	10.75	10.5	15.25	12.17

Interestingly, both the consistency and the inconsistency of the part I (extremely predictable winner and rather random runner-up) disappear. A top 20 player is expected to either win or to finish second, and the semifinalists are expected to be not very much lower seeded. It is once again confirmed, that the length of the series between two competitors is key to eliminate randomness.

However, the average seed as runner-up in the 1950-1983 Candidates remains a mystery to me.

Measuring Harmony With Algebra

Tuesday, June 12, 2018

Success randomness in chess

No comments:

Post a Comment