Thursday, March 30, 2017

On the NHL Scoring System - Part III

Part I
Part II


Once again, driven by idea that if you want to encourage goal scoring, you need to reward the goal scoring in standings directly, not indirectly through winning. Then, based on the idea of a fellow hockey fan and blogger, a new suggestion was born in my mind.

Not so long ago I was involved in another discussion on the subject on Twitter, where an interesting alternative, 2-1-0-0 was described. The idea is that you still get two points for a win in regulation, just one point for a win in OT, but nothing if you lose, and, the key, both teams get nothing if the game is tied at the end of regulation (shootouts are abolished). This is a very sharp idea, but for me something felt very wrong, and then it crystallized:

It's not fair to reward a hard fought 5-5 tie with zero points, just like a lazy-skated 1-1. We still want to encourage goal scoring, and the simple 2-1-0-0 just unbalances the game. And so it dawned on me. We should reward goals with extra standings points!

The formula that first came to mind, and which seemed fair: give each goal a 0.1 point in the standings, while the win-scoring system shall be 2-1-0-0. If you or your database have an aversion against decimals, assign 20 points for a win, 10 points for OT loss, and 1 extra point for each goal scored. This will encourage goal scoring in any situation, and for both sides, including the games that go into garbage time pretty quickly. So, a 7-2 win will give the winner 2.7 points, and the loser 0.2 points. A 2-0 win will give the winner 2.2 points, the loser 0. A 4-3 OT win will give the winner 1.4 points, the loser 0.3 points. A 5-5 OT tie will give each side 0.5 points.

Wait, there's a caveat.

Imagine a situation where a team needs just 0.1 point to pass another one in the standings for the playoff spot. They are playing an opponent whose number of points in the standings does not have any effect on them. In such a situation, the team would play without a goaltender at all, because they don't care how much they lose, they just need that goal. Now, this is not really hockey, so to prevent this kind of play a restriction needs to be introduced:

Any goal scored without a goaltender on the ice, when not on a delayed penalty, and when trailing by more than two goals shall not yield any standings points.

Here is an example what the today's standings would look like under the suggested system:

Team                           W  OW T  L  GF  GA  P
Boston Bruins                  34 04 04 34 216 201 93.6
Montreal Canadiens             31 09 05 31 205 186 91.5
Ottawa Senators                32 04 08 31 191 191 87.1
--------------------------------------------------------
Washington Capitals            41 08 07 20 246 165 114.6
Columbus Blue Jackets          38 09 04 24 233 170 108.3
Pittsburgh Penguins            37 06 08 25 256 211 105.6
--------------------------------------------------------
New York Rangers               38 05 06 28 242 203 105.2
Toronto Maple Leafs            29 06 09 31 229 213 86.9
--------------------------------------------------------
New York Islanders             28 05 06 36 217 224 82.7
Tampa Bay Lightning            27 06 07 35 206 207 80.6
Carolina Hurricanes            28 04 07 36 198 208 79.8
Buffalo Sabres                 24 06 08 39 191 215 73.1
Philadelphia Flyers            22 07 11 36 193 218 70.3
Florida Panthers               21 07 11 37 192 210 68.2
New Jersey Devils              18 06 06 46 171 221 59.1
Detroit Red Wings              16 07 08 45 181 224 57.1
--------------------------------------------------------
Chicago Blackhawks             36 09 05 27 230 197 104.0
Minnesota Wild                 37 04 05 30 241 193 102.1
St. Louis Blues                35 06 02 33 213 200 97.3
--------------------------------------------------------
San Jose Sharks                35 06 03 32 204 185 96.4
Anaheim Ducks                  37 02 06 31 200 183 96.0
Edmonton Oilers                33 05 09 29 221 191 93.1
--------------------------------------------------------
Nashville Predators            33 04 06 33 224 206 92.4
Calgary Flames                 30 09 06 32 208 206 89.8
--------------------------------------------------------
Winnipeg Jets                  29 03 04 41 226 243 83.6
Dallas Stars                   27 04 02 43 207 240 78.7
Los Angeles Kings              23 11 06 36 183 185 75.3
Vancouver Canucks              19 07 06 44 169 221 61.9
Arizona Coyotes                17 04 08 48 176 245 55.6
Colorado Avalanche             14 06 01 55 150 257 49.0

Naturally, they would not be the same standings if the system were indeed implemented, but why not to take a look. And once again, try it in the AHL first, it won't hurt anyone.

Monday, March 13, 2017

On Buchholz and Sonneborn-Berger coefficients - Part II

Part I

2. The Sonneborn-Berger coefficient.
This stranger beast is a metric extensively used for tie-breaks in chess-round robins and as an auxiliary tie-break tool to the Buchholz coefficient in non-round robin. Let's start with the definition.

$$SB = Σ↙{n=1}↖N f(R_n,P_n)$$

where Rn is the result against the n-th opponent, and Pn is the opponent's points score.
The function  f(Rn, Pn) is defined as:

f(Win, Pn)  = Pn
f(Tie, Pn)  = Pn/2
f(Loss, Pn) = 0

The result value evaluates whether the participant performed better against stronger and weaker opposition. Actually, I do have a problem with this criteria as a tie-breaker, in my opinion ALL points are created equal, and it doesn't matter if they came from a contender or a bottom feeder. However, this metric does answer the notorious statements like "This team only shows up for big games" and "This team is only good against garbage opposition."

So, first of all, for the NHL application, we will modify the function f(Rn, Pn) to:

f(Win, Pn) = Pn
f(OW, Pn)  = 2*Pn/3
f(OL, Pn)  = Pn/3
f(L, Pn)   = 0

to account for the overtime point.

Then, we can calculate the minimal possible SBmin value for a team with the given schedule so far this season, by assigning Wins to be against the weakest teams played, and the OW/OL against the weakest remainder until the sum of W, OW and OL points add up to the number of points the team currently has.

Similarly we shall calculate the maximal possible SBmax value by assigning Wins to be against the strongest teams played, and the OW/OL against the strongest of the remainder, assuming OT wins are about 1/4 of the whole.

Then the closer the actual SB is to the SBmin or SBmax we may be able to say whether the team is successful more against the bottom feeders, the top guns, or whether it achieves its points from the whole spectrum available.

Here is the table describing how this season's teams have their SB positioned between SBmin and SBmax.

Team Points SBmin SBopt SB SBmax
Pittsburgh Penguins 1.40 44.28 46.48 46.24 53.06
Washington Capitals 1.40 44.70 46.74 47.77 52.89
Minnesota Wild 1.37 42.25 44.36 46.63 50.66
Columbus Blue Jackets 1.37 43.10 45.36 46.44 52.15
Chicago Blackhawks 1.34 41.61 43.90 43.79 50.80
San Jose Sharks 1.31 40.68 42.97 44.16 49.84
New York Rangers 1.30 41.25 43.67 45.55 50.92
Ottawa Senators 1.25 37.84 40.07 41.79 46.78
Montreal Canadiens 1.25 39.37 41.74 41.05 48.87
Anaheim Ducks 1.19 36.86 39.43 40.12 47.15
Calgary Flames 1.18 35.97 38.49 38.20 46.05
Edmonton Oilers 1.16 35.86 38.32 37.43 45.70
Boston Bruins 1.15 34.73 37.23 37.74 44.72
Nashville Predators 1.13 33.28 36.14 38.04 44.72
Toronto Maple Leafs 1.13 34.64 36.99 35.66 44.02
St. Louis Blues 1.12 34.69 37.14 38.52 44.50
New York Islanders 1.12 34.36 36.94 37.94 44.71
Tampa Bay Lightning 1.09 32.62 34.98 35.41 42.06
Los Angeles Kings 1.07 32.10 34.66 33.56 42.34
Philadelphia Flyers 1.04 31.26 33.56 32.01 40.48
Florida Panthers 1.03 30.89 33.12 30.95 39.82
Carolina Hurricanes 1.00 29.43 31.78 32.41 38.85
Buffalo Sabres 0.99 30.09 32.49 33.43 39.68
Winnipeg Jets 0.96 27.55 30.35 31.48 38.75
Vancouver Canucks 0.96 28.48 30.91 29.02 38.21
Dallas Stars 0.94 28.05 30.62 31.16 38.34
Detroit Red Wings 0.94 29.12 31.12 30.02 37.13
New Jersey Devils 0.91 27.78 30.15 28.63 37.27
Arizona Coyotes 0.84 25.13 27.24 25.86 33.56
Colorado Avalanche 0.61 17.90 19.74 19.98 25.25

Once again, we use Point Per Game values because the teams and their opponents have a different number of games played at most of the moments within a season.

We would dare to make one more step forward and claim that the team that performs closer to SBmax seem to have a coach problem (notable differences highlighted in green in the table above). The roster is there to compete against the best, but the points aren't trickling in at a pace good enough against the fodder. Similarly, if the SB value is closer to SBmin is more likely to have a GM problem (notable differences highlighted in blue in the table above), that its roster is not good enough to compete, but the coach is able to squeeze close to the maximum out of it. However, it is natural to win more games against the weaker teams, so we set the balance point at SBopt = (SBmax + 3*SBmin) / 4;

Wrapping up the talk about the Buchholz and the Sonneborn-Berger coefficients we would like to state that these values have an almost entirely descriptive value and without any predictive capability, with a small exception of the Buchholz-based remaining schedule strength metric. And even then, it's sort of a 'descriptive prediction'.

Please see more Buchholz and Berger-Sonneborn data on the website!

Sunday, March 12, 2017

On Buchholz and Sonneborn-Berger coefficients.


The practice of chess tournaments provides two traditional metrics that are used to rank participants beyond their mere scoring. Their names are the Buchholz coefficient and the Sonneborn-Berger coefficient (often called just Berger). They are frequently used as tie-breakers in chess events, however I arrived to completely different application for them for the National Hockey League seasons.

1. The Buchholz coefficient

The Buchholz coefficient is simply the sum of the points of your opponents.

B = Σn=1N Pn

So, if you played five games, and your opponents currently have 5, 3, 8, 6 and 6 points, your Buchholz value will be 28. Please note, that the current number of points is always used, not the number of points at the moment of meeting. The outcome of the game does not matter (for that one see the Sonneborn-Berger).

At first, the usefulness of such a criteria would prompt a raise of the eyebrow. However, it's not used in round-robin all-play-all tournaments as a final tie-break, because, naturally, the coefficient would be the same for all tied parties. It's used in a special format of chess events called the Swiss Tournament, not very popular outside of the realm of board games for purely logistic reason. But then, consider, first, an NFL season. The list of opponents every team plays there over the 16-game season may be quite different. And, whoever would end up with a larger Buchholz coefficient, clearly would've had stronger opposition on the way.

Now let's go back to hockey. First of all, at the end of the season, although everyone has played everyone, they did so a different number of times. Thus, the sum of opponents' points at the end of the season could be different between teams - including within the same division, if they had a different schedule. So, this could still be a very valid tiebreak. Secondly, the season is so long (82 games, unlike a chess Swiss which is rarely longer than 11 rounds), and that gives us a lot of midway points in time, when the all-play-all has not been completed yet! Here the Buchholz coefficient can clearly show, who has had the stronger opposition up until a certain moment.

Then, if we look at the remainder of the schedule for each team, and for every game we add the opponent's points we get an excellent remaining schedule strength estimator.

Wait... there's a caveat.

Unlike in a chess tournament, where every round occurs for everyone at the same time, and barring very rare circumstances, every participant played an equal amount of games at any point of the tournament, there may be a significant difference in the number of games played by different teams, so summing the opponents up will not work very well. And these opponents also played a different number of games, so their total amount of points is not a very good indicator.

Fortunately, it's not a big deal. Instead of totals, let's operate with per-game numbers. So the NHL Buchholz Coefficient for a team after N games becomes:

B = (Σn=1PPGn)/N. 

Same applies for the remaining schedule strength, where the per-game numbers of the remaining opposition are summed an averaged.

So, if the team played three games against opponents who currently are:
A) 6 points in 4 games, B) 3 points in 3 games, C) 2 point in 5 games, then the team's Buchholz value would be (6/4 + 3/3 + 2/5) / 3 = 2.9/3 ~ 0.967pts.

Here are the current (Mar 12th 2017) Buchholz coefficients and remaining schedule strengths for the entire 30 times (and note how the Blues stand out with plenty of matchups vs Colorado and Arizona remaining).

+-----------------------+-----------+-------+-------+
| Team Name             | PPG       | Buch  | RStr  |
+-----------------------+-----------+-------+-------+
| Washington Capitals   | 1.4179105 | 1.119 | 1.133 |
| Pittsburgh Penguins   | 1.4029851 | 1.117 | 1.127 |
| Minnesota Wild        | 1.3939394 | 1.090 | 1.070 |
| Columbus Blue Jackets | 1.3731343 | 1.125 | 1.132 |
| Chicago Blackhawks    | 1.3283582 | 1.088 | 1.096 |
| San Jose Sharks       | 1.2985075 | 1.106 | 1.106 |
| New York Rangers      | 1.2941176 | 1.120 | 1.184 |
| Ottawa Senators       | 1.2537313 | 1.105 | 1.169 |
| Montreal Canadiens    | 1.2352941 | 1.122 | 1.097 |
| Edmonton Oilers       | 1.1791044 | 1.121 | 1.040 |
| Anaheim Ducks         | 1.1764706 | 1.102 | 1.150 |
| Calgary Flames        | 1.1764706 | 1.099 | 1.140 |
| Boston Bruins         | 1.1470588 | 1.115 | 1.151 |
| Toronto Maple Leafs   | 1.1343284 | 1.114 | 1.150 |
| Nashville Predators   | 1.1323529 | 1.105 | 1.116 |
| St. Louis Blues       | 1.1194030 | 1.144 | 0.943 |
| New York Islanders    | 1.1194030 | 1.142 | 1.103 |
| Tampa Bay Lightning   | 1.0895522 | 1.121 | 1.134 |
| Los Angeles Kings     | 1.0746269 | 1.118 | 1.104 |
| Philadelphia Flyers   | 1.0447761 | 1.122 | 1.179 |
| Florida Panthers      | 1.0298507 | 1.118 | 1.175 |
| Carolina Hurricanes   | 1.0000000 | 1.138 | 1.136 |
| Buffalo Sabres        | 0.9855072 | 1.127 | 1.158 |
| Winnipeg Jets         | 0.9565217 | 1.110 | 1.143 |
| Vancouver Canucks     | 0.9558824 | 1.115 | 1.152 |
| Dallas Stars          | 0.9552239 | 1.119 | 1.100 |
| Detroit Red Wings     | 0.9545455 | 1.151 | 1.059 |
| New Jersey Devils     | 0.9117647 | 1.148 | 1.132 |
| Arizona Coyotes       | 0.8358209 | 1.133 | 1.098 |
| Colorado Avalanche    | 0.6119403 | 1.128 | 1.164 |
+-----------------------+-----------+-------+-------+

In tne next installment we're going to talk about the application of the Sonneborn-Berger coefficient to the NHL regular season.


Thursday, March 2, 2017

On schedule - played and remaining

Here I would like to present visualization of the schedule of the teams, played and remaining. This is actually a graphic representation of the Buchholz/Sonneborn and teams Elo tables I present on the website.

First, let's start with the played games and points.


 Naturally, most of the squares above the X-diagonal indicate more points than the ones below; however we can see interesting anomalies, such as BUF-OTT, TOR-BOS, ARI-SJS, WPG-CHI and probably the most intriguing: NYR - WSH (expected 1st round meeting)

Another unusual thing is that the Sharks are only playing Colorado twice this season rather than the regular 3-4 intraconference games.

Now let's take a look at the remaining games and the expected points.


We can see that STL may expect a big boost from having to play Colorado four(!) more times this season as well as Arizona three times and that Ottawa has two biggest season series mostly unresolved - against MTL and BOS. The expected points are being calculated based on teams Elo rating:

xPts = Ngames/(1 + 10(Eloopp-Eloteam)/400))

however for the sake of precision this number should've been scaled by 2 (since it produces an outcome between 0 and 1 (0.5 for a "tie") and also by the OT factor, i .e. the probability of a team getting an OT point, around 1.125. But for visualization purposes this does not matter.

There are also nice patters indicating travels through California and Western Canada.