After completing the first part of the lemma research - penalty box - the second part was shorter, easier, but just as useful. I decided to find out the share of time teams spend on average while at even strength, on power play/shorthanded and with empty net. Then given this number, and the number of goals scored in each such situation, I was able to calculate the frequency of EVG/PPG/SHG/ENG or the reverse of it which I called the difficult of such goal.
I scanned the database of all games between the 1999/00 season and today, and all the goals extracted from these games. Penalty shot goals were ignored, regardless if during the game itself, or in post-game shootout. The EN time was calculated as total game time minus goaltender TOI. PP/SH time was deducted from the recorded PP TOI of the players. The EV time would naturally become the total game time minus EN minus PP of both teams.
Then I calculated the difficulty of scoring a goal in each of these situations through the following formula:
DiffTYPE = ( GOALSEV / GOALSTYPE ) x ( TOITYPE / TOIEV )
where the difficulty of the EV goal is considered "1". Here are the combined results of the difficulties in a table:
Season | EV | PP | SH | EN |
---|---|---|---|---|
1999 | 1.000 | 0.502 | 3.506 | 0.162 |
2000 | 1.000 | 0.473 | 3.387 | 0.146 |
2001 | 1.000 | 0.492 | 3.635 | 0.153 |
2002 | 1.000 | 0.468 | 3.585 | 0.167 |
2003 | 1.000 | 0.445 | 3.127 | 0.221 |
2005 | 1.000 | 0.535 | 4.247 | 0.272 |
2006 | 1.000 | 0.506 | 4.000 | 0.228 |
2007 | 1.000 | 0.458 | 3.597 | 0.187 |
2008 | 1.000 | 0.438 | 3.745 | 0.183 |
2009 | 1.000 | 0.456 | 4.044 | 0.192 |
2010 | 1.000 | 0.450 | 3.517 | 0.177 |
2011 | 1.000 | 0.460 | 3.568 | 0.169 |
2012 | 1.000 | 0.430 | 3.890 | 0.169 |
2013 | 1.000 | 0.453 | 3.209 | 0.198 |
2014 | 1.000 | 0.427 | 3.564 | 0.171 |
2015 | 1.000 | 0.415 | 3.284 | 0.158 |
2016 | 1.000 | 0.419 | 3.252 | 0.178 |
2017 | 1.000 | 0.427 | 3.052 | 0.168 |
If you divide 1 by these values you can get the relative frequency of goals scored in each situation.
The dataset containing this data is available on the website, on the Request Analysis page.
So why did I need these two lemmas? That blog post won't be ready any time soon, and I better resume the "Page A Day series'.
No comments:
Post a Comment