The evolution of hockey statistics â€“ an ongoing story
Bruce McCurdy Analytics, Big Data, and the Cloud 2012 April 25
Traditional game summaries
1967-68 Plus/minus formally introduced, as well as individual shots on goal / Shooting %
1983-84 Goaltender save percentage added Grant Fuhr
1998-99 Time on ice published, opening the door for rate stats Chris Pronger
1998: NHL introduces Zone Time
â€Ś but turfs it in 2002. Why?!
1998: NHL starts to (sporadically) maintain Real Time Scoring System (RTSS)
â€Śbut there remain huge problems due to lack of standardization & rink bias
Oilers have twice as many giveaways as Florida â€Ś or do they?
• Ranking of teams’ RTSS home and away yields results that might as well be randomized for giveaways and takeaways, and very nearly so for hits and blocked shots. • Whereas the same exercise for Goals For yields a crudely similar ordering home to away.
• Significant home scorer bias in turnover stats. 45% more giveaways and 33% more takeaways by home teams league-wide! • As a result RTSS is highly unreliable, serving to rank players within a given team but almost useless for comparing players from different clubs.
2002-03: NHL introduces play-by-play reports
â€Ś though problems remain with accuracy of some data, e.g. shot distance
â€œStrippingâ€? of PxP data allows detailed on-ice analysis of individual players Even-strength shots / Fenwick / Corsi from timeonice.com
Head-to-head match-ups (timeonice.com)
Customizable, sortable stats from behindthenet.ca
Available stats: Even strength / powerplay / shorthanded Scoring per 60 minutes On/off ice plus/minus per 60 On/off ice shots / Fenwick / Corsi per 60 On-ice Sh% / Sv% / PDO QualComp / QualTeam Penalties drawn / taken ZoneStart / ZoneFinish
• Many stats need to be parsed in terms of positive / negative /neutral game states, e.g.: • Leading / trailing / tied (score effects are HUGELY important) • PP / PK / EV • O-zone / D-zone / neutral zone • Taken in isolation without context, modern stats will be distorted; e.g. “soft minutes” players used in offensive situations should be expected to have positive numbers in things like Relative Corsi
Scoring chances "A chance is counted any time a team directs a shot cleanly on-net from within home-plate. Shots on goal and misses are counted, but blocked shots are not (unless the player who blocks the shot is â€œacting like a goaltenderâ€?). Generally speaking, we are more generous with the boundaries of home-plate if there is dangerous puck movement immediately preceding the scoring chance, or if the scoring chance is screened. If you want to get a visual handle on home-plate, check this image."
One weakness to the current method is that “home plate” isn’t best template for scoring area
Another is that scoring chances are just 1’s and 0’s – no extra weight for first class chances as suggested by heat map colour coding
Actually, scoring areas …which vary for different types of shots and manpower situations. Scoring chance model is greatly simplified from this reality.
Common SC errors and outcomes • • • • • • • •
NHL data doesn’t properly record on-ice players +1 or -1 for selected players Scoring chance improperly credited (or missed) +1 or -1 for 10 players Scoring chance recorded at wrong game time +1 or -1 for up to 20 players Scoring chance recorded but for wrong team +2 or -2 for 10 players
Neilson Numbers • Based on ideas of Roger Neilson • Assignment of individual responsibility on scoring chances for and against • Requires an extra degree of qualitative judgement over and above deciding whether a scoring chance has occurred • Eliminates false positives/negatives, however individual numbers don’t reconcile to team totals • Fewer recording errors than on-ice scoring chances as players are identified as part of the process • Same system can be used to assign unofficial assists on GF or errors on GA • Reliant on a knowledgeable scorer, but as with other scoring chance systems, would work better if 3 or 5 scorers worked independently, then pooled results.
Zone Start: fad or trend?
Possession • “Hockey is a transition game: offense to defense, defense to offense, one team to another. Hundreds of tiny fragments of action, some leading somewhere, most going nowhere. Only one thing is clear. A fragmented game must be played in fragments. Grand designs do not work. … Before offense turns to defense, or defense to offense, there is a moment of disequilibrium when a defense is vulnerable, when a game’s sudden, unexpected swings can be turned to advantage. It is what you do at this moment, when possession changes, that makes the difference.” • – Ken Dryden, The Game
• “It is noteworthy that in general … our teamwork was considerably above our main contenders. In the game against the Canadian team, the players of the USSR squad made 110 passes, while the Canadians made 60 passes; in the game against Czechoslovakia we made 106 passes, they made 70; in the game against Sweden we made 49 more passes than they did. … This is an indication of quite stable habits and a high culture of playing, a correct understanding of the game by the Soviet players.” • -- Anatoli Tarasov, Road to Olympus
Tarasov Numbers Good pass: plus. Bad pass: minus. Good clearance: plus. Bad clearance: minus. Good rush: plus. Bad rush: minus. Good shoot in: plus. Bad shoot in: minus.
…and many more advanced ideas • • • • • • • •
Goals Versus Threshold (GVT) Defence Independent Goalie Rating (DIGR) Shot Quality (SQF / SQA) Preditcted Goals Scored (PGS) Zone Start Adjusted Corsi (ZSAC) Etc. … No time to do them all justice here Thanks for listening!