So I decided to start blogging on a whim this weekend. Partially
because I didn’t want anyone to think I’m the michaelbarba.blogspot.com that
offered unthoughtful opinions in 2009 on legalizing weed as a solution to
boost economic growth, and partially because I had opinions beyond 140
characters on things I read on the internet where the things I’m interested in
– such as hockey, economics, statistics, technology and coffee – intersect.
There was an SAP-sponsored paper that was presented at the
MIT Sloan Sports Analytics Conference in Boston last weekend that caught my
attention titled “Total Hockey Rating (THoR): A comprehensive statisticalrating of National Hockey League forwards and defensemen based upon all on-iceevents”. What the paper tries to
present is a, “reliable methodology that can quantify the impact of players in
creating and preventing goals for both forwards and defenseman” (Shuckers and
Callo 1, 2013). After that, the paper identified nine on-ice action events to
quantify THoR, including shots, turnovers and hits. The paper smartly isolated
biases for several different considerations – how takeaways / giveaways are
calculated (with home scorekeepers tending to favor the former statistic), offensive
vs. defensive zone starts, talent level of goalies and other players on the ice,
and shot location selection (providing higher weights to higher percentage
shots). Overall, it’s an extremely well orchestrated study, but there are a few
major criticisms I have for THoR, in addition to the grammatical mistake from
the paper’s abstract I quoted above (defensemen should be plural). Chiefly: the
inclusion of hits in THoR’s methodology.
The original purpose of measuring hits as a statistic was to
treat it as a turnover metric – for example, if Shea Weber knocks Patrick Kane
off the puck and Roman Josi gains possession as a result, Weber is rewarded
with a hit. However, the stat has become a subjective tool that the
scorekeepers can reward regardless of outcome. For example, say Kane and
Patrick Sharp are on a 2 on 1 vs. Weber and Weber hits Kane right after Kane
sets up Sharp for a goal. The scorekeeper can award Sharp the goal, Kane the
assist and Weber the hit.
The statistic is further diluted by home scorekeeping bias.
Consider this random Red Wings – Blue Jackets game from February 2012 in
Columbus. Although the Blue Jackets lost 5-2, they managed to “out hit” the Red Wings
33-2. Take a look at team hits for the 2011-12 season; the Phoenix Coyotes were
fifth in the league at home in hits, but only ninetieth on the road. Either the
Coyotes are extra fired up to play to arenas where announced attendance was 72.5% last season, or the (more obvious answer) the scorekeepers
are biased towards the home team.
Biased home scorekeeping can extend to other stats beyond
hits as well that the authors did not account for. From the Sabres - Rangers game last weekend in MSG, Drew Stafford scored a rare goal off directly off a
Mikhail Grigorenko faceoff that the Rangers “won”. The book Scorecasting finds
that the “home court” advantage a team maintains is reflective of a) the away
team’s travel schedule (in the NBA, MLB and NHL) and b) subjective decisions
made by referees. In theory, the ability to draw penalties and get on the power
play favors the home team as a result of the referee bias. The referee bias can
also extend to non calls favoring the home team, a la this Matt Duchene goal against the Predators last month that was just a bit offside (Side note: in fairness, this memorable non call from last rewarded the away team).
In a more extreme case of biased home scorekeeping, Jeff
Marek from Sportsnet in Canada recounted a story on the Marek vs. Wyshynski
podcast a few weeks ago from the 1997-98 season when Glen Sather was trying to trade
defenseman Dan McGillis. To boost McGillis’ trade value, Sather had the Oilers
scorekeeper tally extra hits for McGillis to make potential suitors think he
was a hitting machine – at the trade deadline, the Philadelphia Flyers acquired
McGillis and a second round pick from the Oilers for Janne Niinimaa, who gave
the Oilers five productive seasons and an All-Star game appearance in 2001. (Side
note: In fairness, while McGillis was not the hitting machine the Flyers
thought they had acquired on their blue line, McGillis did give the Flyers five quality
seasons and won the Barry Ashbee Trophy awarded to Philadelphia’s top
defenseman in 2001).
Hits’ lack of meaningfulness is not necessarily the study’s
fault, but including it in THoR considerably weakens its usefulness. However, what is not accounted for in
THoR’s methodology hurts the study just as much as including hits. Part of this
is beyond the study’s control – x y coordinates for where every single player
is on the ice during a goal, which undoubtedly affects the probability a shot
will go in, are not recorded. But there are a couple of measurable oversights
that weaken THoR – dummy variables for power plays, and who is coaching the
team.
Including dummy variables for power play situations (whether
it be 5 on 4, 5 on 3, or 4 on 3) is a bit of an academic point. The study
factors in time spent on the power play and multiplies it against the league
average; but as of this writing, the Anaheim Ducks are more than twice as
likely to score with a man advantage than the Buffalo Sabres are (29.2% vs.
12.2%). Because power play situations are mutually exclusive from regular
situations, the effect can be easily isolated with a dummy variable.
Though no mention of it is made in the study, it should be
notable that Alexander Ovechkin (who scored 70 goals over the course of the two
observed seasons) is completely missing from the study’s Top 50 players list (just
as it should notable that Tyler Kennedy is Number 3 behind Alexander Steen and
Pavel Daystuk). But wouldn’t the affect of the Capitals firing their run and
gun head coach Bruce Boudreau midway through the 2011-12 season in favor of the
defensive minded Dale Hunter have a negative effect on Ovechkin and other
Washington Capitals that didn’t make the list such as Niklas Backstrom, Mike
Green and Brooks Laich? (Side note: Alexander Semin was the only Capital to
make the list at #49, and is evidently not too missed in the Washington locker room these days). It is notable THoR treats players as separate if they switched teams over the course of the study; this logic should additionally apply to when their coach is switched.
As well, measuring players by purely goals for and goals
against is an idea that is a sharp departure from what Corsi (arguably the most
popular advanced hockey metric) chiefly accounts for to rank individual players
– shot attempt differential (which is also utilized as a proxy for puck
control). Of course, there would
be little to no difference in findings between THoR and Corsi if they both
measured the same metric. But by measuring players on creating and preventing
goals alone, you can’t control for luck’s contribution as well as you can with
shot differential, chiefly because goals are not independent of the goalie. Accounting
for whether Jonathan Quick (who led all goalies with 40 or more starts last
season with a 1.95 GAA) or Vesa Toskala (who is Vesa Toskala) is in net will have
a difference on goal probability. But Quick still let in 133 goals last year;
without insanely detailed information, you would have to go back and look at
all 133 goals scored on Quick to determine if Player X created the goal, or
scored as a result of luck (such as a good rebound, a defensive letdown or poor positioning on the goalie's part).
Ultimately, any methodology will have its shortcomings. At
the very least, this paper has presented an alternate way to think about what
makes up a top NHL player, and is a good first pass of developing a “reliable methodology” it wishes to develop. But unless you’re Tyler Kennedy or
his agent, I imagine this study will have little more influence on shaping any
business decisions in the NHL as this blog will.