"Tilted Ice", the only hockey paper to be accepted for Sloan, will presents an insightful look on how teams (although not individual players) change on-ice behavior in third periods of contests. There will additionally be a panel on Friday dubbed “Hockey Analytics: Out of the Ice Age” that will discuss the use of analytical judgement for individual player evaluation. If contrasted with recent developments in basketball, hockey is still a ways away from a warm period.
Take, for example, EPV (short for Expected Possession Value), a methodology to be presented in Sloan research paper "POINTWISE" that utilizes optical tracking data from the SportVU cameras the NBA installed earlier this year in collaboration
with STATS LLC to evaluate player performance and decision making. Another Sloan 2014 research paper, "The Hot Hand", additionally utilizes the NBA's optical tracking data to evaluate a different metric of performance (streakiness). While the use of optical tracking cameras in the NBA is old news, the NBA's D-League recently announced a few weeks ago that four teams will be piloting the use of small, wearable devices for tracking player movement and other bio-related indicators.
For a hockey fan, the NBA’s utilization of "big data" collection and analysis techniques is a woeful reminder of just how far behind hockey analytics
are relative to its peers. How to define big data? For tech companies, the phrase is
generally used to describe data whose size exceeds the CPU memory of traditional
databases and analytics tools – a bit of a silly definition, since this just
means today’s big data analysis tools are tomorrow’s data analysis tools. Perhaps a better definition
comes from the 2013 Financial Times Book of the Year finalist "Big Data”, which suggests the revolution is
not behind the tools themselves. While increasingly sophisticated business intelligence and analytics tools help to make big data analysis more economically feasible for businesses and individual statisticians alike, the real revolution is in the world’s shifts towards capturing
far more streams of data – all the data – and empirically analyzing it.
The NBA’s pioneering usage of machine-generated data (as opposed to human-generated data from sources such as emails, photos, tweets etc.) highlights what I believe will be the most common way sports will expand to harness new data sources. While POINTWISE and The Hot Hand will spark discussions at Sloan around additional use cases for basketball, the papers should additionally spark discussions around the use of optic-tracking camera in other fluid, fast-paced sports - i.e., hockey.
The Next Wave of Hockey Datafication
If I were a betting man, the hockey analytics panel at Sloan will only briefly touch on the idea and potential of optic-tracking cameras for the NHL and hockey, should the question be raised. Based on the panel description, the discussion will more so focus on expanding adoption and usage of currently hockey analytics tools (such as Corsi, Fenwick and PDO helped popularized by Behind the Net and Extra Skater) among hockey decision makers. However, the usage of machine-generated data to create the hockey equivalent of EPV model with 'big data' is an intriguing idea for the NHL to theoretically pilot.
The idea of an EPV isn't new. At Sloan last year, a research paper presented a methodology, dubbed Total Hockey Rating (THoR), that aimed to evaluate NHL players based on the idea that each player contributed to the probability of that a goal is scored and prevented. Of course, the study that suggested Tyler Kennedy was the third most valuable player in the NHL from 2010-2012 should raise some eyebrows. While THoR helps to push forward hockey analytics by presenting sound judgment in its methodology, its usage of hits methodology reflects the real weakness facing hockey analytics – the stats that are easily measurable don’t necessarily reflect what a player's true value could be away from when and where a notable hockey play happens that machine-oriented data could easily replicate.
Like in POINTWISE, an EPV algorithm leveraging machine-oriented data could be devised for hockey that tracks the probability of scoring for every moment a player enters the offensive zone (or if a player has the puck any zone if Ondrej Pavelec is in net). EPV additionally presents a framework to calculate the value of
“entry passes, dribble drives and double-teams” in basketball; in hockey, the same could be
done for evaluating pieces of hockey strategy be it dump and chases, shot
selection and line chemistry (I’m looking your way, Chris Kunitz). One could imagine a world where a trail blazing hockey coach plug players into a zone model similar to the half court model provided by POINTWISE co-author Kirk Goldsberry in this Grantland piece and, leveraging billions of rows of machine, calculate success probabilities of player selection and formation on power plays and penalty kills. Models for the antithesis of EPV – perhaps expected defensive value – could be developed to identify players best at
limiting high percentage shot opportunities and creating turnovers. Not only
will machine-generated data help overcome the need to use Corsi and Fenwick as
proxies for puck possession, but it will additionally help more accurately identify which
players correlated with both possession and takeaway ability (not to mention turnover liability).
While the potential of machine-oriented data to help devise coaching strategy and support GMs in player transaction decisions has seemingly unlimited potential, its application by coaches "on the fly" seems unpractical given the need for reaction prompt decision making (i.e., the home team has an 8 second limit on making line changes in between whistles; the away team has just 5 seconds). The infiltration of iPads and tablets behind hockey benches, as has become common place among baseball managers such as Joe Maddon, is probably around the corner. However, their use will probably be reserved for drawing plays with styluses and streaming instant video replay than plugging in variables to calculate probabilities, which sports such as baseball and football have greater use for given the longer lag time in between plays.
Data derived from wearable devices as the D-League is
testing would additionally help hockey organizations derive new
insights on players, given that models to evaluate speed, acceleration, endurance and other bio-related data in basketball are directly transferable to hockey. As noted in Zach Lowe’s article, the collection of bio-related data would raise concerns for the
union, with any sort of application of optic-tracking cameras or wearable
devices to be meticulously negotiated by Donald Fehr and the NHLPA. However, the discussion of wearable devices, and optic-tracking video cameras for that matter, are still in the future without further buy of more sophisticated analytics among hockey decision makers in general.
Hockey Analytics Today
For the hockey analytics world, the good news is that acceptance and adoption is gradually coming in the sport even if it is lagging its peers. Most notably, the
Penguins detailed at a predictive analytics conference in Toronto last year
how they’re working with the Sports Analytics Institute to create a player evaluation system leveraging player location and shot probabilities to create predictive systems for goals for/against and lifetime value. The model's first application in practice came in 2011, when the Penguins acquired James Neal from the Dallas Stars in what is arguably one of the most lopsided trades in recent memory. In January, the New Jersey Devils threw their hat
into the analytics ring with the announcement of hiring of a Director of Analytics that will report directly to Lou Lamoriello (although it remains to
be seen how or if the old-school Lamoriello will leverage the person ultimately
hired for the position).
Take New York City for example, which has embraced big data analysis techniques to identify illegal, over occupancy buildings that are
more prone to deaths in the case of fires Highlighted in the aforementioned "Big Data" book,
New York City’s first “Director of Analytics” Mike Flowers highlights an exchange with a senior fire chief concerning an apartment with multiple red flags based on his team's algorithm, with the senior fire chief's gut claiming a building was likely passable because the brick exterior was new. Instead of brushing off the old
guard’s hunch and sticking with the team's existing algorithm,
Flowers’ team took note of the senior chief’s insight and quantified brick exterior
investments through city building permits.
Ultimately, the datafication of hockey presents an exciting opportunity
for blogs like this one to grow with the infusion of hockey related data both
on and off ice, and maybe get a hockey decision maker or two to listen along
the way. While currently available analytics help to deliver new deeper insights for evaluating player and team performance than +/-, it is important to realize their limitations and develop methodologies to better analyze "the coolest game on Earth".
At the very least, it's more entertaining than watching the
Sabres for the foreseeable future.
No comments:
Post a Comment