by Thor Olavsrud

NHL seeks to grab fans with data analytics

Mar 25, 20157 mins
AnalyticsBig DataData Mining

The NHL wants to bring in hardcore fans and casual fans alike with a multi-phase plan to offer 'enhanced statistics' along with advanced visualizations on 100 years of historical data and its real-time scoring data.

nhl action
Credit: REUTERS/Brian Snyder

Data analytics aren’t just for identifying potential cost savings or driving new insights into customers. They can also be used to create new data products.

Case in point: The National Hockey League (NHL), which is digitizing and repackaging its statistics — some of them going back nearly 100 years — to create a new enhanced statistics offering that it hopes will appeal to hardcore hockey fans while drawing in more casual fans.

Last month, the NHL announced a multi-year North American partnership with SAP SE, which will help it provide fans, broadcasters and media the ability to analyze official NHL, team and player stats that will include advanced visualizations that will “tell stories,” as Steve McArdle, executive vice president of Digital Media and Strategic Planning at the NHL, puts it.

[Related: 10 Social Media Accounts Sports Fans Should Follow ]

“There is fan appetite to be able to derive insight around what happens on the ice as computing power increases and fans are starting to understand more about the strategy and tactics and action on the ice,” he says.

Analytics are changing on the fly

Hockey has always been an old-school sort of sport, and many clubs and coaches have been slow to embrace analytics — much like baseball before Sabermetrics exploded into the wider public consciousness. And it has been argued that hockey’s rapid and fluid nature, with players entering and leaving the ice on-the-fly as play continues, makes it resistant to the sort of modeling that can be done on baseball with its stately pace and one-on-one matchups.

[ Related: 8 Ways Big Data and Analytics Will Change Sports ]

But as with baseball’s Sabermetrics, a niche core of devoted hockey fans embraced the challenge of modeling hockey and created their own language to describe it. You don’t have to go very far down the rabbit hole to encounter some of their metrics:

  • Corsi. Corsi, named for Buffalo Sabres goaltending coach Jim Corsi, is the sum of shots on goal, missed shots and blocked shots — essentially the number of shot attempts. It can be expressed as a differential or percentage by comparing shot attempts for and against a team. The idea is to approximate puck possession. You can use the stat to understand how well a team controls the puck, but you can also apply it to players by calculating a team’s shot attempts for and against while a player is on the ice. While controlling puck possession doesn’t guarantee victory, teams with better puck possession have a higher probability of success in the long-term. Hockey blogger Kent Wilson notes that most players and teams have Corsi ratios between 40 and 60 percent, with the elite coming in at 55 percent or better.
  • Fenwick. Fenwick is a variation on Corsi named for Calgary Flames blogger Matt Fenwick. It counts onlyshots on goal and missed shots, excluding blocked shots. Wilson says it tends to have a stronger correlation with scoring chances, though he also says the difference between Corsi and Fenwick is negligible in the long run.
  • PDO. PDO looks like it’s an abbreviation but it’s not; it’s taken from the Internet handle of Brian King, who first proposed it. PDO is the sum of a team’s even-strength shooting percentage and save percentage. It can also be used to quantify individual players by summing even-strength shooting percentage and save percentage when a particular player is on the ice. The idea is to quantify “puck luck” and therefore determine whether a team is over-performing due to good luck or under-performing due to bad luck. As Staff Writer Evan Sporer notes, PDO can help explain how the Washington Capitals’ fared so poorly last season despite having wunderkind forward Alexander Ovechkin. Ovechkin managed to find the net on 8.97 percent of his own shots, but the team only scored on 5.84 percent of shots when Ovechkin was on the ice. That gives him the fifth-lowest ranked PDO last season among skaters who skated at least 1,000 minutes.
  • Zone starts. The descriptively named zone starts is used to modify a player’s Corsi stat. Zone starts is the ratio between offensive zone faceoffs and defensive zone faceoffs at even strength. It can be used to offset the fact that players with a high zone start ratio — meaning they start more frequently in the offensive zone — will naturally tend to have a higher Corsi than players with low zone start ratios that tend to start in the defensive zone.

There are others as well. While McArdle notes that fans who argue about these stats online are extremely engaged, he says the obscure jargon can be impenetrable to more casual fans and the conversations are happening away from

“This cottage industry has popped up around our statistics about applying enhanced analytics or statistics around our games,” he says. “We strive to make the home for all official conversations around hockey at the NHL level. That was kind of the impetus. We have the data. We have the official statistics around the game and our real-time scoring system. We get the data faster than anyone else.”

A new way to analyze the game

“Goal number one is to put these stats on,” says Chris Foster, director of Digital Business Development at the NHL, who is leading the charge to digitize all of the league’s game records going back to its first season in 1917-1918. “We know that’s where the majority of casual fans come to find us. We know there’s going to be a learning curve, but we want to teach our fans a new way to look at the game and analyze the game.”

To start, the NHL has renamed the stats. Corsi is now Shot Attempts (SAT), Fenwick is now Unblocked Shot Attempts (USAT), PDO has become Shooting Percentage Plus Save Percentage (SPSV%), etc.

“There will be quite a bit of commentary about how or what these statistics are and how to interpret these statistics,” McArdle says. “We’ve given a lot of thought, even to the names we’ve given to these statistics, to kind of chip away at the mystique. We want to make sure that the casual fan understands what the stat is, to be as descriptive as possible.”

That’s part of the first phase of the redesign, along with more than 30 new extended statistics like first and second assists (1st A, 2nd A), goals by time (G/20), penalties (taken and drawn) by time (Pen/20, PenDr/20) and average shot length (ASL).

Phase 2 is planned to roll out in time for the 2015 Stanley Cup Playoffs. It will feature analysis for every Stanley Cup Playoffs game and series using an algorithm that incorporates 37 variables, including road record, goals against, special teams statistics, etc. Advanced filtering will allow users to do game, season and career comparisons between players, teams and games delivered in dynamic line graphs.

In Phase 3, planned for the start of the 2015-2016 season, the NHL will incorporate new metrics, visualizations, active player comparison tools, player performance prediction tools, pre-season rankings and team power indexes.

Phase 4 will come in 2016 to coincide with the NHL’s Centennial Celebration. It will feature the entire official statistical history of the NHL, including every box score dating back to the league’s inaugural 1917-1918 season. With 100 years of stats fully integrated into the package, the NHL will roll out new tools and functionality, including advanced filtering and visualizations that can be applied to the entire history of the league.

“The historical data is certainly not as in-depth as what we’re collecting today,” McArdle says. “What the data going back to 1917 allows you to do is really tell stories.”

For the first time he says, you will be able to have that argument at the bar about whether Gordie Howe, Wayne Gretzky or Sidney Crosby is the greatest player of all time and use visualizations to back up your claim.

“You’ll be able to do team comparisons across eras,” he says. “You’ll be able to ask ‘who was better at X, Y or Z. It will allow fans, broadcast partners, bloggers, analysts and commenters to tap into that data.”

The package will leverage the SAP HANA Enterprise Cloud service, along with various SAP solutions and technologies.

Follow Thor on Google+