Seven Stats: Playing with goalie numbers

An important caveat about hockey stats is that the people who developed them aren't satisfied with what they have. Partly because hockey statistics are in their infancy and partly because hockey is a decidedly non-linear sport, the statisticians are aware that they haven't captured the real information they want with the stats they've got.

This is nowhere more true than with goaltending statistics. For much of the history of the NHL, the key statistics to judge goaltenders were wins and Goals Against Average. In fact, the award named for Georges Vezina was originally given to the primary goaltender on the team with the lowest GAA for the season.

But neither wins nor GAA even pretend to separate the goalie's contributions from that of his teammates. They are--wins, especially--statistics that measure a team's performance, not a player's. And in a salary cap world, measuring a player's individual performance is paramount. Over time, especially after shot data became available to the public in the 1980s, new ways of looking at a goalie's performance have been developed.

The big one, of course is Save Percentage [Saves/Shots Against]. The league started tracking Save Percentage in the early 1980s. At about the same time, the criterion for the Vezina Award was changed from best GAA to "goaltender adjudged best at his position." Today the statistic is so embedded in how we think of goalie performance as to be virtually unquestioned. But there are still problems.

SV% does begin the process of isolating individual performance, but it has some flaws and it's not standing up terribly well to the demands of hockey statisticians. One of the things that we do know about save percentage is that power plays have a disproportionate effect on that metric. Every goalie sees a different amount of time in various power play situations. There's just no way, when looking at total SV%, to know how much the power play or penalty kill has changed their stats. So, a lot of people have begun to use even-strength save percentage as the most reliable way of judging goaltender performance. It has a lot of data to back it up, and is relatively easy to calculate once you have the raw data.

There are some other statistics that are in the process of being developed and which are intriguing but unproven. First is point-shares, which aims to calculate the number of a team's points in the standings that can be attributed to that player. It uses the difference between the league's averages and the player's personal statistics, the amount of time the player has been on the ice, and a league-wide ratio of goals to points. But because it uses flawed stats, it's going to give you a flawed result. It's a shorthand way of looking at where a specific player fits into a team's system.

Then there's Defense Independent Goalie Rating, which takes an actual algorithm to calculate. Not something you can sit around and figure out on a napkin. It aims to determine how a specific goaltender would fare against the kind of shots faced by all goaltenders in the league and so take the team's play out of the equation. So far, the creators have run only a single season with this, although there are efforts to continue to add data and see whether this has any predictive or evaluative value over the long-haul. I will say that this is the only hockey statistic I have ever seen touted by an ecologist. (Whether Jared Diamond has any authority on hockey goaltending is another question.)

Quality starts, created by Robert Vollman, are intended to isolate those games where a goalie performs well enough to give his team a chance to win, regardless of the actual outcome of the game. It essentially tries to separate offense from goaltending. A Quality Start (QS) is awarded when a goalie stops more than the league-average percentage of shots (currently 0.914) or gives up less than three goals while stopping 88.5% or more of the shots he faces. "Quality starts," said Vollman, "have resulted in an actual winning percentage of 0.775, while non-Quality Starts have a winning percentage of only 0.325."

According to Cam Charron, good goalies do this about 55% of the time. 60% is a very good rate. The higher your goalie's QS rate is, the more reliable they are. For the record, Mathieu Garon has a QS rate this season of 47.7% (21 of 44 starts), while Dwayne Roloson has a QS rate of 36% (8/22). Dustin Tokarski has none so far, but two games is only two games.

The converse to the QS is the Bail-Out, also developed by Vollman. It's defined as win awarded to a goalie who didn't earn a QS. That is, the goalie gets the win while stopping less than 88.5% of the shots, or while giving up 3 or more goals and stopping between 88.5 and 91.3% of the shots he faced. Garon has had 8 bail-outs this season, for a rate of 18% (seems high, but I don't know what the "normal" range would be.) Roloson has 2 bail-outs for a rate of 9.1% (2/22). So, high bail-out rate + low QS rate = a goalie being carried by his team.

Vollman also created a third measure that he called Really Bad Start (RBS), which counted the times a goalie stopped less than 85% of the shots he faced. (Vollman said this left a team with a 10% chance of winning.) Cam Charron and Thomas Drance have refined this measure and re-christened it a "Blow-Up." They award a BU when a goalie stops less than 85% or gives up 5 or more goals on 39 or fewer shots (that is, between .850 and .871). This season, Garon has a Blow-Up rate of 18% and Roloson's rate is 41%. According to Charron, they think that "the most reliable of keepers keep it to within 10-12%." Neither of Tokarski's starts qualify as Blow-Ups.

Quality Starts, Bail-Outs, and Blow Ups can't replace save percentage, which is still the best single metric we have for goalies (though it really shouldn't be used in isolation, either.) In fact, they were intended to replace the "Wins" stat. Still, they can be used to understand something about a goaltender's reliability, since they don't fluctuate nearly as much as shot-based metrics do. One other caveat, though, is that no one knows how much these stats really change from season to season or whether they are in any way predictive of a goalie's long-term success.

Finally, there's Goals Versus Threshold, which is, in the words of the man who developed it, Tom Awad, "the value of a player, in goals, above what a replacement player [essentially a top AHL, non-prospect call-up--in other words, a Curtis Sanford rather than a Jacob Markstrom] would have contributed." In other words, GVT "measures a player's contribution to his team's goal differential." Awad has developed GVT formulas for offense, defense, and goaltending, and they are fairly complex. Goaltending GVT takes into account league average shots against and save percentages, league average goals per game, and certain other factors. The formula for goalie GVT is:

League Avg SV% - (League Avg GAA * .04) = Threshold SV%
Raw GVT = individual saves - (shots against * Threshold SV%)
GGVT = Raw GVT * .75

Why .04 and .75? Those numbers are more or less arbitrary. The .04 is an assumption that a replacement goalie will give up 4% more goals than the league average. The .75 is a "Goalie Responsibility" adjustment that assumes that a team's play accounts for about 25% of a goalie's save percentage. It's not that those things aren't based on statistical observations. It is observable that replacement level goalies tend to give up more than average numbers of goals and that team play affects save percentage. The problem is that there's no real indication of exactly where those levels fall.

GVT has some serious problems, among them the arbitrary assignment of adjustment values, the reliance on flawed metrics as the basis for further calculations and the fact that it values goaltending above all other positions. In actuality, GVT numbers work best when you compare goalies with goalies, defensemen with defensemen, and forwards with forwards. So keep that in mind. If you're interested in GVT, I'd recommend exploring the "The Puck Stops Here" posts on it first. They're easier to understand than the original HockeyProspectus posts. And as always, a Google search will turn up a lot of material.