NHL Draft Efficiency Analysis

NHL Draft Efficiency Analysis

Evaluating NHL team draft performance from 2005-2017

1. Project Introduction

The goal of this project is to evaluate the drafting effectiveness of NHL teams.

Specifically, it aims to answer the question:

Which NHL teams draft the most effectively?

This analysis examines drafts from 2005 through 2017, a period that captures the modern salary cap era while also allowing sufficient time for drafted players to develop and establish their NHL careers.

As a fan of the New York Rangers, a team that has often struggled to generate strong value from early draft selections, I became interested in whether team drafting performance could be quantified and compared across the league. This project explores that idea by measuring draft outcomes relative to draft position.

How It Works (Quick Summary)

  • Each player is assigned a Performance Score based on career games played and points
  • Each draft pick has an expected value based on historical averages at that pick
  • Each selection is evaluated as: Draft Value = Actual Performance − Expected Performance
  • Team scores are calculated by summing draft value across all picks

This framework allows teams to be evaluated not by raw outcomes, but by how effectively they extract value from their draft positions.

2. Data Overview

The project uses draft recap data from hockeydb.com.

The dataset includes draft information such as:

  • Round and overall selection number
  • Drafting team
  • Player name and position
  • Junior or youth team

It also includes each player’s NHL career totals, including:

  • Games played
  • Goals
  • Assists
  • Points
  • Penalty minutes
  • Final NHL season

These statistics allow player careers to be compared and evaluated relative to their draft position.

Screenshot of NHL draft table from hockeydb.com

Screenshot of the draft recap table on hockeydb.com

The draft data was programmatically collected from hockeydb.com using a Python web scraping script with the BeautifulSoup library. The extracted data was cleaned and stored in a SQLite database, which was then used for subsequent analysis in SQL and Python.

3. Exploratory Data Analysis

Before analysis could begin, the dataset was examined to validate its structure and integrity. The full EDA process is documented in the notebook here: EDA Notebook.

The key findings were:

  • The dataset contains 2,766 players across 13 draft years, with pick totals consistent with expectations for a 30-31 team league across 7 rounds.
  • No duplicate entries or missing values were found in any key identifying columns. The 1,364 players with no games played statistics simply reflect players who were drafted but never appeared in an NHL regular season game.
  • One player, Kyell Henegan, was found to have no position listed. Research confirmed he is a defenseman, and his position was updated accordingly in the cleaned dataset.
  • The pick count totals across rounds showed minor variation which was investigated and fully explained by real world NHL events — including the 2005 post-lockout CBA, the 2017 Vegas expansion, two forfeited picks, and compensatory second round picks awarded to teams that failed to sign their former first round selections.
  • The cleaned dataset was saved separately from the raw data and used for all subsequent analysis.

4. Draft Success Evaluation Methodology

4.1. The Core Idea

The goal of this analysis is to measure how effectively NHL teams draft relative to the value of the picks they are given.

Not all draft positions carry the same expectations. A 1st overall pick is expected to produce significantly more value than a late-round selection. Because of this, teams should not be evaluated based on raw outcomes alone, but on how their selections perform relative to what is typical for each pick.

This project evaluates drafting by comparing each player’s actual NHL career performance to the historical expectation of their draft slot.

4.2. Defining Expected Value

To establish a fair baseline, we calculate the expected value of every draft pick.

For each pick number (e.g., 1st overall, 50th overall), we compute the average career performance of all players selected at that position across the 2005–2017 drafts.

This creates a stable expectation curve:

  • Early picks have high expected value
  • Mid-round picks have moderate expected value
  • Late-round picks have low expected value

This approach allows all draft selections to be evaluated on a consistent scale across years.

4.3. Measuring Draft Value

Each draft selection is evaluated using the following metric:

Draft Value = Player Performance Score − Expected Performance at That Pick

  • A positive value indicates the team exceeded expectations
  • A negative value indicates the team fell short of expectations

By summing these values across all picks and all drafts, we obtain a single score for each team:

  • Positive total → strong drafting performance
  • Negative total → weaker drafting performance

This framework captures both the magnitude and consistency of a team’s drafting outcomes.

4.4. The Performance Score

To evaluate players, we first need a single metric that captures NHL career value.

The dataset provides two key statistics:

  • Career games played
  • Career points (the sum of career goals and career assists)

Each captures a different dimension of performance:

  • Games played reflects longevity and reliability
  • Points reflect production and impact

To balance these, we define:

Performance Score = games_played + (points × 2.29)

The coefficient 2.29 is derived directly from the data as the value at which games played and points contribute equally on average:

  • avg_games_played = avg_points × X
  • X = avg_games_played / avg_points

This ensures that neither component disproportionately drives the metric.

Validation

Two checks were performed before finalizing the coefficient:

  • Positional fairness: At X = 2.29, forwards are overrepresented in the top 30 by ~4 percentage points (≈1 player per draft), which was considered acceptable
  • Sensitivity analysis: Adjusting X by ±1 results in minimal ranking changes, confirming the metric is stable

The full coefficient investigation is documented in formula.ipynb.

4.5. Limitations

Goalies are excluded from this analysis.

Because goalies accumulate very few points regardless of their actual impact, including them would systematically undervalue their performance and unfairly penalize teams that drafted them early.

As a result, goalie selections are omitted from both the expected value calculations and team scores. This is acknowledged as a limitation, but provides a more consistent evaluation across skater positions.

5. Tableau Visualizations

Overview of the dashboards and charts used to explore the results.

6. Discussion and Conclusions

List known limitations.

Key findings and insights from the analysis.