Connor Young | NHL Line Analysis

Project Overview

In this project, I investigated the dependency of each NHL team on their "top line" throughout the 2020-2021 and 2021-2022 seasons, revealing insights about roster construction and coaching strategies.

Key Questions Explored

How does top line usage correlate with team performance?
Do successful teams have more balanced line distributions?
How do different coaching strategies manifest in line usage patterns?
Can we identify optimal line usage strategies based on roster composition?

A typical NHL team uses 4 "lines" of forwards, rotating them frequently throughout the game. Each line is a set of 3 players, and the "top" or "1st" line generally receives the highest percentage of playing time in a game, decreasing with each line.

Looking at the percentage of total playing time that each team gives to its top line and comparing it with overall team performance reveals interesting insights about the tradeoffs between having star players and having a "deep" roster where the lesser lines share more playing time.

All data used for this project was scraped from the public NHL API, with analysis performed using Python and associated libraries including pandas, matplotlib, and seaborn.

Key Findings & Visualizations

Line Usage Distribution

Analysis of how teams distribute ice time across their forward lines.

Performance Correlation

Relationship between top line usage and overall team success.

Situational Deployment

How line usage changes based on game situations and score.

Coaching Strategy Comparison

Different approaches to line management across coaching styles.

Methodology & Data Sources

This analysis was conducted using Python, leveraging pandas for data manipulation, matplotlib and seaborn for visualization, and scikit-learn for statistical modeling. The dataset was compiled by accessing the NHL's public API endpoints.

Data Collection

Game events and shift data were collected for all 82 regular season games for each team.
Line combinations were identified using shift overlaps and on-ice events.
Time on ice was calculated from precise shift start and end timestamps.

Analysis Approach

Line identification algorithms were developed to account for in-game line changes and special teams play.
Statistical correlations were measured between line usage metrics and team performance indicators.
Multilevel modeling was used to account for team-specific effects while identifying league-wide patterns.

NHL Line Analysis

Project Overview

Key Questions Explored

Key Findings & Visualizations

Line Usage Distribution

Performance Correlation

Situational Deployment

Coaching Strategy Comparison

Data Collection

Analysis Approach

Project Resources

GitHub Repository

Research Paper

Interested in more projects?