Project Overview
We built a predictive model that daily fantasy sports (DFS) players can leverage to select optimized lineups for fantasy competitions, improving performance by 8.58% over baseline projections.
Project Highlights
- Trained on 5 seasons of NBA game data
- Predicts player fantasy scores based on opponent, venue, and historical performance
- Ensemble approach using 7 independent models for different statistical categories
- 8.58% improvement over DraftKings' baseline projections
Our model predicts a player's fantasy score in upcoming games based on multiple factors including the opposing team, home/away status, and performance trends over both recent and extended timeframes.
We conducted a comprehensive comparison between random forest and XGBoost models to determine optimal performance. The winning approach wasn't actually a single model, but rather an ensemble of 7 independent models—each predicting one of the 7 statistical categories that factor into fantasy scoring.
By applying the DraftKings fantasy formula to these individual predictions and integrating them with salary cap optimization, we achieved significant performance improvements in contest lineups.
Results & Implementation
Model Performance
Comparative analysis of prediction accuracy across different model architectures.
Research Findings
Detailed documentation of methodology and experimental results.
Model Implementation
Key parts of the codebase showing the model architecture and training process.
Feature Engineering
Data processing and feature extraction techniques used in the model.
Technical Approach & Methodology
Our approach involved comparing multiple machine learning models to find the optimal solution for fantasy score prediction. We specifically evaluated random forest and XGBoost models, testing both direct fantasy point prediction and component-based prediction methods.
Data Preparation
- Collected 5 seasons of NBA player statistics and game outcomes
- Generated rolling statistical averages over different time windows (5, 10, 20 games)
- Incorporated contextual features like opponent defensive ratings, home/away status, and rest days
- Normalized features and applied feature selection to identify key predictors
Model Architecture
- Developed 7 separate models to predict individual statistical categories: points, rebounds, assists, steals, blocks, turnovers, and three-pointers
- Combined predictions using the DraftKings fantasy scoring formula
- Implemented hyperparameter tuning using grid search with cross-validation
- Integrated predicted scores with a linear optimization algorithm for salary cap-constrained lineup selection
Project Resources
Interested in more projects?
Check out my other data science and machine learning work
View Portfolio