Alex Trommer

Data Science · University of Michigan · alextrommer@gmail.com · GitHub

xG Dashboard

An XGBoost expected goals model and interactive dashboard covering the top 5 European leagues, built on ~257,000 shots from Understat (2020–2025). Three situation-specific models handle open play, corners, and set pieces separately, with isotonic calibration. Data refreshes daily via GitHub Actions.

257k Shots
0.792 ROC-AUC
0.074 Brier score
5 Leagues
3 Specialist models

Models

Each shot is routed to a situation-specific XGBoost classifier based on how it was created. All three are wrapped in CalibratedClassifierCV(method="isotonic") and tuned independently via GridSearchCV (3-fold CV, Brier score). Penalties are fixed at 0.76 xG.

ModelSituationsKey features
OpenPlayOpen play, counter-attacksDistance, angle, counter-attack proxy, throughball, rebound
FromCornerCorner kicksHeader interactions, centrality, weak-angle header
SetPieceDirect & indirect free kicksDistance, angle, shot type

Features

24 features across geometry (distance, angle, coordinates), shot type flags (header, foot, penalty), interaction terms, zone context, and proxy variables. Because Understat labels counter-attacks as open play, a fast_break proxy is engineered from the preceding action type to capture counter-attack context without direct tagging.

Performance

MetricThis modelUnderstat
ROC-AUC0.7920.805
Brier score0.0740.072

The small gap vs Understat is expected — commercial models incorporate freeze-frame data (exact defender positions at the moment of the shot) which Understat does not expose via their public API.