Each season, only one team wins each league — and their forwards tend to have significantly higher goal and shot counts than everyone else. This project trains a K-Nearest Neighbors (KNN) classifier to predict whether a given forward's team won the league that season, using only two features: goals scored and shots on target. Trained on 88,310 player-season records across seven European leagues and five seasons, the model achieves 95.45% accuracy.
Player-season records from the Premier League, La Liga, Serie A, Bundesliga, Ligue 1, Eredivisie, and Primeira Liga. The model is restricted to forwards with at least five 90-minute appearances, where the goal-scoring signal is strongest. The 2019–20 season is excluded due to the Eredivisie being suspended mid-season.
League-winning teams dominate possession and create more chances — so their forwards rack up noticeably more goals and shots than forwards on lower-finishing sides. This gap is large enough that a simple classifier can pick up on it reliably.
| Group | Avg Goals | Avg Shots on Target |
|---|---|---|
| Non-winners | 6.60 | 16.96 |
| Winners | 13.59 | 29.37 |
KNN classifies each player by looking at the 11 most similar player-seasons in the training data (by goals and shots) and taking a majority vote on whether those neighbors' teams won the league. Hyperparameters were tuned via GridSearchCV across 150 combinations and 5-fold cross-validation.