Predicting Breakout Forwards with Bayesian Models
Every season a few U21 forwards go from “who’s that?” to “where have you been?” (think Molde Haaland, or early Saka. So how can we use data to spot true breakouts?
With young talents progressing so rapidly, early breakouts can come seemingly out of nowhere. To move beyond hot takes and reactionary opinions from brief highlight reels, we need a systematic way to quantify and model these breakout seasons, capturing both a player’s raw output and the context that shapes it. By bringing data and a Bayesian framework together, we can *technically* forecast which young forwards are close to hitting the next level.
We looked at every U23 forward in the Premier League and Championship who played at least 5 90’s in 2024–25. To combine goals, assists, dribbling and carries into one simple score, I built a Breakout Index:
Scoring Chance Quality (35%) – Expected goals (xG) per 90
Assists Threat (25%) – Expected assists (xA) per 90
Effective Carries (15%) – Progressive carries per 90
Box Entries (15%) – Carries into the penalty area per 90
1-on-1 Skill (10%) – % of Successful take-ons per 90
Our Bayesian approach weaves together multiple layers of information to give each young forward a balanced, context-aware breakout score. First, it acknowledges that not all goals and assists are created equal, so it factors in team quality (using ELO ratings from clubelo.com) as a baseline expectation.
A player at a larger, well-performing side like Manchester City is judged differently from one at a mid-table side, ensuring we don’t conflate a player’s contributions with their team’s greater team success. At the same time, the model learns an age effect: very young players often need time to adapt, so it gently tempers early performances until there’s enough evidence of genuine growth.
Crucially, through a process called “partial pooling,” the model automatically “shrinks” extreme scores toward each team’s average whenever data is sparse, meaning a player with fewer matches played won’t be unfairly hyped by a lucky few moments, while someone with consistent, standout performances gets to shine. In essence, this framework blends individual stats, age curves, and team context in an attempt to spotlight true breakout talent rather than statistical flukes.
Here are the top 20 players the model thinks are most likely to take things up a notch next season:
Notably, some young players who have already established themselves, such as Jeremy Doku and Bukayo Saka, probably should have been omitted from the results. Callum Hudson-Odoi is being predicted to take the much-awaited step to the next level in his career, and developing star Arsenal’s Ethan Nwaneri is expected to solidify his name in the league, according to the model. QPR’s Rayan Kolli leads the way amongst Championship players, ranking higher than several Premier League level youngsters. Ryan Longman’s evaluation also goes to show as to why Wrexham were so keen to sign him during the Winter transfer market.
When we plot every U21 forward’s predicted breakout score alongside their age, two clear patterns emerge. The histogram shows most prospects clustering in the midrange around the 0.6 score. These are solid performers who might yet take the next step, while a handful of truly elite talents break well above the rest. When we map those same scores against age in a scatter plot, we see that the youngest players (under 20) tend to have wider uncertainty, with the outlier being Ethan Nwaneri. In other words, while most breakout candidates occupy a reliable middle ground, the real outliers, especially those already shining before they turn 20, are the ones to watch most closely.
While still a prototype, this model serves as a predictive indicator for players on the verge of breaking through, giving scouts, clubs, and fans a data-driven way to validate their hunches or uncover hidden gems. For full details and code, visit here.