Clustering Premier League Midfielders 24/25
With present-day Premier League midfielders varying so drastically in their actions, the term "midfielder" has become a surface level blanket statement, so how can we redefine these roles?
The demands of a midfielder often vary drastically from team to team, making current labels and definitions of midfield positions outdated and vague. These labels lead to dissimilar players, who may exist under the umbrella of a “midfielder”, to be spoken about in the same regard, despite their roles varying. These labels can be damaging amongst football discourse, and work to severely reduce the complexity of roles. In order to address this, clustering methods and minimum spanning tree analysis have been used to objectively map Premier League midfielders based on attributes and performance data.
For the analysis, the following key metrics (per90) from FBref were selected to determine the clusters:
Chance Creation: Expected Assists (xA), Key Passes (KP), Shot-Creating Actions (SCA)
Ball Carrying: Progressive Carries (PrgC), Carries into Final Third (1/3C), Carries into Penalty Area (CPA)
Passing Range: Short, Medium and Long Pass Completions (ShortCmp, MedCmp, LongCmp), Progressive Passes (PrgP), Passes into Final Third (1/3P), Crosses into Penalty Area (CrsPA)
Attacking Threat: Succesful Take-ons (SuccTkon), Progressive Passes Received (PrgR)
Defensive Contribution: Tackles Won (TclkW), Interceptions (Int)
These selected metrics seek to provide a balanced, holistic view of a midfielders potential qualities, encompassing creativity, progression, control, attacking intent and defensive acumen.
Using these metrics to perform K-means clustering, all players were then grouped into five distinct profiles:
Cluster 0 (Posession Retainers): Reliable and supportive second phase midfielders who can act as a source of both progression and chance creation, however also often retain possession through higher volumes of passes across all ranges.
Cluster 1 (Anchors): Less creative yet consistent passers with an aptitude for reading the game in order to win tackles and intercept passes.
Cluster 2 (Role Players): All round balanced midfielders with no specialisation, often comprised of dependable squad options or younger players.
Cluster 3 (Progressive Carriers): Excellent dribblers who often receive the ball in more advanced areas of the pitch, with an outstanding number of successful take-ons completed. This archetype of midfielder rarely uses passing to progress the ball, opting to carry the ball up instead.
Cluster 4 (Elite Creators): Outstanding attacking midfielders who often bear the brunt of creative responsibilities for their team, excelling in high risk/high reward actions such as passes and crosses into the penalty area.
A Minimum Spanning Tree was generated from the clusters to visually represent how midfielders with similar profiles can be mapped in close realtion to one another. Each node represents a player, while the edges and their distance represent similarity (using Euclidean distance) between players.
Directly connected nodes convey that the players are deemed to be similar in their playing style, output and statistics. Nodes within the same cluster that are not directly connected mean that the two players share the same role but differ slightly in one or two metrics, indicating differences in tactical setups perhaps. Outliers and players on the fringes are distinctive and either excel or underperform uniquely within their cluster.
There are some key insights that may be drawn from the MST across all clusters. Cluster 0 appears to be the most central, with connections to all other clusters signifying that these Possession Retainers have even more denominations within the cluster, hosting many hybrid archetypes. Mateo Kovačić and Enzo Fernández, although belonging to the same cluster are directly linked to Thomas Partey and Bruno Fernandes respectively, displaying the variety within this all round midfield profile.
Cluster 1 forms the largest subgroup, highlighting the importance and abundance of steady, reliable defensive minded midfielders within the Premier League. Moisés Caicedo and Casemiro in particular appear as some of the best performers of this particular role.
Cluster 2 hosts very close-knit and similar nodes, suggesting interchangeability and a balance in ability with no strong outliers aside from, surprisingly, Marshall Munetsi.
Cluster 3 is the most isolated cluster suggesting the profile of Progressive Carriers is a rather unique quality. This can be seen displayed by the inclusion of skillful players such as Eberechi Eze, Amad Diallo and Matheus Cunha within the cluster.
Cluster 4 is the smallest cluster, comprising of only 8 players and is also the most spread out clustering arrangement. This highlights just how far and few these Elite Creators are within the Premier League, to the point where even the players within this cluster vary so drastically in how they accomplish consistent chance creation for their teams. Players such as Kevin De Bruyne and Bruno Fernandes sit distinctly in their own space, with no equivalents in their playstyles while both providing an abundance of creativity.
To conclude, this MST visualisation of the K-means clustering of Premier League midfielders provides a detailed map of the tacitcual nuances and relationships between players. This analysis can pave the way for more targeted player evaluations such as which profiles work the best together, scouting for specific archetypes of midfielders, or even mapping how young players may develop through and into these clusters and player roles.