In this post, we take a closer look at how SciSports developed a Estimated Transfer Values (ETV) model to support accurate transfer fee predictions.
The Estimated Transfer Value (ETV) is a Machine Learning (ML) model that learns patterns from thousands of past transfers. By training the ML model on previous transfers, it learns to accurately estimate the ETV of every player in our database based on features like the SciSkill, Potential and contract duration.
For practice use cases of the values, see this article.
Estimated Transfer Value (ETV)
A machine learning model was trained on ±10,000 historical paid transfers to find patterns in the fees paid on the transfer market. For all these transfers we collect the SciSkill, Potential, recent performance, contract duration, age and position of the player at the moment of the transfer. In that way, the Machine Learning model can find patterns such as that players with a higher SciSkill tend to have a higher transfer fee, whereas players running out of their contract will have lower transfer fees. Refer to input features for more details about the variables that is fed into the ETV model.
As the transfer market is constantly developing, and economic inflation actually causes prices to rise with 0-10% per transfer window, we have to make sure the model remains aligned with the market. Therefore, the model is automatically retrained and updated after every transfer window, to include the last information and account for changes on the market.
Broadly the information taken into account when predicting transfer fees can be grouped as follows:
Information group | Short description | Influence |
Player current and potential skill | SciSkill, Potential, resistance factor at time of transfer | In general, the higher the SciSkill and/or Potential, the higher the predicted transfer fee. |
Player age | Age of player at time of transfer | In general, the lower the age, the higher the predicted transfer fee. |
Contract duration | Number of days left and indications of whether the contract ends within a year, 2 years or longer. | In general, the longer the contract duration, the higher the predicted transfer fee. |
Player position | Player offensive share is higher the more upfront the player plays | In general, the higher the offensive share, the higher the predicted transfer fee. This means that, in general, forwards get higher predicted transfer fees than goalkeepers and defenders. |
Player experience on different levels | All corrected for age:
| In general, the more experience on a higher level, the higher the predicted transfer fee. |
Player recent playing minutes on different levels | In last 0.5 year, 1 year and 2 years:
| In general, the more minutes played on a higher level more recently, the higher the predicted transfer fee. When a player didn’t play much in the previous season his fee will likely drop. |
Two-Stage ETV Prediction
The best talents are rare and thus very expensive, and this quickly becomes clear from the distribution of fees paid on the transfer market. Most players are actually sold for fees in the range of €0-5 mln, while only a very small subset of players is sold in a range of €50-€200 mln. As a result, there is no one-size-fits-all approach for ETV prediction, as different factors are of importance between the small group of expensive players and (bigger) group of less expensive players. Furthermore, the difference in the supply-demand balance for both groups further differentiates how the ETV is made-up for both groups.
SciSports has developed a two-stage ETV prediction approach to account for these differences:
We predict if a player will likely sell for a fee above or below €5 mln.
We trained ETV models specific to a certain price range (€0-5 mln and >€5 mln), and based on the prediction in step one we use the most suited model for ETV prediction.
If we are unsure about the group a player belongs to (as he is most likely to be valued at ±€5 mln), we take the average of both models to represent his ETV.
Input features
This model uses information about performance, potential, contract duration and position to predict the ETV. More precisely it uses the following information:
SciSkill;
Potential;
Age;
Contract Duration;
Recent (last 52 weeks) playing minutes;
Playing Level;
Player Development in the past 6 months;
Current League.
Model Accuracy
Our suite of the ETV model is very capable of accurately estimating a players transfer value, as can be derived from the following evaluation metrics:
We can predict a player to sell for over or under €5 mln with >90% accuracy.
For players <€5 mln, we can predict the ETV of the player with an absolute error margin of ±€0.4 mln.
For players >€5 mln, we can predict the ETV of the player with an absolute error margin of ±€3 mln.
Please note that, although €3 million is a substantial amount, the average player valued at over €5 million typically sells for €15-20 million. Since many factors influence the actual fee paid, and we can only account for performance- and contract-related attributes, this margin of error still results in a very accurate prediction — approximately ±25% closer to the actual value compared to, for example, the market value reported on Transfermarkt.com.
ETV Patterns
As a rule of thumb, the following patterns can be recognized in the model’s results, and help to understand why certain players are worth more than others:
The most valuable players are young, high potential players. They represent future value, and the market expects them to only increase in value and thus clubs are willing to pay large fees for this group.
Staying on top is even harder than getting there: We all known the house-hold names of elite football. Haaland, Mbappé, Bellingham or Rodri: they are believed to be the best of the best, and are typically rewarded with high ETV values. However, once they are out of form, decrease in their development or lack playing minutes you will notice their values start to drop.
Older players will quickly decrease in value. On the transfer market, players are investment assets that represent value. When a player goes past his prime, his value will decrease with time, and once players get older their return on investment will quickly diminish, as will their ETV.
Model Safety
As with all our scouting and recruitment models, safety and trustworthiness are paramount to ensuring fairness and eliminating bias toward specific subgroups of players. To guarantee valid output, a secure and unbiased model, and to maximize trustworthiness, we have implemented multiple guardrails within the model. These ensure that players cannot rise or fall beyond defined boundaries or experience sudden or inexplicable peaks or dips in their ETV. Additionally, questionable ETV predictions are automatically flagged by our system for human review, establishing a 'human-in-the-loop' AI system.