In order to validate a metric, it is important to know what it aims to measure - plus for what and by whom the metric is used.
As described in this article, the SciSkill is designed to "provide an indication of a player's skill level for scouting purposes". Furthermore, the Potential measures "a football player's peak ability". Both metrics are designed to help football scouts to determine whether a player has the current or future skill level to play for their team.
For both the SciSkill and Potential, we validate and tune our algorithm based on 3 different aspects:
Comparison with common lists
Client feedback
Data analysis
Comparison with common lists
First of all, we frequently check the outcomes of the SciSkill and Potential algorithms with lists of players that are widely accepted in the football industry.
For the SciSkill Index, we mostly use scouting shortlists of football clubs for this comparison. Some scouts at football clubs keep lists with the "Top X players" per position per region. We use these lists to review their overlap with the outcomes of the SciSkill Index (after filtering on the corresponding positions and regions). Because the SciSkill metric is designed to "provide an indication of a player's skill level", this exercise provides us with insights into how effective the metric still is in doing so.
The Potential metric strongly relates to the SciSkill metric, as it indicates a player's estimated peak SciSkill. The Potential metric is also validated with player lists from scouts at football clubs, but then with lists that contain the most promising youngsters (per position per region) according to the scout.
Besides scouting lists, we also look at rankings such as the Golden Boy award to evaluate our SciSkill and Potential metric. In this blog post, we discuss the 2017 Golden Boy award with the help of these metrics.
Client feedback
Obviously, our clients are an important source of input for improvements to our metrics. We continuously gather feedback on the usability of our platform, as well as on the effectiveness of our metrics. Once we spot patterns in the feedback on our metrics that we gather from different clients, we usually start researching the cause of the feedback and defining improvements.
For example, in 2019 we received some requests from clients to review the 3rd and 4th levels in Germany, as well as the 2nd and 3rd levels in Spain. So, we performed an analysis on the SciSkill and Potential metrics applicable then, in order to improve "balance of strengths" between leagues with a special focus to the leagues mentioned by our clients. In this analysis we used data of international matches (Champions League, Europa League, etc.) to calibrate the strength of each league and subsequently determined more accurate SciSkill and Potential values. Eventually, this resulted in an improved SciSkill Index in one of the releases later that year.
Data analysis
Finally, we periodically run analyses to measure the effectiveness of our metrics. For the SciSkill Index, we do so by predicting match outcomes and benchmarking these predictions against bookmakers' predictions. Based on the SciSkill of the players in the expected line-ups of the teams, we outperform the bookmakers.
The effectiveness of the Potential metric is monitored by looking at the SciSkill history and historical Potential predictions of (older or retired) players. At any point in time, the accuracy of the player's Potential prediction relies on the number of matches that the player has played. The more matches a player has played, the more accurate our estimation of a player's peak SciSkill becomes. This accuracy is also reflected in the visualisation of a player's SciSkill development (part of the summer 2020 release).
Naturally, the Potential prediction for players who played fewer than 5 matches is still quite inaccurate. For these players, we are quite sure that they will eventually reach a SciSkill within +/- 35% of their Potential prediction. The more matches we have for a player, the narrower these ranges become. For players who have played more than 100 matches, we are quite sure that their peak SciSkill will end up between +/- 17% of the Potential prediction.