Skip to main content
Tracking Data processing

How do we get from raw tracking data to interpretable physical variables.

J
Written by Jurre van Laarhoven
Updated over a week ago

Tracking data to physical variables

There are multiple ways to process raw tracking data (position data) into physical variables for individual players or team totals. To allow the right interpretation of these variables this article will provide you with information on the data processing of the Performance Center.

Which variables to use?

There is a lack of consensus in both the literature and among professionals on the use of thresholds and parameters for physical variables, which makes it sometimes difficult to compare between different data sources or providers. We have done extensive research on the best practice variables used to indicate volume and intensity of the physical demands of the game. With the chosen set of variables, we hope to provide a useful set of well-underpinned variables. Find more information about speed zones and thresholds here.

Find the set of our variables here.


Data Quality

All post-match data coming from the tracking provider should by definition be post-processed already, post-processing typically involves fixing missing data points, removing duplicates, and clear errors. Ideally, missing data should not be a problem, and data should be filtered already. However, this is not always the case and therefore SciSports implemented processes to improve the quality of the data.


Data (pre-)processing

Before making decisions on which variables to use there are some more technical choices to make in the process from raw data to the interpretable outcomes such as total distance or number of sprints. These ‘pre-processing’ methods aim to improve the validity (accuracy) and reliability (precision) of the position data.

  1. Downsampling & smoothing

  2. Filtering

  3. Calculating speed & acceleration

1. Downsampling & smoothing

The data has been smoothed and possible drifting is removed by the tracking provider.

Data is sampled down from 25 Hz to 10 Hz, by taking the average values over a 100 ms timeframe, which by definition leads to means of either 2 or 3 data points.

Resulting in the positions of 22 players and a ball 10 times per second.

2. Filtering

All data with irrelevant IDs (referees) or without ids and that is not ball data is filtered out of the positions data frame. All data with x,y coordinates outside of the expected range for players and ball is left in, as it cannot be determined if data is incorrect or if a player/ball is truly out of bounds. All data that falls within the half-time break is filtered out. All speed data provided by the data provider for players and ball is removed, as it needs to be recomputed more accurately.

3. Speed and acceleration

Speed & Acceleration can theoretically be computed in very simple manner based on eq. 1 (speed) and eq. 2 (acceleration):

  1. Vt = sqrt(dx**2/dy**2)

  2. At = dv/dt

However, as position data from optical tracking providers is inherently noisy, and differentiating a noisy signal attenuates the errors, one should be cautious when applying these equations. Therefore, for every provider, we investigate the most optimal solution for computing velocity - and subsequently acceleration - initially comparing the following three methods:

  • raw speed with dt = 100 ms → this is the most exact way of computing speed, as it simply computes speed from the displacement over the last 100 ms. Everything else will be a smoothed approximation, and thereby a less exact estimate, but this one is also the most error prone.

  • raw speed with dt = 500 ms → this approximates speed at t using the displacement over the last 500 ms (0.5 seconds).

  • raw speed with dt = 1000 ms → this approximates speed at t using the displacement over the last 1000 ms (1 second).

We have performed tests to compare the various methods. For the raw signals of most tracking providers, we found that the 500 & 1000 ms provide a smoothed approximation, whereas the 100 ms signal is noisy.

To correct for the noise, we compute speed & acceleration of players and the ball by taking a rolling sum of the euclidean displacement (speed) and the derivative of the euclidean displacement (acceleration), over a window of 10 frames (1 second), without the need to apply additional filtering techniques*.

*common filtering techniques for spatiotemporal data are Gaussian- or the Butterworth- low or bandpass filters.


Tracking providers

SciSports

SciSports offers a tracking solution based on Optical Tracking. Currently, this solution is used for the Keuken Kampioen Divisie in the Performance Center.

SciSports optical tracking solution converts video into tracking data with comprehensive coverage of all players, match officials, and the ball during every match. This data delivery solution captures actions 25 times per second.

SciSports will deliver the tracking data files according to the Electronic Performance and Tracking Systems (EPTS) Standard Data Format. Detailed information on this format can be found on the website of the FIFA:
https://www.fifa.com/technical/football-technology/standards/epts/research-development-epts-standard-data-format

Example files can be found here:
Raw position data.txt

Tracab

There are two versions of TRACAB® solutions. The 4th generation TRACAB system (Gen4) consists of two multi-cameras (stereo pair) in two locations either side of the halfway line, whereas the 5th generation TRACAB system (Gen5) is a distributed camera system.

It should be noted that optical tracking data is prone to errors introduced by occlusions and ID-swaps, which is why one human operator is required to correct these errors during measurement. For this reason, three types of tracking data can be distinguished:

  1. Real-time data which is available immediately (<300ms), e.g. for live broadcasting, and therefore potentially contains several tracking errors.

  2. Live-delayed data which is made available with approximately 15s delay and has fewer errors due to further processing and human interventions.

  3. Post-processed data describes the final data which is made available within several hours after the recordings. This final step comprises complete data correction and post-processing steps.

Read more about ‘live data’ here.


If you have any questions or suggestions, please send us your feedback.

Use the chat icon on the right to directly get in touch with us.

Did this answer your question?