2 minutes
Using Machine Learning to predict Tactics from spatiotemporal Data in Football

During my studies in Human Technologies in Sports and Medicine at the German Sports University Cologne, I focused my master’s thesis (2021–2022) on applying machine learning to automate tactical analysis in soccer. Building on this work, I co-authored a publication in the journal Data Mining and Knowledge Discovery. You can find the publication here: https://link.springer.com/article/10.1007/s10618-025-01092-9 .
The aim of my thesis was to investigate how well machine learning models could predict tactical maneuvers based on spatiotemporal data from players and the ball, collected from 15 first-division Bundesliga matches. A key part of the work involved developing a feature-engineered dataset from manual expert annotations of match phases and validating the consistency of these annotations. Several supervised learning techniques, including logistic regression, random forests, and neural networks, were employed to predict these tactical elements.
The findings indicated that feature engineering significantly improved model performance compared to using raw positional data alone. Models trained on engineered features achieved an F1-score increase of about 0.38 ± 0.15. Furthermore, a sequential neural network was able to outperform a baseline logistic regression model by an average of 2.77 ± 2.26 percentage points, showing that more sophisticated models were able to offer small but meaningful improvements.
Overall, the results supported the use of machine learning — and especially neural networks — as a promising approach for automating tactical analysis in soccer. However, the findings also highlighted that domain knowledge (captured through thoughtful feature engineering) remained crucial for achieving strong performance. The study suggested that while machine learning can indeed aid decision-makers and analysts, careful preparation of the data and expert involvement in modeling remain essential.
Deep Learning Data Augmentation Feature-Engineering Statistics Visualization Python Computer Vision
268 Words
2022-01-10 00:00