Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Agenda Overview |
| Session | ||
Statistics in sports
| ||
| Presentations | ||
The Best of Both Worlds: Predicting Coverage Schemes in American Football with Supervised and Unsupervised Learning 1TU Dortmund; 2WU Vienna; 3Bielefeld University Choosing between man and zone coverage is one of the most critical strategic decisions a defensive coordinator must make before each offensive play in American football. In simple terms, in man coverage each defender is assigned to guard a specific offensive player, while zone coverage requires defenders to protect designated areas of the field. This choice fundamentally shapes how the defense reacts to offensive formations and movements. Traditionally, experienced offensive coordinators and quarterbacks rely on visual cues, such as defenders’ alignment or pre-snap motion, to infer these defensive schemes. However, with the increasing availability of high-resolution player tracking data, statistical models can now uncover such tactical patterns quantitatively rather than relying solely on expert intuition. In this project, we first employ an elastic net and an XGBoost classifier to predict whether a defense is in man or zone coverage based on all players’ positions once both teams are set before the snap. The models thus captures spatial configurations that often reveal underlying defensive intentions. In a second step, we incorporate dynamic information from pre-snap player movements. Finally, in a third step, we employ features derived from a hidden Markov model (HMM). Specifically, we use an HMM to represent defenders’ movement trajectories over time. The hidden states correspond to potential offensive players being covered by each defender. From the decoded state sequences, we extract summary statistics, such as the number of state (defender) switches. Including these HMM-based features in the aforementioned models significantly enhances the model’s predictive accuracies. Beyond the pure classification performance, our approach also enables deeper tactical analyses. For instance, it allows us to explore how pre-snap motion helps offenses identify defensive coverages more effectively. Comparing these pre- and post-motion probabilities provides insight into how well offensive movements reveal defensive strategies. Overall, this framework demonstrates how modern machine learning techniques in combination with a statistical model can provide quantitative insights into complex team sports tactics. While developed within an American football context, the methodology may generalize to other sports where spatial positioning and interaction dynamics play similarly crucial roles. Modelling momentum in tennis: A latent-state approach to point outcomes and rally lengths 1Bielefeld University, Germany; 2TU Dortmund, Germany Tennis matches are often characterised by momentum shifts – i.e., changes in match dynamics over time – marked by transitions between phases where either player 1 or player 2 dominates. While dominance is clearly reflected in a player’s point wins, rally lengths provide additional valuable information for modelling momentum; short rallies suggest strong momentum, whereas long rallies and point losses indicate pressure. To effectively model momentum shifts, we hence propose considering both the outcomes of the points and the rally lengths. These sequentially observed outcomes reflect the current dynamics of the match (i.e., the level of pressure a player exerts on their opponent), which we regard as an unobserved state process. Thus, we employ a latent-state approach to investigate these momentum shifts. Specifically, we model the outcomes of server wins and rally lengths jointly using Markov-modulated marked Poisson processes (MMMPPs). This flexible framework allows us to relate the events (server wins or loses the point) and the event times (rally length) to an underlying latent state process, modelled as a continuous-time Markov chain. Its states determine the distribution of the outcomes and can be interpreted as proxies for the players’ momentum. For data from all Grand Slam tournaments from 2016 to 2024, we identify momentum shifts within tennis matches using MMMPPs with two latent states, accounting for player- and match-specific effects such as player rankings and court surfaces. The Accuracy–Complexity Trade-Off in the Expected Threat model for Football 1TU Delft, The Netherlands; 2AFC Ajax, The Netherlands The Expected Threat model is a possession value model in football (soccer) with a Markov chain structure that allows for interpretation and visualization. To create a Markov chain, the pitch is discretized into different Markov states. However, selecting the right discretization of the pitch is still a challenging design choice. A model with more game states can better distinguish between different scenarios, but has less samples per state when estimating the Markov chain. This creates a trade-off between the model complexity in terms of the number of Markov states and the accuracy of the probability estimates. Theoretical analysis of the model gives error bounds, but interpretation of the results indicates that these might be on the conservative side. Simulations provide a more accurate characterization of the model’s error, which is indeed more optimistic than the theoretical bound. Finally, these insights are converted into a practical rule of thumb to help practitioners choose the right balance between the number of Markov states and accuracy of the probability estimates of the Expected Threat model. | ||

