17th Conference of the Methods & Evaluation Section
of the German Psychological Society (DGPs)

September 28 - October 1, 2025 | Berlin, Germany

JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at fgme-berlin-2025@psychologie.fu-berlin.de.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview

Session

Import to your local calendar

Causal Inference: Estimation and General Considerations

Time:

Wednesday, 01/Oct/2025:

1:30pm - 3:00pm

Session Chair: Christoph Jindra

Location: Raum L 113

76

Presentations

MAGMA – A multi-group matching algorithm and a framework to evaluate balance

Julian Urban^1,2, Markus Daniel Feuchter³, Franzis Preckel¹

¹Trier University; ²GESIS - Leibniz Institute for the Social Sciences; ³LIfBI - Leibniz Institute for Educational Trajectories

Observational studies and field research often lack the possibility of exerting experimental control. Under these conditions, covariates may threaten the causal interpretation of group comparisons. Statistical techniques, like matching (Rosenbaum, & Rubin, 1985; Thoemmes & Kim, 2011), can help to strengthen causal interpretations (for an overview, see e.g. Sedlmeier & Renkewitz, 2013). Matching attempts to adjust for covariate effects by pairing individuals from different groups with similar values on these covariates and extracting them into a reduced matched sample for further analysis. Consequently, group differences on covariates are reduced in this matched sample, and the groups differ only in the independent variable (e.g., a treatment or intervention administered). However, available matching techniques are limited in their breadth of application (e.g., only two groups can be matched), ease of use (e.g., competing matching solutions must be extracted and compared individually), and balance evaluation (e.g., relying solely on pairwise effect comparison). To address these limitations, we developed MAGMA (for MAny Group MAtching) and the associated statistical package MAGMA.R (Urban et al., 2024). MAGMA extends matching to up to four-group designs and provides a framework to parsimoniously evaluate balance based on four balance criteria (Feuchter et al., 2022). In two real data examples, we show that (1) MAGMA produces as well-balanced solutions as the most widely used matching package MatchIt (Ho et al., 2023) in a two-group example, and that (2) MAGMA can reduce imbalance in a multi-group design.

Doubly Protective or Doubly Fragile? A Comparison of Doubly Robust Approaches for Estimating Average Treatment Effects

Jingyu Zhang¹, Oliver Lüdtke^1,2, Alexander Robitzsch^1,2

¹IPN - Leibniz Institute for Science and Mathematics Education, Kiel, Germany; ²Centre for International Student Assessment, Kiel, Germany

Estimating the average treatment effect (ATE) from nonexperimental data is challenging due to potential confounding covariates. Accurate estimation requires that all confounders are measured and that the statistical model for covariate adjustment is correctly specified. Two standard approaches—outcome regression and propensity score weighting—can fail when their respective models are misspecified. Doubly robust methods address this issue by combining both approaches, ensuring unbiased ATE estimation if at least one model is correctly specified.

This study reviews four doubly robust methods that are well-established in the causal inference literature but underutilized in psychological research: augmented inverse probability weighting, outcome regression weighted by the inverse propensity score, regression incorporating the inverse propensity score as a clever covariate, and calibrated propensity score weights. Through two simulation studies, we compare these methods with traditional regression and inverse probability weighting estimators. Results suggest that doubly robust methods—particularly regression weighted by the inverse propensity score—offer superior protection against bias from model misspecification. We also discuss practical considerations, such as weight normalization, propensity score truncation, and potential efficiency losses due to overfitting. Finally, we illustrate these methods using an empirical data example.

Targeted Maximum Likelihood Estimation for Causal Inference with Observational Data - The Example of Private Tutoring

Christoph Jindra, Karoline A. Sachse

IQB, Humboldt Universität zu Berlin, Germany

State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing double-robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We provide a brief introduction to TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study (N = 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.

A powerful arsenal of theory-building tools, or just hopelessly lost in 'modeling'? A look at structural equation models from a perspective of scientific logic.

Andreas Klein

Goethe University Frankfurt, Germany

Explanation, representation, and falsifiability are critical functions of a scientific model. Against a background of philosphy of science writings by Hempel, Carnap, and Suppes, we discuss the features of structural equation and related statistical models. What claims for scientific explanation can be made by application of such a model and what claims are clearly not substantiated and hardly more than wishful thinking? What aspects of their model structure are verifiable and which are not, and what are the consequences for scientific explanation? To what degree are statements about the data-generating structure of the model verifiable or falsifiable? And, finally, from the standpoint of scientific explanation, to what degree does good statistical fit permit statements about the confirmation of a theory? The paper attempts to address several of these questions and closes with some recommendations for the proper use and scientific interpretation of these models.

Mobile View Print View

Contact and Legal Notice · Contact Address:

fgme-berlin-2025{at}psychologie.fu-berlin dot

Privacy Statement · Conference: FGME Berlin 2025

17th Conference of the Methods & Evaluation Section of the German Psychological Society (DGPs)

September 28 - October 1, 2025 | Berlin, Germany

Conference Agenda

17th Conference of the Methods & Evaluation Section
of the German Psychological Society (DGPs)