17th Conference of the Methods & Evaluation Section of the German Psychological Society (DGPs)

New Approaches to Analysing Parameter Heterogeneity in Models of Response Processes and Temporal Dynamics

Chair(s): Thorsten Meiser (University of Mannheim, Germany), Esther Ulitzsch (University of Oslo)

Over the last decades, a plethora of statistical models has been developed for the study of latent response processes in psychometrics and for describing temporal dynamics in longitudinal research. Whereas the original modelling frameworks usually maintained the assumption of homogeneity for the structural model parameters, with invariant parameters across individuals and stationary parameters over time, the focus of methodological and statistical research has recently shifted towards the analysis of parameter heterogeneity. This line of research includes methods to detect violations of parameter homogeneity, statistical tools to identify sources and correlates of parameter non-invariance as well as model extensions that accommodate heterogeneous structural parameters. Thereby the analysis of parameter heterogeneity can serve different objectives, such as to avoid biased parameter estimates through identifying heterogeneity as nuisance factor or to investigate (sources of) heterogeneity in response processes and temporal dynamics as a research goal in its own right.

In this symposium, various new approaches to analysing parameter heterogeneity will be introduced and compared, including person-wise and item-wise discrete mixture models, constrained random effect models, and machine-learning procedures to detect moderations of parameters by observed covariates. The approaches are presented and discussed in the context of Item Response Theory, time-varying coefficient models, confirmatory factor analysis and latent Markov factor models.

Presentations of the Symposium

A Systematic Comparison of Mixture Distribution Models and Score-Based Partitioniong to Capture Response Process Heterogeneity with IRTree models

Emre Alagöz¹, Meiser Thorsten¹, Rudolf Debelak²
¹University of Mannheim, ²University of Zurich

IRTree models enable researchers to account for response style (RS) effects and debias trait estimates for valid inferences. However, traditional IRTrees assume that all respondents use the trait and RS dimensions in the same way, implying a single measurement model with identical trait and RS factor loadings across individuals. When this assumption is violated, RS effects are not properly corrected, and trait estimates and factor loadings remain biased. To address this issue, two extensions have been proposed: (1) the mixture IRTree (MixTree) model and (2) the score-based partitioning (SBPTree) algorithm. The MixTree identifies K latent subpopulations, each following a distinct measurement model with different trait and RS weights, capturing heterogeneous response strategies. Covariates can optionally be included to explain subpopulation membership through multinomial logistic regression. In contrast, the SBPTree directly uses covariates by sorting individual score contributions and splitting the sample if systematic changes along a covariate are detected. This process continues recursively, resulting in G subgroups, each with its own measurement model (i.e., response strategy). The MixTree’s strength lies in capturing unobserved heterogeneity without covariates, although covariates can be included through a logistic relationship. The SBPTree flexibly captures any form of covariate relationship but depends on the presence and quality of covariates. In a comprehensive simulation study, we compare the performance of these two approaches under varying conditions, such as the link, reliability, and effect size of the covariates as well as sample size, test length and class separation to better understand conditions where one method is more appropriate.

Faking, Fast and Slow: A Response-Time-Based Latent Response Mixture Model to Account for Faking in High-Stakes Personality Assessments

Timo Seitz¹, Esther Ulitzsch²
¹University of Mannheim, ²University of Oslo

High-stakes personality assessments are often compromised by faking, where test-takers distort their responses according to social desirability. Many previous models have accounted for faking by modeling an additional latent dimension that quantifies each test-taker’s degree of faking. Such models assume a homogeneous response strategy among all test-takers, with substantive traits and faking jointly influencing responses to all items. However, such a model does not allow for changes in test-takers’ faking behavior over the course of the questionnaire. To address this limitation, we propose a response-time-based latent response mixture model. This model assumes that item responses of each test-taker are a mixture of different latent classes related to different measurement models of substantive traits and faking. Taking into account item-level response times, the model enables the investigation of differences in the time test-takers require to respond honestly, edit a trait-based response in the direction of desirability, or fake a response entirely. Along with a simulation examining parameter recovery, we illustrate the model using empirical data (N = 1824) from a real-life personnel selection setting.

Variant Approaches to Assessing Measurement Invariance: A Comparison of MNLFA and SEM Trees for Detecting Differential Item Functioning

Leonie Hagitte, Andreas Brandmaier
Medical School Berlin

Ensuring the validity of psychological assessments is crucial, yet differential item functioning (DIF) can threaten measurement invariance (MI) when test items function differently across groups. Recent calls for improved DIF detection methods emphasize the need for more advanced statistical approaches.

Moderated nonlinear factor analysis (MNLFA) is a recent approach for assessing MI via parameter moderation within a single-group confirmatory factor analysis framework. MLNFA evaluates MI across multiple continuous and categorical covariates, and accounts for heteroskedasticity by modeling factor and residual variances as functions of these covariates. While MNLFA offers continuous moderation of factor loadings, it requires a priori specification of covariates and their functional relationships.

In contrast, structural equation modeling (SEM) trees and forests are data-driven, non-parametric methods that use recursive partitioning to identify latent subgroups in which model parameters differ, without assuming specific functional forms or predefined covariate effects. These approaches allow for nonlinear moderation of factor loadings and can reveal complex interaction effects, enabling the exploratory detection of DIF.

In this study, we conducted a Monte Carlo simulation to compare the performance of MNLFA and SEM trees and forests in detecting DIF and assessing MI under varying conditions. Specifically, we evaluated their effectiveness in identifying non-invariance, detecting relevant covariates, and adjusting for DIF. Our findings will inform best practices for selecting statistical techniques to ensure MI in psychological assessment.

Exploring Heterogeneity in Temporal Dynamics with Different Extensions of Time-Varying Coefficient Models

Esther Ulitzsch¹, Oliver Lüdtke², Steffen Nestler³, Therese Ruud Snuggerud¹, Sverre Urnes Johnson¹
¹University of Oslo, ²IPN - Leibniz Institute for Science and Mathematics Education, ³University of Münster

Time-varying coefficient models (TVCMs) offer flexible tools for exploring dynamic changes in parameters over time by approximating them with continuous, non-parametric coefficient functions. In psychological research, TVCM applications typically either assume homogeneity in temporal dynamics - imposing invariant coefficient functions across individuals - or estimate person-specific functions independently, limiting generalizability and increasing susceptibility to overfitting. This study evaluates extensions of TVCMs aiming to balance pooled and individual-level estimation, positioning them along a continuum between fully aggregated and fully person-specific approaches. Specifically, we expand coefficient functions with (a) random intercepts, assuming individuals to differ in the initial strength of parameters of interest but coefficient functions to be parallel, (b) random effects for each of their constituting components, smoothing coefficient functions via partial pooling across individuals, or (c) equal shrinkage constraints across persons, assuming that fixed-effect coefficient functions exhibit the same degree of smoothness (or "wiggliness") across individuals. We study and discuss advantages and limitations of these model extensions using intensive longitudinal data collected from 16 patients with anxiety disorders over six weeks - two weeks each before, during, and after an attention training intervention. Our analysis examines interindividual differences in changes in both the level and variability of feeling nervous and being distracted by one's emotions, as well as in the dynamics in emotion-cognition coupling over time. To this end, we employ a simple bivariate normal distribution, introducing time-varying coefficient functions for its parameters, i.e., means, variances, and the correlation coefficient. Implications for exploring between-person heterogeneity in temporal dynamics are discussed.

Latent Markov Factor Analysis for Detecting Dynamics in Attentive and Careless Responding in Intensive Longitudinal Data

Leonie V.D.E. Vogelsmeier¹, Joran Jongerling¹, Esther Ulitzsch²
¹Tilburg University, ²University of Oslo

Intensive longitudinal data (ILD) collection methods like experience sampling methodology can place significant burdens on participants, potentially leading to careless responding, such as responding randomly or with a strong preference for certain scale positions. Such behavior can undermine the validity of inferences drawn from the data if not properly identified and addressed. In this presentation, I will introduce latent Markov factor analysis (LMFA) for modeling and understanding dynamics in attentive and careless responding. LMFA is a mixture modeling approach proposed to capture any parameter heterogeneity in the measurement model across individuals and time, thereby identifying measurement non-invariance in ILD. Specifically, LMFA combines a discrete- or continuous-time latent Markov model (that clusters observations into separate states, according to state-specific measurement models) with mixture factor analysis (that explores which measurement model applies for each state). Momentary correlates can be incorporated into the Markov model to identify potential reasons for changes in the measurement model. Until recently, LMFA has been assessed only for detecting changes in the attentive model. The method has the potential to distinguish between attentive and careless responding because careless responding should alter the measurement model (e.g., random responding should weaken item relationships and hence loadings), but its performance may be challenged because careless responding tends to entail violations of the model’s multivariate normality assumption (e.g., uniform distribution for random responding). I will present our recent simulation study results suggesting that LMFA nevertheless effectively distinguishes between different types of careless and attentive responding. I will conclude with extensions and limitations.

Score-Based Tests with Fixed Effects Person Parameters in Item Response Theory

Rudolf Debelak, Charles Driver
University of Zurich

Score-based tests are a method for checking the invariance of model parameters across populations defined by continuous, ordinal or categorical covariates. We present an approximation to score-based tests for item response theory models that treat the person parameters as fixed effects, which are important for large data and item counts where typical approaches can be too slow. Our approximation allows computationally inexpensive tests for differential item function and other model misspecifications in sufficiently long tests, in that score contributions are leveraged and comparison models are not required. We outline the theoretical framework and adapt it to two recently proposed consistent methods of parameter estimation, constrained joint maximum likelihood estimation and a joint maximum a posteriori approach. We use simulations to evaluate the new method for detecting violations of measurement invariance in the two-parametric logistic test model. The new approximation is sensitive to violations of measurement invariance while having a sufficiently low Type I error rate in large samples and tests. We demonstrate the new method in an empirical data set examining reading performance.

Conference Agenda