Penalization Approaches for CML-Estimation - Performance and Applications within the PCM
Can Gürer, Clemens Draxler
UMIT Tirol, Austria
In this talk we present a covariate-based approach, where a linear combination of covariates is utilized to represent the category parameters of the Partial Credit Model - though the approach can be generalized to all Rasch-Models or, even more broadly, to multiparameter exponential families. The model parameters are considered to refer to items and covariates. With a proper definition of covariates, this model allows for detection of uniform DIF, non-uniform DIF, local dependence between items and other applications. Within such an approach, a large number of parameters arises leading to a high number of selection decisions. Conditional maximum likelihood (CML) and penalization procedures are applied to mitigate the problems of cumulative errors. L1-Penalization hereby simultaneously selects from all model parameters in one estimation procedure leading to a single decision for the entire parameter vector. Feasibility and performance of the approach are presented and challenges for further applications are discussed.
Optimal Designs for Thurstonian IRT Models Based on Linear Paired Comparisons
Heinz Holling
University of Münster, Germany
Thurstonian IRT models based on paired comparisons are only useful for many applications, such as personnel selection, if the paired comparisons consist of equally keyed items. However, previous studies have shown that an extremely large number of such paired comparisons must then be presented in order to obtain more or less precise estimates of the trait scores. Using optimal design theory, we develop Thurstonian IRT models based on linear pairwise comparisons with a minimum number of paired comparisons consisting only of equally keyed items. Optimal designs of item pairs are characterized by combinations of those values of factor loadings which optimize predetermined criteria, such as the correlation between the estimated and true trait scores or the volume of the confidence ellipsoid of the trait scores. The optimal designs to be derived for the present purpose require the development of novel types of optimal designs. In particular, two requirements must be given special consideration: (a) the restriction of the design region, and (b) the constraint that alternatives have to load on mutually distinct factors, respectively. In this talk, we present solutions for the optimal design problem which substantially outperform current methods in the literature in terms of precision and amount of paired comparisons. Based on our results, trait scores can be easily estimated even if only factor loadings are known.
Combining the Answer-Until-Correct Response Format with Bayesian Adaptive Testing
Cosima Schenk, Tuo Liu, Aron Fink, Andreas Frey
Goethe-Universität Frankfurt, Germany
The answer-until-correct response format (AUC) allows examinees to make multiple attempts on an item until they answer it correctly and thus provides a useful approach to scoring multiple-choice items polytomously, resulting in a more precise measurement and an enhanced test experience due to the instantaneous feedback on given responses. The question of how to model this response format accurately has been addressed by previous literature, even in combination with computerized adaptive testing (AUC-CAT).
However, to date, methods for estimating person parameters and selecting items during AUC-CAT have relied solely on point estimates, without accounting for inherent uncertainty in the parameter estimates. Using a fully Bayesian approach may further enhance the estimation, since it allows uncertainty in item and person parameter estimates to be taken into account. In the case of AUC-CAT, this could lead to an even more accurate measurement.
The aim of our presentation is to compare the precision of person parameter estimates obtained using previously available AUC-CAT approaches with those derived from a fully Bayesian AUC-CAT using a Monte Carlo simulation. The SIRT-MM model will be used as the measurement model, which takes guessing into account when using the multiple-choice format by using a fixed guessing parameter. We consider RMSE and the bias of the final person parameter estimates as performance.
The analyses are still ongoing, but will be completed by the time of the conference. We expect the Bayesian AUC-CAT to enhance the measurement accuracy of the resulting person parameter estimates.
A Bayesian Non-compensatory Multidimensional Response Time Model
Christoph Koenig1, Jochen Ranger2
1Goethe University Frankfurt am Main; 2Martin Luther University Halle-Wittenberg
This contribution takes up the general idea of a non-compensatory multidimensional Response Time (RT) Model developed in Ranger et al. (2024) and transfers it to a Bayesian context. The Bayesian non-compensatory multidimensional RT-model is based on the Fenton-Wilkinson-Approximation for the sum of lognormal random variables and utilizes a different strategy for the identification of the person parameters. Both core aspects avoid a number of sampling issues associated with modeling the sum of lognormal random variables or multidimensional models. The identification strategy involves explicitly standardizing the person parameters, instead of the common strategy of setting uni- or multidimensional prior distributions with means and variances set to zero and one, respectively. With a comprehensive simulation study we investigate the feasibility of the Bayesian RT-model by assessing parameter recovery in terms of Bias and RMSE across relatively small sample sizes and test lengths with different proportions of multidimensional items. Moreover, we compare different implementations of the identification strategy, namely within-sampling and post-hoc standardization of the person parameters, to a model specification with the common identification strategy. Results indicate good parameter recovery of the within-sampling and post-hoc standardization strategies in sample sizes as small as N=600, across different proportions of multidimensional items. Furthermore, Bias and RMSE are virtually indistinguishable across identification strategies. LOO-model comparisons also indicate that the three identification strategies are equivalent. We discuss the utility of the Bayesian non-compensatory multidimensional RT-model in particular, and the broader utility of the identification strategy for modeling relatively complex RT and IRT models in general.
|