DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification

27.10.2025

Authors

Maximilian Heil Aleksandar Pramov1

Abstract

Numerical claims — statements involving quantities, comparisons, and temporal references — pose unique
challenges for automated fact-checking systems. In this study, we evaluate modeling strategies for veracity
prediction of such claims using the QuanTemp dataset and building our own evidence retrieval pipeline. We
investigate three key factors: (1) the impact of more evidences with longer input context windows using
ModernBERT, (2) the effect of right-to-left (R2L) tokenization, and (3) their combined influence on classification
performance. Contrary to prior findings in arithmetic reasoning tasks, R2L tokenization does not boost natural
language inference (NLI) of numerical tasks. A longer context window does also not enhance veracity performance
either, highlighting evidence quality as the dominant bottleneck. Our best-performing system achieves competitive
macro-average F1 score of 0.57 and places us among the Top-4 submissions in Task 3 of CheckThat! 2025. Our
code is available at https://github.com/dsgt-arc/checkthat-2025-numerical.