2025

Learning to Factorize Spatio-Temporal Foundation Models
Learning to Factorize Spatio-Temporal Foundation Models

Siru Zhong, Junjie Qiu, Yangyu Wu, Xingchen Zou, Bin Yang, Chenjuan Guo, Hao Xu, Yuxuan Liang# (# corresponding author)

Neural Information Processing Systems (NeurIPS) 2025 Spotlight

Spatio-Temporal (ST) Foundation Models (STFMs) promise cross-dataset generalization, yet joint ST pretraining is computationally costly and struggles with domain-specific spatial correlations. To address this, we propose FactoST, a factorized STFM that decouples universal temporal pretraining from ST adaptation. The first stage trains a space-agnostic backbone via multi-task learning to capture multi-frequency, cross-domain temporal patterns at low cost. The second stage attaches a lightweight adapter that rapidly adapts the backbone to specific ST domains via metadata fusion, interaction pruning, domain alignment, and memory replay. Extensive forecasting experiments show that in few-shot settings, FactoST reduces MAE by up to 46.4% versus UniST, uses 46.2% fewer parameters, achieves 68% faster inference than OpenCity, and remains competitive with expert models. This factorized view offers a practical, scalable path toward truly universal STFMs.

Learning to Factorize Spatio-Temporal Foundation Models

Siru Zhong, Junjie Qiu, Yangyu Wu, Xingchen Zou, Bin Yang, Chenjuan Guo, Hao Xu, Yuxuan Liang# (# corresponding author)

Neural Information Processing Systems (NeurIPS) 2025 Spotlight

Spatio-Temporal (ST) Foundation Models (STFMs) promise cross-dataset generalization, yet joint ST pretraining is computationally costly and struggles with domain-specific spatial correlations. To address this, we propose FactoST, a factorized STFM that decouples universal temporal pretraining from ST adaptation. The first stage trains a space-agnostic backbone via multi-task learning to capture multi-frequency, cross-domain temporal patterns at low cost. The second stage attaches a lightweight adapter that rapidly adapts the backbone to specific ST domains via metadata fusion, interaction pruning, domain alignment, and memory replay. Extensive forecasting experiments show that in few-shot settings, FactoST reduces MAE by up to 46.4% versus UniST, uses 46.2% fewer parameters, achieves 68% faster inference than OpenCity, and remains competitive with expert models. This factorized view offers a practical, scalable path toward truly universal STFMs.