Continual Learning for FMs

Continual learning that drops the disjoint-task assumption.

Continual learning frameworks for VLMs and LLMs that detect semantic overlap across tasks, consolidate redundant experts via on-policy self-distillation, and route inputs through zero-parameter or GMM-based routers. Outperforms strongest baselines by +7–15 points across disjoint and overlapping benchmarks while reducing deployed adapters by up to .

I created the first VLM continual-learning benchmark with controlled inter-task overlap, because the standard disjoint-task setup hides the failure mode that matters most in production: tasks that share concepts but differ in distribution.

This is the model-side complement to the production agent system: the agent answers how to ship updates fast, this work answers how to ship them without forgetting. The thread connects to earlier domain-adaptation work (Xu et al., 2019; Zhou et al., 2020) that established stochastic neighborhood embedding for cross-domain transfer, and to recent advances in instruction grounding (Zong et al., 2025) and salient-concept-aware data augmentation (Zhao et al., 2025).

References

2025

  1. Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
    Yongshuo Zong, Qin Zhang, Dongsheng An, and 6 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
  2. Salient Concept-Aware Generative Data Augmentation
    Tianchen Zhao, Xuanbai Chen, Zhihua Li, and 5 more authors
    In Advances in Neural Information Processing Systems (NeurIPS), 2025

2020

  1. Book
    d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding
    Xiong Zhou, Xiang Xu, Ragav Venkatesan, and 2 more authors
    In Domain Adaptation in Computer Vision with Deep Learning, 2020

2019

  1. d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding
    Xiang Xu, Xiong Zhou, Ragav Venkatesan, and 2 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
    Oral