An Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning

Published in Winter Conference on Applications of Computer Vision (WACV), 2024

Real-world applications of machine learning (ML) often involve training models from data streams characterized by distributional changes and limited access to past data. This scenario presents a challenge for standard ML algorithms, as they assume that all training data is available at once. Continual learning addresses this challenge by building models designed to incorporate new data while preserving previous knowledge. Class-incremental learning (CIL) is a type of continual learning that handles the case where the data stream is made up of batches of classes. It is particularly challenging in the exemplar-free case (EFCIL), i.e. when storing examples of previous classes is impossible due to memory or confidentiality constraints. CIL algorithms must find a balance between knowledge retention, i.e. stability, and adaptation to new information, i.e. plasticity. Many existing EFCIL methods update the model at each incremental step using supervised fine-tuning combined with a distillation loss, and thus tend to favor plasticity over stability. Another line of work freezes the initial model and only updates the classifier. This approach has recently gained interest due to the availability of models pre-trained on large external datasets, often through self-supervision. While pre-trained models provide diverse and generic features, there are limits to their transferability, and these limits have not been studied in depth in the context of EFCIL.

We propose a comprehensive analysis framework to disentangle the factors which influence EFCIL performance. Focus is put on the strategies to obtain the initial model of the incremental process. We consider the type of neural architecture, the training method, the depth of fine-tuning, the availability of external data, and the supervision mode for obtaining this initial model. The initial training strategies are compared using three EFCIL algorithms, representative for the state of the art, on 16 target datasets, under 2 challenging CIL scenarios. The main findings are that: (1) pre-training with external data improves accuracy, (2) self-supervision in the initial step boosts incremental learning, particularly when the pre-trained model is fine-tuned on the initial classes, and (3) EFCIL algorithms based on transfer learning have better performance than their fine-tuning-based counterparts. No combination of an EFCIL algorithm and an initial training strategy is best in all cases. Therefore, it is interesting to understand the contribution of the different factors influencing EFCIL performance. The insights brought by the proposed analysis could benefit both continual learning researchers and practitioners. The proposed framework can improve the evaluation and analysis of EFCIL methods. Continual learning practitioners can use the results of this study to better design their incremental learning systems.

The detailled description of the method is available in the paper.

If you found this work useful for your research, please cite it as follows:

@article{petit2023analysis,
  title={An Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning},
  author={Petit, Gr{\'e}goire and Soumm, Michael and Feillet, Eva and Popescu, Adrian and Delezoide, Bertrand and Picard, David and Hudelot, C{\'e}line},
  journal={arXiv preprint arXiv:2308.11677},
  year={2023}
}