, Francisco Casacuberta Nolla (dir. tes.) 
, María Inés Torres Barañano (secret.)
, Felipe Sánchez Martínez (voc.) 
Neural Machine Translation (NMT) faces significant challenges in adapting to the ever-evolving nature of human language. Traditional NMT models struggle to integrate new vocabulary terms and are prone to forget previously learned information upon learning new ones. This work addresses these challenges by exploring methods to enable Continual Learning (CL) in NMT, focusing on the open-vocabulary problem and the Catastrophic Forgetting (CF) phenomenon.
To tackle the open-vocabulary problem, we introduce quasi-character-level vocabularies, which combine the flexibility of character-level models with the efficiency of higher-level encodings. This approach allows NMT models to represent any word, including unseen or rare ones, without excessively increasing sequence lengths, thereby improving model generalization and performance. Additionally, we propose incremental and continual vocabularies that leverage compositional embeddings to integrate new words seamlessly into our NMT models without the need for retraining.
In addressing the Catastrophic Forgetting problem, we investigate the influence of vocabulary domain and size on the model's retention capabilities. Next, we explore rehearsal strategies, demonstrating that minimal amounts of past data can significantly reduce forgetting during training. Furthermore, to improve knowledge retention without heavily relying on past data, we propose a regularization strategy that combines few-shot rehearsal with loss penalties to balance the learning of new tasks while preserving performance on previous ones.
Finally, we explore parameter-efficient adaptation methods to enable effective task-switching strategies in NMT. From this, we discover that by learning a minimal number of parameters, these methods allow NMT models to adapt to new domains, styles, or even languages without a substantial computational overhead or performance degradation on prior tasks. To complement this research, we also derive a gradient-based regularization technique for low-rank matrices that facilitates the integration of new knowledge while mitigating the CF problem.
Overall, this work advances the field of continual learning in NMT by providing practical and thoroughly validated solutions to the open-vocabulary and catastrophic forgetting problems, paving the way for more adaptive and efficient NMT systems capable of responding to the ever-evolving demands of natural language translation tasks.
© 2008-2026 Fundación Dialnet · Todos los derechos reservados