Exploring the Dilemma of Causal Incoherence: A Study on the Approaches and Limitations of Large Language Models in Natural Language Inference

Jon Felix Apaolaza Larraya; Begoña Altuna; Aitor Soroa Etxabe; Íñigo López Gazpio

Ayuda

Exploring the Dilemma of Causal Incoherence: A Study on the Approaches and Limitations of Large Language Models in Natural Language Inference

Autores: Jon Felix Apaolaza Larraya, Begoña Altuna, Aitor Soroa Etxabe , Íñigo López Gazpio
Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 74, 2025, págs. 207-219
Idioma: inglés
Títulos paralelos:
- Explorando el Dilema de la Incoherencia Causal: Un Estudio sobre los Enfoques y las Limitaciones de los LLMs en la Inferencia de Lenguaje Natural
Enlaces
- Texto completo
Resumen
- español
  Esta investigación aborda el crítico pero subestimado problema al que se enfrentan los grandes modelos del lenguaje (LLMs) conocido como la Maldición de la Reversión (RC). La RC denota una limitación inherente al tratar de inferir relaciones bidireccionales que socava las capacidades de razonamiento lógico. Bajo los efectos de la RC, los LLMs no pueden inferir relaciones bidireccionales de manera efectiva y eso limita su capacidad de razonamiento deductivo. Si un LLM se entrena con una oración de la forma “A se relaciona con B”, automáticamente no generaliza a la forma inversa, “B se relaciona con A”. A través de una revisión sistemática de la literatura y del análisis experimental, destacamos las dificultades para mantener la coherencia causal existentes en los LLMs del estado de la cuestión. Analizamos estrategias de mitigación reconociendo la RC como un problema persistente en diversas arquitecturas, incluyendo técnicas de ampliación de datos y optimización de objetivos innovadores. Analizamos avances recientes y las causas fundamentales de este problema, ofreciendo valiosas lecciones aprendidas, discusión sobre los enfoques aplicados y limitaciones de las técnicas de mitigación. El objetivo de este trabajo es contribuir al desarrollo de sistemas de inteligencia artificial más fiables y coherentes.
- English
  This research addresses the critical yet underappreciated problem in state-of-the-art Large Language Models (LLMs) known as the Reversal Curse (RC). The RC denotes a failure to infer bidirectional relationships that undermines logical reasoning capabilities. Under the RC, LLMs are unable to infer bidirectional relationships effectively leading to logical errors in deductive reasoning. If a model is trained on a sentence of the form “A relates to B”, it does not automatically generalize to the reverse form, “B relates to A”. Through a systematic literature review and experimental analysis, we highlight the difficulties in maintaining causal coherence in state-of-the-art LLMs. Recognizing the RC as a persistent problem across architectures, we review mitigation strategies including data augmentation and innovative training objectives to offer valuable insights into the root causes and discuss their limitations. This work aims to contribute to the development of more reliable and coherent AI systems.
Referencias bibliográficas
- Achiam, J., S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al. 2023. Gpt-4...
- Allen-Zhu, Z. and Y. Li. 2023. Physics of language models: Part 3.2, knowledge manipulation. arXiv preprint arXiv:2309.14402.
- Berglund, L., M. Tong, M. Kaufmann, M. Balesni, A. C. Stickland, T. Korbak, and O. Evans. 2023. The reversal curse: Llms trained on” a is...
- Brown, T., B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss,...
- Buchanan, B. G. and E. H. Shortliffe. 1984. Rule-based expert systems: the mycin experiments of the stanford heuristic programming project...
- Chang, Y., X. Wang, J. Wang, Y. Wu, K. Zhu, H. Chen, L. Yang, X. Yi, C. Wang, Y. Wang, et al. 2023. A survey on evaluation of large language...
- Chen, Z. and Q. Gao. 2024. Monotonicity Reasoning in the Age of Neural Foundation Models. Journal of Logic, Language and Information, 33(1):49–68.
- Chu, Z., J. Chen, Q. Chen, W. Yu, T. He, H. Wang, W. Peng, M. Liu, B. Qin, and T. Liu. 2024. Navigate through Enigmatic Labyrinth. A Survey...
- Dagan, I., B. Dolan, B. Magnini, and D. Roth. 2010. Recognizing textual entailment: Rational, evaluation and approaches–erratum. Natural Language...
- Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. ”BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”....
- Du, X., R. Zhu, Y. Li, and A. Anjum. 2019. Language model-based automatic prefix abbreviation expansion method for biomedical big data analysis....
- Fluri, L., D. Paleka, and F. Tramèr. 2024. Evaluating superhuman models with consistency checks. In 2024 IEEE Conference on Secure and Trustworthy...
- Golovneva, O., Z. Allen-Zhu, J. Weston, and S. Sukhbaatar. 2024. Reverse training to nurse the reversal curse. arXiv preprint arXiv:2403.13799.
- Grosse, R., J. Bae, C. Anil, N. Elhage, A. Tamkin, A. Tajdini, B. Steiner, D. Li, E. Durmus, E. Perez, et al. 2023. Studying large language...
- Guo, Q., R. Wang, J. Guo, X. Tan, J. Bian, and Y. Yang. 2024. ”Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation...
- Hadi, M. U., R. Qureshi, A. Shah, M. Irfan, A. Zafar, M. Shaikh, N. Akhtar, J. Wu, and S. Mirjalili. 2023. A survey on large language models:...
- Jiang, A. Q., A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. d. l. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, et al....
- Joshi, N., A. Saparov, Y. Wang, and H. He. 2024. LLMs are prone to fallacies in causal inference. In Y. Al-Onaizan, M. Bansal, and Y.-N. Chen,...
- Khalfa, J. 1994. What is intelligence? Cambridge University Press.
- Kıcıman, E., R. Ness, A. Sharma, and C. Tan. 2023. Causal reasoning and large language models: Opening a new frontier for causality. arXiv...
- Kitouni, O., N. Nolte, D. Bouchacourt, A. Williams, M. Rabbat, and M. Ibrahim. 2024. The Factorization Curse: Which Tokens You Predict Underlie...
- Korb, K. B. and A. E. Nicholson. 2010. Bayesian artificial intelligence. CRC press.
- Lacave, C. and F. J. Díez. 2002. A review of explanation methods for Bayesian networks. The Knowledge Engineering Review, 17(2):107–127.
- Lai, V. D., N. Ngo, A. Pouran Ben Veyseh, H. Man, F. Dernoncourt, T. Bui, and T. H. Nguyen. 2023. ChatGPT beyond English: Towards a comprehensive...
- Lin, S., J. Hilton, and O. Evans. 2022. ”TruthfulQA: Measuring How Models Mimic Human Falsehoods”. In S. Muresan, P. Nakov, and A. Villavicencio,...
- Liu, A., B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruan, et al. 2024. Deepseekv3 technical report. arXiv preprint...
- Lu, Z., L. Jin, P. Li, Y. Tian, L. Zhang, S.Wang, G. Xu, C. Tian, and X. Cai. 2024. ”Rethinking the Reversal Curse of LLMs: a Prescription...
- Lv, A., K. Zhang, S. Xie, Q. Tu, Y. Chen, J.-R. Wen, and R. Yan. 2024. An analysis and mitigation of the reversal curse. In Y. Al-Onaizan,...
- Ma, J.-Y., J.-C. Gu, Z.-H. Ling, Q. Liu, and C. Liu. 2023. Untying the reversal curse via bidirectional language model editing. arXiv preprint...
- MacCartney, B. and C. D. Manning. 2007. Natural logic for textual inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment...
- MacCartney, B. and C. D. Manning. 2009. An extended model of natural logic. In Proceedings of the eight international conference on computational...
- Mohler, M., R. Bunescu, and R. Mihalcea. 2011. Learning to grade short answer questions using semantic similarity measures and dependency...
- OpenAI. 2024. OpenAI o1 System Card.
- Press, O., M. Zhang, S. Min, L. Schmidt, N. Smith, and M. Lewis. 2023. Measuring and narrowing the compositionality gap in language models....
- Radford, A., J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog,...
- Sainz, O., J. Campos, I. García-Ferrero, J. Etxaniz, O. L. de Lacalle, and E. Agirre. 2023. NLP Evaluation in trouble: On the Need to Measure...
- Touvron, H., T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi`ere, N. Goyal, E. Hambro, F. Azhar, et al. 2023a....
- Touvron, H., L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al. 2023b. Llama...
- Wang, B. and A. Komatsuzaki. 2021. GPT-J-6B: A 6 billion parameter autoregressive language model.
- Wu, D., J. Yang, and K. Wang. 2024. Exploring the reversal curse and other deductive logical reasoning in BERT and GPT-based large language...
- Wu, S., Z. Peng, X. Du, T. Zheng, M. Liu, J. Wu, J. Ma, Y. Li, J. Yang, W. Zhou, et al. 2024. A comparative study on reasoning patterns of...
- Xia, Y., R. Wang, X. Liu, M. Li, T. Yu, X. Chen, J. McAuley, and S. Li. 2024. Beyond chain-of-thought: A survey of chain-of-x paradigms for...
- Xu, X., K. Kong, N. Liu, L. Cui, D. Wang, J. Zhang, and M. Kankanhalli. 2023. An LLM can Fool Itself: A Prompt-Based Adversarial Attack. arXiv...
- Yang, Z. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv preprint arXiv:1906.08237.
- Zeng, A., X. Liu, Z. Du, Z. Wang, H. Lai, M. Ding, Z. Yang, Y. Xu, W. Zheng, X. Xia, et al. 2022. GLM-130B: An Open Bilingual Pre-trained...
- Zhao, W. X., K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, et al. 2023. A Survey of Large Language Models....
- Zheng, L., W.-L. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, Z. Lin, Z. Li, D. Li, E. Xing, et al. 2023. Judging LLM-as-a-Judge with MT-Bench...
- Zhu, H., B. Huang, S. Zhang, M. Jordan, J. Jiao, Y. Tian, and S. Russell. 2024. Towards a Theoretical Understanding of the ‘Reversal Curse’...