Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending

Mario Sanz Guerrero; Javier Arroyo Gallardo

Ayuda

Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending

Mario Sanz-Guerrero ^[1] ; Javier Arroyo ^[1]
1. [1] Universidad Complutense de Madrid
  
  Universidad Complutense de Madrid
  
  Madrid, España
Localización: Inteligencia artificial: Revista Iberoamericana de Inteligencia Artificial, ISSN-e 1988-3064, ISSN 1137-3601, Vol. 28, Nº. 75, 2025, págs. 220-247
Idioma: inglés
DOI: 10.4114/intartif.vol28iss75pp220-247
Enlaces
- Texto completo
Resumen
- Peer-to-peer (P2P) lending connects borrowers and lenders through online platforms but suffers fromsignificant information asymmetry, as lenders often lack sufficient data to assess borrowers’ creditworthiness. Thispaper addresses this challenge by leveraging BERT, a Large Language Model (LLM) known for its ability to cap-ture contextual nuances in text, to generate a risk score based on borrowers’ loan descriptions using a datasetfrom the Lending Club platform. We fine-tune BERT to distinguish between defaulted and non-defaulted loansusing the loan descriptions provided by the borrowers. The resulting BERT-generated risk score is then inte-grated as an additional feature into an XGBoost classifier used at the loan granting stage, where decision-makershave limited information available to guide their decisions. This integration enhances predictive performance,with improvements in balanced accuracy and AUC, highlighting the value of textual features in complementingtraditional inputs. Moreover, we find that the incorporation of the BERT score alters how classification modelsutilize traditional input variables, with these changes varying by loan purpose. These findings suggest that BERTdiscerns meaningful patterns in loan descriptions, encompassing borrower-specific features, specific purposes, andlinguistic characteristics. However, the inherent opacity of LLMs and their potential biases underscore the need fortransparent frameworks to ensure regulatory compliance and foster trust. Overall, this study demonstrates howLLM-derived insights interact with traditional features in credit risk modeling, opening new avenues to enhancethe explainability and fairness of these models.
Referencias bibliográficas
- Shabeen A. Basha, Mohammed M. Elgammal, and Bana M. Abuzayed. Online peer-to-peer lending:A review of the literature.Electronic Commerce Research...
- Miller Janny Ariza-Garz ́on, Javier Arroyo, Antonio Caparrini, and Maria-Jesus Segovia-Vargas.Explainability of a machine learning...
- Miller-Janny Ariza-Garz ́on, Javier Arroyo, Mar ́ıa-Jes ́us Segovia-Vargas, and Antonio Caparrini.Profit-sensitive machine learning...
- Miller-Janny Ariza-Garz ́on, Mar ́ıa-Del-Mar Camacho-Mi ̃nano, Mar ́ıa-Jes ́us Segovia-Vargas, andJavier Arroyo. Risk-return modelling...
- Miller Janny Ariza-Garzón, Mario Sanz-Guerrero, Javier Arroyo Gallardo, and Lending Club. Lending Club loan dataset for granting models, May...
- Adithya Bhaskar, Alexander R Fabbri, and Greg Durrett. Prompted opinion summarization with GPT-3.5. arXiv:2211.15914, 2022.
- Magdalena Biesialska, Katarzyna Biesialska, and Marta R. Costa-jussà. Continual lifelong learning in natural language processing: A survey....
- Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry,...
- José Cañete, Gabriel Chaperon, Rodrigo Fuentes, Jou-Hui Ho, Hojin Kang, and Jorge Pérez. Spanish pre-trained bert model and evaluation data....
- Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Greg Brockman,...
- Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference...
- Mark Cummins, Theo Lynn, Ciarán Mac an Bhaird, and Pierangelo Rosati. Addressing Information Asymmetries in Online Peer-to-Peer Lending. In...
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding....
- Adji B. Dieng, Francisco J. R. Ruiz, and David M. Blei. Topic Modeling in Embedding Spaces. Transactions of the Association for Computational...
- Gregor Dorfleitner, Christopher Priberny, Stephanie Schuster, Johannes Stoiber, Martina Weber, Ivan de Castro, and Julia Kammler. Description-text...
- Qiang Gao and Mingfeng Lin. Lemon or cherry? The value of texts in debt crowdfunding. Technical Report 18, Center for Analytical Finance....
- Zhengjie Gao, Ao Feng, Xinyu Song, and Xi Wu. Target-dependent sentiment classification with BERT. IEEE Access, 7:154290–154299, 2019.
- Maarten Grootendorst. Bertopic: Neural topic modeling with a class-based tf-idf procedure. arXiv:2203.05794, 2022.
- Michal Herzenstein, Scott Sonenshein, and Utpal M. Dholakia. Tell me a good story and I may lend you my money: The role of narratives in peer-to-peer...
- John H. Holland. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial...
- Sarthak Jain and Byron C. Wallace. Attention is not explanation. CoRR, abs/1902.10186, 2019.
- Cuiqing Jiang, Zhao Wang, Ruiya Wang, and Yong Ding. Loan default prediction by combining soft information extracted from descriptive text...
- Johannes Kriebel and Lennart Stitz. Credit default prediction from user-generated text in peer-to-peer lending using deep learning. European...
- Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. ALBERT: A Lite BERT for Self-supervised Learning...
- Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. BioBERT: a pre-trained biomedical language...
- Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. Bart: Denoising...
- Yuelei Li, Aiting Hao, Xiaotao Zhang, and Xiong Xiong. Network topology and systemic risk in peer-to-peer lending market. Physica A: Statistical...
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. RoBERTa:...
- Tim Loughran and Bill McDonald. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1):35–65,...
- Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference...
- Andreas Madsen, Siva Reddy, and Sarath Chandar. Post-hoc interpretability for neural NLP: A survey. ACM Computing Surveys, 55(8), December...
- Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric de la Clergerie, Djamé Seddah, and Benoît Sagot....
- Jeremy Michels. Do unverifiable disclosures matter? Evidence from peer-to-peer lending. The Accounting Review, 87(4):1385–1413, 2012.
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013.
- André Aoun Montevechi, Rafael de Carvalho Miranda, André Luiz Medeiros, and José Arnaldo Barra Montevechi. Advancing credit risk modelling...
- David Pride, Matteo Cancellieri, and Petr Knoth. CORE-GPT: Combining open access research and large language models for credible, trustworthy...
- Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. Catboost: unbiased boosting with categorical...
- Zhiyuan Qi, Dongyu Chen, and Jennifer J. Xu. Do facial images matter? Understanding the role of private information disclosure in crowdfunding...
- Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. Technical...
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the...
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why Should I Trust You?”: Explaining the predictions of any classifier. In Proceedings...
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Anchors: high-precision model-agnostic explanations. In Proceedings of the Thirty-Second...
- ROFIEG. Thirty recommendations on regulation, innovation and finance. Final report to the European Commission by the expert group on regulatory...
- Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter....
- Sofia Serrano and Noah A. Smith. Is attention interpretable? In Anna Korhonen, David Traum, and Lluís Màrquez, editors, Proceedings of the...
- Michael Siering. Peer-to-peer (p2p) lending risk management: Assessing credit risk on social lending platforms using textual factors. ACM...
- Matthew Stevenson, Christophe Mues, and Cristián Bravo. The value of text for small business default prediction: A Deep Learning approach....
- Chi Sun, Xipeng Qiu, Yige Xu, and Xuanjing Huang. How to fine-tune BERT for text classification? In Maosong Sun, Xuanjing Huang, Heng Ji,...
- Xiaofei Sun, Xiaoya Li, Jiwei Li, Fei Wu, Shangwei Guo, Tianwei Zhang, and Guoyin Wang. Text classification via large language models. arXiv:2305.08377,...
- Xu Sun and Weichao Xu. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic...
- Mateusz Szczepański, Marek Pawlicki, Rafał Kozik, and Michał Chorąś. New explainability method for BERT-based model in fake news detection....
- Vijay Srinivas Tida and Sonya Hy Hsu. Universal spam detection using transfer learning of BERT model. In Proceedings of the 55th Hawaii International...
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention Is...
- Shuxia Wang, Yuwei Qi, Bin Fu, and Hongzhi Liu. Credit risk evaluation based on text analysis. International Journal of Cognitive Informatics...
- Yufei Xia, Lingyun He, Yinguo Li, Nana Liu, and Yanlin Ding. Predicting loan default in peer-to-peer lending using narrative data. Journal...
- Yufei Xia, Chuanzhe Liu, and Nana Liu. Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research...
- Yufei Xia, Zhengxu Shi, Xiaoying Du, and Qiong Zheng. Extracting narrative data via large language models for loan default prediction: when...
- Jennifer Xu, Dongyu Chen, and Michael Chau. Identifying features for detecting fraudulent loan requests on p2p platforms. In 2016 IEEE Conference...
- Jennifer Xu, Dongyu Chen, Michael Chau, Liting Li, and Haichao Zheng. Peer-to-peer loan fraud detection: Constructing features from transaction...
- Jianrong Yao, Jiarui Chen, June Wei, Yuangao Chen, and Shuiqing Yang. The relationship between soft information in loan titles and online...
- Xin Ye, Luan Dong, and Da Ma. Loan evaluation in p2p lending based on random forest optimized by genetic algorithm with profit score. Electronic...
- Weiguo Zhang, Chao Wang, Yue Zhang, and Junbo Wang. Credit risk evaluation model with textual features from loan descriptions for p2p lending....