Madrid, España
Peer-to-peer (P2P) lending connects borrowers and lenders through online platforms but suffers fromsignificant information asymmetry, as lenders often lack sufficient data to assess borrowers’ creditworthiness. Thispaper addresses this challenge by leveraging BERT, a Large Language Model (LLM) known for its ability to cap-ture contextual nuances in text, to generate a risk score based on borrowers’ loan descriptions using a datasetfrom the Lending Club platform. We fine-tune BERT to distinguish between defaulted and non-defaulted loansusing the loan descriptions provided by the borrowers. The resulting BERT-generated risk score is then inte-grated as an additional feature into an XGBoost classifier used at the loan granting stage, where decision-makershave limited information available to guide their decisions. This integration enhances predictive performance,with improvements in balanced accuracy and AUC, highlighting the value of textual features in complementingtraditional inputs. Moreover, we find that the incorporation of the BERT score alters how classification modelsutilize traditional input variables, with these changes varying by loan purpose. These findings suggest that BERTdiscerns meaningful patterns in loan descriptions, encompassing borrower-specific features, specific purposes, andlinguistic characteristics. However, the inherent opacity of LLMs and their potential biases underscore the need fortransparent frameworks to ensure regulatory compliance and foster trust. Overall, this study demonstrates howLLM-derived insights interact with traditional features in credit risk modeling, opening new avenues to enhancethe explainability and fairness of these models.
© 2008-2026 Fundación Dialnet · Todos los derechos reservados