Ir al contenido

Documat


Improving named entity recognition through the identification of entity roles

  • Autores: Pablo Calleja
  • Directores de la Tesis: Asunción Gómez Pérez (dir. tes.) Árbol académico, Raul Garcia Castro (codir. tes.) Árbol académico
  • Lectura: En la Universidad Politécnica de Madrid ( España ) en 2020
  • Idioma: español
  • Tribunal Calificador de la Tesis: Oscar Corcho García (presid.) Árbol académico, Elena Montiel Ponsoda (secret.) Árbol académico, Mariano Fernández López (voc.) Árbol académico, Jorge Gracia del Río (voc.) Árbol académico, Miriam Fernández Sánchez (voc.) Árbol académico
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • The complexity of natural language documents that are used in companies and organizations is a challenge for information extraction tasks. Among them, the Named Entity Recognition task (the identification of proper names of people, organizations or locations) is an essential one. However, not all named entities that appear in these types of documents are relevant or have the same purpose and meaning. This purpose and meaning is defined by the role they play in the document and, retrieving all possible named entities is not useful in certain scenarios. Therefore, in this thesis we propose a hierarchical classification of named entities according to their role with the necessary models for their identification. At the same time, the thesis proposes a method for the identification of named entities based on their role using the previously mentioned hierarchy. Both contributions have been instantiated in a real use case in the legal domain using leaked mails in a journalism investigation. Finally, this thesis presents a software library that implements the contributions of the thesis together with other tasks of the information extraction field.


Fundación Dialnet

Mi Documat

Opciones de tesis

Opciones de compartir

Opciones de entorno