Ir al contenido

Documat


Resumen de Spelling normalization of historical documents by using a machine translation approach

Miguel Domingo, Francisco Casacuberta Nolla Árbol académico

  • The lack of a spelling convention in historical documents makes their orthography to change depending on the author and the time period in which each document was written. This represents a problem for the preservation of the cultural heritage, which strives to create a digital text version of a historical document. With the aim of solving this problem, we propose three approaches—based on statistical, neural and character-based machine translation— to adapt the document’s spelling to modern standards. We tested these approaches in different scenarios, obtaining very encouraging results.


Fundación Dialnet

Mi Documat