Ester Boldrini, Alexandra Balahur Dobrescu , Patricio Martínez Barco , Andrés Montoyo Guijarro
Abstract: Thanks to the increasing amount of subjective data on the Web 2.0, tools to manage and exploit such data become essential. Our research is focused on the creation of EmotiBlog, a fine-grained annotation scheme for labelling subjectivity in non-traditional textual genres. We also present the EmotiBlog corpus; a collection of blog posts composed by 270,000 tokens about 3 topics and in 3 languages: Spanish, English and Italian. Additionally, we carry out a series of experiments focused on checking the robustness of the model and its applicability to Natural Language Processing tasks with regards to the 3 languages. The experiments for the inter-annotator agreement, as well as for feature selection, provided satisfactory results, which have given an impetus to continue working with the model and extend the annotated corpus. In order to check its applicability, we tested different Machine Learning models created using the annotation in EmotiBlog on other corpora in order to see if the obtained annotation is domain and genre independent, obtaining positive results. Finally, we also applied EmotiBlog to Opinion Mining, proving that our resource allows an improvement the performance of systems built for this task.
© 2008-2024 Fundación Dialnet · Todos los derechos reservados