Ir al contenido

Documat


AzterTest: Herramienta de Análisis Lingüístico y Estilístico de Código Abierto

  • Autores: Kepa Xabier Bengoetxea Kortazar, Amaia Aguirregoitia Martínez, Itziar González Dios
  • Localización: Procesamiento del lenguaje natural, ISSN 1135-5948, Nº. 64, 2020, págs. 61-68
  • Idioma: español
  • Títulos paralelos:
    • AzterTest: Open source linguistic and stylistic analysis tool
  • Enlaces
  • Resumen
    • español

      El análisis de texto es un procedimiento útil para ayudar a los profesionales de la educación en la selección de los textos más adecuados para sus alumnos. Esta tarea exige el análisis de varias características de texto (por ejemplo, complejidad sintáctica, variedad de palabras, etc.), que se realiza principalmente de forma manual. En este artículo, presentamos AzterTest, una herramienta de código abierto para el análisis lingüístico y estilístico. AzterTest calcula 153 características y obtiene una exactitud de 90.09 % al distinguir tres niveles de lectura (elemental, intermedio y avanzado). AzterTest también se encuentra disponible como herramienta web.

    • English

      Text Analysis is a useful process to assist teachers in the selection of the most suitable texts for their students. This task demands the analysis of several text features, which is done mostly manually (e.g. syntactic complexity, words variety, etc.). In this paper, we present an open source tool useful for linguistic and stylistic analysis, called AzterTest. AzterTest calculates 153 features and obtains 90.09 % in accuracy when classifying into three reading levels (elementary, intermediate, and advanced). AzterTest is available also as web tool.AzterTest:Herramienta de Análisis Lingüístico y Estilístico de Código Abierto

  • Referencias bibliográficas
    • Aluísio, S., L. Specia, C. Gasperin, and C. Scarton. 2010. Readability assessment for text simplification. In Proceedings of the NAACL HLT...
    • Boros, T., S. D. Dumitrescu, and R. Burtica. 2018. Nlp-cube: End-to-end raw text processing with neural networks. In Proceedings of the CoNLL...
    • Cer, D., Y. Yang, S.-y. Kong, N. Hua, N. Limtiaco, R. S. John, N. Constant, M. Guajardo-Cespedes, S. Yuan, C. Tar, et al. 2018. Universal...
    • Chall, J. S. and E. Dale. 1995. Readability Revisited: The New Dale–Chall Readability Formula. Brookline Books, Cambridge, MA.
    • Dell’Orletta, F., S. Montemagni, and G. Venturi. 2011. READ-IT: assessing readability of Italian texts with a view to text simplification....
    • Feng, L., M. Jansche, M. Huenerfauth, and N. Elhadad. 2010. A comparison of features for automatic readability assessment. In Proceedings...
    • Flesch, R. 1948. A new readability yardstick. Journal of applied psychology, 32(3):221.
    • François, T. and C. Fairon. 2012. An AI readability formula for French as a foreign language. In Proceedings of the 2012 Joint Conference...
    • Gonzalez-Dios, I., M. J. Aranzabe, A. Díaz de Ilarraza, and H. Salaberri. 2014. Simple or complex? assessing the readability of basque texts....
    • Graesser, A. C., D. S. McNamara, and J. M. Kulikowich. 2011. Coh-Metrix Providing Multilevel Analyses of Text Characteristics. Educational...
    • Gunning, R. 1968. The technique of clear writing. McGraw-Hill New York.
    • Hall, M., E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. 2009. The WEKA data mining software: an update. AzterTest: Open...
    • Hancke, J., S. Vajjala, and D. Meurers. 2012. Readability Classification for German using lexical, syntactic, and morphological features....
    • Landwehr, N., M. Hall, and E. Frank. 2005. Logistic model trees. 95(1-2):161–205.
    • Madrazo, I. and M. S. Pera. 2019. Multiattentive recurrent neural network architecture for multilingual readability assessment. Transactions...
    • Mc Laughlin, G. H. 1969. SMOG grading-a new readability formula. Journal of reading, 12(8):639–646.
    • Miller, G. A. 1995. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41. OECD. 2016.
    • PISA 2015. Results in Focus. OECD Publishing.
    • Parodi, G. 2006. Discurso especializado y lengua escrita: Foco y variación. Estudios filológicos, (41):165–204.
    • Petersen, S. E. and M. Ostendorf. 2009. A machine learning approach to reading level assessment. Computer Speech & Language, 23(1):89–106.
    • Platt, J. 1998. Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14.
    • Qi, P., T. Dozat, Y. Zhang, and C. D. Manning. 2019. Universal dependency parsing from scratch. arXiv preprint arXiv:1901.10457.
    • Quispersaravia, A., W. Perez, M. A. S. Cabezudo, and F. Alva-Manchengo. 2016. Coh-Metrix-Esp: A Complexity Analysis Tool for Documents Written...
    • Scarton, C. and S. M. Aluısio. 2010. Cohmetrix-port: a readability assessment tool for texts in brazilian portuguese. In Proceedings of the...
    • Si, L. and J. Callan. 2001. A statistical model for scientific readability. In Proceedings of the tenth international conference on Information...
    • Speer, R., J. Chin, A. Lin, S. Jewett, and L. Nathan. 2018. Luminosoinsight/wordfreq: v2.2, October.
    • Vajjala, S. and I. Lucic. 2018. Onestopenglish corpus: A new corpus for automatic readability assessment and text simplification.
    • Venegas, R. 2008. Interfaz computacional de apoyo al análisis textual:“el manchador de textos”. RLA. Revista de lingüística teórica y aplicada,...
    • Stajner, S. and H. Saggion. 2013. Readability Indices for Automatic Evaluation of Text Simplification Systems: A Feasibility Study for Spanish....
    • Weide, R. 2005. The carnegie mellon pronouncing dictionary [cmudict. 0.6].
    • Zeman, D. and J. Hajiˇc, editors. 2018. Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies....

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno