A HMM text classification model with learning capacity

Eva Lorenzo Iglesias; María Lourdes Borrajo Diz; Rubén Romero González; Adrián Seara Vieira

Ayuda

A HMM text classification model with learning capacity

IGLESIAS, Eva L. ^[1] ; BORRAJO, Lourdes ^[1] ; ROMERO, R. ^[1] ; A. Seara Vieira
1. [1] Universidade de Vigo
  
  Universidade de Vigo
  
  Vigo, España
Localización: ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, ISSN-e 2255-2863, Vol. 3, Nº. 3, 2014, págs. 21-34
Idioma: inglés
DOI: 10.14201/ADCAIJ2014332134
Enlaces
- Texto completo
Resumen
- In this paper a method of classifying biomedical text documents based on Hidden Markov Model is proposed and evaluated. The method is integrated into a framework named BioClass. Bioclass is composed of intelligent text classification tools and facilitates the comparison between them because it has several views of the results. The main goal is to propose a more effective based-on content classifier than current methods in this environment To test the effectiveness of the classifier presented, a set of experiments performed on the OSHUMED corpus are preseted. Our model is tested adding it learning capacity and without it, and it is compared with other classification techniques. The results suggest that the adaptive HMM model is indeed more suitable for document classification.
Referencias bibliográficas
- ANAND, A., PUGALENTHI, G., FOGEL, G. B., and SUGANTHAN, P. N., 2010. An approach for classification of highly
- imbalanced data using weighting and undersampling. Amino acids, 39:1385–1391.
- BAEZA-YATES, R. A. and RIBEIRO-NETO, B., 1999. Modern Information Retrieval. Addison-Wesley Longman.
- DASARANTHY, B. V., 1991. Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE
- Computer Society Press, Los Alamitos, CA.
- DOMINGOS, P. and PAZZANI, M., 1997. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning, 29:103–130.
- FARID, D. M., ZHANG, L., RAHMAN, C. M., HOSSAIN, M., and STRACHMAN, R., 2014. Hybrid decision tree and naive Bayes classifiers for multi-class...
- FRANCOIS, J. Jahmm - An implementation of HMM in Java.
- Garner, S. R., 1995. WEKA: TheWaikato Environment for Knowledge Analysis. In Proc. of the New Zealand Computer Science Research Students Conference,...
- GLEZ-PEÑA, D., REBOIRO-JATO, M., MAIA, P., ROCHA, M., DÍAZ, F., and FDEZ-RIVEROLA, F., 2010. AIBench: A rapid application development framework...
- GU, N., WANG, D., FAN, M., and MENG, D., 2014. A kernel-based sparsity preserving method for semisupervised classification. Neurocomputing,...
- HARRIS, T., 2014. Credit scoring using the clustered support vector machine. Expert Systems with Applications, 42(2):741–750. Cited By 2.
- HERSH, W. R., BUCKLEY, C., LEONE, T. J., and HICKMAN, D. H., 1994. OHSUMED: An Interactive Retrieval
- Evaluation and New Large Test Collection for Research. In SIGIR, pages 192–201.
- JOACHIMS, T., 1998. Text categorization with support vector machines: learning with many relevant features. In Nédellec, C. and Rouveirol,...
- Machine Learning, pages 137–142. Springer, Heidelberg et al.
- LOVINS, J. B., 1968. Development of a stemming algorithm. Mechanical Translation and Computational
- Linguistics, 11:22–31.
- MALDONADO, S., WEBER, R., and FAMILI, F., 2014. Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines....
- OSUNA, E., FREUND, R., and GIROSI, F., 1997. Support Vector Machines: Training and Applications. Technical
- report.
- PENG, T., ZUO, W., and HE, F., 2008. SVM based adaptive learning method for text classification from positive and unlabeled documents. Knowledge...
- RABINER, L. R., 1990. Readings in speech recognition. chapter A tutorial on hidden Markov models and selected applications in speech recognition,...
- ROMERO, R., VIEIRA, A., IGLESIAS, E., and BORRAJO, L., 2014. BioClass: A Tool for Biomedical Text Classification.
- In 8th International Conference on Practical Applications of Computational Biology & Bioinformatics, volume 294 of Advances in Intelligent...
- SEARA-VIEIRA, A., IGLESIAS, E. L., and BORRAJO, L., 2014. T-HMM: A Novel Biomedical Text Classifier Based on Hidden Markov Models. In 8th...
- –234. Springer.
- SIERRA ARAUJO, B., 2006. Aprendizaje automático: conceptos básicos y avanzados: aspectos prácticos utilizando el software Weka. Pearson Prentice...
- VILLMANN, T., SCHLEIF, F., and HAMMER, B., 2006. Comparison of relevance learning vector quantization with other metric adaptive classification...
- WANG, J., BELATRECHE, A., MAGUIRE, L., and MCGINNITY, T., 2014a. An online supervised learning method for spiking neural networks with adaptive...
- WANG, S., JIANG, L., and LI, C., 2014b. Adapting naive Bayes tree for text classification. Knowl Inf Syst.
- doi:10.1007/s10115-014-0746-y.
- YI, K. and BEHESTHI, J., 2009. A hidden Markov model-based text classification of medical documents.
- Journal of Information Science, 35(1):67–81.
- ZHANG, J. and MANI, I., 2003. kNN Approach to Unbalanced Data Distributions: A case study involving Information Extraction. In Proc. of the...