Abstract
We study automatic classification for the diagnosis of the Carpal Tunnel Syndrome (CTS), a disease frequently observed in occupational medicine. We apply different classification techniques to two real-life medical data sets related to a group of patients reporting the typical symptoms of this syndrome. We are particularly interested in the performance of “Box-Clustering” (BC), a method that is able to favor readability and interpretation of the results by medical doctors, thanks to its “box-type” output which naturally configures as a medical report. Preliminary results of a basic implementation of BC applied to different data sets already exist in the literature, and here we add more. In particular, in this paper, we apply a recently developed (and specialized) implementation of BC, and we test it for the first time on real-life medical data related to the CTS. Our purpose is to evaluate the performance of BC for automatic diagnosis, as well as, gain in explanation capability and interpretability. This is, in fact, a crucial aspect in medical applications that generally represents a limit for other well-known and powerful classification techniques.
Similar content being viewed by others
Notes
We thank Doctor Gioia for the opportunity he gave us of applying BC to real-life data, and for the useful discussions we had with him at the beginning of this work.
The two examinations were performed independently by two different doctors working in different sites in Italy, one at the Department of Neurophysiopathology of the “San Salvatore” Hospital in the city of L’Aquila, and the other at the INAIL-Abruzzi Regional Polydiagnostic Center.
It was performed by a third medical doctor under the protocol that he was neither informed about the diagnosis based on the other examination, nor about the specific clinics of the subject under study.
It can be seen as a product of Boolean variables and of complemented Boolean variables.
Obviously, every pattern in a positive (or negative) theory takes the value \(0\) in every negative (resp. positive) point.
This definition can be easily extended to non-real but totally ordered variables.
We point out that these experiments are a selection from a wider experimental work performed on the same BC implementation applied in this paper Spinelli (2014a).
References
Alexe G, Alexe S, Axelrod DE, Bonates TO, Lozina I, Reiss M, Hammer PL (2006) Breast cancer prognosis by combinatorial analysis of gene expression data. Breast Cancer Res 8:1–20
Alexe S, Blackstone EH, Hammer PL, Ishwaran H, Lauer MS, Snader CEP (2003) Coronary risk prediction by logical analysis of data. Ann Oper Res 119:15–42
Alexe G, Hammer PL, Kogan PL (2002) Comprehensive vs. comprehensible classifiers in logical analysis of data, RUTCOR Research Report, RRR 9/2002
Anthony M, Ratsaby J (2012) Using boxes and proximity to classify data into several categories. RUTCOR Research Reports, RRR 7/2012
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79
Barakat N, Diederich J (2005) Eclectic rule-extraction from support vector machines. Int J Comput Intell 2:59–62
Beekman R, Visser LH (2003) Sonography in the diagnosis of carpal tunnel syndrome: a critical review of the literature. Muscle Nerve 27:26–33
Bonates T, Hammer PL (2006) Logical analysis of data: from combinatorial optimization to medical applications. Ann Oper Res 148:203–225
Boros E, Hammer PL, Ibaraki T, Kogan A, Mayoraz E, Muchnik I (2000) An implementation of logical analysis of data. IEEE Trans Knowl Data Eng 12:292–306
Boros E, Ibaraki T, Shi L, Yagiura M (2000) Generating all good patterns in polynomial expected time. In: Lecture at the 6th International Symposium on Artificial Intelligence and Mathematics. Ft. Lauderdale, Florida
Boros E, Hammer PL, Ibaraki T, Kogan A (1997) A logical analysis of numerical data. Math Program 79:163–190
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Chapman & Hall, London
Cantatore FP, Dell’Accio F, Lapadula G (1997) Carpal Tunnel Syndrome: a review. Clin Rheumatol 16:596–603
Carrizosa E, Martin-Barragan B, Romero Morales D (2010) Binarized support vector machines. INFORMS J Comput 22:154–167
Carrizosa E, Martin-Barragan B, Romero Morales D (2011) Detecting relevant variables and interactions in supervised classification. Eur J Oper Res 213:260–269
Crama Y, Hammer PL, Ibaraki T (1988) Cause–effect relationships and partially defined Boolean functions. Ann Oper Res 16:299–325
Du D, Pardalos PM, Wang J (2000) Discrete mathematical problems with medical applications, DIMACS Series, vol 55. American Mathematical Society
Eckstein J, Hammer PL, Liu Y, Nediak M, Simeone B (2002) The maximum box problem and its application to data analysis. Comput Optim Appl 23:285–298
Ekin O, Hammer PL, Kogan A (1998) Convexity and logical analysis of data. RUTCOR Research Report, RRR 5/1998
Felici G, Spinelli V (2009) Genetic procedure for over-training control in logic mining. Istituto di Analisi dei Sistemi ed Informatica-IASI/CNR. Technical Report 19/2009
Felici G, Simeone B, Spinelli V (2008) Special issue on data mining. In: Sharda R, Voß S (eds) Classification techniques and error control in logic mining., Annals of Information Systems SeriesSpringer, New York
Felici G, Sun F-S, Truemper K (2006) Data mining and knowledge discovery approaches based on rule induction techniques. In: Felici G, Trintaphyllou E (eds) Learning logic formulas and related error distributions. Springer Science, New York
Felici G, Truemper K (2001) A MINSAT approach for learning in logic domains. INFORMS J Comput 13:1–17
Hall M, Eibe F, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. SIGKDD Explor 11:10–18
Hammer PL, Kogan A, Simeone B, Szedmák S (2001) Pareto-optimal patterns in logical analysis of data. RUTCOR Research Report, RRR 7/2001
Hammer PL, Liu Y, Simeone B, Szedmák S (2004) Saturated systems of homogeneous boxes and the logical analysis of numerical data. Discret Appl Math 144:103–109
Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17:299–310
Isolani L, Bonfiglioli R, Raffi GB, Violante FS (2002) Different case definitions to describe the prevalence of occupational carpal tunnel syndrome in meat industry workers. Int Arch Occup Environ Health 75:229–234
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on artificial intelligence, IJCAI 95. Montrèal, (Quèbec, Canada
Lee D, van Holsbeeck MT, Janevski PK, Ganos DL, Ditmars DM, Darian VB (1999) Diagnosis of Carpal Tunnel Syndrome: ultrasound versus electromyography. Radiol Clin N Am 37:859–872
Leffler CT, Gozani SN, Nguyen ZQ, Cros D (2000) An automated electrodiagnostic technique for detection of carpal tunnel syndrome. Neurol Clin Neurophysiol 3:2–10
Mitchell T (1997) Machine learning. McGraw-Hill, New York
Mugan J, Truemper K (2008) Discretization of rational data in mathematical methods for knowledge discovery and data mining. In: Felici G, Vercellis C (eds) Chapter I, Hershey: information science reference
Pacek CA, Tang J, Goitz RJ, Kaufmann RA, Li ZM (2010) Morphological analysis of the carpal tunnel. Hand 5:77–81
Papanicolau GD, McCabe SJ, Firrell J (1987) The prevalence and characteristics of nerve compression symptoms in the general population. J Hand Surg 12:712–717
Pardalos PM, Hansen P (2008) Data mining and mathematical programming. American Mathematical Society, Providence
Pardalos PM, Romeijn E (2009) Handbook of optimization in medicine. Springer, New York
Sakai S, Togasaki M, Yamazaki K (2003) A note on greedy algorithms for the maximum independent set problem. Discret Appl Math 126:313–322
Simeone B, Boros E, Ricca F, Spinelli V (2011) Incompatibility graphs in data mining, Department of Statistical Sciences, Sapienza, University of Rome, Technical Report 10/2011 (submitted to Journal of Graph Theory)
Simeone B, Spinelli V (2007) The optimization problem framework for box clustering approach in logic mining. In: Proceedings of Euro XXII-22nd European Conference on Operational Research, Prague
Simeone B, Felici G, Spinelli V (2007) A graph coloring approach for box clustering techniques in logic mining. In: Proceedings of Euro XXII-22nd European Conference on Operational Research, Prague
Spinelli V (2009) Logic mining, box-clustering, and graphs. PhD Thesis, University of Rome, Sapienza
Spinelli V (2014) Classification and pruning in logic mining. Adv Data Anal Classif (submitted)
Spinelli V (2014) Problems and algorithms in Box-Clustering Adv Data Anal Classif (submitted)
Wu S, Flach P (2005) A scored AUC metric for classifier evaluation and selection. In: Second Workshop on ROC Analysis in ML, Bonn, Germany
Zweig MH, Campbell G (1993) Receiver-Operating Characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Maravalle, M., Ricca, F., Simeone, B. et al. Carpal Tunnel Syndrome automatic classification: electromyography vs. ultrasound imaging. TOP 23, 100–123 (2015). https://doi.org/10.1007/s11750-014-0325-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11750-014-0325-0