Ir al contenido

Documat


An Introduction to Symbolic data Analysis and its Application to the Sodas Project

  • Diday, Edwin [1]
    1. [1] Université de Paris IX – Dauphine, Place du Maréchal de Lattre de Tassigny, CEREMADE
  • Localización: Revista de Matemática: Teoría y Aplicaciones, ISSN 2215-3373, ISSN-e 2215-3373, Vol. 7, Nº. 1-2, 2000, págs. 1-22
  • Idioma: inglés
  • DOI: 10.15517/rmta.v7i1-2.177
  • Enlaces
  • Resumen
    • español

      Las descripciones de los datos de las unidades se llaman "simbólicas" cuando son más complejas que las estándar debido al hecho que contienen variación interna y están estructuradas. Los datos simbólicos aparecen a través de diversas fuentes, por ejemplo para resummir grandes Bases de Datos Relacionales por sus conceptos fundamentales."Extracción del conocimiento" significa la obtención de resultados explicativos, por lo que se introducen los "objetos simbólicos" y se estudian en este artículo. Ellos modelan conceptos y constituyen una salida explicativa para el análisis de datos. Es más, pueden ser usados para definir consultas a una Base de Datos Relacional y propagar conceptos entre Bases de Datos. Definimos el "Análisis de Datos Simbólico" (SDA) como una extensión del Análisis de Datos estándar a tablas de datos simbólicos como entrada, con el fin de encontrar objetos simbólicos como salida. En este artículo damos un panorama de desarrollos recientes en SDA. Presentamos herramientas y métodos de SDA, e introducimos el prototipo de software SODAS (resultado del trabajo conjunto de 17 equipos de nueve países que participan en un proyecto europeo de EUROSTAT).

    • English

      The data descriptions of the units are called "symbolic" when they are more complex than the standard ones due to the fact that they contain internal variation and are structured. Symbolic data happen from many sources, for instance in order to summarise huge Relational Data Bases by their underlying concepts. "Extracting knowledge" means getting explanatory results, that why, "symbolic objects" are introduced and studied in this paper. They model concepts and constitute an explanatory output for data analysis. Moreover they can be used in order to define queries of a Relational Data Base and propagate concepts between Data Bases. We define "Symbolic Data Analysis" (SDA) as the extension of standard Data Analysis to symbolic data tables as input in order to find symbolic objects as output. In this paper we give an overview on recent development on SDA. We present some tools and methods of SDA and introduce the SODAS software prototype (issued from the work of 17 teams of nine countries involved in an European project of EUROSTAT).

  • Referencias bibliográficas
    • Adanson, M. (1757) Hipster Natural du Sénégal-Coquilles. Gauche, Paris.
    • Aristotle. (IV BC) Organon, Vol. I Catégories, Vol. II De l’Interprétation. J. Vrin Edit., Paris (1994).
    • Arnault, A.; Nicole, P. (1662) La Logique ou l’Art de Penser. Froman, Stuttgart (1965).
    • Auriol, E. (1995) Intégration d’Approches Symboliques pour le Raisonnement à Partir d’Exemples. Thèse de Doctorat, Université Paris 9 Dauphine.
    • Barbut, M.; Monjardet, B. (1971) Ordre et Classification, T.2. Hachette, Paris.
    • Belson. (1959) “Matching and prediction on the principle of biological classification”, Applied Statistics VIII.
    • Benzécri, J.P. et al. (1973) L’Analyse de Données. Dunod, Paris.
    • Bertrand, P. (1986) Etude de la Représentation Pyramidale. Thèse de 3-ème cycle, Université Paris IX-Dauphine, Paris.
    • Bock, H.H. (1974) Automatische Klassifikation. Vandenhoeck and Ruprecht, Göttingen.
    • Breiman, L.; Friedman, J.H.; Olsken, R.A.; Stone, C.S. (1984) Classification and Regression Trees. Belmont, Wadsworth.
    • Brito, P.; Diday, E. (1991) “Pyramidal representation of symbolic objects”, NATO ASI Series F61. M. Schader and W. Gaul (Eds.), Knowledge...
    • Brito, P. (1994) “Order structure of symbolic assertion objects”, IEEE TR. on Knowledge and Data Engineering 6(5).
    • Bandemer, H..; Nather, W. (1992) Fuzzy Data Analysis. Kluwer Academic Publisher, Dordrecht.
    • Cazes, P.; Chouakria, A.; Diday, E.; Schektman, Y.(1997)) “Extension de l’analyse en composantes principales des données intervalles”, Revue...
    • Celeux, G.; Diday, E.; Govaert, G.; Lechevallier, Y.; Ralambondrainy, H. (1989) Classification Automatique: Environnement Statistique et Informatique....
    • Changeux, J.P. (1983) L’Homme Neuronal. Collection Pluriel, Fayard, Paris.
    • Chavent, M. (1997) Analyse des Données Symboliques. Une Méthode Divisive de Classification. Thèse de doctorat, Université Paris 9 Dauphine,...
    • Ciampi, A.; Diday, E.; Lebbe, J.; Périnel, E.; Vigne, R. (1995) “Recursive partition with probabilistically imprecise data”, OSDA’95, E. Diday,...
    • Conruyt, N. (1994) Amlioration de la Robustesse des Systèmes d’Aide à la Description, la Classification et la Détermination des Objets Biologiques....
    • De Carvalho, F.A.T. (1998) “New metrics for constrained boolean symbolic objects ”KESDA’98, Eurostat, Luxembourg.
    • De Carvalho, F.A.T. (1998) “Statistical proximity functions of boolean symbolic objects based on histograms”, IFCS, Springer-Verlag, Roma.
    • Diday, E. (1971) “La méthode des nuées dynamiques”, Revue de Statistique Appliquée 19(2):19–34.
    • Diday, E.(1976) “Sélection typologique de variables”, Rapport INRIA, Rocquencourt 78150, France.
    • Diday, E.(1976) “Cluster analysis”, Digital Pattern Recognition, K.S. Fu (Ed.), Springer Verlag, Berlin: 47-94.
    • Diday, E. et al. (1979) Optimisation en Classification Automatique. INRIA, Rocquencourt.
    • Diday, E.; Govaert, G.; Lechevallier, Y.; Sidi, J. (1980) “Clustering in pattern recognition”, NATO Adv. Study Institute on Digital Processing...
    • Diday, E. (1984) “Une représentation visuelle des classes empiétantes”, Rapport INRIA n. 291,Rocquencourt.
    • Diday, E.; Lemaire, J.; Pouget, J.; Testu, F. (1984) Eléments d’Analyse des Données. Dunod, Paris.
    • Diday, E. (1986) “Orders and overlapping clusters by pyramids”, Multidimensional Data Analysis, J.D. De Leeuw et al. (Eds.), DSWO Press, Leiden.
    • Diday E. (1987a) “The symbolic aproach in clustering and related methods of data analysis”, Classification and Related Methods of Data Analysis,...
    • Diday, E. (1987b) “Introduction à l’approche symbolique en analyse des données”, Première Journées Symbolique-Numérique, Université Paris...
    • Diday, E. (1989) “Introduction à l’approche symbolique en analyse des données”, RAIRO (Revue,d’Automatique, d’Informatique et de Recherche...
    • Diday, E. (1995) “ Probabilist, possibilist and belief objects for knowledge analysis”, Annals of Operations Research 55:227–276.
    • Diday, E.; Emilion, R. (1995) “Lattices and capacities in analysis of probabilist objects”, OSDA’95 (Ordinal and Symbolic Data Analysis),...
    • Diday, E.; Emilion, R. (1997) “Treillis de Galois maximaux et capacités de Choquet”, Compte Rendus à l’Académie des Sciences. Analyse Mathématique,...
    • Diday, E.; Emilion, R.; Hillali, Y. (1996) “Symbolic data analysis of probabilist objects by capacities and credibilities”, XXXVIII Società...
    • Diday, E.(1998) “L’analyse des données symboliques: un cadre théorique et des outils”, Cahiersdu CEREMADE, Université Paris IX Dauphine, Paris.
    • Esposito, F.; Malerba, D.; Lisi, F. (1998) “Flexible matching of boolean symbolic objects”, NTTS’98 Sorrento, Nanopoulos, Garonna, Lauro (Eds.),...
    • Ferraris; Gettler-Summa, M.; Pardoux, C.; Tong, H. (1995) “Knowlege extraction using stochastic matrices: Application to elaborate a fishing...
    • Fisher, D.H.; Langley, P. (1986) “Conceptual clustering and its relation to numerical taxonomy”, Workshop on Artificial Intelligence and Statistics,...
    • Fisher, D.H.(1987a) “Conceptual clustering learning from examples and inference”, 4th Workshop on Machine Learning, Irvine, California.
    • Ganascia, J.G. (1991) “Charade: apprentissage de bases de connaissances”, Cepadues, Diday (Ed.), Kodratoff.
    • Gettler-Summa, M. (1992) “Factorial axis interpretation by symbolic objects”, Journées Symbolique-Numérique, Lise-Ceremade, Université Paris...
    • Gettler-Summa, M. (1997) “Symbolic marking: application on car accidents scenari”, ASMDA, Capri.
    • Gigout, E. (1998) “ Graphical interpretation of symbolic objects resulting from data mining”, KESDA’98, Eurostat, Luxembourg.
    • Gowda, K.C.; Diday, E. (1992) “Symbolic clustering using a new similarity measure”, IEEE Trans. Syst. Man and Cybernet 22(2): 368–378.
    • Gower, J.C. (1974) “Maximal predictive classification”,Biomet 30: 643–644.
    • Hayes-Roth, F.; McDermott, J. (1978) “An interference matching technique for inducing abstractions”, Comm. ACM. Artificial Intelligence, Language...
    • Hebrail, G. (1996) “SODAS (Symbolic Official Data Analysis System)”, IFCS96, Springer Verlag, Japan.
    • Jambu, M. (1978) Classification Automatique pour l’Analyse des Données. Dunod, Paris.
    • Jardine, N.; Sibson, R. (1971) Mathematical Taxonomy. John-Wiley and Sons, New-York.
    • Jussieu, A.L. (1748)Taxonomy. Coup d’oeil sur l’histoire et les principes des classifications botaniques. Dictionnaire d’Histoire Universelle.
    • Lance, G.N.; Williams, W.T. (1967) “A general theory of classification sorting strategies: hierarchical systems”, Comp. Jorn 9(4).
    • Langley, P.; Sage, S. (1984) “Conceptual clustering as discrimination learning”, Fifth Biennial Conf. the Canadian Soc. for Comp. Studies...
    • Labowitz, M. (1983) “Generalization from natural language text”, Cognit. Science 7(1).
    • Lauro, C.; Palumbo, F. (1998) “New approaches to principal component analysis of interval data”, NTTS’98, Nanopoulos, Garonna, Lauro (Eds.),...
    • Lebart, L.; Morineau, A.; Piron, M. (1995)Statistique Exploratoire Multidimensionnelle. Dunod, Paris.
    • Lebbe, J.; Vignes, R. (1991) “Génération de graphes d’identification partir de descriptions de concepts”, Induction Symbolique-Numérique,...
    • Lerman, I.C. (1970) Les Bases de la Classification Automatique. Gautier-Villars, Paris.
    • Noirhomme-Fraiture; Rouard, M. (1998) “Representation of sub-populations and correlation withzoom star”, NTTS’98, Eurostat, Nanopoulos, Garonna,...
    • Mfoumoune, E. (1998) Les Aspects Algorithmiques de la Classification Ascendante Pyramidale et Incrémentale. Thèse de Doctorat, Université...
    • Michalski, R. (1973) “ Aqual/1 -computer implementation of a variable-valued logic system VL1and examples in pattern recognition”, Int. Joint...
    • Michalski, R.; Stepp, R.E. (1983) “Automated construction of classifications conceptual clustering versus numerical taxonomy”, IEEE Trans....
    • Michalski, R.; Diday, E.; Stepp, R.E. (1982) “A recent advances in data analysis: clustering objects into classes characterized by conjonctive...
    • Morgan, J.N.; Sonquist, J.A. (1963) “Problems in the analysis of survey data: a proposal”, J.A.S.A. 58: 417–434.
    • Pankhurst, R.J. (1978) Biological Identification. The principle and Practice of Identificatin Methods in Biology. Edward Arnold, London.
    • Payne, R.W. (1975) “Genkey: a program for construction diagnostic keys”, Biological Identification with Computer, Pankhurst (Ed.), Acad.Press,...
    • Prinel (1996) Segmentation et Analyse de Données Symboliques: Application des Données Probabilistes Imprécises. Thèse de Doctorat, Université...
    • Pollaillon, G.; Diday, E. (1997) “Galois lattices of symbolic objects”, Rapport du Ceremade, Université Paris 9-Dauphine, Paris.
    • Pollaillon, G. (1998) Organisation et Interprétation par les Treillis de Galoisde Données de Type Multivalué, Intervalle ou Histogramme. Thèse...
    • Rasson, J.P.; Lissoir, S. (1998) “Symbolic kernel discriminante analysis” NTTS’98, Nanopoulos, Garonna, Lauro (Eds.), Eurostat, Sorrento.
    • Quinlan, J.R. (1986) “Induction of decision trees”, Machine Learning 1: 81–106.
    • Ralambondrainy, H. (1991) “Apprentissage dans le contexted’un schéma de base de données”, Induction Symbolique-Numérique, Kodratoff, Diday...
    • Rosch, E. (1978) “Principle of categorization”, Cognition and Categorization, E. Rosch, B. Lloyd (Eds.), Erlbaum, Hillsdale: 27–48 .
    • Roux, M. (1985) Algorithmes de Classification. Masson, Paris.
    • Saporta, G. (1990)Probabilités, Analyse des Données et Statistiques. Technip, Paris.
    • Schweizer, B. (1985) “Distributions are the numbers of the futur”, Napoli Meeting on The Mathematics of Fuzzy Systems, Instituto di Mathematica...
    • Schweizer, B.; Sklar, A. (1983) Probabilistic Metric Spaces. Elsevier North-Holland, New-York.
    • Sneath, P.H.A.; Sokal, R.R. (1973) Numerical Taxonomy. Freeman and Comp. Publishers, San Francisco.
    • Sowa, J. (1984) Conceptual Structures: Information Processing in Mind and Machine. Addison Wesley, Reading Stphan. (1998) Construction d’Objets...
    • Tukey, J. W. (1958) Exploratory Data Analysis. Addisson Wesley, Reading.
    • Vignes, R. (1991) Caractérisation Automatique de Groupes Biologiques, Thèse de doctorat, Université Paris 9 Dauphine, Paris.
    • Verde, R.; De Carvalho, F. A. T. (1998) “Dependance rules influence on factorial representation of boolean symbolic objects”, KESDA’98, Eurostat,...
    • Wagner, H. (1973) “Begriff”, Hanbuck Philosophischer Grundbegriffe, H. Krungs, H.M. Baum-gartner , C. Wild (Eds.), Kosel, München: 191–209.
    • Ward, J.H. (1963) “Hierarchical groupings to optimize an objective function”, J. Amer. Stat. Assoc 58: 236–244.
    • Wille, R. (1982) “Restructuring lattice theory: an approach based on hierarchies of concepts”, Symp. Ordered Sets, I. Rival (Ed.), Reidel,...
    • Wille, R. (1989) “Knowledge acquisition by methods of formal concepts analysis”, Data Analysis, Learning Symbolic and Numeric Knowledge, Diday...
    • Winston, P. (1979) Artificial Intelligence. Addison Wesley, Reading.
    • Ziani, D. (1996 ) Sélection de variables sur un ensemble d’objets symboliques. Thèse de doctorat, Université Paris 9 Dauphine, Paris.

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno