Ir al contenido

Documat


Multivariate exploratory data analysis for large databases: an application to modelling firms' innovation using CIS data

  • Juan C. Bou [1] ; Albert Satorra [2] [3]
    1. [1] Universitat Jaume I

      Universitat Jaume I

      Castellón, España

    2. [2] Universitat Pompeu Fabra

      Universitat Pompeu Fabra

      Barcelona, España

    3. [3] BI Norwegian Business School

      BI Norwegian Business School

      Noruega

  • Localización: Business Research Quarterly, ISSN 2340-9444, ISSN-e 2340-9436, Vol. 22, Nº. 4, 2019, págs. 275-293
  • Idioma: inglés
  • DOI: 10.1016/j.brq.2018.10.001
  • Títulos paralelos:
    • Análisis de datos exploratorios multivariados para grandes bases de datos: una aplicación para modelar la innovación de las empresas utilizando datos CIS
  • Enlaces
  • Resumen
    • español

      Este artículo sostiene que, cuando se usa una base de datos grande, los investigadores organizacionales se beneficiarían del uso de análisis de datos exploratorios multivariados específicos (MEDA) antes de realizar modelos estadísticos. Cuestiones como la representatividad de la base de datos en todos los dominios (países o sectores), la evaluación de la confusión entre las covariables categóricas, los datos faltantes, la reducción de dimensiones para producir indicadores de rendimiento y / o solucionar problemas de multicolinealidad son abordados por MEDA específico. El MEDA propuesto se aplica a los datos de la Encuesta de innovación comunitaria (CIS), una gran base de datos que se utiliza comúnmente para analizar las actividades de innovación de las empresas, antes de ajustar los modelos de regresión logit y Tobit ordenados. A lo largo del documento se propone un conjunto de prácticas recomendadas que involucran a MEDA.

    • English

      This paper argues that, when using a large database, organizational researchers would benefit from the use of specific multivariate exploratory data analysis (MEDA) before performing statistical modelling. Issues such as the representativeness of the database across domains (countries or sectors), assessment of confounding among categorical covariates, missing data, dimension reduction to produce performance indicators and/or remedy multi-collinearity problems are addressed by specific MEDA. The proposed MEDA is applied to data from the Community Innovation Survey (CIS), a large database commonly used to analyse firms’ innovation activities, prior to fitting ordered logit and Tobit regression models. A set of recommended practices involving MEDA are proposed throughout the paper.

  • Referencias bibliográficas
    • Allison, P.D., Multiple imputation for missing data: a cautionary tale. Sociol. Methods Res. 28:3 (2000), 301–309.
    • Allison, P.D., Missing Data. 2001, Sage University Papers Series on Quantitative Applications in the Social Sciences (07-136), Thousand Oaks,...
    • Belderbos, R., Carree, M., Lokshin, B., Cooperative R&D and firm performance. Res. Policy 33:10 (2004), 1477–1492.
    • Bartholomew, D.J., Steele, F., Moustaki, I., Galbraith, J.I., The Analysis and Interpretation of Multivariate Data for Social Scientists....
    • Cassiman, B., Veugelers, R., Spillovers and R&D cooperation: some empirical evidence. Am. Econ. Rev. 92:4 (2002), 1169–1184.
    • Cerulli, G., Potì, B., Evaluating the robustness of the effect of public subsidies on firms’ R&D: an application to italy. J. Appl. Econ....
    • Cesaratto, S., Mangano, S., Technological profiles and economic performance in the Italian manufacturing sector. Econ. Innov. New Technol....
    • Chen, C.M., Delmas, M.A., Lieberman, M.B., Production frontier methodologies and efficiency as a performance measure in strategic management...
    • Cook, R.D., Weisberg, S., An Introduction to Regression Graphics. 1994, Wiley, Hoboken, NJ.
    • Doran, J., Ryan, G., Firms’ skills as drivers of radical and incremental innovation. Econ. Lett. 125:1 (2014), 107–109.
    • Eurostat, The Community Innovation Survey. 2008, Eurostat, Luxenbourg.
    • Eurostat, The Sixth Community Innovation Survey. Methodology of Anonymisation. 2011, Eurostat, Luxenbourg.
    • Fernstad, S.J., Glen, R.C., Visual analysis of missing data – to see what isn't there. Proceedings of the IEEE Symposium on Visual Analytics...
    • Frenz, M., Ietto-Gillies, G., The impact on innovation performance of different sources of knowledge: evidence from the UK Community Innovation...
    • Garriga, H., Von Krogh, G., Spaeth, S., How constraints and knowledge impact open innovation. Strateg. Manag. J. 34:9 (2013), 1134–1144.
    • Gelabert, L., Fosfuri, A., Tribó, J.A., Does the effect of public support for R&D depend on the degree of appropriability?. J. Ind. Econ....
    • Greenacre, M., Theory and Applications of Correspondence Analysis. 1983, Academic Press, London.
    • Hashi, I., Stojčić, N., The impact of innovation activities on firm performance using a multi-stage model: evidence from the Community Innovation...
    • Hollenstein, H., Innovation modes in the Swiss service sector: a cluster analysis based on firm-level data. Res. Policy 32:5 (2003), 845–863.
    • Jöreskog, K.G., Goldberger, A.S., JASA 70 (1975), 631–639.
    • Kirk, A., Data Visualization: A Successful Design Process. 2012, Packt, Birmingham, UK.
    • Kolenikov, S., Angeles, G., The Use of Discrete Data in PCA: Theory, Simulations, and Applications to Socioeconomic Indices. 2004, Carolina...
    • Kowarik, A., Templ, M., Imputation with the R Package VIM. J. Stat. Softw. 74:7 (2016), 1–16.
    • Laursen, K., Salter, A., Open for innovation: the role of openness in explaining innovation performance among U.K. manufacturing firms. Strateg....
    • Leiponen, A., Drejer, I., What exactly are technological regimes?: intra-industry heterogeneity in the organization of innovation activities....
    • Little, R.J., Rubin, D.B., Statistical Analysis with Missing Data. 2014, John Wiley & Sons.
    • Mention, A.L., Co-operation and co-opetition as open innovation practices in the service sector: which influence on innovation novelty?. Technovation...
    • Michailidis, G., de Leeuw, J., The gifi system of descriptive multivariate analysis. Stat. Sci. 13:4 (1998), 307–336.
    • Organisation for Economic Co-operation and Development (OECD), Eurostat-OECD Manual on Business Demography Statistics. 2008, Organisation...
    • Organisation for Economic Co-operation and Development (OECD), Statistical Office of the European Communities, Oslo Manual: Guidelines for...
    • Pearson, K., Mathematical Contributions to the Theory of Evolution. 1904, Dulau and Co.
    • Peneder, M., Technological regimes and the variety of innovation behaviour: creating integrated taxonomies of firms and sectors. Res. Policy...
    • R Core Team, R: A Language and Environment for Statistical Computing. 2016, R Foundation for Statistical Computing, Vienna, Austria https://www.R-project.org/.
    • Rangus, K., Drnovšek, M., Di Minin, A., Proclivity for open innovation: construct development and empirical validation. Innov.: Manag. Policy...
    • Raymond, W., Mohnen, P.A., Palm, F., Van Der Loeff, S.S., An Empirically-Based Taxonomy of Dutch Manufacturing: Innovation Policy Implications...
    • Robin, S., Schubert, T., Cooperation with public research institutions and success in innovation: evidence from France and Germany. Res. Policy...
    • Roth, P.L., Missing data: a conceptual review for applied psychologists. Pers. Psychol. 47:3 (1994), 537–560.
    • Rubin, D.B., Inference and missing data. Biometrika 63:3 (1976), 581–592.
    • Sapprasert, K., Clausen, T.H., Organizational innovation and its effects. Ind. Corp. Change 21:5 (2012), 1283–1305.
    • Schafer, J.L., Analysis of Incomplete Multivariate Data. 1997, Chapman and Hall, London.
    • Schafer, J.L., Graham, J.W., Missing data: our view of the state of the art. Psychol. Methods 7:2 (2002), 147–177.
    • Schlomer, G.L., Bauman, S., Card, N.A., Best practices for missing data management in counseling psychology. J. Couns. Psychol. 57:1 (2010),...
    • Segarra, A., Teruel, M., High-growth firms and innovation: an empirical analysis for Spanish firms. Small Bus. Econ. 43:4 (2014), 805–821.
    • Srholec, M., Verspagen, B., The Voyage of the Beagle into innovation: explorations on heterogeneity, selection, and sectors. Ind. Corp. Change...
    • Stata Corp, Stata Statistical Software: Release 15. 2017, Stata Corp. LLC, College Station, TX.
    • Stumpf, S.A., A note on handling missing data. J. Manag. 4:1 (1978), 65–73.
    • Tether, B.S., Who co-operates for innovation, and why: an empirical analysis. Res. Policy 31:6 (2002), 947–967.
    • Tobin, J., Estimation of relationships for limited dependent variables. Econometrica 26 (1958), 24–36.
    • Tsikriktsis, N., A review of techniques for treating missing data in OM survey research. J. Oper. Manag. 24:1 (2005), 53–62.
    • Tukey, J.W., Exploratory Data Analysis. 1977, Addison-Wesley, New York, NY.
    • Van Beers, C., Zand, F., R&D cooperation, partner diversity, and innovation performance: an empirical analysis. J. Prod. Innov. Manag....
    • Wooldridge, J.M., Fractional response models with endogeneous explanatory variables and heterogeneity. CHI11 Stata Conference (No. 12), Stata...

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno