Ir al contenido

Documat


Index structures for distributed text databases

  • Autores: Mauricio Marin Caihuan Árbol académico
  • Localización: Journal of Computer Science and Technology, ISSN-e 1666-6038, Vol. 4, Nº. 1, 2004 (Ejemplar dedicado a: Tenth Issue), págs. 1-12
  • Idioma: inglés
  • Enlaces
  • Resumen
    • The Web has became an obiquitous resource for distributed computing making it relevant to investigate new ways of providing efficient access to services available at dedicated sites. Efficiency is an ever-increasing demand which can be only satisfied with the development of parallel algorithms which are efficient in practice. This tutorial paper focuses on the design, analysis and implementation of parallel algorithms and data structures for widely-used text database applications on the Web. In particular we describe parallel algorithms for inverted files and suffix arrays structures that are suitable for implementing search engines. Algorithmic design is effected on top of the BSP model of parallel computing. This model ensures portability across diverse parallel architectures ranging from clusters to super-computers.

  • Referencias bibliográficas
    • References [1] M.D. Araujo, G. Navarro, and N. Ziviani. Large text searching allowing errors. In Workshop on String Processing (WSP’97),...
    • [2] C. Badue, R. Baeza-Yates, B. Ribeiro, and N. Ziviani. Distributed query processing using partitioned inverted files. In Eighth Symposium...
    • [3] R. Baeza and B. Ribeiro. Modern Information Retrieval. Addison-Wesley., 1999.
    • [4] R. Baeza-Yates, A. Moffat, and G. Navarro. Handbook of Massive Data Sets. Kluwer Academic Publishers, 2002. Chapter 7, pages 195–243.
    • [5] N. Barghouti and G. Kaiser. Concurrency control in advanced database applications. ACM Computing Surveys, 23(3), September 1991.
    • [6] B. K. Bhargava. Concurrency control in database systems. Knowledge and Data Engineering, 11(1):3–16, 1999.
    • [7] S. Blott and H. Korth. An almost-serial protocol for transaction execution in main-memory database systems. In 28th International Conference...
    • [8] N. Brisaboa, E. Iglesias, G. Navarro, and J. Paramá. An efficient compression code for text databases. In Proceedings of the 25th European...
    • [9] S.H. Chung, H.C. Kwon, K.R. Ryu, H.K. Jang, J.H. Kim, and C.A. Choi. Parallel information retrieval on a sci-based pc-now. In Workshop...
    • [10] E. Silva de Moura, G. Navarro, N. Ziviani, and R. Baeza-Yates. Fast and flexible word searching on compressed text. ACM Transactions...
    • [11] Daniela Florescu, Alon Y. Levy, and Alberto O. Mendelzon. Database techniques for the world-wide web: A survey. SIGMOD Record, 27(3):59–74,...
    • [12] D.R. Jefferson. Virtual time. ACM Trans. Prog. Lang. and Syst., 7(3):404–425, July 1985.
    • [13] M. Kamath and K. Ramamritham. Efficient transaction management and query processing in massive digital databases. Technical Report UM-CS-1995-093,...
    • [14] J. Kitajima and G. Navarro. A fast distributed suffix array generation algorithm. In 6th International Symposium on String Processing...
    • [15] M. Marín. Time Warp on BSP Computers. In 12th European Simulation Multiconference, June 1998.
    • [16] M. Marín. Parallel text query processing using Composite Inverted Lists. In Second International Conference on Hybrid Intelligent Systems...
    • [17] M. Marín and G. Navarro. Distributed query processing using suffix arrays. In Int. Conf. on String Processing and Information Retrieval,...
    • [18] M. Marín and G. Navarro. Suffix arrays in parallel. In International Conference on Parallel Processing (EuroPar’03), Lecture Notes in...
    • [19] A.A. MacFarlane, J.A. McCann, and S.E. Robertson. Parallel search using partitioned inverted files. In 7th International Symposium on...
    • [20] W.F. McColl. General purpose parallel computing. In A.M. Gibbons and P. Spirakis, editors, Lectures on Parallel Computation, pages 337–391....
    • [21] A. Moffat. Word-based text compression. Software – practice and experience, 19(2):185–198, 1989.
    • [22] A. Moffat and A. Turpin. On the implementation of minimum-redundancy prefix codes. In Data Compression Conference, pages 170–179, 1996.
    • [23] G. Navarro, E. Silva de Moura, M. Neubert, N. Ziviani, and R. Baeza-Yates. Adding compression to block addressing inverted indexes. Information...
    • [24] G. Navarro, J. Kitajima, B. Ribeiro, and N. Ziviani. Distributed generation of suffix arrays. In 8th Annual Symposium on Combinatorial...
    • [25] M. Persin, J. Zobel, and R. Sacks-Davis. Filtered document retrieval with frequency-sorted indexes. Journal of the American Society for...
    • [26] B. Ribeiro-Neto, J. Kitajima, G. Navarro, C. Santana, and N. Ziviani. Parallel generation of inverted lists for distributed text collections....
    • [27] B.A. Ribeiro-Neto and R.A. Barbosa. Query performance for tightly coupled distributed digital libraries. In Third ACM Conference on Digital...
    • [28] S.Dandamudi and J.Jain. Architectures for parallel query processing on networks of workstation. In 1997 International Conference on Parallel...
    • [29] D.B. Skillicorn, J.M.D. Hill, and W.F. McColl. Questions and answers about BSP. Technical Report PRG-TR-15-96, Computing Laboratory,...
    • [30] D.B. Skillicorn and D. Talia. Models and languages for parallel computation. ACM Computing Surveys, 20(2):123–169, June 1998.
    • [31] T. Tamura, M. Oguchi, and M. Kitsuregawa. Parallel database processing on a 100 node pc cluster: Cases for decision support query processing...
    • [32] A. Tomasic and H. Garcia-Molina. Performance of inverted indices in shared-nothing distributed text document information retrieval systems....
    • [33] URL. BSP and Worldwide Standard, http://www.bsp-worldwide.org/.
    • [34] URL. BSP PUB Library at Paderborn University, http://www.uni-paderborn.de/bsp.
    • [35] L.G. Valiant. A bridging model for parallel computation. Comm. ACM, 33:103–111, Aug. 1990.
    • [36] C. Yu and C. Chang. Distributed query processing. ACM Computing Surveys, 16(4):399–433, December 1984

Fundación Dialnet

Mi Documat

Opciones de artículo

Opciones de compartir

Opciones de entorno