Since the appearance of the first distributed databases until the current modern replication systems, the research community has proposed multiple protocols to manage data distribution and replication, along with concurrency control algorithms to handle transactions running at every system node. Many protocols are thus available, each one with different features and performance, and guaranteeing different consistency levels. To know which replication protocol is the most appropriate, two aspects must be considered: the required level of consistency and isolation (i.e., the correctness criterion), and the properties of the system (i.e., the scenario), which will determine the achievable performance.
Regarding correctness criteria, one-copy serializability is broadly accepted as the highest level of correctness. However, its definition allows different interpretations regarding replica consistency. In this thesis, we establish a correspondence between memory consistency models, as defined in the scope of distributed shared memory, and possible levels of replica consistency, thus defining new correctness criteria that correspond to the identified interpretations of one-copy serializability.
Once selected the correctness criterion, the achievable performance of a system heavily depends on the scenario, i.e., the sum of both the system environment and the applications running on it. In order for the administrator to select a proper replication protocol, the available protocols must be fully and deeply known. A good description of each candidate is fundamental, but a common ground is mandatory to compare the different options and to estimate their performance in the given scenario. This thesis proposes a precise characterization model that allows us to decompose algorithms into individual interactions between significant system elements, as well as to define some underlying properties, and to associate each interaction with a specific policy that governs it. We later use this model as basis for a historical study of the evolution of database replication techniques, thus providing an exhaustive survey of the principal existing systems.
Although a specific replication protocol may be the best option for certain scenario, as systems are dynamic and heterogeneous, it is difficult for a single protocol to continuously be the proper choice, as it may degrade or be unable to meet all requirements. In this thesis we propose a metaprotocol that supports several replication protocols which follow different replication techniques and may provide different isolation levels. With this metaprotocol, replication protocols can either work concurrently with the same data or be sequenced for adapting to dynamic environments.
Finally we consider integrity constraints, which are extensively used in databases to define semantic properties of data but are often forgotten in replicated databases. We analyze the potential problems this may involve and provide simple guidelines to extend a protocol so that it notices and properly manages abortions due to integrity violations.
© 2008-2024 Fundación Dialnet · Todos los derechos reservados