Scientific workflows have emerged as a technology which provides computational support for scientific experiments, On the one hand, a workflow can be seen as a high-level specification of a set of tasks and the dependencies among them, which indicate the steps that must be satisfied in order to accomplish a scientific experiment. On the other hand, a workflow can be seen as a computer program and a workflow system as a specialised programming environment to simplify the programming effort required by scientists to undertake a computational science experiment.
Scientific workflows must be designed with the needed levels of flexibility and dynamism and must be adaptive to change to address the requirements of scientific experiments and to cope with the challenges that impose the infrastructures utilised. The scientific context of scientific experiments is in continuous evolution as the scientific exploration process evolves: a scientist may be interested in exploring a parameter space, or interested in undertaking multiple what-if scenarios. Additionally, scientific experiments often exploit computational resources that are distributed, heterogeneous and autonomous leading to highly unreliable environments. Although many different techniques have been adopted and scientific workflow systems offer support for dynamism in several ways, the issue of adding new portions of a workflow at any time during execution has not been addressed satisfactorily.
In this thesis, we developed a scientific workflow engine, DVega, based on Reference nets with special emphasis on the flexibility and dynamism requirements. Reference nets are a particular type of Petri nets which can more effectively provide the abstractions to support and to express hierarchical workflows and their dynamic adaptability. The architecture of the system is built upon service-oriented principles, so that the coupling between the system and the resources at the environment is minimised. Reference net-based workflows in DVega have a hierarchical structure and exploit the well-known operators sequence, parallel, choice and iteration for composing tasks. In order to cope with the changes that arise at the environment, we proposed an exception handling technique which either propagates exceptions in the hierarchy or replaces a sub-workflow in the hierarchy with an alternative one - unaffecting the rest of the workflow structure. The exception handling technique is also combined with a workflow checkpointing technique which is appropriated for data-intensive applications.
Different checkpointing schemes have been developed and at various levels: task-level and workflow-level. At workflow-level, the usually adopted approach is to establish a checkpointing frequency in the system which determines the moment at which a global workflow checkpoint - a snapshot of the whole workflow enactment state at normal execution (without failures) -has to be accomplished. We developed an alternative workflow-level checkpointing scheme and its corresponding rollback recovery process for hierarchical scientific workflows in which every workflow node in the hierarchy accomplishes its own local checkpoint autonomously and in an uncoordinated way after its enactment.
Finally, workflow systems provide support for combining components to achieve a particular outcome. As components used within a workflow may be implemented by third parties, it is often necessary to be able to determine the impact a particular component composition will have on the overall execution of a workflow. Besides, this can be useful for selecting an alternative sub-workflow in case of a replace action must be taken in an exception. In this thesis, a method for predicting the execution time of a given workflow is proposed and is also based on Reference nets.
© 2008-2024 Fundación Dialnet · Todos los derechos reservados