|
 |
|
 |
|
Institute on Grid Information, Resource and Workflow Monitoring Services
Institute leader: Norbert Meyer (meyer man.poznan.pl), PSNC |  | The primary objective of this institute is the development of general and scalable approaches to an information and monitoring infrastructure for large scale heterogeneous Grids. The resources on a Grid are under the control of different entities and are heterogeneous. The useful integration of these resources and related services is impossible without access to relevant information about their accessibility and state. Also, this information must be properly collected, merged, filtered, and delivered to users, either for other services like resource brokers or to end users and their programs. In addition, to the resource and service oriented point of view, job centred collection of information is important for the better understanding of the general state and behaviour of a Grid. Current Grid information and monitoring frameworks have identifiable drawbacks, as they are either too focused to specific aspects or do not scale enough. The performance of the infrastructures is also not satisfactory, especially when security and reliability are required. The institute will focus its research to better understand the reasons and to find models and frameworks to overcome these limitations. In close collaboration with the institute on system architectures, appropriate architectures for scalable and dependable information and monitoring infrastructures will be investigated and deployed. A possibility for convergence of currently distinct approaches to information services and monitoring services will also be taken into account, with the aim to identify a unified framework. The information provided by the monitoring services will be used to get a better understanding of Grid behaviour. However, the current lack of understanding of grid performance in general, and the non-existence of generally accepted set of metrics to evaluate Grid performance, makes the task of Grid evaluation and performance comparison not possible. The institute will focus on the development of new Grid performance models that will provide the means and tools for the evaluation of services deployed on the Grid. Complex job workflows represent another challenge, as the monitoring information must be synchronously gathered from many different sources and appropriately processed to provide a coherent view (state information) of the whole workflow and its components. The job workflow itself must be extracted from programming models and the monitoring and information services must be tightly coupled with job checkpointing and migration support to provide an environment where even complex job workflows could be easily deployed, executed, and monitored. Models and methods to provide a virtualized end user account system are a specific part of the combined job flow support and information services. Roadmap version 3 on Grid Information, Resource and Workflow Monitoring Services Publications related to the Institute on Grid Information, Resource and Workflow Monitoring Services |
|
Research Group
|
Participants |
Network Monitoring System
|
INFN, FORTH
|
Integration of TCKPT and PSNC checkpointer
|
PSNC, SZTAKI |
Storage functionality for distributed checkpointing
|
UCO, PSNC |
Workflow description languages using high-level Petri nets
|
FhG, WWU Muenster
|
Compatibility and conversion of different Grid workflow description languages |
FhG, INFN, UNICAL, WWU Muenster
|
Fault tolerance in Grid workflow execution |
SZTAKI, UoW |
Extending the SEAGRIN semantic overlay Grid infrastructure with the collaborative workflow management support of the P-GRADE portal |
MU, SZTAKI |
User management and accounting framework |
MU, PSNC
|
|
|
Latest Research Highlights |
Availability-based Resource Selection Risk Analysis in the Grid CoreGRID Technical Report TR-0169 [pdf]:
Resources in the Grid exhibit different availability properties and patterns over time, mainly due to their administrators’ policies for the Grid, and the different domains to which they belong, e.g. non-dedicated desktop Grids, on-demand systems, P2P systems etc. This diversification in availability properties makes availability-aware resource selection, for applications with different fault tolerance capabilities, a challenging problem. To address this problem, we introduce new availability metrics for resource availability comparison. We further predict resource availability considering their availability policies. We introduce a new resource availability predictor based on pattern matching through availability pattern recognition and classification for resource instance and duration availability, and compare it with other methods. Notably we are able to achieve an average accuracy of more than 80% in our predictions. |  | Dynamicity in Scientific Workflows CoreGRID Technical Report TR-0162 [pdf]: Dynamicity is a recurrent topic in traditional business workow systems. The need and feasibility to perform changes in workow process instances while they are being executed has been a main (and to a long extend yet unsolved) challenge. More recently, the scientic workow domain has also paid attention to this topic and some of the current scientic workow management systems give a certain support for dynamicity. In general, there is a common agreement that dynamicity is an intrinsic requirement for scientic workows, but the understanding about the real needs and functionality to be provided is confuse. This report is mainly focused on contributing to enhance such an understanding by analyzing dynamicity scenarios, requirements and proposals in scientic workows. First, ve general scenarios involving different dynamicity needs are described through the introduction of concrete examples. Then, these scenarios are used to identify a set of dynamicity requirements for scientic workows support. Finally, a review of current well-known scientic workow execution systems is presented, focusing on their proposals to support dynamicity. |
| Grid Checkpointing Service - integration of low-level checkpointing packages with the Grid environment CoreGRID Technical Report TR-0159 [pdf]:
The technology that significantly supports the load-balancing and fault-tolerance capabilities in Grids is the job’s checkpointing mechanism. Nevertheless, contemporary Grid environments distinctly lack the possibility of integration with the low-level processes’ checkpointers. Nowadays, some Grids support the checkpointing of applications which internally implement this functionality and additionally adhere to the imposed interface. On the contrary, the paper describes the Grid Checkpointing Service, the prototype design and implementation of Grid-level service, which makes it possible to utilize low-level or legacy checkpointing packages in Grids. In fact, the presented service is a proof-of-concept implementation of a part of the Grid Checkpointing Architecture (GCA). Nevertheless, the way the GCS was implemented allows it to be installed and utilized independently of the other parts of the GCA. |  | |
|
| |
|
|
 |
|
 |
|