spacer spacer spacer
spacer spacer spacer
spacer
NASA Logo - Jet Propulsion Laboratory    + View the NASA Portal
spacer
JPL Home Earth Solar System Stars & Galaxies Technology
Parallel Applications Technologies Group
PAT Home PAT News PAT Projects PAT People PAT Publications blank blank
spacer
spacer spacer spacer
spacer

Workshop on Web and Grid Services for Scientific Data Analysis (WAGSSDA)

Held in conjunction with:
2005 International Conference on Parallel Processing (ICPP-2005)
Oslo, Norway
June 14, 2005

Abstracts:

Keynote:
Web Services for building Scientific Applications - Searching for White Dwarfs, Savas Parastatidis

The Web Services Grid Application Framework (WS-GAF) project (Jan 2004 - Jan 2005) aimed to demonstrate the value of using standard, widely-accepted, well-supported Web Services technologies for scientific and commercial Internet-scale (a.k.a. "Grid") applications. The scientific application developed is a tool aimed at astronomers who wish to combine and analyse information from the SuperCOSMOS (UK) and Sloan Digital Sky Survey (US) scientific archives. This presentation will discuss the WS-GAF approach to building Internet-scale applications, the steps followed in creating a tool for scientists, and the implementation challenges and solutions.

GridAssist, a User Friendly Grid-based Workflow Management Tool, Mark ter Linden, Hans de Wolf, Ruud Grim

This paper describes GridAssist, a user friendly Grid-based workflow management tool that allows users to execute workflows in a Grid environment and hides the underlying technology. Two cases are described where this tool is now being used: processing of astronomy data and processing of Earth Observation data.

Web Services Composition for Distributed Data Mining, Ali Shaikh Ali, Omer F. Rana and Ian J. Taylor

A Web Services-based toolkit for supporting distributed data mining is presented. A workflow engine is provided within the toolkit to enable a user to compose Web Services to implement particular point solutions. Three types of Web Services are provided to implement data mining functions: (1) classifiers, (2) clustering algorithms, and (3) association rules. Additional capability is made available through GNUPlot and Mathematica to enable visualisation of the output. Data sets may be read from the local filespace, or streamed from a remote location (provided the algorithm being used has support for streaming). Two case studies are presented to illustrate the use of the toolkit.

Matchmaking, Datasets and Physics Analysis, Heinz Stockinger, Flavia Donno, Giulio Eulisse, Mirco Mazzucato, Conrad Steenberg

Grid enabled physics analysis requires aWorkload Management System (WMS) that takes care of finding suitable computing resources to execute data intensive jobs. A typical example is the WMS available in the LCG2 (also referred to as EGEE-0) software system, used by several scientific experiments. Like many other current Grid systems, LCG2 provides a file level granularity for accessing and analysing data. However, application scientists such as High Energy Physicists often require a higher abstraction level for accessing data, i.e. they prefer to use datasets rather than files in their physics analysis.

We have improved the current WMS (in particular the Matchmaker) to allow physicists to express their analysis job requirements in terms of datasets. This required modifications to the WMS and its interface to potential data catalogues. As a result, we propose a simple Data Location Interface that is based on a web service approach and allows for interoperability of the WMS with new dataset and file catalogues. We took a particular High Energy Physics experiment as the source for our study and show that physics analysis can be improved by our modifications to the current Grid system.

How to Run Scientific Applications Over Web Services, Diego Puppin, Nicola Tonellotto, and Domenico Laforenza

Today, the task of running and coordinating a scientific application across several administrative domains is extremely complex. As an example, the most popular tool for scientific applications, MPI, is not designed to address firewall limitations or data heterogeneity, even if its extensions deal with some of these problems.

In this paper, we design a new approach to run a scientific application in a distributed environment, when data and computing power are scattered across the Web: Web Services can be used to tunnel computation and data migration.

We show that a very simple mapping exists between MPI primitives and the Web Service infrastructure. We are currently designing a framework, based on Web Services, which will implement the main MPI primitives: this way an MPI application could be run on any platform supporting Web Services.

Heterogeneous Relational Databases for a Grid-enabled Analysis Environment, Arshad Ali, Ashiq Anjum, Tahir Azim, Julian Bunn, Saima Iqbal, Richard McClatchey, Harvey Newman, S. Yousaf Shah, Tony Solomonides, Conrad Steenberg, Michael Thomas, Frank van Lingen, Ian Willers

Grid based systems require a database access mechanism that can provide seamless homogeneous access to the requested data through a virtual data access system, i.e. a system which can take care of tracking the data that is stored in geographically distributed heterogeneous databases. This system provides an integrated view of the data that is stored in the different repositories by using a virtual data access mechanism, i.e. a mechanism that can hide the heterogeneity of the backend databases from the client applications.

This paper focuses on accessing the data which is stored in disparate relational databases through a web service interface and exploits the state-of-the-art features of a Data Warehouse and Data Marts. We present a middleware that enables applications to access data that is stored in geographically distributed relational databases without being aware of their physical locations and underlying schema. A web service interface is provided to enable applications to access this middleware in a language and platform independent way. A plug-in for the Java Analysis Studio (JAS) was developed to submit complex queries for accessing the data and visualizing the retrieved results as histograms. A prototype implementation was created based on Clarens [4], Unity [7] and POOL [8]. Intensive tests were carried out against the proposed features of the middleware. This ability to access the data stored in the distributed relational databases transparently is likely to be a very powerful one for Grid users, especially for the scientific community wishing to collate and analyze data distributed over the Grid.

The Clarens Web Service Framework for Distributed Scientific Analysis in Grid Projects, Frank van Lingen, Conrad Steenberg, Michael Thomas, Ashiq Anjum, Tahir Azim, Harvey Newman, Arshad Ali , Julian Bunn, Iosif Legrand

Large scientific collaborations are moving towards service oriented architectures for implementation and deployment of globally distributed systems. Clarens is a light weight, high performance, easy to deploy web service framework that supports the construction of such globally distributed systems. This article discusses some of the core functionality of Clarens that the authors believe is important for building distributed systems, and how Clarens is used within several projects aimed at supporting scientific analysis for large collaborations.

Resource Management Services for a Grid Analysis Environment, Arshad Ali, Ashiq Anjum, Tahir Azim, Julian Bunn, Atif Mehmood, Richard McClatchey, Harvey Newman, Waqas ur Rehman, Conrad Steenberg, Michael Thomas, Frank van Lingen, Ian Willers, Muhammad Adeel Zafar

Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process, and disregarding any user preferences. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimation service that have been designed and written on top of a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Grid-enabled physics analysis.

Flexible Authentication and Authorization Architecture for Grid Computing, Hyunjoon Jung, Hyuck Han, Hyungsoo Jung, Heon Y. Yeom

The Globus Toolkit makes it very easy and comfortable for grid users to develop and deploy grid services. As for the security mechanism, however, only static authentication and coarse-grained authorization mechanism is provided in current Globus Toolkit. In this paper we address the limitations of current security mechanism in the Globus Toolkit and propose a new architecture which provides fine-grained and flexible security mechanism. To implement this without modifying existing components, we make use of the Aspect-Oriented Programming technique.

spacer
spacer spacer spacer
spacer
Privacy / Copyrights FAQ Contact JPL Sitemap
spacer
spacer spacer spacer
spacer
FIRST GOV   NASA Home Page This page, http://pat.jpl.nasa.gov/public/WAGSSDA/abstracts.html, is maintained by Daniel S. Katz and was last modified Tuesday, 17-May-2005 14:45:16 PDT
spacer
spacer spacer spacer
spacer spacer spacer
JPL NASA Caltech