Personal tools
You are here: Home Infrastructure & Services DEISA Services from the User's Perspective

DEISA Services from the User's Perspective

The purpose of the DEISA Unified Services is to enable developers and users of challenging supercomputing applications to manage their applications, compute jobs, input and output data on the European-wide distributed HPC infrastructure in nearly the same way as they are used to at a single national HPC site. These services are hiding - as far as possible - the underlying heterogenous computing, storage and administrative infrastructure.

Data Management Service

The backbone of the Data Management Service is currently IBM's Multicluster General Parallel File System (MC-GPFS) which is distributed over various HPC sites in Europe utilizing the high-speed network provided by GEANT2. Almost all DEISA sites are integrated into this DEISA-wide service which enables the users to access and organize their data directories transparently from every DEISA site.

The dynamic evolution of the DEISA computing environment - replacements of hardware architectures and system software upgrades have to be taken into account - makes it possible that new compute platforms cannot always be integrated into the MC-GPFS infrastructure, at least not immediately after going into production. Thus, the Globus GridFTP service is being provided as an alternative solution for transfering huge data volumes efficiently between the HPC sites over the DEISA internal network. The usage of the GridFTP service has been unified in the sense that each DEISA site allows to address the GridFTP server at another site using a common interface. Hence, details such as host names or port numbers need not to be known by the user.

As an integral part of the Data Management Service every DEISA user has a personal file space on the MC-GPFS that is accessible from every DEISA compute node. The site-specific path to this file space is specified in a unified manner by means of a naming convention and environment variables which are defined either when the user logs interactively into the system or when the job starts executing at a site. Beside the DEISA-wide shared home and data file space that are addressable is a unified way, the users also have the possibility to stage their data into a temporary file space at the job execution site, whenever I/O-efficiency needs to be considered.

The unified view and usage of the Data Management Service is supported by the Common Production Environment Service

Job Management Service

The Job Management Service comprises mainly two services:

  • submitting jobs to distributed compute resources and job scheduling
  • workflow management

 

Job Submission and Scheduling

Users can submit compute jobs using the Grid middleware UNICORE, DESHL and Globus. Alternatively, users can work interactively either on the HPC system at their home site or at the (remote) execution site, submitting jobs into the corresponding batch system. The batch systems of currently four IBM AIX clusters within DEISA (see the figure below) are closely coupled forming a common batch system. Based on IBM's Multicluster LoadLeveler (LL-MC)  these sites establish a homogeneous super-cluster enabling job-migration on the batch system level. 

deisa2-infra-supercluster-550.png

Schematic view of the homogeneous AIX super-cluster which is an aggregation of currently four AIX (IBM Power) based supercomputers that allow remote job submission and job migration by means of the IBM Multicluster LoadLeveler. For example, a job that was locally submitted at CINECA can be rerouted to and executed at RZG.


The super-cluster offers the advantage that jobs prepared to run on any of the four IBM Power architectures can be rerouted or migrated in order to optimize the resource utilization at each of the involved sites.

In the general heterogeneous case, compute jobs have to be submitted to any DEISA site. In addition to the possibility to submit jobs interactively on each of the systems via the site-specific batch system, DEISA offers Grid middleware as a unified access method. The following figure sketches the DEISA Unicore Infrastructure which is a meshed infrastructure of Unicore gateways, Unicore Network Job schedulers (NJS) and the HPC target systems. With Unicore, a user who is used to work at a specific home site can submit his compute jobs to any of the other DEISA sites the Unicore gateway is connected to.

deisa2-infra-unicore-550.png

Sketch of the DEISA Unicore infrastructure. For simplicity, only connections from two of the site gateways are drawn (blue and red).

The example of the LRZ user (thick red line in the figure above) indicates a job submission via the Unicore gateway from non-AIX sites to the AIX supercluster. The DEISA Global File System allows a transparent access from any of the compute nodes in DEISA to the software and data that previously has been prepared by the user at his home site.

Workflow Management

Workflows can be executed using the Unicore workflow engine which enables an orchestrated execution of multiple interdependent jobs on different compute platforms.

User Management Service

The coordinated DEISA User Management enables users to get access to the whole supercomputing infrastructure immediately after having been registered at one single DEISA site. This site acts as the user's home site and is responsible for providing first-level support with regard to any questions or issues concerning the DEISA infrastructure and services. In general, the user's home site is resonsible also for the registration and activation of computing projects to which users have to be assigned to.  The DEISA user administration system is based on a distributed network of ldap servers which contains and propagates vital administrative information from one site to the others. DEISA is using specific ldap schemas for describing the necessary information on users, groups and projects.

Another component of the User Management Service is the common accounting system of DEISA that allows users and project managers to monitor the resource consumption status on the DEISA infrastructure. This service is unified in the sense that the accounting information gathered from all DEISA sites is structured according to a standard.

By agreement, the registration of users and projects, the mutual acceptance of user administration records having been entered into the DEISA user administration system and the user support (e.g., support in getting a user certificate) follows specific rules.

Common Production Environment Service

DEISA provides a defined stack of software, comprising compilers, shells, tools, libraries, middleware components and applications, to the users. The organization of this software is usually site and system specific. The Common Production Environment Service is offering a unified interface to access this software. This service allows compilers, libraries, tools and applications to be accessed at any site in a coherent way, by using a mechanism that is based on Environment Modules. DEISA provides a system of such modules which has been designed according to the requirements of DEISA user communities. Appropriate curation of the DEISA environment modules at each affected site ensures that the software access interface remains consistent in case of system software upgrades or due to the evolution of the DEISA software stack.

Science Gateways

The science gateways are tailored to the common needs of particular scientific disciplines. They facilitate the usage of the supercomputing resources by hiding complexity from the users. In particular, the Life Sciences Portal allows users from any life science oriented Virtual Community to run complex applications on supercomputing facilities via the specific Web interface. For these users, the Web interface represents the single contact point with DEISA. Authentication, authorization and accounting is taken care of on behalf of the user by the portal. After the particular application has been selected, and parameters have been chosen accordingly, the portal initiates the corresponding computations at one of the supercomputing sites. Finally, the gathered results, together with accounting information, are presented to the users via the portal.

Application Support Service

The Application support service is provided on different levels. Early adopters of the DEISA infrastructure from different scientific communities (e.g., Materials Science, Cosmology, Plasma Physics, Life Sciences, Engineering and Industry) are individually supported.

Document Actions