Distributed Computing Laboratory

Emory University


DCL Home » HWB



Project summary

DOE has a significant need for high-end computing, and is making substantial investments to support this endeavor. However, the diversity of high-end computing architectures and associated software complexity often pose challenges that lead to slow or hampered scientific discovery. Application scientists expend considerable time and effort dealing with development, deployment, and run time interfacing activities that are significantly different on each high-end platform. Traditionally, efforts to enhance efficient use of high-end computational resources have focused only on application algorithms, numerical kernels, and internal optimization. This proposal addresses the complementary arena in which substantial benefits can be derived by optimizing the development and deployment processes, and by facilitating software reuse.

We propose basic research in two innovative software environments, together termed the Harness workbench, that will help enhance the overall productivity of applications science on diverse high performance computing platforms. The first is a virtualized command toolkit (VCT) for application building, deployment, and execution, that provides a common view across diverse HPC systems, in particular the DOE leadership computing platforms (Cray, IBM, SGI, and clusters). Our research on VCT will investigate a software backplane architecture that presents a uniform, extensible interface and is capable of interoperating with existing and third-party toolkits. It will interface to native platform functionality via plugin modules that encapsulate vendor-specific knowledge and are further configurable at the system and application levels. The second component of the Harness workbench is a unified runtime environment (RTE) that similarly consolidates access to runtime services via an adaptive framework for execution-time and post processing activities. Our RTE research will seek to virtualize HPC runtime environments in an open and extensible manner, and will support the native RTE of target systems as well as the Open RTE and Harness/H2O runtimes. As new architectures (early systems) emerge, pluggable modules corresponding to their special and novel RTE features will be developed and incorporated into the Harness workbench.

In regard to VCT we propose a scheme in which the toolkit front-end translates command-line (or GUI) service invocations into access requests of some particular capability of the implementation component. The back-end plugins, complemented with (user-overridable) configuration profiles supplied by local administrators and computer science specialists, would translate such requests into appropriate actions to be taken, such as invoking platform-specific compilers with a particular set of command-line switches, or issuing appropriate invocations to the RTE.