Project Motivation

The convergence of traditional High Performance Computing (HPC) and new simulation, analysis, and data science approaches provides unprecedented opportunities for discovery but also creates new application and infrastructure challenges. Several ECP workflows exemplify this new reality: a heterogeneous combination of applications, models, and "glue" code, running on heterogeneous compute nodes with machine learning in the middle, and a scalable workflow infrastructure orchestrating the entire process. At extreme scale, these workflows will almost certainly require specialized workflow management software which is within the reach of only large and specialized inter-disciplinary teams. Historically, such approaches have tended to result in complex, integrated, and stovepiped software systems. Further, there are now hundreds of moribund workflow systems, which indicates that many teams, small and large, elect to build their own custom workflow solution rather than adopt, or build upon, an integrated system.

What is ExaWorks?

ExaWorks represents a new approach: developing a multi-level SDK that will enable teams to produce scalable and portable workflows for a wide range of exascale applications. ExaWorks does not aim to replace the many workflow solutions already deployed and used by scientists, but rather to provide a robust SDK and work with the community to identify well-defined and scalable component interfaces which can be leveraged by new and existing workflows. Most importantly, this SDK will enable a sustainable software infrastructure for workflows so that the software artifacts produced by teams will be easier to port, modify, and utilize long after projects end. SDK components will be usable by many other WMS thus facilitating software convergence in the workflows community.

Project Goals

  1. Create the ExaWorks SDK following a community-oriented process. First defining a community policy for inclusion in the SDK. Based on application requirements we will assemble a set of vertical workflow technologies (level 0), work to integrate those technlogies via existing native interfaces and shim layers (level 1), and collaborate to identify common component interfaces that can be used for interoperation between systems.
  2. Impact ECP applications. Working closely with ECP applications to design and develop workflows, supporting adoption of then ExaWorks SDK, deriving lessons and best-practices, and sharing these approaches with the community via documentation, tutorials, and hackathons.
  3. Community engagement with the ECP applications, workflows, and facility communities to harmonize the disparate and stovepiped workflow landscape, ensure ExaWorks SDK operates on exascale systems, and to create resources to support ECP applications and workflow users.

Funding

ExaWorks is supported by the the DOE Exascale Computing Project

Team

Dan Laney
Lawrence Livermore National Laboratory
Kyle Chard
Argonne National Laboratory
Shantenu Jha
Brookhaven National Laboratory
Rafael Ferreira da Silva
Oak Ridge National Laboratory
Todd Munson
Argonne National Laboratory
Aymen Alsaadi
Brookhaven National Laboratory
Yadu Babuji
Argonne National Laboratory
Ben Clifford
Argonne National Laboratory
James Corbett
Lawrence Livermore National Laboratory
Mihael Hategan
Argonne National Laboratory
Ketan Maheshwari
Oak Ridge National Laboratory
Andre Merzky
Brookhaven National Laboratory
Zeke Morton
Lawrence Livermore National Laboratory
Mikhail Titov
Brookhaven National Laboratory
Matteo Turilli
Brookhaven National Laboratory
Andreas Wilke
Argonne National Laboratory
Tom Uram
Argonne National Laboratory
Justin M. Wozniak
Argonne National Laboratory
Pascal Aschwanden
Lawrence Livermore National Laboratory

Previous Contributors

Dong H. Ahn
Lawrence Livermore National Laboratory
Stephen Herbein
Lawrence Livermore National Laboratory