DIGITAL INFRASTRUCTURES for RESEARCH 2018 | Serving the user base

Obstacles and solutions for large-scale research workflows: towards effective combinations of infrastructure

Friday, September 30, 2016 -
11:30 to 13:00

As research collaborations continue to evolve and use more sophisticated, complex resources many research workflows have become ad-hoc, piece-wise, or specific to one project or collaboration. Compositions of experimental instruments and sensor data with high performance computing (HPC) using high-speed networks is an appealing but complex task requiring end-to-end orchestration of many resources. In this session, we examine research workflows including all software, middleware, algorithms, and resources used in the production and processing of research data until it is in a final published state. Large-scale research workflows seeking high-end capability resources will benefit from shared methodology which fits these pieces together. 

This session will present projects that work towards combining infrastructure, in a seamless way from the end-users’ perspective, for research workflows. We will discuss obstacles and lessons learned, and will present several reusable technologies. The topics covered by the presentations are: light source practices and workflows for research, building a proof of concept to combine infrastructures to enable high-resolution climate modeling, and building a proof of concept system to allow users to combine different pieces of e-Infrastructure efficiently.

The session will finish with a panel discussion to identify the main obstacles that need to be overcome and any actionable outcomes where we can work together as a community to bring these large-scale research workflows to the end-users for the benefit of science. The session aims to illuminate the needs of large-scale research projects as well as the opportunities for using e-infrastructure organisations to better support these and similar projects.


Session structure

This session consists of three presentations followed by a panel discussion to encourage audience participation, feedback and input. The panel discussion will focus on questions and obstacles posed in the presentations to inspire audience participation.


Target audience

The audience for the session aims to include representatives from the various parts of a workflow pipeline--from the researchers seeking the data, to the scientists running the science facilities or institutions, to the HPC services that help optimize algorithms for processing the data. We are seeking audience participation from (Inter)National e-Infrastructure Providers, universities, science facilities, and research institutions.


Benefits for the audience

Examples of research projects which could benefit from collaboration of infrastructure providers.
Understanding possibilities to improve support for research projects.

Contact Organisation URL
Sylvia Kuijpers SURFnet  
David Skinner LBNL  
Jason Maassen NLeSC

“On the complexities of utilizing large-scale lightpath-connected distributed cyberinfrastructure”, J. Maassen et al. Concurrency and Computation, Practice and Experience, May 2016, Wiley,

Niels Drost NLeSC  



Time (min)



20 min

Challenges and Opportunities in Digital Infrastructure at DOE User Facilities

Dr. David Skinner

20 min

Distributed supercomputing for climate modeling?

Dr. Jason Maassen

20 min

Enabling Dynamic Services

Dr. Niels Drost

30 min

Panel Discussion and Q&A

All, moderator Sylvia Kuijpers


This session is part of the general conference topics scheme, topic 1: Challenges facing users and service providers