DIGITAL INFRASTRUCTURES for RESEARCH 2018 | Serving the user base

Data processing approaches

Thursday, September 29, 2016 - 11:30

Chair: Paolo Manghi

In the data processing stack, major challenges lay at the level of hardware resources involved, e.g. data storage, hardware, and bandwidth (moving big data around). Resources are ''expensive'', hence cost must be sustained among stakeholders based on an economy of scale, but most importantly their proper usage and optimisation should be as transparent as possible to scientists, which may not necessarily be IT people. Scientists should be able to use their data processing services without being responsible of how storage and computation are elastically optimised at the iron level or how big data is moved on the Internet to support experiments as efficiently as possible. This session includes three experiences touching on these aspects: how an orchestrated and autonomic usage of Cloud services can help in the construction of data pipelines and guarantee QoS, and how scientists can be served with transparent ''share & sync'' functionalities, to optimise movement of data across the Australian National network of data repositories.

Scheduled presentations: 


EUBra-BIGSEA: cloud services with QoS guarantees for Big Data Analytics Ignacio Blanquer
Dynamic creation of data pipelines in clouds Peter Kacsuk
Australian Data Lifecycle project: giving users a "data pump" Guido Aben