Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Civil-Comp Proceedings
ISSN 1759-3433
CCP: 95
PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, GRID AND CLOUD COMPUTING FOR ENGINEERING
Edited by:
Paper 45

High Performance Communication Framework for Large Scale Workflows

X. Wang1, U. Küster1, M. Resch1 and E. Focht2

1High Performance Computing Centre Stuttgart, University of Stuttgart, Germany
2Research and Development, NEC High Performance Computing Europe, Stuttgart, Germany

Full Bibliographic Reference for this paper
, "High Performance Communication Framework for Large Scale Workflows", in , (Editors), "Proceedings of the Second International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 45, 2011. doi:10.4203/ccp.95.45
Keywords: scientific workflow, workflow management, large scale, high performance communication, data streaming, coupling.

Summary
HREF="#wang:3">3,4] . The main focus of this paper is on efficient and scalable data transferring technologies for large scale scientific workflows. Large scale simulations are increasingly important in many scientific areas. Hundreds of terabytes of I/O data are generated in such workflows. Although the I/O throughput of 1.5 Gbps is mostly available on nowadays file systems, it is strongly affected by many factors, e.g. the amount of concurrently accesses. Our framework automates the streaming of intermediate data through workflows. The inter-connections among computational nodes in PC-Clusters are used instead of disk-involved file I/O.

Our system consists of a global task scheduler to map and manage the executions of inter-dependent tasks, an interaction model to manage the connections and process data streaming between tasks, a special I/O library that replaces normal I/O calls with remote procedure calls (RPCs), and a message transferring layer to enable the communications via the network. The simple-to-use feature of our framework is highlighted in the way that users are not required to modify their applications. For dynamically linked executions under Linux, I/O system calls are captured by pre-loading the system call interception layer and then replaced by implemented RPCs. For other executions, we support them by using Filesystem in USErspace (FUSE) [5] as our basic client library. FUSE intercepts system calls and invokes RPCs implemented in our special I/O library. Therefore, from in-house codes to non-accessible commercial software can be easily integrated into our framework. In addition, multiple network protocols are supported by our framework, that means, our system can be easily ported to various clusters with different networks.

Results of executing a single experiment from a bone implant workflow show that I/O-intensive applications can benefit from our framework with an improvement of I/O rate of 88%. We expect scaling and much higher throughput by execution of hundreds of thousands of experiments concurrently.

References
1
K. Ranganathan, I. Foster, "Identifying Dynamic Replication Strategies for a High-Performance Data Grid", in "Proceedings of the Second Workshop on Grid Computing", Denver, CO, USA, 2001. doi:10.1007/3-540-45644-9_8
2
H.S. Kim, I.S. Cho, H.Y. Yeom, "A Task Pipelining Framework for e-Science Workflow Management Systems", in "Eighth IEEE International Symposium on Cluster Computing and the Grid", Lyon, France, 2008. doi:10.1109/CCGRID.2008.47
3
V. Bhat et al., "High Performance Threaded Data Streaming for Large Scale Simulations", in "Proceedings of the Fifth IEEE/ACM International Workshop on Grid Computing", Pittsburgh, USA, 2004. doi:10.1109/GRID.2004.36
4
J.D. Blower, A.B. Harrison, K. Haines, "Styx Grid Services: Lightweight Middleware for Efficient Scientific Workflows", Journal Scientific Programming - Scientific Workflows, 14(3,4), 209-216, 2006.
5
"Filesystem in userspace", http://fuse.sourceforge.net

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £85 +P&P)