Computational & Technology Resources
an online resource for computational,
engineering & technology publications |
|
Computational Science, Engineering & Technology Series
ISSN 1759-3158 CSETS: 34
PATTERNS FOR PARALLEL PROGRAMMING ON GPUS Edited by: F. Magoulès
Chapter 5
Optimization methodology for Parallel Programming of Homogeneous or Hybrid Clusters S. Vialle1,3 and S. Contassot-Vivier2,3
1Supelec & UMI GT-CNRS 2958, Metz, France S. Vialle, S. Contassot-Vivier, "Optimization methodology for Parallel Programming of Homogeneous or Hybrid Clusters", in F. Magoulès, (Editor), "Patterns for Parallel Programming on GPUs", Saxe-Coburg Publications, Stirlingshire, UK, Chapter 5, pp 111-148, 2014. doi:10.4203/csets.34.5
Keywords: message passing, multithreading on multicore, vectorization on GPU, communication-computation overlapping, computing kernel optimization, deployment.
Abstract
This chapter proposes a study of the optimization process of parallel applications to be run on modern architectures (multi-core CPU nodes with GPUs). Different optimization schemes are proposed for overlapping computations with communications, and for computation kernels.
Development methodologies are introduced to obtain different optimization degrees and specific criteria are defined to help developers find the most suitable degree of optimization according to the considered application and parallel system. According to our experience in industrial collaborations, we analyze both performance and code complexity increase. This last point is an important issue, especially in the industry, as it directly impacts development and maintenance costs. Complete experiments are performed to evaluate the different variants of a benchmark application that consists of a dense matrix product. In those experiments, different runtime parameters and cluster configurations are tested. Then, the results are analyzed to evaluate the interest of the different optimization degrees as well as to validate the interest of the proposed optimization methodology. purchase the full-text of this chapter (price £20)
go to the previous chapter |
|