Computational Technology Resources - CCP

Keywords: parallel distributed algorithms, cluster computing, performance evaluation and prediction, multilevel numerical software.

Summary

This paper shows the results of modelling the performance of practical engineering numerical software, such as [1], across different clusters of processors, with both multicore and heterogeneous resources. The goal of this model is to allow reliable predictions to be made as to the execution time of a given parallel code on a large number of processors of a given parallel system, by only benchmarking the code on a small numbers of processors. When extra memory and processors are available, parallel multilevel implementations are able to solve problems numerically on finer meshes, so as to achieve greater accuracy than would be otherwise possible. The methodology described permits us to estimate the performance prior to actually running with these very fine meshes on a large numbers of processors. This is of great potential value in making decisions concerning the scheduling of jobs within a grid environment.

We describe some recent related work into performance modelling for parallel numerical software, and therefore we introduce the new idea upon which our methodology is based, which is similar to that presented in [2]. Our goal here is to exploit this methodology for predicting the performance of practical multilevel software, that does not necessarily scale linearly with the size of the physical problem.

We model the parallel execution time as sum of two terms: the computational time and the parallel overhead, the latter is primarily due to inter-processor communications. We describe the methodology used to model the computational time. In order to accurately predict it, care needs to be taken to preserve the geometric shape of the subdomains for the parallel runs. We have also to consider, and exploit, any heterogeneity and/or the multicore features associated with the processors and parallel architectures used. We describe the methodology used to predict the parallel overheads. In order to predict the communications patterns for large parallel runs, we use only information based on the patterns observed for sequences of runs across small numbers of processors.

Finally, a selection for numerical results for our methodology are presented and discussed. We demonstrate that the methodology is robust and accurate across multicore and inhomogeneous parallel architectures.

References

1: C.E. Goodyer, M. Berzins, "Parallelization and Scalability issues of a Multilevel Elastohydrodynamic Lubrication Solver", Concurrency and Computation: Practice and Experience, 19, 369-396, 2007. doi:10.1002/cpe.1103
2: G. Romanazzi, P.K. Jimack, "Parallel Performance Prediction for Multigrid Codes on Distributed Memory Architectures" in "High Performance Computing and Communications (HPCC-07)", (Editor), R. Perrott et al., LNCS 4782, Springer, 647-658, 2007. doi:10.1007/978-3-540-75444-2_61

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £95 +P&P)

	Computational & Technology Resources an online resource for computational, engineering & technology publications
	not logged in - login
Front Page Browse CCP CSETS CTR IJRT Other Authors Search Purchase Guide FAQ Contact us	Civil-Comp Proceedings ISSN 1759-3433 CCP: 89 PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY Edited by: M. Papadrakakis and B.H.V. Topping Paper 4 Reliable Performance Prediction for Parallel Scientific Software in a Multi-Cluster Grid Environment G. Romanazzi, P.K. Jimack and C.E. Goodyer School of Computing, University of Leeds, United Kingdom doi:10.4203/ccp.89.4 purchase the full-text of this paper Full Bibliographic Reference for this paper G. Romanazzi, P.K. Jimack, C.E. Goodyer, "Reliable Performance Prediction for Parallel Scientific Software in a Multi-Cluster Grid Environment", in M. Papadrakakis, B.H.V. Topping, (Editors), "Proceedings of the Sixth International Conference on Engineering Computational Technology", Civil-Comp Press, Stirlingshire, UK, Paper 4, 2008. doi:10.4203/ccp.89.4 Keywords: parallel distributed algorithms, cluster computing, performance evaluation and prediction, multilevel numerical software. Summary This paper shows the results of modelling the performance of practical engineering numerical software, such as [1], across different clusters of processors, with both multicore and heterogeneous resources. The goal of this model is to allow reliable predictions to be made as to the execution time of a given parallel code on a large number of processors of a given parallel system, by only benchmarking the code on a small numbers of processors. When extra memory and processors are available, parallel multilevel implementations are able to solve problems numerically on finer meshes, so as to achieve greater accuracy than would be otherwise possible. The methodology described permits us to estimate the performance prior to actually running with these very fine meshes on a large numbers of processors. This is of great potential value in making decisions concerning the scheduling of jobs within a grid environment. We describe some recent related work into performance modelling for parallel numerical software, and therefore we introduce the new idea upon which our methodology is based, which is similar to that presented in [2]. Our goal here is to exploit this methodology for predicting the performance of practical multilevel software, that does not necessarily scale linearly with the size of the physical problem. We model the parallel execution time as sum of two terms: the computational time and the parallel overhead, the latter is primarily due to inter-processor communications. We describe the methodology used to model the computational time. In order to accurately predict it, care needs to be taken to preserve the geometric shape of the subdomains for the parallel runs. We have also to consider, and exploit, any heterogeneity and/or the multicore features associated with the processors and parallel architectures used. We describe the methodology used to predict the parallel overheads. In order to predict the communications patterns for large parallel runs, we use only information based on the patterns observed for sequences of runs across small numbers of processors. Finally, a selection for numerical results for our methodology are presented and discussed. We demonstrate that the methodology is robust and accurate across multicore and inhomogeneous parallel architectures. References 1 C.E. Goodyer, M. Berzins, "Parallelization and Scalability issues of a Multilevel Elastohydrodynamic Lubrication Solver", Concurrency and Computation: Practice and Experience, 19, 369-396, 2007. doi:10.1002/cpe.1103 2 G. Romanazzi, P.K. Jimack, "Parallel Performance Prediction for Multigrid Codes on Distributed Memory Architectures" in "High Performance Computing and Communications (HPCC-07)", (Editor), R. Perrott et al., LNCS 4782, Springer, 647-658, 2007. doi:10.1007/978-3-540-75444-2_61 purchase the full-text of this paper (price £20) go to the previous paper go to the next paper return to the table of contents return to the book description purchase this book (price £95 +P&P)
Back to top	©Civil-Comp Limited 2023 - terms & conditions