Computational Technology Resources - CCP

Keywords: numerical method, OpenMP, parallelization.

Summary

In this paper, we present parallelization methods for a direct numerical simulation of a turbulent open channel flow using OpenMP which is a relatively new set of compiler directives for shared-memory computers [1,2]. OpenMP can create high performance parallelized codes with significantly less coding time than the older compiler directives MPI or PVM, which are based on a distributed memory model and require explicit send/receive directives. From the sequential Direct Numerical Simulation (DNS) program, we analyse the data relationship in order to propose three parallelization models for the Poisson Solver and two parallelization models for the remaining calculation parts. Results obtained on our 16 node PC-Cluster illustrate the advantages and the drawbacks of each approach.

We perform DNS of turbulent flow in an open channel. The no-slip boundary condition is applied at the wall and the zero-stress condition is applied at the free surface. To keep the problem tractable, we follow the standard practice of applying periodic boundary conditions in the streamwise and spanwise directions. The temporal discretization of the incompressible Navier-Stokes equations is based on the fractional step method. Such "projection schemes" require solving a Poisson equation for pressure, which typically accounts for most of the computational time. Time discretization is a second order Adams-Bashforth scheme for the convective terms. For the spatial discretization, a fourth-order central differencing scheme is adopted. Details of the numerical method can be found in [3].

We run OpenMP code compiled by Omni OpenMP on SCASH environment which then runs on a "PC-cluster enabled OpenMP" [4]. To parallelize efficiently, it is crucial to distribute data among nodes in such a manner that minimizes remote data access during the calculation process. For non-Poisson solver parts, we propose two parallelization models. The first model was produced by distributing data and parallelizing intuitively, while for the second model, data distribution and parallelism are guided by solving the optimization problem associated with Data Mapping Parallelism Graph using 0-1 integer programming [5]. Experimental results show that the speed-up of the second model is better than those of the first model and continuously improved when increasing the number of processor.

For the Poisson solver part, we propose three parallelization models: static, dynamic and pipelined model. The Static Data Distribution model requires neither data redistribution nor data transposition while Dynamic Data Distribution requires data transposition in order to keep use of the sequential code on its subset data. In the Pipelined model, we implement a real parallel solver for septa-diagonal linear systems to avoid explicit data transposition and redistribution. Comparing the speed- up of these parallelization models, it appears that the Pipeline parallelization model behaves better than the Static parallelization model. The Dynamic parallelization model performs worst because it requires a lot of time for data transposition.

References

1: Official OpenMP Specifications, Fortran version 2.0, November, 2000.
2: Rohit Chandra, Dave Kohr, Ramesh Menon, Leo Dagum, Dror Maydan, Jeff McDonald, Parallel Programming in OpenMP, Morgan Kaufmann, 2000.
3: Kajishima,T., Takiguchi,S. and Miyake,Y. Direct Numerical Simulation of Two-Phase Turbulent Flow Including Large Particles, Proc. ISAC'97 High Performance Computing on Multiphase Flow, Tokyo, pp.99-104, 1997.
4: Mitsuhisa Sato, Hiroshi Harada, Atsushi Hasegawa: Cluster-enabled OpenMP: An OpenMP compiler for the SCASH software distributed shared memory system. Scientific Programming 9(2-3): 123-130, 2001.
5: R.Bixy, K.Kennedy and U. Kremer, Automatic data layout using 0-1 Integer Programming, Proc. of the international conference on Parallel Architectures and Compilation Techniques, August, 1994.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £95 +P&P)

	Computational & Technology Resources an online resource for computational, engineering & technology publications
	not logged in - login
Front Page Browse CCP CSETS CTR IJRT Other Authors Search Purchase Guide FAQ Contact us	Civil-Comp Proceedings ISSN 1759-3433 CCP: 80 PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY Edited by: B.H.V. Topping and C.A. Mota Soares Paper 88 Parallel Implementation for Direct Numerical Simulation of Turbulent Flow using OpenMP H.V. Truong and J.C. Wells Department of Civil and Enviromental Systems Engineering, Graduate School of Science and Engineering, Ritsumeikan University, Shiga, Japan doi:10.4203/ccp.80.88 purchase the full-text of this paper Full Bibliographic Reference for this paper H.V. Truong, J.C. Wells, "Parallel Implementation for Direct Numerical Simulation of Turbulent Flow using OpenMP", in B.H.V. Topping, C.A. Mota Soares, (Editors), "Proceedings of the Fourth International Conference on Engineering Computational Technology", Civil-Comp Press, Stirlingshire, UK, Paper 88, 2004. doi:10.4203/ccp.80.88 Keywords: numerical method, OpenMP, parallelization. Summary In this paper, we present parallelization methods for a direct numerical simulation of a turbulent open channel flow using OpenMP which is a relatively new set of compiler directives for shared-memory computers [1,2]. OpenMP can create high performance parallelized codes with significantly less coding time than the older compiler directives MPI or PVM, which are based on a distributed memory model and require explicit send/receive directives. From the sequential Direct Numerical Simulation (DNS) program, we analyse the data relationship in order to propose three parallelization models for the Poisson Solver and two parallelization models for the remaining calculation parts. Results obtained on our 16 node PC-Cluster illustrate the advantages and the drawbacks of each approach. We perform DNS of turbulent flow in an open channel. The no-slip boundary condition is applied at the wall and the zero-stress condition is applied at the free surface. To keep the problem tractable, we follow the standard practice of applying periodic boundary conditions in the streamwise and spanwise directions. The temporal discretization of the incompressible Navier-Stokes equations is based on the fractional step method. Such "projection schemes" require solving a Poisson equation for pressure, which typically accounts for most of the computational time. Time discretization is a second order Adams-Bashforth scheme for the convective terms. For the spatial discretization, a fourth-order central differencing scheme is adopted. Details of the numerical method can be found in [3]. We run OpenMP code compiled by Omni OpenMP on SCASH environment which then runs on a "PC-cluster enabled OpenMP" [4]. To parallelize efficiently, it is crucial to distribute data among nodes in such a manner that minimizes remote data access during the calculation process. For non-Poisson solver parts, we propose two parallelization models. The first model was produced by distributing data and parallelizing intuitively, while for the second model, data distribution and parallelism are guided by solving the optimization problem associated with Data Mapping Parallelism Graph using 0-1 integer programming [5]. Experimental results show that the speed-up of the second model is better than those of the first model and continuously improved when increasing the number of processor. For the Poisson solver part, we propose three parallelization models: static, dynamic and pipelined model. The Static Data Distribution model requires neither data redistribution nor data transposition while Dynamic Data Distribution requires data transposition in order to keep use of the sequential code on its subset data. In the Pipelined model, we implement a real parallel solver for septa-diagonal linear systems to avoid explicit data transposition and redistribution. Comparing the speed- up of these parallelization models, it appears that the Pipeline parallelization model behaves better than the Static parallelization model. The Dynamic parallelization model performs worst because it requires a lot of time for data transposition. References 1 Official OpenMP Specifications, Fortran version 2.0, November, 2000. 2 Rohit Chandra, Dave Kohr, Ramesh Menon, Leo Dagum, Dror Maydan, Jeff McDonald, Parallel Programming in OpenMP, Morgan Kaufmann, 2000. 3 Kajishima,T., Takiguchi,S. and Miyake,Y. Direct Numerical Simulation of Two-Phase Turbulent Flow Including Large Particles, Proc. ISAC'97 High Performance Computing on Multiphase Flow, Tokyo, pp.99-104, 1997. 4 Mitsuhisa Sato, Hiroshi Harada, Atsushi Hasegawa: Cluster-enabled OpenMP: An OpenMP compiler for the SCASH software distributed shared memory system. Scientific Programming 9(2-3): 123-130, 2001. 5 R.Bixy, K.Kennedy and U. Kremer, Automatic data layout using 0-1 Integer Programming, Proc. of the international conference on Parallel Architectures and Compilation Techniques, August, 1994. purchase the full-text of this paper (price £20) go to the previous paper go to the next paper return to the table of contents return to the book description purchase this book (price £95 +P&P)
Back to top	©Civil-Comp Limited 2023 - terms & conditions