Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Civil-Comp Proceedings
ISSN 1759-3433
CCP: 107
PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, GRID AND CLOUD COMPUTING FOR ENGINEERING
Edited by:
Paper 44

Three-Dimensional Navier-Stokes Flow Simulation using the Finite Element Method on GPUs

N.S.C. Kao1 and T.W.H. Sheu2

1Department of Engineering Science and Ocean Engineering, National Taiwan University, Taiwan
2Department of Mathematics, National Taiwan University, Taiwan

Full Bibliographic Reference for this paper
N.S.C. Kao, T.W.H. Sheu, "Three-Dimensional Navier-Stokes Flow Simulation using the Finite Element Method on GPUs", in , (Editors), "Proceedings of the Fourth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 44, 2015. doi:10.4203/ccp.107.44
Keywords: CUDA, GPU, finite element, element-by-element.

Summary
In this paper a new finite element model to solve the three-dimensional incompressible Navier-Stokes equations formulated at the steady-state is presented. To circumvent the convective instability problem at high Reynolds number, the proposed streamline upwinding finite element model minimizes the numerical wavenumber error for the convection term. The mixed finite element formulation is adopted and the resulting unsymmetric and indefinite matrix equations are solved iteratively. To avoid a Lanczos or a pivoting breakdown, the matrix equation has been normalized. The conjugate gradient solver is therefore applicable to obtain an unconditionally convergent solution from the resulting symmetric and positive definite matrix equation. To alleviate the drawback of the resulting slower convergence, the Jacobi preconditioner has been used to reduce the condition number. The time consuming PCG solver is executed on a GPU platform. The matrix-vector product in the PCG solver is implemented using the element-by-element and mesh coloring techniques. The shared memory and global memory coalesce strategies are well arranged to optimize the speedup performance. The developed code implemented on a single GPU card is verified by solving the problem amenable to analytical solutions. The lid-driven cavity flow is solved for the performance of speedup. Finally, the 90-degree bent square duct flow problem is investigated.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £45 +P&P)