Computational & Technology Resources
an online resource for computational,
engineering & technology publications |
|
Civil-Comp Proceedings
ISSN 1759-3433 CCP: 101
PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, GRID AND CLOUD COMPUTING FOR ENGINEERING Edited by:
Paper 6
A Comparison of FETI Natural Coarse Space Projector Implementation Strategies V. Hapla1,2 and D. Horak1,2
1Department of Applied Mathematics, VSB-Technical University of Ostrava, Ostrava, Czech Republic
V. Hapla, D. Horak, "A Comparison of FETI Natural Coarse Space Projector Implementation Strategies", in , (Editors), "Proceedings of the Third International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 6, 2013. doi:10.4203/ccp.101.6
Keywords: domain decomposition, FETI, Total FETI, TFETI, FLLOP, PETSc, parallel direct solver, pseudoinverse, natural coarse space matrix, coarse problem.
Summary
In domain decomposition method, the natural effort for large problems solved on
the massively parallel computers is to maximize the number of subdomains so
that sizes of subdomain stiffness matrices are reduced which accelerates the
primal operations. On the other hand, the negative effect of that are the growing
dimensions of objects that couple the subdomains, i.e. operators whose domain
or image is either the dual space (whose dimension is number of Lagrange
multipliers on subdomains interfaces) or the kernel of the stiffness matrix.
We found out that particularly an application of the projector onto the natural coarse space can become a severe bottleneck of the FETI method, especially the part called the coarse problem (CP) solution. Concerning very large problems, these operations can start to dominate in computation times, destroy scalability or even cause out-of-memory error if we do not parallelize them. According to the observations, the matrix-vector and matrix-transpose-vector multiplications take approximately the same time for different coarse space matrix distributions. So the action time and level of communication depends primarily on the implementation of the CP solution. It cannot be solved sequentially due to local memory requirements but on the other hand use of all processes leads to a substantial communication overhead. Furthermore, the precision of its solution should be much higher than the required precision of dual solution else it causes divergence of the top- level FETI dual solver. In this paper, we compare an effect of a choice of the parallel LU direct solver from a set available in PETSc on HECToR (MUMPS, SuperLU) on the performance of CP solution using two strategies:
purchase the full-text of this paper (price £20)
go to the previous paper |
|