Computational Technology Resources - CSETS

Keywords: non-linear computations, parallel strategies, algorithms, large scale problems, load balancing, industrial environment.

Summary

Reducing the time and cost of mechanical design necessitates taking into account material and geometrical non-linearities when simulating the behaviour of the structures. Moreover, realistic models have to be used in order to obtain accurate numerical predictions, especially to respect safety conception constrains which are more and more required in high-tech nuclear, space and naval industries. Unfortunately, these simulations with accurate meshes involving a large number of elements, with highly non-linear material behaviour with softening and localisation and with complex history loadings requiring a large number of time steps, often lead to numerical costs too high for their use to be widespread in the industry. The joint use of powerful algorithms and parallel computers is necessary to greatly reduce the cost of these complex simulations [1,2]. The aim of this research project was to extend the possibilities of the finite element code CAST3M (developed at CEA, France), whose purpose is to facilitate the development of new algorithms. Thus, in order to strongly reduce the numerical elapsed time for solving complex non linear problems it is important to take into account the possibilities of different parallel computers, and in particular efficient and economic configurations of multicore 64 bits PC. Moreover, in order to let the programmer focus on the program design, which is a critical aspect for the application efficiency, the parallel environment has to free him from parallel programming intricacies (management of data, coherence of data, ...). The challenge is to merge these advanced features with the traditional requirements of an industrial code: robustness and flexibility, ease of use, predictability of computational resource employment.

The purpose of this paper is to present a parallel approach suited to the simulation of a wide class of non-linear problems for quasi-static response. The starting point is to make use of the mechanical properties of the different types of equations to be solved in order to distribute computations over the different processors of a parallel computer. The approach is based on the use of two domain decompositions where the goal is to balance the computation load over the various processors by limiting the redistribution of the tasks. A good load balancing of the tasks as well as keeping the communications as low as possible are key to an effective parallel algorithm.

The implementation of this algorithm is carried out starting from an extension of the possibilities of GIBIANE: the user language of the code CAST3M. We have created a parallel environment language that eases the development of parallel algorithms either at the programming level or at the user level. It is based on the development environment of the Finite Element code CAST3M. The developed parallel language, which is based on an object-based virtual shared memory system, offers the user the vision of a unique and global address space over the individual memories [2]. It ensures the data coherence and hides data exchanges between processors and a great part of the sequential code can be reused. The propounded system can be implemented on most parallel computers as it is developed with machine-independent programming techniques and it is important to notice that the different concepts can be used in other object-based parallel languages.

Non-linear problems are usually solved by means of NEWTON methods and lead mainly to compute two types of sub-problems. The proposed parallel strategy uses the mechanical properties of these sub-problems. On one hand, a domain decomposition technique with a direct resolution of the condensed problem is proposed to solve the linear global problems, in order to be compatible with the BFGS type convergence speed-up. One can also use a parallel direct solver associated to a "nested dissection" ordering approach which limits the fill-in effect in the factorization of the matrix [3]. In fact, this strategy is similar to a decomposition domain technique and gives nearly optimal performance on shared memory computers. On the other hand, it is almost impossible to predict the space evolution of the CPU time spent to integrate the constitutive laws. Therefore, in order to have a well-balanced load, without communication, we propose the use of a second domain decomposition [4]. An optimization of the communications between the two domain decompositions is necessary to obtain good performances.

To facilitate the resolution in parallel of a wide class of problems, in a transparent way for the user while ensuring good effectiveness, various recent developments were carried out. The asynchronous execution of calculations at the user level was simplified on one hand, using a "container" object gathering the decompositions of the objects used by a calculation and on the other hand, using an operator to distribute the tasks on various processors starting from these decompositions. Moreover, the structuring of the data proposed for the implementation of this "data parallel" technique to distribute the data and calculations on the various processors of a parallel machine makes it possible to ensure compatibility with sequential simulations. The resolution of large scale problems requires an intensive use of the virtual memory (swap on disk for unused objects). The management of objects which can be shared between various applications has been optimised in order to ensure the data coherence and to limit the blocking phases of the parallel applications. A version based on the standard Posix "pthread" was initially developed to ensure the performances of the parallel programming environment and to allow the code portability on shared memory computers. The extension of this strategy to shared-distributed memory computers is underway.

Numerical examples, in the case of large scale industrial problems with material and geometrical non-linearities are presented to validate the propounded parallel approach.

References

[1]: A.K. Noor, S.L. Venneri, D.B. Paul, M.A. Hopkins, "Structure technology for future aerospace systems", Computer & Structures, 74, 507-519, 2000. doi:10.1016/S0045-7949(99)00067-X
[2]: J.Y. Cognard, F. Thomas, P. Verpeaux, "An integrated approach to solving mechanical problems on parallel computers", Advances in Engineering Software, 31, 885-899, 2000. doi:10.1016/S0965-9978(00)00063-6
[3]: M.T. Heath, P. Raghavan, "A cartesian parallel nested dissection algorithm", SIAM J. Matrix Anal. Appl., 16, 235-253, 1995. doi:10.1137/S0895479892238270
[4]: J.Y. Cognard, A. Poulhalec, F. Thomas, P. Verpeaux, "A parallel environment and associated strategies in structural non-linear analysis". Progress In Engineering Computational Technology, Eds. B.H.V. Topping & C.A. Mota Soares, Saxe-Coburg Publications, Chapter 14, 323-352, 2004.

purchase the full-text of this chapter (price £20)

go to the previous chapter
go to the next chapter
return to the table of contents
return to the book description
purchase this book (price £98 +P&P)

	Computational & Technology Resources an online resource for computational, engineering & technology publications
	not logged in - login
Front Page Browse CCP CSETS CTR IJRT Other Authors Search Purchase Guide FAQ Contact us	Computational Science, Engineering & Technology Series ISSN 1759-3158 CSETS: 21 PARALLEL, DISTRIBUTED AND GRID COMPUTING FOR ENGINEERING Edited by: B.H.V. Topping, P. Iványi Chapter 21 A Parallel Approach for Solving a Wide Class of Structural Non-Linear Problems J.Y. Cognard¹ and P. Verpeaux² ¹Brest Laboratory of Mechanics and Systems, ENSIETA - University of Brest - ENIB, France ²DMT/SEMT, CEA Saclay, Gif sur Yvette, France doi:10.4203/csets.21.21 purchase the full-text of this chapter Full Bibliographic Reference for this chapter J.Y. Cognard, P. Verpeaux, "A Parallel Approach for Solving a Wide Class of Structural Non-Linear Problems", in B.H.V. Topping, P. Iványi, (Editors), "Parallel, Distributed and Grid Computing for Engineering", Saxe-Coburg Publications, Stirlingshire, UK, Chapter 21, pp 455-482, 2009. doi:10.4203/csets.21.21 Keywords: non-linear computations, parallel strategies, algorithms, large scale problems, load balancing, industrial environment. Summary Reducing the time and cost of mechanical design necessitates taking into account material and geometrical non-linearities when simulating the behaviour of the structures. Moreover, realistic models have to be used in order to obtain accurate numerical predictions, especially to respect safety conception constrains which are more and more required in high-tech nuclear, space and naval industries. Unfortunately, these simulations with accurate meshes involving a large number of elements, with highly non-linear material behaviour with softening and localisation and with complex history loadings requiring a large number of time steps, often lead to numerical costs too high for their use to be widespread in the industry. The joint use of powerful algorithms and parallel computers is necessary to greatly reduce the cost of these complex simulations [1,2]. The aim of this research project was to extend the possibilities of the finite element code CAST3M (developed at CEA, France), whose purpose is to facilitate the development of new algorithms. Thus, in order to strongly reduce the numerical elapsed time for solving complex non linear problems it is important to take into account the possibilities of different parallel computers, and in particular efficient and economic configurations of multicore 64 bits PC. Moreover, in order to let the programmer focus on the program design, which is a critical aspect for the application efficiency, the parallel environment has to free him from parallel programming intricacies (management of data, coherence of data, ...). The challenge is to merge these advanced features with the traditional requirements of an industrial code: robustness and flexibility, ease of use, predictability of computational resource employment. The purpose of this paper is to present a parallel approach suited to the simulation of a wide class of non-linear problems for quasi-static response. The starting point is to make use of the mechanical properties of the different types of equations to be solved in order to distribute computations over the different processors of a parallel computer. The approach is based on the use of two domain decompositions where the goal is to balance the computation load over the various processors by limiting the redistribution of the tasks. A good load balancing of the tasks as well as keeping the communications as low as possible are key to an effective parallel algorithm. The implementation of this algorithm is carried out starting from an extension of the possibilities of GIBIANE: the user language of the code CAST3M. We have created a parallel environment language that eases the development of parallel algorithms either at the programming level or at the user level. It is based on the development environment of the Finite Element code CAST3M. The developed parallel language, which is based on an object-based virtual shared memory system, offers the user the vision of a unique and global address space over the individual memories [2]. It ensures the data coherence and hides data exchanges between processors and a great part of the sequential code can be reused. The propounded system can be implemented on most parallel computers as it is developed with machine-independent programming techniques and it is important to notice that the different concepts can be used in other object-based parallel languages. Non-linear problems are usually solved by means of NEWTON methods and lead mainly to compute two types of sub-problems. The proposed parallel strategy uses the mechanical properties of these sub-problems. On one hand, a domain decomposition technique with a direct resolution of the condensed problem is proposed to solve the linear global problems, in order to be compatible with the BFGS type convergence speed-up. One can also use a parallel direct solver associated to a "nested dissection" ordering approach which limits the fill-in effect in the factorization of the matrix [3]. In fact, this strategy is similar to a decomposition domain technique and gives nearly optimal performance on shared memory computers. On the other hand, it is almost impossible to predict the space evolution of the CPU time spent to integrate the constitutive laws. Therefore, in order to have a well-balanced load, without communication, we propose the use of a second domain decomposition [4]. An optimization of the communications between the two domain decompositions is necessary to obtain good performances. To facilitate the resolution in parallel of a wide class of problems, in a transparent way for the user while ensuring good effectiveness, various recent developments were carried out. The asynchronous execution of calculations at the user level was simplified on one hand, using a "container" object gathering the decompositions of the objects used by a calculation and on the other hand, using an operator to distribute the tasks on various processors starting from these decompositions. Moreover, the structuring of the data proposed for the implementation of this "data parallel" technique to distribute the data and calculations on the various processors of a parallel machine makes it possible to ensure compatibility with sequential simulations. The resolution of large scale problems requires an intensive use of the virtual memory (swap on disk for unused objects). The management of objects which can be shared between various applications has been optimised in order to ensure the data coherence and to limit the blocking phases of the parallel applications. A version based on the standard Posix "pthread" was initially developed to ensure the performances of the parallel programming environment and to allow the code portability on shared memory computers. The extension of this strategy to shared-distributed memory computers is underway. Numerical examples, in the case of large scale industrial problems with material and geometrical non-linearities are presented to validate the propounded parallel approach. References [1] A.K. Noor, S.L. Venneri, D.B. Paul, M.A. Hopkins, "Structure technology for future aerospace systems", Computer & Structures, 74, 507-519, 2000. doi:10.1016/S0045-7949(99)00067-X [2] J.Y. Cognard, F. Thomas, P. Verpeaux, "An integrated approach to solving mechanical problems on parallel computers", Advances in Engineering Software, 31, 885-899, 2000. doi:10.1016/S0965-9978(00)00063-6 [3] M.T. Heath, P. Raghavan, "A cartesian parallel nested dissection algorithm", SIAM J. Matrix Anal. Appl., 16, 235-253, 1995. doi:10.1137/S0895479892238270 [4] J.Y. Cognard, A. Poulhalec, F. Thomas, P. Verpeaux, "A parallel environment and associated strategies in structural non-linear analysis". Progress In Engineering Computational Technology, Eds. B.H.V. Topping & C.A. Mota Soares, Saxe-Coburg Publications, Chapter 14, 323-352, 2004. purchase the full-text of this chapter (price £20) go to the previous chapter go to the next chapter return to the table of contents return to the book description purchase this book (price £98 +P&P)
Back to top	©Civil-Comp Limited 2023 - terms & conditions