Computational & Technology Resources
an online resource for computational,
engineering & technology publications |
|
Civil-Comp Proceedings
ISSN 1759-3433 CCP: 80
PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON ENGINEERING COMPUTATIONAL TECHNOLOGY Edited by: B.H.V. Topping and C.A. Mota Soares
Paper 131
On Training Sample Selection for Artificial Neural Networks using Number-Theoretic Methods F. Tong+ and X.L. Liu*
+Department of Civil Engineering, Tsinghua University, Beijing, China
F. Tong, X.L. Liu, "On Training Sample Selection for Artificial Neural Networks using Number-Theoretic Methods", in B.H.V. Topping, C.A. Mota Soares, (Editors), "Proceedings of the Fourth International Conference on Engineering Computational Technology", Civil-Comp Press, Stirlingshire, UK, Paper 131, 2004. doi:10.4203/ccp.80.131
Keywords: artificial neural networks, number-theoretic methods (NTMs), NT-net, discrepancy, good lattice points (GLP-net), hammersley-net.
Summary
Flexibility in generalization is always what is to be pursued when an artificial
neural network (ANN) model is set up. For this purpose, this paper makes efforts to
improve the quality of the "teacher", i.e., to ensure that the uniformity of training samples
distribution by use of the series of Number-Theoretic Methods (NTMs). NTMs are a
series of deterministic number-theoretic algorithms used to generate points that
uniformly scatter in s-dimensional unit cube . As the ANN prediction shows that the
nature of nonlinear interpolation, uniformity of samples is helpful to produce small
errors on new samples unseen during training.
Under NTMs theory frame, discrepancy is defined as a quantitative measurement for the uniformity of a set of points. The smaller the discrepancy is, the more uniformly samples distribute. Actually, discrepancy describes how well a set of points represents the uniform distribution on , . In this paper, GLP-net, Halton-net, and Hammersley-net, are introduced as typical NT-nets. Training samples are prepared, respectively, by GLP-net, Hammersley-net, and compared with equal-spaced samples in uniformity in terms of discrepancy value. Trained, respectively, by these three types of samples, ANN models show quite different performance in computational precision and stability. ANNs trained by NTM-based samples outperform in terms of generalization flexibility. This is demonstrated through an engineering case study in this paper. Conclusively, good uniformity of training samples, instead of unselectively piling more and more data, is really helpful to enhance the ANNs' generalization performance. It is mathematically proven to obtain uniformly scattering samples through NTMs other than equal spaced sampling. References
purchase the full-text of this paper (price £20)
go to the previous paper |
|