Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Civil-Comp Proceedings
ISSN 1759-3433
CCP: 92
PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON SOFT COMPUTING TECHNOLOGY IN CIVIL, STRUCTURAL AND ENVIRONMENTAL ENGINEERING
Edited by: B.H.V. Topping and Y. Tsompanakis
Paper 10

Assessing Organics Removal in Water Treatment with Data Mining and Artificial Neural Networks

M. Bieroza1, A. Baker2 and J. Bridgeman1

1School of Civil Engineering, 2School of Geography, Earth and Environmental Sciences,
University of Birmingham, United Kingdom

Full Bibliographic Reference for this paper
M. Bieroza, A. Baker, J. Bridgeman, "Assessing Organics Removal in Water Treatment with Data Mining and Artificial Neural Networks", in B.H.V. Topping, Y. Tsompanakis, (Editors), "Proceedings of the First International Conference on Soft Computing Technology in Civil, Structural and Environmental Engineering", Civil-Comp Press, Stirlingshire, UK, Paper 10, 2009. doi:10.4203/ccp.92.10
Keywords: multivariate analysis, pattern recognition, artificial neural networks, disinfection by-products, fluorescence spectroscopy, organic matter removal.

Summary
In this paper, for the first time the use and application of different robust data mining methods are assessed and compared. In particular, multi-way analysis, principal components analysis (PCA), parallel factor analysis (PARAFAC), and an unsupervised artificial neural network algorithm (ANN) - self-organizing map (SOM)) were tested for their efficacy in the exploratory analysis and calibration of fluorescence excitation-emission matrix (EEM) data characterising organic matter removal efficiency at sixteen water treatment works in the UK.

Insufficient organic matter removal prior to disinfection of drinking water treatment leads to the formation of harmful disinfection by-products (DBPs) arising from the chemical reaction of organic compounds with the disinfectant. Therefore a rapid and accurate assessment of organic matter removal in the initial stages of water treatment is crucial to the prediction of DBPs formation potential. In the work presented here, fluorescence spectroscopy was used for the quantitative and qualitative characterisation of organic matter during water treatment. As fluorescence measurements are rapid and non-invasive, fluorescence spectroscopy can provide an accurate assessment of organic matter removal efficiency and facilitate on-line prediction of DBPs formation potential.

Here, different multivariate analysis techniques were applied for pattern recognition and calibration of fluorescence EEMs. Fluorescence data were collected from sixteen water treatment works, for raw and partially-treated (clarified) water. The decrease in fluorescence intensity between raw and clarified samples was correlated with actual organic matter removal measured as TOC. Prior to calibration of the fluorescence data with TOC concentrations, different decomposition algorithms were used to analyse EEMs, extract information on the organic matter constituents, and reduce the dimensionality of the data to enhance the efficiency of calibration methods. In particular, robust data analysis methods (PCA, PARAFAC, and SOM) were tested. From the comparison of the unsupervised pattern recognition methods, PCA and SOM, it was found that SOM provided better discrimination between water treatment sites on the base of spectral properties of organic matter. Moreover, the SOM analysis enabled the correlation between organic matter removal efficiency and fluorescence properties. In the PARAFAC analysis a three components model was found to be valid; however visual inspection of the modelled emission and excitation spectra confirmed the presence of more components. The PARAFAC models with higher number of components were unstable as derived from the core consistency diagnostic and model validation.

Decomposed fluorescence data with PCA, PARAFAC, and SOM were used as input for different calibration models that were employed for calibration with actual TOC concentrations: stepwise regression (SR), partial least squares (PLS), multiple linear regression (MLR), and neural network with back-propagation algorithm (BPNN). From the results obtained, the best results of fluorescence - TOC removal calibration were obtained with the use of PLS and BPNN techniques, whereas the SR and MLR models were less accurate.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £78 +P&P)