Computational & Technology Resources
an online resource for computational,
engineering & technology publications
Civil-Comp Proceedings
ISSN 1759-3433
CCP: 108
PROCEEDINGS OF THE FIFTEENTH INTERNATIONAL CONFERENCE ON CIVIL, STRUCTURAL AND ENVIRONMENTAL ENGINEERING COMPUTING
Edited by: J. Kruis, Y. Tsompanakis and B.H.V. Topping
Paper 273

Detecting Concepts in Construction Project Documents using Statistical Measures for Semantic Similarity

D. Nedeljkovic and M. Kovacevic

Department of Construction Project Management, Faculty of Civil Engineering, Belgrade, Serbia

Full Bibliographic Reference for this paper
D. Nedeljkovic, M. Kovacevic, "Detecting Concepts in Construction Project Documents using Statistical Measures for Semantic Similarity", in J. Kruis, Y. Tsompanakis, B.H.V. Topping, (Editors), "Proceedings of the Fifteenth International Conference on Civil, Structural and Environmental Engineering Computing", Civil-Comp Press, Stirlingshire, UK, Paper 273, 2015. doi:10.4203/ccp.108.273
Keywords: document management, information retrieval, automatic concept detection, pointwise mutual information, semantic similarity.

Summary
This paper addresses the problem of automatic concept detection in a construction project documentation with the aim of increasing the efficiency of information retrieval for all stakeholders in real-time in cases when documents lack previously defined metadata or when the semantic knowledge is not taken into account. Introduction of significant concepts, in a user-specific problem domain would improve retrieval of relevant documents. Concepts, represented as word pairs, were ranked by using different statistical measures for semantic similarity in order to compare the observed and the expected co-occurrence under a null model. Experiments suggested that using the statistical measures in different combinations yielded better performance when compared to their individual usage. The proposed approach was tested on several data sets compiled from the documents originating from a smelting project in Bor in the Republic of Serbia. Common information retrieval measures, precision and recall, were calculated for different combinations of word span, context scope and applied statistical measures, and further discussed taking into account the complexity and specificity of the observed construction project documentation.

purchase the full-text of this paper (price £20)

go to the previous paper
go to the next paper
return to the table of contents
return to the book description
purchase this book (price £75 +P&P)