Management of Knowledge Sources Supported by Domain Ontologies: Building and Construction Case Studys

AuthorCelson Lima,Ruben Costa
Date01 January 2015
DOIhttp://doi.org/10.1002/isaf.1361
Published date01 January 2015
MANAGEMENT OF KNOWLEDGE SOURCES SUPPORTED BY
DOMAIN ONTOLOGIES: BUILDING AND CONSTRUCTION CASE
STUDY
RUBEN COSTA
a
*AND CELSON LIMA
b
a
Centre of Technology and Systems UNINOVA, Monte de Caparica, Portugal
b
Federal University of Western ParáUFOPA, Santarém, Brazil
SUMMARY
This paper introduces a novel conceptual framework to support the creation of knowledge representations based on
enriched semantic vectors, using the classical vector space model approach extended with ontological support. This
work is focused on collaborative engineering projects where knowledge plays a key role in the process. Collabo-
ration is the arena, engineering projects are the target and knowledge is the currency used to provide harmony into
the arena since it can potentially support innovation and,hence, a successful collaboration. The test bed for the as-
sessment of the approach comes from the Building and Construction sector, which is challenged with signicant
problems for exchanging, sharing and integrating information among actors. Semantic gaps or lack of meaning def-
inition at the conceptual and technical levels, for example, are problems fundamentally originated through the em-
ployment of representations to map the worldinto models in an endeavour to anticipate other actorsviews,
vocabulary and even motivations. One of the primary research challenges addressed in this work relates to the pro-
cess of formalization and representation of document contents, where most existing approaches are limited and
only take into account the explicit, word-based information in the document. The research described in this paper
explores how traditional knowledge representations can be enriched through incorporation of implicit information
derived from the complex relationships (semantic associations) modelled by domain ontologies with the addition of
information presented in documents, by providing a baseline for facilitating knowledge interpretation and sharing
between humans and machines. Preliminary results were collected using a clustering algorithm for document clas-
sication, which indicates that the proposed approach does improve the precision and recall of classications. Fu-
ture work and open issues are also discussed. Copyright © 2015 John Wiley & Sons, Ltd.
Keywords: knowledge sharing; semantic interoperability; ontology engineering; unsupervised document classi-
cation; vector space model
1. INTRODUCTION
Engineering companies are project oriented, and successful projects are their way to keep market share
as well as to conquer new ones. By their very nature, engineering projects normally are knowledge in-
tensive, relying on individuals (or groups) holding the appropriate knowledge (combination of existing,
recycled and brand new knowledge) to provide the required breakthrough. Knowledgeis considered the
key asset of modern organizations; as such, industry and academia have been working to provide the
appropriate support to leverage on this asset (Firestone & McElroy, 2003). A few examples of these
works are: the extensive work on knowledge models and knowledge management tools, the rise of
the so-called knowledge engineering area, the myriad of projects around controlled vocabularies
* Correspondence to: Ruben Costa, Centre of Technology and Systems UNINOVA, Monte de Caparica, Portugal. E-mail:
rddc@uninova.pt
Copyright © 2015 John Wiley & Sons, Ltd.
INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE AND MANAGEMENT
Intell. Sys. Acc. Fin. Mgmt. 22,2964 (2015)
Published online in Wiley Online Library (wileyonlinelibrary.com)DOI: 10.1002/isaf.1361
(such as ontologies, taxonomies, dictionaries and thesauri) and the academic offer of knowledge-
centred courses (graduation, master and doctoral).
Like many information retrieval tasks, knowledge representation (KR) and classication techniques
depend on using content-independent metadata (e.g. author, creation date) and/or content-dependent
metadata (e.g. words in the document). However, such approaches tend to be inherently limited by
the information that is explicit in the documents, which introduces a problem. For instance, in the
situation where words like architectand designdo not co-occur frequently, statistical techniques
will fail to make any correlation between them (Nagarajan et al., 2007). Furthermore, existing infor-
mation retrieval techniques are based upon indexing keywords extracted from documents and then
creating a vector of terms. Unfortunately, keywords or index terms alone often do not adequately cap-
ture the document contents, resulting in poor retrieval and indexation performances. Keyword
indexing is still widely used in commercial systems because it is by far the most viable way to pro-
cess large amounts of text, despite the high computational power and cost required to update and
maintain the indexes.
Such challenges motivate the following question: how to intuitively alter and add contents to a doc-
uments term vector using semantic knowledge available in domain ontologies, and thereby provide
classiers with more richness than is directly found in the document?
An ontology is used to represent knowledge inthe for mof concept hierarchies (taxonomies), inter relation-
ships between concepts, and axioms (Noy & Hafner, 1997; Noy & McGuinness, 2002). Axioms, along with
the hierarchal structure and relationships, dene the semantics, the meaning of the concepts. Ontologies are
thus the foundation of content-based information access and semantic interoperability over the Web. Here,
we propose to use knowledge available in domain ontologies in order to support the process of representing
knowledge sources (KSs; e.g. project reports, meeting minutes, description of problems/solutions) picking a
case study focused on the Building and Construction (B&C) sector, thus improvingthe classication of such
KSs. Our hypothesis is that semantic knowledge from domain ontologies can be used to enrich statistical
term vectors. Therefore, one of the main contributions of this work is consequently not trying to develop
new or improve any of the current classication algorithms but to affect the document term vectors in a
way that we could measure the effect of such semantic enrichment on existing classiers.
The information contained in ontologies can be incorporated into many representation schemes and
algorithms. In this paper we focus on a particular representation scheme based on vector space models
(Salton et al., 1975), which represents documents as a vector of their most relevant terms, considered to
be the best discriminators for each document space. The main aim is to understand how useful external
domain knowledge is to the process of KR: what the trade-offs may be and when it makes sense to
bring in such knowledge. In order to do this, we intuitively alter basic tfidf (term frequencyinverse
document frequency) (Salton & Buckley, 1988) weighted document term vectors (statistic term vector)
with the help of a domain ontology to generate new semantic term vectors for all documents to be
represented.
The performance of the proposed approach is evaluated using an unsupervised document classica-
tion algorithm. Document clustering has become one of the main techniques for organizing large vol-
umes of documents into a small number of meaningful clusters (Chen et al., 2010). However, several
challenges for document clustering still exist, such as high dimensionality, scalability, accuracy, mean-
ingful cluster labels, overlapping clusters and extracting semantics from the texts. Additionally, perfor-
mance is directly related to the quantity and quality of information within the knowledge base (KB) it
runs upon. Until, if ever, ontologies and metadata (and the Semantic Web itself) become a global com-
modity, the lack or incompleteness of available ontologies and KBs is a limitation likely to have to be
lived with in the mid-term (Castells et al., 2007).
30 R. COSTA AND C. LIMA
Copyright © 2015 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt., 22,2964 (2015)
DOI: 10.1002/isaf
This paper is structured as follows. Section 2 discusses the conceptual model supporting this work,
including knowledge modelling and representation, and the semantic referential. Section 3 explains the
semantic enrichment process. Section 4 describes the experimental scenario used to validate and assess
the work. Section 5 presents the experiment conducted and the assessment itself. Section 6 argues
around the results achieved and the related works. Finally, Section 7 concludes the paper and points
to the future work to be carried out.
2. KNOWLEDGE MODELLING AND REPRESENTATION
The conceptual foundations of the work presented here are grounded on collaboration, knowledge and
semantics (Figure 1). Collaboration is related to the work perfor med by a group of actors in the context
of development of engineering projects. Knowledge is the currencyexchanged among actors collab-
orating within a project. Semantics is represented by the use of text mining techniques with the support
of a domain ontology, which guarantees that knowledge generated during each project is captured,
transformed and mined in order to support actors in having a common understanding of the KSs that
are exchanged.
The overall aim of semantic enrichment process, addressed by this work, is to specify, develop and
evaluate, with a support of knowledge experts, a set of capabilities that promotes effective and consis-
tent KRs (including capturing, indexing and classifying) across corporate domain knowledge,
expressed by a domain ontology, within collaborative construction environments.
Distributed knowledge workers and teams lack proactive system support for seamless and natural
collaboration on applications like problem solving, conict resolution, knowledge sharing and receiv-
ing expert advice on demand. The ambition is to have innovative solutions to establish effective part-
nerships that enable collaboration, drive creativity, improve productivity, and enable one to take a
holistic approach to implement project phases. Such collaborative working environments of the future
will be based on enhanced communication, advanced simulation services, improved visualization, nat-
ural interaction and especially knowledge.
Figure 1. Conceptual foundations of the work
MANAGEMENT OF KNOWLEDGE SOURCES USING DOMAIN ONTOLOGIES 31
Copyright © 2015 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt., 22,2964 (2015)
DOI: 10.1002/isaf

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT