Multi-view Augmented Concept in Support of Geospatial Data Retrieval

来源 :Journal of Earth Science and Engineering | 被引量 : 0次 | 上传用户:shigaomin
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Received: October 22, 2011 / Accepted: December 26, 2011 / Published: January 20, 2012.
  Abstract: Retrieving relevant geospatial data has become increasingly critical because of the growing volume of geospatial data made available to various users through distributed environment. In this context, semantics of geospatial data is also critical, because it allows the user to understand the meaning of shared data, and the system to automatically identify and resolve semantic heterogeneity of data. However, geospatial data often lack explicit semantics, which can lead to low performance of search engines, and misinterpretation or misuse of retrieved data. In particular, the complexity of geospatial data increases the importance of explicit semantics; we have identified a lack of semantics with respect to contexts of concepts, spatiotemporal semantics, and dependencies between concepts’ features. A solution to poor semantics of geospatial data is semantic enrichment. In this paper, we propose an approach to geospatial data retrieval based on enrichment of geospatial data semantics, which contributes to solving the identified retrieval problems caused by lack of semantics. The proposed approach is based on a semantically augmented representation of the concept. A semantic enrichment system generates enriched concepts with semantic reasoning engines and data mining techniques. Then, a semantic mapping system determines the semantic correspondences between the users’ queries and the enriched concepts of databases’ ontologies. More specifically, this retrieval system is able to compute context-dependent semantic mappings; to consider spatiotemporal semantics when comparing spatiotemporal features of concepts; and to use dependencies between features to identify“missing mappings” that could not be detected otherwise. As a result, and as illustrated in a case study, the identification of relevant data sets by the retrieval system is improved, and the system is able to point out semantic heterogeneity problems that could lead to misinterpretation of data.
  Key words: Data mining, geospatial data retrieval, geospatial semantics, knowledge extraction, semantic enrichment.
   databases are meant for different purposes and developed independently; therefore, the same reality is abstracted differently. To resolve semantic heterogeneity, this meaning should be available to machines into an explicit representation, so that it can automatically be processed. However, geospatial data often lack explicit semantics [10], which can lead to low performance of search engines, and misinterpretation or misuse of retrieved data. A popular solution to this problem is the development of ontologies, which are explicit and formal specifications of shared conceptualizations [11]. An ontology provides a vocabulary to describe a domain of interest (universe of discourse) and a specification of the meaning of terms being used in that vocabulary[12]. Ontologies are used to provide a formal
  2. Running Example to Demonstrate the Problems Caused by Poor Semantics
  Geospatial data retrieval aims at finding relevant geospatial data sets over distributed and heterogeneous data sources. The main challenges related to geospatial data retrieval are the representation of data semantics, and optimizing the matching of the user’s query with the data semantics. We first give an overview of representative approaches in geospatial data retrieval.
  The Bremen University Semantic Translator for Enhanced Retrieval (BUSTER) approach of V?gele et al. [1] proposes an information broker middleware. In this approach, each source’s semantics is formalized with a Description Logics (DL) ontology. Each source’s ontology is developed using a common vocabulary defined in a global ontology. The user can select the query concept from one of the ontologies or specify a query with necessary conditions (in term of properties and range of properties). The RACER and FaCT reasoning engines are used to retrieve the concepts that are subsumed by the query concept. While the global ontology makes the different source’s ontologies comparable to each other, assuming that local ontologies can be developed from a global ontology is not always feasible in an open and dynamic environment, where sources are developed independently.
  Lutz and Klien [2] proposed a similar approach for the discovery and the retrieval of geographic information in spatial data infrastructures (SDIs). Their approach is also based on annotations of geographic feature types with DL concepts. The DL
   of knowledge are not limited to some category, meaning that they can be linguistic knowledge, thesauruses, the web, instances of concepts, documents, metadata, etc..
  Within the larger domain of information systems, several approaches for semantic enrichment in support of semantic mapping and semantic interoperability have been proposed. As reported by Su [18], existing semantic enrichment techniques for semantic mapping use a variety of resources, including shared thesaurus, such as WordNet, linguistic knowledge, and extensional knowledge (instances of concepts of ontology). For example, Tun [21] makes the assumption that “the more explicit semantics is specified in ontologies, the feasibility of matching will be greater”. In his approach, the semantics of concepts are enriched by adding concept-level knowledge, which is called “meta-knowledge”, according to a“MetaOntoModel”. The enrichment technique is integrated into a multi-system ontology matching architecture; the enrichment process is user-driven(user has to provide knowledge). In another semantic enrichment approach for ontology mapping, Su [18] used instances of the ontology to enrich the original ontology. In this case, instances correspond to documents that are associated with the concept; for each concept, a feature vector composed of the terms extracted from these documents is built using some Natural Language Processing (NLP) techniques. The lexical database WordNet plays the role of a global ontology that provides synonyms. The architecture of the approach is composed of a text categorizer, a feature vector constructor, a mapper and a mapping refiner. One of the limitations of feature vectors is that they are only unstructured sets or words; therefore, we cannot distinguish if one of those words corresponds, for example, to a role of the concept, a spatial relation, the description of a localization, etc.. In other words, there is still a lack of knowledge on the nature of those words and how they should be considered when comparing two concepts.
   performance of geospatial data retrieval engines. Despite the widespread use of ontologies to formalize the semantics of geospatial data, it cannot be assumed, as demonstrated in this paper, that the provided semantics are sufficient to ensure understanding of shared data by users and to resolve more complex semantic heterogeneity problems, such as heterogeneity of spatiotemporal semantics. We have proposed a new approach and architecture for geospatial data retrieval which addresses the specific problems caused by the lack of geospatial data semantics.
  The multi-view augmented concept model (MVAC) was used as a basis for the approach. This model is an enrichment of the traditional, property-based representation of concepts, to which it adds contexts, and associated contextual views of the concept, spatiotemporal semantics, as well as dependencies between features of concepts. These additional features were useful to improve geospatial data retrieval by identifying more semantic mappings between the query and the concepts of databases’
其他文献
Abstract: The analysis of shoreline positions in the most severely impacted area in the South of Thailand namely Phang Nga Province was conducted using time series remotely sensed data from LANDSAT-TM
期刊
CCII-Based Inverse Active Filters with GroundedPassive Components
期刊
The Design of Predictive Model for the AcademicPerformance of Students at University Based onMachine Learning
期刊
Development of PC Worker's Management System in PCTraining Room and Office by Using loT Technology
期刊
Received: July 25, 2011 / Accepted: November 7, 2011 / Published: January 20, 2012.  Abstract: Cellular automata (CA) models (deterministic, stochastic or hybrid) have garnered tremendous popularity a
期刊
Design of ADS-B Simulator
期刊
Impact of Thickness of Polymer Electrolyte Membrane and Gas Diffusion Layer on Temperature Distributions in Polymer Electrolyte Fuel Cell Operated at Temperature around 90℃
期刊
Integration of Tropical Renewable Energies
期刊
Different Approaches on the Investigation of Ground Water
期刊
Abstract: This study proposes a survey of environmental information by applying techniques of modeling, image segmentation and OOA (object-oriented analysis) in the definition of landscape units based
期刊