论文部分内容阅读
尽管 XML 文档类型定义提供了一种机器可读形式的、能够说明 XML 语言语法的机制,但目前并没有类似的机制来指定 XML 词汇表的具体语义。这意味着没办法说明 XML 标记的意义,由 XML 形式呈现的事实和关系无法清晰、全面和规范地定义。这在实践和理论上都引起了严重的后果。从积极的方面看,XML 结构能被赋予任意语义,并可用于最初的设计者无法预见的领域。从不太积极的方面来看,内容开发者和软件工程师必须依靠乏味的文档,或者更糟的情况是,只能依靠猜测标记语言设计者的意图来开展工作。这一过程既费时费力,又易出错,还无法核实验证。即便是设计者当初的建档工作做得相当完美,不如意的情况还是会发生。另外,对标记语义本质研究的匮乏也意味着属于工程应用领域的数字文档处理根本没有什么理论。尽管目前正在进行的一些工程(XML 模式、RDF、语义网)已经取得了一些成绩,但是这些工程都没有直接全面地解决XML 标记语义的核心问题。本文回顾了标记意义这个概念的发展历史,阐明了解释 XML 正式语义的动机,并介绍了一个研究语义的科研项目——BECHAMEL 标记语义计划。“,”Although XML Document Type Definitions provide a mechanism for specifying, in machine-readable form, the syntax of an XML markup language, there is no comparable mechanism for specifying the semantics of an XML vocabulary. That is, there is no way to characterize the meaning of XML markup so that the facts and relationships represented by the occurrence of XML constructs can be explicitly, comprehensively, and mechanical y identified. This has serious practical and theoretical consequences. On the positive side, XML constructs can be assigned arbitrary semantics and used in application areas not foreseen by the original designers. On the less positive side, both content developers and application engineers must rely upon prose documentation, or, worse, conjectures about the intention of the markup language designer — a process that is time-consuming, error-prone, incomplete, and unverifiable, even when the language designer properly documents the language. In addition, the lack of a substantial body of research in markup semantics means that digital document processing is undertheorized as an engineering application area. Although there are some related projects underway (XML Schema, RDF, the Semantic Web) which provide relevant results, none of these projects directly and comprehensively address the core problems of XML markup semantics. This paper (i) summarizes the history of the concept of markup meaning, (i ) characterizes the specific problems that motivate the need for a formal semantics for XML and (i i) describes an ongoing research project :the BECHAMEL Markup Semantics Project —that is attempting to develop such a semantics.