Document Server@UHasselt >
Research >
Research publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/9220

Title: Optimizing Schema Languages for XML: Numerical Constraints and Interleaving
Authors: Gelade, Wouter
Martens, Wim
Neven, Frank
Issue Date: 2009
Citation: SIAM Journal on Computing, 38(5). p. 2021-2043
Abstract: The presence of a schema offers many advantages in processing, translating, querying, and storage of XML data. Basic decision problems such as equivalence, inclusion, and nonemptiness of intersection of schemas form the basic building blocks for schema optimization and integration, and algorithms for static analysis of transformations. It is thereby paramount to establish the exact complexity of these problems. Most common schema languages for XML can be adequately modeled by some kind of grammar with regular expressions at right-hand sides. In this paper, we observe that, apart from the usual regular operators of union, concatenation, and Kleene-star, schema languages also allow numerical occurrence constraints and interleaving operators. Although the expressiveness of these operators remains within the regular languages, the presence or absence of these operators has a significant impact on the complexity of the basic decision problems. We present a complete overview of the complexity of the basic decision problems for DTDs, XSDs, and Relax NG with regular expressions incorporating numerical occurrence constraints and interleaving. We also discuss chain regular expressions and the complexity of the schema simplification problem incorporating the new operators.
URI: http://hdl.handle.net/1942/9220
DOI: 10.1137/070697367
ISI #: 000264353000015
ISSN: 0097-5397
Category: A1
Type: Journal Contribution
Validation: ecoom, 2010
Appears in Collections: Research publications

Files in This Item:

Description SizeFormat
Published version337.05 kBAdobe PDF
Peer-reviewed author version333.68 kBAdobe PDF

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.