Document Server@UHasselt >
School for Information Technology >
Master theses >
Please use this identifier to cite or link to this item:
|Title: ||Hidden Markov Modellen voor het infereren van XSDs|
|Authors: ||Fonteyn, Dominique|
|Advisors: ||NEVEN, Frank|
|Issue Date: ||2011|
|Publisher: ||tUL Diepenbeek|
|Abstract: ||XML is the most popular languages for storing data on the web. Using schemas we can specify the structure of these documents. Its presence is used for automatic validation and. However, half of the online XML fragments do not refer to a schema and about two-thirds of the XSDs are not valid w.r.t. the W3C specifications. Thus we look for algorithms to infer an XSD for a set of XML fragments. In this thesis we explore inference techniques. This boils down to inferring regular expressions. However we cannot learn all regular expressions from positive data only and restrict us to SOREs. We present iXSD for local SOXSDs. Next we identify k-occurrence REs which are harder. We focus on HMMs to infer kOREs with iDRegEx. We combine these algorithms to infer local k-OXSDs. We present a similarity measure for two XSDs used for evaluating the experimental results. We see that it does not perform well on precision and generalisation but rather well on similarity and runtime.|
|Notes: ||master in de informatica-databases|
|Type: ||Theses and Dissertations|
|Appears in Collections: ||Master theses|
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.