www.uhasselt.be
DSpace

Document Server@UHasselt >
Research >
Research publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/805

Title: Topological aspects of information retrieval
Authors: EGGHE, Leo
ROUSSEAU, Ronald
Issue Date: 1998
Publisher: Wiley
Citation: Journal of the American Society for Information Science, 49(13). p. 1144-1160
Abstract: Let (DS, DQ, sim) be a retrieval system consisting of a document space DS, a query space QS, and a function sim, expressing the similarity between a document and a query. Following D. M. Everett and S. C. Cater (1992), we introduce topologies on the document space. These topologies are generated by the similarity function sim and the query space QS. Three topologies will be studied: The retrieval topology, the similarity topology, and the (pseudo-)metric one. It is shown that the retrieval topology is the coarsest of the three, while the (pseudo-) metric is the strongest. These three topologies are generally different, reflecting distinct topological aspects of information retrieval. We present necessary and sufficient conditions for these topological aspects to be equal. Several examples of topological retrieval systems are presented. One of these examples is a vector space model that yields a simplification of the Everett-Cater model, yet having a more diversified spectrum of topological properties. Finally, it is shown that information retrieval based on Boolean operators is an intrinsic part of the general topological model. This is a major motivation of the introduction of topologies in theoretical IR models.
URI: http://hdl.handle.net/1942/805
DOI: 10.1002/(SICI)1097-4571(1998)49:13<1144::AID-ASI2>3.0.CO;2-Z
ISI #: 000076489900002
ISSN: 0002-8231
Type: Journal Contribution
Validation: ecoom, 1999
Appears in Collections: Research publications

Files in This Item:

Description SizeFormat
Published version144.7 kBAdobe PDF
Peer-reviewed author version841.17 kBAdobe PDF

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.