Title: On the law of Zipf-Mandelbrot for multi-world phrases
Authors: EGGHE, Leo
Issue Date: 1999
Citation: Journal of the American Society for Information Science, 50(3). p. 233-241
Abstract: The paper studies the probabilities of the occurrence of m - word phrases (m=2,3, ...) in relation with the probabilities of occurrence of the single words. It is well-known that, in the latter case, the law of Zipf is valid (i.e. a power law). We prove that in the case of m - word phrases (m22) this is not the case. We present two independent proofs of this. We furthermore show that - in case we want to approximate the found distribution by Zipfs law - we obtain exponents p, in this power law for which the sequence (P,),,, is strictly decreasing. This explains experimental findings of Smith and Devine, Hilberg and Meyer. Our results should be compared with a heuristic finding of Rousseau who states that the law of Zipf-Mandelbrot is valid for multi-word phrases. He, however, uses other - less classical - assumptions than we do.
URI: http://hdl.handle.net/1942/7405
DOI: 10.1002/(SICI)1097-4571(1999)50:3<233::AID-ASI6>3.0.CO;2-8
ISI #: 000078555800006
ISSN: 0002-8231
Type: Journal Contribution
Validation: ecoom, 2000
