Document Server@UHasselt >
Research >
Research publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/24959

Title: Clustering multiply imputed multivariate high-dimensional longitudinal profiles
Authors: Bruckers, Liesbeth
Molenberghs, Geert
Dendale, Paul
Issue Date: 2017
Publisher: WILEY
Citation: BIOMETRICAL JOURNAL, 59(5), p. 998-1015
Abstract: In this paper, we propose a method to cluster multivariate functional data with missing observations. Analysis of functional data often encompasses dimension reduction techniques such as principal component analysis (PCA). These techniques require complete data matrices. In this paper, the data are completed by means of multiple imputation, and subsequently each imputed data set is submitted to a cluster procedure. The final partition of the data, summarizing the partitions obtained for the imputed data sets, is obtained by means of ensemble clustering. The uncertainty in cluster membership, due to missing data, is characterized by means of the agreement between the members of the ensemble and fuzziness of the consensus clustering. The potential of the method was brought out on the heart failure (HF) data. Daily measurement for four biomarkers (heart rate, diastolic, and systolic blood pressure, weight) were used to cluster the patients. To normalize the distributions of the longitudinal outcomes, the data were transformed with a natural logarithm function. A cubic spline base with 69 basis functions was employed to smooth the profiles. The proposed algorithm indicates the existence of a latent structure and divides the HF patients into two clusters, showing a different evolution in blood pressure values and weight. In general, cluster results are sensitive to choices made. Likewise for the proposed approach, alternative choices for the distance measure, procedure to optimize the objective function, choice of the scree-test threshold, or the number of principal components, to be used in the approximation of the surrogate density, could all influence the final partition. For the HF data set, the final partition depends on the number of principal components used in the procedure.
Notes: [Bruckers, Liesbeth; Molenberghs, Geert; Dendale, Paul] Univ Hasselt, I BioStat, Agoralaan, B-3590 Diepenbeek, Belgium. [Molenberghs, Geert] Katholieke Univ Leuven, I BioStat, B-3000 Leuven, Belgium. [Dendale, Paul] Jessa Hosp, Heart Ctr Hasselt, B-3500 Hasselt, Belgium.
URI: http://hdl.handle.net/1942/24959
DOI: 10.1002/bimj.201500027
ISI #: 000408988700020
ISSN: 0323-3847
Category: A1
Type: Journal Contribution
Appears in Collections: Research publications

Files in This Item:

Description SizeFormat
Published version755.58 kBAdobe PDF
Peer-reviewed author version556.42 kBAdobe PDF

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.