Entrepôts, Représentation et Ingénierie des Connaissances
Publications du laboratoire

Recherche approfondie

par Année
par Auteur
par Thème
par Type
- Model-based co-clustering for mixed type data hal link

Auteur(s): Selosse M., Jacques J., Biernacki Christophe

(Document sans référence bibliographique)

Ref HAL: hal-01893457_v1
Exporter : BibTex | endNote

Over decades, a lot of studies have shown the importance of clustering to emphasize groups of observations. More recently, due to the emergence of high-dimensional datasets with a huge number of features, co-clustering techniques have emerged and proposed several methods for simultaneously producing groups of observations and features. By synthesizing the dataset in blocks (the crossing of a row-cluster and a column-cluster), this technique can sometimes summarize better the data and its inherent structure. The Latent Block Model (LBM) is a well-known method for performing a co-clustering. However, recently, contexts with features of different types (here called mixed type datasets) are becoming more common. Unfortunately, the LBM is not directly applicable on this kind of dataset. The present work extends the usual LBM to the so-called Multiple Latent Block Model (MLBM) which is able to handle mixed type datasets. The inference is done through a Stochastic EM-algorithm embedding a Gibbs sampler and model selection criterion is defined to choose the number of row and column clusters. This method was successfully used on simulated and real datasets.