Entrepôts, Représentation et Ingénierie des Connaissances
- Topic modeling and hypergraph mining to analyze the EGC conference history hal link

Auteur(s): Truică C.-O.

Conference: Conférence sur l'Extraction et la Gestion des Connaissances (Reims, FR, 2016-01-18)
Actes de conférence: Actes de la 16ème Conférence sur l'Extraction et la Gestion des Connaissances, vol. p.383-394 (2016)

Ref HAL: hal-01442858_v1

Each year the EGC conference gathers researchers and practitioners from the knowledge discovery and management domain to present their latest advances. This year’s edition features an open challenge that encourages participants to leverage the EGC rich anthology which spans from 2004 to 2015. The ultimate goal is to highlight the dynamics of the conference history and to try to get a glimpse of the coming years. In this context, we first describe our methodology for inferring latent topics that pervade this corpus using non-negative matrix factorization. Based on the discovered topics and other properties of the articles (e.g., authors, affiliations) we shed light on interesting facts on both the topical and collaborative structures of the EGC society. Secondly, we employ a hypergraph itemset extraction process to discover existent but latent relations between authors or between topics. We also propose topic-author and author-author recommendations with a content-based approach. Lastly, we describe a Web interface for browsing this collection of articles complemented with the discovered knowledge.

Commentaires: Prix du meilleur article - Session «défi»