Entrepôts, Représentation et Ingénierie des Connaissances
Publications du laboratoire

Recherche approfondie

par Année
par Auteur
par Thème
par Type
- A Study of Synthetic Oversampling for Twitter Imbalanced Sentiment Analysis hal link

Auteur(s): Ah-Pine J., Soriano-Morales Edmundo-Pavel

Conference: Workshop on Interactions between Data Mining and Natural Language Processing (DMNLP 2016) (Riva del Garda, IT, 2016-09-23)
Actes de conférence: , vol. p. ()

Ref HAL: hal-01504684_v1

The majority of Twitter sentiment analysis systems implicitly assume that the class distribution is balanced while in practice it is usually skewed. We argue that Twitter opinion mining using learning methods should be addressed in the framework of imbalanced learning. In this work, we present a study of synthetic oversampling techniques for tweet-polarity classification. The experiments we conducted on three publicly available datasets show that these methods can improve the recognition of the minority class as well as the geometric mean criterion.